JPH07509795A

JPH07509795A - Advanced large-scale parallel computer

Info

Publication number: JPH07509795A
Application number: JP6505313A
Authority: JP
Inventors: チン，ダニー; ピータース，ジヨゼフ・エトワード・ジユニア; テイラー，ハーバート・ハドソン・ジユニア
Original assignee: デヴイツド・サーンオフ・リサーチ・センター，インコーポレーテツド
Priority date: 1992-08-05
Filing date: 1993-07-14
Publication date: 1995-10-26
Also published as: KR100327712B1; EP0654158A4; KR950703177A; EP0654158A1; WO1994003852A1

Abstract

(57)【要約】本公報は電子出願前の出願データであるため要約のデータは記録されません。 (57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】アドバンスト大規模並列計算機この発明は政府の援助を得て契約箱ＭＤＡ−９７２−９０−Ｃ−００２２号に基づいてなされた。政府が本発明の一定の権利を有する。[Detailed description of the invention] Advanced large-scale parallel computer This invention is based on contract box MDA-972-90-C-0022 with government support. It was done accordingly. The government has certain rights in this invention.

この発明は大規模並列計算機、特にマルチユーザ時分割オペレーションの可能な計算機に関する。This invention is useful for large-scale parallel computers, especially those capable of multi-user time-sharing operations. Regarding computers.

本発明の背景大きな逐次オペレーションおよび並列オペレーションをするスーパーコンピュータの両方が当該技術分野において知られているが、大規模並列オペレーションは、実時間で実行されるべき非常に多量のデータ計算およびデータ通信を要する計算に強く依存する用途には好ましい。そのような用途の例には、天気モデリングと医療イメージングが含まれる。そのような用途で遭遇するそんな複雑なシナリオの実時間分析では、非常に大きなデータセットが扱われる。Background of the invention A supercomputer that performs large sequential and parallel operations Although both methods are known in the art, massively parallel operations , calculations that require very large amounts of data computation and data communication to be performed in real time. Preferred for applications that rely heavily on calculations. Examples of such applications include weather modeling and medical imaging. Such complex scenarios encountered in such applications Real-time analysis in O's deals with very large data sets.

従来技術のプリンストンエンジン（Ｐｒｉｎｃｅｔｏｎ　［！ｎｇｉｎｅＸＰＥ）の構造は、単一命令多重データ（ＳＩＭＤ）リニアアレイプロセッサとなっている。そのリニアアレイプロセッサは、６４から２０４８個のプロセッサにまで６４の間隔で拡張可能であり、最大の構成では１４ＭＨｚの命令クロックに対し、２８．６７２百万命令／秒（ＭＩＰＳ）の計算レートを達成する。各プロセッサはローカルメモリを有し、隣接したものと２つの２方向チヤンネルを介して通信できる。入力および出力データレートとして、各々１４および１．８Ｇｂｐｓが得られる。このＰＲのホストは、アポロ／メンタ−グラフィックスワークステーション（Ａｐｏｌｌｏ／Ｍｅｎｔｏｒ　Ｇｒａｐｈｉｃｓ　Ｗｏｒｋｓｔａｔｉｏｎ）であり、高分解能モニタが出力結果を観るために使用される。Prior art Princeton engine (Princeton [!ngineXPE ) is a single instruction multiple data (SIMD) linear array processor. There is. Its linear array processors range from 64 to 2048 processors. 64 increments, and the maximum configuration is scalable to 14MHz instruction clock. , achieving a computation rate of 28.672 million instructions per second (MIPS). Each process The server has local memory and communicates with its neighbors via two two-way channels. I can believe it. 14 and 1.8Gbps as input and output data rates, respectively is obtained. The host of this PR is Apollo/Mentor Graphics Workstation (Apollo/Mentor Graphics Workstat ion) and a high-resolution monitor is used to view the output results.

上記ＰＥの各プロセッシング要素ＰＥＯからＰＥｎ−１は、７つの独立な内部１６ビツトデータパス、１６ビツトＡＬＵ、１６ビツト乗算器、６４の要素を有する３ポートレジスタスタツク、１６ビツト通信ボート；および６４０バイトまでの外部ＳＲＡＭローカルメモリを含む。レジスタファイルはそのファイルへの読み取り専用アクセス用の１つのアドレスポート、およびそのファイルの読み取り又は書き込み用の第２のアドレスポートを有する。プロセッサ間連絡バス（ＩＰＣ）により１命令サイクルの間で隣接プロセッサ間のデータが交換できる。各命令サイクルにおいて、６つまでの同時オペレーション（１１０バスを介した入力又は出力、レジスタファイルにおける同時読み取りおよび書き込み、１つのＡＬＵオペレーション、およびローカルメモリアクセス）ができる。Each processing element PEO to PEn-1 of the above PE has seven independent internal 1 6-bit data path, 16-bit ALU, 16-bit multiplier, 64 elements 3-port register stack, 16-bit communication port; and up to 640 bytes Contains external SRAM local memory. The register file allows reads to the file. One address port for read-only access and reading of that file Or it has a second address port for writing. Inter-processor communication bus (IP C) allows data to be exchanged between adjacent processors within one instruction cycle. each life Up to six simultaneous operations (input via the 110 bus) or output, simultaneous read and write in register file, one AL U operations and local memory access).

入力データはビデオの各走査線０からｖ−１用の各プロセッサのローカルメモリＭＯからＭｎ−１内に、プロセッサにつき１つのピクセルとして記録される。従って、フレーム周期にわたって、ビデオフレームの１つのビクセル列が各ローカルメモリに記録される。ローカルメモリは、１０２４本線のフレーム用の８ビツトピクセルで６４０列まで記録するのに十分である。図１の機能図１００が、ビデオフレームがいかにしてローカルメモリにわたり分布しているかを示している。対応するそれぞれのビデオフレーム列Ｏからｚ−１は、同じローカルメモリに記録される。従って、一時的なアルゴリズムでは、プロセッサ間、すなわち隣接プロセッサのローカルメモリ間の通信を要しない。水平フィルターと統計的集計操作では、１Ｐｃ２００を介してプロセッサ間のデータ通信が要求される。Input data is stored in local memory of each processor for each scan line 0 to v-1 of the video. Recorded in MO to Mn-1, one pixel per processor. subordinate So, over a frame period, one pixel column of the video frame is assigned to each local recorded in memory. The local memory is 8 bits for a frame of 1024 lines. This is sufficient to record up to 640 columns in pixels. The functional diagram 100 in FIG. shows how deoframes are distributed across local memory . Each corresponding video frame sequence O to z-1 is stored in the same local memory. recorded. Therefore, in a temporal algorithm, between processors, i.e., adjacent No communication between local memories of processors is required. Horizontal filters and statistical aggregation The operation requires data communication between processors via the 1Pc 200.

ＩＰＣは４つのモード、即ちノーマル、バイパス、同報通信（ブロードキャスト）送信および同報通信受信の内の１つに設定され得る。ノーマル通信はリニアに接続されたアレイ内の隣接のものの間で行われる。データがＩＰＣチャンネル上に１命令内にロードされ、次の命令で左もしくは右にシフトされる。このモードは最も隣接したものの計算に対して非常に有効である。IPC has four modes: normal, bypass, and broadcast. ) transmission and broadcast reception. Normal communication is linear between adjacent ones in a connected array. Data is on IPC channel is loaded in one instruction and shifted to the left or right in the next instruction. This mode is very useful for computing nearest neighbors.

ある場合には、元のアレイのサブグリッド上で隣接化操作を行うのが望ましい。In some cases, it is desirable to perform the contiguity operation on a subgrid of the original array.

この間引きはより小さい連結された領域内にアレイ要素を圧縮することなく達成され得る。むしろ、プロセッサはバイパスされ、所望の領域間の新しい隣接接続をもたらす。左右のシフト操作がバイパスされた連結パターンを通っていく。This thinning is achieved without compressing the array elements into a smaller concatenated region. can be done. Rather, the processor is bypassed and new adjacent connections between the desired regions bring about. The left and right shift operations are passed through a connected pattern that is bypassed.

図２では、ＰＢがアナログおよびデジタルのソースおよびデスティネーションとコントローラ２００を介してインターフェースされている。並列アレイへの入力および出力データチャンネルはそれぞれ４８ビツトおよび６４ビツト幅である。In Figure 2, PB represents analog and digital sources and destinations. It is interfaced via controller 200. Input to parallel array and output data channels are 48 bits and 64 bits wide, respectively.

これらのチャンネルは２８ＭＨｚクロックで動作し、６つのアナログ−デジタル変換器（ＡＤＣ）と７つのデジタル−アナログ変換器（ＤＡＣ）をインターフェースする。ホストコンピュータはシステムまたはアルゴリズムのテストのためにこれらのバス上のデータをロードまたは受け取るデジタルアクセスを有する。These channels are clocked at 28MHz and have six analog-to-digital interface converter (ADC) and seven digital-to-analog converters (DACs). source. The host computer is used for system or algorithm testing. It has digital access to load or receive data on these buses.

コントローラ２００もまたＡＤＣやＤＡＣにユーザが選択可能なりロックを供給する。３つの独立な入力クロックと４つの独立な出力クロックが可能である。このことにより、いくつかの異なったデータソースが同時に読み取られ、処理され、表示され、比較されることが可能となる。出力は種々の表示器、例えばスペクトラムアナライザや組み込み用実時間システムハードウェアー内の背景に取り込まれ得る。Controller 200 also provides user-selectable locks for ADCs and DACs. do. Three independent input clocks and four independent output clocks are possible. child This allows several different data sources to be read and processed simultaneously. , can be displayed and compared. The output can be displayed on various displays, e.g. Incorporate into the background within tram analyzers and embedded real-time system hardware It can be rare.

並列プロセッサ２０２からの出力は、特別な出力であるマルチポート、すなわちビットスライスＩ１０　ＩＣ内に組み込まれたランダムアクセスメモリ（ＲＡＭ）構造２０４を介してユーザがプログラム可能である。ローカルメモリアクセスはこの特異な出力構造ゆえに軽減される。出力データストリームはさらに付加的な処理のために並列アレイの入力に返され得る。この特徴によりレーダー処理（コーナーターン）や大きい３Ｄデータセツトの高速回転に有効な実時間転置を可能にするレーダー処理やテレビシュミレーションのような先進的な問題の規模の増大は、これらの従来技術の大規模並列スーパーコンピュータの通信および計算の最大レートに帰するようになり、Ｐ（！はその実時間での解法を与えるには十分でない。The output from parallel processor 202 is a special output, multiport, i.e. Bitslice I10 Random access memory (RAM) built into the IC ) structure 204 is user programmable. local memory access is reduced due to this unique output structure. The output data stream is additionally can be returned to the input of a parallel array for further processing. This feature allows radar processing ( Enables real-time transposition, which is effective for corner turns) and high-speed rotation of large 3D data sets. scale of advanced problems such as radar processing and television simulation. The growth of these prior art massively parallel supercomputer communications and computation , and P(! is insufficient to give its real-time solution. It's not a minute.

従って、そのような計算能力の問題に対する解法を与えるのに必要な帯域と計算性能（１２００ＭＢｙｔｅｓ／ｓｅｃまてのＩ１０帯域、および９．６Ｔｅｒａｏｐｓ／ｓｅｃまてのピーク計算レート）の両方を備えたより大きい大規模並列スーパーコンピュータが必要になっている。更に、逐次スーパーコンピュータは時分割マルチューザオペレーンヨンが可能であるが、従来技術の大規模並列スーパーコンピュータではこのことができない。Therefore, the bandwidth and computation required to provide solutions to such computational power problems Performance (I10 band up to 1200MBytes/sec and 9.6Tera Larger scale parallelism with both peak computation rates up to ops/sec Supercomputers are needed. Furthermore, sequential supercomputers Time-division Malthusa operation is possible, but the massively parallel processors of the prior art A computer cannot do this.

本発明の概略Ｎブロックで構成され、その各々がＭ個のプロセッサを含んだ並列計算システムが記載されている。各プロセンサは算術論理演算装置（ＡＬＵ）、ローカルメモリ、および人力／出力（Ｉｌｏ）インターフェースを有する。各ブロックはまたコントローラを含み、そのコントローラは同一命令のグループを与えるようにプロ・ツク内のＭ個のプロセッサの各々と接続されている。並列計算システムもまた、Ｎブでいる。ホストプロセッサはこれらのブロックを少なくとも第１と第２のブロックグループに分割し、各グループはＰ個のブロックを含んでいる。Ｐ個のブロックを有する各グループに対して、同一プロセッサ命令のそれぞれ異なるグループが、ホストプロセッサによりＭ個のＰ倍のプロセッサの各々に与えられる。Outline of the invention A parallel computing system consisting of N blocks, each containing M processors. is listed. Each processor has an arithmetic logic unit (ALU), local memory and human power/output (Ilo) interface. Each block also contains a controller that is programmed to give the same group of instructions. It is connected to each of the M processors in the block. Parallel computing system also I'm N-bu. The host processor divides these blocks into at least the first and second blocks. into block groups, each group containing P blocks. P pieces For each group with blocks of A group is given by the host processor to each of M times P processors. Ru.

図面の簡単な説明図１はビデオフレームが従来技術のプリンストンエンジン（ＰＥ）のメモリにいかに格納されているかを示している。Brief description of the drawing Figure 1 shows how a video frame is stored in the memory of a conventional Princeton engine (PE). It shows what is stored in the file.

図２はホストコンピュータがシステムまたはアルゴリズムのテストのためにコントローラバス上のデータをロードまたは受け取るデジタルアクセスをすることができる従来技術のＰＥにおける構成を示している。Figure 2 shows how a host computer can be used to test a system or algorithm. Digital access to load or receive data on the trollerbus 1 shows a configuration in a conventional PE that can be used.

図３はサーノフエンジン（Ｓａｒｎｏｆｆ　Ｅｎｇｉｎｅ）（ＳＥ）の概略図である。Figure 3 is a schematic diagram of the Sarnoff Engine (SE). be.

図４はＳＥのホスト、コントローラ、ローカルメモリ、および１７０機能の接続を示す、エンジンブロック（ＥＢ）の拡大図である。Figure 4 shows the connection of the SE host, controller, local memory, and 170 functions. It is an enlarged view of an engine block (EB) showing.

図５はシステムモジュールの物理的な配置を示している。FIG. 5 shows the physical layout of the system modules.

図６はＳＨのプロセッサ構成を示している。FIG. 6 shows the processor configuration of the SH.

図７はＳＥのストライドレジスタ（Ｓｔｒｉｄｅ　Ｒｅｇｉｓｔｅｒ）の使用を示している。Figure 7 shows the use of the SE stride register. It shows.

図８はＳＥのセン。口演算モード（Ｍｏｄｕｌｏ　Ａｒｉｔｈｍａｔｉｃ　Ｍｏｄｅ）の例を示している。Figure 8 shows SE Sen. Modulo Arithmatic Mo An example of de) is shown.

図９はＳＨのバウンデングモード（Ｂｏｕｎｄｉｎｇ　Ｍｏｄｅ）の例を示している。Figure 9 shows an example of the SH bounding mode. There is.

図１ＯはＳＥプロセッサリソース使用テーブルを示している。FIG. 1O shows the SE processor resource usage table.

図１１は２つのバックデータワードのマツチ例を示している。FIG. 11 shows an example of matching two back data words.

図１２はマツチシーケンスとそれに対応するテンプレートを示している。FIG. 12 shows a match sequence and its corresponding template.

図１３はマツチおよびデータシーケンス間で見つけられたマツチを示している図１４は条件付ロッキングの例を示している。Figure 13 is a diagram showing matches and matches found between data sequences. 14 shows an example of conditional locking.

図１５はプロセッサ命令ワードの４つの異なるモードを示している。FIG. 15 shows four different modes of processor instruction words.

図１６はＩＰｃオペレーションの４つの異なる例を示している。Figure 16 shows four different examples of IPc operations.

図１７はＳＥの入力／出力メモリコントローラ（Ｉｎｐｕｔｌｏｕｔｐｕｔ　Ｍｅｍｏｒｙ　Ｃｏｎｔｒｏｌｌｅｒ’ｌ　（ＩＯＭｃ）の入力スライス（４スライス／チツプ）を示している。Figure 17 shows the SE input/output memory controller (Inputoutput M memory Controller’l (IOMc) input slice (4 slices) chair/chip).

図１８はＩＯＭｃの出力スライス（４スライス／チツプ）を示している。FIG. 18 shows the output slices (4 slices/chip) of IOMc.

図１８ａは好適イメージヴオールト（ＩＶ）インターフェース回路のブロック図である。FIG. 18a is a block diagram of a preferred image vault (IV) interface circuit. It is.

図１９はＩ１０データフォーマットを示している。Figure 19 shows the I10 data format.

図２０はビデオデータフォーマットを示している。FIG. 20 shows the video data format.

図２１は入力ＦＩＦＯ（、先入れ先出し方式）により捕えられたデータ入力を示している。Figure 21 shows the data input captured by the input FIFO (first in, first out). are doing.

図２２は入力タイミングシーケンスの例を示している。FIG. 22 shows an example of an input timing sequence.

図２３は多重ピクセルを扱うプロセッサの２つの図を示している。FIG. 23 shows two diagrams of a processor that handles multiple pixels.

図２４は入力ＦＩＦＯからローカルメモリへのデータの転送を示している。Figure 24 shows the transfer of data from the input FIFO to local memory.

図２５はＦＩＦＯ入力タ入力タイミングシーケンス示している。FIG. 25 shows the FIFO input timing sequence.

図２６はデータ出力チャンネルから出力ＦＩＦＯデータをロードするのが示されている。Figure 26 shows loading output FIFO data from the data output channel. ing.

図２７はローカルメモリから出力ＦＩＦＯへのデータの転送を示している。FIG. 27 shows the transfer of data from local memory to the output FIFO.

図２７ａから２７ｉまでは入力および出力ＦＩＦＯのオペレーションを説明するのに役立つメモリ配置のアレイ図である。Figures 27a to 27i describe the operation of the input and output FIFOs. FIG. 2 is an array diagram of a memory arrangement useful for.

図２８はローカル０Ｒ（ＬＯＲ）バスを示している。Figure 28 shows the local 0R (LOR) bus.

図２９はコントローラ同期スイッチを示している。Figure 29 shows a controller synchronization switch.

図３０はコントローラの概念的グルービングを示している。Figure 30 shows the conceptual grooving of the controller.

図３１はコントローラ用の同期スイッチの構成を示している。FIG. 31 shows the configuration of a synchronous switch for the controller.

図３２はバリヤ同期の例を示している。FIG. 32 shows an example of barrier synchronization.

図３３はオペレーティングシステムの要素を示している。Figure 33 shows the elements of the operating system.

好適実施例の説明本発明の記載を容易にするために、ここで用いられているいくつかの頭字ワードのアルファベット類のリストが添付されている。DESCRIPTION OF THE PREFERRED EMBODIMENT Some acronyms are used herein to facilitate the description of the invention. A list of alphabets is attached.

ＳＥは好ましくは３２ビツトプロセツサを有し、これはｌ命令あたり１５の独立なプログラム可能なオベレーンヨンを行い、２倍のメモリ帯域幅（ｌプロセッサあたり２つのローカルメモリポート）を有する。最大システムでのプロセッサの総数は８１９２個であり、各プロセッサは８１９ＭＩＰＳの計算データレートと９．６ｘ１０１２オペレ一シヨン／秒を実現するために、１００ＭＨｚクロック（Ｉｏｎｓ命令サイクル）で動作するように設計されている。もう一つの大きな改良は、Ｓ［！が多重命令多重データ（Ｍｕｌｔｉｐｌｅ　Ｉｎ５ｔｒｕｃｔｉｏｎ　Ｍｕｌｔｉｐｌｅ　ＤａｔａＸＭＩＭＤ）を行うことであり、６４個のプロセッサの各々に対して１つのコントローラが存在し、各コントローラはそのプロセッサに対して異なる命令ストリームを同報通信できる。このアーキテクチャ −編成では、コントローラ間の同期をハードウェアーの支援で得て１２８ＭＩＭＤ命令ストリームまで供給する。ＳＥもまたマルチユーザモードで動作でき、このモードではアプリケーション間で干渉することなくいくつかの実時間および非実時間アプリケーションをサポートするために、システムはこの装置を時分割するように構成され得る。本システムはまたアプリケーションを実行するためにいくつかのより小さいシステムに再構成され得る。図３は本装置編成の上位概略図である。The SE preferably has a 32-bit processor, which has 15 independent processors per instruction. programmable overlay and twice the memory bandwidth (l processor (2 local memory ports per device). Processor at maximum system The total number is 8192, and each processor has a computational data rate of 819 MIPS. 100MHz clock to achieve 9.6x1012 operations/second (Ions instruction cycle). another big The improvement is S[! is multiple instruction multiple data (Multiple In5tructi on Multiple Data There is one controller for each of the processors, and each controller Different instruction streams can be broadcast to processors. This architecture -In the organization, synchronization between controllers is obtained with hardware support and 128MIM Provides up to D instruction stream. SE can also operate in multi-user mode and mode allows some real-time and non-interference between applications. To support real-time applications, the system can time-share this device. may be configured to do so. The system also uses Can be reconfigured into several smaller systems. Figure 3 is a high-level schematic diagram of this device organization. It is.

ｌＯＯＭＩ（ｚシステムクロックを実現するためには、コントローラ機能は集積化され、プロセッサに近接して配置されなければならない。コントローラ３００はプロセッシング要素３０２への命令を同報通信し、且つプロセスおよび信号上の情報を維持するように応答する。各コントローラは命令メモリとプログラムコントロールフローを指令するマイクロシーケンサを含む。稼働中のプロセスについての情報はプロセスコントロールメモリ内に維持される。いくつかのプロセッサ３゜２、ローカルメモリ３０４、Ｉ１０機能３０６、およびコントローラ機能を含んだ冗長的スライスを使用することにより、マルチユーザに対応しＭＩＭＤを可能にするという要求がさらに満たされる。lOOMI (In order to realize the z system clock, the controller function must be integrated. processor and must be located close to the processor. controller 300 broadcasts instructions to processing elements 302 and processes and signals respond to maintain information. Each controller has instruction memory and program code. Contains a microsequencer that directs control flow. About running processes information is maintained in process control memory. some processes S3゜2, local memory 304, I10 function 306, and controller function By using redundant slices containing The requirement to enable is further fulfilled.

図３の影の部分はＳＥ内のＥＢの冗長的スライスを示している。ＥＢは６４個のプロセッサ、それらの各々のローカルメモリ、１７０機能、およびホストワークステーション３０８とのインターフェースを含んだコントローラ機能からなる。The shaded area of FIG. 3 shows redundant slices of the EB within the SE. There are 64 EBs. Processors, their respective local memory, 170 functions, and host work It consists of a controller function including an interface with station 308.

ＥＢは１つのコントローラＩＣ５プログラムメモリモジユール、１６個のプロセッサＩｃ、１６個のローカルメモリモジュール、および１６個のＩＯＭｃ　Ｉｃを含んだマルチチップモジュールから物理的に構成される。EB has one controller IC5 program memory module, 16 processors IOMc Ic, 16 local memory modules, and 16 IOMc Ic Physically consists of a multi-chip module containing

図３はまたプロセッサがＩＶ３２０へ接続されていることを示していて、このＩＶは各プロセッサのＩＯＭＣからアクセスできる、大きな第２記録アレイである。ＩＶ３２０は分散ディスク記録とし、で働き、システムレベルではテラバイトの容量を、プロセッサレベルではメガバイトの容量を有している。各プロセッサにおいて４メガバイト／′秒のデータレートにすることにより、システムレベルでは３２ギガバイト／′秒までのデータ転送レートが実現される。ＩＶは相対的に長いイメージシーケンスまたは大きなデータベースを記録するのに使用できる。FIG. 3 also shows the processor connected to IV320, which V is a large second storage array accessible from each processor's IOMC. . The IV320 has distributed disk recording and works with terabytes at the system level. It has a capacity of megabytes at the processor level. each processor system-level data rate of 4 MB/'s at Data transfer rates of up to 32 gigabytes/'second can be achieved. IV is relative can be used to record long image sequences or large databases .

図４はホスト、コントローラ、プロセッサ、ローカルメモリ、および１２８個のスライスまで用の１１０機能の接続を示ずｌ！Ｂの拡大図である。システムがより小さいシステムに再構成されるときは、各サブシステムはそれに割り当てられたホストワークステーション４００を有し、各旧０バスはサブシステムに接続され、ローカルに存する。最大のＳＥが用いられると、最も左のホストワークステーション／ＶＭＥバスのみがアクティブとなり、各スライスへの旧０バスは互いにシリーズに接続される。グローバル０Ｒ（ＧＯＲ）、ローカル０Ｒ（ＬＯＲ）、および隣接化０Ｒ（ＮＯＲ）バスがコントローラセットの同期に用いられる。Figure 4 shows the host, controller, processor, local memory, and 128 Shows 110 functional connections for up to slicing! It is an enlarged view of B. The system is good When reconfigured into a smaller system, each subsystem has host workstation 400 with each old 0 bus connected to a subsystem. exists locally. When maximum SE is used, the leftmost host workstation Only the application/VME bus is active, and the old 0 bus to each slice is connected to each other. connected in series. Global 0R (GOR), Local 0R (LOR) , and a Neighboring 0R (NOR) bus are used to synchronize the controller sets.

プロセッサがＩＰＣを介してリニアアレイ内で互いに接続されている。より多くのプロセッサがシステムに加えられているので、このアーキテクチャでは、オーバーヘッドが増すという弊害なしに処理能力がリニアに増加する。ＥＢ内の全てのプロセッサが、プロセッサに同報通信される１２８ビツト命令ワード（ＩＷ）を用いてＳＩＭＤモード−〇動作する。条件付ロッキングが可能なので、異なる動作がこれらのプロセッサ上で実行され得る。全てのプロセッサＩ１０がメモリマツプされ、ローカルメモリと１１０ソ一ス間のデータの転送はＩＯＭＧが担当している。各プロセッサとコントローラはまた専用のプロファイルカウンタを有し、コントローラはデバッグ中断機構を含む。　′ 図５では、ＳＥが幅５０ｃ＋ｎ、厚さ２０ｃｍの６角形として表わされたモジュール５００で構成されている。各モジュール５００は電源に接続されている１６個のＥＢ、冷却液入力、および冷却液排出を含む。各ＥＢは６４個のプロセッサとその各々のローカルメモリおよび１７０機能、およびＥＢコントローラ機能を含む。１つのＥＢはアドバンストメモリ製造技術を用いた１６個のマルチチップモジュールを使用してパッケージ化されている。各システムモジュールは自分自身を含み１０２４プロセツサ機もしくは１６個の６４プロセツサ機として機能できる。モジュールは最大８１９２プロセッサ機を実現するために垂直に（８つ）積み上げられ得る。Processors are connected to each other in a linear array via an IPC. More processors are added to the system, this architecture Processing capacity increases linearly without the negative effect of increased bar head. Everything in EB 128-bit instruction word (IW) broadcast to the processor. Operates in SIMD mode using . Different because conditional locking is possible Operations may be performed on these processors. All processors I10 have memory IOMG is responsible for data transfer between local memory and 110 sources. are doing. Each processor and controller also has a dedicated profile counter. However, the controller includes a debug interruption mechanism.　′ In Figure 5, SE is a module represented as a hexagon with a width of 50c+n and a thickness of 20cm. It consists of 500 rules. Each module 500 is connected to a power source 16 EB, coolant input, and coolant discharge. Each EB has 64 processors and their respective local memory and 170 functions, and EB controller functions. include. One EB consists of 16 multi-chips using advanced memory manufacturing technology. Packaged using modules. Each system module has its own It can function as a 1024 processor or 16 64 processors. Wear. Modules are vertically arranged (8) to achieve up to 8192 processors Can be piled up.

図６には、プロセッサがＢ　ｉ　ＣＭＯ５技術を使った４つのプロセッサを好ましくは含んだＩＣ上に組み込みでき、ｌＯナノ秒全命令サイクル有しているのが示されている。プロセッサはコントローラから受けとった１２８ビツトＩＷ上で動作する。ＩＷは１５の独立にプログラム可能なオペレーションを規定する。プロセッサは３２ビツトデータバスとレジスタを使用し、いくつかのデータバスとレジスタは６４ビツトデータを転送し記録するために接続され得る。更に、ＡＬＵ、レジスタファイル、およびローカルメモリのようないくつかのリソースが６４ビツト入力上で動作する。In Figure 6, the processor prefers four processors using Bi CMO5 technology. It can be integrated on an integrated circuit with a 10 nanosecond full instruction cycle. It is shown. The processor uses the 128-bit IW received from the controller to Operate. IW defines 15 independently programmable operations. P The processor uses a 32-bit data bus and registers, and has several data buses and registers. Registers can be connected to transfer and record 64-bit data. Furthermore, A.L. Some resources like U, register file, and local memory are 6 Operates on 4-bit input.

各プロセッサは６４ビットＡ！、０６００．３２ビツト乗算器６０２．３２ビツトマツチヤ６０４．３２ビツト補助ＡＬＵ６０６．１２８ワードレジスタフアイル６０８．２つのアドレス発生器（ＡＧ）６１０−１．６１０−２によりアドレス指定されるデュアルポートローカルメモリ、他のプロセッサと通信するためのＩＰｃボート６１２、条件付ロッキングハードウェア６１４、および専用のプロファイルカウンタ６１６を有する。Each processor is a 64-bit A! , 0600.32 bit multiplier 602.32 bit Tomatochia 604.32-bit auxiliary ALU606.128 word register file address generator (AG) 610-1.610-2. Dual-port local memory specified by processor, for communicating with other processors IPc boat 612, conditional locking hardware 614, and dedicated professional It has a file counter 616.

命令サイクルあたりのオペレーションの数を最大にするために、整数および浮動小数点の乗算器とＡＬＩＪユニットが統合されている。多くのプロセッサが整数と浮動少数点に分離したＡＬＵを有し、整数ＡＬＵがメモリアドレッシングに使用される間、多くの計算が浮動小数点形式で実行されることにより並列化が実現される。ＳＥが２つの専用ＡＣを有し、浮動小数点と整数演算が通常同時には行われないので、整数および浮動小数点ユニットが一緒にグループ化され、他の機器のためにＩＣ領域を節約している。integer and float to maximize the number of operations per instruction cycle. A decimal point multiplier and an ALIJ unit are integrated. Many processors are integer The integer ALU is used for memory addressing. Parallelization is achieved by performing many calculations in floating-point format while be done. The SE has two dedicated ACs, and floating point and integer operations are usually not performed simultaneously. The integer and floating point units are grouped together and other This saves IC area for the device.

乗算器６０２は２つの３２ビツト値を掛は合わせ、６４ビツトの計算結果を各命令サイクルで得ることができる。結果が６４ビツトＰレジスタに記録され、その積が累積され得るようにそのレジスタはＡＬＵへ入力されている。もしくは、乗算器は２つの３２ビツト入力値を６４ビツト値として扱い、Ｐレジスタに６４ビットワードをロードできる。このことはＡＬＵに６４ビツトデータを供給するのに有効である。Multiplier 602 multiplies two 32-bit values together and applies the 64-bit calculation result to each instruction. It can be obtained in the second cycle. The result is recorded in the 64-bit P register and its That register is an input to the ALU so that products can be accumulated. Or, square The calculator treats the two 32-bit input values as 64-bit values and stores the 64-bit value in the P register. can load passwords. This means that 64-bit data is supplied to the ALU. It is effective for

マツチユニット６０４がプロセッサ設計に採用されている。というのは、それはデータ依存の強いオペレーションにとってはもってこいのものであるからである。プロセッサ命令ワード（ＰＩＷ）を最適化するために、乗算器６０２とマツチャ６０４は同じ命令フィールドを共有する。マツチャ６０４は３２ビツトバツクデータ上でのマツチオペレーションを実行する特別なハードウェアー要素である。より小さいワードサイズが１つの３２ピントワードにフォーマットされるとデータがバックされる。A match unit 604 is employed in the processor design. Because it is This is because it is ideal for operations that are highly data-dependent. . Matched with multiplier 602 to optimize the processor instruction word (PIW). 604 share the same instruction field. Matscha 604 is 32 bit back is a special hardware element that performs match operations on data . When the smaller word size is formatted into one 32 pinto word, the data data is backed up.

図６のＡＬＵ６００は３２ビツトと６４ビツトの入力を有し、又２つの６４ビツトアキユムレータ（ＡＣＣ）を有している。それはｌサイクルでの整数および浮動小数点（３２ビツトと６４ビツト）演算をサポートする。ＡＣＣもまたＡＬＵへ入力され、計算の中間値を記録するために使用できる。ＰレジスタとＡＣＣはＡＬＵ６００への６４ビツト入力として働き、他の全てのデータソースは３２ビツトソースである。ALU 600 of FIG. 6 has 32-bit and 64-bit inputs, and two 64-bit inputs. It has an accumulator (ACC). It is an integer and a float in l cycles. Supports dynamic point (32-bit and 64-bit) arithmetic. ACC is also an ALU can be used to record intermediate values in calculations. P register and ACC Serves as a 64-bit input to the ALU600; all other data sources are 32-bit inputs. It's tuto sauce.

ＡＬ１１１＋００で実行され得るオペレーションには、通常の３２ビツトおよび６４ビツトの単項バイナリ算術論理演算、シフト演算、および整数／浮動小数点変換演算が含まれる。複数サイクル整数除算演算もまた可能である。条件付減算やゼロ／非ゼロによりＡＣＣＩを更新すること（Ｕｐｄａｔｅ　ＡＣＣＩ　ｉｆ　Ｚｅｒｏ／ＮｏｎＺｅｒｏ）　（条件付置き込みを行うのに使用される）などがサポートされた条件付オペレーションがある。特別な目的のオペレーションとしては、より大きい値をＡＣＣＩに、より小さい値をＡＣＣ２に記録するＭＡＸＭＩＮバイナリオペレーション、第１ゼロピツト認識および第１−ビット認識をする単項オペレーション、および絶対値演算がある。Operations that can be performed on the AL111+00 include the normal 32-bit and 64-bit unary binary arithmetic and logic operations, shift operations, and integer/floating point Contains conversion operations. Multi-cycle integer division operations are also possible. conditional subtraction or zero/non-zero (Update ACCI if Zero/NonZero) (used to perform conditional placement) etc. There are conditional operations supported. special purpose operations and then record the larger value in ACCI and the smaller value in ACC2. MIN binary operation, first zero pit recognition and first bit recognition There are unary operations and absolute value operations.

補助ＡＬＵ（ＡｕｘＡＬＵ）６０６　（図６）は３２ビツト計数オペレーシヨンに使用される。Auxiliary ALU (AuxALU) 606 (Figure 6) is a 32-bit counting operation. used for.

計数オペレーションはイメージ処理を行うアプリケーションに大いに共通したものであるので、ＡｕｘＡＬＵ６０６がプロセッサ設計に含まれている。エキストラＡＬＵにより計数オペレーションがパイプラインされるので、ファクター６のスピードアップが条件付計数オペレーションで達成される。ＡｕｘＡＬＵはレジスタファイルのＲ１１ボートの近くに配置され、２つのレジスタ、即ちＡｕｘＡＬＵデータレジスタ（Ａｕｘ　ＡＬＵ　Ｄａｔａ　ＲｅｇｉｓｔｅｒＸＡＤＲ）とＡｕｘＡＬＵ条件マスクレジスタ（ＡｕｘＡＬＵ　Ｃｏｎｄｉｔｉｏｎ　Ｍａｓｋ　Ｒｅｇｉｓｔｅｒ）（ＡＣＭＲ）を有する。ＡＤＲはＡｕｘＡＬＩＪオペランドを含み、ＡＣＭＲは条件をモニタするためのプロセッサステータスワード（Ｐｒｏｃｅｓｓｏｒ　５ｔａｔｕｓ　ＷｏｒｄＸＰＳＷ）ｖスフを含む。Counting operations are very common to applications that perform image processing. Therefore, Aux ALU 606 is included in the processor design. Ext Since the counting operations are pipelined by the ALU, the factor 6 Speedup is achieved with conditional counting operations. AuxALU is a cash register It is located near the R11 port of the star file and has two registers, namely AuxA LU data register (Aux ALU Data RegisterXADR) and AuxALU condition mask register (AuxALU Condition Ma sk Register) (ACMR). ADR is AuxALIJ operation ACMR contains a processor status word to monitor conditions. (Processor 5tatus WordXPSW) Contains vs.

ＡｕｘＡＬＵの特別な機能として、ＡＤＲの値を減少すること、およびゼロの結果でプロセッサをロックするということがある。このオペレーションは実行時間がデータに依存するオペレーションに使用できる。各プロセッサがそのオペレーションを終えると、それはその値をゼロまで減少し、それ自身をロックしくＮＯＰの実行）、そのＬＯＲ信号をコントローラを終了させる信号にする。全てのプロセッサがオペレーションを終え、それらのＬＯＲ信号を送りだしたときに、コントローラはプロセッサ全てのロックを解除し、実行が続けられる。このオペレーションはＳＩＭＤプロセッサのグループ上のローカルデータ条件に依存するループを組み込むのに有効である。ＬＯＲはプロセッサをコントローラに接続する１ビツトワイヤであり、全てのプロセッサがハイ信号を送りだし、そしてＬＯＲ信号をハイにするまで、ＬＯＲ信号はローのままである。A special feature of AuxALU is to reduce the value of ADR and to This may cause the processor to lock up. This operation has an execution time of can be used for data-dependent operations. Each processor has its own After finishing the session, it decreases its value to zero and locks itself to NO. (execution of P), makes the LOR signal the signal that terminates the controller. All programs When the processors finish their operations and send out their LOR signals, The controller unlocks all processors and execution continues. This operator The implementation is a rule that depends on local data conditions on a group of SIMD processors. This is useful for incorporating groups. LOR connects processor to controller 1 bit wire, all processors send a high signal, and LOR The LOR signal remains low until you bring the signal high.

３２ビツト専用プロフアイルカウンタ６１６（図６）は実時間プロファイル用に各プロセッサ上に存在する。加えて、各コントローラは実時間プロファイルに使用される専用プロファイルカウンタ３３０１　（図３３）を含む。プロファイルは通常イベントの発生回数を計数するために元のプログラムに付加された命令を加えることにより行われる。ＩＰＣを介した通信のようないくつかのプログラムセグメントは限界まで時間がかかっているので、このタイプのプロファイリングは実時間モードでは不可能である。専用プロファイルカウンタが、プロセッサの実行を妨害することなくプロファイリングを行うのに使用される。A 32-bit dedicated profile counter 616 (Figure 6) is used for real-time profiles. Exists on each processor. In addition, each controller can be used for real-time profiling. It includes a dedicated profile counter 3301 (FIG. 33) that is used. profile is usually an instruction added to the original program to count the number of occurrences of an event. It is done by adding. Some programs like communication via IPC This type of profiling because the segment is taking a long time to reach its limit. is not possible in real-time mode. Dedicated profile counters Used for profiling without interfering with execution.

各プロセッサプロファイルカウンタ６１６とコントローラプロファイルカウンタ３３０１は、４つの機能、即ちロードカウンタ値、スタートカウンタ、ストップカウンタ、リセットカウンタのうちの１つを実行するように対応するコントローラまたはプロセッサＩＷ内で２ビツトでコントロールされる。カウンタから計数値を読み取る機能は、レジスタへの書き込みオペレーションとしてコントロールされ、レジスタがその結果を受け取る。プロセッサＩＷ形式である図１５ａから１５ｄには、プロファイルカウンタコントロールフィールドが２ビツトフイールドＦＣＣとじて示されている。Each processor profile counter 616 and controller profile counter 3301 has four functions: load counter value, start counter, stop the corresponding controller to execute one of the following: counter, reset counter It is controlled by two bits within the controller or processor IW. Count from counter The ability to read a value is controlled as a write operation to a register. and the register receives the result. From Figure 15a, which is the processor IW format: 15d, the profile counter control field is a 2-bit file. FCC.

さらに、プロファイルカウンタが増加する前に出くわした命令の数は、ＰＳＷ内の２ビツトフイールドを設定することにより変更され得る。この２ビツトフイールドの４つの状態は、プロファイルカウンタ６１６を各々の命令の度に、または各々の４つの１６または６４命令の度に増加するようにするのに使用され得る。Additionally, the number of instructions encountered before the profile counter is incremented is This can be changed by setting the 2-bit field . These two bit fees The four states of the field set the profile counter 616 on each instruction or It can be used to increment each four 16 or 64 instructions.

各プロセッサは１２８ワード（３２ビツトワード）レジスタファイル（ＲＦ）６０Ｂ　（図６）を有する。各命令サイクルでは、４つの読み取りおよび２つの書き込みまでが実行できる。そして、チップ上の並列化を増加し、メモリアクセスのボトルネックを減じることにより、機能ユニットを動作状態に保つのに必要な帯域幅を与える。ＲＦ６０８は２つの入カポ−）　（Ｒ１１，ＲＩ２）と４つの出力ポートを有し、それらはレジスタＲＯＩ−ＲＯ４に向けられている。各命令サイクルで、２つの３２ビツトワードがＲＦｆ３０８に書き込まれ得て、４つの３２ビツトワードがＲＦ６０８から読み取られ得る。レジスタ対（ＲＯＩ、　ＲＯ２）と（ＲＯ３，ＲＯ４）はまた他のプロセッサリソースに対する６４ビツトレジスタ対として使用され得る。Each processor has a 128-word (32-bit word) register file (RF)6 0B (Figure 6). Each instruction cycle has four reads and two writes. It can be executed up to the point of writing. and increases on-chip parallelism and memory access by reducing the bottlenecks required to keep functional units operational. Give bandwidth. RF608 has two input ports (R11, RI2) and four It has output ports, which are directed to registers ROI-RO4. each instruction In a cycle, two 32-bit words can be written to RFf308 and four A 32 bit word can be read from RF608. Register pair (ROI, R O2) and (RO3, RO4) are also 64 bits for other processor resources. Can be used as a register pair.

各プロセッサは８メガワード、デュアルポートのＤＲＡＭメモリ３０４（図３および３３）を有している。コントローラはプロセッサＩＷの１ビツトを介してローカルメモリを初期化する。各プロセッサが自分自身のローカルメモリを有しているので、メモリに対するプロセッサ間の奪い合いは起こらない。各命令サイクルで、２つの３２ビツトワードメモリアクセスが実行され得て、メモリボトルネックを軽減することによりプロセッサが計算するためのメモリ帯域幅を２倍にする。６４ビツト値が、上位及び下位ワードを同時に読み取り／書き込みすることによりアクセスされ得る。メモリサイズは６４のプロセッサのグループが２ギガバイトまたは６４の８Ｋｘ８にイメージを記録するのに十分なほど大きい。ローカルメモリにわたるデータの構成は図１と同様である。Each processor has an 8 megaword, dual-port DRAM memory 304 (see Figure 3). and 33). The controller is loaded via one bit of the processor IW. initialize the internal memory. Each processor has its own local memory Therefore, there is no competition between processors for memory. Each instruction cycle Two 32-bit word memory accesses can be performed in the memory bottleneck. Doubles the memory bandwidth for processor calculations by reducing Ru. 64-bit value reads/writes upper and lower words simultaneously can be accessed by Memory size is 2 gigabytes for a group of 64 processors Large enough to record an image in 8Kx8 bytes or 64. Low The structure of data across the storage memory is similar to that in FIG.

２つのＡＧが、このうち１つは各プロセッサの各メモリポート用であるが、アドレス算術演算を行うので、メインＡＬＵはアドレス指定オペレーションに使用されない。ＡＧはチェ’７り及び／又はアレイアクセスにおける一定の条件を強化する必要を除去する特別なアドレス指定モードを有し、これによりより効率的になる。Two AGs, one for each memory port on each processor, Since the main ALU is used for addressing operations, Not possible. AG enforces certain conditions on checks and/or array access has a special addressing mode that eliminates the need to Become.

ＡＧはハードウェア領域のチェックをアレイについて行い、アレイアクセスについてのストライド更新を計算できる。更に、いくつかの特別な境界条件がアレイに対し規定され得る。The AG performs hardware area checks on the array and It is possible to calculate stride updates. In addition, some special boundary conditions can be defined for.

ＡＧは６セツトのアドレス指定レジスタを共有し、アレイアクセスに対し４つのアドレス算術モード、すなわちノーマルモード、モジュロ算術モード、バウンディングモード、およびバタフライ算術モードを使用する。The AGs share six sets of addressing registers, with four sets for array accesses. Address arithmetic mode, namely normal mode, modulo arithmetic mode, bound arithmetic mode, and butterfly arithmetic mode.

各プロセッサはプロセッサに条件付ロック実行機構を与える条件付ロックハードウェアを有し、その機構はローカルデータの条件に基づいたコードを条件付実行することにより、プロセッサがそれら自身をロックするようにすることにより、ＭＩＭＤ方式でＳＩＭＤプロセッサを実行させる。プロセッサがコントローラ上の命令ノーケンスからそれに送られた命令の代わりにＮＯＰ　（ノーオペレーション）を実行するとき、プロセッサ実行状態は「ロックされている」と定義される。ロック解除の命令がプロセッサＩＷ内で出会うまでプロセッサはＮＯＰを実行し続ける。Each processor has a conditional lock hard drive that provides the processor with a conditional lock execution mechanism. The mechanism is conditional execution of code based on local data conditions. By causing processors to lock themselves, A SIMD processor is executed using the MIMD method. processor is on the controller NOP (no operation) instead of the command sent to it from the command nocence The processor execution state is defined as ``locked'' when executing Ru. The processor executes a NOP until an unlock instruction is encountered within the processor IW. keep going.

プロセッサをロックまたはロック解除する命令は構造化されたセグメント内で発生し、このセグメントには［ビギンＪ　（’ｂｅｇｉｎ’）と「エンドＪ　（’ ｅｎｄ’）文がある。これらのセグメントは「イフーゼンーエルスＪ　（’　１ｆ−ｔｈｅｎ−ｅｌｓｅ’　）構造と同様であり、入れ子が可能である。ロックとロック解除の決定は常に最も近い入れ子構造に対して関係する。条件付ロックコードはコントロールフローでの変化を伴わない。命令はコントローラから順に同報通信され、プロセッサがロック命令およびローカルデータ条件に基づいて実行するコードのどれかを選択する。条件付ロック情報はプロセッサステータスワードに記録される。コンテキストを記録および引き出す命令がサポートされ、全てのプロセッサがロック解除されるのを要求する割り込みがかかる。Instructions that lock or unlock the processor are issued within structured segments. This segment includes [begin J ('begin') and 'end J (' end') sentence. These segments are “If f-then-else') structure, and nesting is possible. rock and unlocking decisions always concern the nearest nested structure. conditional lock The code does not involve any changes in control flow. Instructions are given in order from the controller. broadcast and executed by the processor based on lock instructions and local data conditions. Select one of the codes to run. Conditional lock information is stored in the processor status recorded in the code. Context recording and retrieval instructions are supported and all An interrupt is generated requesting that all processors be unlocked.

ＡＧはローカルメモリにアクセスするためアドレスを計算するプロセッサ構成要素である。それは基本的なアドレッシングモード全てと規則的なアレイアクセスを効率的に計算するための更なるオペレーションを与える。■プロセッサにつき２つのＡＧがあり、各ローカルメモリボートは専用のＡＧををする。ＡＧはメモリにアクセスするのにレジスタの共通のセットを使用する。８つのユーザベースレジスタ（ｔｌＢＲｏ−ＵＢＲ７）、８つのユーザリミットレジスタ（ＵＬＲＯ −ＵＬＲ７）、１つのバンクセレクトレジスタ（ＢＳＲ）、１６個のベースレジスタ（ＢＲＯ−ＢＲ１５）、１６個のリミットレジスタ（ＬＲＯ−ＬＲ１５）、８つのオフセットレジスタ（ＯＲＯ−ＯＲ７）、および８つのストライドレジスタ（ＳＲＯ−３Ｒ７）がある。AG is a processor component that calculates addresses to access local memory. It is basic. It includes all basic addressing modes and regular array access We give further operations for efficiently computing . ■Per processor There are two AGs, and each local memory boat has its own dedicated AG. AG is a memo uses a common set of registers to access the memory. 8 user bases register (tlBRo-UBR7), eight user limit registers (ULRO -ULR7), 1 bank select register (BSR), 16 base registers star (BRO-BR15), 16 limit registers (LRO-LR15), 8 offset registers (ORO-OR7) and 8 stride registers There is a model (SRO-3R7).

ＵＢＲとＵＬＲはローカルメモリの８つのバンクに対してプログラムデータのリミットをはずすのに使用されている。プログラム用のデータは各バンクに連続して記録されなければならない。ＢＳＲはどのメモリバンクが動作中であるかを決めるのに使用される３ビツトレジスタである。１６個のＢＳＲとＬＳＲはアレイデータの構造のリミットをはずすのに使用されている。全てのアレイの指定はＢＲに相対的であり、ＬＲはアレイ構造への参照が範囲をはずれているかどうかを決めるのにＡＧにより使用されている。８つのＯＲがアレイ内で規定の位置を指定するのに使用され、８つのＳＲがＳＲの内容によりオフセット値を更新するのに使用される。The UBR and ULR provide program data readout for eight banks of local memory. Used to remove mitts. Program data is stored consecutively in each bank. shall be recorded. The BSR determines which memory bank is active. This is a 3-bit register used to store data. 16 BSRs and LSRs are an array It is used to remove limits on the structure of data. Specify all arrays as B Relative to R, LR determines whether the reference to the array structure is out of scope. Used by the AG to make decisions. Eight ORs point to specific locations within the array. The eight SRs are used to update the offset value according to the contents of the SR. used for.

アドレスワードは次の形式、（１）絶対／ＩＪＢ−相対アドレス指定、（３）バンクセレクト、および（２０）メモリバンクアドレスを有する。The address word has the following format: (1) absolute/IJB-relative addressing, (3) buffer (20) memory bank address.

ＡＣは２３ビツトアドレスで動作し、最上位３ビツトはメモリバンクを指定し、下位２０ビツトはバンクメモリのメガワード（３２ビツトワード）における１ワードを指定する。そのアドレスが３２ビツト構成で記録されるので、アドレス指定に使用されない９つの付加ビットが存在し、そのうちのいくつかは付加的な情報を運ぶ。１ビツトはそのアドレスが絶対値もしくはユーザベース（ＵＢ）に相対的な０８値であるかどうかを決めるのに使用される。ＵＢ相対アドレス指定はプログラムデータにアクセスするときに使用される。この使用により、プログラムデータを再配置可能にする。υＢ相対アドレスがユーザリミット（Ｌ）値より大きいかゼロより小さいならば、アクセス違反が発生する。絶対アドレス指定モードでは、アドレスはＵＢ値に加えられない。このモードは分割システム情報にアクセスするのに使用され、この情報は下位ローカルメモリに記録されている。The AC operates on a 23-bit address, with the most significant 3 bits specifying the memory bank. The lower 20 bits correspond to one word in a megaword (32-bit word) of bank memory. Specify the code. Since the address is recorded in a 32-bit configuration, the address There are nine additional bits that are not regularly used, some of which contain additional information. carry news. 1 bit indicates that the address is relative to the absolute value or user base (UB). Used to determine whether it is a 08 value or not. UB relative addressing is Used when accessing program data. This use allows the program to make system data relocatable. υB relative address is less than the user limit (L) value If it is greater than or less than zero, an access violation occurs. Absolute addressing mode In the UB code, the address is not added to the UB value. This mode is used for split system information. This information is stored in lower local memory.

ＡＧはＢＲＯ−ＯＲ１５とＬＲＯ−ＬＲ１５を使用する。ベースレジスタ（ＢＲ）はアレイやテーブルやストラフチャのような集合データのスタート位置を決め、リミットレジスタは集合データのアドレス境界を決める。このことはハードウェアが領域チェックを実行時各メモリアクセスにおいて行うのを可能とする。ＯＲとＬＲコントロールは８ＲＸがＬＲＸと共に使用されなければならないように束縛される。下位８つのＩＩＲＯ−ＢＲ７とＬＲＯ−ＬＲ７のみがベース−リミットオフセット−ストライド（ＢＬＯ３）オペレーションて使用され得る。ＢＲは２４ビツト値を含み、そのうち２０ビツトはアドレス用で、■ピントは絶対／ＵＢ相対アドレス指定用で、３ビツトフイールドはＢＲメモリバンクを指定する。リミットレジスタはＢＲに対してオフセットを制限する２０ビツトを含む。AG uses BRO-OR15 and LRO-LR15. Base register (BR ) determines the starting position of aggregate data such as arrays, tables, and stractures. , the limit register determines the address boundaries of the aggregate data. This is hard Enables the software to perform region checks at runtime on each memory access. O R and LR controls are now 8RX must be used with LRX be bound. Only the bottom eight IIRO-BR7 and LRO-LR7 are base-limited. A set offset-stride (BLO3) operation may be used. B.R. contains a 24-bit value, of which 20 bits are for address, and ■focus is absolute/ For UB relative addressing, the 3-bit field specifies the BR memory bank. . The limit register contains 20 bits that limit the offset to BR.

ＡＧはまた８つの２１ビツトオフセツトレジスタ（ＯＲＯ−ＯＲ７）と８つの２０ビツトストライドレジスタ（ＳＲＯ−３Ｒ７）を含む。これらのレジスタは規則的なストライドでアレイに繰り返しアクセスするための効果的な手段を与える。ベース−オフセント対（ＢＲｘ、　０Ｒｘ）がアクセスされるべきアレイ要素のアドレスを計算するのに使用された後、次にアレイアクセスを設定することにより、ＳＲｘの値がオフセットレジスタを更新するのに使用される。５ＲＯ−３Ｒ７に対して更に、ハードワイヤド定数０、＋１、および−ｌがストライド値として利用可能である。ＯＲ値はストライドレジスタ値により自動的に更新され、もしオフセットが更新される必要がなければ、ゼロのストライドが指定される。The AG also has eight 21-bit offset registers (ORO-OR7) and eight 21-bit offset registers (ORO-OR7). Contains a 0-bit stride register (SRO-3R7). These registers are Provides an effective means of repeatedly accessing the array with regular strides . Array element to which base-offcent pair (BRx, 0Rx) is to be accessed is used to calculate the address of Therefore, the value of SRx is used to update the offset register. 5RO-3 Additionally for R7, the hardwired constants 0, +1, and -l are the stride values. It is available as follows. The OR value is automatically updated by the stride register value, If the offset does not need to be updated, a stride of zero is specified.

もし新しいオフセット値が境界外（ＯＯＢ）であれば、ＯＯＢビットがｐｓｗに設定される。下位の８つのＢＲＯ−ＢＲ７とＬＲＯ−ＬＲ７のみがＢ１．ＯＳオペレーションで使用され、ＢＲｘがＬＲｘ、ＯＲｘ、およびＳＲｘと共に使用されなければならないように、ハードウェアコントロールが束縛される。ＳＲは２　。If the new offset value is out of bounds (OOB), the OOB bit is set in psw. Set. Only the lower eight BRO-BR7 and LRO-LR7 are B1. OS operation, and BRx is used with LRx, ORx, and SRx. Hardware controls are bound so that they must be SR is 2 .

１ビツトの２の補数を保持し、オフセットレジスタは正の２０ビツト値を保持する。It holds a 1-bit two's complement number, and the offset register holds a positive 20-bit value. Ru.

ＳＲの使用の例は図７に示され、図中オフセットは初め２であり、ストライド値は３である。逐次的なアレイアクセスは影の部分で示しである。アドレスは命令サイクル毎に発生される。６つの利用可能なアドレス指定モードがあり、これらは即値（イミーディエイト）、レジスタ直接、直接、間接、ベース相対、およびベースインデックスである。最初の２つのアドレス指定モードはＡＧを使用しないＡＧは即値モードでは使用されない。プロセッサオペレーションでの使用のためにＰＩＷの即値フィールドで値が指定される。ＡＧはまたレジスタ直接モードでも使用されない。値がレジスタファイルから読み取られたり書き込まれたりする。レジスタ直接読み取りはＰＩＷのＲｏｌ、ＲＯ２、ＲＯ３、又はＲＯ４フィールドにレジスタファイルアドレスを指定することにより実行される。指定されたレジスタファイル位置の内容が、それから適当なＲＯＸレノスタ内にロードされる。レジスタ直接書き込みは、ＰＩＷのＲ１１又はＲ１２フイールドにレジスタファイルアドレスを指定することにより実行される。Ｒ１ｘポートの値は指定されたレジスタファイル位置に書き込まれる。An example of the use of SR is shown in Figure 7, where the offset is initially 2 and the stride value is 3. Sequential array accesses are shown in shaded areas. address is a command Generated every cycle. There are six addressing modes available, these is immediate, register direct, direct, indirect, base relative, and This is the base index. The first two addressing modes do not use AG. A small AG is not used in immediate mode. For use in processor operations. A value is specified in the immediate field of the PIW for this purpose. AG is also register direct mode But it's not used. When values are read from or written to the register file. Ru. Register direct reads are performed using Rol, RO2, RO3, or RO4 files in the PIW. It is executed by specifying the register file address in the field. specified The contents of the registered register file location are then loaded into the appropriate ROX renostar. It will be done. Direct register writing is done by writing a register into the R11 or R12 field of the PIW. This is executed by specifying the data file address. Specify the R1x port value is written to the registered register file location.

直接アドレス指定モードでは、ローカルメモリに記録されたスカラーデータがローカルメモリにアドレスを指定することによりアクセスされる。レジスタファイルにスカラーデータを記録するのはより効率的であるが、ローカルメモリにスカラーデータを記録する必要のある場合（例えばレジスタあふれ又はスカラー値がポインタにより指定される非間接の場合）が存在する。アドレスはＵ［ｌＲからの変位をダイレクトソース（ＤＳ）を用いて指定することにより指定される。有効アドレスの計算は、有効アドレス−ＤＳ　＋　ＵＢｙ　となる。In direct addressing mode, scalar data recorded in local memory is loaded. accessed by specifying an address in local memory. register file Although it is more efficient to record scalar data in local memory, If you need to record error data (e.g. register overflow or scalar values (non-indirect case specified by a pointer) exists. The address is from U[lR is specified by specifying the displacement using a direct source (DS). Yes The calculation of the effective address is: effective address - DS + UBy.

非間接アドレス指定モードはポインタにメモリ内のデータを入れるために最もよく使用される。ＢＲにはローカルメモリのデータのアドレスがロードされる。オフセットはこのモードでは必要ないので、上位８つのＢＲ８−ＢＲＩ５にはまず間接アドレス値がロードされるべきである。このモードはゼロ変位でのベース相対アドレス指定に等しい。有効アドレスの計算は、有効アドレス＝　ＢＲＸ＋υ Ｂｙ　となる。The non-indirect addressing mode is most commonly used to fill pointers with data in memory. often used. The address of data in local memory is loaded into BR. O The offset is not needed in this mode, so for the top 8 BR8-BRI5 An indirect address value should be loaded. This mode is the base phase at zero displacement. Equivalent to paired addressing. Calculation of effective address is as follows: Effective address = BRX + υ By.

ベース相対アドレス指定モードは構造体メンバーのアクセスおよび規則的なパターンではアクセスされない（テーブルルノクアソブのような）アレイ内のランダムアクセスに最もよく使用される。ＢＲには構造体又はアレイのような集合的なデータ構造体のベースアドレスがロードされる。変位がＤＳを介してオフセットとして送られる。オフセントはこのモードでは必要ないので、上位８つのＢＲ８ −ＯＲ１５にはまずベース相対アドレス値がロードされるべきである。アドレス算術モードがベース相対アドレス指定に使用され得る。有効アドレスの計算は、有効アドレス−ＢＲＸ　＋　ＤＳ　＋　ＵＢｙ　となる。Base-relative addressing mode is for structure member access and regular pattern addressing. Randomizers in arrays (like table nodes) that are not accessed in the most commonly used for system access. BR has a collective structure such as a structure or an array. The base address of the data structure is loaded. Displacement offset via DS sent as. Offcents are not needed in this mode, so the top 8 BR8 -OR15 should first be loaded with the base relative address value. address Arithmetic mode may be used for base-relative addressing. Calculating the effective address is The effective address will be -BRX + DS + UBy.

ベースインデックスアドレス指定モードは規則的なパターンでアクセスされるアレイのために最もよく使用される。ＢＲ，リミットレジスタ、オフセットレジスタ、およびストライドレジスタには初期値がロードされる。有効アドレスが発生された後に、オフセント値がストライド値を加えることにより更新される。下位８つのＢＲＯ−ＢＲ７とＬＲＯ−ＬＲ７のみがＢＬＯＳオペレーンヨンのために使用され得る。アドレス算術モードはベースインデックスアドレス指定で使用され得る。有効アドレスの計算は、有効アドレス＝ＢＲｘ　＋　ＯＲｘ　＋　ＵＢｙ　および　０ＲＸ＝ＯＲＸ　＋　ＳＲＸとなる。Base index addressing mode is for addresses that are accessed in a regular pattern. Most commonly used for lei. BR, limit register, offset register Initial values are loaded into the register and stride register. Valid address occurs After the offset value is updated, the offset value is updated by adding the stride value. subordinate Only 8 BRO-BR7 and LRO-LR7 for BLOS operation lanes can be used. Address arithmetic mode is used with base index addressing. It can be done. Calculation of effective address is as follows: Effective address = BRx + ORx + UB y and 0RX = ORX + SRX.

ベース相対およびベースインデックスアドレス指定モードに関連して使用される４つのアドレス算術モード（ＡＡＭ）がある。これらの特別な目的のモードは共通形式でのアレイアクセスにおける計算を減じるのに使用される。これらのモードではハードウェア内で実行され、ローカルメモリ内の１次元アレイに操作する。Used in conjunction with base relative and base index addressing modes There are four address arithmetic modes (AAM). These special purpose modes are Used to reduce computations in accessing arrays in standard format. These modes is executed in hardware and operates on one-dimensional arrays in local memory. .

それらは、モジュロ算術モード、バウンデングモード、バタフライ算術モード、およびノーマルモードである。They are modulo arithmetic mode, bounding mode, butterfly arithmetic mode, and normal mode.

モジュロ算術モードは境界外のアレイアクセスをモジュロ算術を用いたアレイ内にマツピングする。モジュロ値はリミットレジスタ値により与えられる。アレイアクセスが境界外のときは、バウンデングモードはユーザが指定した境界条件のアドレスを与える。バタフライ算術モードは、高速フーリエ変換ＣＦＦＴ）の段階のための全てのバタフライのアドレスを発生する。境界外アクセスが修正されても、ノーマルモードは何もしない。Modulo arithmetic mode converts out-of-bounds array accesses into arrays using modulo arithmetic. Map to. The modulo value is given by the limit register value. array When the access is out of bounds, the bounding mode is set to the bounding condition specified by the user. Give address. Butterfly arithmetic mode is a fast Fourier transform (CFFT) stage. Generate all butterfly addresses for the floor. Out of bounds access fixed However, normal mode does nothing.

ベース相対アドレス指定では、モジュロ算術の有効アドレスが、Ｉ）３＝Ｄ３モジュロ１．Ｒｘおよび有効アドレス−ＢＲｘ　＋　ＤＳ　＋υＢＹにより計算される。With base-relative addressing, the effective address of modulo arithmetic is Juro 1. Calculated by Rx and effective address - BRx + DS + υBY It will be done.

ベースインデックスアドレス指定モードでは、モジュロ算術の有効アドレスが計算され、有効アドレスが有効アドレス＝ＢＲＸ　＋　ＯＲｘ　＋　１ＪＢｙ　ここでＯＲ１＝　（ＯＲｘ　＋　ＳＲＸ　）モジュロＬＲｘ　により発生された後に、オフセットが更新される。In base index addressing mode, the effective address of modulo arithmetic is The effective address is calculated as follows: Effective address = BRX + ORx + 1JBy Here, OR1 = (ORx + SRX) after being generated by modulo LRx , the offset is updated.

モジュロ操作は次のように計算される。The modulo operation is calculated as follows.

Ｘ＝Ｘ　−ＬＲＸ　（Ｘ＞＝ＬＲＸ　（７）とき）Ｘ＝Ｘ　＋　ＬＲｘ　（Ｘ＜Ｏのとき）Ｘ＝Ｘ　＜その他のとき）モジュロ算術モードの例では、図８を見ると、２次元アレイがプロセッサにわたり、各プロセッサに１つの列となるように分布している。アレイＡの列３０へのデータを有するプロセッサは、１００の要素列に対する９９の上限リミットより大きいオフセット１５０を発生する。このモードの下で、境界すなわち要素５０内にある新しいオフセットを生じるのために、リミット値がオフセットから差し引かれる。このモードはまたオフセットがゼロより小さいかどうかをチェックし、もしそうなら、アレイ境界内にある新しいオフセットを生じるようにリミット値にオフセットを加える。X=X −LRX (when X>=LRX (7)) X=X + LRx (X< When O) X=X <Other times) In the modulo arithmetic mode example, looking at Figure 8, a two-dimensional array is passed to the processor. They are distributed in one column for each processor. to column 30 of array A. A processor with data has an upper limit of 99 for a sequence of 100 elements. Generates a large offset of 150. Under this mode, the border or element 50 The limit value is deducted from the offset to yield a new offset within I am drawn to it. This mode also checks if the offset is less than zero. , if so, limit to yield a new offset that is within the array bounds Add an offset to the value.

バウンデングモードでは、アレイアクセスが境界外になると、オフセット値は境界条件値のアドレス位置と置き換えられる。これは次のように行われる。即ち、デフオールドの境界条件がすぐに最後のアレイ位置に続く位置に記録され、その結果、それは位置（ＢＲＸ　＋　ＬＲＸ）に記録される。境界外のアドレスが検出されると、ＡＣはアドレス（ＢＲｘ　＋　ＬＲｘ）を返し、これは境界条件値の位置である。In bounding mode, when an array access goes out of bounds, the offset value Replaced with the address location of the boundary condition value. This is done as follows. That is, A default boundary condition is recorded immediately following the last array position and its As a result, it is recorded at position (BRX + LRX). An out-of-bounds address is detected. When issued, the AC returns the address (BRx + LRx), which is the boundary condition value This is the position of

ベース相対アドレス指定モードでは、バウンデングモードの有効アドレスは、有効アドレス＝　ＢＲｘ÷境界（ＤＳ）　＋　ｕｏｙ　として計算される。In base-relative addressing mode, the effective address in bounding mode is It is calculated as effective address=BRx÷boundary (DS)+uoy.

ベースインデックスアドレス指定モードでは、モジュロ算術の有効アドレスは、を効アドレス−ＢＲＸ　＋　ＯＲＸ　＋　ＵＢｙここで、ＯＲ１＝境界（ＯＲｘ　＋　５Ｒｘ）として計算される。In base index addressing mode, the effective address for modulo arithmetic is The effective address - BRX + ORX + UBy where, OR1 = boundary (ORx +5Rx).

境界オフセット操作の境界（Ｘ）は、Ｘ＝Ｘ　（０＞＝Ｘ＞ＬＲｘ　Ｏ）場合）Ｘ＝ＬＲｘ　（その他の場合）バウンデングモードの例では、図９を見ると、２次元アレイがプロセッサにわたり、各プロセッサに１つの列となるように分布している。アレイＡの列３０へのデータを有するプロセッサはオフセット１２０を発生し、これは１００の要素列に対する９９の上限より大きい。従って、アレイ要素Ａ［３０，１２０］がアクセスされるときに、値ゼロが返される。The boundary (X) of the boundary offset operation is X=X (0>=X>LRx O)) X=LRx (other cases) In the bounding mode example, looking at Figure 9, a two-dimensional array is passed to the processor. They are distributed in one column for each processor. to column 30 of array A. A processor with data generates an offset 120, which is a 100 element sequence greater than the upper limit of 99 for . Therefore, array element A[30,120] is activated. When accessed, a value of zero is returned.

ＡＧは２つのモード、即ちアドレス発生モードとアドレスレジスタにロードおよび書き込みするためのセットアツプモードを有する。このモードはＰＩＷ内のアドレス発生モードビットにより決められる。両方のＡＧがこのビットを共有し、従って両方のＡＧは常に同じオペレーションモードにある。The AG operates in two modes: address generation mode and address register load and It has a setup mode for writing and writing. This mode is Determined by the address generation mode bit. Both AGs share this bit, Therefore both AGs are always in the same mode of operation.

セットアツプモードでは、プロセッサ命令フィールド内のｌＯビットＡＧフィールドは次のフォーマット、即ち（２）読み取り／書き込みイネーブル、ＮＯＰ、（２）直接ソース選択、（３）レジスタ選択（３）レジスタ番号を有する。In setup mode, the lO bit AG field in the processor instruction field is The field has the following format: (2) read/write enable, NOP; It has (2) direct source selection, (3) register selection, and (3) register number.

２ビット読み取り／書き込みイネーブルフィールドが、レジスタがＡＧレジスタファイルセットに読み取りされるかもしくはＲＡＭに書き込みされるかを決定する。書き込み動作がＲＡＭになされたときに、対応するＲＡＭ命令フィールドもまた書き込みを指定しなければならない。ＡＧがレジスタ値をＲＡＭに書き込みするときに、ＲＡＭフィールドの書き込みデータ選択フィールド選択をその書き込み動作が上書きする。２ビットＤＳ選択は、ＡＧレジスタファイルセット内にデータをロードするためのソース選択を決定する。2-bit read/write enable field, register is AG register Determines whether it is read into a fileset or written to RAM. Ru. When a write operation is made to RAM, the corresponding RAM instruction field is also You must also specify writing. AG writes register value to RAM When writing, select the write data selection field of the RAM field. The loading operation overwrites it. 2-bit DS selection is in the AG register file set. Determine source selection for loading data.

３ビツトレジスタ選択はロードされるべきレジスタを選ぶ。レジスタセットには、１）ｔｌＢｏ−ＵＯ３，２）ＵＬＯ−ＵＬ７．３）ＢＲＯ−ＯＲ７，４）ＯＲ８−ＯＲ１５，５）ＬＲＯ−ＬＲ７，６）ＬＲ８−ＬＲ１５，７）ＯＲＯ−ＯＲ７，８）ＳＲＯ−３Ｒ７がある。The 3-bit register select selects the register to be loaded. In the register set , 1) tlBo-UO3, 2) ULO-UL7.3) BRO-OR7, 4) OR 8-OR15,5)LRO-LR7,6)LR8-LR15,7)ORO-OR 7,8) There is SRO-3R7.

３ビツトレジスタ番号フィールドは８つの動作中のレジスタのセットのうちのどれかのレジスタを選ぶ。The 3-bit register number field indicates which of the set of eight active registers. Select one of the registers.

アドレス発生モードでは、ＰＩＷは以下のフォーマットを有する。In address generation mode, the PIW has the following format:

（２）アドレス指定モード００　直接アドレス指定０１　間接アドレス指定ＩＯベース相対アドレス指定ＩＩ　ベースインデックスアドレス指定（２）直接ソース選択（直接、ベース相対アドレス指定モードに有効）（２）ＡＡＭ選択（ベース相対、ベースインデックスアドレス指定モードに有効）００　モジュロ算術モードＯｌ　バウンデングモードＩＯノーマルモード１１　バタフライ算術モード（２）ストライド選択（ベースインデックスアドレス指定モードに有効）００　定数００１　定数１１０　定数−１１１ＢＬＯＳレジスタ選択で指定されるストライドレジスタを使用（３）ＢＬＯＳレジスタ選択（ベースインデックスアドレス指定に有効）（４）ペースレノスタ選択（間接、ベース相対アドレス指定モードに有効）アドレス発生命令フォーマット直接アドレス　００　ｘｘｄｄ　ＸＸＸＸ間接アトレア、０１　ｘｘｘｘ　ｂｂｂｂベース相対アドレス　１０　ｍｍｄｄ　ｂｂｂｂベースインデックス　１１　ｍｍ５ｓ　０ｂｂｂｂｂｂｂ　：　ＢＲ選択０ｂｂｂ　：　ＢＬＯ３選択ｄｄ：　直接ソース選択ｍｍ：　アドレス算術モード選択ＳＳ６　ストライド選択ｘｘ、　未使用ＡＧの活用の例として、次の２つの３ｘ３行列の掛は算を考えよう。(2) Addressing mode 00 Direct addressing 01 Indirect addressing IO-based relative addressing II Base index addressing (2) Direct source selection (Valid for direct and base relative addressing modes) (2) AAM selection (Effective for base relative and base index addressing modes) 00 modulo Arithmetic mode Ol bounding mode IO normal mode 11 Butterfly arithmetic mode (2) Stride selection (Effective for base index addressing mode) 00 Constant 0 01 constant 1 10 constant -1 11 Use the stride register specified by BLOS register selection (3) BLO S register selection (Effective for base index address specification) (4) Pace renostar selection (Valid for indirect and base relative addressing modes) Address generation instruction format Direct address 00 xxdd XXXX indirect atrea, 01 xxxxx bb bb base relative address 10 mmdd bbbb base index 11 mm5s 0bbbbbbbb: BR selection 0bbb: BLO3 selection dd: Direct source selection mm: Address arithmetic mode selection SS6 Stride selection xx, unused As an example of using AG, consider the following multiplication of two 3x3 matrices.

ＣＩ＝Ａ１ｘＢ１＋Ａ２ｘＢ４＋Ａ３ｘＢ７Ｃ４＝Ａ４ＸＢ１＋Ａ５ＸＢ４＋Δ ６ｘＢ７Ｃ７＝Ａ７ｘＢｌ＋Ａ８ｘＢ４＋Ａ９ｘＢ７効率を高めるために、多くの計算システムでは、アドレス計算を減らすためおよび／またはキャッシュの使用を最大にするためにデータを行の大きい形式のＩつの行列と列の大きい形式のその他の行列に記録する。しかしながら、ＡＧはストライド更新機能を有しているので、ＳＥ上で行列データは一貫した形式で記録され得る。特に、両方の行列用のデータは行の大きい形式で記録され得る。２番目の行列（その列は計算に使用されている）が、行の大きい形式に記録されている３ｘ３アレイ用の同じ列の要素間の距離が３のストライドを使用している間に、１番目の行列（その行は計算に使用されている）は距離が１のストライドを使用するＡＧはＡＬＵをアドレス算術計算から解放するので、性能が増す。事実、非常にきついバイブラインループがプロセッサリソースにより形成される。図１０はプロセッサの効率を示すリソース活用テーブルである。テーブルの行はリソースを表わし、列は命令サイクルを表わす。影の部分は特定の命令サイクルに対するリソースの活用を表わす。このテーブルは計算結果を示す行列の列をめるための計算を表わす。CI=A1xB1+A2xB4+A3xB7C4=A4XB1+A5XB4+Δ 6xB7C7 = A7xBl + A8xB4 + A9xB7 To increase efficiency, calculation systems, to reduce address calculations and/or use caches. To maximize usage, the data is divided into one matrix in large row format and one matrix in large column format. Record in other matrices. However, AG does not have a stride update function. Therefore, matrix data can be recorded in a consistent format on the SE. In particular, both matrices Data for can be recorded in large format in rows. the second matrix (its columns are used in the calculation) ) in the same column for a 3x3 array recorded in row large format. While using a stride with a distance of 3 between elements, the first matrix (its rows are (used in the calculation) uses a stride with a distance of 1. This improves performance by freeing you from arithmetic calculations. In fact, a very tight vibrator A group is formed by processor resources. Figure 10 shows the efficiency of the processor This is a resource utilization table. The rows of the table represent resources, and the columns represent instruction sizes. Represents Kuru. The shaded area represents resource utilization for a particular instruction cycle. . This table represents calculations for filling the columns of a matrix that represents calculation results.

バイブライン計算は次の様に進める。最初の命令サイクルで、２つの行列のアドレスが発生される。次の命令サイクルで、ローカルメモリから値が取り出される。メモリポートは乗算器への入力であるので、これらの値はそれから次のサイクルで掛は合わされ得る。それからその積は次のサイクルで蓄積される。それから、その結果の行列の要素が一時的にレジスタファイルに記録される。異なるリソースが計算の各段階および各リソースに対して互いに独立に動作するので、詰まったバイブライン計算が発生する。The vibration line calculation proceeds as follows. Add two matrices in the first instruction cycle A response is generated. The value is retrieved from local memory on the next instruction cycle. . Since the memory ports are inputs to the multiplier, these values are then used for the next cycle. Multiplies can be combined in The product is then accumulated in the next cycle. after that , the elements of the resulting matrix are temporarily recorded in the register file. different litho The resources work independently for each step of the computation and each resource, eliminating blockages. Vibration line calculation will occur.

上記のバイブラインによる方法を使用すると、ＮｘＮ行列の掛は算は（Ｎ２＋４Ｎ＋６）回の命令を必要とする。行列の掛は算は総計で（２Ｎ３−Ｎ２）回の算術オペレーションを必要とする。Using the Vibrine method above, the multiplication of an NxN matrix is (N2+4 It requires N+6) instructions. Matrix multiplication is a total of (2N3-N2) times. Requires surgical operation.

専用ハードウェアマツチャ６０４は効果的に任意の長さのデータシーケンス間のマツチナンバーを計数する。マツチャはマツチ計数が蓄積され得るようにＡＬＬＩの前に位置される。マツチャはＰＩＷ内の乗算器と命令フィールド共有する。The dedicated hardware matcher 604 effectively matches data sequences of arbitrary length. Count the match numbers. Matsucha is ALL so that Matsuchi counts can be accumulated. It is located in front of I. The matcher shares an instruction field with the multiplier in the PIW.

乗算器のように、ＸとＹデータソースおよび結果を記録するためのＰレジスタを使用する。掛は算オペレーションはマツチオペレーションに直交している（すなわち、マツチ動作中は掛は算は必要とされなく、掛は真中はマツチ動作は必要とされない）ので、この設計が採用された。Like a multiplier, it has X and Y data sources and a P register to record the result. use. The multiplication operation is orthogonal to the match operation (i.e. In other words, during the match operation, multiplication is not required, and when the match is in the middle, the match operation is not required. ), this design was adopted.

マツチャはパックデータワードのデータシーケンスを操作する。２つ以上のより小さい大きさのデータワードが１つの３２ビツトワード内に置かれたときに、データがパックされる。マツチャは各命令サイクルで２つの３２ビツト値をマツチでき、Ｐレジスタ内でのいくつかのパックワードマツチを記録する。Ｐレジスタ内に記録された値は、それからＡＬＵへ入力され、累積され得る。Matscha operates on data sequences of packed data words. two or more twists When data words of small size are placed within one 32-bit word, data is packed. The matcher matches two 32-bit values in each instruction cycle. and records several packed word matches in the P register. P register The values recorded within can then be input to the ALU and accumulated.

３２ビツト入カワードはパックデータとしてマツチャにより変換される。各３２ビツトワードはより小さいサイズの複数ワードを表わし得る。入力に対する可能なマツチ形式としては、■ビット、３２ワード：２ビツト、１６ワード、３ビツト、ＩＯワード；４ビツト、８ワード、５ビツト、６ワード、６ビツト、５ワード：７ビツト、４ワード；８ビツト、４ワード、１６ビツト、２ワード、３２ビツト、１ワードがある。A 32-bit keyword is converted by a matcher as packed data. 32 each Bitwords may represent multiple words of smaller size. Possible for input The following match formats are: ■Bit, 32 words: 2 bits, 16 words, 3 bits bit, IO word; 4 bit, 8 word, 5 bit, 6 word, 6 bit, 5 word Code: 7 bits, 4 words; 8 bits, 4 words, 16 bits, 2 words, 32 bits Tsuto, there is one word.

３．５．６．７ビツトワ一ド形式はマツチャに無視される未使用ビットを有する。マツチ形式は（マツチャに置かれた）Ｂレジスタをそのマツチ形式でロードするセットアツプ命令において決められる。例として、図１１に示された２つの３２ビツトデータワードを考えよう。もし、Ｂレジスタがパックワードのサイズが４であるように初期化されるなら、５つのマツチが記録されるであろう。3.5.6.7 Bitword format has unused bits ignored by matcher . The match format loads the B register (placed in the match format) in that match format. Determined in the set-up command. As an example, the two 3 Consider a 2-bit data word. If the B register is packed word size If initialized to be 4, 5 matches will be recorded.

頭字語Ｉｓ　Ｉの補数２Ｓ　２の補数八〇〇　アキュムレータＡＣＭＲ補助ＡＬＵ条件マスクレジスタＡＤＲ補助ＡＬ［Ｉデータレジスタ届　アドレス発生器ＡＬＩＮ　アクティブロッキング識別子数ＡＬＵ　算術論理演算装置ＡｕｘＡＬＬｌ　補助ＡＬＩＪＢＬＯ３ベース−リミット−オフセット−ストライドＢＲベースレジスタＣＩＤ　通信識別子ＣＩＤＲ通信識別子レジスタＣＬＣ３条件付ロッキングコードセグメントＤＩＣデータ入力チャンネルＤＯＣデータ出力チャンネルＤＳ　直接ソースＥＢ　エンジンブロックＦＡＧ　フレームアドレス発生器ＦＢＲフレームベースレジスタＦＩＦＯ先入れ先出し方式ＦＩＴＳＲＦＩＦＯ入力タイミングンーケノーレジスタＦＯＲフレームオフセットレジスタＦＯＴＳＲＦＩＦＯ出力タイミングシーケンスレジスタＧＥ　ゼロ以上ＧＯＲ大域的０ＲＧＴ　より大きい１１１０　ホスト人力／出力＋１１ＯＲホスト人力／出力レジスタＨＲＱ　ホスト要求待ち行列１／Ｆ　整数／浮動少数点Ｉ１０　人力／出力ＩＣ集積回路ＩＭＤ　即値ＩＮＥ　不正確な１０Ｍｃ　人力／出力メモリコントローラ１０Ｑ　入力／出力待ち行列１０３　人力／出力サーバーＩＰｃ　プロセッサ間連絡マツチは、必ずしも３２ビツトワードの境界で開始していないので、バックデータシーケンスを整合させることは複雑になる。３２ビツトワード内において、すべての可能性のあるバックワードの開始位置を表示する１セツトのテンプレートを定義しなければならない。本実施例においては、３２ビツトワードに対して四つのバックワードがあるので、マツチは、Ｄシーケンスの３２ビツトワード内において、第１１第２、第３、又は第４のバックワードにおいて開始される。こうして、四つのマツチテンプレートがこの場合をカバーするために定義されなければならない。図１２は、七つのＡＳＣＩＩキャラクタシーケンスに使用するマツチテンプレートの集合を示している。テンプレートの空部骨は、Ｄシーケンスに使用されないキャラクタに対して初期化されて、誤マツチが生じないことが保証される。acronym Is Complement of I 2S 2's complement 800 accumulator ACMR auxiliary ALU condition mask register ADR auxiliary AL [I data register Notification address generator ALIN Active locking identifier number ALU Arithmetic logic unit AuxALLl Auxiliary ALIJ BLO3 base-limit-offset-stride BR base register CID communication identifier CIDR communication identifier register CLC3 conditional locking code segment DIC data input channel DOC data output channel DS direct source EB engine block FAG frame address generator FBR frame base register FIFO first-in first-out method FITSRFIFO input timing - Keno register FOR frame offset tre register FOTSRFIFO output timing sequence register GE zero or more GOR global 0R Bigger than GT 1110 Host human power/output +11OR host human power/output register HRQ host request queue 1/F integer/floating point I10 Human power/output IC integrated circuit IMD immediate value INE inaccurate 10Mc human power/output memory controller 10Q input/output queue 103 Human power/output server IPc inter-processor communication Matsushi does not necessarily start on a 32-bit word boundary, so Aligning data sequences becomes complex. Within a 32-bit word, all One set of templates showing all possible backward starting positions must be defined. In this example, for a 32-bit word, four Since there are two backward words, the match is within the 32-bit word of the D sequence. , starting at the eleventh second, third, or fourth backward. like this and four match templates must be defined to cover this case. Must be. Figure 12 shows the pine tree used for the seven ASCII character sequences. It shows a set of templates. The empty bones of the template are in the D sequence. Initialized for unused characters to ensure no false matches be done.

正確なマツチを行うために、各テンプレートは同一サイズのＤシーケンスと整合する。そのテンプレートのすべてがサブノーケンスと比較された場合、マツチシーケンス比較は３２ビツトワードによってＤシーケンスに相対的にシフトされる。このことは、図１３で説明されている。これにより、Ｄのすべてのキャラクタサブノーケンスがマツチシーケンスに対して整合していることが保証される。For accurate matching, each template must be matched with a D-sequence of the same size. do. If all of its templates are compared with subnokens, sequence comparisons are shifted relative to the D sequence by 32 bit words. . This is illustrated in FIG. This allows all characters of D It is guaranteed that the subnocence is consistent with the matching sequence.

マツチナンバーはＰレジスタに記憶されて、ＡＬＵにより累積される。The match number is stored in the P register and accumulated by the ALU.

Ｐレジスタの内容をマツチシーケンス長と比較することにより整合が正確であることがわかる。この比較は、Ｐレジスタの内容をマツチシーケンス長と比較することによって、ＡＬ［Ｉにおいて実行される。本実施例においては、この整合結果は、マツチシーケンスにおけるチャラクタ数であるナンバー７と比較されなければならない。The match is accurate by comparing the contents of the P register with the match sequence length. I understand that. This comparison compares the contents of the P register with the match sequence length. is executed in AL[I. In this example, this matching connection is The result must be compared with number 7, which is the number of characters in the match sequence. Must be.

プロセッサに局所的データ条件に基づいて条件付き実行コードの性能を与える実行メカニズムを次に説明する。条件付きロッキングを概観した後に、プロセッサのオペレーションの説明を行い、条件付きロッキングを実行するためのハードウェア要件を定義して、いくつかの疑似コード例を提示する。An implementation that gives the processor the performance of conditionally executed code based on local data conditions. The row mechanism will be explained next. After reviewing conditional locking, the processor describes the operation of and describes the hardware for performing conditional locking. We define the hardware requirements and present some pseudocode examples.

制御装置の命令シーケンサから送付される命令のかわりに、プロセッサがＮ０Ｐｓ（演算なし）を実行した場合はプロセッサ状態は’　１ｏｃｋｅｄ’である。Instead of instructions sent from the control unit's instruction sequencer, the processor When s (no operation) is executed, the processor state is '1ocked'.

逆に、制御装置が送付した命令を実行した場合はプロセッサは°ｕｎｌｏｃｋｅｄ’である。プロセッサがロックの場合には、制御装置がアンロックのコマンドを与えるまでは、Ｎ０Ｐｓを実行する。プロセッサがアンロックの場合には、ＩＰＣはなお演算状態にあり、アンロックの時期を決定するために、特定のブックキーピング操作がプロセッサによってなお実行される。Conversely, if the control device executes the instructions sent, the processor unlocks the d'. If the processor is locked, the control unit issues an unlock command. N0Ps are executed until . If the processor is unlocked, I The PC is still in a computing state and requires a specific book to determine when to unlock. Keeping operations are still performed by the processor.

条件付きロッキングメカニズムは、ＳＩＭＤ環境における条件付きコードの実行にむくに十分高性能である。条件付きコードは、追加の命令オーバーヘッドを受けるほど制御フローに変更をもたらすことなく実行できる。プロセッサ状態を変更する決定は、条件付き０７キングコードセグメント（Ｃｏｎｄｉｔｉｏｎａｌ　Ｌｏｃｋｉｎｇ　ＣｏｄｅＳｅｇ＋ｏｅｎｔ：ＣｌＯ２）内にて行われる。ＣｌＯ２は、ｂｅｇｉｎ文とｅｎｄ文によって区切られたアセンブリ言語レベルコンストラクトである。各ＣＬＣ３は、ロック１０ナンバー（Ｌｏｃｋ　ＩＤ　Ｎｕｍｂｅｒ：ＬＩＮ）と関係する。ＣＬＣ３内の命令は、ｐｓｗからの情報に基づいて、プロセッサをロックしアンロックする。Conditional locking mechanisms allow conditional code execution in SIMD environments. It has sufficient performance for many purposes. Conditional code incurs additional instruction overhead. This can be done without any changes to the control flow. Change processor state The decision to change the conditional 07 King code segment (Conditional Locking CodeSeg+oent: ClO2). C lO2 is an assembly language level code separated by begin and end statements. It is a structure. Each CLC3 has a lock 10 number (Lock ID N umber:LIN). Instructions in CLC3 are based on information from psw. then lock and unlock the processor.

ＣｌＯ２は、最高級レベル言語でサポートされた゛１ｆ−ｔｈｅローｅｌｓｅ’ コンストラクトに類似する形式である。’　ｔｈｅｎ’文本体どｅｌｓｅ’文本体の開本体相互に排他的実行条件があり、いずれかの文本体は、プロセッサによって実行されるが両者共ではない。ＣＬＣ３ｓは入れ子にすることができるが、オーバーラツプはできない（ＣＬＣ３Ｉが開始して次にＣｌＯ３２が開始する場合には、ＣｌＯ３２はＣＬＣ３Ｉが終了する前に終了しなければならない）。ClO2 is supported by the highest level languages. It has a format similar to a construct. 'then' body of sentence else' body of sentence There are mutually exclusive execution conditions such that either statement body cannot be executed by the processor. is executed, but not both. CLC3s can be nested, but No overlap (if CLC3I starts and then ClO32 starts) ClO32 must be terminated before CLC3I is terminated).

条件付きでプロセッサをロックしアンロックするために、下記のオペレーションがなされる。即ち、ｌ）　Ｂｅｇｉｎ　ＣｌＯ５；　２）　Ｅｎｄ　ＣｌＯ２；　３）　Ｃｏｎｄｉｔｉｏｎａｌ　Ｌｏｃｋ　（条件付き）　；　４　）　Ｃｏｎｄｉｔｉｏｎａｌ　Ｕｎｌｏｃｋ　；　５　）　Ｃｏｎｄｉｔｉｏｎａｌ　［！Ｉｓｅ　；　６　）　Ｉｎ狽■窒窒浮垂■ Ｕｎｌｏｃｋ　；　７）　Ｉｎｔｅｒｒｕｐｔ　Ｒｅ５ｔｏｒｅ　；及び８）ＮＯＰである。To conditionally lock and unlock the processor, use the following operations: will be done. That is, l) Begin ClO5; 2) End ClO2; 3) Conditional Lock (conditional); 4) Co Conditional Unlock; 5) Conditional [ ! Ise　；　6　　In狽■Nitrogen buoyancy■ Unlock; 7) Interrupt Re5tore; and 8) N It is OP.

Ｂｅｇｉｎ　ＣｌＯ２とＥｎｄ　ＣｌＯ２は、ＣｌＯ２を区切るために使用される。命令に与えられた条件が満たされた場合、Ｃｏｎｄｉｔｉｏｎａｌ　Ｌｏｃｋ命令はプロセッサをロックする。Ｃｏｎｄｉｔｉｏｎａｌ　Ｕｎｌｏｃｋ命令は、カレント（高集積され入れ子された）　ＣｌＯ３上でロックされているすべてのプロセッサをアンロックする。Ｃｏｎｄｉｔｉｏｎａｌ　［！Ｉｓｅ命令は、カレントＣｌＯ３内においてコードを実行していないすべてのプロセッサをアンロックしており、カレントＣｌＯ３内においてコードを実行したすべてのプロセッサをロックしている。Ｉｎｔｅｒｒｕｐｔ　Ｕｎｌｏｃｋ命令は、割り込みが生じるか、すべてのプロセッサをアンロックにする文脈節スイッチの間に、使用される。Ｉｎｔｅｒｒｕｐｔ　Ｒｅ５ｔｏｒｅは、Ｉｎｔｅｒｒｕｐｔ　Ｕｎｌｏｃｋ命令が実行される前にプロセッサ状態を復元するために使用される。Begin ClO2 and End ClO2 are used to separate ClO2 Ru. Conditional Loc if the condition given to the instruction is met The k instruction locks the processor. Conditional Unlock instruction is all locked on the current (highly integrated and nested) ClO3 Unlock all processors. Conditional [! The Ise command is , all processors not executing code in the current ClO3 are All programs that are locked and have executed code within the current ClO3 Sessa is locked. Interrupt Unlock instruction occurs or during a context clause switch that unlocks all processors. used. Interrupt Re5tore is Interrupt Un Used to restore processor state before the lock instruction is executed.

ＣＬＣ３例は、ＣｌＯ２が１ｆ（ｈｅｎ−ｅｌｓｅコンストラクトに類似させる方法を説明するために提示されている。The CLC3 example shows that ClO2 is 1f (to resemble a hen-else construct Presented to illustrate the method.

１ｆ（ｃｏｎｄｉｔｉｏｎｌ）　Ｂｅｇｉｎ　ＣＬＣ３ｔｈｅｎ　Ｃｏｎｄｉｔｉｏｎａｌ　Ｌｏｃｋ（ｎｏｔ　ｃｏｎｄｉｔｉｏｎｌ）ｓｔａｔｅｍｅｎｔ　１　ｓｔａｔｅｍｅｎｔ　１ｅｌｓｅ　１ｆ（ｃｏｎｄｉｊｉｏｎ２）　Ｃｏｎｄｉｔｉｏｎａｌ　ＥｌｓｅＣｏｎｄｉｔｉｏｎａｌ　Ｌｏｃｋ（ｎｏｔ　ｃｏｎｄｉｔｉｏｎ２）ｓｔａｔｅｍｅｎｔ　２　ｓｔａｔｅｍｅｎｔ　２ｅｌｓｅ　Ｃｏｎｄｉｔｉｏｎａｌ　Ｅｌｓｅｓｔａｔｅｍｅｎｔ　３　ｓｔａｔｅｍｅｎｔ　３Ｅｎｄ　ＣｌＯ２下記ハードウェアサポートは、条件付きロッキングをサポートするためにプロセッサによって使用される。1f (conditionl) Begin CLC3thenCondit ional Lock (not condition) statement 1 statement 1 else 1f (condition2) Con dional　ElseConditional　Lock(not co ndition2) statement 2 statement 2 else Conditional Elsestatement 3 stateme nt　3End　ClO2 The following hardware supports the process to support conditional locking. used by the

ＡＬＩＮカウンタ　アクティブＬＩＮナンバー（ＡＬＩＮ）は、実行されたカレントＣＬＣ３のＬＩＮナンバーである。ALIN counter Active LIN number (ALIN) This is the LIN number of the client CLC3.

ＬＩＮレンスタ：ＬＩＮ値は、プロセッサがロックされたＣｌＯ２のナンバーである。LIN Lenster: The LIN value is the number of ClO2 that the processor is locked to. be.

プロセッサがアンロックの場合には、ＡＬＩＮとＬＩＮは同一である。If the processor is unlocked, ALIN and LIN are the same.

Ｃｏｎｄレジスタ：　Ｃｏｎｄレジスタは、プロセッサをロックする条件付きてＰＳＷを含んでいる。Cond register: The Cond register is a conditional register that locks the processor. Contains PSW.

Ｃ状態ビット：　Ｃ（Ｃｏｎｔｅｘｔ）状態ビットは、ＰＳＷに位置しているが、プロセッサの状態を決定する。ビットが設定された場合には、プロセッサはロックされて、ビットが設定されていない場合には、プロセッサはアンロックされる。C status bit: The C (Context) status bit is located in the PSW. , determine the state of the processor. If the bit is set, the processor will If the bit is not set, the processor is unlocked. Ru.

Ｘ状態ビット：　Ｘ（Ｅｘｅｃｕｔｅｄ）状態ビットは、ｐｓｗに位置しているが、ＣＬＣ３内において、文本体相互の排他的実行を行う。Ｘビットを抑制すると相互の排他性を抑制することになる。X status bit: The X (Executed) status bit is located in psw However, within CLC3, mutually exclusive execution of statement bodies is performed. suppress X bit and mutual exclusivity.

ＰＬＩＮレジスタ：　Ｐｒｅｖｉｏｕｓ　ＬＩＮレジスタ（ＰＬＩＮ）は、割り込みが生じた時にＬＩＮが記憶されている場所である。PLIN register: Previous LIN register (PLIN) This is where the LIN is stored when a problem occurs.

ＰＣａｎｄレジスタ：　Ｐｒｅｖｉｏｕｓ　Ｃｏｎｄｉｔｉｏｎレジスタ（ＰＣａｎｄ）は、割り込みが生じた時にＣｏｎｄｉｔｉｏｎレジスタが記憶されている場所である。PCand register: Previous Condition register (PC and) means that the Condition register is stored when the interrupt occurs. It is a place where you can

ＰＣ状態ビット・Ｐｒｅｖｉｏｕｓ　Ｃｏｎｔｅｘｔ（ＰＣ）状態ビットは、割り込みが生じた時にＣ状態ビットが記憶されている場所である。PC status bit/Previous Context (PC) status bit This is where the C status bit is stored when a crash occurs.

ＰＸ状態ビット：　Ｐｒｅｖｉｏｕｓ　Ｅｘｅｃｕｔｅｄ（ＰＸ）状態ビットは、割り込みが生した時にＸ状態ビットが記憶されている場所である。PX status bit: Previous Executed (PX) status bit , is where the X status bit is stored when the interrupt occurs.

一般的に、ＡＬＩＮカウンタ値は、ＣｌＯ５がエンターされると増加し、ＣｌＯ２がイグジットされると減少する。こうして、ＡＬＩＮ値はＣｌＯ３入れ子レベルと等価である。ＡＬＩＮ値はすべてのプロセッサにおいて同一であり、プロセッサがロックされても増加減少する。Generally, the ALIN counter value increases as ClO5 is entered and ClO It decreases when 2 is exited. Thus, the ALIN value is determined by the ClO3 nesting level. is equivalent to The ALIN value is the same for all processors and Increases and decreases even if the sensor is locked.

第２の値は、ＬＩＮ値と呼ぶが、ＣｌＯ２がプロセッサをロックさせたことを記録する。この情報はＣＬＣ３ｓが入れ子とされた状態に必要であり、プロセッサをロックしたのは外部ＣＬＣ３か内部ＣＬＣ５かを判定しなければならない。プロセッサがアンロックされた場合には、ＬＩＮ値はＡＬＩＮ値と同一である。プロセッサがロックされた場合は、ＬＩＮ数はＡＬＩＮ数以下である。The second value, called the LIN value, records that ClO2 caused the processor to lock up. Record. This information is necessary for CLC3s to be nested and the processor It must be determined whether the external CLC 3 or the internal CLC 5 has locked the external CLC 3 or internal CLC 5. P If the processor is unlocked, the LIN value is the same as the ALIN value. P If the processor is locked, the number of LINs is less than or equal to the number of ALINs.

プロセッサを条件付きにロックした場合には、ＰＳＷはＣｏｎｄレジスタに記憶され、ＬＩＮ値はもはやＡＬＩＮ値に影響されない。ＣビットはプロセッサをロックしているＰＳＷに設定されている。コードに条件付きアンロック命令がある場合には、プロセッサを条件付きにアンロックするが、ＬＩＮ値はＡＬＩＮ値と同一である。こうして、アンロッキング命令は常に高集積され入れ子されたＣｌＯ２に適用される。プロセッサは４つの命令のうちの１つによりアンロックされる。即ち、Ｃｏｎｄｉｔｉｏ口ａＩＵｎｌｏｃｋ、　Ｃｏｎｄｉｔｉｏｎａｌ　Ｅｌｓｅ、　Ｉｎｔｅｒｒｕｐｔ　Ｕｎｌｏｃｋ、（条件付きコードセグメントの終了信号を送る）　Ｅｎｄ＃ＣＬＣ３命令である。If the processor is conditionally locked, the PSW is stored in the Cond register. and the LIN value is no longer affected by the ALIN value. The C bit locks the processor. is set in the PSW being checked. Your code has a conditional unlock instruction conditionally unlocks the processor, but the LIN value is different from the ALIN value. are the same. Thus, the unlocking instruction always Applies to O2. The processor is unlocked by one of four instructions. Ru. That is, Conditional Unlock, Conditional Else, Interrupt Unlock, (conditional code segment This is the End#CLC3 command.

Ｘビットは、ＣＬＣ３内の相互の排他的実行性に効力をもたせるために使用される。ＣＬＣ３内でコードが実行される場合には、Ｘビットがすべてのアンロツタ状態のプロセッサに設定される。ＣｌＯ２の°ｅｌｓｅ’句が実行された場合には、Ｘビットが実行していないプロセッサを決定するために使用される。プロセッサのＸビットがまだ設定されていない場合は、ＣＬＣ３内の文本体は未実行である。The X bit is used to effect mutual exclusivity within CLC3. Ru. If the code is executed in CLC3, the X bit is Set to state processor. If the ClO2 °else clause is executed is used to determine which processors the X bits are not running. process If the X bit of the processor is not yet set, the statement body in CLC3 is unexecuted. be.

次に、条件付きロッキングオペレーションを実行するためにハードウェアサポートを使用する方法を説明する。提示されている各オペレーションでは、疑似コードが条件付きロッキングハードウェアのオペレーションを説明しており、続いて命令の実行方法が説明される。Next, use hardware support to perform conditional locking operations. Explain how to use Each operation presented uses pseudo code. describes the operation of conditional locking hardware, followed by A method for executing instructions is explained.

Ｂｅｇｉｎ　ＣｌＯ２ＡＬＩＮ＝ＡＬＩＮ＋１１Ｐ（Ｃ＝Ｏ）しＩＮ＝ＬＩＮ＋１ｘ：ＯＮＤＩＰ（プロセッサがロックされていても、）　ＡＬＩＮは増加する。プロセッサがアンロックされている場合には、ＬＩＮ値も増加する。Ｘビットがリセットされるのは、ＣｌＯ５のコードが新ＣＬＣ３に対して実行されていないからである。Begin ClO2 ALIN=ALIN+1 1P (C=O) IN=LIN+1 x: O NDIP ALIN increases (even if the processor is locked). processor is activated If unlocked, the LIN value also increases. X bit is reset This is because the ClO5 code has not been executed against the new CLC3.

Ｅｎｓ　ＣｌＯ２１Ｆ（ＬＩＮ＝ＡＬＩＮ）しＩＮ＝ＬＩＮ−１Ｘ＝１ＮＤＩＦＡ１．ＩＮ＝ＡＬＩＮ−１ＬＩＮ＝ＡＬＩＮの場合には、ＣｌＯ２はイグジットされているのでプロセッサのＣビットは（プロセッサをアンロックするために）自動的にリセットできる。Ens ClO2 1F (LIN=ALIN) IN=LIN-1 X=1 N.D.I.F. A1. IN=ALIN-1 If LIN=ALIN, ClO2 has been exited, so the processor The C bit of can be automatically reset (to unlock the processor).

ＬＩＮ値は減少する。ＣｌＯ２が入れ子されている場合には、プロセッサは次の最深部に入れ子されたＣｌＯ２の文本体を実行するので、Ｘビットが設定される。この関係は、Ｘビット情報をつぶしてＣｌＯ２の深さに等しいＸビット数の代わりに、１ビツトにする。The LIN value decreases. If ClO2 is nested, the processor: The X bit is set because the statement body of the deepest nested ClO2 is executed. . This relationship collapses X bits of information and yields a number of Instead, use 1 bit.

ＡＬＩＮは無条件に減少しなければならないので、下記のＩＦ文″Ｃｏｎｄｉｔｉｏｎａｌ　Ｌｏｃｋ　（条件付き）”の外側に現れる。Since ALIN must decrease unconditionally, the following IF statement ``Condit'' Appears outside of "ional Lock (conditional)".

ＩＦ（（Ｃ＝０）ＡＮＤ［（ｃｏｎｄｉｔｉｏｎ　ＴＲＵＥ）　ＯＲ（Ｘ＝１）］）Ｃ＝１Ｃｏｎｄ二ＰＳＷＬＳＥＸ＝１ＮＤＩＦプロセッサがアンロックされている場合には、条件は真であるか又はＸビットが設定されて、プロセッサはロックされて、ＰＳＷはＣｏｎｄレジスタに記憶される。IF((C=0)AND[(condition TRUE)OR(X=1) ])C=1 Cond 2 PSW LSE X=1 N.D.I.F. If the processor is unlocked, the condition is true or the X bit is set, the processor is locked and the PSW is stored in the Cond register. Ru.

条件が偽ならば、プロセッサは依然としてアンロックである。ＣＬＣ３文本体炉本体されるので、Ｘビットが設定される。If the condition is false, the processor is still unlocked. CLC3 body furnace body, so the X bit is set.

次に、下記のＩＦ文”Ｃｏｎｄｉｔｉｏｎａｌ　Ｕｎｌｏｃｋ”を考える。Next, consider the following IF statement "Conditional Unlock".

１Ｆ（（Ｃ＝ｌ）ＡＮＤ（ＬＩＮ＝ＡＬＩＮ）ＡＮＤ（Ｘ＝０））−ＯＮＤＩＦプロセッサが高集積され入れ子されたＣｌＯ２でロックされて、プロセッサがなおＣｌＯ５のための文本体を実行していない場合には、プロセッサはアンロックである。プロセッサがすてにアンロックであった場合には、この命令は効果はない。1F ((C=l)AND(LIN=ALIN)AND(X=0))-O N.D.I.F. The processor is highly integrated and locked with nested ClO2 If you are not executing the statement body for ClO5, the processor will be unlocked. It is. This instruction has no effect if the processor was previously unlocked. stomach.

次に、下記のＩＦ文”Ｃｏｎｄｉｔｉｏｎａｌ　Ｅｌｓｅ″を考える。Next, consider the following IF statement "Conditional Else".

ＩＦ（（Ｃ！＝Ｘ）ＡＮＤ（ＬＩＮ＝ＡＬＩＮ））Ｃ；ｘＮＤＩＦこの命令は、カレントＣＬＣ３で実行していないすべてのプロセッサをアンロックし、カレントＣＬＣ３で実行しだすへてのプロセッサをロックし、°汀−ｔｈｅｎ−ｅｌｓｅ’文における’ｅｌｓｅ’文の機能性を提供する。IF((C!=X)AND(LIN=ALIN))C;x N.D.I.F. This instruction unlocks all processors not running on the current CLC3. lock the processor to be executed on the current CLC3, and Provides the functionality of the 'else' statement in the 'en-else' statement.

Ｉｎｔｅｒｒｕｐｔ　Ｕｎｌｏｃｋ”に対する命令は、下記の通りである。The command for "Interrupt Unlock" is as follows.

ＰＬＩＮ＝ＬＩＮｐｃ＝ｃｐｘ＝ｘＰＣａｎｄ＝ＣｏｎｄＬＩＮ＝Ｏこの命令はすへてのレジスタの状態をセーブするので、割り込みはプログラムの状態に影響を与えることなく　ＬＩＮナンバーを使用することができる。すべてのプロセッサがアンロックされているので割り込みに応答することができる。PLIN=LIN pc=c px=x PCand=Cond LIN=O This instruction saves the state of all registers, so interrupts are You can use the LIN number without affecting the status. all processor is unlocked and can respond to interrupts.

”Ｉｎｔｅｒｒｕｐｔ　Ｒｅ５ｔｏｒｅ”に対する命令は、下記の通りである。The command for "Interrupt Re5tore" is as follows.

ＬＩＮ＝ＰＬＩＮｃ＝ｐｃｘ＝ｐｘＣｏｎｄ＝ＰＣａｎｄ割り込みルーチンが終了した後に、この命令はすべてのレジスタの状態を復元する。LIN=PLIN c=pc x=px Cond=PCand This instruction restores the state of all registers after the interrupt routine finishes. Ru.

図１４は、ハイレベル言語疑似コードからローレベルコードまでの翻訳の例である。翻訳の大部分は一対一で、はとんど実行オーバーヘッドを有しないことに注意しなければならない。また、図１４は、異なるデータ条件を有するプロセッサの異なる文を実行する方法について説明している。各プロセッサが入れ子されたｉｆ文内で単−文Ｓｘを実行するのは、一部分がＣｏｎｄ＃［＋ｌｓｅ文で条件付き文の相互排他性を有効なものとしているからである。Figure 14 is an example of translation from high-level language pseudocode to low-level code. Ru. Note that most translations are one-to-one and have little execution overhead. must be considered. In addition, FIG. 14 shows processors with different data conditions. Explains how to execute different statements. Each processor is nested Executing the simple statement Sx within the if statement is partially due to the condition in the Cond#[+lse statement. This is because the mutual exclusivity of attached statements is valid.

適宜プロセッサはデータに従属するコードを実行するので、条件が満たされるまでオペレーション上実行を繰り返す。プロセッサがその演算の実行を終えてＬＯＲ（ＬＯＲ）同期化を提供した場合には、そのＰＳＷにＬＯＲビットを設定してその計算を終了した制御装置に信号を送信する。すべてのプロセッサが制御装置に信号を送り、ロックした場合には、制御装置はＬＯＲビットをアンロックしてリセットする信号を送る。次の実行に行く。As appropriate, the processor executes code dependent on the data until the condition is met. Repeat the operational execution with . When the processor finishes executing the operation, the LO If R (LOR) synchronization is provided, set the LOR bit in the PSW. A signal is sent to the control device that has completed the calculation. All processors are controllers If it locks, the controller unlocks the LOR bit. Send a signal to reset. Go to next run.

一例として、プロセッサがＸＹ値を計算する場合にＸ値とＹ値が各プロセッサで異なっている場合を考える。結果を計算するには（Ｙ−１）乗算が必要であるが、この結果はプロセッサ毎に変化する。制御装置はコードをそのプロセッサに送って、実行継続のためのＬＯＲ信号を受信するまで連続して部分積にＸを掛ける。このオペレーションのための疑似コードプログラムは、下記の通りである。As an example, when a processor calculates an XY value, the X and Y values are Consider the case where they are different. Although (Y-1) multiplication is required to calculate the result, , this result varies from processor to processor. The controller sends the code to its processor. Then, continuously multiply the partial products by X until receiving the LOR signal to continue execution. . The pseudocode program for this operation is as follows.

Ｐ＝１ＣｏｕｎＦ＝Ｙ＋１Ｂｅｇｉｎ＃ＣＬＣ５Ｒｅｐｅａｔ　ｕｎｔｉｌ　ＬＯＲｓｉｇｎａｌ　ｒｅｃｅｉｖｅｄ　ｂｙ　ａｌｌ　ｐｒｏｃｅｓｓｏｒｓ［Ｄｅｃｒｅｍｅｎｔ−ａｎｄ−Ｌｏｃｋ−Ｏｎ−Ｚｅｒｏ（Ｃｏｕｎｔ）Ｐ＝Ｐ本ＸコＥｎｄ＃ＣＬＣ３ＲｅｓｅｔＬＯＲｂｉｔＣｏｎｄｉｔｉｏｎａｌ　Ｌｏｃｋｉｎｇ／ＬＯＲ同期化オペレーションの疑似コードにおいては、Ｄｅｃｒｅｍｅｎｔ−ａｎｄ−Ｌｏｃｋ−Ｏｎ−Ｚｅｒｏは、Ａｕｘｉｌｉａｒｙ　ＡＬＵにより与えられた特別の命令である。この命令によりＡＤＨの値が減少し、その結果がゼロの場合にはプロセッサをロックする。P=1 CounF=Y+1 Begin#CLC5 Repeat until LORsignal received by a ll processors [ Decrement-and-Lock-On-Zero (Count) P=P Book X Ko End#CLC3 ResetLORbit Conditional Locking/LOR synchronization operation pseudo In the code, Decrement-and-Lock-On-Zero is , Auxiliary This is a special instruction given by the ALU. to this command If the value of ADH decreases and the result is zero, the processor is locked.

図１５において、１２８ビツト長に定義されたＰＩＷは制御装置からプロセッサまでその制御のもとてブロードキャストされる。命令は時分割多重化された二つの６４ビツトワードとして送られる。図１５に示されたＩＷ形式を有するＰＩＷは複数の命令フィールドからなる。In Figure 15, a PIW defined as 128-bit length is transmitted from the control device to the processor. broadcast under its control. Two instructions are time-division multiplexed is sent as a 64-bit word. PIW with the IW format shown in Figure 15 consists of multiple instruction fields.

Ｐ　ｌｎｓけｕｃｔｉｏｎ　Ｆｉｅｌｄ　（１ビツト）はエラーチェックに使用されるパリティピントである。パリティチェックが実行されても、（パリティビットを含む）命令におけるｌの合計数は常に偶数である。命令におけるエラーは１０制御装置からプロセッサに送信される間に生じる。１ビツトエラーは検出されるが、２ビツトエラーは検出されないことに注意しなければならない。P lnsuction Field (1 bit) is used for error checking parity focus. Even if a parity check is performed (parity check The total number of l's in an instruction (including cuts) is always an even number. The error in the command is 10 during transmission from the controller to the processor. 1 bit error is not detected However, it must be noted that 2-bit errors are not detected.

Ｍｏｄｅ　Ｉｎ５ｔｒｕｃｔｉｏｎ　Ｆｉｅｌｄ　（２ビツト）は、図１５に示された四つのＩＷ形式、即ちモードＯ、モードｌ、モード２、モード３がら−っ選択する。この異なるモードの違いはただＩＷにおける即位データフィールドのサイズである。即値フィールドのためのスペースは、ＲＩ２及びＲＯ２−ＲＯ４命令フィールドを特定している命令フィールドにオーバーラツプしている。こうして、即値を特定するとＲＦ６０８へのデータ転送数が制限される。即値フィールドのいくつかのサイズはＲＦ６０８アクセスとの衝突を最小限とするように定義されている。The Mode In5truction Field (2 bits) is shown in Figure 15. The four IW formats, namely Mode O, Mode I, Mode 2, and Mode 3, are select. The difference between these different modes is simply that of the coronation data field in IW. It's the size. Space for immediate fields is RI2 and RO2-RO4 Overlaps the instruction field that specifies the instruction field. like this When the immediate value is specified, the number of data transferred to the RF 608 is limited. immediate fee The sizes of some of the fields are defined to minimize conflicts with RF608 access. is justified.

Ｍｏｄｅ　Ｉ＋＋＋＋＋＋ｅｄｉａｔｅ　Ｆｉｅｌｄ　Ｒｌｘ　ａｖａｉｌ　ＲＯｘ　ａｖａｉｌＯＮｏｎｅ　１．２　１．２．３．４１　３２ｂｉｔ　１　１２　１６ｂｉｔ　１，２　１．４３　８ｂｉｔ　１，２　１，３．４ＲＦｉｅｌｄ　（１ビツト）は、ローカルメモリーを再生する信号を送る。ＩＰｃ　Ｉｎ５ｔｒｕｃｔｉｏｎ　Ｆｉｅｌｄ　（８ビツト）については後記されている。Mode I+++++++ediate Field Rlx available R Ox availONone 1.2 1.2.3.4 1 32bit 1 1 2 16bit 1, 2 1.4 3 8bit 1, 2 1, 3.4 RField (1 bit) sends a signal to play local memory. IP c In5truction Field (8 bits) is described later. There is.

ＡＬＵ　Ｉｎ５ｔｒｕｃｔｉｏｎ　Ｆｉｅｌｄ形式（１９ビツト）とは、　（Ｍｕｌｔｉｐｌｉｅｒ　Ｉｎ５ｔｒｕｃｔｉｏｎＦｉｅｌｄと共通する）ｌビットｌ／Ｆ　５ｅｌｅｃｔ　；　８ビットＡＬｔｌ演算：２ビツト５ｏｕｒｃｅ　ＡＳｅｌｅｃｔ　；　２ビツト５ｏｕｒｃｅ　Ｂ　５ｅｌｅｃｔ　；　１ビツトＡＣＣＩ　Ｅｎａｂｌｅ　；　ｌビ・ソトＡＣＣ２Ｅｎａｂｌｅ　；　ｌビットＡＣＣＩ　Ｈ／Ｌ　５ｅｌｅｃｔ　；　ｌビットＡＣＣ２Ｈ／Ｌ　５ｅｌｅｃｔ　；及び２ビ・ット０ｕｔｐｕｔ　５ｈｉｆｔである。ALU In5truction Field format (19 bits) is (M ultiplier　l bit (common with In5tructionField) l/F 5elect; 8-bit ALtl operation: 2-bit 5source A Select; 2 bits 5 source B 5 select; 1 bit A CCI Enable; l bi-soto ACC2 Enable; l bit A CCI H/L 5elect; l bit ACC2H/L 5elect ; and 2 bits 0output 5hift.

ＡＬＵ　Ｉｎ５ｔｒｕｃｔｉｏｎ　Ｆｉｅｌｄは、ＡＬＵに対する演算とデータソースを特定する。１ビツトｌ／Ｆ　５ｅｌｅｃｔは、ＡＬＵが演算しているのが整数モードであるか浮動小数点モードであるかを特定する。８ビツトＡＬＵ　０ｐｅｒａｔｉｏｎ　Ｆｉｅｌｄは、ＡＬＵ機能を実行するものを特定する。２ビツト５ｏｕｒｃｅ　Ａ　５ｅｌｅｃｔは、四つのデータソースの一つを特定し、２ビツト５ｏｕｒｃｅ　Ｂ　５ｅｌｅｃｔは、ＡＬＵオペランドとして四つのデータソースの一つを特定する。二つの１ビツトフイールドは、ＡＣＣＩレジスタとＡＣＣ２レジスタが更新されているかどうかを決定する。ＡＬＵが命令により使用されていない場合には、ＡＣＣ値が保存される。二つのｌビットフィールドは、ＡＣＣＩとＡＣＣ２の上位又は下位の３２ビツトワードがその他のいくつかのデータソースに入力されているかどうかを判定する。２ビツト０ｕｔｐｕｔ　Ｓｈ汀【フィールドは、ＡＬＵの出力に対する通常シフトを特定する。ALU In5truction Field is the operation and data for ALU. Identify the source. 1 bit l/F 5 select is calculated by ALU. Determine whether is in integer or floating point mode. 8-bit ALU 0operation Field specifies what executes the ALU function. 2 Bit 5source A 5elect specifies one of the four data sources. , 2-bit 5source B 5elect is the ALU operand. Identify one of your data sources. The two 1-bit fields are in the ACCI register. Determine whether the data and ACC2 registers have been updated. ALU responds to commands If not used, the ACC value is saved. two l bit fields The upper or lower 32-bit word of ACCI and ACC2 is Determine whether the input is in a data source. 2 bit 0 output The Sh field specifies the normal shift for the output of the ALU.

５ｏｕｒｃｅ　Ａと５ｏｕｒｃｅ　Ｂに対するＤａｔａ　５ｏｕｒｃｅ　：５ｏｕｒｃｅ　Ａ：　Ｐ　５ｏｕｒｃｅ　Ｂ：　ＡＣＣ２ＡＣＣＩ　ＲＯ２ＲＯＩ　ＩＭＤＩＰｃ　ＭＲＩＡＬＵ６００で実行できる演算として、定型の３２ビツト及び６４ビ・ットの１進及び２進算術及び論理演算、シフト演算、整数／ｎ動小数点変換演算、多重サイクル整数除真演算がある。条件付き減算のような条件付きサポート演算と（条件付き書き込みを実行するのに使用する）　Ｚｅｒｏ／ＮｏｎＺｅｒｏの場合には、Ｕｐｄａｔｅ　ＡＣＣＩがある。特別目的の演算は、ＡＣＣＩにおける大きな数値とＡＣＣ２における小さな数値を記′　憶するＭＡＸＭＩＮＺ進演算と、ｆｉｎｄ−ｆｉｒｓｔ−ｚｅｒｏ−ｂｉｔとｆｉｎｄ−ｆｉｒｓｔ−ｏｎｅ−ｂｉｔ　ｌ進演算と、絶対数を含む。Data 5source for 5source A and 5source B: 5o source A: P5 source B: ACC2ACCI RO2 ROI IMD IPc MRI The operations that can be executed by ALU600 include standard 32-bit and 64-bit 1 Base and binary arithmetic and logical operations, shift operations, integer/n dynamic point conversion operations, multiplexing There is an cycle integer divide operation. Conditionally supported operations like conditional subtraction and (conditional Used to execute subject writing) In case of Zero/NonZero has Update ACCI. Special purpose operations are large in ACCI. MAXMINZ arithmetic that stores large numbers and small numbers in ACC2, find-first-zero-bit and find-first-one-b It includes l-adic operations and absolute numbers.

Ａｕｘｉｌｉａｒｙ　ＡＬＵ　（ＡｕｘＡＬＵ）　Ｉｎ５ｔｒｕｃｔｉｏｎ　Ｆｉｅｌｄ　（４ビツト）は、レジスタファイルのＲ１１ポートの付近に位置するＡｕｘＡＬＵを使用して実行するオペレーションを特定する。ＡｕｘＡＬＵは、ＡＤＨにあるデータを条件付きで増加するか減少するために使用される。４ビツト０ｐｅｒａｔｉｏｎフイールドにあるＡｕｘＡＬＵ　０ｐｅｒａｔｉｏｎｓは、下記の通りである。Auxiliary ALU (AuxALU) In5truction F ield (4 bits) is located near the R11 port of the register file. Identify operations to perform using AuxALUs. AuxALU is Used to conditionally increase or decrease data in ADH. 4 bits AuxALU 0operations in the 0operations field is , as follows.

即ち、ｌ）条件付き増加数値、２）逆条件付き増加数値、３）無条件の増加数値、４）条件付き減少数値、５）逆条件付き減少数値、６）無条件の減少数値、７）条件付き減少数値と数値がゼロの場合のロック、８）逆条件付き減少数値と数値がゼロの場合のロック、９）無条件の減少数値と数値がゼロの場合のロック、ｌ　Ｏ）　ＡＣＭＲロード、１１）ＡＤＲロード、１２）　ＡＣＭＲ書き込み、１３）ＡＤＨ書き込み、及び１４　）　ＮＯＰである。i.e.: l) conditional increase value; 2) inversely conditional increase value; 3) unconditional increase value. , 4) Conditional decreasing numerical value, 5) Inverse conditional decreasing numerical value, 6) Unconditional decreasing numerical value, 7 ) conditional decrease number and lock if number is zero, 8) inverse conditional decrease number and number Lock if value is zero, 9) Unconditional decrease number and lock if number is zero, l O) ACMR load, 11) ADR load, 12) ACMR write, 13) ADH write, and 14) NOP.

条件付きに増加減少させるために、ＰＳＷマスクはＡＣＭＲにロードさせなければならない。マスクによって特定された条件が満足された場合には、ＡＤＨに記憶された数値について演算が実行される。ＡＣＭＲとＡＤＨをロードするオペレーションでは、Ｒ１１ポートからデータを読み出す。すべての条件がＰＳＷで明示されているわけではないので、マスクによって特定された逆条件を使用するオペレーションがある。（条件の多くは、ゼロと非ゼロのように相互に排他的である。）減少してゼロ値でロックする上記のオペレーションは、累乗（ｘ、ｙ）　（ｘのｙ乗）のようなデータ従属演算を実行するために使用される。ｙの数値がＡｕｘＡＬＵを介して減少するとともに、数値Ｘを乗じた部分積が計算される。To conditionally increase or decrease, the PSW mask must be loaded into the ACMR. Must be. If the conditions specified by the mask are met, the ADH is recorded. Arithmetic operations are performed on the stored numerical values. Operator loading ACMR and ADH In the application, data is read from the R11 port. All conditions are clear in PSW option using the inverse condition specified by the mask. There is a peration. (Many of the conditions are mutually exclusive, such as zero and non-zero. Ru. ) decreases and locks at zero value The above operation reduces to the power (x, y) It is used to perform data-dependent operations such as (x to the power of y). The value of y is The partial product multiplied by the number X is calculated while decreasing through the AuxALU.

数値ｙがゼロまで減少した場合には、ＰＳＷにＬＯＲビットを設定してロックする。制御装置がＬＯＲ信号を受信した場合には、制御装置はプロセッサをアンロックするように命令を送る。If the value y decreases to zero, set the LOR bit in the PSW and lock it. Ru. If the controller receives the LOR signal, the controller unloads the processor. Send a command to check.

Ｍｕｌｔｉｐｌｉｅｒ／Ｍａｔｃｈ　５ｅｌｅｃｔ　Ｉｎ５ｔｒｕｃｔｉｏｎ　Ｆｉｅｌｄ　（１ビツト）は、アクティブなのは乗算器かマツチャかを判定する。両方のリソースは同時にアクティブになることがないのは、二つのリソースの命令フィールドがオーパーラ・ノブしているからである。命令フィールドが一つのリソースを特定する場合には、他のリソースはその命令サイクルてＮＯＰを実行する。両方のリソースがＮＯＰを実行しなければならない場合には、マツチャは特定されるＮＯＰ命令を有している。Multiplier/Match 5elect In5truction Field (1 bit) determines whether the multiplier or matcher is active . Both resources cannot be active at the same time. This is because the command field is over-knobbed. one instruction field When specifying a resource, other resources must perform a NOP in that instruction cycle. go If both resources must perform a NOP, match has a specified NOP instruction.

Ｍｕｌｔｉｐｌｉｅｒ　Ｉｎ５ｔｒｕｃｔｉｏｎ　Ｆｉｅｌｄ　（６ビツト）形式とは、　（１）　０ｐｅｒａｔｉｏｎ　；　（２）　５ｏｕｒｃｅ　Ｘ　５ｅｌｅｃｔ　；　（２）　５ｏｕｒｃｅ　Ｙ　５ｅｌｅｃｔ　；及び（＋　）　Ｉｓ／２Ｓ　５ｅｌｅｃｔである。Multiplier In5truction Field (6 bit) type The formula is (1) 0operation; (2) 5source X 5e select; (2) 5source Y 5elect; and (+) I s/2S 5elect.

ｌビット０ｐｅｒａｔｉｏｎフイールドは、乗算器のためのオペレーションを選択する。２ビツトフイールド５ｏｕｒｃｅ　Ｘ　５ｅｌｅｃｔと５ｏｕｒｃｅ　Ｙ　５ｅｌｅｃｔは、乗算器にインプットするＸソース及びＹソースの四つのデータソースのうちから一つ選択する。The l bit 0 operation field selects the operation for the multiplier. Choose. 2 bit fields 5 sources x 5 select and 5 sources Y5elect is the four data of X source and Y source input to the multiplier. Select one of the data sources.

ｌビットＩｓ／２Ｓ　５ｅｌｅｃｔフイールドは、乗算器が演算するのは一つの相補形式においてか二つの相補形式においてかを判定する。１ビツトｌ／Ｆ　５ｅｌｅｃｔは、乗算器が演算するのは整数モードにおいてか浮動小数点においてかを特定する。このビットは、ＡＬＵ　Ｉｎｓけｕｃｔｉｏｎ　Ｆｉｅｌｄに位置しており、ＡＬＵと乗算器はともに同じモードで演算する。1 bit Is/2S 5 select field indicates that the multiplier operates on one Determine whether in complementary form or in two complementary forms. 1 bit l/F 5 select determines whether the multiplier operates in integer mode or in floating point. to identify. This bit is located in the ALU Instruction Field. The ALU and multiplier both operate in the same mode.

乗算器は、二つの演算を実行する。（１ビツト０ｐｅｒａｔｉｏｎフイールドにより特定されるような）乗算、または６４ビツト値を有するＰレジスタの直接ロードである。５ｏｕｒｃｅ　Ｘ　５ｅｌｅｃｔと５ｏｕｒｃｅ　Ｙ　５ｅｌｅｃｔは、それぞれの上位と下位の３２ビツトワードの位置を特定するが、それはＰレジスタにロードすることである。The multiplier performs two operations. (1 bit 0 operation field (more specific) or direct loading of the P register with a 64-bit value. It is a code. 5source X 5elect and 5source Y 5elec t specifies the position of each upper and lower 32-bit word, which is P It is to load it into a register.

５ｏｕｒｃｅ　Ｘと５ｏｕｒｃｅ　Ｙに対するＤａｔａ　ＤｏｕｒｃｅＳｏｕｒｃｅ　Ｘ：　ＩＭＤ　５ｏｕｒｃｅ　Ｙ：　ＩＰｃＭＲＩ　ＭＲ２ＲＯ３ＲＯ４ＡＣＣＩ　ＡＣＣ２Ｍａｔｃｈｅｒ　Ｉｎ５ｔｒｕｃｔｉｏｎ　Ｆｉｅｌｄ　（５ビツト）形式とは、１ビツト０ｐｅｒａｔｉｏｎ　；　２ビツト５ｏｕｒｃｅ　Ｘ　５ｅｌｅｃｔ　；　２ビツト５ｏｕｒｃｅ　Ｙ　５ｅｌｅｃｔ　；及び４ビツトＢ　５ｅｌｅｃｔである（フィールドは５ｏｕｒｃｅ　Ｘ、　Ｙフィールドに関しては相互に排他的である）。Data SourceSour for 5sources X and 5sources Y ce X: IMD 5source Y: IPcMRI MR2 RO3RO4 ACCI ACC2 What is Matcher In5truction Field (5 bit) format? , 1 bit 0operation; 2 bits 5source X 5select ; 2 bits 5source Y 5select; and 4 bits B 5ele ct (fields are 5 sources, and the X and Y fields are mutually exclusive).

１ビ、ト０ｐｅｒａｔｉｏｎフィールドは、マツチャのための演算を選択する。The 1 bit, 0 operation field selects the operation for the matcher.

２ビツトフイールド５ｏｕｒｃｅ　Ｘ　５ｅｌｅｃｔと５ｏｕｒｃｅ　Ｙ　５ｅｌｅｃｔは、Ｘ、Ｙソース入力のためのデータソースを選択する。４ビツトＢ　５ｅｌｅｃｔフイールドは、５ｏｕｒｃｅ　Ｘ、　Ｙ　５ｅｌｅＣ【フィールドに関して相互に排他的であり、マツチセットアツプ命令において使用される。2 bit fields 5source X 5elect and 5source Y 5e lect selects the data source for the X,Y source input. 4 bit B 5elect field is 5source X, Y 5eleC [field are mutually exclusive for use in match setup instructions.

マツチャは二つの演算、即ちマツチング及びマツチセットアツプを実行する。The matcher performs two operations: matching and match setup.

マツチャがマツチオペレーションを実行する場合には、５ｏｕｒｃｅ　Ｘ　５ｅｌｅｃｔと５ｏｕｒｃｅ　Ｙ　５ｅｌｅｃｔはマツチャのＸ、Ｙインプットのデータソースを特定する。各命令サイクルにおいて二つの３２ビツト数値と整合する。記録されたマツチナンバーはＰレジスタに記憶される。When Matsushi executes Matsushi operation, 5source X 5e select and 5source Y 5elect are the data of the X and Y inputs of Matsucha. identify the data source. Matching two 32-bit numbers in each instruction cycle Ru. The recorded match number is stored in the P register.

マツチセットアツプオペレーションが特定された場合には、４ビツトＢ　５ｅｌｅｃｔフイールドはマツチャＢレジスタにロードされる数値を特定する。リーガルＢ値は、ｌ−８，１６，３２、及び口を含んでいる（変更しない）。Ｂ値に対して変更がない場合には、乗算器とマツチャはともにＮＯＰを実行していることを意味している。　５ｏｕｒｃｅ　Ｘと５ｏｕｒｃｅ　Ｙに対するＤａｔａ　５ｏｕｒｃｅｓＳｏｕｒｃｅ　Ｘ：　ＩＭＤ　５ｏｕｒｃｅ　Ｙ：　ＩＰＣＭＲＩ　ＭＲ２ＲＯ３ＲＯ４ＡＣＣＩ　ＡＣＣ２Ｒ１１Ｉｎ５ｔｒｕｃｔｉｏｎ　Ｆｉｅｌｄ　（１１ビツト）形式とは、７ビツトＲｅｇｉｓｔｅｒ　Ｆｉｌｅ　Ａｄｄｒｅｓｓ　；　１ビツトＷｒｉｔｅ　Ｅｎａｂｌｅ　；及び３ビツトＷｒｉｔｅ　Ｄａｔａ　５ｏｕｒｃｅである。If a match set up operation is specified, 4 bits B 5el The ect field specifies the number to be loaded into the matcher B register. liga The LeB values include l-8, 16, 32, and mouth (unchanged). For B value If there is no change, both the multiplier and matcher are performing NOPs. It means. Data 5 for 5sources X and 5sources Y sources Source X: IMD 5 sources Y: IPCMRI MR2 RO3RO4 ACCI ACC2 The R11In5truction Field (11 bit) format is a 7 bit Register File Address; 1 bit Write E and 3-bit Write Data 5source.

Ｒ１１ポートは、数値を１２８ワードＲＦ６０８に書き込むのに使用される。７ビットＲｅｇｉｓｔｅｒ　Ｆｉｌｅ　Ａｄｄｒｅｓｓフィールドは、宛先ＲＦワードを特定する。１ビツトＷｒｉｔｅ　Ｅｎａｂｌｅフィールドは、特定されたＲＦが更新されているかどうかを判定する。３ビツトＷｒｉｔｅ　ＤａＬａ　５ｏｕｒｃｅフイールドは、データ転送のソースを特定する。下記のレジスタ／フィールドはＲ１１ポートのソースである。The R11 port is used to write numbers to the 128 word RF608. 7 The Bit Register File Address field specifies the destination RF address. Identify the code. The 1-bit Write Enable field specifies Determine whether the RF has been updated. 3 bit Write DaLa 5 The source field identifies the source of the data transfer. The following registers/files field is the source of the R11 port.

Ｒ１１：　ＡＣＣＩ　ＩＭＤＰ（Ｈ）　ＩＰＣＭＲＩ　ＣＲＲＯＩ　ＰＳＷＲＩ２１ｎｓけｕｃｔｉｏｎ　Ｆｉｅｌｄ形式（１０ビツト）とは、７ビットＲｅｇｉｓｔｅｒ　Ｆｉｌｅ　Ａｄｄｒｅｓｓ　；　１ビツトＷｒｉｔｅ　Ｅｎａｂｌｅ　；及び２ビツトＷｒｉｔｅ　Ｄａｔａ　５ｏｕｒｃｅである。R11: ACCI IMD P(H) IPC MRI CR ROI　PSW RI21nsuction Field format (10 bits) is 7 bits R egister File Address; 1 bit Write Ena ble; and 2-bit Write Data 5source.

Ｒ１２ポートオペレーンヨンは、Ｒ１１ポートの場合と同一であるが、２ビツトＷｒｉｔｅ　Ｄａｔａ　５ｏｕｒｃｅフイールドを使用しているのが異なる。下記のレジスタはＲ１２ポートのソースである。The R12 port operating lane is the same as for the R11 port, but with 2 bits. The difference is that the Write Data 5 source field is used. under The register shown below is the source of the R12 port.

Ｒ１２：　ＡＣＣ２Ｐ（Ｌ）Ｒ２ＲＯｘ　Ｉｎ５ｔｒｕｃｔｉｏｎ　Ｆｉｅｌｄ　（フィールド毎に８ビツト）形式とは、（７）　Ｒｅｇｉｓｔｅｒ　Ｆｉ　Ｉｅ　Ａｄｄｒｅｓｓ（１）　Ｒｅａｄ　Ｅｎａｂｌｅである。R12: ACC2 P(L) R2 ROx In5truction Field (8 bits per field) type The formula is (7) Register Fi Ie Address (1) Re ad Enable.

Ｒｏｌ−Ｒ０４レジスタは、１２８ワードＲＦ６０８がら読み出された数値を一時的に保存するために使用される。７ビツトＲｅｇｉｓｔｅｒ　Ｆｉｌｅ　Ａｄｄｒｅｓｓフィールドは、ＲＦ６０８からのワードがレジスタに読み込まれるものがどうかを判定する。１ビツトＲｅａｄ　Ｅｎａｂｌｅフィールドは、そのレジスタが更新されているがどうかを判定する。The Rol-R04 register synchronizes the numerical value read from the 128 words RF608. Used for temporary storage. 7-bit Register File Ad The dress field is also used when the word from RF608 is read into the register. Determine whether the The 1-bit Read Enable field Determine whether the register has been updated.

レジスタＲＯＩ−ＲＯ４のいずれも他のプロセ・けコンポーネントのデータソースである。Both registers ROI-RO4 are data sources for other process components. It is

ＲＯＩ：　ＡＬＩＩ（Ａ）　ＲＯ３：　ＭＰＹ（Ｘ）ＭＷＩ　ＭＡＩＲＩ　Ｉ　Ｉ　ＰＣＤＲＲＲＯ２：　ＡＬＵ（Ｂ）　’ＲＯ４：　ＭＰＹ（Ｙ）ＭＷ２　ＭＡ２Ｒ１２１ＰＣＯＲ／ＣＩＤＳＷＩｍｍｅｄｉａｔｅ　（ＩＭＤ）　Ｆｉｅｌｄ　（３２ビツト、１６ビツト、又は８ビツト）は、モードフィールドが非ゼロの場合に存在する。フィールドサイズは、モード値とともに変化して、そのフィールドはＲ１ｘフィールドとＲＯｘフィールドにオーバーラツプする。ＩＭＤフィールドは、下記のソースに入力するように使用される。ROI: ALII(A) RO3: MPY(X)MWI MAI RI I I PCDR R RO2:　ALU(B)　’RO4:　MPY(Y)MW2　MA2 R121PCOR/CID SW Immediate (IMD) Field (32 bit, 16 bit, or (8 bits) is present if the mode field is non-zero. field rhinoceros The field changes with the mode value, and that field is the R1x field and the ROx field. overlap the field. The IMD fields should be entered in the source below. It is used as follows.

ＩＭＤ：　ＭＰＹ（Ｘ）　ＭＡＩＡ！Ｕ（Ｂ）　ＭＷ２Ｒ１１ｌＰｃＯＲ／ＣＩＤＡｄｄｒｅｓｓ　Ｇｅｎｅｒａｔｏｒ　Ｍｏｄｅ　Ｂｉｔ　（１ビツト）は、ＡＧが演算しているのはアドレス生成モードであるのが、セットアツプモードであるのかを判定する。セットアツプモードは、ＡＧレジスタセットをロードして記憶する機能がある。IMD: MPY(X) MAI A! U(B) MW2 R11lPcOR/CID Address Generator Mode Bit (1 bit) is A G is calculating in the address generation mode and in the setup mode. Determine whether the Setup mode loads and writes the AG register set. It has a memory function.

Ａｄｄｒｅｓｓ　Ｇｅｎｅｒａｔｏｒ　１．２　Ｉｎ５ｔｒｕｃｔｉｏｎ　Ｆｉｅｌｄｓ　（フィールド毎に１０ビツト）は、二つのモード、即ちセットアツプモードとアドレス生成モードを有している。そのモードはＡｄｄｒｅｓｓ　Ｇｅｎｅｒａｔｏｒ　Ｍｏｄｅ　Ｂｉｔによって決定される。Address Generator 1.2 In5truction Fi elds (10 bits per field) can be set up in two modes: mode and address generation mode. The mode is Address Ge Determined by the nerator Mode Bit.

ＡＧモードでは、Ｉｎ５ｔｒｕｃｔｉｏｎ　Ｆｉｅｌｄは下記の形式を有する。In AG mode, the In5traction Field has the following format:

即ち、２ビットＡｄｄｒｅｓｓｉｎｇ　Ｍｏｄｅｓ　：　２ビツトＤＳ　５ｅｌｅｃｔ　（Ｓｔｒｉｄｅ　５ｅｌｅｃｔフイールドと相互排他性がある）：２ビツトＡｄｄｒｅｓｓ　Ａｒｉｔｈｍｅｔｉｃ　Ｍｏｄｅ　５ｅｌｅｃｔ　；　２ビツト５ｔｒｉｄｅ　５ｅｌｅｃｔ（ＤＳ　５ｅｌｅｃｔフイールドと相互排他性がある）；４ビツトＢａ５ｅ　Ｒｅｇｉｓｔｅｒ　５ｅｌｅｃｔ（これはＢＬＯ３Ｒｅｇｉｓｔｅｒ　５ｅｌｅｃｔフイールドとオーバーラツプする）；及び３ビツトＢＬＯ５Ｒｅｇｉｓｔｅｒ　５ｅｌｅｃｔである。That is, 2-bit Addressing Modes: 2-bit DS 5el ect (mutually exclusive with Stride 5 select field): 2 bits Address Arithmetic Mode 5 select; 2 Bit 5tride 5elect (mutually exclusive with DS 5elect field) 4-bit Ba5e Register 5elect (This is BL O3Register (overlaps with 5select field); and This is a 3-bit BLO5Register 5elect.

Ａｄｄｒｅｓｓ　Ｇｅｎｅｒａｔｏｒ　Ｉｎ５ｔｒｕｃｔｉｏｎ　Ｆｉｅｌｄは、Ａｄｄｒｅｓｓ　Ｇｅｎｅｒａｔｏｒに関連して完全に説明されている。二つのＡＧｏ）ＤＳ　５ｅｌｅｃｔは、下記の通りである。Address Generator In5truction Field is , Address Generator. two AGo) DS 5 select is as follows.

ＡＧｉＤＳ：　ＩＭＤ　ＭＲＩ　ＡＣＣ２ＲＯ３ＡＧ２ＤＳ：　ＩＰＣＭＲ２ＡＣＣＩ　ＲＯ４セットアツプモードでは、Ｉｎ５ｔｒｕｃｔｉｏｎ　Ｆｉｅｌｄは下記の形式を有している。即ち、２ビツトＲｅａｄ／Ｗｒｉｔｅ　Ｅｎａｂｌｅ　、　ＮＯＰ　；　２ビツトＤｉｒｅｃｔ　５ｏｕｒｃｅ　５ｅｌｅｃｔ　；　３ビツトＲｅｇｉｓｔｅｒ　５ｅｌｅｃｔ　；及び３ビツトＲｅｇｉｓｔｅｒ　Ｎｕｍｂｅｒである。AGiDS: IMD MRI ACC2RO3AG2DS: IPCMR2A In CCI RO4 setup mode, In5truction Field has the following format: That is, 2-bit Read/Write Enable e, NOP; 2-bit Direct 5source 5elect; 3-bit Register 5 select; and 3-bit Register It is Number.

２ビツトＲｅａｄ／Ｗｒｉｔｅ　Ｅｎａｂｌｅフィールドは、レジスタはＡＧレジスタファイルセットに読み込まれているが、ＲＡＭに書き込まれているかを判定する。書き込みがＲＡＭになされた場合には、対応するＲＡＭ命令フィールドは書き込みも特定しなければならない。ＡＧがレジスタ値をＲＡＭに書き込んだ場合には、その書き込みはＲＡＭフィールドにおけるＷｒｉｔｅ　Ｄａｔａ　５ｅｌｅｃｔフィールド選択を取り消す。２ビツトＤＳ　５ｅｌｅｃｔは、データをＡＧレジスタファイルセットにロードするソースを選択する。The 2-bit Read/Write Enable field indicates that the register is read into the register file set, but it is determined whether it is written to RAM. Set. If the write was to RAM, the corresponding RAM instruction field The writing must also be specified. AG wrote register value to RAM In this case, the write is Write Data 5 in the RAM field. Cancels select field selection. 2-bit DS 5 select is data Select the source to load into the AG register file set.

３ビツトＲｅｇｉｓｔｅｒ　５ｅｌｅｃｔは、ロードされるレジスタセントを選択する。レジスタセットは、Ｉ）ＬＩＢＯ−ＵＢ７．２）Ｕｓｅｒ　Ｌｉｍ１ｔ　Ｒｅｇｉｓｔｅｒｓ（ＵＬＯ−ＵＬ７）　、３）ＩＩＲＯ−ＯＲ７，４）Ｂａｓｅ　Ｒｅｇｉｓｔｅｒｓ（ＯＲ８−ＯＲ１５）、５）Ｌｉ＋ｎｉＬ　Ｒｅｇｉｓｔｅｒｓ（ＬＲＯ−ＬＲ７）、６）１．１ｍ１ｔ@Ｒｅｇｉｓｔｅｒｓ（ＬＲ８−ＬＲ１５）　、７．）ＯＲＯ−ＯＲ７、及び８）ＳＲＯ−３Ｒ７である。The 3-bit Register 5elect selects the register cent to be loaded. Choose. The register set is I) LIBO-UB7.2) User Lim1t Registers (ULO-UL7), 3) IIRO-OR7, 4) Ba se Registers (OR8-OR15), 5) Li+niL Regi sters (LRO-LR7), 6) 1.1m1t@Register s (LR8-LR15), 7. ) ORO-OR7, and 8) SRO-3R7 be.

３ビツトＲｅｇｉｓｔｅｒ　Ｎｕ＋ｎｂｅｒフィールドは、８セント内のアクティブレジスタを選択する。詳細なことは上記されている。The 3-bit Register Nu+nber field is the active value within 8 cents. Select live register. Details are given above.

ＲＡＭ　Ｉｎ５ｔｒｕｃｔｉｏｎ　Ｆｉｅｌｄ　（フィールド毎に３ビツト）形式は、（１）Ｒｅａｄ／Ｗｒｉ　ｔｅ、及び（２）Ｗｒｉｔｅ　Ｄａｔａ　５ｅｌｅｃｔである。RAM Intrusion Field (3 bits per field) type The formula is (1) Read/Write, and (2) Write Data 5e It is lect.

ローカルメモリーに通しる独立したリード／ライトボートは、二つあり、そのメモリーへのアクセスを独立して制御する３ビツト命令フイールドは二つある。There are two independent read/write ports that pass through local memory. There are two 3-bit instruction fields that independently control access to memory.

ランダムアクセスメモリー（Ｒａｎｄｏｍ　Ａｃｃｅｓｓ　Ｍｅｍｏｒｙ：ＲＡＭ）の各々は、プロセッサに通しるローカルメモリーへのアクセスを制御する。Random Access Memory: RA Each of M) controls access to local memory through the processor.

１ビツトＲｅａｄ／Ｗｒｉ　（ｅフィールドは、データ値はメモリーから読み出されるべきであるが、メモリーに書き込むへきであるかを判定する。そのデータがメモリーに書き込まれている場合には、２ビツトＷｒｉｔｅ　Ｄａｔａ　５ｅｌｅｃｔは、メモリーに書き込まれる内容を有するデータソースを決定する。これに対する例外は、Ａｄｄｒｅｓｓ　ＧｅｎｅｒａｔｏｒがＳモードであり、ＲＡＭに書き込み動作をしている場合である。1-bit Read/Wri (e field, data value is read from memory. Determine if it should be written to memory. that data is written in memory, 2-bit Write Data 5e lect determines the data source whose content is written to memory. child The exception to this is if Address Generator is in S mode and R This is the case when a write operation is being performed to AM.

ＲＡＭＩ　Ｄａｔａ　５ｏｕｒｃｅｓ：　ＩＰｃ　ＭＲＩ　ＡＣＣＩ　ＲＯＩＲＡＭ２　Ｄａｔａ　５ｏｕｒｃｅｓ：　ＩＭＤ　ＭＲ２ＡＣＣ２ＲＯ２ＰＳＷは、各プロセッサに３２ビツトレジスタであり、最終演算の実行後のプロセンサ状態に関する情報を含んでいる。Ａ１．Ｕオペレーション、ＡＧ、及びプロセッサ状態の結果に関する情報は、ＰＳＷに見いだされる。RAMI Data 5 sources: IPc MRI ACCI ROIR AM2 Data 5 sources: IMD MR2ACC2RO2PSW , a 32-bit register in each processor, which stores the processor state after the final operation. Contains information about the status. A1. U-Operation, AG, and Processor Information regarding the outcome of the condition is found in the PSW.

下記のＡＬＵ　５ｔａｔｕｓ　Ｂｉｔｓ　（８ビツト）は、ＰＥと互換性をもって保存されている。８状態ビツトの二つのグループは相補的である。The following ALU 5 status Bits (8 bits) are compatible with PE. It has been preserved. The two groups of 8-state bits are complementary.

Ｆａｌｓｅ（Ｆ）　ビットは一定しておりゼロである。The False (F) bit is constant and zero.

Ｃａｒｒｙ（Ｃ）　ビットは、ＡＬＩＩがキャリを生成する場合に設定される。The Carry (C) bit is set if ALII generates a carry.

＞０（ＧＴ）　ビットは、ＡＬＵ結果がゼロより大の場合に設定される。The >0(GT) bit is set if the ALU result is greater than zero.

０（ＧＥ）　ビットは、ＡＬＵ結果が〉・０の場合に設定される。The 0 (GE) bit is set if the ALU result is >.0.

Ｖａｌｉｄ（ＶＡＬ）　ビットは、ＡＬＵ結果が有効の場合に設定される。The Valid (VAL) bit is set if the ALU result is valid.

Ｕｎｄｅｒ４１ｏｗ（ＵＦ）ビットは、ＡＬＵ結果がアンダーフローする場合に設定される。The Under41ow (UF) bit is set when the ALU result underflows. Set.

Ｏｖｅｒ４１ｏｗ（ＯＦ）ビットは、ＡＬＵ結果がオーバーフローする場合に設定される。The Over41ow (OF) bit is set when the ALU result overflows. determined.

Ｚｅｒｏ（Ｚ）　ビットは、ＡＬＵ結果がゼロの場合に設定される。The Zero (Z) bit is set if the ALU result is zero.

追加の二つのビットは、浮動小数点に使用される。Two additional bits are used for floating point.

Ｉｎｅｘａｃｔ（ＩＮＥ）ビットは、浮動小数点結果が丸められたが、切り捨てられた場合に設定される。The Inexact (INE) bit indicates that the floating-point result is rounded but not truncated. Set if the

ＮｏｔＡＮｕｍｂｅｒ（ＮａＮ）ビットは、ワードが数字でない場合に設定される。The NotANumber (NaN) bit is set if the word is not a number. Ru.

下記のＡｄｄｒｅｓｓ　Ｇｅｎｅｒａｔｏｒ　５ｔａｔｕｓ　Ｂｉｔｓ　（２ビツト）は、ＡＧｓから生成される。アレーオフセットがアレーの境界の外側にある場合に、ビットが設定される。Address Generator 5 tatus Bits (2 bits) below ) is generated from AGs. The array offset is outside the boundaries of the array. The bit is set if

次のオフセットがＢＬＯＳアドレッシングオペレーションの境界の外にあると計算されるか、その他のアドレッシングオベレーンヨンの現在のオフセットにあると計算された場合に、ビットが設定される（Ａｄｄｒｅｓｓ　Ｇｅｎｅｒａｔｏｒに関する上記の詳細な記述を参照）。The next offset is assumed to be outside the boundaries of the BLOS addressing operation. or at the current offset of the other addressing obeline. The bit is set if it is calculated that (Address Generate (see detailed description above for r).

０ｕｔＯｆＢｏｕｎｄｌ（ＯＯＢＩ）　アレーオフセットは、アレーの境界の外側にある。0utOfBoundl (OOBI) Array offset is outside the bounds of the array. It's on the side.

（Ａｄｄｒｅｓｓ　Ｇｅｎｅｒａｔｏｒｌから）Ｏｕ　ｔｏｆ　Ｂｏｕｎｄ２（００Ｂ２）　アレーオフセットは、アレーの境界の外側にある。(From Address Generator) Out of Bound 2 ( 00B2) Array offset is outside the boundaries of the array.

（Ａｄｄｒｅｓｓ　Ｇｅｎｅｒａｔｏｒ２から）Ｐｒｏｃｅｓｓｏｒ　Ｃｏｎｄｉｔｉｏｎａｌ　Ｌｏｃｋｉｎｇ　５ｔａｔｕｓ　Ｂｉｔｓ　下記のＰｒｏｃｅｓｓｏｒ　Ｃｏｎｄｉｔｉｏ獅■ ｌ　Ｌｏｃｋｉｎｇ　５ｔａｔｕｓ　Ｂｉｔｓ　（４ビツト）は、プロセッサの実行状態を決定し、条件付きにプロセッサをロック、アンロックするオペレーションに使用される。詳細は上記参照。(From Address Generator 2) Processor Cond itional Locking 5 tatus Bits Process below ssor　Conditionioㅍ■ l Locking 5 status Bits (4 bits) An operation that determines the execution state and conditionally locks and unlocks the processor. used for See above for details.

Ｃｏｎｔｅｘｔ（Ｃ）　この状態ビットは、プロセッサをロック、アンロックする。Context (C) This status bit locks or unlocks the processor. Ru.

ＰｒｅｖＣｏｎｔｅｘｔ（ＰＣ）割り込みが生じた場合に、文脈節ビットが記憶されているので、その文脈はその後復元される。The context clause bit is stored when a PrevContext (PC) interrupt occurs. , so its context is then restored.

Ｅｘｅｃｕｔｅｄ（Ｘ）　この状態ビットは、プロセッサがカレントＬＩＮナンバーで実行しているかとうかを判定するために使用される。このビットは、条件付き実行の相互実行性を有効にするために使用される。Executed (X) This status bit indicates when the processor executes the current LIN number. Used to determine if it is running on a bar. This bit is the condition Used to enable interoperability between executions.

Ｐｘ　割り込みが生じた場合に、実行ビットは記憶されるので、その後復元できる。When a Px interrupt occurs, the execution bit is memorized and cannot be restored later. Ru.

下記の二つのビットは、シーケンサに信号を送るために使用される。The following two bits are used to signal the sequencer.

ＬＯＲ（ＬＯＲ）　この状態ビットは、イベントがプロセッサに生じたことの信号を送るためにプロセッサの制御装置に送られる。イベント例として、データ従属オペレーションが完了したことの信号をプロセッサが制御装置に送る場合がある。代わりに、ＬＯＲが１ビツト通信メカニズムとして使用される。LOR (LOR) This status bit provides an indication that an event has occurred to the processor. is sent to the processor's control unit to send the signal. As an example event, data follow The processor may signal the controller that a specific operation is complete. Ru. Instead, LOR is used as a 1-bit communication mechanism.

下記のＩＰＣ５ｔａｔｕｓ　Ｂｉｔｓ　（２ビツト）は、プロセッサＩＰＣ０ｐｅｒａｔｉｏｎｓのための状態情報を表示する。The IPC5 status Bits (2 bits) below are the processor IPC0p Display status information for generations.

ＩＰｃ　Ｐａｒｉｔｙ　Ｅｒｒｏｒ（ＩＰＣＰ）　ＩＰｃ　Ｍｏｄｅｌオペレーションが実行された場合にＩＰＣデータにパリティエラーがある場合、ビットが設定される。IPc Parity Error (IPCP) IPc Model Operation If the IPC data has a parity error when the Set.

ＩＰＣＲｅｄｕｃｔｉｏｎ（ＩＰｃＲ）　縮小オペレーションが受信データを処理するのに必要とされる場合にビットが設定され、演算はされない。IPCReduction (IPcR) A reduction operation processes the received data. Bits are set when needed to process and no operations are performed.

下記のＩｍａｇｅ　Ｖａｕｌｔ　５ｔａｔｕｓ　Ｂｉｔ　（１ビツト）は、データをローカルメモリーにロードすることを終了したことの信号を送るためにＩｍａｇｅ　Ｖａｕｌｔ（ＩＶ）によって使用される。The following Image Vault 5 status Bit (1 bit) is the data Im to signal that it has finished loading the data into local memory. Used by Vault (IV).

ＩＶ　Ｆｉｎｉｓｈｅｄ（ＩＶＦ）ビットは、ＩＶデータがロードされた場合に設定される。The IV Finished (IVF) bit is set when IV data is loaded. Set.

３２ビツトの内、１２ビツトの状態ワードが現在定義されていない。Of the 32 bits, 12 bits of the status word are currently undefined.

ＩＰｃは、プロセッサ間でデータを転送する一部チャネルである。ＩＰＣは、リニアアレーネットワーク接続性を有している。データは、データシフト、バイパス、ブロードキャストなどの定型通信パターン内、又は任意の一対一もしくは一対多の通信パターン内を移動できる。各プロセッサのＩＰＣＬｏｇｉｃは、ｓｕｍ　Ｓｍ１ｎ　。IPc is a partial channel that transfers data between processors. IPC is Has near-array network connectivity. Data is transferred via data shift, bypass within a regular communication pattern such as broadcast, broadcast, or any one-on-one or Can move within a to-many communication pattern. The IPC Logic of each processor is su m　Sm1n　.

ｍａｘ　、　ａｎｄ　、　ｏｒ、又はｘｏｒの演算などのＩＰＣデータに関する縮小オペレーションを実行する性能も有する。Regarding IPC data such as max, and, or, or xor operations It also has the ability to perform reduction operations.

ＩＰＣは、プロセッサ設計に組み込まれているので、待ち時間通信は短い。相互に離れた四つのプロセッサを接続したプロセッサは、プロセッサ命令サイクル毎に一度データを転送することができる。ＩＰｃのリニアアレー接続性は、通信を一次元に縮小させるが、これによりルーティングと組立を簡単にする。ＩＰＣ縮小オペレーションは、追加の機能性をプロセッサに与えて、オンチップ並列性を増加させる。さらに、ランダムアクセスリード／ライト性能をサポートするオペレーションモード（ＩＰＣＴａｇｇｅｄ　Ｍｏｄｅという）があるので、ＳＥに仮想クロスバ−通信性能を与える。Because IPC is built into the processor design, latency communication is low. mutual A processor with four processors separated by data can be transferred once. IPc linear array connectivity allows communication Reduces to one dimension, which simplifies routing and assembly. IPC contraction Small operations provide additional functionality to the processor and enable on-chip parallelism. increase. In addition, there are Since there is a ration mode (called IPCTagged Mode), Provides virtual crossbar communication performance.

ＩＰｃは、幅６４ビットで二つのパリティビットを有し、３．２ＧＢｙｔｅｓ／ｓｅｅ、のスルーブツトに対して４００Ｍ１（Ｚで演算する。デュアル３３ビツトチヤネルとして実行して、プロセッサの命令クロック速度の一倍乃至四倍で演算する。IPc has a width of 64 bits, two parity bits, and a storage capacity of 3.2 GBytes/ 400M1 (calculated by Z. Dual 33-bit Executes as a processor channel and performs at one to four times the processor's instruction clock speed. Calculate.

ＩＰＣは、二つの命令ソースから演算する。ＰＩＷからの８ビツトフイールドは、ＩＰｃがアクティブであるかどうかを特定し、ＩＰｃレジスタのロードと記憶を制御する。その他の命令ソースは、６４ビツトＩＰｃ　０ｐｅｒａｔｉｏｎ　Ｒｅｇｉｓｔｅｒ（ＩＰＣＯＲ）であり、これによりプロセッサによって実行される特定のＩＰＣオペレーションを決定する。この実行により、各プロセッサはユニークＩＰＣオペレーションを特定できる。ＩＰＣオペレーションはＭＩＭＤである。IPC operates from two sources of instructions. The 8-bit field from PIW is , determine if IPc is active, load and store IPc registers control. Other instruction sources are 64-bit IPc 0operation Register (IPCOR), which allows the Determine the specific IPC operations to be performed. This execution causes each processor to Unique IPC operations can be identified. IPC operation is MIMD It is.

ＩＰｃは、二つの基本モード、即ちＩＰｃ　ＭｏｄｅとＩＰｃ　Ｔａｇｇｅｄ　Ｍｏｄｅの内の−っを演算する。チャネルモードでは、３３ビツトＩＰｃは独立的にプログラマブルである。ＩＰＣの各々は、データをチャネル上で左右方向ににシフトさせるか、データをチャネル上で左右方向にバイパスするか、データをその他のプロセッサにブロードキャストする。図１６ａは、ＩＰＣの右方向シフトを示している。バイパスオペレーションによりプロセッサはシフトオペレーションから除かれる。図１６ａにおいて、プロセッサ５．６．７はバイパスされて、プロセッサ８はプロセッサ４からのデータを受信する。ブロードキャストオペレーションでは、通信ソースであるプロセッサが隣接するプロセッサに数値を送る。これらのプロセッサは、チャネルを介してデータを引き続いてソフトする（図１６０）。図１６ｄにおいてプロセッサ６．７のようにブロードキャストのンンクとして定義されたプロセッサは、データを受信した場合にそれをパスし続けない。プロセッサ７はローカルブロードキャストのソースであるとともにシンクである。IPc has two basic modes: IPc Mode and IPc Tagged Calculate - in Mode. In channel mode, 33-bit IPc is independent It is programmable. Each of the IPCs sends data left and right on the channel. , bypass the data left and right on the channel, or shift the data left and right on the channel. Broadcast to other processors. Figure 16a shows the rightward shift of the IPC. It shows the Bypass operation allows the processor to perform shift operations. removed from the section. In Figure 16a, processor 5.6.7 is bypassed. , processor 8 receives data from processor 4. broadcast operation In communication, a processor that is a communication source sends a number to an adjacent processor. Ru. These processors successively soft data through the channel ( Figure 160). In Figure 16d, the broadcast number is A processor defined as a link continues to pass data if it receives it. do not have. Processor 7 is a local broadcast source and sink. It is.

ＩＰｃ　Ｔａｇｇｅｄ　Ｍｏｄｅでは、ＩＰＣは単一６６ビツトチヤネルとして演算する。このモードは任意の一対一通信と一対多通信を提供するために使用される。このモードでは、Ｃｏｏｕｎｕｎｉｃａｔｉｏｎ　ＩＤ（ＣＩＤ）フィールドというタグがデータと関連している。データの受信装置となるプロセッサはすべて同じＣＩＤ値をそのＣＩＤレジスタ（ＣＩＤＲ）でロードする。次に、ＩＰＣは最大速度（４シフト／サイクル）でシフトされて、ＩＰｃ　Ｌｏｇｉｃのマツチングハードウェアは、ＣＩＤ値がＣＩＤＲＩＤ−ルドの数値と整合する場合のタグデータを有するＩＰＣＤａｔａ　Ｒｅｇｉｓｔｅｒ（ＩＰＣＤＲ）をロードする。In IPc Tagged Mode, IPC is configured as a single 66-bit channel. calculate. This mode is used to provide arbitrary one-to-one and one-to-many communications. It will be done. In this mode, the Counciation ID (CID) field A tag called ``LD'' is associated with the data. The processor that receives the data is All load the same CID value in its CID register (CIDR). Next, I The PC is shifted at maximum speed (4 shifts/cycle) and the IPc Logic The matching hardware will match the CID value with the number in the CIDRID field. Load the IPC Data Register (IPCDR) containing the tag data for the code.

ＩＰｃオペレーションの前に、プロセッサ０、■、４はタグデータをＩＰｃにロードして、下記のテーブル（ａ）に表示されているように、すべてのプロセッサは受信したタグデータを特定する。ＩＰＣオペレーションの後は、下記のテーブル（ｂ）に表示されているように、すべてのプロセッサはＣＩＤＲで特定されたタグと関連するデータを受信する。Before the IPc operation, processors 0, 4, and 4 load the tag data into the IPc. and all processors as shown in table (a) below. identifies the received tag data. After the IPC operation, the following table As shown in Figure (b), all processors are identified by CIDR. Receive tags and associated data.

ＣＩＤタグ　５　２０　１０データ　ＡＢブロモ・け　０１２３４５ＣＩＤＲ２０１０５２０２０２０ＰＣＤＲ（ａ）　ＩＰｃオペレーション前プロセッサ　０１２３４５ＣＩＤＲ２０１０５２０２０２０ＩＰｃＤＲＢ　ＣＡ　Ｂ　Ｂ　Ｂ（ｂ）　ＩＰＣオペレーンヨン後１３ビツトＣＩＤフイールドに追加して、６６ビツトＩＰｃ　Ｔａｇｇｅｄオペレーションワードは、５０ビツトデータフイールド、２ビツトタグフイールド、及び偶数のパリティビットを含む。２ビツトタグフイールドはユーザーが定義するが、タグ値００が予約されているのはデータが無効であると主張するためである。データフィールドはユーザーがフォーマット化して、最小有効３２ビツトはＩＰＣ縮小縮小オペレーションめにマスクされる。追加データフィールドビットを適切に使用する場合には復帰ＣＩＤを含むので、数値はタグデータの演算を開始するプロセッサに復帰するか、メモリアドレス又はアレーオフセットは追加データフィールドピントで特定されるので、受信プロセッサはメモリ位置を受信データと関連付けることができる。CID tag 5 20 10 Data AB Bromo Ke 012345 CIDR20105202020 P.C.D.R. (a) Before IPc operation Processor 012345 CIDR20105202020 IPcDRB CA B B B (b) After IPC operation In addition to the 13-bit CID field, the 66-bit IPc Tagged operation The translation word is a 50-bit data field, a 2-bit tag field, and an even number of parity bits. The 2-bit tag field is user-defined. However, the tag value 00 is reserved to assert that the data is invalid. Ru. The data field is formatted by the user and has a minimum significant 32 bits. Masked for IPC shrink operations. Additional data field bit When used properly, it includes the return CID, so the numbers open the tag data calculations. The memory address or array offset may be returned to the starting processor, or the memory address or array offset data field focus, so the receiving processor stores the memory location in the receiving data. can be associated with data.

ＩＰＣのオペレーア：Ｉンは二つの命令ソースから決定される。ＰＩＷに特定された８ビツトＩＰＣＩｎ５ｔｒｕｃｔｉｏｎ　Ｆｉｅｌｄ　、及びＩＰＣＯＲにロードされる６４ビツトＩＰＣｏｐｅｒａｔｉｏｎがある。（ＩＰＣＩｎ５ｔｒｕｃｔｉｏｎフイールドはＰＩＷに表れているので、）それはすべてのプロセッサに共通であり、一方特定されたＩＰＣ０ｐｅｒａｔｉｏｎはプロセッサではローカルである。IPC Operator:In is determined from two instruction sources. PIW-specific 8-bit IPCIn5truction Field and IPCOR There is a 64-bit IPCooperation loaded. (IPCIn5tr Since the uction field appears in the PIW, it is visible to all processes. The specified IPC0operation is common to all processors, while the specified IPC0operation is It is Cal.

ＩＰＣＩｎ５ｔｒｕｃｔｉｏｎ　Ｆｉｅｌｄ　（８ビツト）はＰＩＷに位置している。それは下記のサブフィールドを有している。IPCIn5truction Field (8 bits) is located in the PIW There is. It has the following subfields:

（１）　Ｒｕｎ／５ｔｏｐ（２）　Ｌｏａｄ　ＩＰＣＤＲ（Ｐｒｅｓｅｒｖｅ、　Ｌｏａｄ　Ｈ＆Ｌ、　Ｌｏａｄ　Ｌ、　Ｌｏａｄ　ｔ（）（２）　ＩＰｃＤＲ用５ｏｕｒｃｅ　５ｅｌｅｃｔ（１）　Ｌｏａｄ　ｌＰｃＯＲ（Ｐｒｅｓｅｒｖｅ、　Ｌｏａｄ）（１）　Ｌｏａｄ　ＣＩＤＲ（Ｐｒｅｓｅｒｖｅ、　Ｌｏａｄ）（１）　ＩＰＣＯＲ，ＣＩＤＲ用５ｏｕｒｃｅ　５ｅｌｅｃｔｌビツトＲｕｎ／５ｔｏｐフイールドは、現命令上ＩＰＣがアクティブか否かを判定する。２ビツトＬｏａｄ　ＩＰＣＤＲは、６４ビツトＩＰＣＤＲがロードされているかどうか、さらにその方法を決定する。４モードというのは、ＩＰＣＤＲ、Ｌｏａｄ　ＩＰｃＤＲ（Ｌ）　、Ｌｏａｄ　ＩＰＣＤＲ（＋（）　、及びＬｏａｄ　ＩＰＣＤＲ（Ｌ、Ｈ）の保存内容である。最後の例では、ＩＰｃＤＲの下位ワードと上位ワードが同じ３２ビツト値でロードされる。２ビツト５ｏｕｒｃｅ　５ｅｌｅｃｔは、どのソースがＩＰＣＤＨにロードされるかを決定する。３２ビツトソースとは、ＲＯ３、ＡＣＣＩ、ＡＣＣ２、及びＭＲＩである。ｌビットＬｏａｄ　ｌＰｃＯＲは、ｌＰｃＯＲの数値をロードするか保存する。１ビツトＬｏａｄ　ＩＰＣＣＩＤＲは、ＣＩＤＲの数値をロードするか保存する。ｌＰｃＯＲとＣＩＤＲの１ビツト５ｏｕｒｃｅ　５ｅｌｅｃｔは、どのソースがロードされるかを決定する。ＩＰＣＯＲとＣＩＤＲは、ＩＭＤとＭＲ２である共通ソースを有する。(1) Run/5top (2) Load IPCDR (Preserve, Load H&L, L oad L, Load t() (2) 5source 5ele for IPcDR ct(1) Load lPcOR(Preserve, Load)(1) Load CIDR (Preserve, Load) (1) IPCOR, C The 5source 5electl bit Run/5top field for IDR is Determine whether IPC is active on the current instruction. 2 bit Load IPCDR determines if and how a 64-bit IPCDR is loaded. do. The 4 modes are IPCDR, Load IPcDR (L), Lo Saved contents of ad IPCDR (+(), and Load IPCDR (L, H) It is. In the last example, the lower and upper words of IPcDR are the same 32 bits. Loaded by value. 2 bits 5source 5elect indicates which source is the IP Determine whether it will be loaded into CDH. 32-bit sources are RO3, ACCI , ACC2, and MRI. l bit Load lPcOR is lPcOR Load or save the numbers. 1 bit Load IPCCIDR is CID Load or save the R value. 1 bit 5ourc of lPcOR and CIDR e5elect determines which source is loaded. IPCOR and C IDR has a common source which is IMD and MR2.

オペレーションはデータとプロセッサに従属しているので、ＩＰＣ０ｐｅｒａｔｉｏｎ　（６４ビツト）は、プロセッサ内に記憶されている。＋ｐｃ　０ｐｅｒａｔｉｏｎは、ＩＰＣＤＲにロードされる６４ビツト値である。各オペレーションはＩＰｃを独立的に制御するので実際は二つの３２ビツトオペレーシヨンである。上位の３２ビツトはＩＰＣＩを制御して、下位の３２ビツトはＩＰＣ２を制御する。６４ビツト値がＩＰｃを介して通信される場合には、ＩＰＣオペレーンヨンの上位ワードと下位ワードは同一でなければならない。２種類のオペレーション、即ちＩＰＣ０ｐｅｒａｔｉｏｎとＩＰｃ　Ｔａｇｇｅｄ　０ｐｅｒａｔｉｏｎがある。Since the operation is dependent on data and processor, IPC0perat ion (64 bits) is stored within the processor. +pc 0per ation is a 64-bit value loaded into IPCDR. Each operation It controls IPc independently so it is actually two 32-bit operations. Ru. The upper 32 bits control IPCI, and the lower 32 bits control IPC2. control If a 64-bit value is communicated over IPc, the IPC operand The upper and lower words of Yon must be the same. Two types of operation i.e. IPC0peration and IPc Tagged 0perati There is on.

ＩＰｃ　０ｐｅｒａｔｉｏｎは、ＰＥでサポートされたＩＰＣオペレーションに類似している。IPc0operation is an IPC operation supported by PE. Similar.

このモードでは、両方のＩＰＣとも別々にプログラマブルである。６４ビツト値は、二つのチャネルを同じようにプログラムすることによりＩＰＣを介して送信される。３種類のオペレーション、即ちシフテイング、パイパシング、ブロードキャスティングである。In this mode both IPCs are separately programmable. 64 bit value is transmitted via IPC by programming the two channels identically. be done. Three types of operations: shifting, pipe passing, and broad It's casting.

ＩＰＣＴａｇｇｅｄ　０ｐｅｒａｔｉｏｎは、１セツトのプロセッサ間における任意の通信のために設計されている。このモードにおいて、両方のＩＰｃが６４ビットワードを送信するためにともに使用されなければならない。ワードは、メツセージ番号、ＣＩＤ、データからなっている。データの発信側はＣＩＤに送信するため６４ビットワード割り当て、データを受信するすべてのプロセッサはＣＩＤＲにロードされた同じＣＩＤを有しなければならない。こうして、一対一通信と一対多通信プロトコルは、サポートされる。代わりに、そのプロセッサＩＤをＣＩＤとして使用し、プロセッサの範囲はデータの受信装置として特定される。データ形式はプログラマ−に残されており、復帰ＣＩＤとしてその情報を含むことができる。IPCTagged 0operation is a Designed for any communication. In this mode, both IPcs are 64 Must be used together to transmit a bitword. The word is It consists of a message number, CID, and data. Data originator sends to CID Allocating a 64-bit word to Must have the same CID loaded into the IDR. In this way, one-on-one correspondence communication and one-to-many communication protocols are supported. Instead, its processor ID as the CID, the range of processors is identified as the receiver of the data. . The data format is left to the programmer and includes that information as the return CID. be able to.

ＩＰＣＴａｇｇｅｄ　０ｐｅｒａｔｉｏｎのために、データをＩＰｃＤＲにロードした後に、ＩＰＣ内容は、シーケンサによって決められた時間内に各サイクル毎にシフトされる。シーケンサは、宛先にデータを送るのに要するサイクル数を決定するユーザープログラマブルカウンタを有している。各プロセッサは、ＩＰｃにシフトしたデータを比較して、そのデータのＣＩＤとＣＩＤＨの数値とを比較する。二つのＣＩＤ値が整合してワードのタッグビットが非ゼロの場合には、６４ビットワードがそのＩＰＣＤＲにロードされる。Load data to IPcDR for IPCTagged 0operation. After loading, the IPC contents are updated each cycle within the time determined by the sequencer. shifted each time. The sequencer calculates the number of cycles it takes to send data to the destination. It has a user programmable counter to determine. Each processor has an IP Compare the data shifted to c and compare the CID and CIDH values of that data. Compare. If the two CID values match and the tag bit of the word is non-zero, then A 64-bit word is loaded into the IPCDR.

ＩＰＣオペレー／ヨン（２７ビツト）は、ＩＰＣのシフティング、パイパシング、ブロードギヤスティングのようなＩＰｃオペレーションを含んでいる。これらのオペレーションは独立的にＩＰＣを制御しているので、二つの異なるオペレーションはただちに実行できる。（ＩＰＣＩ命令はＩＰＣＯＲの上位３２ビツト内に記憶されており、ＩＰＣ２命令は３２ビツト内に記憶されている。）ＩＰＣオペレーションは、下記の２７ビツト命令フィールド形式を有している。IPC operation/yeon (27 bits) performs IPC shifting and piecing. , including IPc operations such as Broadcasting. these The operations of the two control IPC independently, so the can be executed immediately. (The IPCI instruction is within the upper 32 bits of IPCOR. The IPC2 instruction is stored within 32 bits. ) IPC O The operation has the following 27-bit instruction field format:

即ち、１ビツト（Ｃｈａｎｎｅｌ　Ｍｏｄｅに設定された）　Ｍｏｄｅフィールド、１ビットＩＰｃＤＲＨｉｇｈ／Ｌｏｗ　５ｅｌｅｃｔ　；　１ビツトＩＩ” Ｃ５ｐｅｅｄ　（１シフト／サイクル、４シフト／サイクル）；　ｌビットＥｎａｂｌｅ　Ｂｏｕｎｄａｒｙ　Ｖａｌｕｅ　；　３ビツトＲｅｄｕｃｔｉｏｎ　０ｐｅｒａｔｉｏｎ　；　１ビットＬｅｆｔ／Ｒｉｇｈｔ　Ｄｉｒｅｃｔｉｏｎａｌ　Ｂｉｔ　；　２ビツト０ｐｅｒａｔｉｏｎ　（Ｓｈｉｆｔ　、　Ｂｙｐａｓｓ、　aｒｏａｄｃａｓｔ　、　ＮＯＰ　）　；　１ビツトＢｒｏａｄｃａｓｔ　５ｅｎｄ　（Ｂｒｏａｄｃａｓｔ　５ｅｎｄ　Ｎ０Ｐ）　；　２ビツg Ｂｒｏａｄｃａｓｔ　Ｒｅｃｅｉｖｅ　（Ｂｒｏａｄｃａｓｔ　Ｒｅｃｅｉｖｅ　Ｌｅｆｔ　Ｂｏｕｎｄａｒｙ　５Ｂｒｏａｄｃａｓｔ　Ｒ■モ■奄魔■ Ｒｉｇｈｔ　Ｂｏｕｎｄａｒｙ、　Ｂｒｏａｄｃａｓｔ　Ｒｅｃｅｉｖｅ　ＮＯＰ　）　：　１３ビツトＣａｐｔｕｒｅ　Ｃｙｃｌｅｓ　；yび１ビツトＲｅｐｅａｔ　０ｐｅｒａｔｉｏｎである。That is, 1 bit (set in Channel Mode) Mode field 1 bit IPcDR High/Low 5 select; 1 bit II” C5peed (1 shift/cycle, 4 shifts/cycle); l bit En able Boundary Value; 3-bit Reduction 0operation; 1 bit Left/Right Direction al Bit; 2 bit 0 operation (Shift, Bypa ss, ar oadcast, NOP); 1 bit Broadcast 5end (Broadcast 5end N0P); 2 bits g Broadcast Receive Left Boundary 5Broadcast R■Mo■Amama■ Right Boundary, Broadcast Receive NO P): 13-bit Capture Cycles; 1 bit Repeat 0 operations.

１ビツトモードフイールドは、ＩＰＣ０ｐｅｒａｔｉｏｎである命令を特定する。１ビツトＩＰｃＤＲ）Ｉ／Ｌ　５ｅｌｅｃｔは、ＩＰｃＩＩＲの上位ワード又は下位ワードが他のプロセッサコンポーネントにより読み込まれているかどうかを判定する。１ビツトＩＰｃ　５ｐｅｅｄフイールドは、ＩＰｃがプロセッサと同じスピード（１シフト／サイクル）で演算しているのか、プロセッサスピードの４倍（４シフト／サイクル）で演算しているのかを判定する。プロセッサがそのデータ値を次のプロセッサにシフトするかどうかを特定する１ビツトＥｎａｂｌｅ　Ｂｏｕｎｄａｒｙ　Ｖａｌｕｅフィールドがある。境界値を許可すると、同時にＩＰＣを使用しているいくつかの独立したＩＰＣオペレーション間における干渉を妨げる。３ビツト縮小オペレーシヨンフイールドは、両方のモードに共通である。A 1-bit mode field identifies instructions that are IPC0operation. . 1-bit IPcDR) I/L 5select is the upper word of IPcIIR or is whether the lower word is being read by another processor component. Determine. The 1-bit IPc 5peed field indicates whether the IPc is Are you calculating at the same speed (1 shift/cycle)? Processor speed It is determined whether the calculation is performed four times (4 shifts/cycle). the processor is 1-bit Enab that specifies whether to shift the data value of There is a Boundary Value field. Allowing boundary values gives us between several independent IPC operations using IPC at the same time. prevent interference. The 3-bit reduction operation field is the same for both modes. I am a connoisseur.

ＩＰｃ　０ｐｅｒａｔｉｏｎは、ＩＰＣの方向は左右のいずれであるかを決定するための１ビツトフイールドを有している。２ビツトオペレーシヨンフイールドは、シフト、バイパス、ブロードキャスト、又はＮＯＰのいずれが実行されているのかを判定する。ブロードキャストオペレーションが実行されている場合には、１ビツトブロードキヤストセンドフイールドが、そのプロセッサがブロードキャストのオリジネータであるかを判定する。２ビツトフイールドは、そのプロセッサがブロードキャスト受信に参加する方法を決定する。プロセッサは、データ値を受信してＩＰＣの隣接するプロセッサにパスするか、あるいは境界仕様の一つが選択された場合には、ブロードキャスト値のシンクとして作動する。左側の境界ブロードキャスト受信は、そのプロセッサがデータを受信するＩＰＣの左端のプロセッサであると特定する。右側の境界ブロードキャスト受信は、それが右端のプロセッサであると特定する。ＩＰｃ　０ｐｅｒａｔｉｏｎには３２ビツトあるので、各チャネルに対して５ビツトは現在は未使用である。IPc0operation determines whether the IPC direction is left or right. It has a 1-bit field for reading. 2-bit operation field is a shift, bypass, broadcast, or NOP being performed. Determine whether the If a broadcast operation is being performed , the 1-bit broadcast send field is is the originator of the cast. The 2-bit field is Determine how the subscriber will participate in broadcast reception. Processor processes data receive the value and pass it to an adjacent processor in the IPC, or as part of a boundary specification. If selected, it acts as a sink for broadcast values. on the left Boundary broadcast reception refers to the left edge of the IPC where that processor receives data. processor. Right border broadcast reception means it is right Identify the edge processor. 32 bits for IPc 0operation 5 bits for each channel are currently unused.

図１６はシフト、バイパス、及びブロードキャストのオペレーションを実行するＩＰｃのハイレベル図である。レジスタは各プロセッサのＩＰＣＤＲを表している。Figure 16 performs shift, bypass, and broadcast operations FIG. 2 is a high level diagram of IPc. The register represents the IPCDR of each processor. Ru.

最上図はバスの右シフトを説明している。第二図はバイパスオペレーションを説明しており、三つのプロセッサがバイパスされている。本実施例において、（左から数えて）第一と第五プロセッサを論理的に隣接させるバイパスパターンが特定されている。第一プロセッサから右に一回シフトさせるとデータは第五プロセッサにソフトされる。（そのオペレーションは必ずしもｌ命令サイクルでは生じないことを理解しなければならない。多くのプロセッサがバイパスされた場合には、データを論理的に接続された次のプロセッサにシフトするにはいくつかの命令がいるかもしれない。）第三図では、左から数えて第三プロセッサがその数値をブロードキャストされている。最下図では、いくつかのプロセッサがブロードキャストしている。左から数えて第二及び第四プロセッサはＢｒｏａｄｃａｓｔ　５ｅｎｄ命令を実行するとともに、第三プロセッサはＢｒｏａｄｃａｓｔ　Ｒｅｅｅｉｖｅ　Ｒｉｇｈｔ　Ｂｏｕｎｄａｒｙを実行し、第四プロセッサはＢｒｏａｄｃａｓｔ　Ｒｅｃｅｉｖｅ　Ｌｅｆｔ　Ｂｏｕｎｄａｒｙを実行している。これはブロードキャストのためのシンクを特定し、ローカルブロードキャストが互いに干渉するのを妨げる方法である。The top diagram illustrates the right shift of the bus. Figure 2 illustrates bypass operation. and three processors are bypassed. In this example, (left A bypass pattern that makes the first and fifth processors logically adjacent (counting from has been established. A single shift to the right from the first processor transfers the data to the fifth processor. It is softened by the sensor. (The operation does not necessarily occur in l instruction cycles.) You have to understand that there is no. If many processors are bypassed requires several instructions to shift data to the next logically connected processor. There may be an order. ) In the third figure, the third processor counting from the left is being broadcast. In the bottom diagram, some processors are Casting. The second and fourth processors counting from the left are Broadcast At the same time as executing the 5end instruction, the third processor executes the Broadcast R eeeive Right Boundary and the fourth processor executes Br oadcast Receive Left Boundary is running . This identifies the sink for broadcast and local broadcast This is a way to prevent them from interfering with each other.

ＩＰＣＴａｇｇｅｄ　０ｐｅｒａｔｉｏｎ　（６２ビツト）により１セツトのプロセッサの間で任意の通信が許可される。このオペレーションではＩＰＣが単一の６４ビツトチヤネルとして使用される。タグオペレーションのために、シーケンサのカウンタが通信を完全なものとするために必要とされるサイクル数でロードされる。カウンタがゼロとなった場合に、ＩＰｃ通信が完了し、シーケンサにその通信が完了したことを示す信号が送られる。タグオペレーションは、二つのデータ形式を有しており、この形式がＣＩＤの解釈の方法を決定する。IPCTagged 0operation (62 bits) allows one set of programs Any communication between processors is allowed. In this operation, the IPC is It is used as a 64-bit channel. For tag operations, sequence The sensor's counter is low for the number of cycles required to complete the communication. is coded. When the counter reaches zero, IPc communication is completed and the sequencer A signal is sent indicating that the communication is complete. The tag operation consists of two It has a data format, and this format determines how the CID is interpreted.

ＩＰＣＴａｇｇｅｄ　０ｐｅｒａｔｉｏｎは、６２ビツト命令フィールド形式を有している。即ち、（Ｔａｇｇｅｄ　Ｍｏｄｅの設定された）ｌビットＭｏｄｅフィールド；ｌビットＩＰｃＤＲＨｉｇｈ、’Ｌｏｗ　５ｅｌｅｃｔ　：　ＩビットＩＰＣ５ｐｅｅｄ　（１／フト／サイクル、４シフト／サイクル）、１ビツトＥｎａｂｌｅ　Ｂｏｕｎｄａｒｙ　Ｖａｌｕｅ　：　３ビツトＲｅｄｕｃｔｉｏｎ　０ｐｅｒａｔｉｏｎ　；　１ビツトＩＰＣＤａｔａ　Ｒａｎｇｅ形式、＋１ビツトＬｅ４ｔ　５ｈｉｆｔ　Ｃｙｃｌｅｘ　４　；　Ｉ　１ビットＲｉｇｈｔ　５ｈｉｆｔ　Ｃｙｃｌｅ　ｘ　４　：及び３２ビツトＲｅｄｕｃｔｉｏｎ　Ｍａｓｋである。IPCTagged 0operation uses 62-bit instruction field format. have. That is, l bit Mode (with Tagged Mode set) Field; l bit IPcDRHigh, 'Low 5 select: I bit Cut IPC5peed (1/ft/cycle, 4 shifts/cycle), 1 bit Enable Boundary Value: 3 bits Reducti on 0operation; 1 bit IPCData Range format, + 1 bit Le4t 5hift Cyclex 4; I 1 bit Right t 5hift Cycle x 4: and 32-bit Reduction It is Mask.

ＩＰＣＴａｇｇｅｄ　０ｐｅｒａｔｉｏｎにおいて、１ビツトモードフイールドは命令を特定する。１ビツトＩＰＣＤＲＨ／Ｌ　５ｅｌｅｃｔは、ＩＰＣＤＨの上位ワード又は下位ワードがその他のプロセッサコンポーネントによって読み込まれているかどうかを判定する。ｌビットＩＰＣ５ｐｅｅｄフイールドは、そのＩＰＣがプロセッサと同じスピード（ｌシフト／サイクル）で演算しているか、プロセッサスピードの４倍（４シフト／サイクル）で演算しているかどうかを判定する。プロセッサがデータ値を次のプロセッサにシフトするべきかどうかを特定するための１ビツトＥｎａｂｌｅ　Ｂｏｕｎｄａｒｙ　Ｖａｌｕｅフィールドがある。境界値を許可すると、同時にＩＰｃを使用するいくつかの独立したＩＰＣオペレーションの間で生じる干渉を妨げる。３ビツト縮小オペレーシヨンフイールドは、両方のモードに共通である。IPCTagged 1 bit mode field at 0operation identifies the instruction. 1 bit IPCDRH/L 5 select is IPCDH High word or low word read by other processor components Determine whether the l bit IPC5peed field Is the IPC operating at the same speed as the processor (l shifts/cycle)? Determines whether calculations are being performed at four times the processor speed (4 shifts/cycle). Set. Specifies whether a processor should shift the data value to the next processor. 1-bit Enable Boundary Value field for setting There is. Allowing boundaries allows several independent IPs to use IPc at the same time Prevent interference that occurs between C operations. 3-bit reduction operation fee The field is common to both modes.

１ビツトＩＰｃ　Ｄａｔａ　Ｒａｎｇｅ形式は、ＣＩＤ値を解釈するための二つのり一ガルデータ形式の一つを特定する。データがプロセッサの左右方向のどこまでシフトされているかを特定するために二つの１１ビツトフイールドがある。The 1-bit IPc Data Range format has two formats for interpreting CID values. Specify one of the glue data formats. Where is the data left and right of the processor? There are two 11-bit fields to specify what has been shifted.

特定値は４倍して測定されているので、フィールドにおいて数値を特定することはデータがその方向に対して４倍シフトされていることを意味している。３２ビツト縮小マスクがＩＰｃＤＲの最小有効３２ビツトデータに適用されて、縮小オペレーションに従属するワードのビット数が特定される。二つの未定義ビットがある。Specific values are measured by multiplying by 4, so specify the numerical value in the field. means that the data has been shifted by a factor of 4 in that direction. 32 bits The reduction mask is applied to the minimum valid 32-bit data of the IPcDR to reduce the reduction mask. The number of bits in the word that is dependent on the operation is specified. two undefined bits be.

Ｔａｇｇｅｄ　Ｄａｔａ形式１において、ＣＩＤ値は通信ＩＤナンバーとして解釈される。ＣＩＤＲのマツチングＣＩＤナンバーを有するプロセッサはいずれも、そのデータを受信する。In Tagged Data format 1, the CID value is interpreted as a communication ID number. be interpreted. Any processor with a matching CID number of CIDR , and receive that data.

ＩＰｃの６４ビツトデ一タワード形式は、ｌビットＥｖｅｎ　Ｐａｒｉｔｙ　Ｂｉｔ　；　１３ピッ１−ＣＩＤフィールド：２ピツ）Ｔａｇビットフィールド；及び５０ビツトＤａｔａフイールドである。The IPc 64-bit data word format has l-bit Even Parity B. it; 13 bits 1-CID field: 2 bits) Tag bit field; and a 50-bit Data field.

この形式では、ｌビットＥｖｅｎ　Ｐａｒｉｔｙビットはエラーを検出するために使用される。１３ビツトＣＩＤフイールドは、宛先プロセッサによって整合した数値を含んでいる。ユーザ一定義の２ビツトタグフイールドがある。そのフィールドが非ゼロである場合には、意味あるデータは６４ビットワードである。（タグビットはユーザ一定義であるけれども、タグビットバターゾ００゛が予約されている。）５０ビツトフイールドはデータ用である。データ形式を決定するのはプログラマ−の責任である。In this format, l bit Even Parity bit is used to detect errors. used for. The 13-bit CID field is aligned by the destination processor. Contains numerical values. There is a user-defined 2-bit tag field. That fee If the field is non-zero, the meaningful data is a 64-bit word. ( Although the tag bit is user-defined, the tag bit butterzo 00゛ is reserved. It is. ) 50 bit field is for data. Deciding on the data format is the responsibility of the programmer.

５０ビツトに使用できる一つの適切なデータ形式は、１１ビツトＲｅｔｕｒｎ　ＣＩＤアドレス：及び１１ビツトＤａｔａである。もう一つの適切なデータ形式は、（１８）アレーに０ｆｆｓｅｔ　；及び（３２）（アレーに記憶し読み出す）　Ｄａｔａである。One suitable data format that can be used for 50-bit is 11-bit Return. CID address: and 11-bit data. Another suitable data format (18) 0ffset to the array; and (32) (store and read in the array ) Data.

ＩＰｃ　Ｔａｇｇｅｄ　Ｄａｔａ形式１を使用して通信する方法に関する実施例が下記のテーブル（ａ）及び（ｂ）に示されている。テーブル（ａ）はＩＰｃ　０ｐｅｒａｔｉｏｎ前を示してあり、すべてのプロセッサはＣＩＤＲにロードしたＣＩＤ値を有している。Example of how to communicate using IPc Tagged Data format 1 are shown in tables (a) and (b) below. Table (a) is IPc The state before 0operation is shown, and all processors are loaded into CIDR. It has a CID value.

ＣＩＤタグ　５　２０　１０データ　ＡＢ　Ｃブロモ・け　０１２３４５ＣＩＤＲ２０１０５２０２０２０ＰｃＤＲ（ａ）　ＩＰＣ０ｐｅｒａｔｉｏｎ前次に、プロセッサはローカルデータをＣＩＤタグとともにＩＰＣＤＨに入力される。CID tag 5 20 10 Data AB C Bromo Ke 012345 CIDR20105202020 PcDR (a) Before IPC0operation The processor then inputs the local data into the IPCDH along with the CID tag. Ru.

バスは高速度でシフトされて、マツチングハードウェアは、タグデータのＣＩＤ値とＣＩＤＨ値との整合を試みる。整合する場合には、データはＩＰＣＤＨにロードされる。そのため、ＩＰＣ０ｐｅｒａｔｉｏｎの後、その結果はテーブル（ｂ）に示される。The bus is shifted at high speed and the matching hardware Attempts to match the value with the CIDH value. If compatible, the data is loaded to IPCDH. is coded. So after IPC0operation, the result is in the table ( b).

プロセッサ　０１２３４５ＣＩＤＲ２０１０５２０２０２０ＩＰＣＤＲＢ　ＣＡ　Ｂ　Ｂ　Ｂ（ｂ）　ＩＰＣ０ｐｅｒａｔｉｏｎ後ＩＰｃのデータの数ワードが同じＣＩＤ値を有している場合には、ＩＰｃＤＲにある結果値はＩＰｃ縮小オペレータに従属している。Processor 012345 CIDR20105202020 IPCDRB CA B B B (b) After IPC0operation If several words of data in IPc have the same CID value, Certain result values are dependent on the IPc reduction operator.

ＩＰＣの６４ビツトデータの形式は、（１）　Ｅｖｅｎ　Ｐａｒｉｔｙ　Ｂｉｔ　；　（１３）　ＣＩＤフィールド；（２）Ｔａｇビットフィールド；　（８）　Ｒａｎｇｅフィールド；及び（４２）　Ｄａｔａフィールドである。The format of IPC 64-bit data is (1) Even Parity Bit (13) CID field; (2) Tag bit field; (8) Range field; and (42) Data field.

１ビツトＥｖｅｎ　Ｐａｒｉｔｙビットは、エラーを検出するために使用される。１３ビツトＣＩＤフイールドはプロセッサＩＤの数値を含んでいる。このモードにおいて、ＣＩＤフィールドはプロセッサＩＤでロードされる。ユーザ一定義２ビツトタグビツトフイールドがある。そのフィールドが非ゼロである場合には、意味のあるデータは６４ビツトワードにある。（タグビットはユーザ一定義であるけれども、タグビットパターン００′は予約されている。）８ビットＲａｎｇｅフィールドはプロセッサの連続範囲を特定している（Ｒａｎｇｅ値は、［ＣＩＤ］と［ＣＩＤ＋Ｒａｎｇｅｌの間にあるプロセッサがデータを受信することを示している）。４２ビツトデータフイールドデータの場合には、プログラマ− はそのフィールドのデータ形式を決定しなければならない。1-bit Even Parity bit is used to detect errors . The 13-bit CID field contains the numeric value of the processor ID. This mode In the code, the CID field is loaded with the processor ID. User defined There is a 2-bit tag bit field. if that field is non-zero , meaningful data is in 64-bit words. (Tag bits are user-defined. However, tag bit pattern 00' is reserved. ) 8 bit Ran The ge field specifies a contiguous range of processors (Range values are [C The processor between [ID] and [CID+Range] receives data. ). In the case of 42-bit data field data, the programmer must determine the data format for that field.

下記の二つのテーブルに示された実施例は、ＩＰＣＴａｇｇｅｄ　Ｄａｔａ形式２の働く方法を説明している。この例においては、プロセッサ０はデーラダＡ′ をプロセッサ２−５に送る。最初に、下記の第一テーブルに示しているように、各プロセッサは論理的プロセッサナンバーをＣＩＤＨに入力して、プロセッサ０は３の範囲で２のＣＩＤを特定する。The examples shown in the two tables below are in IPCTagged Data format. It explains how 2 works. In this example, processor 0 is data processor A' is sent to processor 2-5. First, as shown in the first table below, Each processor enters its logical processor number into CIDH and processes processor 0. specifies a CID of 2 in a range of 3.

始値：プロセッサ　０１２３４５６ＣＩＤＲＯｌ　２　３　４　５　６範囲　３　・　・　・　・　・　・ＩＰｃ　０ｐｅｒａｔｉｏｎ後に、下記の第二テーブルに示しているように、プロセッサ２−５は正しいデータ値を有している。Opening price: Processor 0123456 CIDROl 2 3 4 5 6 Range 3・・・・・・・・ After IPc 0operation, as shown in the second table below, Processors 2-5 have correct data values.

ＩＰＣ０ｐｅｒａｔｉｏｎ後：プロセッサ　０１２３４５６受信　・　・　ＡＡＡＡ　・Ｒｅｄｕｃｔｉｏｎ　０ｐｅｒａｔｉｏｎ　Ｆｉｅｌｄ　（３ビツト）は、ＩＰＣとＴａｇｇｅｄオペレーションの両方に共通である。それは、ＩＰＣのデータに関して実行する縮小オペレーションを特定する。そのフィールドが縮小は生じるものでないことを特定して、縮小が必要とされた場合には、ＰＳＷのビットが設定される。八つの縮小オペレーションがある。　１）　ＸＯＲ５）　Ｍａｘ２）　ＡＮＤ　６）　Ｍｉｎ３）　ＯＲ７）　Ｓｕｍ４）　Ｒｅｐｌａｃｅ　８）　５ｏｒｔＣｈａｎｎｅｌ　Ｍｏｄｅでは、データフィールドは特定チャネル用のＩＰＣＤＲの３２ビツト値である（チャネル１１はＩＰＣＤＲの３２ビットを使用しており、チャネル１２はＩＰＣＤＲの下位３２ビツトを使用する）。Ｔａｇｇｅｄ　Ｍｏｄｅでは、データフィールドは可変であり、３２ビツトＲｅｄｕｃｔｉｏｎ　Ｍａｓｋにより定義されいるが、それはデータの最小有効３２ビツトに適用される。特定された縮小オペレーションは、ＩＰＣバスとＩＰＣＤＲを介して受信したワードで実行される。そのオペレーションの結果は、信号ＩＰＣとしてプロセッサに供給される。図６では、信号ＩＰＣはＡＧ６１０−１を介してローカルメモリーに書き込まれて、ＲＦ６０８のレジスタに記憶されて、マツチユニット６０４又は乗算器６０２のＹオペランドとしてが、ＡＬＵ６００のＡオペランドとして適用される。After IPC0operation: Processor 0123456 Reception・・AAAA・ Reduction 0operation Field (3 bits) is the IP Common to both C and Tagged operations. It is IPC data Identify the reduction operation to perform on. That field will shrink If it is determined that the PSW is not Set. There are eight reduction operations. 1) XOR5) Max 2) AND 6) Min 3) OR7) Sum 4) Replace 8) In 5ortChannel Mode, the data The field is a 32-bit value of IPCDR for a particular channel (channel 11 uses 32 bits of IPCDR, and channel 12 uses the lower 3 bits of IPCDR. (use 2 bits). In Tagged Mode, data fields are variable is defined by the 32-bit Reduction Mask, but it applies to the minimum significant 32 bits of data. The identified reduction operation is , on words received via the IPC bus and IPCDR. its operation The result of the tion is provided to the processor as signal IPC. In FIG. 6, the signal I The PC is written to local memory via AG610-1 and the RF608 Y operand of match unit 604 or multiplier 602 is applied as the A operand of ALU 600.

データ縮小オペレーションは下記のように行われる。ＩＰＣバスを介してＩＰＣロジック６１２によって受信されたデータ値はｌオペランドであり、ＩＰＣＤＨに保存された数値は別のオペランドである。一度そのオペレーションが実行されると、その結果はＩＰＣＤＲに記憶されて、初期の内容を取り替える。上記の二つのリプレースオペレーションとソートオペレーションは、説明を加えた方がよりよく理解できる。リプレースオペレーションにより、Ｉｆ’Ｃバスを介して受信した数値はＩＰＣＤＲの初期の内容を取り替える。ソートオペレーションにより、大きなオペランドがＩＰｃＤＲの３２Ｍ５Ｒ位置を占めるとともに、小さなオペランドが３２ＬＳＢ位置を占める。The data reduction operation is performed as follows. IPC via IPC bus The data value received by logic 612 is the l operand and the IPCDH The number stored in is another operand. Once the operation is executed The result is then stored in the IPCDR, replacing the initial contents. above two The two replace and sort operations are best explained. I can understand it very well. Replace operation allows The received value replaces the initial contents of the IPCDR. By sort operation The large operand occupies the 32M5R position of IPcDR, and the small operand occupies the 32M5R position of IPcDR. The operand occupies the 32 LSB positions.

１０ＭＣは、ＳＥとすべての外部ソースの間のすべてのデータ転送を行う。そのＳＥは／リンダに編成されており、各シリンダはプロセッサ、ローカルメモリー、及び１０ＭＣを含んている。シリンダは編成されているので、ＩＯＭＣとプロセッサの間の唯一の通信形式はローカルメモリーを介してである。こうして、プロセッサＩ１０はマツプメモリーであり、外部ソースとローカルメモリーの間のデータ転送が適切に実行されていることを確認するのが制御装置とＩＯＭＣの役割である。The 10MC handles all data transfers between the SE and all external sources. the The SE is organized into /linders, with each cylinder containing a processor, local memory , and 10MC. Since the cylinders are organized, the IOMC and The only form of communication between processors is through local memory. In this way, Processor I10 is a map memory, a link between external sources and local memory. It is the role of the controller and IOMC to ensure that data transfer is performed properly. It's a discount.

１０ＭＣは、三つの主たるＩｌｏ　Ｃｈａｎｎｅｌ　、即ちＤａｔａ　Ｉｎｐｕｔ　Ｃｈａｎｎｅｌ　（ＤＩＣ）　；　Ｄａｔａ　０ｕｔｐｕ＋　Ｃｈａｎｎｅｌ　（ＤＯＣ）　：及びＨｏ５ｔ　Ｉｌｏ　Ｃｈａｎｎｅｌ　（１＋ｌ０ｃ）に接続している。これは、ビデオソース、ビデオ宛先、及びホストワークステーションの間でのデータ転送をそれぞれ処理している。ＤＩＣとＤＯＣは、ｌｎｐ　ｕｔ　５ｌｉｃｅと０ｕｔｐｕｔ　５ｌｉｃｅというプロセッサインターフェイスを経由してＩＯＭＣに接続されている。10MC has three main Ilo Channels, namely Data Input t Channel (DIC); Data 0utpu+ Channel l (DOC): and Ho5t Ilo Channel (1+l0c) Connected. This includes video sources, video destinations, and host workstations. each handles data transfer between the two versions. DIC and DOC are lnp The processor interfaces are ut5lice and 0utput5lice. connected to the IOMC via the

Ｈｏ５ｔ　Ｉｌｏ　Ｂｕｓ　（ＨＩＯ）は、Ｉ（ｏｓｔ　ＷｏｒｋｓｔａｔｉｏｎをＩＯＭＣに接続した３２ビツト二方向チヤネルである。そのチャネルはリニアアレーのＩＯＭＣをＨＩＯの左端に置かれたホストに接続される。そのチャネルは、２００ＭＢ／ｓｅｃ、のデータ率を有している。Ho5t Ilo Bus (HIO) is I (ost Workstation) A 32-bit bidirectional channel connects the IOMC to the IOMC. That channel is The IOMC of the array is connected to the host placed at the left end of the HIO. that channel The file has a data rate of 200MB/sec.

ＤＩＣはｌ０ＩＪＣを最大臼つのＶｉｄｅｏ　５ｏｕｒｃｅと同時に接続した４８ビット一方向チャネルである。ＤＩＣは独立制御の四つの１２ビツトシリアルチヤネルからなっており、（各チャネルは異なったＶｉｄｅｏ　５ｏｕｒｃｅを読み出すことができるように、）各チャネルは異なったクロックから作動する。DIC connected 10IJC with maximum 5 sources at the same time 4 It is an 8-bit unidirectional channel. DIC has four independently controlled 12-bit serial It consists of channels (each channel has 5 different video sources). ) Each channel operates from a different clock so that it can be read out.

ＤＩＣは、リニアアレーのＩＯＭＣをＤＩＣの左端に位置するＶｉｄｅｏ　５ｏｕｒｃｅと接続される。チャネルは、データをバスを介して左から右に伝送する。ＤＩＣは、Ｉｎｐｕｔ　５ｌｉｃｅを介してＩＯＭＣに接続されている。DIC has a linear array IOMC located at the left end of DIC. Connected to urce. Channels transmit data across the bus from left to right . DIC is connected to IOMC via Input 5lice.

チャネルは、最大速度８６Ｍ１＋Ｚで作動し、１．２ＧＢ／ｓｅｃ、のデータ率を有している。The channel operates at a maximum speed of 86M1+Z, with a data rate of 1.2GB/sec. have.

ＤＯＣは、ＩＯＭＣを最大臼つのＶｉｄｅｏ　Ｄｅｓｔｉｎａｔｉｏｎに同時に接続した４８ビット一方向チャネルである。ＤＩＣと同様に、ＤＯＣは独立制御の四つの１２ビツトシリアルチヤネルからなっており、（各チャネルは異なったＶｉｄｅｏ　Ｄｅｓｔｉｎａｔｉｏｎに書き込むことができるように、）各チャネルは異なったクロックから作動する。ビデオ人力／出力チャネルが同じ形式同じ速度でデータを伝送する場合には、ＤＯＣがＤＩＣクロックから演算するというモードがある。ＤＯＣは、リニアアレーのＩＯＭｃをＤＩＣの左端に位置するＶｉｄｅｏ　Ｄｅｓｔｉｎａｔｉｏｎと接続されている。バスは、それを介して右から左にデータを伝送する。ＤＩＣは、０ｕｔｐｕｔ　５ｌｉｃｅを介してＩＯＭＣに接続されている。チャネルは、最大速度８６ＭＨｚで作動し、１．２ＧＢ／ｓｅｃ、のデータ率を有している。DOC can simultaneously connect IOMC to the maximum number of Video Destinations. A connected 48-bit unidirectional channel. Similar to DIC, DOC is independently controlled consists of four 12-bit serial channels (each channel has a different ) for each channel so that it can be written to the Video Destination. The channels run from different clocks. Video input/output channel is same format When transmitting data at the same speed, the DOC calculates from the DIC clock. There is a mode. The DOC places the linear array IOMc at the left end of the DIC. Connected to Video Destination. bus through it Transmit data from right to left. DIC connects I via 0output 5lice. Connected to OMC. The channel operates at a maximum speed of 86MHz and supports 1.2G It has a data rate of B/sec.

図１７において、Ｉｎｐｕｔ　５ｌｉｃｅはＤＩＣのための１０ＭＣインターフェイスであり、各ＤＩＣに一つの合計口つの６４ｘ３２ビツトＦＩＦＯ１７０２ −１乃至１７０２−４からなるＩｎｐｕｔ　Ｃｏｎｔｒｏｌｌｅｒ　１７００と、ＤＩＣとインターフェイスをもつハードウェアからなっている。各ＦＩＦＯ１７０２−１乃至１７０２−４は、１２ビツト入力を３２ビツト出力に変更するフォーマツタ（ＦＭＴ）を含んでいる。ＤＩＣからのデータはＦＩＦＯを介しローカルメモリに方向付けられるか、ＤＩＣを介してリニアアレーの次のＩＯＭＣのＩｎｐｕｔ　５ｌｉｃｅにバスされる。代わりに、前のＩＯＭＣの０ｕｔｐｕｔ　５ｌｉｃｅからのデータは、ＩＯＭｃのＩｎｐｕｔ　５ｌｉｃｅにルートをつけることができる。制御装置！７００は、二つの機能の役割を有している。即ち、ＤＩＣからＦＩＦＯ１７０２−１乃至１７０２−４にロードされるデータを制御することと、ＦＩＦＯ１７０２−１乃至１７０２−４からローカルメモリーにデータを転送することである。In Figure 17, Input 5lice is 10MC interface for DIC. A total of 64x32 bit FIFO 1702, one for each DIC. Input Controller 1700 consisting of -1 to 1702-4 and It consists of hardware that has an interface with the DIC. Each FIFO1 702-1 to 1702-4 are filters that change 12-bit input to 32-bit output. Contains Foma Tsuta (FMT). Data from DIC is loaded via FIFO. of the next IOMC in the linear array via the DIC. Bused to Input 5lice. Instead, the previous IOMC's 0output Data from 5lice is routed to IOMc Input 5lice. can be used. Control device! 700 has two functional roles. That is, , controls the data loaded from DIC to FIFOs 1702-1 to 1702-4. from FIFO 1702-1 to 1702-4 to local memory. It is to transfer data.

図１８において、０ｕｔｐｕｔ　５ｌｉｃｅはＤＯＣのための１０ＭＣインターフェイスであり、各ＤＩＣ１，ｍ−−１（７）合計四つ（１）　６４　ｘ　３２ビツトＦＩＦＯ］−８０２−１乃至１８０２−４からなる０ｕｔｐｕｔ　Ｃｏｎｔｒｏｌｌｅｒ　１８００と、ＤＯＣとインターフェイスをもつハードウェアからなっている。各ＰＩＦ０１８０２−１乃至１８０２−４は、３２ビツト人カを１２ビツト出力に変更するフォーマツタ（ＦＭＴ）を含んでいる。ローカルメモリがらのデータはＰＩＦｏ　１８０２−１乃至１８０２−４を介しＤＯＣに方向付けられるか、前のＩＯＭＣからのＤＯＣのデータは次のＩＯＭＣにバスされる。制御装置１８００は、二つの機能の役割を有している。即ち、ローカルメモリから０ｕｔｐｕｔ　ＦＩＦＯ１８０２−１乃至１８０２−４にデータを転送することと、ＦＩＦＯ１，８０２−１乃至１８０２−４（７）出力をＤＯＣｌ、ｍ送信することである。In Figure 18, 0output 5lice is 10MC interface for DOC. face, each DIC1, m--1 (7) total four (1) 64 x 32 bit FIFO] -802-1 to 1802-4 0output Con Troller 1800 and hardware that interfaces with DOC? It is becoming more and more. Each PIF01802-1 to 1802-4 has a 32-bit power It includes a formatter (FMT) that changes to 12-bit output. local memo The raw data is directed to DOC via PIFo 1802-1 to 1802-4. data in the DOC from the previous IOMC is bussed to the next IOMC. . Control device 1800 has two functional roles. i.e. local memory Transfer data from 0output to FIFO 1802-1 to 1802-4 In addition, the output of FIFO1, 802-1 to 1802-4 (7) is sent to DOCl, m. It is to believe.

旧Ｏは、ローカルメモリーとホストワークステーションとの間での非実時間データ転送に使用される。それは、ユーザーによるインタラクティブ制御とアルゴリズムの修正、及びプログラムローディング、大刀、及び出力のようなアクティビティをサポートする。ホストチャネルはスカシデータ転送とベクトルデータ転送をサポートしている。Old O is a non-real-time data transfer between local memory and the host workstation. used for data transfer. It is based on user interactive control and algorithmic control. Modification of rhythms and activities such as program loading, daggers, and output Support Tee. The host channel is used for data transfer and vector data transfer. is supported.

ホストとメモリの間の転送データはバッファされる。ホストとＳＲは異なるクロック率で演算し、強く結合されていないので、データのバッファリングは必要である。ホストは、データをＶＭＥバスを介してＯｐｅｒａｔｉｎｇ　Ｓｙｓｔｅｍ　（ＯＳ）　Ｂｕｆｆｅｒに読み込み、書き込む。ＩＯＭｃは、データをＨＩＯバスを介してＯ３Ｂｕｆｆｅｒに読み込み、書き込む。１セツトのレジスタがリニアアレーに接続されているＩＯＭＣと制御装置の各々に存在する（図３参照）。データはＨＩＯバスを介して２００ＭＢ／ｓｅｃ、の率でシフトサレる。Ｏ３Ｂｏａｒｄは、Ｏ３Ｂｕｆｆｅｒの使用を調整して、データがホストとローカルメモリとの間で正確に転送されることを保証する。Ｏ３ＢｕｆｆｅｒとＯ３Ｂｏａｒｄに関しての詳細は後述。　ＨＩＯバスは、すべてのＩＯＭＣと制御装置に直列に接続される３２ビツト二方向パスである。データがＯ３Ｂｕｆｆｅｒに記憶されるまで、データはそれを旧ＯＲｅｇｉｓｔｅｒ　（ＨＩＯＲ）にロードし、）１１０バス上左方向にシフトすることにより、Ｏ３Ｂｕｆｆｅｒに書き込まれる。同様に、データが宛先ＨＩＯＲに届くまで、［ＯＭＣへの書き込みは、Ｏ３Ｂｕｆｆｅｒを読み出し、そのデータを右方向にシフトすることにより実行される。Transfer data between the host and memory is buffered. Host and SR are in different clones Because it operates on the read rate and is not strongly coupled, no data buffering is required. be. The host sends data to the Operating System via the VME bus. m (OS) Read and write to Buffer. IOMc sends data to HI Read and write to O3Buffer via O bus. One set of registers exists in each IOMC and control device connected to the linear array (see Figure 3). ). Data is shifted over the HIO bus at a rate of 200MB/sec. O 3Board coordinates the use of O3Buffer to ensure that data is shared between host and local ensure accurate transfers to and from memory. O3Buffer and O3B Details regarding oard will be described later. The HIO bus connects all IOMCs and control devices is a 32-bit bidirectional path connected in series with Data to O3Buffer Until stored, the data loads it into the old ORegister (HIOR) ) 110 bus to the left and write to the O3Buffer. be caught. Similarly, until the data reaches the destination HIOR, [writing to OMC is Executed by reading O3Buffer and shifting the data to the right be done.

バスを介して送信されるデータには２種類ある。即ち、ベクトルデータとスカシデータである。ベクトルデータは、プロセッサ数と同じ大きさを有する３２ビツトデータのアレーである。データは逆順序で旧０に送信されるので、第一データワードは最右側のプロセッサに向けられており、最後のデータワードは最左側のプロセッサに向けられている。こうして、すべてのデータが同じサイクルでプロセッサに届く。There are two types of data sent over the bus. In other words, vector data and It is data. Vector data is 32 bits with the same size as the number of processors. This is an array of data. The data is sent in reverse order to old 0, so the first data The word is directed to the rightmost processor, and the last data word is directed to the leftmost processor. Directed to the processor. This way, all data is processed in the same cycle. It reaches Sessa.

スカシデータは、すべての１０ＭＣプロセッサに送信されたホストワードで宛先ＩＤナンバー（ＰＲＯＣ＃ＮＵＭ）を特定して、データの単一ワードを旧ＯＢｕｓに送信することにより、ＩＯＭｃに送信される。スカシモードではシフトできないし、ポストバスはすべてのプロセッサが監視する真のバスとして作動する。The data is sent to all 10 MC processors in the destination host word. Identify the ID number (PROC#NUM) and copy a single word of data to the old OBu. s to the IOMc. You can't shift in swash mode. Otherwise, the postbus operates as a true bus monitored by all processors.

スカシデータの宛先であるかどうかを判定するために、各１０ＭＣはホストワードのＰＲＯＣ＃ＮＵＭをＰＩＤＲの数値と比較する。To determine if it is the destination for the data, each 10 MC uses the host Compare the PROC#NUM of the code with the value of PIDR.

ＩＯＭｃは、旧０を特定する制御装置から受信した４２ビツトＨｏ５ｔ　Ｃｏｍａｎｄを有している。旧ＯＣｏｍｍａｎｄは下記の命令フィールドを有している。The IOMc receives the 42-bit Ho5tCom from the control device that identifies the old 0 It has and. The old OCommand has the following command fields. .

（１）　Ｖｅｃｔｏｒ／５ｃａｌａｒ　５ｅｌｅｃｔ（１）　）ｌｏｓｔ　Ｒｅａｄ／Ｗｒｉｔｅ　５ｈｉｆｔ（１）　５ｈｉｆｔ　Ｅｎａｂｌｅ　（Ｌｏａｄ　ＨＩＯＲＥｎａｂｌｅ）（１３）　Ｐｒｏｃｅｓｓｏｒ　ＩＤ（１）　Ｍｅｍｏｒｙ　Ｅｎａｂｌｅ（１）　Ｍｅｍｏｒｙ　Ｒｅａｄ／Ｗｒｉｔｅ（２３）　Ｍｅｍｏｒｙ　Ａｄｄｒｅｓｓ　Ｆｉｅｌｄ（１）　Ｌｏａｄ　ＰＩＤＲＥｎａｂｌｅ１ビットＶｅｃｔｏｒ／５ｃａｌａｒ　５ｅｌｅｃｔフイールドは、バスのデータはベクトルであるのか、スカシであるのかを判定する。１ビツトＨｏ５ｔ　Ｒｅａｄ／Ｗｒｉｔｅ　５ｈｉｆｔは、ホストによって読み出すためにデータをパス上で左方向にシフトして、ホストにより書き込むためにデータをバス上で右方向にシフトする。左端のＩＯＭＣは、Ｈｏ５ｔ　Ｒｅａｄｓｈｉｆｊを介してホストにデータをシフトして、右端のプロセッサは最後のプロセッサであるので、バスからデータをシフトする。１ビツト５ｈｉｆｔ　Ｅｎａｂｌｅは、バスから旧ＯＲにデータをシフトする。１３ビツトＰｒｏｃｅｓｓｏｒ　ＩＤフィールドは、ＰＩＤＲの数値と比較するためにスカシモードで使用される。数値が整合する場合には、その数値は旧ＯＲにロードされる。１ビットＭｅｍｏｒｙ　Ｅｎａｂｌｅフィールドは、ローカルメモリーアクセスを可能にする。１ビットＭｅｍｏｒｙ　Ｒｅａｄ／Ｗｒｉｔｅフィールドは、メモリーアクセスが読み出しであるか書き込みであるかを特定する。２３ビツトＭｅｍｏｒｙ　Ａｄｄｒｅｓｓフィールドは、）ｌｏｓｔ　Ｒｅａｄ／Ｗｒｉｔｅに参加するローカルバンクとアドレスを特定する。(1) Vector/5calar 5elect (1)) lost Re ad/Write 5hift (1) 5hift Enable (Load HIOREnable) (13) Processor ID (1) Memory Enable (1) Memory Read/Write (23) Memory Add ress Field (1) Load PIDRenable 1 bit Vec tor/5calar 5elect field indicates that the bus data is a vector. Determine whether it is there or not. 1 bit Ho5t Read/Wri te 5hift moves data to the left on the path for reading by the host. Shift to shift data right on the bus for writing by the host . The leftmost IOMC sends data to the host via Ho5t Readshift. Shift and the rightmost processor is the last processor, so it pulls the data from the bus. shift. 1 bit 5hift Enable transfers data from the bus to the old OR shift. The 13-bit Processor ID field is Used in scat mode to compare with numbers. If the numbers match, then The value of is loaded into the old OR. 1 bit Memory Enable field allows local memory access. 1 bit Memory Rea The d/Write field indicates whether the memory access is a read or a write. Identify if there is. The 23-bit Memory Address field is lost Specify the local bank and address participating in Read/Write. Ru.

ＰＲＯＣＲＮＵＭ値がシリンダに配線されていないので、１ビツトＬｏａｄ　Ｅｎａｂｌｅフィールドは、ＳＥ初期化の間だけ実行する必要がある。初期化中は、ホストは、ＰＲＯＣＩＩＮＵＭ値のアレーをベクトル値として伝送する。（ベクトルデータ伝送ではＰＲＯＣＲＮＵＩＪ値は使用していない。）数値は旧ＯＲによって受信されている場合には、その数値はＰＩＤＲにロードされなければならない。Ｌｏａｄ　ＰＩＤＲＥｎａｂｌｅは、）ＩＩＯＨの内容を有するＰＩＤＲをロードする。また、ＰＲＯＣはＰＲＯ（ｊｌＮＵＭで初期化しなければならない。また、ＰＲＯＣ＃ＮＵＭは、ローカルメモリーに書き込まれるので、ＰＲＯＣはその数値を読み出して、ＰＲＯＣ＃ＮＵＭて初期化する。Since the PROCRNUM value is not wired to the cylinder, 1 bit Load E The nable field only needs to be executed during SE initialization. During initialization , the host transmits the array of PROCIINUM values as a vector value. (Be The PROCRNUIJ value is not used in vector data transmission. ) numbers are old OR If the value has been received by the No. Load PIDREnable is a PID with contents of ) IIOH Load R. Also, PROC must be initialized with PRO(jlNUM) do not have. Also, since PROC#NUM is written to local memory, PR OC reads the value and initializes it as PROC#NUM.

Ｖｉｄｅｏ　Ｉｎｐｕｔは、４８ビット一方向ＤＩＣを介してＳＥに送られる。Video Input is sent to the SE via a 48-bit unidirectional DIC.

ＤＩＣは実際独立して制御される四つの１２ビツトチヤネルであり、異なるＶｉｄｅｏ　Ｉｎｐｕｔから各々読み込むことができる。概念上、Ｖｉｄｅｏ　Ｉｎｐｕｔは、ＤＩＣの左端にあり、左端の１０ＭＣプロセッサに接続されている。The DIC is actually four 12-bit channels that are independently controlled and have different Vi Each can be read from deo Input. Conceptually, Video In put is at the left end of the DIC and is connected to the left end 10MC processor.

すべてのＩＯＭｃプロセッサは、ＤＩＣによって直列に接続されており、右端のＩＯＭｃはバスの最後のプロセッサである。データは、ＤＩＧの左方向から右方向に転送される。All IOMc processors are connected in series by DIC, and the rightmost IOMc is the last processor on the bus. Data is from left to right of DIG forwarded to

１０ＭＣ／ＤＩＣインターフェイスは、Ｉｎｐｕｔ　５ｌｉｃｅと呼ばれ、ＩＯＭＣＩｎｐｕｔ　Ｃｏｎｔｒｏｌｌｅ「によって制御される。Ｉｎｐｕｔ　Ｃａｎけｏｌｌｅｒは、二つの基本的機能を実行する。データをＤＩＣからＩｎｐｕｔ　ＦＩＦＯＯ’１ｄｅｏ　Ｃａｐｔｕｒｅ）に転送することと、データをＩｎｐｕｔＦＩＦＯからローカルメモリー（Ｖｉｄｅｏ　ｔｏ　Ｍｅｍｏｒｙ　Ｔｒａｎｓｆｅｒ）に転送する。データソースによって送信された同期化信号に基づいて、Ｖｉｄｅｏ　Ｃａｐｔｕｒｅは自律的に実行される。制御装置がＶｉｄｅｏ　Ｉｎｔｅｒｒｕｐｔによって割り込まれた場合には、Ｖｉｄｅｏ　！。The 10MC/DIC interface is called Input 5lice, and the IO Controlled by MCInput Controlle. Input Ca The noker performs two basic functions. Input data from DIC t FIFOO’1deo Capture) and data In put FIFO to local memory (Video to Memory Tr (ansfer). Based on the synchronization signal sent by the data source Video Capture is executed autonomously. Control device is video o If interrupted by Interrupt, Video! .

Ｍｅｍｏｒｙ　Ｔｒａｎｓｆｅｒが実行される。Memory Transfer is executed.

現在、ビクセルを表すためのＳ［！でサポートされた七つのビデオデータ形式がある。その形式とは、Ｃｏｍｐｏｓｉｔｅ　Ｖｉｄｅｏ　、　Ｙ、Ｃ（Ｌｕｍｉｎａｎｃｅ／Ｃｈｒｏｍａ）　（Ｍｕｌｔｉｐｌｅｘｅｄ）、Ｙ、Ｃ（Ｌ＋ｕ＋＋１ｎａｎｃｅ／Ｃｈｒｏｍａ）　（Ｄｅｄｉｃａｔｅｄ　Ｃｈａｎｎｅｌ）、ＲＧＢ　（Ｍｕｌｔｉｐｌｅｘｅｄ）　AＲＧＢ　（Ｄｅｄｉｃａｔｅｄ　Ｃｈａｎｎｅｌ）　、及びＦｅｅｄｂａｃｋである。Currently, S [! There are seven video data formats supported by be. The formats are Composite Video, Y, C (Lumi nance/Chroma) (Multiplexed), Y, C (L+u+ +1nance/Chroma) (Dedicated Channel), RGB (Multiplexed) ARGB (D (Channel), and Feedback.

ローカルメモリにロードされるデータは、１２ビツトシリアルチヤネルからとって、Ｉｎｐｕｔ　ＦＩＦＯの前に位置するフォーマツタによって３２ビツトワードにバックされる。マルチフィールドを有するデータは、時分割多重化されてフォーマツタにはいる。その形式はＩｎｐｕｔ　Ｃｏｎｔｒｏｌｌｅｒによって決定される。その形式はＶｉｄｅｏ　Ｃａｐｔｕｒｅ　５ｅｔｕｐ　Ｉｎ５ｔｒｕｃｔｉｏｎを介して変更される。Data loaded into local memory is retrieved from a 12-bit serial channel. The 32-bit data is stored by the formatter located in front of the Input FIFO. It will be backed up. Data with multiple fields is time-division multiplexed and It's in Omatsuta. Its format is determined by the Input Controller. determined. The format is Video Capture 5etup In5tru ction.

図１９において、フォーマツタは三つのビクセル形式を有している。上記のすべてのビデオ形式はビクセル形式の−っに一致する。ビクセル形式１は、３２ビツトワードの下位１２ビツトに位置する単一の１２ビツトデータフイールドである。ビクセル形式２は二つの１２ビツトデータフイールドを有しており、各１６ビツトハーフワードの下位１２ビツトを満たす。ビクセル形式３は三つのデータフィールドを有している。即ち、二つのｌＯビットフィールドと１２ビツトフイールドで、ワードの下位１２ビツトに１２ビツトフイールドを有している。In FIG. 19, the formatter has three pixel formats. All of the above All video formats match the pixel format. Bixel format 1 is 32 bits is a single 12-bit data field located in the lower 12 bits of the word. . Bixel format 2 has two 12-bit data fields, each containing 16 bits. Fills the lower 12 bits of the first halfword. Bixel format 3 has three data files. It has a field. That is, two lO bit fields and a 12 bit field. The word has a 12-bit field in the lower 12 bits.

図２０はＳＥによってサポートされた異なるビデオモードを示している。Ｃｏｍｐｏｓｉｔｅ　Ｖｉｄｅｏモードでは、Ｃｏｍｐｏｓｉｔｅ　Ｖｉｄｅｏが１２ビツト値として１２ビツトチヤネルを介して送信される。その数値は３２ビツトワードの下位１２ビツトにロードされる。Ｉｎｐｕｔ　Ｃｏｎけｏｆｆｅｒによって特定されるように、これはビクセル形式である。Figure 20 shows the different video modes supported by the SE. Com In composite Video mode, Composite Video is 12 Sent as a bit value over a 12-bit channel. The number is 32 bits Loaded into the lower 12 bits of the word. Input Conke offer This is in pixel format, as specified by .

Ｌｕｍ１ｎａｎｃｅ／Ｃｈｒｏｍａ　（Ｙ、Ｃ）モードでは、二つの１２ビツト値としてエンコードした情報が１２ビツトチヤネルを介して伝送される。その二つの数値はフォーマツタによって３２ビツトワードによって時分割多重化される。ルミナンス値は上位１６ビソトワードの下位１２ビツトにロードされて、クロマは下位１６ビツトワードの下位１２ビツトにロードされる。これはＩｎｐｕｔ　Ｃｏｎけｏｌｌｅｒによって特定されたビクセル形式２である。In Luminance/Chroma (Y, C) mode, two 12-bit Information encoded as values is transmitted over a 12-bit channel. Part two The two numbers are time-multiplexed by the formatter using 32-bit words. . The luminance value is loaded into the lower 12 bits of the upper 16 bits of the clock. The master is loaded into the lower 12 bits of the lower 16 bit word. This is Input This is pixel type 2 specified by Conker.

Ｙ、Ｃ（Ｌｕｍｉｎａｎｃｅ／Ｃｈｒｏｍａ、　Ｄｅｄｉｃａｔｅｄ　Ｃｈａｎｎｅｌ）モードにおいて、二つのコンポ−ネントには専用チャネルが備わる。二つの１２ビツト値は、フォーマツタにより３２ビツトワードの下位１２ビツトにロードされる。これはビクセル形式ｌである（Ｃｏ＋ｎｐｏｓｉｔｅ　Ｖｉｄｅｏと同じ形式）。Y, C (Luminance/Chroma, Dedicated Chan In the nel) mode, the two components are provided with dedicated channels. two The two 12-bit values are stored in the lower 12 bits of a 32-bit word by the formatter. loaded. This is in pixel format (Co+nposite Video (same format as o).

ＲＧＢ　（Ｍｕｌｔｉｐｌｅｘｅｄ）モードにおいて、ＲＧＢ信号は１２ビツトチヤネルを介して伝送される三つの１０ビツト値としてエンコードされる。三つの数値はフォマッタによって３２ビツトワードに時分割多重化される。上位２０ビツトは赤色と緑色のコンポーネントでロードされて、下位１０ビツトが青色のコンポーネントでロードされる。これはＩｎｐｕｔ　Ｃｏｎｔｒｏｌｌｅｒによって特定されたピクセル形式３で、１０ビツト値が下位１２ビツトフイールドにロードされる。In RGB (Multiplexed) mode, the RGB signal is 12 bits It is encoded as three 10-bit values transmitted over the channel. three The numbers are time multiplexed into 32 bit words by the formatter. top 20 The bits are loaded with red and green components, with the lower 10 bits being the blue component. loaded by the component. This is done by Input Controller. pixel format 3, the 10-bit value is placed in the lower 12-bit field. loaded.

ＲＧＢ　（Ｄｅｄｉｃａｔｅｄ　Ｃｈａｎｎｅｌｓ）モードにおいて、各色彩コンポーネントが専用チャネルに与えられる。三つの１２ビツト値は、フォーマツタによって３２ビツトワードの下位１２ビツトにロードされる。これはピクセル形式ｌである（Ｃｏｍｐｏｓｉ　ｔｅ　Ｖｔｄｅｏと同じ形式）。In RGB (Dedicated Channels) mode, each color component is given to a dedicated channel. The three 12-bit values can be formatted as is loaded into the lower 12 bits of a 32-bit word by the controller. this is pixel The format is 1 (same format as Composite Vtdeo).

フィードバック形式モードは、３２ビツト値にフィードバックするために使用される。ワードは二つの１０ビツト値と１２ビツト値に分解される。三つの数値は、フォーマツタによって３２ビツトワードに時分割多重化される。１２ビツト値は３２ビツトワードの下位１２ビツトを占めており、二つのｌＯビット値は上位２０ビツトを占めている。これはＩｎｐｕｔ　Ｃｏｎｔｒｏｌｌｅｒによって特定されたピクセル形式３である。Feedback format mode is used to feed back 32-bit values. It will be done. The word is split into two 10-bit values and a 12-bit value. The three numbers are , time division multiplexed into 32-bit words by the formatter. 12 bit value occupies the lower 12 bits of a 32-bit word, and the two lO bit values occupy the upper It occupies 20 bits. This is specified by the Input Controller. pixel format 3.

ＲＧＢ形式はＲ，Ｂ、Ｇを表すための四つの８ビツトフイールド、及びビデオ信号のコンポーネントを有している。ワードはＤＩＣとＤＯＣを介して伝送するため二つのｌＯビット値と１２ビツト値に分解される。三つの数値はフォーマツタによって３２ビツトワードに時分割多重化される。これはＩｎｐｕｔ　Ｃｏｎｔｒｏｌｌｅｒによって特定されるピクセル形式３である。The RGB format uses four 8-bit fields to represent R, B, and G, and a video signal. It has components of No. Words are transmitted via DIC and DOC. The second is decomposed into two lO bit values and a 12 bit value. The three numbers are formatted is time-division multiplexed into 32-bit words. This is Input Cont pixel type 3 specified by roller.

図２１において、Ｖｉｄｅｏ　Ｃａｐｔｕｒｅ　ＣｏｍｍａｎｄｓはＤＩＣからビデオデータを”捕捉”してＩｎｐｕｔ　ＦＩＦＯ２１００にロードするプロセスのために使用される。二次元のビデオインプットフレームデータは、５ｅｒｉａｌ　ＤＩＣを介して伝送される。そのフレームは、ページが読み込まれると同じようにＤＥＣに一行毎に左から右に読み込まれる、　ＩＯＭＣＩｎｐｕｔ　Ｃｏｎｔｒｏｌｌｅｒ　２１０２の各々の役割は、ＤＩＣのどのビクセルがそのローカルメモリー２１０４にロードされているかを判定することである。In Figure 21, Video Capture Commands are from DIC. The process of “capturing” video data and loading it into the Input FIFO 2100. used for Two-dimensional video input frame data is 5eri al　Transmitted via DIC. That frame is displayed as soon as the page loads. IOMCInput C, which is read line by line from left to right in the DEC The role of each ontroller 2102 is to determine which pixel in the DIC This is to determine whether the file has been loaded into the local memory 2104.

ビデオ入力は遅延しないので、ＤＩＣからビクセルを捕捉して、それをＩｎｐｕｔ　ＦＩＦＯ２１００にロードするオペレーションは、ノーケンサ命令ストリームから独立してＩｎｐｕｔ　Ｃｏｎｔｒｏｌｌｅｒ　２１０２によって自動的に実行される。Ｉｎｐｕｔ　Ｃｏｎｔｒｏｌｌｅｒは、ＤＩＣＩｎｐｕｔ　Ｔｉｍｉｎｇ　５ｅｑｕｅｎｃｅｒ　Ｒｅｇｉｓｔｅｒ　２１０６　（ＩＴＳＲ）によって提供されるパラメータとともに、Ｈ（Ｈｏｒｉｚｏｎｔａｌ　５ｙｎｃｈｒｏｎｉｚａｔｉｏｎ　Ｓｉｇｎａｌ）　、Ｆ　（Ｆｒａｍｅ　５ｙｎｃｈｒｏｎｉｚａｔｉｏｎ　Ｓｉｇｎａｌ）、及びビデオクロック信号を使用して、別のビクセルをＤＩＣからＦＩＦＯにロードする時を決定する。各チャネルは個別セットの信号とＩＴｓＲを有している。The video input is not delayed, so you can capture the pixels from the DIC and send them to the Inpu t The operation to load into the FIFO 2100 is a no-control instruction stream. automatically by the Input Controller 2102 independently from the system. executed. Input Controller is DICInput Tim ing 5equencer Register 2106 (ITSR) H (Horizontal 5 ynchr onization　Signal), F　(Frame　5ynchron ization Signal) and video clock signal to another video signal. Determine when to load the cells from the DIC into the FIFO. Each channel has its own set It has an initial signal and an ITsR.

Ｉｎｐｕｔ　Ｃｏｎｔｒｏｌｌｅｒは、チャネル毎に１セツトの二つのカウンタを有している。The Input Controller has two counters, one set for each channel. have.

即ち、Ｐｉｘｅｌ　ＣｏｕｎｔｅｒとＬｉｎｅ　Ｃｏｕｎｔｅｒである。このカウンタは、ＨＳＦで演算して、ビデオクロック信号はビデオ入力のフレーム内にあるビクセル位置を決定するために使用される。Ｐｉｘｅｌ　Ｃｏｕｎｔｅｒは、ビデオ入力のカレントラインのためにＤＩＧのビクセルの横方向の位置を表示する。Ｌｉｎｅ　Ｃｏｕｎｔｅｒは、ビデオ入力のカレントラインのためのＤＥＣのビクセルの横方向の位置を決定する。That is, they are the Pixel Counter and the Line Counter. This card The counter operates in HSF, and the video clock signal is input within the frame of the video input. Used to determine the location of a certain pixel. Pixel Counter is , displays the horizontal position of the DIG pixels for the current line of video input do. Line Counter is the DE for the current line of video input. Determine the lateral position of the pixels of C.

ＤＩＣはビデオクロック信号率で演算して、ビデオクロック信号がアサートされる度に別の１２ビツト値がＤＩＣＲｅｇｉｓｔｅｒ　（ＤＩＣＲ）にクロックされる。Ｐｉｘｅｌ　Ｃｏｕｎｔｅｒは、ｌビデオクロックサイクル、２ビデオクロツクサイクル、又は３ビデオクロツクサイクル毎に増加するかどうかは、ピクセル形式が１データフイールド、２データフイールド、又は３データフイールドを有しているかどうかにかかっている。ビデオの新ラインは開始しているので、 ■信号はＬｉｎｅ　ｃｏｕｎｔｅｒを増加させ、Ｐｉｘｅｌ　Ｃｏｕｎｔｅｒをリセットする。ビデオのフレームがＤＩＣを伝送される度にＦ信号が生成される。それが生成された場合には、Ｐｉｘｅｌ　ＣｏｕｎｔｅｒとＬｉｎｅ　Ｃｏｕｎｔｅ「の両方がリセットされる。また、Ｆ信号はＦｒａｍｅ　Ａｄｄｒｅｓｓ　Ｇｅｎｅｒａｔｏｒ　（ＦＡＧ）に信号を送って、フレームバッファアドレスを変更する。ＳＥは任意のバッファリングシステムを使用し、バッファの最小数を２とする（二重バッファ）。そして、一方のビデオフレームがプロセッサによって処理されるとともに、他のフレームがＩＯＭＣによってロードされる。任意のバッファリングの主たる利点は、データの前のフレームがロードした後の多くのフレームのために存続することができることである。これは、一時的データを使用するプログラムにとって必要である。DIC calculates the video clock signal rate and determines when the video clock signal is asserted. Each time a different 12-bit value is clocked into the DIC Register (DICR). It will be done. Pixel Counter is 1 video clock cycle, 2 video clock cycles. Whether it increments every 3 video clock cycles or every 3 video clock cycles. Cell format is 1 data field, 2 data fields, or 3 data fields It depends on whether you have it. A new line of videos has started, so ■The signal increases the Line counter and increases the Pixel counter. Reset. An F signal is generated each time a frame of video is transmitted through the DIC. . If it is generated, Pixel Counter and Line Cou nte" are reset. Also, the F signal is Send a signal to the Generator (FAG) and set the frame buffer address. change. SE uses any buffering system, with a minimum number of buffers is 2 (double buffer). Then one video frame is processed by the processor. and other frames are loaded by the IOMC. Any The main advantage of buffering is that much of the data is loaded after the previous frame has been loaded. It is possible to survive for several frames. This stores temporary data Required by the program you use.

ＩＴｓＲは、ビクセルをＤＩＣからＩｎｐｕｔ　ＦＩＦＯに読み出す時を決定するために使用される。また、それは、読み出すべき連続ビクセル数と読み出し処理の繰り返し回数のようなパラメータを特定する。ITsR determines when to read a pixel from DIC to Input FIFO. used for It also depends on the number of consecutive pixels to be read out and the readout process. Specify parameters such as the number of repetitions of the process.

２７ビソトＤＩＣＩＴｓＲ形式は、ＤＩＣのデータがＩｎｐｕｔ　ＦＩＦＯに読み出される方法のパラメータを特定するために使用される。レジスタは四つのフィールドを有している。即ち、（２）　Ｐｉｘｅｌ　Ｄａｔａ形式、（１３）　Ｉｎ１ｔｉａｌ　Ｐｉｘｅｌ　Ｐｏ５ｉｔｉｏｎ、（６）Ｎｕｍｂｅｒ　ｏｆ　Ｐｉｘｅｌｓ、及び（６）　Ｐｉｘｅｌ　Ｒｅｐｅａ　ｔ　Ｉｎｔｅｒｖａｌである。In the 27bit DICITsR format, DIC data is read into the Input FIFO. used to specify the parameters of the method to be discovered. The register has four frames. It has a field. That is, (2) Pixel Data format, (13) In1tial Pixel Po5ition, (6) Number of Pixels, and (6) Pixel Repeat at Interval be.

２ビツトＰｉｘｅｌ　Ｄａｔａ形式フィールドは、チャネルフォーマツタによって使用されたピクセル形式を選択する。これは、Ｐｉｘｅｌ　Ｃｏｕｎｔｅｒがビデオクロック信号に比例して増加した回数を決定するために必要である。The 2-bit Pixel Data format field is set by the channel formatter. Select the pixel format used. This is the Pixel Counter It is necessary to determine the number of times the video clock signal is increased proportionally.

１３ビツトＩｎ１ｔｉａｌ　Ｐｉｘｅｌ　Ｐｏ５ｉｔｉｏｎフイールドは、第一ビクセルがビデオ入力の各ラインのためにＤＩＣから読み出される時を決定する。それはカレントラインのビクセルの横方向の位置を特定する。このフィールドの数値は、Ｐｉｘｅｌ　Ｃｏｕｎｔｅｒの数値と比較される。二つの数値が整合する場合には、ビクセルはＦＩＦＯにロードされる。The 13-bit In1tial Pixel Po5ition field is Determines when pixels are read from the DIC for each line of video input . It specifies the lateral position of the current line's pixels. this field The value of is compared with the value of Pixel Counter. Two numbers are consistent If so, the pixels are loaded into the FIFO.

６ビソトＮｕａ＋ｂｅｒ　ｏｆ　Ｐｉｘｅｌフィールドは、ＦＩＦＯに読み込まれる連続ビクセルの数を決定する。Ｉｎ１ｔｉａｌ　Ｐｉｘｅｌ　Ｐｏ５ｉｔｉｏｎがＰｉｘｅｌ　Ｃｏｕｎｔｅｒと整合する場合か、Ｐｉｘｅｌ　Ｒｅｐｅａｔ　Ｉｎｔｅｒｖａｌ　Ｃｏｕｎｔｅｒ　（ＰＲＩ　Ｃｏｕｎｔｅｒ）がゼロまで減少する場合には、この数値がＮｕｍＰｉｘ　Ｃｏｕｎｔｅｒにロードされるｏ　Ｎｕｍ　Ｐｉｘ　ＣｏｕｎｔｅｒはＰｉｘｅｌ　Ｃｏｕｎｔｅｒが減少する毎に増加して、カウンタがゼロに減少するまて、Ｉｎｐｕｔ　ＣｏｎけｏｆｆｅｒはビクセルをＦＩＦＯにロードする。１１信号は、ＮｕｍＰｉｘ　Ｃｏｕｎｔｅｒをリセットする。6bit Nua+ber of Pixel field is read into FIFO Determine the number of consecutive pixels that will be displayed. In1tial Pixel Po5iti If on matches Pixel Counter, PixelRepea t Interval Counter (PRI Counter) is zero or , this number is loaded into the NumPix Counter. o Num Pix Counter will decrease Pixel Counter Input Conkey is incremented every time until the counter decreases to zero. r loads the pixels into the FIFO. 11 signal is NumPix Count Reset er.

６ビツトＰｉｘｅｌ　Ｒｅｐｅａｔ　Ｉｎｔｅｒｖａｌフィールドは、隣接グループのビクセルに読み込む回数を特定する。Ｉｎ１ｔｉａｌ　Ｐｉｘｅｌ　Ｐｏ５ｉｔｉｏｎフイールドがＰｉｘｅｌ　Ｃｏｕｎｔｅｒ値により整合させられた場合には、Ｐｉｘｅｌ　Ｒｅｐｅａｔ　ＩｎｔｅｒｖａｌはＰＲＩ　Ｃｏｕｎｔｅｒにロードされる。Ｐｉｘｅｌ　Ｃｏｕｎｔｅｒが増加する毎に、ＰＲＩ　Ｃｏｕｎｔｅｒが減少する。ＰＲＩ　Ｃｏｕｎｔｅｒがゼロまて減少する場合には、ＰＲＩ　ＣｏｕｎｔｅｒとＮｕｍＰｉｘ　Ｃｏｕｎｔｅｒがリロードされる。The 6-bit Pixel Repeat Interval field Determines the number of times to read into a pixel in the group. In1tial Pixel Po 5ition field is aligned by Pixel Counter value In this case, Pixel Repeat Interval is PRI Count loaded into er. Every time Pixel Counter increases, PRI C outer decreases. When PRI Counter decreases to zero, , PRI Counter and NumPix Counter are reloaded.

１１信号は、ＰＲＩ　ＣｏｕｎｔｅｒとＮｕｍＰｉｘ　Ｃｏｕｎｔｅｒをリセットする。11 signal resets the PRI Counter and NumPix Counter. to

ビクセル入力の形式の実施例として、ＩＴＳＲが説明される（図２２参照）。この説明図は、Ｉｎ１ｔｉａｌ　Ｐｉｘｅｌ　Ｐｏ５ｉｔｉｏｎが１で、Ｎｕｍｂｅｒ　ｏｆ　Ｐｉｘｅｌｓが３で、ＰＲＩが１１である実施例である。As an example of a format for pixel input, ITSR is described (see FIG. 22). child In the explanatory diagram, In1tial Pixel Po5ition is 1 and Numb This is an example in which er of Pixels is 3 and PRI is 11.

図２３は、二つの問題空間の相違について示している。連続するマルチビクセルを取得する性能は、ＳＨに備わっている。ＰＥの場合には、プロセッサはｌビクセルを受信するだけなので、問題空間はローカルメモリを交差するプロセッサ数を法として分配されていた。１０２４プロセツサシステムの場合では（図２３における、１０２４プロセツサの内の同一のものが単一の斜線部分によって表示されて、図２３における、１０２４プロセツサの次のものが二重斜線部分によって表示されており）、ｌフレームのビデオの１０２４番目の列がすべて同じプロセッサ上に存在している。図２３のスキームｌに基づいて、ＰＥの場合には、２０４８ピクセルスキヤンラインの第一の１０２４ビクセルを処理するためには、１０２４プロセツサンステムのすべてのプロセッサを使用し、次にこの２０４８ビクセルスキヤンラインの第二の１０２４ビクセルを処理するために、再びこの同一の１０２４プロセツサのすべてを使用しなければならない。また、ＳＥがスキームｌに従って作動する間は、図２３のスキーム２に従って作動することもできる。この場合には、システム設計にはさらに柔軟性が加えられ、最大６４列のスライスを同しプロセッサに内在させることができる。FIG. 23 shows the difference between the two problem spaces. consecutive multi-vixels The ability to obtain this is inherent in SH. In the case of PE, the processor Since we are only receiving cells, the problem space is limited to the number of processors crossing local memory. It was distributed according to the law. In the case of a 1024 processor system (see Figure 23) Identical ones of the 1024 processors in the 23, the one next to the 1024 processor is indicated by the double hatched area. ), the 1024th column of l-frame video is all processed by the same process. exists on the server. Based on scheme l in Fig. 23, in the case of PE, 20 To process the first 1024 pixels of the 48 pixel scan line, 1 024 processor system and then this 2048 bit This same sequence again to process the second 1024 pixels of the xel scan line. All of one 1024 processors must be used. Also, SE is good While operating according to Scheme 1, it is also possible to operate according to Scheme 2 of Figure 23. Ru. In this case, additional flexibility is added to the system design, with up to 64 rows of Rice can be embedded in the same processor.

図２４において、Ｖｉｄｅｏ　Ｉｏ　Ｍｅｍｏｒｙ　Ｔｒａｎｓｆｅｒ　（ＦＩＦＯＲｅａｄ）は、Ｉｎｐｕｔ　ＦＩＦＯ２４００の内容がローカルメモリに読み込まれる方法が示されている。Ｖｉｄｅｏ　Ｃａｐｔｕｒｅコマンドを実行するにつれて、ＦＩＦＯ２４００が連続してロードされる。ＦＩＦＯ２４００の内容をメモリ２４０２に記憶するために、メモリ転送を開始する前に、制御装置２４０４に割り込みプログラム２４０６によって割り込まなければならない。これは、インチラブドを介して実行されるが、それは他のビデオラインがＤＩＣにクロックされる度に呼び出される。In FIG. 24, Video Io Memory Transfer (FI FORRead) reads the contents of Input FIFO2400 to local memory. It shows how it is incorporated. Execute the Video Capture command. FIFO 2400 is successively loaded as the data progresses. Inside FIFO2400 In order to store the content in the memory 2402, the controller 2 404 must be interrupted by the interrupt program 2406. this is executed through the inch labd, but it is not possible for other video lines to be connected to the DIC. Called every time a lock is acquired.

Ｉｎｐｕｔ　ＦＩＦＯにあるデータがローカルメモリに記憶される方法と場所を決定するために、各チャネルを定義したレジスタが五つある。この四つのレジスタがＦＡＧによって使用されアクティブフレームバッファの有効なベースアドレスを生成する。示したように、ＦＡＧはＦｒａｍｅ　Ｐａ１ｎｔｅｒ　（ＦＲＴＲ）　＋　Ａｄｄｒｅｓｓ　Ｃｏｕｎｔｅｒ　２４０８を利用している。第五のレジスタはＦＩＦＯＩｎｐｕｔ　Ｔｉｍｉｎｇ　５ｅｑｕｅｎｃｅｒ　Ｒｅｇｉｓｔｅｒ　２４１０　（ＦＴＴＳＲ）で、これはデータを（ローカルメモリに位置する）フレームバッファに記憶する方法を記述している。Input: How and where data in the FIFO is stored in local memory There are five registers that define each channel to determine. These four Regis valid base address of the active frame buffer used by the FAG. generate a file. As shown, FAG is a Frame Painter (FRT R) + Address Counter 2408 is used. fifth The register is FIFOInput Timing 5equencer Regi ster 2410 (FTTSR), which stores data (in local memory). It describes how to store data in the frame buffer.

Ｖｉｄｅｏ　ｔｏ　Ｍｅ＋＋＋ｏｒｙ　Ｔｒａｎｓｆｅｒ　Ｉｎ５ｔｒｕｃｔｉｏｎは、マルチサイクル命令である。それが実行された場合には、ピクセルの特定数がＦＩＦＯ２４００からローカルメモリ２４０２に転送される。その命令のパラメータはＦＩＴＳＲ２４１０に記憶されている。Video to Me+++ory Transfer In5tructi on is a multicycle instruction. If it is executed, the pixel characteristics Constants are transferred from FIFO 2400 to local memory 2402. of that command The parameters are stored in FITSR2410.

ＦＩＦＯＩｎｐｕｔ　Ｔｉｍｉｎｇ　５ｅｑｕｅｎｃｅ　Ｒｅｇｉｓｔｅｒ　２４１０　（ＦＩＴＳＲ）フォーマット（３２ビツト）は、Ｉｎｐｕｔ　ＦＩＦＯにあるデータをローカルメモリに読み込む方法のパラメータを特定するために使用される。レジスタは四つのフィールドを有している。第五のフィールドはＩＴＳＲから始まる。即ち、（１１）　Ｉｎ１ｔｉａｌ　Ｆｒａｍｅ　０ｆｆｓｅｔ　、（６）　Ｄｅｆｔａ　０ｆｆｓｅｔ、（１１）　Ｍｏｄｕｌｏ　Ｌ、、（４）　Ｗａｉｔ　Ｃｙｃｌｅｓ　、及び（６）　（ＩＴｓＲがら）　Ｎｕｍｂｅｒ　B ｆ　Ｐｉｘｅｌｓである。FIFOInput Timing 5equence Register 2 410 (FITSR) format (32 bits) is Input FIFO used to specify parameters for how to read data in local memory into local memory. used. The register has four fields. The fifth field is IT It starts with SR. That is, (11) In1tial Frame 0ffset , (6) Defta 0ffset, (11) Modulo L, (4 ) Wait Cycles, and (6) (from ITsR) Number B f Pixels.

１１ビツト１ｎ山ａｌ　Ｆｒａｍｅ　０ｆｆｓｅｔは、第一要素がフレームベース値に比例してフレームに記憶される場所を特定する。例えば、８のオフセットが特定された場合には、ローカルメモリに表わされた画像は、ビデオソースに表わされているように、画像より８ライン真下にシフトされている。６ビツトＤｅｌｔａ　０ｆｆｓｅｔはオペレーション毎にアドレスに付加された追加の縦方向のオフセットを特定する。１１ビツトＭｏｄｕｌｏ　１．フィールドは、縦方向の位置演算が回り込む時を決定する。そのフィールドは（極限）数値りを保持する。11-bit 1n mountain al Frame 0ffset indicates that the first element is frame-based. Specifies where the frame is stored in proportion to the value of the frame. For example, an offset of 8 is specified, the image represented in local memory will be displayed on the video source. As shown, it has been shifted 8 lines directly below the image. 6 bit De lta　0ffset is an additional vertical value added to the address for each operation. Determine the offset of 11 bit Modulo 1. The field is vertical Determines when the position calculation wraps around. The field holds (in the limit) a numerical value Ru.

図２５において、Ｉｎ１ｔｉａｌ　Ｆｒａｍｅ　Ｏ［ｆｓｅｔが２の場合には、Ｄｅｌｔａ　０ｆｆｓｅｔは３であり、Ｍｏｄｕｌｏ　Ｌ値は１６であり、連続するデータ転送はライン２．５．８．１１．１４、ｌ、４．７．１０，１３等に表示される。In FIG. 25, if In1tial Frame O[fset is 2, Delta 0ffset is 3, Modulo L value is 16, continuous Data transfer to lines 2.5.8.11.14, l, 4.7.10, 13, etc. Is displayed.

４ビツトＷａｉｔ　Ｃｙｃｌｅｓフィールドは、転送が完成される前に待機する追加のクロックサイクル数を決定するために使用される。このフィールドは低速ローカルメモリがアクセスされる場合に使用される。The 4-bit Wait Cycles field waits before the transfer is completed. Used to determine the number of additional clock cycles. This field is slow Used when local memory is accessed.

６ビソトＮｕｍｂｅｒ　ｏｆ　Ｐｉｘｅｌｓフィールドは、ローカルメモリに転送されるべきピクセル数を特定する。この数値はＶｉｄｅｏ　Ｃａｐｔｕｒｅ用と常に同一であるがら、それはなおＶｉｄｅｏ　ｔｏ　Ｍｅｍｏｒｙ　Ｔｒａｎｓｆｅｒ用のパラメータであるけれども、パラメータはＦＩＴＳＲに明示されていない。6 The Number of Pixels field is moved to local memory. Specify the number of pixels to be sent. This number is for Video Capture is always the same as Video to Memory Tran. Although it is a parameter for sfer, the parameter is not specified in FITSR. not present.

ＩＯＭｃは、すべてのチャネルが使用する単−ＦＡＧを有する。ローカルメモリへのポートが一つあるので、ＦＡＧが一つＶｉｄｅｏ　Ｉｎｐｕｔと０ｕｔｐｕｔ　５ｏｕｒｃｅｓのすべてに必要とされるだけである。各チャネルは五つのレジスタを有している。即ち、Ｆｒａｍｅ　Ｂａ５ｅ　Ｒｅｇｉｓｔｅｒ　（ＦＢＲ）　、Ｆｒａｍｅ　０ｆｆｓｅｔ　Ｒｅｇｉｓｔｅｒ　（ＦＯＲ）　、Ｆｒａｍｅ　５ｔｒｉｄ■@Ｒｅｇｉｓｔｅｒ　（ＦＳＲ）　、Ｆｒａ＋ｎｅ　Ｌｉｍ１ｔ　Ｒｅｇｉｓｔｅｒ　（ＦＬＲ）、及びＰｉｘｅｌ　０ｆｆｓｅｔ　Ｒｅｇｉｓｔｅｒ　iＦＯＲ）である。　各チャネルに対するフレームバッファはローカルメモリの隣接メモリを割り当てる。各フレームバッファは同一サイズでなければならない。ＦＢＲとＦＬＲは、フレームバッファに割り当てられたメモリの最初と最後の位置を特定する。ＦＳＲはフレームサイズを含んでいる。ＦＯＲは、ＦＢＲ値に比例するアクティブフレームバッファ用のオフセットを含んでいる。（フレームバッファがデータで現在ロードされているバッファである場合は、それはアクティブである。）データの別のフレームがロードされる度に、ＡＧはＦＯＲ＝（ＦＯＲ＋ＦＳＲ）モジュロＦＬＲを計算して次のバッファオフセットを生成する。アクティブフレーム用のベースアドレスがＥｆｆｅｃｔｉｖｅ　Ａｄｄｒｅｓｓ＝ＦＢＲ＋ＦＯＲを計算する。IOMc has a single FAG that all channels use. local memory Since there is one port to , there is one FAG for Video Input and 0utpu It is only needed for all 5 sources. Each channel has five levels. I have dysstasis. That is, Frame Ba5e Register (FB R), Frame 0ffset Register (FOR), Fra me 5trid■@Regis ter (FSR), Fra+ne Lim1t Register (FL R), and Pixel 0ffset Register iFOR) It is. The frame buffer for each channel is located in the adjacent memory of local memory. Assign. Each frame buffer must be the same size. FBR and FLR identifies the first and last locations of memory allocated to the frame buffer do. FSR includes frame size. FOR is an α proportional to the FBR value. Contains the offset for the active frame buffer. (The frame buffer is If a buffer is currently loaded with data, it is active . ) Each time another frame of data is loaded, the AG loads FOR=(FOR+FS R) Compute modulo FLR to generate the next buffer offset. Active The base address for the frame is Effective Address=FBR+ Calculate FOR.

ＦＯＲはアクティブフレームバッファ内の位置を引用するために使用される。ＰＯＲはＦＩＴＳＲに記述されているパラメータによって更新される。（Ｆ信号が生成される場合には、）新フレームが開始する度に、ＰＯＲはＩｎ１ｔｉａｌ　Ｆｒａｍｅ　０ｆｆｓｅｔに初期化される。Ｖｉｄｅｏ　ｔｏ　Ｍｅｍｏｒｙ　Ｄａｔａ　Ｔｒａｎｓｆｅｒ命令がなされた場合には、ピクセルオフセットが計算される。FOR is used to refer to a position within the active frame buffer. P OR is updated by the parameters described in FITSR. (The F signal is If generated, POR is In1tial every time a new frame starts. Frame is initialized to 0ffset. Video to Memory If a Data Transfer command is made, the pixel offset is calculated. calculated.

ＦＯＲ＝Ｉｎｉｔｉａｌ　Ｆｒａｍｅ　１Ｍｆｓｅｔ　（Ｆ信号直後）ＦＯＲ＝（ＰＯＲ＋ＤｅｌＬａ　５ｔｒｉｄｅ）モジュロしＩｎ１ｔｉａｌ　Ｆｒａｍｅ　０ｆｆｓｅｔ、　Ｄｅｌｔａ　５ｔｒｉｄｅ、及びＬ値は、すべてＦＴｒＳＲに特定されている。また、（Ｎｕｍｂｅｒ　ｏｆ　Ｐｉｘｅｌｓフィールドによって特定されているように、）　ＦＯＲは転送された各ピクセルに対して一度増加する。FOR=Initial Frame 1Mfset (immediately after F signal) FOR= (POR+DelLa 5tride) Modulo In1tial Frame 0ffset, Delta 5tride, and L value are all FTrSR is specified. Also, (by the Number of Pixels field) )FOR is incremented once for each pixel transferred. Add.

アクティブフレームバッファのある位置に対するＦ！ｆｆｅｃｔｉｖｅ　Ｌｏｃａｌ　Ｍｅｍｏｒｙアドレスは、ＥＡ＝ＦＯＲ＋ＦＯＲ＋ＦＯＲとして計算される。F! for a location in the active frame buffer! effective Loc al Memory address is calculated as EA=FOR+FOR+FOR Ru.

Ｖｉｄｅｏ　Ｉｎｐｕｔ　０ｐｅｒａｔｉｏｎ　５ｅｔｕｐは、各チャネルに対するＩＴＳＲ，ＦＩＴＳＲ、及びＦＡＧ　Ｒｅｇｉｓｔｅｒが新数値で初期化される方法を簡単に説明する。これらのすべての数値はユーザー特定のもので、プロセッサに従属するので、アドレッシング情報はプロセッサからこなければならない。このレジスタの初期化と修正を簡単にするため、ローカルメモリ部は予約されている。プロセッサはそのデータをＩＯＭＣが読み出すことができる専用メモリ位置に書き込む。Video Input 0operation 5etup is for each channel. ITSR, FITSR, and FAG Register are initialized with new values. I will briefly explain how to do this. All these numbers are user specific and are Processor dependent, so addressing information must come from the processor. do not have. To simplify the initialization and modification of this register, the local memory portion is reserved. has been done. The processor uses a dedicated memory that allows the IOMC to read the data. Write to the memory location.

予約されたメモリ位置が初期化のために使用されるので、ビデオパラメータを変更するための命令をシステムコールとして実行する。パラメータのなかにはユーザーによって特定できないしそうすべきものではないものもある。例えば、そのシステムは、ユーザーが別のアプリケーションによって使用したチャネルにパラメータを更新するよう試みる状況から保護するために必要とされる。こうして、そのような状況から保護できるシステムコールは、ビデオ入力オペレーションセットアソブを実行する適切な方法である。Reserved memory locations are used for initialization, so you cannot change video parameters. Execute the instruction to update the system as a system call. Some parameters include user There are some things that cannot and should not be specified by the user. For example, the The system allows users to parameterize channels used by another application. Required to protect against situations where you attempt to update the meter. thus, A system call that can protect against such situations is the video input operations is the proper way to perform a setassembly.

Ｖｉｄｅｏ　０ｕｔｐｕｔは３６ビツト一方向ＤＯＣを介して送信される。ＤＯＣは、それぞれ異なるＶｉｄｅｏ　０ｕｔｐｕｔに書き込む三つの独立制御の１２ビツトチヤネルからなる。Video output is sent via a 36-bit unidirectional DOC. D.O. C is one of three independent controls that write to different Video 0 outputs. Consists of 2-bit channels.

概念上、Ｖｉｄｅｏ　０ｕｔｐｕｔはＤＯＣの左端にあり、左端のＩＯＭｃに接続されている。すべてのＩＯＭｃは、ＤＯＣによって直列に接続されており、右端のＩＯＭｃは、バスの最後のプロセッサである。データはＤＯＣ上を右から左に伝送される。Conceptually, Video 0output is at the left end of the DOC and connected to the left end IOMc. It is continued. All IOMcs are connected in series by DOC, right The edge IOMc is the last processor on the bus. Data is from right to left on the DOC transmitted to.

１０Ｍｃ／ＤＯＣインタフェースは０ｕｔｐｕｔ　５ｌｉｃｅといい、ＩＯＭＣ０ｕｔｐｕｔ　Ｃｏｎｔｒｏｌｌｅｒによって制御される。０ｕｔｐｕｔ　Ｃｏｎｔｒｏｌｌｅｒは二つの基本的機能を実行する。即ち、データをローカルメモリから０ｕｔｐｕｔ　ＦＩＦＯに転送する（Ｍｅｍｏｒｙ　ｔｏ　Ｖｉｄｅｏ　Ｔｒａｎｓｆｅｒ）こととデータを０ｕｔｐｕｔ　ＦＩＦＯからＤＯＣに転送する（Ｖｉｄｅｏ　Ｄｉｓｐｌａｙ）ことである。10Mc/DOC interface is called 0output 5lice, and IOMC Controlled by the 0output Controller. 0output Co The ntroller performs two basic functions. In other words, the data can be stored as a local memo. Transfer from Memory to Video to 0output FIFO Transfer) and data from 0output FIFO to DOC. (Video Display).

ノーケンサはＶｉｄｅｏ　Ｉｎｔｅｒｒｕｐ口こよって割り込まれた場合に、Ｍｅｍｏｒｙ　ｔｏ　Ｖｉｄｅ。If you are interrupted by a Video Interrupt message, the Nokensa will memory to video.

Ｔｒａｎｓｆｅｒが実行する。Ｖｉｄｅｏ　Ｄｉｓｐｌａｙは、出力データチャネルソースにより送信された同期化に基づいて、０ｕｔｐｕｔ　Ｃｏｎけｏｆｆｅｒにより自律的に実行される。Transfer is executed. Video Display is the output data channel. Based on the synchronization sent by the channel source, 0output It is executed autonomously by er.

ビデオ形式とピクセル形式の定義の詳細は上記されている。サポートされるビデオデータ形式は七つある。即ち、Ｃｏｍｐｏｓｉｔｅ　Ｖｉｄｅｏ　、Ｌｕｍ１ｎａｎｃｅ／Ｃｈｒｏｍａ、　Ｌｕｍ１ｎａｎｃｅ／Ｃｈｒｏｍａ　（Ｄｅｄｉｃａｔｅｄ　Ｃｈａｎｎｅｌ）、ＲＧＢ　、　ＲＧＢ　（Ｄｅｄｉｃａｔｅｄ　Ｃｈａｎｎｅｌｓ）AＲＧＢ　。Details of the definitions of video formats and pixel formats are provided above. Supported video There are seven data formats. That is, Composite Video, Lum1 nance/Chroma, Lum1nance/Chroma (Dedi rated Channel), RGB, RGB (Dedicated Channels) ARGB.

及びＦｅｅｄｂａｃｋである。and Feedback.

０ｕｔｐｕｔ　ＦＩＦＯに位置するＶｉｄｅｏ　Ｄａｔａは３２ビツトワードにバックされている。0output Video Data located in FIFO is a 32-bit word It is backed up.

ＦＩＦＯは信号を０ｕｔｐｕｔ　Ｃｏｎけ０ｆｆｅｒから受信してそのデータを１２ビツトシリアルＤＯＣに送信する場合には、（Ｏｕｔ四ＬＣｏ口ｔｒｏｌ　Ｉｅｒにより特定されたビクセル形式に従って、）　０ｕｔｐｕｔ　ＦＩＦＯに位置するフォーマツタはそのデータをアンバックして、時分割でそのデータをＤＯＣに分離する。０ｕｔｐｕｔ　ＦＩＦＯのこのピクセルフォーマツタは、Ｉｎｐｕｔ　ＦＩＦＤに位置するピクセルフォーマツタの逆オペレーションを実行する。FIFO receives the signal from 0output and 0offer and stores the data. When sending to a 12-bit serial DOC, (Out 4 LCo port trol )0output FIFO according to the pixel format specified by Ier. The located formatter unbacks the data and transfers the data to D in a time-sharing manner. Separate into OC. This pixel formatter of 0output FIFO is In put Performs the inverse operation of the pixel formatter located in the FIFD. Ru.

Ｖｉｄｅｏ　Ｄｉｓｐｌａｙ　Ｃｏｍｍａｎｄｓには、ピクセルをＤＯＣに出力することによってビデオデータを表示する処理を記載されている。二次元ビデオ出力フレームはシリアルＤＯＣを介して伝送しなければならないので複雑になる。ページが書き込まれるのと同様に、ピクセルはＤＯＣにライン毎に左方向から右方向にクロックされる。０ｕｔｐｕｔ　ＦＩＦＯがＤＯＣにクロックされる時を決定するのは、各１０ＭＣ０ｕｔｐｕｔ　Ｃｏｎｔｒｏｌｌｅｒの役割である。Video Display Commands output pixels to DOC A process for displaying video data is described. two dimensional video Complications arise as output frames must be transmitted via serial DOC . As pages are written, pixels are written to the DOC line by line starting from the left. Clocked to the right. 0output When FIFO is clocked by DOC It is the role of each 10MC0put Controller to determine .

図２６において、ＰＩＦＯ２６００の内容をＤＯＣに書き込むオペレーションは、ビデオ出力が遅延しないように、制御装置の命令ストリームから独立して０ｕｔｐｕｔ　Ｃａｎけｏｌｌｅｒ　２６０２によって自動的に実行される。０ｕｔｐｕｔ　Ｃｏｎけｏｌｌｅｒ　２６０２は、ＨＳＦ、及びビデオクロック信号を、ＤＯＣ０ｕｔｐｕｔ　Ｔｉｍｉｎｇ　５ｅｑｕｅｎｃｅ　Ｒｅｇｉｓｔｅｒ　２６０４　（ＯＴＳＲ）に特定されたパラメータとともに使用して、０ｕｔｐｕｔ　ＦＩＦＯ２６００の別のピクセルをＤＯＣに書き込む時を決定する。In Figure 26, the operation to write the contents of PIFO2600 to DOC is , 0u independently from the control unit's instruction stream so that the video output is not delayed. This is automatically executed by the tput canoler 2602. 0ut The put controller 2602 outputs the HSF and video clock signals. , DOC0output Timing 5equence Register 2604 (OTSR) with the specified parameters to t Determine when to write another pixel in FIFO 2600 to the DOC.

０ｕｔｐｕｔ　Ｃｏｎけｏｌｌｅｒ　２６０２は、１１、Ｆで増加するチャネル毎の１セツト三つのカウンタとビデオクロック信号を有している。このカウンタがビデオの出力フレーム内のピクセル位置を決定する。Ｐｉｘｅｌ　Ｃｏｕｎｔｅｒがビデオ出力のカレントラインのためのＤＯＣのピクセルの横方向の位置を表示している。Ｌｉｎｅ　Ｃｏｕｎｔｅｒは、ビデオ出力のカレントラインのためのＤＯＣのピクセルの縦方向の位置を決定する。0output controller 2602 is a channel that increases with 11, F Each set has three counters and a video clock signal. This counter determines the pixel position within the output frame of the video. Pixel Count er determines the horizontal position of the DOC pixels for the current line of video output. it's shown. Line Counter is for the current line of video output. Determine the vertical position of the pixel of the DOC.

ＤＯＣは、ビデオクロック信号率を演算し、ビデオクロック信号がアサートする度に、別の１２ビツト値がＤＯＣにクロックされるＰｉｘｅｌ　Ｃｏｕｎｔｅｒは、■ビデオクロックサイクル、２ビデオクロツクサイクル、又は３ビデオクロツクサイクルのすべてを増加して、ビクセル形式が１データフイールド、２データフイールド、又は３データフイールドを有しているがどうかに従属する。ビデオの新ラインが開始しているので、１１信号はＬｉｎｅ　Ｃｏｕｎｔｅｒを増加させ、Ｐｉｘｅｌ　Ｃｏｕｎｔｅｒをリセットする。ビデオのフレームが完了する度にＦ信号が生成される。信号が生じた場合には、Ｐ　１ｘｅｌ　ＣｏｕｎｔｅｒとＬｉｎｅ　Ｃｏｕｎｔｅｒはともにリセットされる。また、Ｆ信号は、ＦＡＧに信号を送ってフレームバッファアドレスを変更する。ＳＥは、任意のバッファリングスキームを使用する。バッファの最小数は２であるので（二重バッファリング）、−ビデオフレームがプロセッサによって処理されるとともに、別のフレームが表示される。DOC calculates the video clock signal rate and the video clock signal asserts Each time, another 12-bit value is clocked into the Pixel Counter ■Video clock cycles, 2 video clock cycles, or 3 video clock cycles The pixel format is 1 data field, 2 data fields. Depends on whether you have a tough field, or 3 data fields. bidet Since the new line of O has started, the 11th signal will increase the Line Counter. and reset the Pixel Counter. the frame of the video is completed An F signal is generated each time the If a signal occurs, P1xel Count Both er and Line Counter are reset. Also, the F signal is F Sends a signal to the AG to change the frame buffer address. SE can be used with any Use a faring scheme. Since the minimum number of buffers is 2 (double buffer processing), - the video frame is processed by the processor and another A frame is displayed.

０ＴＳＲ２６０４は、ピクセルが０ｕｔｐｕｔ　ＦＩＦＯ２６００からＤＯＣにロードされる時を決定するために使用される。また、それは書き込むべき連続ピクセルの数などのパラメータを特定する。0TSR2604 is pixel 0output from FIFO2600 to DOC Used to determine when to load. Also, it is necessary to write consecutive Identify parameters such as number of cells.

ＤＯＣ０ＴＳＲ２６０４（２７ビツト）は、０ｕｔｐｕｔ　ＦＩＦＯ２６００のデータをＤＯＣに書き込む方法に関するパラメータを特定するために使用される。それはＩＴｓＲと同じ形式を有する。レジスタは四つのフィールドを有している。即ち、（１３）Ｉｎｉｔｉａｌ　ＰｉｘｅＩ　Ｐｏ５ｉｔｉｏｎ、（６）Ｎｕｍｂｅｒ　ｏｆ　Ｐｉｘｅｌ、及び（６）　Ｐｉｘｅｌ　Ｒｅｐｅａｔ　Ｉｎｔｅｒｖａｌである。DOC0TSR2604 (27 bits) is 0output FIFO2600. Used to specify parameters regarding how data is written to the DOC . It has the same format as ITsR. The register has four fields. Ru. That is, (13) Initial Pixel Po5ition, (6) N amber of Pixel, and (6) Pixel Repeat In It is terval.

２ビツトＰｉｘｅｌ　Ｄａｔａ形式フィールドは、チャネルフォーマツタによって使用されたビクセル形式を選択する。これは、Ｐｉｘｅｌ　Ｃｏｕｎｔｅｒがビデオクロック信号に比例して増加する回数を決定するために必要とされる。The 2-bit Pixel Data format field is set by the channel formatter. Select the pixel format used. This is the Pixel Counter is needed to determine the number of times that increases proportionally to the video clock signal.

１３ビツトＩｎ１ｔｉａｌ　Ｐｉｘｅｌ　Ｐｏ５ｉｔｉｏｎフイールドは、■信号を生成した後に、０ｕｌｐｕｔ　ＦＩＦＯの次のピクセルがＤＯＣにロードされる時を決定する。それは、そのピクセルがビデオ出力フレームを有する横方向の位置を特定する。このフィールドの数値は、Ｐｉｘｅｌ　Ｃｏｕｎｔｅｒの数値と比較される。二つの数値が整合する場合には、ピクセルはＤＯＣにロードされる。　６ビツトＮｕｍｂｅｒ　ｏｆ　Ｐｉｘｅｌフィールドは、連続ビクセルをＤＯＣにロードする数を決定する。この数値は、Ｉｎ１ｔｉａｌ　Ｐｉｘｅｌ　Ｐ。The 13-bit In1tial Pixel Po5ition field is After generating the signal, the next pixel in the 0ulput FIFO is loaded into the DOC. Decide when to That pixel has the video output frame horizontally Locate the location. The number in this field is the number of Pixel Counter. compared to the value. If the two numbers match, the pixel is loaded into the DOC. It will be done. The 6-bit Number of Pixel field is a continuous pixel field. Determine how many to load into the DOC. This number is In1tial Pixel P.

５ｃｉｏｎがＰｉｘｅｌ　Ｃｏｕｎｔｅｒと整合する場合か、Ｐｉｘｅｌ　Ｒｅｐｅａｔ　Ｉｎｔｅｒｖａｌ　Ｃｏｕｎｔｅｒ　（ＰＲＩ　Ｃｏｕｎｔｅｒ）をゼロまで減少させる場合には、ＮｕｍＰｉｘ　Ｃｏｕｎｔｅｒにロードされる。5cion matches Pixel Counter, or Pixel Re peat Interval Counter (PRI Counter) If it decreases to zero, it is loaded into the NumPix Counter.

ＮｕｍＰｉｘ　Ｃｏｕｎｔｅｒは、Ｐｉｘｅｌ　Ｃｏｕｎｔｅｒ命令の各々に関して減少し、カウンタがゼロまで減少するまで、０ｕｔｐｕｔ　ＣｏｎｔｒｏｌｌｅｒはピクセルをＤＯＣにロードする。■信号はＮｕｍＰｉｘ　Ｃｏｕｎｔｅｒをリセットする。NumPix Counter is associated with each Pixel Counter instruction. 0output Control ler loads pixels into the DOC. ■The signal is NumPix Count Reset r.

６ビソトＰｉｘｅｌ　Ｒｅｐｅａｔ　Ｉｎｔｅｒｖａｌフィールドは、ピクセルの連続数をＤＯＣに書き込む回数を特定する。Ｉｎ１ｔｉａｌ　Ｐｉｘｅｌ　Ｐｏｓ山ｏｎフィールドがＰｉｘｅｌ　Ｃｏｕｎｔｅｒ値と整合する場合には、Ｐｉｘｅｌ　Ｒｅｐｅａｔ　ＩｎｔｅｒｖａｌはＰＲＩ　Ｃｏｕｎｔｅｒにロードされる。6 bits Pixel Repeat Interval field is pixel Specify the number of consecutive times to write the number of consecutive times in the DOC. In1tial Pixel P If the os mountain on field matches the Pixel Counter value, P ixel Repeat Interval is loaded to PRI Counter be done.

Ｐｉｘｅｌ　Ｃｏｕｎｔｅｒが増加する度に、ＰＲＩ　Ｃｏｕｎｔｅｒは減少する。ＰＲＩ　Ｃｏｕｎｔｅｒがゼロまて減少した場合には、ＰＲＩ　ＣｏｕｎｔｅｒとＮｕｍＰｉｘ　Ｃｏｕｎｔｅｒがリロードされる。Ｈ信号はＰＲＩ　ＣｏｕｎｔｅｒとＮｕ＋ｎＰｉｘ　Ｃｏｕｎｔｅｒをリセットする。Every time Pixel Counter increases, PRI Counter decreases. Ru. When the PRI Counter decreases to zero, the PRI Count er and NumPix Counter are reloaded. H signal is PRI Co Reset counter and Nu+nPix Counter.

０ＴＳＲは、ＴＴＳＲと同じタイプのピクセル出力フォーマツティングを特定する。唯一の相違点は、逆オペレーションが実行されていることである。0TSR specifies the same type of pixel output formatting as TTSR. Ru. The only difference is that a reverse operation is being performed.

図２７において、ローカルメモリ２７００をＭｅｍｏｒｙ　ｔｏ　Ｖｉｄｅｏ　Ｔｒａｎｓｆｅｒ　（ＦＩＦＯＷｒｉｔｅ）における０ｕｔｐｕｔ　ＦＩＦＯ２７０２に書き込む方法を示している。ＦＩＦＯ２７０２は、Ｖｉｄｅｏ　Ｄｉｓｐｌａｙコマンドが実行されるにつれて、連続して空にされる。メモリの内容を０ｕｔｐｕｔ　ＦＩＦＯ２７０２にロードするため、制御装置２７０４はインチラブドプログラム２７０６によって割り込まれなければならないし、メモリ転送に進むことになる。In FIG. 27, the local memory 2700 is 0output FIFO2 in Transfer (FIFO Write) 702 is shown. FIFO2702 is Video Dis It is continuously emptied as play commands are executed. the contents of memory 0output In order to load to FIFO 2702, the control device 2704 Must be interrupted by rhabdo program 2706 and memory transfer will proceed to.

これは、ビデオの別のラインがＤＯＣにクロックされる準備がなされる度に呼び出されるインチラブドを経由して実行される。This is called every time another line of video is ready to be clocked into the DOC. Executed via Inch Loved issued.

ローカルメモリ２７００内のどのデータが０ｕｔｐｕｔ　ＦＩＦＯにロードされるか、どの順番でロードされるかを決定するために、各チャネル毎に定義された五つのレジスタがある。この四つのレジスタはＦＡＧによって使用されてアクティブフレームバッファのために有効なペースアドレスを生成する。五つのレジスタは、ＦＩＦＯ０ｕｔｐｕｔ　Ｔｉｍｉｎｇ　５ｅｑｕｅｎｃｅ　Ｒｅｇｉｓｔｅｒ　２７０８　（ＦＯＴＳＲ）であり、これには（ローカルメモリに位置する）フレームバッファからデータを読み出す方法が記載されている。Which data in the local memory 2700 is loaded into the 0output FIFO? defined for each channel to determine which channels are loaded and in what order. There are five registers. These four registers are used by the FAG to Generates a valid pace address for the dynamic frame buffer. five regis The data is FIFO0output Timing 5equence Register er 2708 (FOTSR), which includes (located in local memory) ) describes how to read data from the frame buffer.

Ｍｅｍｏｒｙ　ｔｏ　Ｖｉｄｅｏ　Ｔｒａｎｓｆｅｒ　Ｉｎ５ｔｒｕｃｔｉｏｎは、マルチサイクル命令である。それが実行された場合には、ピクセルの特定数がローカルメモリ２７００から０ｕｔｐｕｔ　ＦＩＰＯに転送される。その命令のためのパラメータは、下記のように、ＦＯＴＳＲ２７０８に記憶されている。Memory to Video Transfer is a multicycle instruction. A specific number of pixels if it is executed is transferred from local memory 2700 to 0output FIPO. that command The parameters for are stored in FOTSR 2708 as follows.

ＦＯＴＳＲ形式（３２ビツト）は、０ｕｔｐｕｔ　ＦＩＦＯにあるデータをローカルメモリに読み込む方法に関するパラメータを特定するために使用される。The FOTSR format (32 bits) loads the data in the 0output FIFO. Used to specify parameters for how to read into local memory.

それは、ＦＩＴＳＲ２４１０と同じ形式を有している。レジスタは四つのフィールドを有しており、第五のフィールドは０ＴＳＲ２６０４から読み出される。即ち、（６）　ＤｅｌｔａＯｆｆｓｅｔ、（ＩＩ）　Ｍｏｄｕｌｏ　Ｌ　、　（４）　Ｗａｉｔ　Ｃｙｃｌｅｓ　、及び（６）　（ＯＴＳＲから）　Ｎｕｍｂｅｒ　ｏ■ Ｐｉｘｅｌである。１１ビツトＩｎ１ｔｉａｌ　Ｆｒａｍｅ　ｆＭｆｓｅｔは、画像を表示する場合に、フレームに付は加えた追加の縦方向のオフセットを特定する。例えば、８のオフセットが特定された場合には、出力画像はビデオ出力宛先においてローカルメモリに現れるよりも８ライン下に表示される。６ビツトＤｅｌｔａ　０ｆｆｓｅｔは、各オペレーションのアドレスに付加された縦方向のオフセットを特定する。２のＤｅｌｔａ　０ｆｆｓｅｔが与えられた場合には、第一の転送はゼロの縦方向のオフセットを有し、第二のオフセットは２の縦方向のオフセットを有し、第三のオフセットは４の縦方向のオフセット等を有することになる。１１ビツトＭｏｄｕｌｏ　Ｌフィールドは、縦方向の位置演算が回り込む時を決定する。そのフィールドは（極限）数値りを保持している。例えば、Ｉｎ１ｔｉａｌ　Ｆｒａｍｅオフセットがゼロである場合には、Ｄｅｌｔａ　０ｆｆｓｅｔは４で、Ｍｏｄｕｌｏ　Ｌ値は１５で、連続データ転送はライン０．４．８、■２．１．５．９．１３．２．６、等に現れている。４ビツトＷａｉｔ　Ｃｙｃｌｅｓフィールドは、転送が完了する前に待機さぜる追加のクロックサイクルの数を決定するために使用される。低速ローカルメモリがアクセスされる場合に、このフィールドが使用される。６ビツトＮｕｍｂｅｒ　ｏｆ　Ｐｉｘｅｌｓフィールドは、０ｕｔｐｕｔ　ＦＩＦＯに転送されるピクセル数を特定する。この数値はＶｉｄｅｏ　Ｄｉｓｐｌａｙに対するものと常に同じであるので、パラメータはＦＯＴＳＲにおいて明示されていない。しかし、それはなおＭｅｉｖｏｒｙ　ｌｏ　ｖｉｄｅｏ　Ｔｒａｎｓｆｅｒのためのパラメータである。It has the same format as FITSR2410. The register has four fields. The fifth field is read from 0TSR2604. Immediately (6) DeltaOffset, (II) Modulo L, (4 ) Wait Cycles, and (6) (from OTSR) Number 　o■ It's a Pixel. The 11-bit In1tial Frame fMfset is Specifies the additional vertical offset added to the frame when displaying the image. do. For example, if an offset of 8 is specified, the output image will be It is displayed eight lines below what it previously appeared in local memory. 6 bit D elta0ffset is the vertical value added to the address of each operation. Identify the offset. If a Delta 0ffset of 2 is given, The first transfer has a vertical offset of zero and the second has a vertical offset of 2. , the third offset may have a vertical offset of 4, etc. It becomes. The 11-bit Modulo L field is used for vertical position calculations. Decide when to enter. The field holds a (limit) numerical value. for example, If the In1tial Frame offset is zero, then Delta 0 ffset is 4, Modulo L value is 15, continuous data transfer is on line 0. 4.8, ■2.1.5.9.13.2.6, etc. 4 bit Wait The Cycles field specifies additional clock cycles to wait before the transfer is complete. used to determine the number of cycles. Slow local memory is accessed This field is used if 6 bit Number of Pixe The ls field specifies the number of pixels transferred to the 0output FIFO . This number is always the same as for Video Display, so Parameters are not specified in FOTSR. However, it is still Mei This is a parameter for vory lo video Transfer.

Ｖｉｄｅｏ　０ｕｔｐｕｔ　０ｐｅｒａｔｉｏｎ　５ｅＬｕｐは、チャネル毎の０ＴＳＲ，ＦＯＴＳＲ、及びＦＡＧ　Ｒｅｇｉｓｔｅｒｓの折数値での初期化の方法を簡単に説明している。この数値のすべてはユーザー特定のもので、プロセンサに従属するものであるので、アドレッシング情報はプロセッサから受信されなければならない。このレジスタの初期化と修正を簡単にするために、ローカルメモリ部が予約されている。プロセッサはＩＯＭｃが読み込むことのできる専用メモリ位置にデータを書き込む。Video 0output 0operation 5eLup is for each channel. Initialization of 0TSR, FOTSR, and FAG Registers with fold values The method is briefly explained. All of these numbers are user specific and process addressing information is received from the processor. There must be. To simplify the initialization and modification of this register, a local Memory section is reserved. The processor is a dedicated device that can be read by IOMc. Write data to a memory location.

予約したメモリ位置は初期化のために使用されるので、ビデオパラメータを変更する命令は、／ステムコールとして実行される。パラメータのなかにはユーザーによって特定することはできないし、すべきでないものがある。例えば、システムは、ユーザーが別のアプリケーションによって使用されたチャネルにパラメータを更新する試みを行うような状況から保護される必要がある。こうして、その状況から保護することができるシステムコールは、ビデオ出力オペレーションセットアツプを実行する適切な方法である。Reserved memory locations are used for initialization, so changing video parameters The instruction to do this is executed as a / stem call. Among the parameters is the user There are some things that cannot and should not be specified. For example, the system The system allows the user to add parameters to a channel used by another application. must be protected from situations where an attempt is made to update the data. In this way, the A system call that can be protected from the situation is the video output operation This is the proper way to perform a startup.

ＳＨのフィードバック性能により、Ｏｕｔ四ｔ　ＦＩＦＯ２６００のデータはデータ出力チャネル（ＤＯＣ）に書き込んで、Ｉｎｐｕｔ　ＦＩＦＯ２４００に読み出すことができるが、プロセッサの別の部分を関係させることなくＳＥはメモリのデータを処理することができる。図２７ａ乃至図２７ｉは、二つのメモリオペレーション工程を説明しており、一工程は数値アレーを回転させて、他の工程は数値アレーを置き換える。メモリ編成は、図２７ａで説明されている。簡素化のために、四つのプロセッサ（０乃至３）が示されているだけであり、各々は四つのメモリ位置をアドレスする。Due to the feedback performance of SH, the data of Out 4T FIFO2600 is Write to the data output channel (DOC) and read to the Input FIFO2400. SE can be used to extract notes without involving other parts of the processor. data can be processed. Figures 27a to 27i show two memory It explains the peration process, in which one process rotates the numerical array and the other process replaces the numeric array. The memory organization is illustrated in Figure 27a. Simplification , only four processors (0 to 3) are shown, each with four processors (0 to 3). Address one memory location.

図２７ａにおいて、四つのメモリ位置はＯ乃至３のオフセットを使用して各プロセッサによってアドレスされる。下記の実施例は、Ｐプロセッサが数値のＮＸＮマトリックスで演算する場合である。こうして、図２７ａ乃至図２７ｉにおいて、ＮとＰとも４に等しい。In Figure 27a, the four memory locations are mapped to each program using an offset of O to 3. addressed by the processor. The example below shows that the P processor is a numeric NXN This is the case when calculating with a matrix. Thus, in Figures 27a-27i , N and P are both equal to 4.

アレートランスポーズオペレーションを実行するために、ＦＯＴＳＲ、０ＴＳＲ，ＦＩＴＳＲｌ及びＩＴＳＲレジスタは下記のように設定される。ＦＯＴＳＲレジスタにおいて、Ｉｎ１ｔｉａｌ　Ｆｒａｍｅ　Ｐａ１ｎｔｅｒ　０ｆｆｓｅｔフイールドは（Ｐ＋１）モジュＯＮに設定される。Ｄｅｆｔａ　０ｆｆｓｅｔフイールドは＋１に設定される。Ｍｏｄｕｌ　ｏ　ＬフィールドＮに設定されるｏ　Ｗａｉｔ　Ｃｙｃｌｅｓフィールドはｌに設定される。また、Ｎｕｍｂｅｒ　ｏｆ　Ｐｉｘｅｌｓフィールドはｌに設定される。０ＴＳＲレジスタにおいて、Ｉｎ１ｔｉａｌ　Ｐｉｘｅｌ　Ｐｏ５ｉｔｉｏｎフイールドはＰに設定される。FOTSR, 0TSR to perform array transpose operations , FITSRl and ITSR registers are set as follows. FOTSR In the register, In1tial Frame Pa1nter 0ffset The field is set to (P+1) module ON. Defta 0ffset The yield is set to +1. Module o Set in L field N The Wait Cycles field is set to l. Also, Number The of Pixels field is set to l. In the 0TSR register, The In1tial Pixel Po5ition field is set to P.

Ｐｉｘｅｌ　Ｒｅｐｅａｔ　ＩｎｔｅｒｖａｌフィールドはＮに設定される。Ｎｕｍｂｅｒ　ｏｒ　Ｐｉｘｅｌフィールドはｌに設定される。これらのレジスタは、０ｕｔｐｕｔ　ＦＩＦＯを制御して、図２７ｂに示されている順番でアレーからデータ出力チャネル（ＤＱＣ）に数値を供給する。The Pixel Repeat Interval field is set to N. N The umber or Pixel field is set to l. These registers controls the 0output FIFO to load the array in the order shown in Figure 27b. supplies numerical values to the data output channel (DQC).

ＦＩＴＳＲレンスタにおいて、Ｉｎ１ｔｉａｌ　Ｆｒａｍｅ　Ｐｏ１ｎｔｅｒ　０ｆｆｓｅｔフイールドは（Ｐ−１４Ｎ）モジュロＮに設定される。Ｄｅｌｔａ　０ｆｆｓｅｔフイールドは一■に設定される。Ｍｏｄｕ１ｏＬフィールドはＮに設定されている。Ｗａｉｔ　Ｃｙｃｌｅｓはｌに設定される。Ｎｕ＋ｎｂｅｒｏｆ　Ｐｉｘｅｌｓフィールドはｌに設定される。　ＩＴｓＲレジスタにおいて、Ｉｎ１ｔｉａｌ　Ｐｉｘｅｌ　Ｐｏ５ｉｔｉｏｎフイールドは（Ｐ−１＋Ｎ）　Ｍｏｄｕｌｏ　Ｎに設定される。Ｐｉｘｅｌ　Ｒｅｐｅａｔ　ＩｎｔｅｒｖａｌフィールドはＮに設定される。Ｎｕｍｂｅｒ　ｏｆ　Ｐｉｘｅｌｓはｌに設定される。これらのレジスタはＩｎｐｕｔ　ＦＩＦＯを制御して、図２７ｂに示された順番でＤＯＣからアレーにデータ数値を記憶する。図２７ｄ及び図２７ｅは、アレートランスポートオペレーションの前後でメモリアレーにあるｐを介してピクセルをそれぞれ示している別の役にたつメモリオペレーションはアレー回転オペレーションである。このオペレーションでは、あたかもアレーが９０°回転したかのようにアレーの内容が再構成されている。アレー回転オペレーションを実行するために、ＦＯＴＳＲレジスタは、Ｉｎ１ｔｉａｌ　Ｆｒａｎｃｅ　Ｐａ１ｎｔｅｒフイールドがＰであり、Ｄｅｌｔａ　０ｆｆｓｅｔフイールドは＋１であり、Ｍｏｄｕｌｏ　ＬフィールドはＮであり、Ｗａｉｔ　Ｃｙｃｌｅｓフィールドはｌであり、またＮｕｍｂｅｒ　ｏｆ　Ｐｉｘｅｌｓフィールドは１であるように設定されている。０ＴＳＲレジスタは、Ｉｎ１ｔｉａｌ　Ｐｉｘｅｌ　Ｐｏ５ｉｔｉｏｎフイールドがＰであり、Ｐｉｘｅｌ　Ｒｅｐｅａｔ　１ｎＬｅｒｖａｌはＮであり、またＮｕｍｂｅｒ　ｏｆ　Ｐｉｘｅｌｓは！であるように設定される。これらのレジスタにより、０ｕｔｐｕｔ　ＦＩＦＯ２６００は、図２７ｆに示されている順番で、アレーからＤＯＣに数値を供給することになる。In FITSR Lenstar, In1tial Frame Po1nter The 0ffset field is set to (P-14N) modulo N. Delta The 0ffset field is set to 1■. Modu1oL field is N is set to . Wait Cycles is set to l. Nu+nber The of Pixels field is set to l. In the ITsR register , In1tial Pixel Po5ition field is (P-1+N) Modulo Set to N. Pixel Repeat Interva The l field is set to N. Set Number of Pixels to l be done. These registers control the Input FIFO and are shown in Figure 27b. Data values are stored in the array from the DOC in the order in which they were received. Figures 27d and 27e are , via p in the memory array before and after the array transport operation. Another useful memory operation to represent each pixel is array rotation. It is an operation. In this operation, it is as if the array were rotated 90°. The contents of the array are reorganized as if it were Array rotation operation To execute, the FOTSR register is set to In1tial France Pa. 1nter field is P and Delta 0ffset field is +1 , the Modulo L field is N, and the Wait Cycles field is field is l, and the Number of Pixels field is 1. is set to 0TSR register is In1tial Pixel Po5ition field is P and Pixel Repeat 1nLe rval is N, and Number of Pixels is! as it is Set. With these registers, the 0output FIFO 2600 is The numbers will be supplied from the array to the DOC in the order shown at 27f.

オペレーションを完了するために、ＦＩＴＳＲレジスタは、Ｉｎ１ｔｉａｌ　Ｆｒａｍｅ　Ｐｏ１ｎｔｅｒ　０ｆｆｓｅｔフイールドは（Ｎ−Ｐ−１）であり、Ｄｅｌｔａ　０（Ｅｓｅｔフィールドは＋１であり、Ｍｏｄｕｌｏ　ＬフィールドはＮであり、Ｗａｉｔ　ＣｙｃｌｅｓフィールドはＩであり、またＮｕｍｂｅｒ　ｏｆ　Ｐｉｘｅｌ　フィールドはｌであるように設定される。ＩＴＳＲレジスタは、Ｉｎ１ｔｉａｌ　Ｐｉｘｅｌ　Ｐｏ５ｉｔｉｏｎフイールドは（Ｎ−Ｐ −１）であり、Ｐｉｘｅｌ　Ｒｅｐｅａｔ　Ｉｎｔｅｒｖａｌ　フィールドはＮであり、またＮｕｍｂｅｒ　ｏｒ　Ｐｉｘｅｌフィールドはｌであるように設定される。この数値を使用して、Ｉｎｐｕｔ　ＦＩＦＯ２４００は、図２７ｇに示された順番で、データ数値をＤＯＣからアレーに記憶する。このオペレーションの結果は、図２７ｈに示されているように、ｐを介してデータ数値を図２７ｉに示された回転位置に翻訳される。入出力ＦＩＦＯ２４００／２６００を制御するレジスタのその他の構成はその他のりマツピングオペレーソヨンを実行するために使用されることが想定されている図１８ａは回路機構が示されている。これはプロセッサの各々を関係するＩＯＭｃを介してＩＶとインタフェースをとるために使用される。１Ｖ３２０はディスクドライブのアレーがあり、それはＳＨにおいて各プロセッサに一つである。しかし、フラッシュメモリ、マグネティクバブル記憶装置、または偶数のランダムアクセスメモリ装置のようなその他の種類の第二メモリが１Ｖ３２０として使用されることが想定されている。To complete the operation, the FITSR register is The rame Po1nter 0ffset field is (N-P-1), Delta 0 (Eset field is +1, Modulo L field The code is N, the Wait Cycles field is I, and the Number The r of Pixel field is set to l. ITSR cash register The In1tial Pixel Po5ition field is (N-P -1), and the Pixel Repeat Interval field is N and the Number or Pixel field is set to l. be done. Using this number, the Input FIFO 2400 is set as shown in Figure 27g. Data values are stored from the DOC into the array in the order in which they are stored. This operation The result is shown in Fig. 27h, and the data values are transferred to Fig. 27i via p as shown in Fig. 27h. translated into the indicated rotational position. Control input/output FIFO2400/2600 Other configurations of registers are used to perform other mapping operations. FIG. 18a shows a circuit arrangement that is envisioned to be used for. this is To interface each of the processors with the IV via the associated IOMc used for. The 1V320 has an array of disk drives, which are connected to the SH. one for each processor. However, flash memory, magnetic bub other types of memory devices, such as random access memory devices, or even random access memory devices. It is assumed that the second memory is used as 1V320.

図１８ａにおいて、各ディスクドライブはＩＶとインタフェースをとる１０ＭＣ部に接続するシリアル入出力部を有している。例えば、これらは標準Ｒ５−２３２接続部である。データは、並列回路を介してシリアル（Ｐ／Ｓ）インタフェース１８１６に接続するシリアル入力部に供給されるとともに、シリアル出力接続部を介して伝送されるデータは直並列（Ｓ／Ｐ）インタフェース１８１８に送信される。ＩＯＭｃ側では、データはＰ／３１８１６に送信されて、３９ビツトＦＩＦＯバツフアによって１キロビツトはどＳ／Ｐ　１８１８から受信される。３９ビツトは３２データビツトとエラー検出コード（ＥＤＣ）の７ビツトを含んでいる。In Figure 18a, each disk drive interfaces with the 10MC It has a serial input/output section that connects to the section. For example, these are standard R5-23 2 connections. Data is transferred via parallel circuit to serial (P/S) interface. 1816 and a serial output connection. The data transmitted through the serial/parallel (S/P) interface 1818 be done. On the IOMc side, the data is sent to the P/31816 and the 39-bit F One kilobit is received from S/P 1818 by the IFO buffer. 3 The 9 bits include 32 data bits and 7 bits of error detection code (EDC). There is.

また、ＦＩＦＯ１８１０はＩＶのディスクドライブから制御情報（即ちデータアドレス）を受信したり、そこに制御情報を供給する。この制御情報は制御回路１８２ｏを介してデータストリームに入力する。制御回路１８２０を介して転送されるアドレス値は２３ビツト値であり、各数値は個別３２ビツトデータワードに対応している。The FIFO 1810 also receives control information (i.e., data access) from the IV's disk drive. address) or supply control information thereto. This control information is the control circuit 1 82o to the data stream. transferred via control circuit 1820 The address value entered is a 23-bit value, and each number is separated into a separate 32-bit data word. Compatible.

従って、典型的ディスクドライブは最高３２メガビツトのデータを保持している。　ＰＩＦｏ　１８１０を経由して転送されるデータは３２ビツトＥＤＣエンコーダ１８１２によって供給されるか、３２ビツトＥＤＣデコーダ１８１４に供給されるが、データがＩＶに書き込まれるか、それから読み出されるがどうかに従属している。また、ＥＤＣデコーダ１８１４は、エラーがデコードされたデータで検出されたことを表示するｌビットエラー信号を供給する。この信号に応答して、プロセッサは再びデータにアクセスすることを試みるか、簡単にエラーを制御装置又はポストに伝達する。Therefore, a typical disk drive can hold up to 32 megabits of data. . Data transferred via PIFo 1810 is 32-bit EDC encoded. 1812 or to a 32-bit EDC decoder 1814. depending on whether data is written to or read from the IV. belong to. Additionally, the EDC decoder 1814 decodes the error-decoded data. An l-bit error signal is provided to indicate that an error has been detected. respond to this signal The processor then attempts to access the data again or simply overcomes the error. control device or post.

図１８ａにおいて、データをＩＶ３２０と送受信するために四つの３２ビツト出力チヤネルと四つの３２ビツト入カチヤネルがある。図１７及び図１８に示されているように、このチャネルは、入出カスライスのためにローカルメモリに多重化される。In Figure 18a, four 32-bit outputs are used to send and receive data to and from the IV320. power channel and four 32-bit channels. As shown in Figures 17 and 18 This channel is multiplexed into local memory for input and output slices, as shown in be converted into

１ｖは比較的長い画像シーケンスを保持するためか、大きいデータベースを保持するために使用できる。高い並列度から生じるハイデータ帯域幅により、画像処理アプリケーションの急速画像取得及びデータベースアプリケーションの高速データベース探索ができるようになる。1v has a large database, probably because it holds relatively long image sequences. can be used to High data bandwidth resulting from high degree of parallelism enables image processing Rapid image acquisition for administrative applications and high-speed data acquisition for database applications. You will be able to search the database.

ＳＥは、ＭＩＭＤ性能を有している。６４プロセツサ毎に一つの制御装置があり、各制御装置は異なる命令ストリームをそのプロセッサにブロードキャストすることができる。この編成により、制御装置間で同期させるハードウェアサポートにより最大ＭＩＭＤ命令ストリームが供給される。同期化は制御装置とその制御を受けるプロセッサの間、及び制御装置間で必要とされる。ＡＬＯＲ（ＬＯＲ）バスは、プロセッサを制御装置に同期させるために使用され、ＬＯＲバス、Ｇｌｏｂａｌ　ＯＲ（ＧＯＲ）バス、及びＮｅｉｇｈｂｏｒｉｎｇ　ＬＯＲ（ＮＯＲ）バスは、制御装置間の同期化のために使用される。SE has MIMD performance. There is one controller for every 64 processors. , each controller broadcasts a different instruction stream to its processor be able to. This organization provides hardware support for synchronizing between control devices. provides the maximum MIMD instruction stream. Synchronization is the control device and its control between the receiving processors and between the control units. ALOR(LOR) The buses are used to synchronize the processor to the control unit, LOR bus, Gl obal OR (GOR) bus, and Neighboring LOR (NOR ) bus is used for synchronization between control devices.

プロセッサ同期化は、その完了時間がローカルプロセッサデータに従属しているオペレーションにとって必要である。例えば、ローカルデータ状況が誤りとなるまで、すべてのプロセッサはコード部を反復しなければならないかもしれない。Processor synchronization has its completion time dependent on local processor data Necessary for operation. For example, local data status is incorrect Until then, all processors may have to repeat the code section.

これにより、すへてのプロセッサはそのループコードの実行を終了するまで、制御装置はループコードをブロードキャストすることがめられる。制御装置がループコードをブロードキャストすることを停止して、プログラム実行を続けることができる場合の信号を送るためにＬＯＲ信号が使用される。ＬＯＲ信号は、イベントが生したことを示す信号を制御装置に送るためプロセッサによって使用される。ＬＯＲバスは、各プロセッサからその制御装置に至る単一のラインである。This causes all processors to remain constrained until they finish executing their loop code. The control device is allowed to broadcast a loop code. If the control device is stop broadcasting program code and continue program execution The LOR signal is used to signal when it is possible. The LOR signal is used by the processor to send a signal to the controller indicating that an event has occurred. Ru. The LOR bus is a single line from each processor to its controller.

しＯＲバス上の数値は、最初は低く、各プロセッサがＬＯＲビットをそのＰＳＷに設定することにより、ＬＯＲの上位信号をアサートする（図２８参照）。すべてのプロセッサはそのＬＯＲ信号をアサートした場合には、ＬＯＲバス数値はハイとなり、その制御装置は、すべてのプロセッサが同期化するように信号が送られる。The numbers on the OR bus are initially low, and each processor uses the LOR bits in its PSW. By setting this to , the upper signal of LOR is asserted (see FIG. 28). Everything If all processors assert their LOR signal, the LOR bus number will be high. The controller receives a signal to synchronize all processors. It will be done.

定義することにより、ＭＩＭＤプログラムはマルチ命令ストリームを有することとなり、それは互いに非同期的で、独立して演算される。適宜、この命令ストリームは同期化されているので、計算結果は共用できる。ＳＥでは、制御装置を同期化するため使用された下記メカニズムの手段により、各制御装置は異なる命令ストリームを実行することができる。図２９において、各制御装置はスイッチを有しており、ＬＯＲ信号とＮＯＲ信号を組み合わせる。スイッチネットワークは、接続されているので、連続する制御装置のグループだけが互いに同期化される。各制御装置はスイッチの構成をソフトウェアで設定できる。図３０は、七つの制御装置の概念上のグルーピングを示しており、図３１はスイッチネットワーク構成のためのスイッチ構成を示している。By definition, MIMD programs have multiple instruction streams. , which are asynchronous to each other and are computed independently. If appropriate, this instruction string The systems are synchronized, so calculation results can be shared. In SE, the control device is the same. By means of the following mechanisms used to stream can be executed. In Figure 29, each control device has a switch. It has a combination of LOR and NOR signals. switch network , connected so that only groups of consecutive control devices are synchronized with each other . Each control device can configure the switch configuration using software. Figure 30 shows the seven Fig. 31 shows a conceptual grouping of control devices, and Fig. 31 shows a switch network. The switch configuration for the configuration is shown.

制御装置の間の同期化は下記の通りである。スイッチネットワークにより形成されたＬＯＲ／ＮＯＲバスは、バスを介するすべてのシーケンサが上位信号をアサートする場合だけバス信号がハイとなるように実行する。制御装置は、他の制御信号と同期化するのに必要とされるコードのポイントにきた場合において、ＰＳｗのＬＯＲビットを設定するようにプロセッサにコマンドを発信する。すべてのプロセッサにはビットが設定されているので、この動作でＬＯＲバスがハイになる。次に、制御装置は待機状態になり、ハイになるようにスイッチネットワークによって定義されたバスを待つ。ＮＯＲ信号とは、隣接する制御装置がそのＬＯＲ信号を設定した場合において、要約する信号である。すべての制御装置がＬＯＲ信号をアサートした場合には、スイッチネットワークによって定義されたバスはハイになり、その制御装置は同期化する。The synchronization between the controllers is as follows. formed by a switch network The LOR/NOR bus is configured such that all sequencers via the bus assign upper-level signals. Execute so that the bus signal goes high only when the The control device controls other When the point in the code is reached where it is needed to synchronize with the signal, the PS Issue a command to the processor to set the LOR bit of w. all Since the bit is set in the processor, this action causes the LOR bus to go high. Ru. Then the controller goes into standby state and switches network to high wait for the bus defined by . A NOR signal means that an adjacent control device This is a signal that summarizes when the R signal is set. All controls are LO If you assert the R signal, the bus defined by the switch network goes high and its controllers synchronize.

ＧＯＲハスとは、すべての制御装置を接続するバスである。このバスは、制御装置のグローバル同期化を必要とする状況において使用される。一つの実施例はＳＲが時分割モードである場合のもので、新プログラムの文脈節がロードされている。ＧＯＲ同期化が必要とされるのは、ＳＩＭＯプログラムが同期化して実行を開始することを保証するためである。別の実施例は、ＭＩＭＤプログラムが終了する場合である。−ストリームは早めに終了するかもしれないが、すべてのストリームを待って、制御装置が終了したことの信号を送る前に終わらせるべきである。ＬＯＲ／ＮＯＲスイッチネットワークの使用例として、バリア同期というコンストラクトをプログラムする下位レベルＭＩＭＤを考察する。命令ストリームがバリアに届いた場合には、バリア同期に参加するその他の命令ストリームがそのバリアに届くまで、待機しなければならない。The GOR bus is a bus that connects all control devices. This bus is Used in situations that require global synchronization of locations. One example is S This is when R is in time-sharing mode, and the new program's context clause is loaded. Ru. GOR synchronization is required when SIMO programs are synchronized and executed. This is to ensure that it starts. Another example is when the MIMD program is finished. This is the case. - Streams may end early, but all streams should wait for the program to finish before the controller signals that it is finished. Ru. As an example of using a LOR/NOR switch network, there is a concept called barrier synchronization. Consider lower level MIMD programming constructs. instruction stream reaches the barrier, other instruction streams participating in barrier synchronization You must wait until you reach the barrier.

制御装置がその命令ノーケンスのバリアに出くわした場合には、上位信号をＬＯＲ／ＮＯＲバスに送信して、ハイにするためバスを待つ。すべての制御装置がバリア同期ポイントに届いた場合には、バスはハイになり、参加制御装置は同期化する。実施例として、図３２を考察する。本実施例において、第一シーケンスｌ、２が同期化されて、シーケンス１．２．３が同期化される。If the controller encounters the barrier of its command no-ken, it will LO the upper level signal. Sends to R/NOR bus and waits for bus to go high. All controls are If the rear synchronization point is reached, the bus goes high and the participating controllers synchronize. do. As an example, consider FIG. In this example, the first sequence l , 2 are synchronized and the sequence 1.2.3 is synchronized.

時分割は、ＳＨのためのオペレーションの通常モードである。文脈節を記憶してロードできる二つのメモリポートがあるので、プログラム間の文脈節スイッチ時間用の時間は短い（約２５０命令サイクル）。Time division is the normal mode of operation for SH. remember the context clause There are two memory ports that can be loaded, so when switching between context clauses between programs, The intermission time is short (approximately 250 instruction cycles).

また、プログラムはアーキテクチュアのサブセットで実行できる。ＳＥはスヵラブルアーキテクチュアとして設計されているので、システムはいくつかの小システムとして作動するためＥＢレベルで再構成される。次に、プログラムは新構成システムサブセットでロードし実行して、小さい問題のリソースの割付けを処理することができる。Additionally, programs can run on a subset of the architecture. SE is scalar Because it is designed as a blue architecture, the system has several small Reconfigured at EB level to operate as a system. Next, the program has a new configuration Load and run on a system subset to handle resource allocation for small problems can do.

実時間オペレーティングシステムのハードウェアサポートはＯ５Ｂｏａｒｄであり、これはＳＨのホストヮークステーションと制御装置からの要求に応答する。Hardware support for real-time operating systems is provided by O5Board. It responds to requests from the SH's host workstation and controller.

Ｏ３Ｂｏａ「ｄは、ホストワークステーションと制御装置からの要求をバッファするハードウェアキューを含んでいる。また、Ｏ３Ｂｏａｒｄはバッファメモリを制御するので、ポスト、制御装置、及びＩＯＭＣにより読み出し、書き込みをすることができる。ポストとＳＥはかなり異なるクロック率で作動して、ゆるく結合しているので、Ｏ３Ｂｏａｒｄは、データを二つのシステム間で転送する方法を調整しなければならない。O3Boa'd buffers requests from host workstations and controllers. Contains hardware queues to In addition, O3Board is a buffer memory The read and write operations are controlled by the post, control device, and IOMC. can do. The post and SE run at significantly different clock rates and are loosely Because it is coupled, O3Board is the only way to transfer data between two systems. Laws must be adjusted.

また、制御装置において、マルチユーザー環境のための追加ハードウェアサポートがある。各制御装置は、実行をスケジュールした未定のジョブを含んだジョブキューを有している。（Ｏ３Ｂｏａｒｄは、スケジュールジョブをすべての制御装置にブロードキャストするが、制御装置はジョブキューにスケジュールされている。）制御装置はプロセステーブルメモリとポーリングハードウェアを有している。プロセステーブルメモリーはｓｌ！に存在するプロセスに関する情報を含んでいる。ポーリングハードウェアは、実時間ジョブが実行をスケジュールしなければならない時を決定する。Additionally, the control unit offers additional hardware support for multi-user environments. There is a Each control unit has jobs, including pending jobs, scheduled for execution. Has a queue. (O3Board has full control over scheduled jobs. broadcast to the device, but the control device is not scheduled in the job queue. There is. ) The controller has process table memory and polling hardware. There is. Process table memory is sl! Contains information about the processes that exist in I'm reading. Polling hardware prevents real-time jobs from being scheduled to run. Decide when you need to do it.

Ｏ３Ｂｏａｒｄは、Ｍｏｔｏｒｏｌａ　６８０４０プロセツサ又はその同等装置を有しており、それは連続してＯ３Ｂｏａｒｄキューを監視するオペレーティングシステムプログラムを実行する。それは、アクティブプログラムとホストからくる要求に待機して、一度に一つの要求に応答する。要求には優先順位がある。The O3Board is a Motorola 68040 processor or equivalent device. It has an operating system that continuously monitors the O3Board queue. run the system program. It is from the active program and host It waits for incoming requests and responds to one request at a time. Requests have priorities.

アクティビティの中には実時間プログラムの実行スケジュールのように直ちに実行しなければならないものもあるし、他のアクティビティの中にはプログラムをロードするように実時間で実行することに拘束されていないものもあり優先順位は低い。待機要求は、優先順位は低いが、Ｏ３Ｂｏａｒｄがキューからエントリを読み出す場合に実行される。直ちに処理しなければならない優先順位の高い要求は、Ｏ３Ｂｏａｒｄプロセッサプログラムに割り込んで実行される。Some activities can be executed immediately, such as a real-time program execution schedule. Some activities have to be performed, and some other activities require Some things are not bound to run in real time like loading priorities is low. A standby request has a low priority, but the O3Board does not enter it from the queue. Executed when reading . High-priority items that need immediate attention The request is executed by interrupting the O3Board processor program.

１セツトの実時間プログラムをともに実行することを保証するために、新たに提出された実時間プログラムが存在する実時間プログラムと互換性を持って実行できるかどうかを決定する場合には、ＲＡＰによって下記規則が使用される。即ち、（文脈節スイッチングのようなオーバーヘッドを含めて、）実時間プログラム実行時間の合計は、実時間プログラムの基準（最短）フレーム時間（フレームバッファをロードする時間）より少なくなければならない。このことは、各プログラムが各基準フレームに対して一度実行することができることを保証している。In order to guarantee that a set of real-time programs can run together, a new proposal The issued real-time program can be executed in a manner compatible with existing real-time programs. The following rules are used by the RAP when deciding whether to That is, , a real-time program (including overheads like context clause switching) The total execution time is based on the standard (shortest) frame time (frame bar) of the real-time program. (time to load the buffer) should be less. This applies to each program. This guarantees that the ram can be executed once for each reference frame.

これは控え目な見積である。実時間プログラムのほとんどは、より長いフレーム時間を使用している。そのため、その状況はプログラムが実行しなければならない回数を過大評価している。実時間プログラムの実行時間は、プロファイリングとユーザー見積を通して決定される。This is a conservative estimate. Most real-time programs use longer frames using time. Therefore, the situation must be handled by the program. You are overestimating the number of times. Real-time program execution time is profiling and determined through user estimates.

実時間プログラムのスケジューリングは、第一優先順位の要求である。実時間ジョブは、別のフレームバッファをロードする度に実行できるよう用意されている。各制御装置は、Ｄａｔａ　Ｉｎｐｕｔ　Ｃｈａｎｎｅｌｓのフレーム同期化（Ｆ同期化）信号をポールするポーリングレジスタを有している。（実時間プログラム完了実行動作、又は非実時間ジョブスライスが終了したことにより、）ジョブが完成する度に、このレジスタから読み出されリセットされて、Ｆ同期化に関連したジョブがスケジュールされる。一つのＦ同期化信号以上のものが読み込まれた場合には、このジョブは最短フレーム時間第一がスケジュールされる。新Ｆ同期化信号がなく、実行をスケジュールされたジョブが二つより少ない場合には、Ｏ３Ｂｏａｒｄは実行のための使用可能非実時間ジョブをスケジュールする。Scheduling real-time programs is a first priority requirement. real time di The job is prepared to run every time another framebuffer is loaded. . Each control device performs frame synchronization ( It has a polling register that polls the F synchronization) signal. (Real time program (due to completion of a run or non-real-time job slice) Each time a block is completed, this register is read and reset to A series of jobs are scheduled. More than one F synchronization signal is read. If so, the job will be scheduled with the shortest frame time first. New F If there is no synchronization signal and there are fewer than two jobs scheduled to run, , the O3Board schedules available non-real-time jobs for execution.

制御装置は、オペレーティングシステムアクティビティとインクラクトするコンポーネントを１４有している。即ち、Ｐｏｌｌｉｎｇ　Ｒｅｇｉｓｔｅｒ、　Ｐｏ１ｌ　Ｃｏｒｒｅｓｐｏｎｄｅｎｃｅ　Ｔａｂｌｅ　（ＰＣＴ）　、Ｊｏｂ　Ｑｕｅｕｅ　（ＪＱ）、Ｊｏｂ　Ｆｉｎｉｓｈｅｄ　Ｓｉｇｎａｌ　、Ｔｉｍｅ　Ｑｕａｎｔｕｍ@Ｒｅｇｉｓｔｅｒ　（ＴＱ）、Ｔｉｍｅ　５ｌｉｃｅ　Ｃｏｕｎｔｅｒ　（ＴＳＣ）、Ｐｒｏｃｅｓｓ　Ｔａｂｌｅ　Ｍｅｍｏｒｙ　（ＰＴＭ）、Ｐｒ盾モ■唐■ Ｂａｓｅ　Ｒｅｇｉｓｔｅｒｓ　（ＰＢＲｓ）　、Ｉｌｏ　Ｒｅｑｕｅｓｔ　Ｓｉｇｎａｌ、Ｉｌｏ　Ｒｅａｄｙ　Ｓｉｇｎａｌ、　）ＩＩnＲ，Ｉｎ５ｔｒｕｃｔｉｏｎ　Ｍｅｍｏｒｙ、　ＰＣスタックメモリ、及びＬｏｏｐスタックメモリである。A control device is a controller that interacts with operating system activity. It has 14 components. That is, Polling Register, P o1l Correspondence Table (PCT), Job Queue (JQ), Job Finished Signal, Time Quantum@Regis ter (TQ), Time 5lice Counter (TSC), Pr ocess Table Memory (PTM), Pr Shield Mo■Tang■ Base Registers (PBRs), Ilo Request S ignal, Ilo Ready Signal, )IInR, In5t ruction Memory, PC stack memory, and Loop stack It's memory.

Ｐｏｌｌｉｎｇ　Ｒｅｇｉｓｔｅｒは、四つのピットレジスタであり、各ビットはＤａｔａ　Ｉｎｐｕｔ　Ｃｈａｎｎｅｌに対応する。そのレジスタは、最後のポールがチェックしてからＦｒａｍｅ　５ｙｎｃｈｒｏｎｉｚａｔｉｏｎ　（Ｆ　５ｙｎｃ）信号が受信されているかどうかをポールするために使用される。アトミック命令はレジスタを読み出しリセットするために使用される。Polling Register is four pit registers, each bit corresponds to Data Input Channel. That register is the last After Paul checks it, Frame 5ynchronization (F 5ync) Used to poll whether a signal is being received. a Tomic instructions are used to read and reset registers.

Ｆ同期化信号が受信された場合には、対応ビットがＰｏｌｌｉｎｇ　Ｒｅｇｉｓｔｅｒに設定され、それはデータの別のフレームがシステムにロードされたことと、データを使用する実時間ジョブがスケジュールできることを表示している。If the F synchronization signal is received, the corresponding bit is set to Polling Regis. ter, it indicates that another frame of data has been loaded into the system. and that real-time jobs that use the data can be scheduled.

ＰＣＴは、Ｐｏｌｌｉｎｇ　Ｒｅｇｉｓｔｅｒに要約されたポーリング信号をＤａｔａ　Ｉｎｐｕｔ　Ｃｈａｎｎｅｌを使用する実時間プログラムに関係させるために使用される。The PCT sends the polling signal summarized in the Polling Register to the D Related to real-time programs that use ata Input Channel used for.

ＪＱは、実行のため次のジョブナンバーを含んだハードウェアで実行されたキューである。ＪＱは、Ｏ３Ｂｏａｒｄからのジョブを受信して、どのジョブがスケジュールされるかを決定する。制御装置は、実行するジョブを準備した時に、Ｊｏｂ　Ｑｕｅｕｅのヘッドを削除する。JQ selects the queue executed on the hardware containing the next job number for execution. - is. JQ receives jobs from O3Board and determines which jobs are scheduled. Determine which module is used. When the control device prepares a job to be executed, Delete the ob Queue head.

Ｊｏｂ　Ｆｉｎｉｓｈｅｄ　（ＪＦ）　Ｓｉｇｎａｌは、カレントジョブが時間スライスの実行を完了した場合に、制御装置ｊｆＯがＯ３［１ｏａｒｄを送信する信号である。信号は、Ｏ３Ｂｏａｒｄに送信されるので、追加ジョブをスケジュールすることができる。Job Finished (JF) Signal indicates that the current job When the execution of the slice is completed, the controller jfO sends O3[1oard. This is a signal that A signal is sent to the O3Board so additional jobs can be scheduled. can be

ＴＱは、実時間ジョブを実行する場面で、非実時間ジョブが割り当てられる回数を決定するために使用される。ＴＱ値は、非実時間ジョブが次を実行するためにスケジュールされる場合にＴＳＣにロードされる。TQ is the number of times a non-real-time job is allocated when a real-time job is executed. used to determine. The TQ value is used for non-real-time jobs to run Loaded into the TSC when scheduled.

ＴＳＣは、プログラムが実行のために割り当てられたサイクル数を計算するために使用される。時間スライス値は、カウンタにロードされて、命令サイクル毎に減少していく。カウンタがゼロまで減少した場合には、プログラム実行は中断して、制御装置は次にスケジュールされたプログラムを実行するために準備される。実時間プログラムのためのＴｉｍｅ　５ｌｉｃｅはＰｒｏｃｅｓｓ　Ｃｏｎｔｒｏｌ　Ｔａｂｌｅ　（ＰＣＴ）から取得される。非実時間プログラムのためのＴｉｍｅ　５ｌｉｃｅは、ＴＱから取得される。TSC is for calculating the number of cycles a program is allocated for execution. used for. The time slice value is loaded into a counter and It will continue to decrease. If the counter decreases to zero, program execution is interrupted. the controller is then prepared to run the scheduled program. . Time 5lice for real-time programs is Process Cont It is obtained from rolTable (PCT). for non-real-time programs Time5lice is obtained from TQ.

ＰＴＭは、各プログラムのためのプログラム情報を含んでいる。それは、プログラム文脈節情報のような情報とＩｎ５ｔｒｕｃｔｉｏｎ　ａｎｄ　Ｌｏｃａｌ　（Ｄａｔａ）　ＭｅｍｏｒｙにあるプログラムのためのＢａ５ｅ　Ａｄｄｒｅｓｓを含んている。The PTM contains program information for each program. It's a program Information such as Ram context clause information and In5truction and Local (Data) Ba5e Address for the program in Memory Contains s.

ＰＢＲｓは、１セツトが１６のレジスタからなっており、各々は表示しているジョブのためのＰＴＭ　Ｅｎｔｒｙのベースアドレスを保持している。特定時間にＳＥを実行できる１６プログラム（実時間と非実時間）というハードウェア負担制限がある。One set of PBRs consists of 16 registers, each of which corresponds to the page being displayed. Holds the base address of the PTM Entry for the job. at a specific time Hardware burden of 16 programs (real time and non-real time) that can run SE There is a limit.

例えば、ＰＢＲ５はジョブ５のためのプロセステーブルエントリのベースアドレスを保持している。For example, PBR5 is the base address of the process table entry for job 5. maintains a

１１０　Ｒｅｑｕｅｓｔ　Ｓｉｎｇｌｅは、１１０を要求する制御装置からＯ３Ｂｏａｒｄに位置するＩＯＱに送信される。その要求は、Ｉｌｏを必要とするプログラムのジョブナンバーである。Ｏ３Ｂｏａｒｄがその要求を検査する場合には、そのジョブはＩｌｏのためにスケジュールされる。ホスト入出力に関する詳細な情報は下記の通りである。110 Request Single is an O3 request from the control device requesting 110. It is sent to the IOQ located on the Board. The request is a program that requires Ilo. This is the job number of the program. When the O3Board examines the request , the job is scheduled for Ilo. More information about host I/O Detailed information is as follows.

Ｉｌｏ　Ｒｅａｄｙ　Ｓｉｇｎａｌは、制御装置で実行しているプログラムがｏｓバッファにある情報をロードするか読み出すことを完了したことを示す信号をＯ３Ｂｏａｒｄに送るために使用される。ホスト入出力に関する詳細な情報は下記の通りである。Ilo Ready Signal indicates that the program running on the control device is s signal indicating that the information in the s buffer has been loaded or read. Used to send to O3Board. More information about host I/O can be found below. As written.

ＨＩＯＲは、制御装置がデータをＨｏ５ｔ　Ｗｏｒｋｓｔａｔｉｏｎとやりとりするのに必要とする場合にアクセスするレジスタである。それは、旧ＯＢｕｓの一部である。HIOR allows the control device to exchange data with the Ho5t Workstation. This is a register that is accessed when necessary to do so. It is the old OBus Part of it.

Ｉｎ５ｔｒｕｃｔｉｏｎ　Ｍｅｍｏｒｙは、プログラムのための命令がＳＥにある場所である。メモリはマルチポートを有しているので、プログラムでメモリをロードすることができるとともに、別のプログラムが（実行のため）読み出される。Instruction Memory is where the instructions for the program are in the SE. It is a place where you can Memory has multiple ports, so you can use the program to can be loaded and another program is read out (for execution) Ru.

Ｐｒｏｇｒａｍ　Ｃｏｕｎｔｅｒ　（ＰＣ）は、機能コールのための情報がプログラム実行中は維持される場所である。各ＰＣ５ｔａｃｋは、各ユーザープログラムに一つの、１６セツトからなる三つの専用レジスタを有している。それらはＰＣＢａ５ｅ　、　ＰＣＬｉｍ１ｔ、及び５ｔａｃｋ　Ｐｏ１ｎｔｅｒ　Ｒｅｇｉｓｔｅｒｓである。これらのレジスタはメモリ内のプログラムのデータを区切ってアクセスするために使用される。そのメモリはマルチポートを有しているので、プログラムでメモリをロードさせることができるとともに、別のプログラムが（実行のために）読み出される。Program Counter (PC) is where information for function calls is programmed. This location is maintained while the program is running. Each PC5tack has each user program It has three dedicated registers consisting of 16 sets, one for each RAM. They are PCBa5e, PCLimlt, and 5tack Po1nter Reg isters. These registers delimit the program's data in memory. It is used for accessing. Does that memory have multi-ports? allows a program to load memory and another program to load memory. is read (for execution).

Ｌｏｏｐ　５ｔａｃｋ　Ｍｅｍｏｒｙは、特別ループハードウェアとともに使用される情報がプログラム実行中に記憶される場所である。各Ｌｏｏｐ　５ｔａｃｋは、各ユーザープログラムのための１６セツトからなる三つの専用レジスタを有しており、それらはＬｏｏｐＢａｓｅ　、　Ｌｏｏｐ　Ｌｉｍ１ｔ、及びＬｏｏｐ　５ｔａｃｋ　Ｐａ１ｎｔｅｒ　Ｒｅｇｉｓｔｅｒｓである。これらのレジスタはメモリ内のプログラムのデータを区切ってアクセスするために使用される。そのメモリはマルチポートであるので、プログラムでメモリをロードするとともに、別のプログラムが（実行のために）読み出される。Loop 5tack Memory is used with special loop hardware This is the location where the information that is displayed is stored during program execution. Each Loop 5tac k has three dedicated registers consisting of 16 sets for each user program. They are LoopBase, Loop Limlt, and Loop OP 5tack Painter Registers. These cash registers Stars are used to separate and access program data in memory. . That memory is multi-ported, so when you load the memory programmatically, Also, another program is loaded (for execution).

ＩＯＭＧがオペレーティングシステムアクティビティに関して有するインタラクシ町ンだけが旧ＯＢｕｓの一部であるＨＩＯＲに読み出し、書き込みがなされる。このレジスタに読み出し、書き込みする決定は制御装置からＩＯＭＧに送信される。Interactions that IOMG has with operating system activities Only the remote control can read and write to HIOR, which is part of the old OBus. . Decisions to read and write to this register are sent from the controller to the IOMG. It will be done.

ホストは、Ｈｏ５ｔ　Ｒｅｑｕｅｓｔ　Ｓｉｇｎａｌ　、　）ｌｏｓｔ　Ｓｉｇｎａｌ　、及びＨｏ５ｔ　Ｂｕｓなどのオペレーティングシステムアクティビティとインタラクトする三つのコンポーネントを有している。これに加えて、ホストで実行するソフトウェアは、マシーンのリソースを割り当て、ＨＩＱ　Ｒｅｑｕｅｓｔｓにサービスしてデータをファイル又は端末に読み出し、書き込む役割がある。The host sends Ho5t Request Signal, )lost Sig Operating system activities such as nal and Ho5t Bus It has three components that interact with the user. In addition to this, host Software running on the machine allocates machine resources and sends HIQ requests. Role of servicing uests and reading and writing data to files or terminals There is.

Ｈｏ５ｔ　Ｒｅｑｕｅｓｔ　ＳｉｇｎａｌはホストがＯ３Ｂｏａｒｄに送信する信号で、ジョブを１（ｏｓｔ　Ｒｅｑｕｅｓｔ　Ｑｕｅｕｅに付は加える。要求はプログラムのローディング、プログラムのキリング、及びプログラムのリローデイングを含んている。Ｈｏ５ｔ　ＳｉｇｎａｌはホストがＯ５Ｂｏａｒｄに送信する信号で、ホストがＯ３Ｂｕｆｆｅｒの読み出しと書き込みを完了するための実行を完了していることを表している。Ho5t Request Signal is sent by the host to O3Board At the signal, add the job to the ost Request Queue. is for program loading, program killing, and program reloading. Contains deing. Ho5t Signal is sent by the host to O5Board. In order for the host to complete the O3Buffer read and write with the signal sent to the O3Buffer. This indicates that execution has been completed.

ＨＩＯは、すべてのＩＯＭｃと制御信号を直列に接続している３２ビツト二方向のバスである。データがＯ３Ｂｕｆｆｅｒに記憶されるまで、データを旧ＯＲにロードして、）１１０バスを左方向にシフトすることにより、そのデータはＯ３Ｂｕｆｆｅｒに書き込まれる。同様に、ＩＯＭｃへの書き込みは、データが宛先のＨＩ［ｌＲに届くまで、Ｏ３Ｂｕｆｆｅｒを読み出し、そのデータを右方向にシフトすることにより実行される。HIO is a 32-bit bidirectional connection that connects all IOMcs and control signals in series. bus. Transfer data to old OR until data is stored in O3Buffer By loading and shifting the )110 bus to the left, that data is transferred to O3 Written to Buffer. Similarly, a write to IOMc indicates that the data is Read the O3Buffer and move the data to the right until it reaches HI [lR of This is done by shifting.

ＲＡＰはホストに常駐しており、リソース割当情報を保持して、新しく提出されたプログラムが実行できるかどうか判定する。ＲＡＰは、システムのＰｈｙｓｉｃａｌ　５ｐｅｃｉｆｉｃａｔｉｏｎに関する情報とシステムのＣｕｒｒｅｎｔ　５ｔａｔｅに関する情報を保持している。Ｐｈｙｓｉｃａｌ　５ｐｅｃｉｆｉｃａｔｉｏｎ情報は、Ｔｏｔａｌ　Ｎ　ｕｍｂｅｒ　ｏｆ　Ｆｕｎｃｔｉｏｎｉｎｇ　Ｐｒｏｃｅｓ唐盾■ ｓ　＋Ｐｈｙｓｉｃａｌ　Ｄａｔａ　Ｍｅｍｏｒｙ　５ｔｚｅ　、及びＰｈｙｓｉｃａｌ　Ｉｎ５ｔｒｕｃｔｉｏｎ　Ｍｅｎ＋ｏｒｙ　５ｉ嘯■■■ んでいる。The RAP resides on the host, maintains resource allocation information, and responds to new submissions. Determine whether the program can be executed. RAP is the system's Physi Cal　5specification information and system current It holds information regarding 5tate. Physical 5 specifi cation information is Total N number of Functioni ng Process Tang Shield ■ s + Physical Data Memory 5tze, and Phys ical In5truction Men+ory 5i 嘯■■■ I'm reading.

Ｃｕｒｒｅｒ＋ｔ　Ｓｙｓｔｅｍ　５ｔａｔｅ情報は、Ｉｎ５ｔｒｕｃｔｉｏｎ　Ｍｅｍｏｒｙ　ＭａｐＳＬｏｃａｌ　Ｍｅｍｏｒｙ　Ｍａ吹B ＰＣ５ｔａｃｋ　Ｍａｐ、　１．ｏｏｐ　５ｔａｃｋ　Ｍａｐ、　Ｉｌｏ　Ｒｅ５ｏｕｒｃｅ　Ｍａｐ、及びＲｅｆｅｒｅｎｃｅ　Ｆｒａｍ■@ＴｉｍｅＭａｐを含んでいる。上記の第一の四つのマツプは、異なるメモリの割当に関する情報を提供する。そのマツプは各種メモリに生じる断片化量を決定するために使用される。リソースがプログラムのために存在するが、メモリ断片化によりそのプログラムが連続的にロードされるのを禁止する場合には、Ｒｅ５ｏｕｒｃｅ　Ａ１１ｏｃａｔｏｒは非実時間プログラムを再配置する要求を送信することができる。実時間プログラムは実行を要する次の時間までに完全に再配置されるとは保証されないので、それは再配置できない。Ｉｌｏ　Ｒｅ５ｏｕｒｃｅ　Ｍａｐは、Ｉｌｏ　Ｒｅ５ｏｕｒｃｅｓが使用されていることを示している。最後に、Ｒｅｆｅｒｅｎｃｅ　Ｆｒａｍｅ　Ｔｉｍｅ　Ｍａｐは、実時間処理のための命令バジェットを決定するか、非実時間ジョブを実行する十分な時間があるかどうかを決定する。Currer+t System 5tate information is In5truction Memory MapSLocal Memory Mabuki B PC5tack Map, 1. oop 5tack Map, Ilo Re 5source Map and Reference Fram■@Time Contains Map. The first four maps above relate to different memory allocations. Provide information on The map is used to determine the amount of fragmentation that occurs in various types of memory. used. Resources exist for the program, but memory fragmentation If you want to prevent the program from being loaded continuously, use Re5source. A11ocator may send requests to relocate non-real-time programs. can. A real-time program will be completely relocated by the next time it needs to run. is not guaranteed, so it cannot be relocated. Ilo Re5source Ma p indicates that IloRe5sources is used. lastly , Reference Frame Time Map is for real-time processing. Determine your instruction budget or whether you have enough time to run non-real-time jobs. Decide whether to use it or not.

Ｒｅｆｅｒｅｎｃｅ　Ｆｒａｍｅは、すべてのアクティブ実時間プログラムの最短フレーム時間として定義される。Ｒｅ５ｏｕｒｃｅ　Ａ１１ｏｃａｔｏｒは、Ｒｅｆｅｒｅｎｃｅ　Ｆｒａｍｅ　Ｒｕ１ｅという下記規則に従って作動する。The Reference Frame is the first frame of all active real-time programs. Defined as short frame time. Re5source A11ocator is It operates according to the following rule called Reference Frame Ru1e.

（考えられる実時間プログラムを含めて、）実時間プログラム命令カウントすべての合計が、Ｒｅｆｅｒ　ｅｎｃｅ　Ｆｒａｍｅのサイズ以下である場合には、実時間プログラムの実行がスケジュールされる。この規則が表明していることは、（各プログラムがＲｅｆｅｒｅｎｃｅ　Ｆｒａ＋ｎｅ毎に一度実行されなければならないという）最も厳密な仮定のもとで、すべてのプログラムが実行される場合には、すべてのプログラムは緩やかな条件で実行できる（その条件として、プログラムのなかにはＲｅｆｅｒｅｎｃｅ　Ｆｒａｍｅ　Ｔｉｍｅ毎に一回未満実際に実行するものもあることを意味するいくつかの（長い）フレーム率がある）。All real-time program instruction counts (including all possible real-time programs) If the total is less than or equal to the size of the Reference Frame, A real-time program is scheduled for execution. What this rule states is , (each program must be executed once for each Reference Fra+ne) All programs are executed under the strictest assumption that If all programs can run under relaxed conditions, the condition is Some programs have less than once per Reference Frame Time. There are some (long) frame rates that mean some actually run ).

実時間プログラムがＲｅｆｅｒｅｎｃｅ　Ｆｒａｍｅ　Ｒａｔｅを変更するＲＡＰに提出される場合には（これは提出されたプログラムが現在実行中のプログラムより高い頻度で実行する場合である）、命令バジェットはこの新フレーム率に従って再計算しなければならない。実行中のプログラムと考慮中のプログラムがこの命令バジェットを越えない場合には、そのプログラムをロードしてＲｅｆｅｒｅｎｃｅ　Ｆｒａｍｅを更新しなければならない。同様に、Ｒｅｆｅｒｅｎｃｅ　Ｆｒａｍｅ　Ｒａｔｅのプログラムオペレーンコンが終了した場合には、命令バジェットを増加させる効果を有する新Ｒｅｆｅｒｅｎｃｅ　Ｆｒａｔｎｅが決定される。RA where the real-time program changes the Reference Frame Rate (This means that the submitted program is the currently running program.) ), the instruction budget will increase at this new frame rate. Therefore, it must be recalculated. The running program and the program under consideration are If this instruction budget is not exceeded, load the program and ranceFrame must be updated. Similarly, Reference e If the Frame Rate program operation control is completed, the life The new Reference Fratne has the effect of increasing the budget. It is determined.

実時間ジョブがロードされるか、終了される場合はいつでも、制御装置に位置するＴＱが更新される。そのＴＱ値は、非実時間ジョブが実時間ジョブを実行する場面で実行する時間を決定する。Ｔｉｍｅ　Ｑｕａｎｔｕｍ値の計算の説明は下記になされている。Whenever a real-time job is loaded or terminated, the The TQ is updated. The TQ value is the same as when a non-real-time job executes a real-time job. Determine how long the scene will run. The explanation of Time Quantum value calculation is below. It is written down.

プログラムが終了した場合には、ＲＡＰはそのプログラムによって使用されたリソースを解放する。典型的には、新しく使用可能となったリソースを表したマ・ツブを更新したり、Ｒｅｆｅｒｅｎｃｅ　Ｆｒａｒｎｅをできる限り再計算したり、ホストで実行している１０５　Ｐｒｏｇｒａｍを更新することである。If a program terminates, RAP uses the resources used by that program. Release the source. Typically, a map representing a newly available resource is I updated the knobs and recalculated the Reference Frarne as much as possible. This means updating the 105 Program running on the host.

Ｒｅａｌ−Ｔｉｍｅ　Ｊｏｂ　Ｓｃｈｅｄｕｌｉｎｇにおいて、最後のジョブが時間スライスの実行を終了したことを示すＪＦ　Ｓｉｇｎａｌを制御装置から受信した後に、Ｏ３Ｂｏａｒｄは、Ｏ３Ｂｏａｒｄに位置し、Ｊｏｂ　Ｑｕｅｕｅにあるジョブナンバーをトラ・ンクするＪｏｂ　Ｃｏｕｎｔｅｒを減少させる。In Real-Time Job Scheduling, the last job is Receives JF Signal from the control device indicating that time slice execution has finished. After the O3Board is located on the O3Board and the Job Queue Decrease the Job Counter that tracks the job number in .

Ｏ８Ｂｏａｒｄはどの実時間ジョブが実行の準備をしているかを決定する。The O8Board determines which real-time jobs are ready to run.

このことは、制御装置ＯのＰｏｌｌｉｎｇ　Ｒｅｇｔｓｔｅｒを検査することにより達成されるが、Ｏ３Ｂｏａｒｄがレジスタを検査した最後の時より後にＦｒａｍｅ　５ｙｎｃｈｒｏｎｉｚａｔｉｏｎ　（Ｆ　５ｙｎｃ）信号のいずれかが生成されたかどうかが判定される。Ｆ同期化信号が受信された場合には、対応するビットはＰｏｌｌｉｎｇ　Ｒｅｇｉｓｔｅｒに設定されて、データの別のフレームがシステムにロードされていることを示している。そのデータを使用する実時間ジョブは、スケジュールすることができる。Ｏ３Ｂｏａｒｄは、Ｆ　５ｙｎｃ信号のいずれもミスしないように、Ｐｏｌｌｉｎｇ　Ｒｅｇｉｓｔｅｒを読み出し、リセ・ッ卜するためにアトミック命令を使用する。This means that when checking the Polling Regtster of the control device O, Fr If any of the ame5syncronization (F5sync) signals It is determined whether it has been generated. If an F synchronization signal is received, the corresponding The polling bits are set in the Polling Register to indicates that the system is loaded on the system. The actual entity that uses that data Time jobs can be scheduled. O3Board is F5yn Read the Polling Register so as not to miss any of the c signals. Use atomic instructions to issue and reset.

次に、Ｏ３Ｂｏａｒｄは、Ｐｏ１ｌ　Ｒｅｇｉｓｔｅｒワードのビット間の関係とＤａｔａ　Ｉｎｐｕｔ　Ｃｈａｎｎｅｌｓを使用する実時間ジョブを決定するためにＰＣＴを参照する。ＰＣＴには四つのエントリがある。各エントリはジョブナンバーとＩｎｐｕｔ　ＣｈａｎｎｅｌのＦｒａｍｅ　Ｒａｔｅを含んでいる。Next, the O3Board determines the relationship between the bits of the Po1l Register word. Determine real-time jobs that use and Data Input Channels Please refer to the PCT for this purpose. There are four entries in the PCT. Each entry is Contains the number and Input Channel Frame Rate. .

そのＦ　５ｙｎｃ信号がポールされているすべての実時間ジョブは、かく制御装置の、ＩＱにスケジュールされている。１以上のジョブがスケジュールされている場合には、優先順位がＰＣＴによって規定されたジョブ用のＦｒａｍｅ　Ｒａｔｅ情報から決定される。優先順位は最高速フレーム重篤−（Ｆａｓ　ｔｅｓｔ　Ｆｒａｍｅ　Ｒａｔｅ　Ｆｉｒｓｔ　）である。All real-time jobs whose F5sync signal is polled will thus Scheduled for IQ at the location. One or more jobs are scheduled. If the priority is determined by the PCT, the Frame Ra for the job is specified by the PCT. It is determined from the te information. Priority is the highest speed frame (Fas test Frame Rate First).

ＪＣＴＬはＪＱに付は加えられたジョブ数により増加する。JCTL increases with the number of jobs added to JQ.

実時間ジョブがスケジュールされていない場合には、Ｏ３Ｂｏａｒｄはスケジュールされる非実時間ジョブがあるかどうかを判定する。Ｏ３ＢｏａｒｄはＪｏｂ　Ｃｏｕｎｔｅｒを検査する。ＪＱに２未満のジョブがある場合には、Ｎｏｎ− Ｒｅａｌ−Ｔｉｍｅ　Ｊｏｂ　Ｑｕｅｕｅ　（ＮＲＴＱ）からエントリされて、ＪＱに追加される。この条件は維持されるので、ＪＱには常に実行準備されているジョブがある。マルチユーザーモードでシステム上実行されるジョブが一つだけである場合には、ダミージョブの実行がスケジュールされている。システムジョブの実行は、プログラムをロードするホスト要求のようにスケジュールする必要があるかもしれないので、アクティブジゴブが一つだけである場合にも、この条件は維持される。If the real-time job is not scheduled, O3Board will Determine whether there are any non-real-time jobs to be run. O3Board is a job Check the Counter. If there are less than 2 jobs in JQ, Non- Entry is made from Real-Time Job Queue (NRTQ), Added to JQ. This condition is maintained, so JQ is always ready to execute. There are jobs available. There is only one job running on the system in multi-user mode. If so, a dummy job is scheduled to run. system Job execution must be scheduled like a host request to load a program. This can also be used if there is only one active jigob, as this may be necessary. Conditions are maintained.

Ｏ３Ｂｏａｒｄ　Ｒｅ５ｉｄｅｎｔ　Ｐｒｏｇｒａｍは、（Ｉ（ＲＱ経由の）ホストからの要求、（１１０Ｑｕｅｕｅ経由の）制御装置、及び別のジョブをスケジュールする信号（Ｊｏｂ　Ｆｉｎｉｓｈｅｄ信号）に応答する。オペレーティングシステムプログラムは、しばしば要求をボールして、要求が生じた場合にそれを実行するエンドレスループとしてモデル化される。そのモデルは、Ｏ３Ｂｏａｒｄプログラムにおいて使用され、絶えず新し０要求をチェックしなければならない。O3Board Re5ident Program (I (via RQ) requests from the host, the control unit (via 110 Queue), and another job. job finished signal). Operate Operating system programs often ball requests and respond to requests as they arise. is modeled as an endless loop that executes The model is O3Bo Used in the ard program and must constantly check for new zero requests. No.

オペレーティングシステムプログラムの第一優先順位は、Ｊｏｂ　３ｃｈｅｄｕ１ｉｎｇアクティビティである。実行中の実時間ジョブと関連する時間拘束が厳格なので、ＪＯｂ　Ｆｉｎｉｓｈｅｄ信号が生成された場合に直ちに別のジョブをスケジュールする必要力くある。また、優先順位の低い要求（Ｉｌｏとホスト要求）は、フレーム時間を多くかけて実行することができるので、Ｊｏｂ　Ｓｃｈｅｄｕｌｉｎｇはこの要求の一つの次に待機するべきではない。こうして、Ｊｏｂ　Ｓｃｈｅｄｕｌｉｎｇアクティビティはオペレーティングシステムプログラムに割り込むものとして実行される。The first priority of operating system programs is Job 3chedu 1ing activity. Running real-time jobs and associated time constraints are tight. Therefore, when the JOb Finished signal is generated, another job is immediately executed. There is a strong need to schedule. Also, lower priority requests (Ilo and host request) can take a long frame time to execute, so Job Sc Heduling should not wait after one of these requests. In this way, J ob Scheduling activity is an operating system program executed as interrupting RAM.

次の優先レベルが、Ｉｌｏに対するホスト要求と制御装置要求に対する応答である。この時点て、優先順位よりさらに重要な負担はない。しかし、優先順位システムはソフトウェアにおいて完全に実行されているので、相対的優先順位に関する問題は将来に延期されている。The next priority level is response to host requests and controller requests to Ilo. Ru. At this point, there are no more important burdens than priorities. However, the priority system Since the system is implemented entirely in software, there are no issues regarding relative priority. The issue is being deferred to the future.

下記の説明は、オペレーティングシステムプログラムに対する単純化されｔこ疑似コードである。The following explanations are simplified and suspicious for operating system programs. Similar code.

ｍａｉｎ　ｐｒｏｇｒａｍｌｏｏｐ　ｆｏｒｅｖｅｒ［ｅｘａｍｉｎｅ　）ＩＲＱｕｐｄａｔｅ　ＳＳＲｉｆ　ｎｅｗ　ｊｏｂ　ｒｅｑｕｅｓｔ　ｐｅｎｄｉｎｇｉｆ　ｎｏｔ　ｅｍｐｔｙ、　ｐｒｏｃｅｓｓ　ｈｏｓｔ　ｒｅｑｕｅｓｔｅｘａｍｉｎｅ　ｌ０Ｑｕｐｄａｔｅ　ＳＳＲｉｆ　ｎｅｗ　ｊｏｂ　ｒｅｑｕｅｓｔ　ｐｅｎｄｉｎｇｉｆ　ｎｏｔ　ｅｗ＋ｐｔｙ、　ｐｒｏｃｅｓｓ　Ｉｌｏ　ｒｅｑｕｅｓｔ］ｓｃｈｅｄｕｌｉｎｇ　１ｎｔｅｒｒｕｐｔ：（１１ＪＦｓｉｇｎａｌ。main program loop forever [ exam)IRQ update SSRif new job request pending if not empty, process host requestex amine l0Q update SSRif new job request pending if not ew+pty, process Ilo request] scheduling 1interrupt: (11JFsignal.

［５ｃｈｅｄｕｌｅ　ａ　ｒｅａｌ　ｔｉｍｅ　ｊｏｂｉｆ　ｎｏ　ｒｅａｌ　ｔｉｍｅ　ｊｏｂ　ａｖａｉｌａｂｌｅ　ａｎｄ　１ｅｓｓ　ｔｈａｎ　２　ｊｏｂｓｓｃｈｅｄｕｌｅ　ａ　ｎｏｎ−ｒｅａｌ　ｔｉｍｅ　ｊｏｂｉｆ　ａ　ｎｅｗ　ｊｏｂ　ｒｅｑｕｅｓｔ　ｉｓ　ｏｎ　ＨＲＱ　ｏｒ　ＩＯＱ、ｒｅｓｅｔ　ＳＳＲｅｎｔｒｙ］従来のデバッキング技術は、高度に並列化された実時間プログラムに対する問題が提示されている。この技術は、デバッキングを助けるプログラムに命令を典型的に付は加える。しかしながら、このコードはＩＰｃオペレーションや遅延ブランチングなどの時間クリティカルコードセグメントと干渉するがもじれない。ＳＥにおいて、制御袋５！ＩＷ（７）Ｄｅｂｕｇ　Ｉｎｔｅｒｒｕｐｔ　（ＤＩ）は、命令をマークするために使用されるが、それに関してそのプログラムは、制御装置の命令メモリの予約領域に位置したＤｅｂｕｇ　Ｉｎｔｅｒｒｕｐｔ　Ｈａｎｄｉｅｒに対して中断する。本発明におけるデバッキングのためのこのハードウェアサポートは、そのプログラムが追加のコードを挿入することなしに実行できる中断点ファシリティを提供する。特定ビットがプログラムＩＷに設定されているのが発見された場合には、デバッグルーチンが自動的に制御される。デバッグルーチンの実行中、ホストワークステーション４００を経由してオペレータは、制御装置、ローカルメモリ変数、及びいずれがのプロセッサのレジスタの状態を検査できる。デバッグルーチンをイグジットすると同時に、制御がプログラムに戻る。割り込みが制御装置ＩＷの単一ビットに基づいて行われているので、制御装置プログラムにおいて命令実行中はデバッグルーチンを呼び出すことができる。[ 5 chedule a real time jobif no real ime job available and 1ess than 2jo bsschedule a non-real time jobif a n ew job request is on HRQ or IOQ, rese t SSR Entry] Traditional debugging techniques have problems for highly parallelized real-time programs. is presented. This technique typifies instructions into a program to help debugging. Add the target. However, this code does not support IPc operations or delayed braking. Interferes with time-critical code segments such as pinching, but does not interfere. S In E, control bag 5! IW (7) Debug Interrupt (DI) is used to mark an instruction, about which the program is Debug Interrupt H located in the reserved area of the instruction memory of the control device Interrupt against andier. This hardware for debugging in the present invention hardware support means that the program runs without inserting any additional code. Provide breakpoint facilities that allow you to: A specific bit is set in the program IW. debugging routines are automatically controlled. Deva During execution of the programming routine, the operator The control unit, local memory variables, and the state of the processor's registers The condition can be inspected. Control is programmed as soon as you exit the debug routine. Return to mu. Since the interrupt is based on a single bit of the controller IW, Debug routines can be called during instruction execution in the controller program. Wear.

また、多量の計算とデータ通信の両方を解決する必要のある大容量データセットを含んでいるその他の多くの問題領域がある。実施例は、天気モデリング、医療画像、コンピュータビジョン、分子モデリング、ＶＬＳ　Ｉンミュレーションだけでなく、ニューラルネットワーク、ボリューム、ビジュアライゼーション、及び多角形レンダリングを含んでいる。Also, large datasets that require solving both heavy calculations and data communication. There are many other problem areas including: Examples are weather modeling, medical Image, computer vision, molecular modeling, VLS I-imulation. as well as neural networks, volume, visualization, and and polygon rendering.

本明細書に記載したオペレーンヨンの装置と方法により本発明が説明されていることが理解できる。本発明の精神と範囲を逸脱せずに当業者によって容易に修正を加えることもできる。The invention is illustrated by the operating apparatus and method described herein. I can understand that. Easily modified by those skilled in the art without departing from the spirit and scope of the invention. You can also add

ｌンジ′ンブＡフックＦＩＧＵＲＥ４ＦＩＧＵＲＥ５ＦＩＧＵＲＥ　８　ＦＩＧＵＲＥ　９ＣＤＡＢＤＣＡ　マ・ソチシーケンスＡＢＣＤ　ＡＢＤＣＡＢＡＤ　ＣＣＢＤ　ＡＢＣＤ　ＤＢＣＢ　・・・ＦＩＧＵＲＥ　１３ＦＩＧＵＲＥ　１６励スライス（４又ライス／４−ラフ０）ＦＩＧＵＲＥ　１７ＦＩＧＵＲＥ１８・１νＯｄａｔｕｍ　ｔｎ　Ｉ　ＦＩＦＯｗｏｒｄＤａｔａＦｏｒｍａｔ　１（１２ｂｉｔｓ）：　Ｍ−＝＝＝コ］尺Ｃ・２　Ｉｌｏ　ｄａｔａ　ｔｏ　ｌ　ＦＩＦＯｗｏｒｄＤａｔａ　Ｆｏｒｍａｔ、２　（２４ｂｉｔｓ）：　Ｍ由１２　ｉｔ　１２　ｂｉｔｓ’″３　Ｉｌｏ　ｄａｔａ　ｔｏ　ｏｎｅ　ＦＩＦＯｗｏｒｄＤａｔａＦｏｒｍａｔ３（３２ｂｉｔｓ）：　Ｍ、　、　コ、”ＦＩＧｕｌ ’：ＩＥ　１９Ｃｏｍｐｏｓｌｅ　Ｖｉｄｅｏ　１２　ｂｉｔｓＲＧＢ　（Ｍｕｌｔｉｐｌｅｘｅｄ）　１０　ｂｉｔｓ　１０　ｂｕｓ　１０　ｂｉｔｓＲＧＢ　（Ｄｅｄｉｃａｌｅｃｌ　Ｃｈａｎｎｅｌｓ）　１２　ｂｉｔｓＦ１ＧＵ自己２０Ｓｃｈｅｍｅ　１：Ｓｃｈｅｍｅ　２：ＦＩＧＵＩ”ｌε２３［］　１０ＭＣチッフ“１：Ａず３　ファ〉り謁ンＦＩＧＵＲＥ　２５ロコ　ＸＯＭｔ’ｊツアｌ＝Ｊ＆ｆ３フｖ＞７：／３ンＹ　”ｙ’ｒ−”Ｊ−’ ４１１　フ１ン’）シｆ’　ＨＧＵＲＥ２６χシ令クン−１〉ス１　４けトシーケンス２　勅シケンス３ＦＩＧＵＲＥ３２ＦＩＧＵＲＥ　３３フロントページの続き（７２）発明者　ビータース、ジョゼフ・エトワード・ジュニアアメリカ合衆国ニューシャーシー州０８８１６、イースト・ブランズウィック、オースティン・ドライヴ　７（７２）発明者　ティラー、バーバート・ハドソン・ジュニアアメリカ合衆国ニューシャーシー州０８５３４、ペニントン、テイムバーランド・ロード　５lnji'bu A hook FIGURE4 FIGURE5 FIGURE 8 FIGURE 9 CDABDCA Ma Sochi Sequence ABCD ABDCABAD CCBD ABCD DBCB...FIGU RE　13 FIGURE 16 Excitation slice (4-pronged rice/4-rough 0) FIGURE 17 FIGURE18 ・1νOdatum tn I FIFO word Data Format 1 ( 12 bits): M-===ko] Scale C・2 Ilo data to l F IFOwordData Format, 2 (24bits): Myu 12 it 12 bits’″3 Ilo data to one FIFOwo rdDataFormat3 (32bits): M, , ko, “FIGul ’:IE 19 Composle Video 12 bitsRGB (Multiplex ed) 10 bits 10 bus 10 bitsRGB (Dedic alecl Channels) 12 bitsF1GU self 20 Scheme 1: Scheme 2: FIGUI”lε23 [] 10MC Chiff “1: Azu3 Fa Ri Audience FIGURE 25 Rocco 411　F1n’)shif’　HGURE26χしIREKun-1〉Su1　4ketoshi Kens 2 Imperial Sequence 3 FIGURE 32 FIGURE 33 Continuation of front page (72) Inventor Beaters, Joseph Etward Jr. New Chassis, United States 7 Austin Drive, East Brunswick, 08816 (72) Inventor Tiller, Barbert Hudson Jr. New Chassis, United States 5 Tamburland Road, Pennington 08534

Claims

[Claims] 1. N blocks, where N is an integer, and each block is M processors, where M is an integer, and each processor has an arithmetic logic unit (ALU), a local memory the M processor, including memory, and input/output (1/O) interfaces. A controller connected to provide the same group of instructions to each of the M processors and M processors. host means for selectively coupling said N blocks comprising control means and control means of said N blocks into at least first and second block groups, each group comprising P blocks, where P is N blocks; For each group with P blocks, the same A parallel computing system comprising: a host means in which respective different groups of processor instructions are provided to each of M times P processors. 2. Each of the M processors in a block has an interprocessor communication (IPC) channel. channel that allows the processor to place data values within the block. can be transferred to other processors of M processors in a block, and can be transferred to other processors of M processors in a block. A method for programming the M processors within that block to determine the The control means of each block includes a stage, in which each of the processors within any one boundary Each processor can communicate with another processor within its one boundary via the IPC, creating a data communication bus to each of the groups with P blocks. means for selectively combining N groups of IPC channels with means for controlling each of the M processors in the respective block, wherein the PC channel connects the M processors in one of the blocks in the predetermined sequence; selectively programming each IPC channel to a) passing a data value received from a processor to the next processor in the sequence without receiving the data value; b) passing the data value to one or more processors in the sequence; multiple pros 2. The system of claim 1, including means for receiving data values sent to the processor. 3. Each processor conditionally executes instructions given by means for indicating local data conditions within the processor and control means based on the indicated local data conditions. 2. A system as claimed in claim 1, including means for determining. 4. a plurality of processors, each having: a source of a system clock signal; an arithmetic logic unit (ALU); a means for indicating local data conditions within the processor; local memory; and an input/output (I/O) interface; and profiling counters are installed. When enabled, has a counter value that is incremented in response to the system clock signal. a plurality of processors, the plurality of processors comprising a profiling counter; a control means responsive to the control instruction and coupled to provide the same group of processor instructions to each of the plurality of processors; and a control means for providing the control instruction and the processor instruction to the control means. A parallel computing system comprising a host means for each profile in Enoki's processors. Fields used to enable and disable ring counters The parallel computing system wherein each processor instruction includes a field. 5. Separate program with count value that is incremented only when the counter is enabled. The control means includes a filing counter, and the control means has a profile. each of the control instructions includes a field for selectively enabling and disabling a counter, and the control means is configured to control the controller. In response to the first of the command commands, the data associated with the immediate value and the control means are One of the values obtained from the data register is used as the profiling counter of the control means. and in response to the second of said control instructions, records the count value in the data register, and each of the processors responds to its first processor instruction. the immediate value and one of the values obtained from the processor's local memory. 5. The system of claim 4, wherein the system loads a profiling counter of the processor and records the count value in local memory in response to a second processor instruction. 6. A processor suitable for use in parallel computing systems that stores operand values. an arithmetic logic unit (ALU) that performs arithmetic and logic operations on operand values; a multiplier separate from the ALU that generates the arithmetic product of the first and second operand values; and a match unit separate from the ALU. and the bits from the memory means. Count the match numbers between the bit pattern and the bit sequence, and The processor includes a match unit that generates a count value indicative of a detected match number between a subsequence of a sequence of bits and a subsequence of bit sequences. 7. The matching bit pattern has a number of bits smaller than the number of bits of the bit sequence, and the match unit records a sequence of templates representing each possible match position of the bit pattern and the corresponding bit pattern of the bit sequence. The means for A means for comparing a bit sequence and all templates in a sequence, and giving the count match between a bit sequence and a template as a match number. 7. A processor as claimed in claim 6, including means for obtaining. 8. A multiplier is coupled to generate an arithmetic product as an input operand to the ALU. the match unit is coupled in parallel to the multiplier, the bit pattern is included in the first operand, the bit sequence is included in the second operand, and the count value generated by the multiplier and the arithmetic product generated by the ALU are only one of them is presented as an input operand to the ALU at any given time, and the ALU responds to an instruction word containing a first subfield and a second subfield to field is used by the ALU to perform one of the arithmetic and logic operations, and the second subfield The code is used by a multiplier to generate an arithmetic product or a matcher to generate a count value. The processor according to claim 6. 9. The processor according to claim 6, further comprising: a first accumulator; and a second accumulator; The ALU is coupled to feed both simultaneously. sa. 10. means for providing a processor instruction word; memory means for maintaining a plurality of arrays of operand values; an arithmetic logic unit (ALU) having first and second input ports coupled to receive and select a respective operand from a first of the plurality of arrays of operand values for providing to a first input port of the ALU; , coupled to the memory means, the first frame of the instruction word; first address generating means responsive to the first field, coupled to the memory means for selecting a respective operand from a second one of the plurality of arrays of operand values and providing it to a second input port of the ALU; is the second field of a different instruction word. A processor suitable for use in a parallel computing system, comprising second address generating means responsive to a field. 11. Each array of operands has a lower boundary address and an upper boundary address, and each of the first and second address generating means determines that the generated address value is invalid and therefore Specify whether to generate an out-of-bounds signal by specifying whether it is smaller than the lower boundary address or higher boundary address. means for determining whether the address value is greater than the address value, and means for responding to an out-of-bounds signal and converting an invalid address value to a predetermined address value, the predetermined address value being 11. The processor of claim 10, further comprising: means for addressing a predetermined operand value within an upper boundary and a lower boundary of a second operand. 12. P processors, P being a parallel number, each processor coupled to a source of a clock signal having a predetermined frequency; an arithmetic logic unit (ALU) capable of performing at least one arithmetic operation during one period of the clock signal; and a local memory coupled to retrieve and record data values in synchronization with the clock signal; control means coupled to provide instructions to each of the P processors; A processor coupled to each of the P processors for transferring data values between processors. an inter-processor communication (IPC) means coupled to each of the P processors and a bus comprising means for transferring a clock signal, the data clock signal causing the IPC means to transfer one of said data values on the bus during each pulse of the data clock signal; The one PC means is responsive to the control means to generate data at a first frequency substantially equal to a predetermined frequency. a clock signal and output a clock signal at a second frequency approximately equal to N times the predetermined frequency. 1. A parallel computing system comprising means for providing a computer clock signal, N being an integer greater than one. 13, P processors, where P is an integer, and each processor has an arithmetic and logic unit. The P-processor contains the local memory (ALU) and local memory for holding data values. control means coupled to provide instructions to each of the P processors; Inter-processor communication (IPC) means coupled to each of the P-processors; A bus containing means for transferring data values to each of the processors, capable of transferring 2N bits at a time, where N is an integer, stored in local memory. the bus, each of the data values held therein being an N-bit data value, and the bus being responsive to the control means for causing the bus to transfer data values in one of first and second opposite directions between the processor sequences; , first and second separated N-bit buses; IPC logic for allowing the bus to operate as a 2N-bit bus or as one 2N-bit bus 1. A parallel computing system comprising: said one PC means including a processing means. 14, N blocks, where N is an integer, each block having M processors, where M is an integer, each processor responsive to processor instructions and having an arithmetic logic unit; a control means responsive to the control instructions and coupled to provide the same group of processor instructions to each of the M processors in the block; The N blocks containing the N blocks, and the control instruction and the processor instruction are given to the control means of each of the N blocks. a host means coupled so as to control each block; control instructions and processor instructions given to each other block. A parallel computing system comprising said host means different from and processor instructions. 15. Each of the M processors in each block has an interprocessor communication (IPC) channel for responding to IPC instructions and transferring data values between the M processors in the block. a means for indicating local data conditions within a processor; means for conditionally executing processor instructions responsive to the means for and a separate group of IPC commands for each PC channel. 15. The system of claim 14, including means coupled to an IPC channel of each of the M processors. 16. P processors, where P is an integer greater than 1, each processor having: memory means having N memory locations for holding N data values, where N is an integer greater than 1; means, coupled to the memory, a first output controller; reading said N data values from memory locations in the memory means in the order determined by the signal; output data buffer means for inputting and providing read data values at instants determined by a second output control signal; and output data buffer means coupled to the memory for receiving data values at instants determined by a first input control signal. , and input data buffer means for applying the received data value to said memory location of the memory means determined by a second input control signal, wherein the data value provided by the output buffer means is the data value received by the input buffer means. means for coupling the output buffer means and the input buffer means, and providing first and second output control signals and first and second input control signals, stored in respective memory means of the P processor; A parallel computing system that includes programmable control means for reordering data values. 17. Each of the P processors is identified by a unique processor identifier value, and each processor's memory means is recorded at the memory location indicated by the offset value. responsive to the offset value for accessing one of the N data values, programmable control means provides a first output control signal and a second input control signal to each of the P processors to specify the offset; including the means of an offset of P is used to access the N data values stored in the memory means, and the first output control signal and the second input control signal are offset from P and P, respectively. 17. The system of claim 16, wherein: and N different functions. 18, M processors, where M is an integer, and each processor has an arithmetic and logic unit. a controller coupled to provide identical processor instructions to each of the M processors; control means, giving a control instruction to the control means, and giving a processor instruction to the M processor. a host means coupled to a control means for controlling a process table that maintains information on real-time and non-real-time processes executed in a parallel computing system; memory, a port for determining when a real-time process should run on a parallel computing system. resource allocation means for allocating processors to processes running in parallel computing systems. queuing means for queuing real-time and non-real-time processes executing in the parallel computing system; and queuing means responsive to a synchronization signal for removing a process from the queuing means and directing the assigned processor to do so. schedule for executing the specified process A parallel computing system comprising said host means comprising a cursoring means. 19. Process table memory is used to store real-time processes executed in parallel computing systems. For each process, the predicted program execution time and predicted frame time are maintained, and the combined program execution time of all processes in the queue is summed. The resource allocation procedure is executed only when the program execution time of the new real-time process added is less than the minimum frame time of any process in the queue. 19. The system of claim 18, wherein the stage allocates processors to new real-time processes. stem.