JP4631442B2

JP4631442B2 - Processor

Info

Publication number: JP4631442B2
Application number: JP2005006017A
Authority: JP
Inventors: 晃成轟
Original assignee: Seiko Epson Corp
Current assignee: Seiko Epson Corp
Priority date: 2005-01-13
Filing date: 2005-01-13
Publication date: 2011-02-16
Anticipated expiration: 2025-01-13
Also published as: JP2006195705A

Description

本発明は、プロセッサにかかり、特に複数のタスクまたはスレッドを並列に実行可能なプロセッサに関する。 The present invention relates to a processor, and more particularly to a processor capable of executing a plurality of tasks or threads in parallel.

一般的に、プロセッサは、パイプライン処理を実行するため、各処理に専用のハードウェア構成を備えている。このようなプロセッサを複数設け、プロセッサのすべてを統括的に制御する構成は、マルチスレッドプロセッサ等と呼ばれている。マルチスレッドプロセッサの多くは、複数のプロセッサ間で互いにハードウェア構成の一部を共用することにより、ハードウェアリソースを効率的に使用し、回路の大規模化を抑止している。 Generally, a processor has a hardware configuration dedicated to each process in order to execute pipeline processing. A configuration in which a plurality of such processors are provided and all the processors are controlled in an integrated manner is called a multi-thread processor. Many multi-thread processors share a part of the hardware configuration among a plurality of processors, thereby efficiently using hardware resources and suppressing an increase in circuit scale.

マルチスレッドプロセッサによれば、複数の命令を並列に実行することができ、プロセッサの処理速度をシングルスレッドのプロセッサに比べて格段に高めることができる。
マルチスレッドプロセッサに関する従来技術としては、例えば、特許文献１ないし３に記載された発明が掲げられる。特許文献１の発明のマイクロコンピュータは、複数のＣＰＵ（Central Processing Unit）でなるマイクロコンピュータが、メモリから命令を読み出すよう構成されている。 According to the multi-thread processor, a plurality of instructions can be executed in parallel, and the processing speed of the processor can be significantly increased as compared with a single-thread processor.
As prior art relating to a multi-thread processor, for example, inventions described in Patent Documents 1 to 3 are listed. The microcomputer according to the invention of Patent Document 1 is configured such that a microcomputer composed of a plurality of CPUs (Central Processing Units) reads instructions from a memory.

特許文献１に記載された発明は、メモリのメモリ空間をＣＰＵごとのプログラム領域に分割し、分割されたプログラム領域のオフセットアドレスをレジスタに格納する。そして、ＣＰＵがメモリにアクセスした場合、ＣＰＵに対応したオフセットアドレスと命令フェッチアドレスとを加算してアクセスすべきプログラム領域を選択する。このような特許文献１によれば、プログラマがメモリのバンクを意識することなくソフトウェアを開発することができ、メモリ空間が有効に活用できるという効果を得ることができる。 The invention described in Patent Document 1 divides the memory space of the memory into program areas for each CPU, and stores the offset addresses of the divided program areas in a register. When the CPU accesses the memory, the program area to be accessed is selected by adding the offset address corresponding to the CPU and the instruction fetch address. According to such a patent document 1, it is possible to develop software without the programmer being aware of the memory bank, and the memory space can be effectively used.

また、特許文献２に記載された発明では、ＣＰＵと、ＣＰＵによって命令が読み出される命令メモリと、データが読み出されるデータメモリと、命令メモリ、データメモリでマッピング可能な共用メモリバンクを設けている。特許文献２によれば、ソフトウェアの開発時、命令メモリやデータメモリに必要なメモリ量を正確に見積もることができない場合にも対応可能な集積回路が提供できる。 In the invention described in Patent Document 2, a CPU, an instruction memory from which an instruction is read by the CPU, a data memory from which data is read, an instruction memory, and a shared memory bank that can be mapped by the data memory are provided. According to Patent Document 2, it is possible to provide an integrated circuit that can cope with a case where the amount of memory required for an instruction memory or a data memory cannot be accurately estimated during software development.

さらに、特許文献３に記載された発明は、複数のプロセッサと、複数のプロセッサによってアクセスされる共有メモリを設ける。そして、各プロセッサに共有メモリとの間で連続したアドレスを持つローカルメモリを設け、このローカルメモリに共有メモリ内のサブページを記憶させている。このような特許文献３の発明によれば、プロセッサ間のスヌープ処理にかかる負荷を軽減することができる。
特開平９−３２５９１０号公報特開２００４−１７１２３２号公報特開平１１−１２００７８号公報 Furthermore, the invention described in Patent Document 3 includes a plurality of processors and a shared memory accessed by the plurality of processors. Each processor is provided with a local memory having a continuous address with the shared memory, and subpages in the shared memory are stored in the local memory. According to such an invention of Patent Document 3, it is possible to reduce the load on the snoop process between the processors.
JP-A-9-325910 JP 2004-171232 A Japanese Patent Laid-Open No. 11-120078

ところで、マルチスレッドプロセッサでは、複数のプロセッサの各々によってメモリから命令が読み出される。このため、マルチスレッドプロセッサにあっては、シングルスレッドのプロセッサに比べて命令列を高速に読み出し、適正なタイミングで処理することが必要になる。そして、プロセッサが命令を適正なタイミングで読み出すことができない場合、マルチスレッドプロセッサの全てのプロセッサで実行されているプログラムが、メモリからの命令の読み出し待ちにより停止する可能性がある。なお、このような要因でプログラムが停止することを、一般的にメモリ・ストールという。 By the way, in a multi-thread processor, an instruction is read from a memory by each of a plurality of processors. For this reason, in a multi-thread processor, it is necessary to read out an instruction sequence at a higher speed and process it at an appropriate timing as compared with a single-thread processor. If the processor cannot read the instruction at an appropriate timing, the program being executed by all the processors of the multi-thread processor may stop due to waiting for the instruction to be read from the memory. Note that the fact that a program stops due to such a factor is generally called a memory stall.

しかしながら、上記した特許文献１ないし３は、いずれもマルチスレッドプロセッサに命令を適正なタイミングで供給することを目的にしてなされたものではない。こため、特許文献１ないし３の構成では、例えば２００ＭＨｚ程度の高速でプロセッサを動作させることを考えた場合、動作に充分な読み出し速度を実現することが難しい。このため、従来技術を適用したプロセッサでは、高速で動作する場合のメモリ・ストールの発生を抑えることはできず、動作速度の制限によってプロセッサの処理能力が制限される。 However, none of the above-described Patent Documents 1 to 3 is intended to supply instructions to the multithread processor at an appropriate timing. For this reason, with the configurations of Patent Documents 1 to 3, it is difficult to realize a read speed sufficient for operation when considering operating the processor at a high speed of about 200 MHz, for example. For this reason, in a processor to which the conventional technology is applied, it is not possible to suppress the occurrence of memory stalls when operating at high speed, and the processing capability of the processor is limited by the limitation of the operation speed.

さらに、特許文献１ないし特許文献３は、いずれもメモリをバンク分けし、並列に命令列を読み出すことによって命令の読み出しを高速化し、メモリ・ストールを防いでいる。
メモリをバンク分けする構成は、マルチスレッドプロセッサの動作速度を高めることに有利である。ただし、バンク分けされたメモリは、一般的に各バンクに異なる種類のデータを保存する。このため、複数種類のデータのいずれにも適正なメモリ空間をバンクに設定しない限り、複数のプロセッサからの要求が1つのバンクに集中し処理性能を低下させる要因になる。さらに、複数のバンクのうちデータが保存されない空き領域が生じるものが発生し、メモリの使用効率が低下する可能性がある。 Furthermore, in each of Patent Documents 1 to 3, the memory is divided into banks, and the instruction sequence is read in parallel, thereby speeding up the reading of the instructions and preventing memory stalls.
The configuration in which the memory is divided into banks is advantageous in increasing the operation speed of the multi-thread processor. However, the banked memory generally stores different types of data in each bank. For this reason, unless an appropriate memory space is set in a bank for any of a plurality of types of data, requests from a plurality of processors are concentrated in one bank, causing a reduction in processing performance. Furthermore, there is a possibility that an empty area where data is not stored is generated among a plurality of banks, and the use efficiency of the memory may be lowered.

また、従来技術によれば、マルチポートを採用して命令の読み出しを高速化するものがある。しかし、マルチポートの構成は比較的コストがかかるため、より低コストのシングルポートを採用することが望ましい。バンク分けされたメモリとマルチポート接続するプロセッサの構成を図１１に例示する。
本発明は、上記した点に鑑みてなされたものであり、比較的高速に動作する場合にあっても適正なタイミングで命令列を読み出すことができ、高い処理能力が得られるプロセッサを提供することを目的とする。また、本発明は、適正なタイミングで命令列を読み出すことが可能であるためにメモリのバンク分けを避けてメモリの利用効率を高め、しかもシングルポートを採用して回路規模を抑えると共に低コスト化を実現できるプロセッサを提供することを目的とする。 In addition, according to the prior art, there is one that adopts a multi-port to speed up instruction reading. However, since a multi-port configuration is relatively expensive, it is desirable to employ a lower cost single port. FIG. 11 shows an example of the configuration of a processor connected to a banked memory in a multiport connection.
The present invention has been made in view of the above points, and provides a processor capable of reading a sequence of instructions at an appropriate timing even when operating at a relatively high speed and obtaining high processing capability. With the goal. In addition, since the instruction sequence can be read at an appropriate timing, the present invention avoids memory banking to improve memory utilization efficiency, and adopts a single port to reduce the circuit scale and reduce the cost. An object of the present invention is to provide a processor capable of realizing the above.

以上の課題を解決するため、本発明のプロセッサは、命令が保存されているメモリから命令をフェッチするプロセッサであって、命令をフェッチするフェッチ手段と、当該フェッチ手段によって要求された命令を、命令を要求したフェッチ手段に転送するメモリ制御手段とを含み、前記フェッチ手段が、転送された命令を蓄積する命令蓄積手段と、前記命令蓄積手段に蓄積された命令のうち実際に実行される命令である実命令の状態に基づいて、該フェッチ手段における命令取得の緊急性の度合いを示す緊急度設定手段と、前記緊急度設定手段によって設定された緊急度を前記メモリ制御手段に出力し、命令の転送を要求するフェッチ要求手段と、を備え、前記メモリ制御手段は、前記フェッチ要求手段によって入力された緊急度に基づいて命令転送の優先順位を決定するフェッチ優先順位設定手段と、前記フェッチ優先順位設定手段によって設定された優先順位にしたがってフェッチ要求にかかる命令を前記メモリから読み出すメモリアクセス制御手段とを備えることを特徴とする。 In order to solve the above-described problems, a processor according to the present invention is a processor that fetches an instruction from a memory in which an instruction is stored. The processor fetches an instruction, and the instruction requested by the fetch means Memory control means for transferring to the fetch means that requested the instruction, wherein the fetch means is an instruction storage means for storing the transferred instructions, and an instruction that is actually executed among the instructions stored in the instruction storage means. Based on the state of a certain actual instruction, the urgency level setting means indicating the degree of urgency of instruction acquisition in the fetch means, and the urgency level set by the urgency level setting means are output to the memory control means, Fetch request means for requesting transfer, and the memory control means is based on the urgency level input by the fetch request means. Fetch priority setting means for determining the priority of instruction transfer; and memory access control means for reading an instruction relating to a fetch request from the memory according to the priority set by the fetch priority setting means. To do.

このような発明によれば、フェッチ手段に転送された命令を蓄積し、蓄積された命令のうち実際に実行される命令である実命令の状態に基づいてフェッチ手段における命令取得の緊急性の度合いを設定することができる。そして、設定された緊急度をメモリ制御手段に出力して命令の転送を要求する。さらに、メモリ制御手段が、出力された緊急度に基づいて命令転送の優先順位を決定し、設定された優先順位にしたがってフェッチ要求にかかる命令をメモリから読み出すことができる。 According to such an invention, instructions transferred to the fetch means are accumulated, and the degree of urgency of instruction acquisition in the fetch means based on the state of the actual instructions that are actually executed among the accumulated instructions Can be set. Then, the set urgency level is output to the memory control means to request the command transfer. Further, the memory control means can determine the priority order of instruction transfer based on the outputted urgency level, and can read out the instruction relating to the fetch request from the memory according to the set priority order.

このため、本発明は、比較的高速に動作する場合にあってもメモリに対するアクセス速度を高めることなく適正なタイミングで命令列を読み出すことができ、メモリ・ストールを防ぐことが可能なプロセッサを提供することができる。
また、本発明は、命令のフェッチを並行して行わずに適正なタイミングで命令列を読み出すことができ、メモリ・ストールを防ぐことが可能である。このため、命令をバンク分けすることを避けることによってメモリの利用効率を高め、しかもシングルポートを採用して回路規模を抑えると共に低コスト化を実現できるプロセッサを提供することができる。 For this reason, the present invention provides a processor capable of reading a sequence of instructions at an appropriate timing without increasing the access speed to the memory even when operating at a relatively high speed, and preventing a memory stall. can do.
Further, according to the present invention, it is possible to read an instruction sequence at an appropriate timing without performing instruction fetching in parallel, and it is possible to prevent a memory stall. For this reason, it is possible to provide a processor capable of improving the memory utilization efficiency by avoiding the banking of instructions and also adopting a single port to reduce the circuit scale and reduce the cost.

また、本発明のプロセッサは、複数の命令を並列実行するプロセッサであって、命令をフェッチする複数のフェッチ手段と、当該フェッチ手段によって要求された命令を、命令を要求したフェッチ手段に転送するメモリ制御手段とを含み、前記フェッチ手段が、転送された命令を蓄積する命令蓄積手段と、前記命令蓄積手段に蓄積された命令のうちの実際に実行される命令である実命令の状態に基づいて、該フェッチ手段における命令取得の緊急性の度合いを示す緊急度設定手段と、前記緊急度設定手段によって設定された緊急度を前記メモリ制御手段に出力し、命令の転送を要求するフェッチ要求手段と、を備え、前記メモリ制御手段は、前記フェッチ要求手段によって出力された緊急度に基づいて命令転送の優先順位を決定するフェッチ優先順位設定手段と、前記フェッチ優先順位設定手段によって設定された優先順位にしたがってフェッチ要求にかかる命令を前記メモリから読み出すメモリアクセス制御手段と、を備えることを特徴とする。 The processor according to the present invention is a processor for executing a plurality of instructions in parallel, a plurality of fetch means for fetching instructions, and a memory for transferring the instruction requested by the fetch means to the fetch means that requested the instruction. Control means, wherein the fetch means is based on an instruction storage means for storing transferred instructions and a state of an actual instruction that is an instruction actually executed among the instructions stored in the instruction storage means. Urgent level setting means for indicating the degree of urgency of instruction acquisition in the fetch means; fetch request means for outputting the urgency level set by the urgency level setting means to the memory control means and requesting transfer of instructions; And the memory control means determines the priority of instruction transfer based on the urgency level output by the fetch request means. The previous order setting means, characterized in that it comprises, a memory access control means for reading such instruction fetch request from the memory according to the priority set by the fetch priority setting means.

このような発明によれば、複数のフェッチ手段の各々が転送された命令を蓄積し、蓄積された命令のうち実際に実行される命令である実命令の状態に基づいてフェッチ手段における命令取得の緊急性の度合いを設定することができる。そして、設定された緊急度をメモリ制御手段に出力して命令の転送を要求する。さらに、メモリ制御手段が、出力された緊急度に基づいて命令転送の優先順位を決定し、設定された優先順位にしたがってフェッチ要求にかかる命令をメモリから読み出すことができる。 According to such an invention, each of the plurality of fetch means accumulates the transferred instruction, and the instruction acquisition in the fetch means is based on the state of the actual instruction that is the actually executed instruction among the accumulated instructions. The degree of urgency can be set. Then, the set urgency level is output to the memory control means to request the command transfer. Further, the memory control means can determine the priority order of instruction transfer based on the outputted urgency level, and can read out the instruction relating to the fetch request from the memory according to the set priority order.

このため、本発明は、比較的高速に動作する場合にあってもメモリに対するアクセス速度を高めることなく適正なタイミングで命令列を読み出すことができ、メモリ・ストールを防ぐことが可能なプロセッサを提供することができる。
また、本発明は、命令のフェッチを並行して行う場合にあっても適正なタイミングで命令列を読み出すことができ、メモリ・ストールを防ぐことが可能である。このため、命令を並列に実行するいわゆるマルチスレッドプロセッサ等にあっても命令をバンク分けすることを避けあてメモリの利用効率を高め、しかもシングルポートを採用して回路規模を抑えると共に低コスト化を実現できるプロセッサを提供することができる。 For this reason, the present invention provides a processor capable of reading a sequence of instructions at an appropriate timing without increasing the access speed to the memory even when operating at a relatively high speed, and preventing a memory stall. can do.
Further, according to the present invention, even when instructions are fetched in parallel, an instruction sequence can be read at an appropriate timing, and a memory stall can be prevented. For this reason, even in a so-called multi-thread processor that executes instructions in parallel, it avoids banking of instructions to increase the memory utilization efficiency, and also uses a single port to reduce the circuit scale and reduce costs. A realizable processor can be provided.

また、本発明は、前記緊急度設定手段が、実際に実行される命令の蓄積数に応じて前記緊急度を設定することを特徴とする。
このような発明によれば、処理すべき命令が不足してメモリ・ストールが発生する可能性が高いフェッチ手段のフェッチを優先的に実行することができる。このため、メモリ・ストールを効果的に防ぐことができる。 Further, the present invention is characterized in that the urgency level setting means sets the urgency level according to the number of instructions actually executed.
According to such an invention, it is possible to preferentially execute fetching by fetch means that is highly likely to cause a memory stall due to a lack of instructions to be processed. For this reason, memory stall can be effectively prevented.

また、本発明のプロセッサは、前記フェッチ蓄積手段がデータ長の異なる複数の実行予定命令を蓄積する場合、前記緊急度設定手段は、実行予定命令のデータ長に基づいて命令の蓄積数を判定することを特徴とする。
このような発明によれば、データ長が可変長の命令を扱う場合にも、フェッチ蓄積手段に蓄積されている命令の蓄積数を正確に判別することができる。このため、より適正なタイミングで命令列を読み出すことができる。 In the processor according to the present invention, when the fetch storage unit stores a plurality of execution scheduled instructions having different data lengths, the urgency setting unit determines the number of stored instructions based on the data length of the execution scheduled instruction. It is characterized by that.
According to such an invention, even when an instruction with a variable data length is handled, the number of instructions stored in the fetch storage means can be accurately determined. For this reason, an instruction sequence can be read at a more appropriate timing.

また、本発明のプロセッサは、前記フェッチ手段によって分岐命令がフェッチされたことを検出する分岐命令検出手段をさらに備え、前記緊急度設定手段は、前記分岐命令検出手段が、該緊急度設定手段に対応するフェッチ手段による分岐命令のフェッチを検出したことによって緊急度を設定することを特徴とする。
このような発明によれば、分岐命令を実行したことによってフェッチ蓄積手段に蓄積されている命令がクリアされるフェッチ手段のフェッチを優先して実行することができる。このため、メモリ・ストールを効果的に防ぐことができる。 The processor according to the present invention further includes a branch instruction detection unit that detects that a branch instruction has been fetched by the fetch unit, and the urgency level setting unit includes the branch instruction detection unit. The urgent level is set by detecting the fetch of the branch instruction by the corresponding fetch means.
According to such an invention, it is possible to preferentially execute the fetch of the fetch unit in which the instruction stored in the fetch storage unit is cleared by executing the branch instruction. For this reason, memory stall can be effectively prevented.

また、本発明のプロセッサは、複数の前記フェッチ手段のうち第１フェッチ手段が第１命令を要求した後であって、かつ第１命令の転送が完了する以前に第２フェッチ手段が第２命令を要求し、第２命令の転送が第１命令の転送よりも先に完了した場合、前記優先順位設定手段は、前記第１命令の命令転送の優先順位の設定をより高位の順位に更新することを特徴とする。 In the processor according to the present invention, the second fetching unit may receive the second instruction after the first fetching unit requests the first instruction among the plurality of fetching units and before the transfer of the first instruction is completed. When the transfer of the second instruction is completed before the transfer of the first instruction, the priority setting means updates the instruction transfer priority setting of the first instruction to a higher order. It is characterized by that.

このような発明によれば、フェッチの追越しが発生した場合、先に要求された緊急度の低いフェッチが長時間待機させられて実質的に実行できなくなることを防ぐことができる。
また、本発明のプロセッサは、前記メモリ制御手段と前記メモリとをシングルポート接続し、前記フェッチ手段のいずれかに命令を１つずつフェッチさせることを特徴とする。 According to such an invention, when a fetch overtaking occurs, it is possible to prevent a previously requested fetch with a low urgency from being waited for a long time and being substantially unable to be executed.
The processor according to the present invention is characterized in that the memory control means and the memory are connected in a single port, and one of the fetch means is fetched one by one.

このような発明によれば、シングルポートを採用して回路規模を抑えると共に低コスト化を実現できるプロセッサを提供することができる。 According to such an invention, it is possible to provide a processor that employs a single port to reduce the circuit scale and realize cost reduction.

以下、図を参照して本発明に係るプロセッサの実施形態１、実施形態２を説明する。
（実施形態１）
図１は、本発明の実施形態１、実施形態２に共通のプロセッサを説明するための図である。実施形態１のプロセッサは、複数のプロセッサＡ、Ｂ、Ｃ、Ｄを含み、各プロセッサでハードウェアリソースを一部共有して構成されるプロセッサである。このような実施形態１のプロセッサは、プロセッサＡ〜Ｄが並行して命令を実行することができる。 Embodiments 1 and 2 of a processor according to the present invention will be described below with reference to the drawings.
(Embodiment 1)
FIG. 1 is a diagram for explaining a processor common to the first and second embodiments of the present invention. The processor according to the first embodiment is a processor that includes a plurality of processors A, B, C, and D, and is configured by sharing some hardware resources among the processors. In the processor according to the first embodiment, the processors A to D can execute instructions in parallel.

実施形態１のプロセッサは、命令が保存されているメモリ１０１を備え、メモリ１０１から命令を読み出す（フェッチする）ものである。このため、プロセッサは、命令をフェッチする複数のフェッチ部１０７ａ、１０７ｂ、１０７ｃ、１０７ｄを備えるプロセッサ本体１と、フェッチ部１０７ａ〜１０７ｄによって要求された命令を命令を要求したフェッチ部に転送するメモリ制御部１０３とを含んでいる。 The processor according to the first embodiment includes a memory 101 in which instructions are stored, and reads (fetches) instructions from the memory 101. Therefore, the processor controls the processor main body 1 including a plurality of fetch units 107a, 107b, 107c, and 107d for fetching instructions and the memory control for transferring the instructions requested by the fetch units 107a to 107d to the fetch unit that requested the instructions. Part 103.

実施形態１のメモリ１０１は、命令が記憶される記憶領域がバンクされていない非バンク構成を有する。このようなメモリ１０１は、多くの命令を記憶領域にリニアに保存することができ、記憶領域を効率よく使用することが可能である。
また、実施形態１では、メモリ制御部１０３が、メモリ１０１とシングルポート接続してメモリから命令データを１つずつ読み出している。このような実施形態１は、メモリ１０１との接続にかかる回路の大規模化及びコストを抑えることができる。 The memory 101 according to the first embodiment has a non-bank configuration in which a storage area for storing instructions is not banked. Such a memory 101 can store many instructions linearly in a storage area, and can efficiently use the storage area.
In the first embodiment, the memory control unit 103 is connected to the memory 101 with a single port and reads instruction data one by one from the memory. In the first embodiment, the scale and cost of the circuit for connection with the memory 101 can be suppressed.

プロセッサ本体１に含まれるプロセッサＡ〜Ｄは、いずれもフェッチ部と、フェッチ部でフェッチされた命令を解釈するデコード部と、デコードされた命令にかかる演算を実行するＡＬＵ（Arithmetic and Logical Unit）とを備えている。実施形態１では、各プロセッサＡ〜Ｄは、フェッチ部１０７ａ〜１０７ｄのいずれかを有し、デコード部１１３とＡＬＵ１１５とを共有している。 The processors A to D included in the processor main body 1 each include a fetch unit, a decode unit that interprets an instruction fetched by the fetch unit, an ALU (Arithmetic and Logical Unit) that executes an operation related to the decoded instruction, and It has. In the first embodiment, each of the processors A to D includes any one of the fetch units 107 a to 107 d and shares the decode unit 113 and the ALU 115.

また、フェッチ部１０７ａ〜１０７ｄの各々は、メモリ１０１からプログラム等にしたがって実行が予定される命令をフェッチする。そして、フェッチされた命令を蓄積するフェッチ・バッファと、フェッチ・バッファにおける命令の蓄積状態を管理するフェッチ・バッファ管理部と、を備えている。なお、フェッチされた命令のうち、いったんフェッチ・バッファに蓄積されても分岐等の発生によってクリアされ、実際には実行がなされない場合もある。実行がなされる命令を、本明細書では実命令と記して実行されない命令と区別する場合もある。 Each of the fetch units 107a to 107d fetches an instruction to be executed from the memory 101 according to a program or the like. A fetch buffer that accumulates fetched instructions and a fetch buffer management unit that manages the accumulation state of instructions in the fetch buffer are provided. Of the fetched instructions, once they are stored in the fetch buffer, they may be cleared by the occurrence of a branch or the like, and may not actually be executed. An instruction to be executed is sometimes referred to as an actual instruction in this specification to distinguish it from an instruction that is not executed.

さらに、プロセッサＡ〜Ｄでは、フェッチ部１０７ａ〜１０７ｄの各々に対応したプログラム制御部１０５ａ、１０５ｂ、１０５ｃ、１０５ｄを備えている。プログラム制御部１０５ａ〜１０５ｄは、プログラムカウンタ２０９を備え、プログラムのアドレス演算を行う。さらに、分岐判定部２１０を備え、分岐命令をデコードし、分岐条件のステータスが整った時点で分岐の判定が行われ、プログラムカウンタに分岐アドレスを渡し、分岐先のアドレスを命令アドレスとして出力する。フェッチ・バッファ管理部によって管理されているフェッチ・バッファにおける実命令の蓄積状態に基づいて、対応するフェッチ部の命令データ転送の緊急性の度合い（緊急度）を設定する。 Furthermore, the processors A to D include program control units 105a, 105b, 105c, and 105d corresponding to the fetch units 107a to 107d, respectively. Each of the program control units 105a to 105d includes a program counter 209, and performs a program address calculation. Further, a branch determination unit 210 is provided to decode the branch instruction, determine the branch when the branch condition status is ready, pass the branch address to the program counter, and output the branch destination address as the instruction address. Based on the accumulation state of the actual instruction in the fetch buffer managed by the fetch buffer management unit, the degree of urgency (urgency) of instruction data transfer of the corresponding fetch unit is set.

実施形態１では、プログラム制御部１０５ａ〜１０５ｄが、設定された緊急度を示す信号（緊急度信号）を生成し、この信号を、命令データ転送を要求するための信号と共にメモリ制御部１０３に出力する。緊急度信号、命令データ転送を要求するための信号を出力したことによって命令データ転送の要求がなされるため、プログラム制御部１０５ａ〜１０５ｄは、実施形態１のフェッチ要求手段として機能する。 In the first embodiment, the program control units 105a to 105d generate a signal indicating the set urgency level (emergency level signal), and output this signal to the memory control unit 103 together with a signal for requesting command data transfer. To do. Since the instruction data transfer is requested by outputting the urgent signal and the signal for requesting the instruction data transfer, the program control units 105a to 105d function as the fetch request unit of the first embodiment.

以上の構成において、プログラム制御部、フェッチ・バッファ、フェッチ・バッファ管理部は、いずれもプロセッサＡ〜Ｄの各々に１つずつ設けられている。本明細書では、以降、プログラム制御部をプログラム制御部１０５ａ、１０５ｂ、１０５ｃ、１０５ｄと記し、プロセッサＡ〜Ｄとの対応関係を符号中のａ、ｂ、ｃ、ｄの文字にて示す。また、フェッチ・バッファをフェッチ・バッファ１０９ａ、１０９ｂ、１０９ｃ、１０９ｄと記し、プロセッサＡ〜Ｄとの対応関係を同様にして示す。さらに、フェッチ・バッファ管理部を、フェッチ・バッファ管理部１１１ａ、１１１ｂ、１１１ｃ、１１１ｄと記し、プロセッサＡ〜Ｄとの対応関係を同様にして示す。 In the above configuration, one program control unit, one fetch buffer, and one fetch buffer management unit are provided for each of the processors A to D. In the present specification, hereinafter, the program control unit is referred to as program control units 105a, 105b, 105c, and 105d, and the correspondence with the processors A to D is indicated by characters a, b, c, and d in the code. Further, the fetch buffers are denoted as fetch buffers 109a, 109b, 109c, and 109d, and the correspondence relationships with the processors A to D are similarly shown. Further, the fetch buffer management units are denoted as fetch buffer management units 111a, 111b, 111c, and 111d, and the corresponding relationships with the processors A to D are similarly shown.

一方、メモリ制御部１０３は、緊急度信号が示す緊急度に基づいて命令データ転送の優先順位を設定し、設定された優先順位にしたがってメモリ１０１から命令を読み出すことによって要求されたフェッチを実行する。
さらに、図示した実施形態１のプロセッサは、ＡＬＵ１１５の演算の結果を保存するレジスタ部１１７ａ〜１１７ｄを備えている。レジスタ部１１７ａ〜１１７ｄは、いずれもプロセッサＡ〜Ｄの各々に対応して設けられ、対応するプロセッサの演算結果等を保存している。 On the other hand, the memory control unit 103 sets the priority order of the instruction data transfer based on the urgency level indicated by the urgency level signal, and executes the requested fetch by reading the command from the memory 101 according to the set priority level. .
Furthermore, the processor of the illustrated first embodiment includes register units 117 a to 117 d for storing the result of the operation of the ALU 115. Each of the register units 117a to 117d is provided corresponding to each of the processors A to D, and stores the calculation result of the corresponding processor.

次に、以上述べたプロセッサ本体１、メモリ制御部１０３についてより詳細に構成及び動作を説明する。
（１）プロセッサ本体
図２は、プロセッサ本体１の構成をより詳細に示した図であって、特にプロセッサＡのフェッチ部１０７ａ及びプログラム制御部１０５ａと、他のプロセッサと共有のデコード部１１３及びＡＬＵ１１５とを示している。なお、プロセッサ本体１に含まれる複数のプロセッサＡ〜Ｄは、いずれもプログラム制御部とフェッチ部とを１つずつ有し、デコード部１１３とＡＬＵ１１５とを共有するという同様の構成を有している。このため、本明細書では、説明の簡単のため、以降、プロセッサＡの構成及び動作を説明し、他のプロセッサの説明を同様のものとして一部略すものとする。 Next, the configuration and operation of the processor body 1 and the memory control unit 103 described above will be described in more detail.
(1) Processor Main Body FIG. 2 is a diagram showing the configuration of the processor main body 1 in more detail, and in particular, the fetch unit 107a and program control unit 105a of the processor A, and the decoding unit 113 and ALU 115 shared with other processors. It shows. The plurality of processors A to D included in the processor main body 1 each have a similar configuration in which one program control unit and one fetch unit are provided, and the decoding unit 113 and the ALU 115 are shared. . For this reason, in the present specification, for the sake of simplicity, the configuration and operation of the processor A will be described below, and the description of the other processors will be partially omitted as being the same.

以下、フェッチ部１０７ａ及びプログラム制御部１０５ａ、他のプロセッサと共有のデコード部１１３及びＡＬＵ１１５の各構成について説明し、さらに各構成の動作について図２を用いて説明する。
Ｉフェッチ部１０７ａ
フェッチ部１０７ａにフェッチされた実行が予定されている命令のデータ（命令データ）は、フェッチ・バッファ１０９ａに蓄積される。フェッチ・バッファ１０９ａは、３つのバッファ２００、２０１、２０２を有し、同時に複数の命令データを蓄積することができる。フェッチ・バッファ管理部１１１ａは、３つのバッファ２００、２０１、２０２における実行予定の命令データの蓄積状態を判別し、フェッチ・バッファ１０９ａに蓄積されている命令データの個数等を含む情報を検出することができる。 Hereinafter, each configuration of the fetch unit 107a, the program control unit 105a, the decoding unit 113 and the ALU 115 shared with other processors will be described, and the operation of each configuration will be described with reference to FIG.
I Fetch unit 107a
The data (instruction data) of the instruction scheduled to be executed fetched by the fetch unit 107a is accumulated in the fetch buffer 109a. The fetch buffer 109a has three buffers 200, 201, and 202, and can store a plurality of instruction data at the same time. The fetch buffer management unit 111a determines the accumulation state of instruction data to be executed in the three buffers 200, 201, and 202, and detects information including the number of instruction data accumulated in the fetch buffer 109a. Can do.

フェッチ・バッファ管理部１１１ａは、フェッチ・バッファ１０９ａにおける命令データの蓄積状態を検出して緊急度信号を生成し、プログラム制御部１０５ａに出力する。実施形態１では、フェッチ・バッファ管理部１１１ａがフェッチ・バッファ１０９ａの状態を常に監視し、状態信号として、信号E（Empty）及び信号ＥL（Empty_Level）をプログラム制御部１０５ａに出力している。信号Ｅは、フェッチ・バッファ１０９に蓄積されている命令データがなくなったことを示す信号である。また、信号ＥLは、フェッチ・バッファ１０９ａに現在蓄積されている命令データの数を示す信号である。信号Ｅ及び信号ＥLの具体的な例については後述する。 The fetch buffer management unit 111a detects the accumulation state of the instruction data in the fetch buffer 109a, generates an urgency signal, and outputs it to the program control unit 105a. In the first embodiment, the fetch buffer management unit 111a constantly monitors the state of the fetch buffer 109a, and outputs a signal E (Empty) and a signal EL (Empty_Level) to the program control unit 105a as state signals. The signal E is a signal indicating that there is no instruction data stored in the fetch buffer 109. The signal EL is a signal indicating the number of instruction data currently stored in the fetch buffer 109a. Specific examples of the signal E and the signal EL will be described later.

なお、フェッチ部１０７ａにフェッチされた命令データは、先ず、フェッチ・バッファ１０９ａのバッファ２０２に蓄積される。フェッチ・バッファ管理部１１１ａは、バッファ２０２に蓄積された命令データを順次バッファ２００に向けて送ると共に、後にフェッチされた命令データを空きが生じたバッファに蓄積する。そして、バッファ２００に蓄積されている命令データは、次の動作時にデコード部１１３に出力され、ＡＬＵ１１５における演算の実行に使用される。 The instruction data fetched by the fetch unit 107a is first stored in the buffer 202 of the fetch buffer 109a. The fetch buffer management unit 111a sequentially sends the instruction data stored in the buffer 202 to the buffer 200, and stores the instruction data fetched later in a buffer in which a space is generated. Then, the instruction data stored in the buffer 200 is output to the decoding unit 113 at the next operation and used for execution of the operation in the ALU 115.

IＩデコード部１１３及びＡＬＵ１１５
デコード部１１３は、バッファ２００に蓄積されていた命令データをデコードする。そして、命令データを解釈し、解釈の結果と共にＡＬＵ１１５に送る。ＡＬＵ１１５は、デコード部１１３から受け取った命令データの演算を実行し、この結果をレジスタ部１１７ａに保存する。 II decode unit 113 and ALU 115
The decoding unit 113 decodes the instruction data stored in the buffer 200. Then, the instruction data is interpreted and sent to the ALU 115 together with the interpretation result. The ALU 115 performs an operation on the instruction data received from the decoding unit 113 and stores the result in the register unit 117a.

レジスタ部１１７ａは、レジスタファイル２０６と、分岐条件ステータスレジスタ２０５とを備えている。レジスタファイル２０６は、演算結果や演算のためにロードされたデータ等が保存されるレジスタファイルである。また、分岐条件ステータスレジスタ２０５は、ＡＬＵ１１５で実行された演算の演算結果の状態を保存するためのレジスタである。大小比較、一致比較、加減算結果の符号等がステータスレジスタに保存される。 The register unit 117 a includes a register file 206 and a branch condition status register 205. The register file 206 is a register file in which calculation results, data loaded for calculation, and the like are stored. The branch condition status register 205 is a register for storing the state of the operation result of the operation executed by the ALU 115. The size comparison, the coincidence comparison, the sign of the addition / subtraction result, etc. are stored in the status register.

また、レジスタ部１１７ａは、分岐命令がデコードされた場合、プログラム制御部１０５ａに分岐アドレスＩ及び分岐アドレスＪを出力する。さらに、分岐条件ステータスレジスタ２０５から分岐条件のステータスを示す分岐条件ステータス信号をプログラム制御部１０５ａに出力する。
IＩI プログラム制御部１０５ａ
プログラム制御部１０５ａは、命令データ読み出し制御部２０７と、プログラムカウンタ２０９と、分岐判定部２１０とを備えている。命令データ読み出し制御部２０７は、プロセッサＡにおけるフェッチを全般的に制御する。また、プログラムカウンタ２０９は、プロセッサＡで実行されているプログラムの進行状態を示すカウンタであって、フェッチ部１０７ａと共にフェッチ手段として機能する。分岐判定部２１０は、レジスタ部１１７ａから分岐アドレスＩ及びＡＬＵ１１５から分岐アドレスＪ、分岐条件ステータス信号を入力し、対応するデコード部１１３によって分岐命令がデコードされたことを検出する構成である。 In addition, when the branch instruction is decoded, the register unit 117a outputs the branch address I and the branch address J to the program control unit 105a. Further, a branch condition status signal indicating the status of the branch condition is output from the branch condition status register 205 to the program control unit 105a.
III program control unit 105a
The program control unit 105 a includes an instruction data read control unit 207, a program counter 209, and a branch determination unit 210. The instruction data read control unit 207 generally controls fetch in the processor A. The program counter 209 is a counter indicating the progress of the program being executed by the processor A, and functions as a fetch unit together with the fetch unit 107a. The branch determination unit 210 is configured to receive the branch address I from the register unit 117 a and the branch address J and the branch condition status signal from the ALU 115, and detect that the branch instruction is decoded by the corresponding decoding unit 113.

命令データ読み出し制御部２０７は、メモリ制御部１０３に命令データ転送要求信号である信号Ｓreq（RD_REQ）を出力する。信号Ｓreqを入力したメモリ制御部１０３は、命令データ転送の要求によってメモリ１０１から命令データを読み出し、フェッチを実行する。命令の読出しに成功した場合、メモリ制御部１０３は、信号Ｓreqに対して信号Ｓready（RD_READY）を返す。プログラムカウンタ２０９は、信号Ｓreqによって要求された命令データのメモリ１０１のアドレスを示す命令アドレス信号を出力する。 The instruction data read control unit 207 outputs a signal Sreq (RD_REQ) that is an instruction data transfer request signal to the memory control unit 103. The memory control unit 103 that has received the signal Sreq reads instruction data from the memory 101 in response to a request for instruction data transfer, and executes fetching. If the instruction is successfully read, the memory control unit 103 returns a signal Sready (RD_READY) in response to the signal Sreq. The program counter 209 outputs an instruction address signal indicating the address in the memory 101 of the instruction data requested by the signal Sreq.

また、実施形態１の命令データ読み出し制御部２０７は、緊急度決定テーブル２０８を備えている。命令データ読み出し制御部２０７は、分岐判定部から分岐制御信号を用いて分岐を実行することが決定した場合、緊急度を最も高め命令データを取得する。一方、分岐が成立しない場合、信号Ｅ及び信号ＥLを緊急度決定テーブル２０８に対照し、フェッチ部１０７ａの命令データ転送の緊急度を決定する。そして、決定した緊急度を示す信号Ｐ（PRIORITY）を生成する。そして、信号Ｐを、信号Ｓreq、命令アドレス信号と共にメモリ制御部１０３に出力する。 Further, the command data read control unit 207 of the first embodiment includes an urgency determination table 208. The instruction data read control unit 207 acquires instruction data with the highest degree of urgency when it is determined to execute a branch using a branch control signal from the branch determination unit. On the other hand, when the branch is not established, the signal E and the signal EL are compared with the urgency determination table 208 to determine the urgency of the instruction data transfer of the fetch unit 107a. Then, a signal P (PRIORITY) indicating the determined urgency level is generated. Then, the signal P is output to the memory control unit 103 together with the signal Sreq and the instruction address signal.

図３は、緊急度決定テーブル２０８の一例を示した図である。図３の緊急度決定テーブル２０８は、バッファ２００〜２０２が、いずれも３２ｂｉｔの容量を持ち、命令データが１命令１６ｂｉｔの固定長データであるとした場合の例を示している。このような例では、バッファ２００〜２０２の各々が２命令を蓄積することが可能であって、フェッチ・バッファ１０９ａは、最大６個の命令データを蓄積することができる。 FIG. 3 is a diagram showing an example of the urgency determination table 208. The urgency determination table 208 in FIG. 3 shows an example in which the buffers 200 to 202 all have a 32-bit capacity and the instruction data is fixed-length data of one instruction 16 bits. In such an example, each of the buffers 200 to 202 can store two instructions, and the fetch buffer 109a can store a maximum of six instruction data.

図３に示した緊急度決定テーブル２０８は、フェッチ・バッファ１０９ａの状態と、各状態に応じた命令データの緊急度とを示している。フェッチ・バッファ１０９ａの状態としては、フェッチ部１０７ａが分岐命令による分岐が成立したことと、フェッチ・バッファ１０９ａに蓄積されている命令データの数（残り命令数）とが設定されている。
実施形態１では、緊急度決定テーブル２０８に示すように、フェッチ部１０７ａに蓄積されている実行予定の命令データの蓄積数に応じて緊急度が示す緊急性の度合いを設定している。なお、実施形態１では、フェッチ・バッファ１０９ａに蓄積されている命令データの蓄積数がより少ない場合により高い緊急度を設定するものとした。 The urgency determination table 208 shown in FIG. 3 shows the state of the fetch buffer 109a and the urgency of the instruction data corresponding to each state. As the state of the fetch buffer 109a, the fact that the fetch unit 107a has taken a branch by a branch instruction and the number of instruction data stored in the fetch buffer 109a (the number of remaining instructions) are set.
In the first embodiment, as shown in the urgency determination table 208, the degree of urgency indicated by the urgency is set according to the number of instruction data to be executed that are stored in the fetch unit 107a. In the first embodiment, a higher urgency level is set when the number of instruction data stored in the fetch buffer 109a is smaller.

また、実施形態１では、分岐命令をデコードしたことをデコード部１１３が検出したことに基づいても緊急度を設定するものとした。なお、実施形態１では、分岐命令にあっても割り込みによる分岐と分岐命令による分岐とを区別し、割込みによる分岐命令がデコードされたとき、より高い緊急度の緊急度信号を生成する。
以上述べたプロセッサＡは、以下のように動作する。すなわち、フェッチ・バッファ管理部１１１ａは、フェッチ・バッファ１０９ａに蓄積されている実行予定の命令データの数を示す信号ＥLを生成し、常時プログラム制御部１０５ａに出力している。また、フェッチ・バッファ１０９ａに蓄積されている命令データがなくなった場合、信号Ｅを出力し、フェッチ・バッファ１０９ａにおける命令データの蓄積状態を命令データ読み出し制御部２０７に通知する。 In the first embodiment, the urgency level is set based on the fact that the decoding unit 113 detects that the branch instruction has been decoded. In the first embodiment, even in the case of a branch instruction, a branch caused by an interrupt is distinguished from a branch caused by a branch instruction, and when a branch instruction caused by an interrupt is decoded, an urgency level signal having a higher urgency level is generated.
The processor A described above operates as follows. That is, the fetch buffer management unit 111a generates a signal EL indicating the number of instruction data to be executed stored in the fetch buffer 109a, and constantly outputs it to the program control unit 105a. When there is no instruction data stored in the fetch buffer 109a, a signal E is output to notify the instruction data read control unit 207 of the instruction data storage state in the fetch buffer 109a.

一方、デコード部１１３は、命令データを解釈し、この結果から分岐命令がデコードされたことを検出することができる。この場合、デコード部１１３は、分岐デコード信号を出力し、分岐命令がデコードされたことを分岐判定部２１０に通知する。さらに、レジスタ部１１７ａおよびＡＬＵ１１５は、分岐アドレスＩ、分岐アドレスＪ、分岐条件ステータス信号を分岐判定部２１０に出力する。 On the other hand, the decoding unit 113 can interpret the instruction data and detect from the result that the branch instruction has been decoded. In this case, the decoding unit 113 outputs a branch decode signal and notifies the branch determination unit 210 that the branch instruction has been decoded. Further, the register unit 117a and the ALU 115 output the branch address I, the branch address J, and the branch condition status signal to the branch determination unit 210.

分岐判定部２１０は、分岐デコード信号、分岐アドレスＩ及び分岐アドレスＪ、分岐条件ステータス信号を入力する。そして、分岐アドレスＩ及び分岐アドレスＪから分岐先のアドレスである分岐アドレスを算出する。さらに、分岐判定部２１０は、分岐条件ステータス信号に基づいて分岐制御信号を生成する。分岐制御信号は、プログラムカウンタ２０９に入力されると共に、命令データ読み出し制御部２０７に入力される。 The branch determination unit 210 receives a branch decode signal, a branch address I and a branch address J, and a branch condition status signal. Then, a branch address which is a branch destination address is calculated from the branch address I and the branch address J. Further, the branch determination unit 210 generates a branch control signal based on the branch condition status signal. The branch control signal is input to the program counter 209 and to the instruction data read control unit 207.

命令データ読み出し制御部２０７は、信号E、信号ＥLを入力し、緊急度決定テーブル２０８に対照する。信号Ｅが入力された場合、命令データ読み出し制御部２０７は、フェッチ・バッファ１０９ａに蓄積されている命令データがないと判定し、緊急度２を示す信号Ｐを生成する。また、命令データ読み出し制御部２０７は、信号ＥLが入力された場合、信号ＥLが示すフェッチ・バッファ１０９ａに蓄積されている命令データの数に応じて緊急度３から８のいずれかを示す信号Ｐを生成する。 The command data read control unit 207 receives the signal E and the signal EL, and compares them with the urgency determination table 208. When the signal E is input, the instruction data read control unit 207 determines that there is no instruction data stored in the fetch buffer 109a, and generates a signal P indicating the urgency level 2. In addition, when the signal EL is input, the instruction data read control unit 207 receives the signal P indicating one of the urgency levels 3 to 8 depending on the number of instruction data stored in the fetch buffer 109a indicated by the signal EL. Is generated.

また、命令データ読み出し制御部２０７は、分岐判定部２１０から分岐制御信号を入力し、緊急度決定テーブル２０８に照合する。そして、分岐判定部２１０によって自装置に対応するフェッチ部１０７ａが分岐命令をデコードしたことが検出された場合、フェッチ・バッファ１０９ａの残り命令数が１以上である場合よりも緊急度が高い緊急度信号を生成している。 In addition, the instruction data read control unit 207 receives a branch control signal from the branch determination unit 210 and collates it with the urgency determination table 208. When the branch determination unit 210 detects that the fetch unit 107a corresponding to the own device has decoded the branch instruction, the urgency level is higher than when the remaining number of instructions in the fetch buffer 109a is 1 or more. The signal is generated.

分岐命令が実行した場合に残り命令数が１以上である場合よりも緊急度が高い緊急度信号を生成する理由は、処理の分岐が生じたとき、フェッチ・バッファ１０９ａに蓄積されている命令データが全てクリアされるためである。つまり、分岐命令がデコードされ、この命令が実行された場合、フェッチ・バッファ１０９ａは空になる。このため、本実施形態では、命令データがフェッチされない場合には直ちにメモリ・ストールが発生する可能性が生じるため比較的高い緊急度の信号Ｐを生成するものとした。 The reason for generating an urgency level signal having a higher urgency level than when the remaining number of instructions is 1 or more when a branch instruction is executed is that the instruction data stored in the fetch buffer 109a when a processing branch occurs This is because all are cleared. That is, when a branch instruction is decoded and this instruction is executed, the fetch buffer 109a is emptied. For this reason, in this embodiment, when instruction data is not fetched, a memory stall may occur immediately, so that the signal P with a relatively high urgency level is generated.

さらに、本実施形態では、命令データ読み出し制御部２０７が、分岐制御信号が示す分岐命令の割込みによる分岐、命令による分岐の別を判断し、判断の結果に基づいて緊急度０または緊急度１の信号Ｐを生成する。生成された信号Ｐは、信号Ｓreq及び命令アドレス信号と共にメモリ制御部１０３に出力される。
以上、実施形態１では、命令データが固定長データであるものとして実施形態１のプロセッサの動作を述べた。しかし、実施形態１は、固定長の命令データを取り扱う構成に限定されるものでなく、可変長の命令データを扱うことも可能である。可変長の命令データを取り扱う場合、プログラム制御部１０５ａは、例えばアドレス信号の出力時、あるいは信号Ｓreadyの入力時に命令データのデータ長を検出する。 Furthermore, in this embodiment, the instruction data read control unit 207 determines whether the branch instruction is interrupted by the branch instruction interrupt indicated by the branch control signal, and whether the instruction branch is determined, and the emergency level 0 or the emergency level 1 is determined based on the determination result. A signal P is generated. The generated signal P is output to the memory control unit 103 together with the signal Sreq and the instruction address signal.
As described above, in the first embodiment, the operation of the processor according to the first embodiment has been described assuming that the instruction data is fixed-length data. However, the first embodiment is not limited to a configuration that handles fixed-length instruction data, and can also handle variable-length instruction data. When handling variable-length instruction data, the program control unit 105a detects the data length of the instruction data when, for example, an address signal is output or a signal Sready is input.

そして、命令データ読み出し制御部２０７が、検出されたデータ長と信号ＥLに含まれる命令データのフェッチ・バッファ１０９ａにおける蓄積量とを対照し、フェッチ・バッファ１０９ａに現実に蓄積されている可変長の命令データ数を得る。
（２）メモリ制御部
図４は、メモリ制御部１０３の構成をより詳細に示した図であって、メモリ制御部１０３とメモリ１０１とを示している。プロセッサ本体１のプロセッサＡ〜Ｄとの間でデータを授受するプロセッサインターフェイス部４０１を備えている。プロセッサインターフェイス部４０１には、プロセッサ本体１のプロセッサＡ〜Ｄが備えるプログラム制御部１０５ａ〜１０５ｄの各々から信号Ｓreq、信号Ｐが入力される。また、プロセッサインターフェイス部４０１には、プログラム制御部１０５ａ〜１０５ｄの各々から信号Ｓreqが要求する命令データのアドレスがアドレス信号として入力される。 Then, the instruction data read control unit 207 compares the detected data length with the accumulation amount of the instruction data included in the signal EL in the fetch buffer 109a, and changes the variable length actually accumulated in the fetch buffer 109a. Get the number of instruction data.
(2) Memory Control Unit FIG. 4 is a diagram showing the configuration of the memory control unit 103 in more detail, and shows the memory control unit 103 and the memory 101. A processor interface unit 401 that exchanges data with the processors A to D of the processor body 1 is provided. A signal Sreq and a signal P are input to the processor interface unit 401 from each of the program control units 105 a to 105 d included in the processors A to D of the processor body 1. The processor interface unit 401 receives an address of instruction data requested by the signal Sreq from each of the program control units 105a to 105d as an address signal.

メモリ制御部１０３は、プロセッサインターフェイス部４０１を介して信号Ｓreq、信号Ｐ、命令アドレスにより、プロセッサＡ〜Ｄごとに、フェッチ時にアクセスすべきメモリ１０１のアドレス、命令データ転送の緊急度の信号を受け取る。そして、要求されたデータ転送が実行されたことを示す信号Ｓreadyと共に命令データをプロセッサＡ〜Ｄの各々に送出する。 The memory control unit 103 receives, via the processor interface unit 401, the address of the memory 101 to be accessed at the time of fetch and the urgency signal of instruction data transfer for each of the processors A to D by the signal Sreq, the signal P, and the instruction address. . Then, the instruction data is sent to each of the processors A to D together with the signal Sready indicating that the requested data transfer has been executed.

メモリ制御部１０３は、プログラム制御部１０５ａによって入力された信号Ｐが示す緊急度に基づいて、要求されたデータ転送の優先順位を設定する優先順位設定手段として機能する要求リストテーブル４０４及びアドレスデコード部４０２を備えている。また、設定された優先順位にしたがってメモリ１０１から命令を読み出すメモリアクセス制御部４０５を備えている。 The memory control unit 103 includes a request list table 404 and an address decoding unit that function as a priority setting unit that sets the priority of requested data transfer based on the urgency indicated by the signal P input by the program control unit 105a. 402 is provided. In addition, a memory access control unit 405 that reads an instruction from the memory 101 in accordance with the set priority order is provided.

図５は、要求リストテーブル４０４の一例を示す図である。図示した要求リストテーブル４０４は、緊急度に応じて優先度が設定されたメモリ１０１に対するアクセス要求の一覧を示している。図中に示したプロセッサＩＤとは、プロセッサＡ〜Ｄの各々を識別するためのＩＤであり、図５中にはＡ、Ｂ、Ｃ、Ｄとして示す。緊急度とは、信号Ｐによって表されるプロセッサへのデータ転送の緊急性である。現在の優先度とは、各プロセッサの緊急度に応じて設定された競合するフェッチ間の優先度である。要求リストテーブル４０４において、命令データ転送要求は、優先度が高いものから順に配列されて小さいアクセスリスト番号が割り当てられている。なお、アクセスリスト番号は、リストのポインタである。 FIG. 5 is a diagram illustrating an example of the request list table 404. The illustrated request list table 404 shows a list of access requests to the memory 101 in which priorities are set according to the urgency level. The processor ID shown in the figure is an ID for identifying each of the processors A to D, and is shown as A, B, C, and D in FIG. The urgency is the urgency of data transfer to the processor represented by the signal P. The current priority is the priority between competing fetches set according to the urgency of each processor. In the request list table 404, the instruction data transfer requests are arranged in descending order of priority and assigned a small access list number. The access list number is a list pointer.

以上述べた構成は、以下のように動作する。すなわち、信号Ｓreqは、要求リストテーブル４０４、優先順位制御部４０３、アドレスデコード部４０２の各々に出力される。信号Ｓreqは出力されたプロセッサを特定するためのプロセッサＩＤを含んでいて、プロセッサＩＤは、要求リストテーブル４０４に設定される。要求リストテーブル４０４では、信号Ｓreqが入力されたことによって命令データ転送要求がなされたことを検出すると共に、命令データ転送を要求したプロセッサが特定される。 The configuration described above operates as follows. That is, the signal Sreq is output to each of the request list table 404, the priority order control unit 403, and the address decoding unit 402. The signal Sreq includes a processor ID for specifying the output processor, and the processor ID is set in the request list table 404. In the request list table 404, it is detected that an instruction data transfer request has been made by the input of the signal Sreq, and the processor that has requested the instruction data transfer is specified.

また、信号Ｓreqを出力したプロセッサは、同時に信号Ｐ及び命令アドレスを出力する。信号Ｐは、優先順位制御部４０３に入力される。優先順位制御部４０３は、信号Ｐに基づいて信号Ｓreqのデータ転送の緊急度を判定する。判定された緊急度は、要求リストテーブル４０４に出力され、命令データ転送の要求に対応して設定される。一方、命令アドレスは、アドレスデコード部４０２に入力される。アドレスデコード部４０２は、入力された命令アドレスを要求リストテーブル４０４に出力し、要求リストテーブル４０４は命令アドレスを命令データ転送要求に対応させて設定する。 The processor that has output the signal Sreq outputs the signal P and the instruction address at the same time. The signal P is input to the priority control unit 403. The priority control unit 403 determines the urgency of data transfer of the signal Sreq based on the signal P. The determined urgency level is output to the request list table 404 and set in response to a request for command data transfer. On the other hand, the instruction address is input to the address decoding unit 402. The address decoding unit 402 outputs the input instruction address to the request list table 404, and the request list table 404 sets the instruction address corresponding to the instruction data transfer request.

命令データ転送要求は、緊急度が高いものからアクセスリスト番号が付されて要求リストテーブル４０４に設定される。また、プロセッサＩＤは、要求に対応して要求リストテーブル４０４に設定される。
以上述べた動作により、命令データ転送を要求したプロセッサを特定するＩＤ及び命令データ転送の緊急度が、緊急度にしたがう順番で要求リストテーブル４０４に設定される。優先順位制御部４０３は、要求リストテーブル４０４に設定された命令データ転送の要求に対し、緊急度にしたがって優先順位を付す。 The command data transfer request is set in the request list table 404 with an access list number assigned in order of highest urgency. The processor ID is set in the request list table 404 in response to the request.
By the operation described above, the ID for identifying the processor that has requested the instruction data transfer and the urgency level of the instruction data transfer are set in the request list table 404 in the order according to the urgency level. The priority order control unit 403 assigns priorities to command data transfer requests set in the request list table 404 according to the urgency level.

また、フェッチ部１０７ａ〜１０７ｄは、要求リストテーブル４０４の設定がいったんなされた後も順次命令データ転送を要求する。後になされた命令データ転送要求の緊急度が先に優先順位が設定された命令データ転送要求よりも高い緊急度を持つ場合、優先順位制御部４０３は、要求リストテーブル４０４に優先度更新情報を出力し、いったん設定された優先順位を更新する。このため、要求リストテーブル４０４に設定された優先順位は、以降になされた命令データ転送要求の緊急度に応じて変動することになる。 The fetch units 107a to 107d sequentially request instruction data transfer even after the request list table 404 is set once. If the urgency level of the command data transfer request made later has a higher urgency level than the command data transfer request for which the priority is set first, the priority control unit 403 outputs priority update information to the request list table 404 Then, the priority order once set is updated. For this reason, the priority set in the request list table 404 varies depending on the urgency of the command data transfer request made thereafter.

すなわち、優先順位制御部４０３は、要求された命令データ転送の緊急度を要求リストテーブルに設定する。そして、先に緊急度が設定された命令データ転送要求による命令データ転送が実行される以前であって、かつ、より高い緊急度を示す信号Ｐが後に入力された場合、この情報を優先度更新情報として要求リストテーブル４０４に出力する。要求リストテーブル４０４では、優先度更新情報に基づいて、緊急度がより高い命令データ転送要求に対し、先に要求された緊急度がより低い命令データ転送要求よりも高い優先順位を設定する。 That is, the priority order control unit 403 sets the urgency level of the requested command data transfer in the request list table. If the command data transfer by the command data transfer request for which the urgency level is set first is executed and a signal P indicating a higher urgency level is input later, this information is updated as a priority. The information is output to the request list table 404 as information. In the request list table 404, based on the priority update information, a higher priority is set for a command data transfer request with a higher urgency than a command data transfer request with a lower urgency requested earlier.

一方、緊急度がより低い命令データ転送要求の優先順位は、後になされた緊急度がより高い命令データ転送要求によって低下する。このような場合、優先順位制御部４０３は、要求リストテーブル４０４を更新し、未実行の命令データ転送要求の要求リストテーブル４０４における優先順位を新たに設定する。新たに設定された優先順位の情報は、要求リストテーブル４０４から現在優先度として優先順位制御部４０３に通知される。現在優先度は、以降の優先順位制御部４０３における優先順位の制御に使用される。 On the other hand, the priority of the instruction data transfer request with the lower urgency level is lowered by the instruction data transfer request with the higher urgency level made later. In such a case, the priority order control unit 403 updates the request list table 404 and newly sets a priority order in the request list table 404 for an unexecuted instruction data transfer request. The information on the newly set priority is notified from the request list table 404 to the priority control unit 403 as the current priority. The current priority is used for priority control in the subsequent priority control unit 403.

図６は、メモリ制御部１０３による優先順位の設定を変更する処理を説明するためのフローチャートである。優先順位制御部４０３は、優先順位設定の処理を開始するにあたり、先ず、要求リストテーブル４０４を初期化する（Ｓ６０１）。そして、メモリ制御部１０３は、信号Ｓreqによって命令データ転送要求があるか否か判断する（Ｓ６０２）。命令データ転送要求がある場合（Ｓ６０２：Ｙｅｓ）、さらにアクセスすべきアドレス等の情報を入力する（Ｓ６０３）。また、ステップＳ６０２において命令データ転送要求がないと判断された場合（Ｓ６０２：Ｎｏ）、命令データ転送の要求が起こるまで待機する。 FIG. 6 is a flowchart for explaining processing for changing the priority order setting by the memory control unit 103. Prior to starting the priority setting process, the priority control unit 403 first initializes the request list table 404 (S601). Then, the memory control unit 103 determines whether there is an instruction data transfer request based on the signal Sreq (S602). When there is an instruction data transfer request (S602: Yes), information such as an address to be accessed is further input (S603). If it is determined in step S602 that there is no instruction data transfer request (S602: No), the process waits until an instruction data transfer request occurs.

次に、メモリ制御部１０３は、要求リストテーブル４０４のアクセスリスト番号ｋをＮ−１と定義する（Ｓ６０４）。なお、Ｎは、図５に示した要求リストテーブル４０４に設定されている命令データ転送要求の総数である。図５に示したアクセスリスト番号は、０から始まっている。このため、アクセスリスト番号ｋの最大値は、Ｎ−１によって得られる。 Next, the memory control unit 103 defines the access list number k in the request list table 404 as N−1 (S604). N is the total number of instruction data transfer requests set in the request list table 404 shown in FIG. The access list number shown in FIG. For this reason, the maximum value of the access list number k is obtained by N-1.

次に、メモリ制御部１０３は、ステップＳ６０２で要求された命令データ転送の緊急度が、アクセスリスト番号ｋの緊急度よりも高いか否か判断する（Ｓ６０５）。比較の結果、今回要求された命令データ転送の緊急度がアクセスリスト番号ｋの緊急度よりも高くない場合（Ｓ６０５：Ｎｏ）、今回要求された命令データ転送の優先順位を、要求リストテーブル４０４のアクセスリスト番号ｋに設定されていた優先順位の直後の順位に設定する（Ｓ６０９）。 Next, the memory control unit 103 determines whether or not the urgency level of the command data transfer requested in step S602 is higher than the urgency level of the access list number k (S605). As a result of the comparison, if the urgency level of the command data transfer requested this time is not higher than the urgency level of the access list number k (S605: No), the priority order of the command data transfer requested this time is set in the request list table 404. The order immediately after the priority set in the access list number k is set (S609).

ステップＳ６０５、６０９の処理によれば、要求された命令データ転送の緊急度が、アクセスリスト番号ｋの緊急度と等しい場合、等しい緊急度を持つ命令データ転送要求のうち、後に要求された命令データ転送を先に要求された命令データ転送の直後に設定することができる。
また、メモリ制御部１０３は、ステップＳ６０５において、今回要求された命令データ転送の緊急度がアクセスリスト番号ｋの緊急度よりも高いと判断した場合（Ｓ６０５：Ｙｅｓ）、アクセスリスト番号ｋの優先度を下げるよう要求リストテーブル４０４を更新する（Ｓ６０６）。そして、アクセスリスト番号ｋから１を減じてｋとし（Ｓ６０７）、アクセスリスト番号ｋから１を減じ結果、ｋが０になったか否か判断する（Ｓ６０８）。 According to the processing of steps S605 and S609, when the urgency level of the requested command data transfer is equal to the urgency level of the access list number k, the command data requested later among the command data transfer requests having the same urgency level The transfer can be set immediately after the previously requested command data transfer.
If the memory control unit 103 determines in step S605 that the urgency level of the command data transfer requested this time is higher than the urgency level of the access list number k (S605: Yes), the priority level of the access list number k The request list table 404 is updated so as to lower (S606). Then, 1 is subtracted from the access list number k to make k (S607), and it is determined whether or not k has become 0 as a result of subtracting 1 from the access list number k (S608).

ステップＳ６０８の判断の結果、メモリ制御部１０３は、アクセスリスト番号ｋが０でない場合に要求された命令データ転送の緊急度がアクセスリスト番号ｋの緊急度よりも高いか否か再び判断する（Ｓ６０８：Ｎｏ）。また、アクセスリスト番号ｋが０になった場合（Ｓ６０８：Ｙｅｓ）、次の命令データ転送要求があるか否かを判断する。
次に、以上述べたようにして設定された優先度に基づいて行われる実施形態１のプロセッサの動作を説明する。 As a result of the determination in step S608, the memory control unit 103 determines again whether or not the urgency level of the command data transfer requested when the access list number k is not 0 is higher than the urgency level of the access list number k (S608). : No). If the access list number k becomes 0 (S608: Yes), it is determined whether or not there is a next instruction data transfer request.
Next, the operation of the processor of Embodiment 1 performed based on the priority set as described above will be described.

図７、図８は、プロセッサ動作を説明するためのタイミングチャートである。図７は、命令データ転送要求の優先順位を緊急度に応じて設定した場合の動作を示すタイミングチャートである。図８は、実施形態１との比較のため掲げた従来のマルチスレッドプロセッサの動作を示すタイミングチャートである。
図７、図８のいずれにおいても、プロセッサＡ〜Ｄの各々について信号Ｓreq、信号Ｐ、信号Ｓreadyのタイミング及び命令データ読み出しのタイミングを示し、最上段にプロセッサのクロックサイクル（ＣＬＫ）を示している。図８〜図１０において、各信号を発生するプロセッサをＡ、Ｂ、Ｃ、Ｄで示す。 7 and 8 are timing charts for explaining the processor operation. FIG. 7 is a timing chart showing the operation when the priority order of the instruction data transfer request is set according to the urgency level. FIG. 8 is a timing chart showing the operation of the conventional multi-thread processor listed for comparison with the first embodiment.
7 and 8, for each of the processors A to D, the signal Sreq, the signal P, the signal Sready timing, and the instruction data read timing are shown, and the processor clock cycle (CLK) is shown at the top. . In FIGS. 8 to 10, A, B, C, and D denote processors that generate signals.

プロセッサＡ_ＲＥＱはプロセッサＡが発生する信号Ｓreqである。また、プロセッサＡ_ＰＲＩＯＲＩＴＹはプロセッサＡが発生する信号Ｐを示し、プロセッサＡ_ＲＥＡＤＹはプロセッサＡが発生する信号Ｓreadyである。さらに、プロセッサＡ_ＤＡＴＡは、命令データ転送要求によって命令データが読み出されるタイミングを指す。なお、以上の表記は、プロセッサＡの部分をプロセッサＢ、プロセッサＣ、プロセッサＤに置き換えることにより、プロセッサＡ以外のプロセッサによって発生される信号を表すものとする。 The processor A_REQ is a signal Sreq generated by the processor A. The processor A_PRIORITY indicates a signal P generated by the processor A, and the processor A_READY is a signal Sready generated by the processor A. Further, the processor A_DATA indicates the timing at which the instruction data is read by the instruction data transfer request. The above notation represents a signal generated by a processor other than processor A by replacing processor A with processor B, processor C, and processor D.

図７によれば、先ず、１サイクル目でプロセッサＡがメモリ制御部１０３に対して緊急度１の命令データ転送要求をし、次サイクルの２サイクル目で要求に応じた命令データの読み出しが開始される。そして、命令データ読み出し完了のタイミングで信号Ｓreadyが発生し、プロセッサ本体１の側に命令データ転送が実行されたことを通知することが分かる。 According to FIG. 7, first, processor A makes an urgent 1 instruction data transfer request to memory control unit 103 in the first cycle, and starts reading instruction data in response to the request in the second cycle of the next cycle. Is done. Then, it can be seen that the signal Sready is generated at the timing of completion of the instruction data reading, and that the instruction data transfer is executed to the processor body 1 side.

また、図７に示すように、実施形態１では、プロセッサＣが３サイクル目で緊急度２の命令データ転送要求をし、プロセッサＣの命令データ転送要求によってなされる命令データの読出しが完了する以前にプロセッサＡが緊急度０の命令データ転送要求をしている。このとき、メモリ制御部１０３は、プロセッサＣの命令データ転送要求に応じる命令データの読出しよりもプロセッサＡの命令データ転送要求に応じる命令データの読出しを先に実行する。 Also, as shown in FIG. 7, in the first embodiment, the processor C issues an instruction data transfer request with an urgency level 2 in the third cycle, and the instruction data read by the instruction data transfer request of the processor C is completed. Processor A makes an urgent 0 command data transfer request. At this time, the memory control unit 103 reads the instruction data in response to the instruction data transfer request from the processor A before reading the instruction data in response to the instruction data transfer request from the processor C.

一方、図８に示した従来のフェッチ動作では、プロセッサＡ〜Ｄによってなされた命令データ転送要求を順に実行している。このような従来の動作に比べ、図７に示した実施形態１のプロセッサの動作では、フェッチ・バッファに命令データが残り少ないプロセッサの命令データ転送を優先的に実行することができる。このため、従来のプロセッサとメモリに対するアクセス速度が同じであってもプロセッサ全体を円滑に動作させ、メモリ・ストール等の不具合を防ぐことができる。 On the other hand, in the conventional fetch operation shown in FIG. 8, the instruction data transfer requests made by the processors A to D are executed in order. Compared to such a conventional operation, in the operation of the processor of the first embodiment shown in FIG. 7, the instruction data transfer of the processor with less instruction data remaining in the fetch buffer can be preferentially executed. For this reason, even if the access speed to the conventional processor and the memory is the same, the entire processor can be operated smoothly, and problems such as memory stalls can be prevented.

さらに、円滑に動作できる実施形態１のプロセッサは、メモリ１０１を非バンク構成としても適切なタイミングでフェッチ部１０７ａ〜１０７ｄが命令を命令データ転送することができる。このため、メモリにデータをリニアに割り当て、バンク分けするよりもメモリ１０１の使用効率を高めることができる。また、実施形態１のプロセッサは、円滑に動作できるため、メモリ１０１との間をシングルポートで接続しても充分メモリ・ストール等の不具合が発生することを抑えることができる。このため、メモリとプロセッサとの間の接続にかかる回路規模をより小型化し、コストをも低廉化することに有利である。 Furthermore, in the processor according to the first embodiment that can operate smoothly, the fetch units 107a to 107d can transfer the instruction data to the instruction at an appropriate timing even if the memory 101 has a non-bank configuration. For this reason, it is possible to increase the use efficiency of the memory 101 rather than assigning data linearly to the memory and dividing the data into banks. In addition, since the processor according to the first embodiment can operate smoothly, even if it is connected to the memory 101 with a single port, it is possible to sufficiently prevent problems such as memory stalls. For this reason, it is advantageous to further reduce the circuit scale required for connection between the memory and the processor and to reduce the cost.

なお、以上述べた実施形態１では、本発明のプロセッサを命令が並列に実行できるマルチスレッドプロセッサ等の構成としている。しかし、実施形態１のプロセッサは命令を並列に実行できる構成に限定されるものではなく、シングルプロセッサとして構成することも可能である。 In the first embodiment described above, the processor of the present invention is configured as a multi-thread processor or the like that can execute instructions in parallel. However, the processor according to the first embodiment is not limited to a configuration capable of executing instructions in parallel, and may be configured as a single processor.

（実施形態２）
ところで、実施形態１では、比較的緊急度が低い命令データ転送要求が、後により緊急度の高い命令データ転送要求が次々となされた場合に長時間実行されない可能性が生じる。つまり、実施形態１のように、命令データ転送要求の優先順位を緊急度にのみ応じて設定した場合、図７に示したように、３サイクル目で要求された緊急度２のプロセッサＣの命令データ転送は、後になされたプロセッサＡ及びプロセッサＢによる命令データ転送要求によって１０サイクル目まで実行されることがない。このような動作によれば、比較的緊急度の低い命令データ転送要求が長時間待機させられて実質的に実行できなくなる可能性がある。 (Embodiment 2)
By the way, in the first embodiment, there is a possibility that an instruction data transfer request having a relatively low urgency level may not be executed for a long time when a command data transfer request having a higher urgency level is subsequently issued. That is, when the priority order of the instruction data transfer request is set only according to the urgency level as in the first embodiment, the instruction of the processor C having the urgency level 2 requested in the third cycle as shown in FIG. The data transfer is not executed until the 10th cycle due to an instruction data transfer request by the processor A and the processor B made later. According to such an operation, there is a possibility that an instruction data transfer request with a relatively low degree of urgency may be substantially not executed due to being kept waiting for a long time.

実施形態２は、メモリ・ストール防止の効果を維持しながらこのような実施形態１の可能性をなくすため、あるフェッチ部（第１フェッチ部）が命令データ転送（第１フェッチ）を要求した後であって、かつ第１の命令データ転送が完了する以前に他のフェッチ部（第２フェッチ部）がメモリ１０１に命令データ転送（第２フェッチ）を要求し、第２の命令データ転送が第１命令データ転送よりも先に実行された場合、優先順位制御部４０３は、第１の命令データ転送の優先順位の設定をより高位の順位に更新するものとした。 In the second embodiment, after a certain fetch unit (first fetch unit) requests an instruction data transfer (first fetch) in order to eliminate the possibility of the first embodiment while maintaining the effect of preventing memory stalls. In addition, before the completion of the first instruction data transfer, another fetch unit (second fetch unit) requests the instruction data transfer (second fetch) from the memory 101, and the second instruction data transfer is When executed prior to one instruction data transfer, the priority control unit 403 updates the priority setting of the first instruction data transfer to a higher order.

すなわち、例えば、プロセッサＡがメモリ制御部１０３に対して緊急度３の命令データ転送要求をした後であって、かつ要求が受け付けられる以前にプロセッサＣが緊急度１の命令データ転送を要求したものとする。このような場合、実施形態２では、実施形態１と同様に、緊急度がより高いプロセッサＣの命令データ転送要求を優先し、プロセッサＣがプロセッサＡよりも先に命令データを命令データ転送する。 That is, for example, after processor A makes a urgency 3 command data transfer request to memory control unit 103 and before the request is accepted, processor C requests urgency 1 command data transfer. And In such a case, in the second embodiment, as in the first embodiment, the instruction data transfer request of the processor C having a higher degree of urgency is given priority, and the processor C transfers the instruction data before the processor A.

そして、プロセッサＡの命令データ転送要求の緊急度を１つ高め、緊急度２に設定する。なお、実施形態２では、先に要求された命令データ転送より後に要求された命令データ転送を先に実行する動作を命令データ転送要求の追越しとも記す。
次に、例えばプロセッサＤが、緊急度１の命令データ転送を要求したものとする。このとき、優先順位制御部４０３は、プロセッサＡの命令データ転送要求の緊急度２とプロセッサＤの命令データ転送要求の緊急度１とを比較する。そして、より緊急度が高いプロセッサＤの命令データ転送要求にプロセッサＡよりも高い優先順位に設定する。また、プロセッサＡの命令データ転送要求の緊急度をさらに１つ高めて緊急度１に設定する。 Then, the urgency level of the instruction data transfer request of the processor A is increased by one and set to the urgency level 2. In the second embodiment, the operation of executing the requested instruction data transfer after the previously requested instruction data transfer is also referred to as the overtaking of the instruction data transfer request.
Next, for example, it is assumed that the processor D requests a command data transfer with an urgency level of 1. At this time, the priority control unit 403 compares the urgency level 2 of the instruction data transfer request of the processor A with the urgency level 1 of the instruction data transfer request of the processor D. Then, a higher priority order than the processor A is set for the instruction data transfer request of the processor D having a higher degree of urgency. Further, the urgency level of the instruction data transfer request of the processor A is further increased by one and set to the urgency level 1.

次に、例えばプロセッサＣが、緊急度１の命令データ転送を要求した場合、優先順位制御部４０３は、プロセッサＡの命令データ転送要求の緊急度１とプロセッサＣの命令データ転送要求の緊急度１とを比較する。このとき、プロセッサＡの命令データ転送要求の緊急度とプロセッサＣの命令データ転送要求の緊急度とは同じである。実施形態２では、このとき、先になされた命令データ転送要求を優先するものとし、プロセッサＡの命令データ転送要求の優先順位をプロセッサＣの命令データ転送要求よりも高く設定する。 Next, for example, when the processor C requests an instruction data transfer with an urgency level 1, the priority control unit 403 performs an urgency level 1 for an instruction data transfer request for the processor A and an urgency level 1 for an instruction data transfer request for the processor C. And compare. At this time, the urgency level of the instruction data transfer request of the processor A is the same as the urgency level of the instruction data transfer request of the processor C. In the second embodiment, the instruction data transfer request made earlier is given priority, and the priority of the instruction data transfer request of the processor A is set higher than that of the instruction data transfer request of the processor C.

図９は、実施形態２の要求リストテーブルを例示する図である。図９に示す要求リストテーブルでは、緊急度４のプロセッサＣによる命令データ転送要求が、より緊急度の高い命令データ転送要求よりも高い優先順位に設定されている。このような設定は、比較的緊急度の低かった命令データ転送要求が追越されるたびに緊急度を高め、後になされた緊急度のより高い命令データ転送要求より高い優先順位を獲得したことによってなされたものである。 FIG. 9 is a diagram illustrating a request list table according to the second embodiment. In the request list table shown in FIG. 9, the command data transfer request by the processor C with the urgency level 4 is set to a higher priority than the command data transfer request with the higher urgency level. Such a setting is achieved by increasing the urgency every time an instruction data transfer request with a relatively low urgency is overtaken, and gaining a higher priority than a command data transfer request made later with a higher urgency. It was made.

図１０は、命令データ転送要求の追越しが起こった場合に追越された先の命令データ転送要求の緊急度を高める場合の動作を示すタイミングチャートである。図１０によれば、図７に示したのと同様に、プロセッサＣが３サイクル目で緊急度２の命令データ転送要求を発生している。そして、この要求による命令データ転送が実行される以前、プロセッサＡが、緊急度０の命令データ転送要求を行っている。 FIG. 10 is a timing chart showing an operation when the urgency of the instruction data transfer request that has been overtaken is increased when the instruction data transfer request is overtaken. According to FIG. 10, as shown in FIG. 7, the processor C generates a command data transfer request with an urgency level 2 in the third cycle. Before the command data transfer by this request is executed, the processor A makes a command data transfer request with an urgent level of 0.

このような場合、図１０に示した動作では、図７に示した動作と同様に、メモリ制御部１０３がプロセッサＣの命令データ転送要求に先んじてプロセッサＡに命令データ転送をさせる。ただし、命令データ転送要求の追越しを考慮して優先順位を設定するので、３サイクル目でなされたプロセッサＤの命令データ転送要求の緊急度は２から１に変更される。このため、図１０の例では、プロセッサＣの命令データ転送要求の優先順位は、さらに後になされたプロセッサＢの命令データ転送要求（緊急度１）と同じになる。緊急度が同じである場合、先になされた命令データ転送要求が優先される。このため、メモリ制御部１０３は、プロセッサＣの命令データ転送要求をプロセッサＢの命令データ転送要求に優先して実行する。 In such a case, in the operation shown in FIG. 10, the memory control unit 103 causes the processor A to transfer instruction data prior to the instruction data transfer request from the processor C, as in the operation shown in FIG. However, since priority is set in consideration of overtaking of the instruction data transfer request, the urgency of the instruction data transfer request of the processor D made in the third cycle is changed from 2 to 1. Therefore, in the example of FIG. 10, the priority order of the instruction data transfer request of the processor C is the same as the instruction data transfer request of the processor B (urgent level 1) made later. If the urgency is the same, the command data transfer request made earlier is prioritized. For this reason, the memory control unit 103 executes the instruction data transfer request of the processor C in preference to the instruction data transfer request of the processor B.

以上述べたように、図１０に示した動作は、図７に示した動作よりもプロセッサＣの命令データ転送要求の待機時間を短縮し、比較的緊急度の低い命令データ転送要求が実質的に実行できなくなることを防ぐことができる。
このような実施形態２は、命令データ転送の追越しが発生した場合、先になされた命令データ転送要求の緊急度を高め、プロセッサのメモリ・ストールを抑えつつ、比較的緊急性が低い命令データ転送要求が実質的に実行できなくなることを防ぐことができる。 As described above, the operation shown in FIG. 10 shortens the waiting time for the instruction data transfer request of the processor C than the operation shown in FIG. It is possible to prevent the execution from becoming impossible.
In the second embodiment, when overtaking of instruction data transfer occurs, instruction data transfer with relatively low urgency is achieved while increasing the urgency of the instruction data transfer request made earlier and suppressing the memory stall of the processor. It is possible to prevent the request from becoming substantially unexecutable.

なお、以上述べた実施形態２では、命令データ転送の追越しが発生したとき、追越された命令データ転送の緊急度を１つ高めるものとした。しかし、実施形態２は、このような構成に限定されるものでなく、プロセッサの仕様や動作、実行する処理に応じていくつ緊急度を高めるものであってもよい。 In the second embodiment described above, when an overtaking of instruction data transfer occurs, the urgency of the overwritten instruction data transfer is increased by one. However, the second embodiment is not limited to such a configuration, and the number of urgency levels may be increased depending on the specifications and operations of the processor and the processing to be executed.

本発明の実施形態１、実施形態２に共通のプロセッサを説明するための図である。It is a figure for demonstrating a processor common to Embodiment 1 and Embodiment 2 of this invention. 図１に示したプロセッサ本体の構成をより詳細に示した図である。It is the figure which showed the structure of the processor main body shown in FIG. 1 in detail. 図２に示した緊急度決定テーブルの一例を示した図である。It is the figure which showed an example of the urgency determination table shown in FIG. 図１に示したメモリ制御部の構成をより詳細に示した図であって、FIG. 2 is a diagram showing the configuration of the memory control unit shown in FIG. 1 in more detail, 図４に示した実施形態１の要求リストテーブルの一例を示す図である。FIG. 5 is a diagram illustrating an example of a request list table according to the first embodiment illustrated in FIG. 4. 実施形態１の優先順位の設定を変更する処理を説明するためのフローチャートである。6 is a flowchart for explaining processing for changing priority order setting according to the first embodiment; 実施形態１のプロセッサの動作を示すタイミングチャートである。3 is a timing chart illustrating an operation of the processor according to the first embodiment. 実施形態１との比較のため掲げた従来のマルチスレッドプロセッサの動作を示すタイミングチャートである。6 is a timing chart showing an operation of a conventional multi-thread processor listed for comparison with the first embodiment. 実施形態２の要求リストテーブルの一例を示す図である。It is a figure which shows an example of the request | requirement list table of Embodiment 2. 実施形態２のプロセッサの動作を示すタイミングチャートである。10 is a timing chart illustrating an operation of the processor according to the second embodiment. バンク分けされたメモリにマルチスレッドプロセッサがアクセスする構成を例示した図である。It is the figure which illustrated the structure where a multithread processor accesses the memory divided into banks.

Explanation of symbols

１プロセッサ本体、１０１メモリ、１０３メモリ制御部
１０５ａ，１０５ｂ，１０５ｃ，１０５ｄプログラム制御部
１０７ａ，１０７ｂ，１０７ｃ，１０７ｄフェッチ部
１０９ａ，１０９ｂ，１０９ｃ，１０９ｄフェッチ・バッファ
１１１ａフェッチ・バッファ管理部，１１３デコード部
１１７ａ，１１７ｂ，１１７ｃ，１１７ｄレジスタ部
２００，２０１，２０２バッファ
２０５分岐条件ステータスレジスタ、２０６レジスタファイル
２０７命令データ読み出し制御部、２０８緊急度決定テーブル
２０９プログラムカウンタ、２１０分岐判定部
４０１プロセッサインターフェイス部、４０２アドレスデコード部
４０３優先順位制御部、４０４要求リストテーブル、４０５メモリアクセス制御部 1 processor main body, 101 memory, 103 memory control unit 105a, 105b, 105c, 105d program control unit 107a, 107b, 107c, 107d fetch unit 109a, 109b, 109c, 109d fetch buffer 111a fetch buffer management unit, 113 decoding unit 117a, 117b, 117c, 117d Register unit 200, 201, 202 Buffer 205 Branch condition status register, 206 Register file 207 Instruction data read control unit, 208 Urgency determination table 209 Program counter, 210 Branch determination unit 401 Processor interface unit, 402 Address decoding unit 403 Priority control unit, 404 Request list table, 405 Memory access control unit

Claims

A processor that fetches instructions from a memory in which the instructions are stored,
It includes a fetch means for fetching an instruction, instruction requested by the fetch unit, a memory control means for transferring the fetched unit that requested the instruction, and
The fetch means comprises:
Command storage means for storing transferred commands;
An urgency level setting means that indicates a degree of urgency of instruction acquisition in the fetch means based on a state of an actual instruction that is actually executed among the instructions stored in the instruction storage means;
Fetch request means for outputting the urgency level set by the urgency level setting means to the memory control means and requesting transfer of instructions, and
The memory control means includes
Fetch priority order setting means for determining the priority order of instruction transfer based on the urgency level output by the fetch request means;
Memory access control means for reading out an instruction relating to a fetch request from the memory according to the priority set by the fetch priority setting means;
A processor comprising:

A processor that executes a plurality of instructions in parallel,
A plurality of fetch means for fetching an instruction, and a memory control means for transferring an instruction requested by the fetch means to the fetch means that requested the instruction,
The fetch means comprises:
Command storage means for storing transferred commands;
An urgency level setting means that indicates a degree of urgency of instruction acquisition in the fetch means based on a state of an actual instruction that is an actually executed instruction among the instructions stored in the instruction storage means;
Fetch request means for outputting the urgency level set by the urgency level setting means to the memory control means and requesting transfer of instructions, and
The memory control means includes
Fetch priority setting means for determining the priority of instruction transfer based on the urgency output by the fetch request means;
Memory access control means for reading an instruction relating to a fetch request from the memory in accordance with the priority set by the fetch priority setting means;
A processor comprising:

3. The processor according to claim 1, wherein the urgency level setting unit sets the urgency level according to a stored number of instructions stored in the command storage unit.

The second fetching unit requests the second instruction after the first fetching unit requests the first instruction among the plurality of fetching units and before the transfer of the first instruction is completed. 3. The transfer apparatus according to claim 2, wherein when the transfer is completed prior to the transfer of the first instruction, the priority setting unit updates the setting of the priority of the instruction transfer of the first instruction to a higher rank. The processor described in.

5. The processor according to claim 1, wherein the memory control unit is connected to the memory in a single port and transfers instructions one by one to one of the fetch units. 6.