JPH04333929A

JPH04333929A - Cache memory control system

Info

Publication number: JPH04333929A
Application number: JP3105440A
Authority: JP
Inventors: Yasumasa Nakada; 中田　恭正
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1991-05-10
Filing date: 1991-05-10
Publication date: 1992-11-20

Abstract

PURPOSE:To accelerate a program processing by staying the instruction train of a program loop under execution existent in a cache memory until the end a loop processing. CONSTITUTION:When the condition of a branching instruction for returning is established and a size (loop size) from the branching destination instruction to the branching instruction is smaller than a cache memory size, the address of the branching instruction, the address of the branching destination and the loop side are set to registers 11, 12 and 13 and by detecting the establishment of the condition in the case of executing the same branching instruction again at a comparator 14 and an AND gate 18, a state under loop execution is detected to turn a flag 20 on. When the flag 20 is turned on, a cache controller 4 extracts the loop instruction train shown by the registers 11-13 from a main memory 2 and writes it is a cache memory 3 and until the comparator 14 detects that the address of the instruction to be executed exceeds the address of the branching instruction, the loop instruction train in the cache memory 3 is not turned out.

Description

[Detailed description of the invention]

【０００１】0001

【産業上の利用分野】この発明は、キャッシュメモリを
備えた情報処理装置に係り、特にプログラムループの実
行に好適なキャッシュメモリ制御方式に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an information processing apparatus equipped with a cache memory, and more particularly to a cache memory control method suitable for executing a program loop.

【０００２】0002

【従来の技術】近年の情報処理装置では、主記憶等のメ
モリの内容の一部の写しが置かれるキャッシュメモリを
備えるのが一般的である。2. Description of the Related Art In recent years, information processing apparatuses are generally equipped with a cache memory in which a copy of a portion of the contents of a memory such as a main memory is stored.

【０００３】この種の情報処理装置では、メモリアクセ
ス要求時に、目的の命令（またはデータ）がキャッシュ
メモリ内に存在するか否か、即ちキャッシュヒットか否
（ミスヒット）かを調べ、キャッシュヒット時には、キ
ャッシュメモリから目的の命令（またはデータ）を取出
すことにより、命令（またはデータ）のアクセスの高速
化を図っている。In this type of information processing device, when a memory access request is made, it is checked whether or not the target instruction (or data) exists in the cache memory, that is, whether there is a cache hit or not (mishit). By retrieving the target instruction (or data) from the cache memory, the access speed of the instruction (or data) is increased.

【０００４】また、ミスヒット時には、メモリアクセス
を行って目的の命令（またはデータ）を含む所定サイズ
のブロックデータを読出し、同ブロックデータをキャッ
シュメモリに格納することにより、ミスヒットとなった
命令（またはデータ）に続いてその近傍の命令（または
データ）をアクセスする際の高速化が図れるようにして
いる。このキャッシュメモリへのブロックデータ格納が
行われると、それ以前にその格納先に格納されていた他
のブロックデータは追出されることになる。[0004] Furthermore, in the event of a miss, the instruction (or data) that caused the miss is accessed to read a block of data of a predetermined size containing the target instruction (or data), and the block data is stored in the cache memory. This makes it possible to speed up access to instructions (or data) in the vicinity of the instruction (or data) following the instruction (or data). When this block data is stored in the cache memory, other block data previously stored in the storage location will be evicted.

【０００５】[0005]

【発明が解決しようとする課題】上記したように従来は
、主記憶からキャッシュメモリへの命令（またはデータ
）の格納は、所定サイズのブロックを単位に行われてい
た。しかし、ブロック単位でキャッシュメモリの入替え
を行う方式では、例えばプログラムの中にループがあり
、そのループ全体（の命令列）がキャッシュメモリに格
納可能な場合でも、一部のデータが無効になったり、他
のデータにオーバーライトされることが生じ、そのプロ
グラムの処理時間が増大されしまうという問題があった
。As described above, in the past, instructions (or data) were stored from the main memory to the cache memory in units of blocks of a predetermined size. However, with the method of replacing the cache memory in block units, even if there is a loop in the program and the entire loop (instruction sequence) can be stored in the cache memory, some data may become invalid. However, there is a problem in that the program may be overwritten with other data, increasing the processing time of the program.

【０００６】この発明は上記事情に鑑みてなされたもの
でその目的は、実行中のプログラムループを構成する命
令列がキャッシュメモリに入り切る場合には、そのルー
プの命令列を、そのループの処理が終了するまで確実に
キャッシュメモリに駐在させることができ、もってプロ
グラム処理の高速化が図れるキャッシュメモリ制御方式
を提供することにある。The present invention has been made in view of the above-mentioned circumstances, and an object of the present invention is to transfer the instruction string of the program loop being executed to the processing of the loop when the instruction string constituting the program loop being executed can fit into the cache memory. It is an object of the present invention to provide a cache memory control method that can reliably make programs reside in a cache memory until they are completed, thereby speeding up program processing.

【０００７】[0007]

【課題を解決するための手段】この発明は、前に戻る分
岐命令の分岐成立時に、キャッシュメモリに入り切る命
令列からなるプログラムループの実行中であることを検
出することにより、同ループの構成命令列を全てキャッ
シュメモリに格納し、同ループの処理の終了を検出する
までは同ループの構成命令列がキャッシュメモリから追
出されないようにしたことを特徴とするものである。[Means for Solving the Problems] The present invention detects that a program loop consisting of an instruction sequence that can fit into a cache memory is being executed when a branch of a previous branch instruction is established, thereby reconfiguring the loop. The entire instruction sequence is stored in the cache memory, and the instruction sequence forming the loop is not evicted from the cache memory until the end of the processing of the same loop is detected.

【０００８】[0008]

【作用】上記の構成において、ＣＰＵは、前に戻る分岐
命令の分岐成立時に、その分岐先アドレスの示す分岐先
命令から分岐命令までの命令列がキャッシュメモリに入
り切ることを検出することにより、分岐命令の命令アド
レスと分岐先アドレスと分岐先命令から分岐命令までの
命令列のサイズをレジスタに保持し、再びその分岐命令
を実行して分岐が成立した場合には、その分岐命令がプ
ログラムループのために用いられていること、即ちプロ
グラムループの実行中であることを検出し、そのプログ
ラムループを構成する命令列のキャッシュメモリへの格
納をキャッシュコントローラに要求する。[Operation] In the above configuration, the CPU detects that the instruction string from the branch destination instruction indicated by the branch destination address to the branch instruction is completely stored in the cache memory when a branch of the previous branch instruction is established. The instruction address of the branch instruction, the branch destination address, and the size of the instruction sequence from the branch destination instruction to the branch instruction are held in registers, and if the branch instruction is executed again and the branch is taken, the branch instruction is inserted into the program loop. It detects that a program loop is being executed, that is, a program loop is being executed, and requests the cache controller to store a sequence of instructions constituting the program loop in the cache memory.

【０００９】これによりキャッシュコントローラは、上
記レジスタの内容に従い、プログラムループの命令列を
例えば主記憶から取出してキャッシュメモリに格納する
。そしてキャッシュコントローラは、上記レジスタの内
容が有効な期間中（プログラムループの実行期間中）は
、ＣＰＵからのアクセス要求に対して、いわゆるキャッ
シングではなく、通常のメモリアクセスを行ってプログ
ラムループ中の対応命令を取出し、ＣＰＵに送出する。[0009] Accordingly, the cache controller takes out the instruction sequence of the program loop from, for example, the main memory and stores it in the cache memory according to the contents of the register. During the period when the contents of the above registers are valid (during the execution period of the program loop), the cache controller responds to access requests from the CPU by performing normal memory access rather than caching. Takes an instruction and sends it to the CPU.

【００１０】さて、上記レジスタの内容は、実行中の命
令の命令アドレスが上記レジスタに保持されている分岐
命令の命令アドレスを越えると、プログラムループの処
理が終了したものとして（ループから抜け出たものとし
て）、クリアされる。キャッシュコントローラは、上記
レジスタの内容がクリアされると、ＣＰＵからのアクセ
ス要求に対してキャッシングを行い、キャッシュミスの
場合には、要求先の命令を含むブロックデータをキャッ
シュメモリに格納する。Now, the contents of the above register are determined as follows: When the instruction address of the instruction being executed exceeds the instruction address of the branch instruction held in the above register, it is assumed that the processing of the program loop has ended (the one that has escaped from the loop). ), cleared. When the contents of the register are cleared, the cache controller performs caching in response to an access request from the CPU, and in the case of a cache miss, stores block data including the requested instruction in the cache memory.

【００１１】[0011]

【実施例】図１はこの発明を適用するキャッシュメモリ
を持つ情報処理装置の一実施例を示すブロック構成図で
ある。DESCRIPTION OF THE PREFERRED EMBODIMENTS FIG. 1 is a block diagram showing an embodiment of an information processing apparatus having a cache memory to which the present invention is applied.

【００１２】図１において、１は装置の中心をなし、命
令実行を司るＣＰＵ、２は各種プログラム、データの格
納に用いられる主記憶、３は主記憶２の内容の一部の写
し、例えばプログラムの一部の写しが置かれるキャッシ
ュメモリ（命令キャッシュ）、４はＣＰＵ１からの主記
憶アクセス要求を受けてキャッシュメモリ３また主記憶
２をアクセスするキャッシュコントローラである。なお
図１には、主記憶２上のデータの一部の写しが置かれる
キャッシュメモリ（データキャッシュ）は省略されてい
る。In FIG. 1, numeral 1 is a CPU which forms the center of the device and is in charge of executing instructions; 2 is a main memory used for storing various programs and data; and 3 is a copy of a part of the contents of the main memory 2, such as a program. A cache memory (instruction cache) in which a copy of a part of is placed, 4 is a cache controller that accesses the cache memory 3 or the main memory 2 in response to a main memory access request from the CPU 1. Note that a cache memory (data cache) in which a copy of part of the data on the main memory 2 is placed is omitted in FIG.

【００１３】ＣＰＵ１には、前に戻る分岐先を指定する
条件付分岐命令の命令アドレスを保持するためのレジス
タ１１（ＲＥＧ＃１）と、上記条件付分岐命令で指定さ
れる分岐先のアドレスを保持するためのレジスタ１２（
ＲＥＧ＃２）と、レジスタ１３（ＲＥＧ＃３）とが設け
られる。このレジスタ１３は、上記条件付分岐命令で指
定される分岐先アドレスの命令から同分岐命令までの命
令列のサイズ（以下、ループサイズと称する）を保持す
るのに用いられる。レジスタ１１〜１３は、キャッシュ
コントローラ４から参照可能である。The CPU 1 has a register 11 (REG#1) for holding the instruction address of a conditional branch instruction that specifies the previous branch destination, and a register 11 (REG#1) for holding the instruction address of the conditional branch instruction that specifies the branch destination to return to. Register 12 for holding (
REG#2) and a register 13 (REG#3) are provided. This register 13 is used to hold the size of the instruction string from the instruction at the branch destination address specified by the conditional branch instruction to the branch instruction (hereinafter referred to as loop size). The registers 11 to 13 can be referenced by the cache controller 4.

【００１４】ＣＰＵ１にはまた、実行中命令（これから
実行される命令）の命令アドレスとレジスタ１１（ＲＥ
Ｇ＃１）の保持内容（分岐命令アドレス）とを比較する
比較器１４が設けられる。この比較器１４は、実行中命
令の命令アドレスがレジスタ１１の内容に一致している
ことを検出した場合に、プログラムループを構成するた
めの分岐命令の実行であることを示すループ構成分岐命
令検出信号１５を出力し、実行中命令の命令アドレスが
レジスタ１１の内容を越えていることを検出した場合に
、プログラムループの処理の終了を示すループ終了検出
信号１６を出力する。このループ終了検出信号１６は、
レジスタ１１〜１３（ＲＥＧ＃１〜＃３）の内容をクリ
アするためのクリア信号として用いられる。The CPU 1 also contains the instruction address of the instruction currently being executed (the instruction to be executed) and the register 11 (RE
A comparator 14 is provided to compare the content held in the branch instruction (branch instruction address) of the branch instruction G#1). When the comparator 14 detects that the instruction address of the instruction being executed matches the contents of the register 11, the comparator 14 detects a loop configuration branch instruction, which indicates that a branch instruction for configuring a program loop is being executed. A signal 15 is output, and when it is detected that the instruction address of the instruction being executed exceeds the contents of the register 11, a loop end detection signal 16 indicating the end of the program loop processing is output. This loop end detection signal 16 is
It is used as a clear signal to clear the contents of registers 11 to 13 (REG #1 to #3).

【００１５】ＣＰＵ１には更に、比較器１４からのルー
プ構成分岐命令検出信号１５と、条件付分岐命令実行時
にその条件が成立したこと（分岐が成立したこと）を示
す分岐成立信号１７との論理積をとるアンドゲート１８
と、このアンドゲート１８の出力信号（以下、ループ実
行中検出信号と称する）１９に応じてセット（オン）す
るループ命令列格納フラグ（Ｆ）２０とが設けられる。このループ命令列格納フラグ２０は、レジスタ１１〜１
３の内容で指定されるプログラムループの構成命令列（
ループ命令列）をキャッシュメモリ３に格納して、キャ
ッシングによらないキャッシュメモリアクセスを行うこ
とをキャッシュコントローラ４に指示するためのもので
ある。ループ命令列格納フラグ２０は、ループ終了検出
信号１６とレジスタ１１〜１３の書換えを示すＲＥＧ書
換え信号２１との論理和をとるオアゲート２２の出力信
号に応じてリセット（オフ）される。The CPU 1 further includes a logic between the loop configuration branch instruction detection signal 15 from the comparator 14 and a branch established signal 17 indicating that the condition is satisfied (that the branch is established) when the conditional branch instruction is executed. AND gate 18 that takes the product
and a loop instruction string storage flag (F) 20 that is set (turned on) in response to an output signal (hereinafter referred to as a loop execution detection signal) 19 of the AND gate 18. This loop instruction string storage flag 20 is stored in registers 11 to 1.
The configuration instruction sequence of the program loop specified by the contents of 3 (
This command is used to instruct the cache controller 4 to store a loop instruction sequence) in the cache memory 3 and perform cache memory access that does not involve caching. The loop instruction string storage flag 20 is reset (turned off) in response to the output signal of the OR gate 22 which takes the logical sum of the loop end detection signal 16 and the REG rewrite signal 21 indicating rewriting of the registers 11 to 13.

【００１６】次に、図１の構成の動作を、図２乃至図４
のフローチャートを参照して説明する。ＣＰＵ１はまず
、前に戻る条件付分岐命令（即ち、条件成立時の分岐先
が分岐命令より前のアドレスとなる条件付分岐命令）の
実行時に、その条件が成立して分岐する際には、その分
岐命令のアドレスから分岐先のアドレスを引いた値（相
対アドレスの絶対値）に、その分岐命令自身の大きさ（
命令長）を加えた値、即ち分岐先の命令から分岐命令ま
での命令列（分岐命令がプログラムループを構成するた
めの命令であるものとした場合の、そのループサイズ）
を算出する（図２ステップＳ１）。Next, the operation of the configuration shown in FIG. 1 will be explained with reference to FIGS. 2 to 4.
This will be explained with reference to the flowchart. First, when the CPU 1 executes a conditional branch instruction to go back (i.e., a conditional branch instruction in which the branch destination when the condition is met is an address before the branch instruction), when the condition is met and the branch is executed, The value obtained by subtracting the address of the branch destination from the address of the branch instruction (the absolute value of the relative address) is the size of the branch instruction itself (
(instruction length), that is, the instruction sequence from the branch destination instruction to the branch instruction (the loop size when the branch instruction is an instruction for configuring a program loop)
(Step S1 in FIG. 2).

【００１７】次にＣＰＵ１は、算出したループサイズと
キャッシュメモリ３のサイズ（キャッシュサイズ）との
比較により、ループサイズがキャッシュサイズ以下であ
ること、即ち該当するプログラムループの構成命令列が
キャッシュメモリ３に入り切ることを検出し、更にレジ
スタ１１〜１３（ＲＥＧ＃１〜＃３）が全てクリアされ
ていることを検出したならば（図２ステップＳ２，Ｓ３
）、実行中の分岐命令のアドレスをレジスタ１１に、同
分岐命令で指定される分岐先のアドレスをレジスタ１２
に、そして算出したループサイズをレジスタ１３に、そ
れぞれ設定する（図２ステップＳ４）。Next, the CPU 1 compares the calculated loop size with the size of the cache memory 3 (cache size) and determines that the loop size is less than or equal to the cache size, that is, the constituent instruction sequence of the corresponding program loop is in the cache memory 3. If it is detected that the registers 11 to 13 (REG #1 to #3) are all cleared (steps S2 and S3 in FIG.
), the address of the branch instruction being executed is stored in register 11, and the address of the branch destination specified by the same branch instruction is stored in register 12.
and set the calculated loop size in the register 13 (step S4 in FIG. 2).

【００１８】またＣＰＵ１は、レジスタ１１〜１３に既
に何らかの値が設定されている場合でも、算出したルー
プサイズが、（キャッシュサイズ以下であって且つ）レ
ジスタ１３（ＲＥＧ＃３）の示すループサイズより大き
いならば、アクティブなＲＥＧ書換え信号２１を出力し
た上で（図２ステップＳ５，Ｓ６）、上記ステップ４の
レジスタ設定を行う。これにより、レジスタ１１〜１３
が、既設定のプログラムループよりサイズの大きい新た
なプログラムループに関する情報に書換えられる。この
とき、オアゲート２２の出力信号が、アクティブなＲＥ
Ｇ書換え信号２１に応じて真となり、これによりループ
命令列格納フラグ２０はオフ状態となる。[0018] Furthermore, even if some value has already been set in registers 11 to 13, the CPU 1 determines that the calculated loop size is smaller than the loop size indicated by register 13 (REG#3) (less than or equal to the cache size). If it is larger, the active REG rewrite signal 21 is output (steps S5 and S6 in FIG. 2), and the register setting in step 4 is performed. As a result, registers 11 to 13
is rewritten with information regarding a new program loop that is larger in size than the previously set program loop. At this time, the output signal of the OR gate 22 is
It becomes true in response to the G rewrite signal 21, and thereby the loop instruction string storage flag 20 is turned off.

【００１９】さて、ＣＰＵ１は、分岐成立（分岐条件成
立）すると、分岐先命令以降の命令列を順に実行する。この際の命令は、ＣＰＵ１からキャッシュコントローラ
４に対して命令アドレスと共に命令アクセス要求が出力
されることにより、次のようにキャッシュコントローラ
４から与えられるものである。Now, when the branch is established (branch condition is satisfied), the CPU 1 sequentially executes the instruction sequence after the branch destination instruction. The instruction at this time is given from the cache controller 4 as follows by outputting an instruction access request together with an instruction address from the CPU 1 to the cache controller 4.

【００２０】即ちキャッシュコントローラ４は、ＣＰＵ
１から命令アクセス要求があると、ループ命令列格納フ
ラグ２０の状態を調べ（図４ステップＳ２１）、本実施
例のようにオフ状態にある場合には、キャッシュメモリ
３に対していわゆるキャッシングによるアクセスを行う
ことで（図４ステップＳ２３）、要求された命令を得て
、ＣＰＵ１に出力する。この際、ミスヒットとなったな
らば、要求された命令を含むブロックが主記憶２から読
出されてキャッシュメモリ３の該当領域に書込まれるこ
とは勿論である。That is, the cache controller 4
When there is an instruction access request from 1, the state of the loop instruction string storage flag 20 is checked (step S21 in FIG. 4), and if it is in the off state as in this embodiment, the cache memory 3 is accessed by so-called caching. By doing this (step S23 in FIG. 4), the requested command is obtained and output to the CPU 1. At this time, if a miss occurs, the block containing the requested instruction is of course read from the main memory 2 and written to the corresponding area of the cache memory 3.

【００２１】このようにして、分岐先命令以降の命令列
が順に実行され、やがてレジスタ１１に保持されている
命令アドレスと同一命令アドレスの命令、即ちレジスタ
１１〜１３への情報設定のきっかけとなった条件付分岐
命令が再び実行されると、比較器１４からアクティブな
ループ構成分岐命令検出信号１５が出力される。[0021] In this way, the instruction sequence after the branch destination instruction is executed in order, and eventually an instruction with the same instruction address as the instruction address held in register 11, that is, information setting in registers 11 to 13, is triggered. When the conditional branch instruction is executed again, the comparator 14 outputs an active loop configuration branch instruction detection signal 15.

【００２２】このとき、再実行された条件付分岐命令の
条件が成立するならば、図示せぬデコーダから条件成立
を示すアクティブな分岐成立信号１７が出力される。こ
の結果、アンドゲート１８から、（再実行された条件付
分岐命令がプログラムループを構成する分岐命令であり
、したがって）プログラムループの実行中であることを
示すアクティブなループ実行中検出信号１９が出力され
、ループ命令列格納フラグ２０がオン状態となる。At this time, if the condition of the re-executed conditional branch instruction is satisfied, an active branch established signal 17 indicating that the condition is satisfied is output from a decoder (not shown). As a result, the AND gate 18 outputs an active loop execution detection signal 19 indicating that the program loop is being executed (because the re-executed conditional branch instruction is a branch instruction that constitutes a program loop). Then, the loop instruction string storage flag 20 is turned on.

【００２３】キャッシュコントローラ４は、ループ命令
列格納フラグ２０がオン状態になると、図３のフローチ
ャートに示すループ命令列の格納処理を以下に述べるよ
うに実行する。When the loop instruction string storage flag 20 is turned on, the cache controller 4 executes the loop instruction string storage process shown in the flowchart of FIG. 3 as described below.

【００２４】即ちキャッシュコントローラ４は、まずＣ
ＰＵ１内のレジスタ１２，１３（ＲＥＧ＃２，＃３）の
内容（分岐先アドレス、ループサイズ）を取出す（図３
ステップＳ１１）。次にキャッシュコントローラ４は、
レジスタ１２で示される分岐先アドレスを先頭アドレス
とし、レジスタ１３で示されるサイズ分のループ命令列
を、主記憶２から順に取出す（図３ステップＳ１２）。That is, the cache controller 4 first
Retrieve the contents (branch destination address, loop size) of registers 12 and 13 (REG #2, #3) in PU1 (Figure 3
Step S11). Next, the cache controller 4
Using the branch destination address indicated by the register 12 as the start address, a loop instruction string of the size indicated by the register 13 is sequentially retrieved from the main memory 2 (step S12 in FIG. 3).

【００２５】そしてキャッシュコントローラ４は、主記
憶２から取出した命令列、即ちレジスタ１２に保持され
ている分岐先アドレスで指定される分岐先命令から、レ
ジスタ１１に保持されている命令アドレスで指定される
分岐命令までの命令列（ループ命令列）を、キャッシュ
メモリ３の先頭アドレスから順に書込む（図３ステップ
Ｓ１３）。このように本実施例では、プログラムループ
の実行中が検出されると、そのループの命令列が全てキ
ャッシュメモリ３に書込まれ、従来のキャシングにおけ
るミスヒット時のブロック書込みとは異なることに注意
されたい。[0025] The cache controller 4 then selects a branch destination instruction specified by the instruction address held in the register 11 from the instruction sequence retrieved from the main memory 2, that is, a branch destination instruction specified by the branch destination address held in the register 12. The instruction string (loop instruction string) up to the branch instruction is written in order from the first address of the cache memory 3 (step S13 in FIG. 3). In this way, in this embodiment, when it is detected that a program loop is being executed, the entire instruction sequence of the loop is written to the cache memory 3, which is different from block writing when a miss occurs in conventional caching. I want to be

【００２６】一方、ＣＰＵ１は、再実行した分岐命令の
条件が成立すると、分岐先命令以降の命令列を順に実行
する。この際の命令は、ＣＰＵ１からキャッシュコント
ローラ４に対して命令アドレスと共に命令アクセス要求
が出力されることにより、次のようにキャッシュコント
ローラ４から与えられるものである。On the other hand, when the condition of the re-executed branch instruction is satisfied, the CPU 1 sequentially executes the instruction sequence after the branch destination instruction. The instruction at this time is given from the cache controller 4 as follows by outputting an instruction access request together with an instruction address from the CPU 1 to the cache controller 4.

【００２７】即ちキャッシュコントローラ４は、ＣＰＵ
１から命令アクセス要求があると、ループ命令列格納フ
ラグ２０の状態を調べ（図４ステップＳ２１）、この例
のようにオン状態にある場合には、要求された命令アド
レスからレジスタ１２（ＲＥＧ＃２）の内容（分岐先ア
ドレス）を差し引いた値をキャッシュメモリ３のアドレ
スとして、通常のメモリアクセスと同様にキャッシュメ
モリ３をリードアクセスすることで（図４ステップＳ２
２）、要求された命令を得て、ＣＰＵ１に出力する。こ
の際、いわゆるキャッシングのためのキャッシュヒット
／ミスの検出機構の動作は、抑止または無視され、キャ
ッシングが行われないことから、キャッシュメモリ３に
書込まれた実行中のループ命令列が、一部分たりとも追
出されることはない。That is, the cache controller 4
When there is an instruction access request from 1, the state of the loop instruction string storage flag 20 is checked (step S21 in FIG. 4), and if it is in the on state as in this example, the register 12 (REG# By subtracting the contents of 2) (branch destination address) and setting the value as the address of the cache memory 3, read access to the cache memory 3 in the same way as normal memory access (step S2 in FIG. 4).
2) Obtain the requested instruction and output it to the CPU 1. At this time, the operation of the so-called cache hit/miss detection mechanism for caching is suppressed or ignored, and caching is not performed. You will not be kicked out either.

【００２８】このようにして、ＣＰＵ１がループ命令列
を順に実行していき、やがてレジスタ１１に保持されて
いる命令アドレスで指定される分岐命令が実行されて分
岐不成立となり、次の命令のアクセスが要求されたもの
とする。この場合、その命令の命令アドレスはレジスタ
１１に保持されている命令アドレス（分岐命令のアドレ
ス）より大きいことから、比較器１４からプログラムル
ープの処理の終了（ループから抜け出たこと）を示すア
クティブなループ終了検出信号１６が出力される。この
結果、レジスタ１１〜１３（ＲＥＧ＃１〜＃３）の内容
がクリアされると共に、ループ命令列格納フラグ２０が
オフ状態となる。キャッシュコントローラ４は、ループ
命令列格納フラグ２０がオフ状態になると、ＣＰＵ１か
らの命令アクセス要求に対してキャッシングによるキャ
ッシュメモリ３のアクセスを行う（図４ステップＳ２１
，Ｓ２３）。In this way, the CPU 1 sequentially executes the loop instruction sequence, and eventually the branch instruction specified by the instruction address held in the register 11 is executed, the branch is not taken, and the access of the next instruction is interrupted. As requested. In this case, since the instruction address of that instruction is larger than the instruction address (branch instruction address) held in the register 11, the active signal from the comparator 14 indicates the end of the program loop processing (exit from the loop). A loop end detection signal 16 is output. As a result, the contents of registers 11 to 13 (REG #1 to #3) are cleared, and the loop instruction string storage flag 20 is turned off. When the loop instruction string storage flag 20 turns off, the cache controller 4 accesses the cache memory 3 by caching in response to an instruction access request from the CPU 1 (step S21 in FIG. 4).
, S23).

【００２９】なお、前記実施例では、ループサイズを保
持するためのレジスタ１３（ＲＥＧ＃３）を設けた場合
について説明したが、命令長が固定の場合には、レジス
タ１１，１２の内容からループサイズを求めることがで
きるため、レジスタ１３は必ずしもなくてもよい。また
、図２のフローチャートで示されるＣＰＵ１の処理は、
ハードウェア回路により簡単に実現可能である。In the above embodiment, the case where the register 13 (REG#3) for holding the loop size was provided was explained, but if the instruction length is fixed, the loop size is determined from the contents of registers 11 and 12. Since the size can be determined, the register 13 does not necessarily have to be provided. Furthermore, the processing of the CPU 1 shown in the flowchart of FIG.
This can be easily realized using a hardware circuit.

【００３０】[0030]

【発明の効果】以上詳述したようにこの発明によれば、
前に戻る分岐命令の分岐成立時に、キャッシュメモリに
入り切る命令列からなるプログラムループの実行中であ
ることを検出することにより、同ループの構成命令列を
全てキャッシュメモリに格納し、同ループの処理の終了
を検出するまでは同ループの構成命令列がキャッシュメ
モリから追出されないようにしたので、プログラムルー
プの構成命令を高速にフェッチすることができ、プログ
ラム処理の高速化が図れる。[Effects of the Invention] As detailed above, according to the present invention,
When a branch to the previous branch instruction is taken, by detecting that a program loop consisting of a sequence of instructions that can fit into the cache memory is being executed, all the instruction sequences that constitute the loop are stored in the cache memory, and the execution of the same loop is executed. Since the constituent instructions of the same loop are not ejected from the cache memory until the end of processing is detected, the constituent instructions of the program loop can be fetched at high speed, and program processing can be accelerated.

[Brief explanation of the drawing]

【図１】この発明を適用するキャッシュメモリを持つ情
報処理装置の一実施例を示すブロック構成図。FIG. 1 is a block diagram showing an embodiment of an information processing device having a cache memory to which the present invention is applied.

【図２】図１のＣＰＵ１におけるレジスタ設定動作を説
明するためのフローチャート。FIG. 2 is a flowchart for explaining a register setting operation in the CPU 1 in FIG. 1;

【図３】図１のキャッシュコントローラ４におけるルー
プ命令列格納動作を説明するためのフローチャート。FIG. 3 is a flowchart for explaining a loop instruction string storage operation in the cache controller 4 of FIG. 1;

【図４】図１のキャッシュコントローラ４におけるキャ
ッシュアクセス動作を説明するためのフローチャート。4 is a flowchart for explaining a cache access operation in the cache controller 4 of FIG. 1. FIG.

[Explanation of symbols]

１…ＣＰＵ、２…主記憶、３…キャッシュメモリ、４…
キャッシュコントローラ、１１…レジスタ（ＲＥＧ＃１
）、１２…レジスタ（ＲＥＧ＃２）、１３…レジスタ（
ＲＥＧ＃３）、１４…比較器（第１、第２の検出手段）
、１８…アンドゲート（第１の検出手段）、２０…ルー
プ命令列格納フラグ（Ｆ）。1...CPU, 2...Main memory, 3...Cache memory, 4...
Cache controller, 11... register (REG#1
), 12...Register (REG#2), 13...Register (
REG#3), 14... Comparator (first and second detection means)
, 18... AND gate (first detection means), 20... Loop instruction string storage flag (F).

Claims

[Claims]

1. An information processing device equipped with a cache memory, comprising: a first detection means for detecting that a program loop consisting of a sequence of instructions that can fit into the cache memory is being executed; comprising means for storing all constituent instruction sequences of the program loop in the cache memory according to a detection result, and second detection means for detecting the end of processing of the program loop,
A cache memory control method characterized in that the instruction sequence constituting the program loop is not ejected from the cache memory until the end is detected by the second detection means.

2. An information processing device equipped with a cache memory, comprising: a first register for holding an instruction address of a branch instruction specifying a previous branch destination; and an address of the branch destination specified by the branch instruction. A second register for holding the above-mentioned first and second registers, and a second register for holding the above-mentioned first and second registers, and detecting that, when a branch of the above-mentioned branch instruction is established, the instruction sequence from the branch destination instruction to the same branch instruction is fully stored in the above-mentioned cache memory. By means of setting information in the second register and detecting the establishment of a branch of the branch instruction of the instruction address held in the first register, a program loop consisting of an instruction string that can be stored in the cache memory is configured. a first detecting means for detecting that the first detecting means is being executed;
According to the detection result of the detection means, all instruction strings from the instruction at the branch destination address held in the second register to the branch instruction at the instruction address held in the first register are stored in the cache memory. means for storing and detecting that the address of the instruction being executed exceeds the instruction address held in the first register;
a second detection means for detecting the end of the processing of the program loop, and prevents the constituent instruction sequence of the program loop from being evicted from the cache memory until the end is detected by the second detection means. A cache memory control method that is characterized by:

3. An information processing device equipped with a cache memory, comprising: a first register for holding an instruction address of a branch instruction specifying a previous branch destination; and an address of the branch destination specified by the branch instruction. a second register for holding the size of the instruction string from the instruction at the branch destination address specified by the branch instruction to the branch instruction, and a third register for holding the size of the instruction string from the instruction at the branch destination address specified by the branch instruction to the branch instruction, and the branch completion of the branch instruction. Sometimes, by detecting that the instruction string from the branch destination instruction to the branch instruction is completely stored in the cache memory, it is determined whether valid information is not set in the first to third registers or If the size indicated by the third register is smaller than the size of the instruction string from the instruction at the branch destination address specified by the branch instruction that took the branch to the same branch instruction, the information is transferred to the first to third registers. Detecting that a program loop consisting of an instruction sequence that can fit into the cache memory is being executed by detecting the execution of a branch instruction of the instruction address held in the first register and the setting means. and a first detection means for detecting a branch from the instruction at the branch destination address held in the second register to the instruction address held in the first register according to the detection result of the first detection means. The processing of the program loop is performed by storing the entire instruction sequence up to the instruction in the cache memory, and by detecting that the address of the instruction being executed exceeds the instruction address held in the first register. a second detection means for detecting the end, and the constituent instruction string of the program loop is not ejected from the cache memory until the end is detected by the second detection means. Cache memory control method.

4. The contents of the first and second registers are cleared in response to detection of completion of processing of the program loop by the second detection means. cache memory control method.