JPH1011288A - Microprocessor - Google Patents

Microprocessor

Info

Publication number
JPH1011288A
JPH1011288A JP16301996A JP16301996A JPH1011288A JP H1011288 A JPH1011288 A JP H1011288A JP 16301996 A JP16301996 A JP 16301996A JP 16301996 A JP16301996 A JP 16301996A JP H1011288 A JPH1011288 A JP H1011288A
Authority
JP
Japan
Prior art keywords
instruction
sequence
changed
instruction sequence
execution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP16301996A
Other languages
Japanese (ja)
Other versions
JP3547562B2 (en
Inventor
Masatake Fujii
方毅 藤井
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Toshiba Corp
Original Assignee
Toshiba Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Toshiba Corp filed Critical Toshiba Corp
Priority to JP16301996A priority Critical patent/JP3547562B2/en
Publication of JPH1011288A publication Critical patent/JPH1011288A/en
Application granted granted Critical
Publication of JP3547562B2 publication Critical patent/JP3547562B2/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Landscapes

  • Advance Control (AREA)

Abstract

PROBLEM TO BE SOLVED: To provide a microprocessor reducing the processing time by executing first an instruction which can be executed previously from the view point of its substantial processing contents even if it is an instruction following the instruction of an unobtained resource. SOLUTION: This microprocessor changes the executing order of a series of instruction strings including the instruction of the unobtained resource, which is fetched by an instruction fetcher 1, decoded by an instruction decoder 2 and stored in an instruction buffer 5 without obtaining the resource at an executing device 3, with a prescribed algorithm by an optimizing device 6, evaluates the executing time of the instruction string, stores changed instruction strings in an corrected instruction buffer 7, executes the changed instruction strings by the execution device 3 when the changed strings are evaluated to be advantageous, and reevaluates and recharges the changed instruction strings corresponding to the acquiring situation of the resources by the optimizing device 6.

Description

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【0001】[0001]

【発明の属する技術分野】この発明は、外部又は内部資
源が取得できない状態で後続命令が実行できず、命令フ
ェッチが停止している状況において、命令列の実行順序
を変更して命令列の実行スループットを向上させるマイ
クロプロセッサに関する。
BACKGROUND OF THE INVENTION The present invention relates to a method for executing an instruction sequence by changing the execution sequence of an instruction sequence in a situation where a subsequent instruction cannot be executed in a state where external or internal resources cannot be acquired and instruction fetch is stopped. The present invention relates to a microprocessor that improves throughput.

【0002】[0002]

【従来の技術】スカラープロセッサにおいて、命令実行
時にそのオペランドのデータがまだ決定されていない
時、その値が決まるまでストールを起こしていた。パイ
プライン方式のプロセッサでは、その命令のオペランド
フェッチが行なわれるまで命令フェッチ、デコードまで
は後続の命令に対して行なっていた。また、さらに進ん
だプロセッサでは、そのデータの実行ステージに入るま
で命令サイクルを続けていた。
2. Description of the Related Art In a scalar processor, when data of its operand has not been determined at the time of execution of an instruction, a stall occurs until its value is determined. In a pipeline-type processor, instruction fetch and instruction decoding are performed for a subsequent instruction until the operand fetch of the instruction is performed. Further, in a more advanced processor, the instruction cycle is continued until the execution stage of the data is entered.

【0003】in−order発行、in−order
完了又はin−order発行、out−of−ord
er完了のスーパースカラープロセッサでは、同時にフ
ェッチされた命令の先頭命令はデータが得られない場合
に後続命令は全てオペランドフェッチでストールしてい
た。また、out−of−order完了では特に実行
ステージでは、後続命令に真の依存関係及び出力依存関
係がない場合にはオペランドステージ、実行ステージと
もに同時に行なわれていた。
[0003] In-order issuance, in-order
Completion or in-order issuance, out-of-order
In the super scalar processor in which er is completed, the first instruction of instructions fetched at the same time has been stalled by operand fetch when all data cannot be obtained. In addition, when the out-of-order is completed, especially in the execution stage, if the subsequent instruction has no true dependency or output dependency, both the operand stage and the execution stage are performed simultaneously.

【0004】out−of−order発行、out−
of−order完了のスーパースカラプロセッサで
は、後続命令に真の依存関係及び出力依存関係、逆依存
関係がない場合にはオペランドステージ、実行ステージ
ともに先行して行なわれていた。また、スーパースカラ
プロセッサに代表される複数の機能ユニットを持つプロ
セッサの場合は、同じユニットを同時使用する命令実行
が可能であり、そのユニット数を超えた場合にはストー
ルを起こし、解消されるまで待っていた。
[0004] Out-of-order issuance, out-
In the superscalar processor of the completion of the order, if the subsequent instruction has no true dependency, no output dependency, and no reverse dependency, both the operand stage and the execution stage are performed in advance. In addition, in the case of a processor having a plurality of functional units represented by a superscalar processor, it is possible to execute instructions using the same unit at the same time. have been waiting.

【0005】メモリリードオペランド命令において、資
源としてのキャッシュメモリがヒットせず主記憶に命令
フェッチを行う間、そのデータを使用することができな
かった。このため、その命令がストールを起こし、加え
て演算結果を使用する後続命令を実行できなかった。ま
た、ある演算処理のデータの結果による条件分岐命令が
ある場合に、分岐先が決まらず、後続命令はすべて依存
関係があるとして分岐予測によりフェッチし実行しなか
った。
In a memory read operand instruction, the cache memory as a resource does not hit and the data cannot be used while the instruction is fetched to the main memory. For this reason, the instruction causes a stall and, in addition, a subsequent instruction using the operation result cannot be executed. Further, when there is a conditional branch instruction based on the result of data of a certain arithmetic processing, the branch destination is not determined, and all subsequent instructions are fetched and executed by branch prediction because they have dependencies.

【0006】一方、out−of−order命令発行
を行うスーパースカラプロセッサでは、資源割り当てが
されずにストールが発生する場合に、後続の依存関係の
ない命令を先に実行し、資源が得られた時に実行を再開
していた。この時に、調べる依存関係はフェッチした命
令列の指定通りの依存であり、ひと塊の令命列が実行す
る演算の内容としての観点からは無駄なデータ相互関係
を見ているケースがあった。
On the other hand, in a superscalar processor that issues an out-of-order instruction, when a stall occurs without allocating resources, a subsequent instruction having no dependency is executed first to obtain resources. At times the execution was resumed. At this time, the dependency to be examined is a dependency as specified by the fetched instruction sequence, and there is a case where a useless data interrelationship is seen from the viewpoint of the contents of the operation executed by the block of instruction sequences.

【0007】[0007]

【発明が解決しようとする課題】以上説明したように、
従来のスカラープロセッサにおいては、命令を実行する
際に必要となる資源が取得できない場合にはストールを
生じ、後続の命令が実行できず、処理効率の低下を招い
ていた。
As described above,
In a conventional scalar processor, when resources necessary for executing an instruction cannot be obtained, a stall occurs, and subsequent instructions cannot be executed, resulting in a decrease in processing efficiency.

【0008】一方、従来のスーパースカラプロセッサに
おいては、命令の実行に必要な資源が取得できない場合
には、資源が取得できず実行できない命令と依存関係の
ない後続命令が先に実行されていた。しかし、命令間の
依存関係を調べるにあたって、複数の命令によって実現
される実質的な処理内容の観点からは命令間の依存関係
が調べられていなかった。このため、上記観点から見た
場合には実行可能な後続命令は、従来の手法によって依
存関係があると見なされて実行されず、処理効率の低下
を招いていた。
On the other hand, in the conventional super scalar processor, when resources necessary for executing an instruction cannot be obtained, a subsequent instruction having no dependency on an instruction that cannot be obtained because the resource cannot be obtained has been executed first. However, in examining the dependencies between instructions, the dependencies between instructions have not been examined from the viewpoint of the substantial processing content realized by a plurality of instructions. For this reason, from the above point of view, the executable subsequent instruction is regarded as having a dependency by the conventional method and is not executed, resulting in a decrease in processing efficiency.

【0009】そこで、この発明は、上記に鑑みてなされ
たものであり、その目的とするところは、資源未取得の
命令の後続命令であっても実質的な処理内容の観点から
見て先行して実行できる命令を先に実行して、処理時間
の短縮を図ったマイクロプロセッサを提供することにあ
る。
Therefore, the present invention has been made in view of the above, and an object of the present invention is to precede even an instruction following a resource-unobtained instruction from the viewpoint of substantial processing contents. Another object of the present invention is to provide a microprocessor which executes instructions which can be executed first to reduce processing time.

【0010】[0010]

【課題を解決するための手段】上記目的を達成するため
に、請求項1記載の発明は、フェッチされた命令列をデ
コードし、デコードした命令が実行される際に必要とな
る資源が取得できない場合には、資源が取得できない命
令に資源情報を付加して出力する命令デコーダと、前記
デコーダによってデコードされた命令又は実行順序が変
更された命令列の命令を受けて、命令の実行に必要な資
源を取得して命令を実行し、資源が取得できない場合に
は資源情報を前記命令デコーダに出力し、資源の取得/
未取得を示す資源変更情報を出力する実行装置と、前記
デコーダによってデコーダされた命令列、及び該命令列
の実行順序が変更された命令列を格納する命令バッファ
と、資源未取得の命令を含んで以降の所定の命令列の実
行順序を予め設定された規則にしたがって変更し、変更
前命令列の実行内容と等価の内容を実行する1又は複数
の命令列を生成し、実行順序変更前後の命令列の実行時
間を評価し、変更前の命令列よりも実行時間が短いと評
価された変更後の命令列がある場合には、該命令列を選
択して前記命令バッファに格納するとともに前記実行装
置に与え、変更後の命令の実行毎に前記実行装置から出
力される資源変更情報に基づいて前記命令バッファに格
納された命令列の実行順序の再変更、再評価を行う命令
最適化装置と、前記命令最適化装置によって生成された
実行順序変更後の1又は複数の命令列を格納する修正命
令バッファを有して構成される。
In order to achieve the above object, according to the first aspect of the present invention, a fetched instruction sequence is decoded, and resources required when the decoded instruction is executed cannot be obtained. In this case, an instruction decoder that adds resource information to an instruction whose resources cannot be obtained and outputs the instruction and an instruction decoded by the decoder or an instruction in an instruction sequence whose execution order has been changed receive an instruction necessary for executing the instruction. The resource is acquired and the instruction is executed. If the resource cannot be acquired, the resource information is output to the instruction decoder, and the resource acquisition /
An execution device that outputs resource change information indicating unacquired, an instruction buffer that stores an instruction sequence decoded by the decoder, and an instruction sequence whose execution sequence has been changed, and an instruction that has not acquired resources Then, the execution order of the subsequent predetermined instruction sequence is changed according to a preset rule, and one or a plurality of instruction sequences that execute contents equivalent to the execution contents of the pre-change instruction sequence are generated. The execution time of the instruction sequence is evaluated, and if there is a changed instruction sequence that is evaluated to be shorter in execution time than the instruction sequence before the change, the instruction sequence is selected and stored in the instruction buffer, and An instruction optimizing device that is provided to an execution device and re-changes and re-evaluates an execution sequence of an instruction sequence stored in the instruction buffer based on resource change information output from the execution device each time the changed instruction is executed. And before Configured with a correction command buffer for storing one or more sequence of instructions after execution order changes generated by the instruction optimizer.

【0011】請求項2記載の発明は、請求項1記載のマ
イクロプロセッサにおいて、分岐命令の実行時には、分
岐予測/先行フェッチされた命令列に対して最適化の評
価を行ない有利と判定した場合は、変更された命令列に
対し分岐予測を行い、分岐命令を間に含む命令列が最適
化された場合には、その変更命令列間の評価と同時に通
常の分岐予測による候補の評価を合わせて吟味決定する
ことを特徴とする。
According to a second aspect of the present invention, in the microprocessor according to the first aspect, when a branch instruction is executed, optimization is performed on a branch prediction / preceding fetched instruction sequence to determine that it is advantageous. When a branch prediction is performed on the changed instruction sequence and the instruction sequence including the branch instruction is optimized, the evaluation between the changed instruction sequences and the evaluation of the candidates by the normal branch prediction are performed together. It is characterized by scrutinizing.

【0012】請求項3記載の発明は、請求項1記載のマ
イクロプロセッサにおいて、割り込みが発生した時は、
前記命令バッファのみを退避し、割り込み処理中には現
割り込み処理に対して最適化評価を行ない、元の命令に
復帰した時はその時点で資源を解釈し、実行順序を変更
して最適化した命令列を生成することを特徴とする。
According to a third aspect of the present invention, in the microprocessor of the first aspect, when an interrupt occurs,
Only the instruction buffer was saved, optimization processing was performed for the current interrupt processing during interrupt processing, and when returning to the original instruction, resources were interpreted at that time and the execution order was changed to optimize. An instruction sequence is generated.

【0013】[0013]

【発明の実施の形態】図1は請求項1記載の発明の一実
施形態に係るマイクロプロセッサの構成を示す図であ
る。
FIG. 1 is a diagram showing a configuration of a microprocessor according to an embodiment of the present invention.

【0014】図1において、マイクロプロセッサは、命
令をフェッチする命令フェッチャ1と、命令フェッチャ
1によってフェッチされた命令をデコードする命令デコ
ーダ2と、命令デコーダ2によってデコードされた命令
を実行する実行装置3と、命令デコーダ2によってデコ
ードされた命令を受けて、命令を解釈し命令の実行順序
を変更する命令列解釈装置4を備えて構成される。
In FIG. 1, a microprocessor includes an instruction fetcher 1 for fetching an instruction, an instruction decoder 2 for decoding an instruction fetched by the instruction fetcher 1, and an execution device 3 for executing the instruction decoded by the instruction decoder 2. And an instruction sequence interpreting device 4 that receives the instruction decoded by the instruction decoder 2 and interprets the instruction to change the execution order of the instruction.

【0015】命令デコーダ2は、デコードした命令が実
行される際に必要となる資源が取得できない場合には、
実行装置3から与えられる資源情報を資源が取得できな
い命令に付加し、資源情報を付加した命令とともに後続
のデコードされた一連の命令列を命令列解釈装置4に与
える。
When the resources necessary for executing the decoded instruction cannot be obtained, the instruction decoder 2
The resource information provided from the execution device 3 is added to an instruction whose resource cannot be obtained, and a subsequent decoded instruction sequence is supplied to the instruction sequence interpretation device 4 together with the instruction to which the resource information has been added.

【0016】実行装置3は、命令デコーダ2でデコード
された命令又は命令列解釈装置4から与えられる命令を
実行する際に、実行に必要な資源を取得し、取得できな
かった場合には資源未取得を示す資源情報を命令デコー
ダ2に出力し、資源の取得/未取得を示す資源変更情報
を命令列解釈装置4に出力する。
When executing the instruction decoded by the instruction decoder 2 or the instruction given from the instruction sequence interpreting device 4, the execution device 3 acquires resources necessary for execution. The resource information indicating acquisition is output to the instruction decoder 2, and the resource change information indicating acquisition / non-acquisition of resources is output to the instruction sequence interpreting device 4.

【0017】命令列解釈装置4は、命令デコーダ2でデ
コードされた命令列又は実行順序が変更された命令列を
実行に必要な実行時間の評価値とともに格納し、格納さ
れた命令列が実行装置4に与えられる命令バッファ5
と、命令バッファ5に格納された命令列の実行順序を変
更した1又は複数の命令列を生成する最適化装置6と、
最適化装置6で実行順序が変更された1又は複数の命令
列を格納する修正命令バッファ7を備えて構成される。
なお、この実施形態では、修正命令バッファ7を1つ備
え、最適化装置6で実行順序が変更された1つの命令列
をこの修正命令バッファ7に格納するようにしている。
The instruction sequence interpreting device 4 stores the instruction sequence decoded by the instruction decoder 2 or the instruction sequence whose execution order has been changed, together with the evaluation value of the execution time required for execution, and stores the stored instruction sequence in the execution device. Instruction buffer 5 provided to 4
An optimization device 6 for generating one or more instruction sequences in which the execution order of the instruction sequences stored in the instruction buffer 5 has been changed;
It comprises a modified instruction buffer 7 for storing one or more instruction sequences whose execution order has been changed by the optimizing device 6.
In this embodiment, one modified instruction buffer 7 is provided, and one instruction sequence whose execution order has been changed by the optimizing device 6 is stored in the modified instruction buffer 7.

【0018】最適化装置6は、命令バッファ5に格納さ
れた、資源未取得の命令を含んで以降の所定の命令列を
受けて、この命令列の実行順序を予め設定された規則に
したがって変更し、変更前命令列の実行内容と等価の内
容を実行する命令列を生成し、実行順序変更前後の命令
列の実行時間を評価し、評価値とともに変更後の命令列
を修正命令バッファ7に格納し、評価した変更前の命令
列よりも実行時間が短いと評価された変更後の命令列が
ある場合には、この命令列を選択して命令バッファ5に
格納するとともに実行装置3に与え、変更後の命令の実
行毎に実行装置3から出力される資源変更情報に基づい
て命令バッファ5に格納された変更後の命令列の実行順
序を再変更、再評価する。
The optimizing device 6 receives a predetermined instruction sequence including instructions for which resources have not been acquired and stored in the instruction buffer 5 and changes the execution order of the instruction sequence in accordance with a preset rule. Then, an instruction sequence for executing contents equivalent to the execution contents of the instruction sequence before the change is generated, the execution time of the instruction sequence before and after the execution order is changed is evaluated, and the instruction sequence after the change together with the evaluation value is stored in the modified instruction buffer 7. If there is a changed instruction sequence whose execution time is shorter than the stored and evaluated pre-change instruction sequence, this instruction sequence is selected and stored in the instruction buffer 5 and given to the execution device 3. The execution order of the changed instruction sequence stored in the instruction buffer 5 is re-changed and reevaluated based on the resource change information output from the execution device 3 every time the changed instruction is executed.

【0019】このような構成において、命令フェッチャ
1によりフェッチした命令列を取り込み、実行装置3内
の資源取得に失敗した時は、その命令列を未実行として
命令バッファ5で保持し、後続の命令をフェッチし続け
る間に命令バッファ5中に保持される命令列を一定のパ
ターンを抽出する最適化装置6が判定し、先に実行可能
な命令パターンを抽出し、残りの命令列の実行速度の評
価を行ない、有利と判定した場合には、修正した命令列
の実行可能な部分を実行する。
In such a configuration, an instruction sequence fetched by the instruction fetcher 1 is fetched, and when resource acquisition in the execution device 3 fails, the instruction sequence is held as unexecuted in the instruction buffer 5 and the subsequent instruction The optimization device 6 that extracts a certain pattern determines the instruction sequence held in the instruction buffer 5 while continuing to fetch the instruction sequence, extracts an executable instruction pattern first, and determines the execution speed of the remaining instruction sequence. The evaluation is performed, and when it is determined that the instruction is advantageous, the executable part of the corrected instruction sequence is executed.

【0020】アウトオブオーダー命令発行で依存関係が
あると解釈し、実行待機している命令列の内命令列の演
算の意味の解釈から依存関係の考慮不要部分を抜き出
し、先に実行可能なものを処理する。資源が取得できた
時はその情報を命令列解釈装置4に送り、命令バッファ
5の評価値を変更し、再び命令順序の最適化を行う。
An out-of-order instruction is interpreted as having a dependency, and an unnecessary part of the dependency is extracted from the interpretation of the meaning of the operation of the instruction sequence among the instruction sequences waiting to be executed, and the part that can be executed first Process. When the resources can be obtained, the information is sent to the instruction sequence interpreting device 4, the evaluation value of the instruction buffer 5 is changed, and the instruction sequence is optimized again.

【0021】次に、図2を参照して、図1に示すマイク
ロプロセッサにおいて、演算Y=a+b+c+dを実行
する時の動作を説明する。
Next, the operation of the microprocessor shown in FIG. 1 when executing the operation Y = a + b + c + d will be described with reference to FIG.

【0022】演算Y=a+b+c+dでは、コンパイラ
は次のコードを生成するとする。 add A,a,A (命令1) add A,b,A (命令2) add A,c,A (命令3) add A,d,A (命令4) なお、Aは加算データを保持するレジスタを示す。
In the operation Y = a + b + c + d, the compiler generates the following code. add A, a, A (instruction 1) add A, b, A (instruction 2) add A, c, A (instruction 3) add A, d, A (instruction 4) where A is a register for holding addition data Is shown.

【0023】この命令列実行に際し、実行装置3でデー
タaが得られない場合には、命令デコーダ2がデータa
は未知として命令1を命令バッファ5に送り、通常後続
の命令2、3、4が待機状態となる。アウトオブオーダ
ー命令発行機構を持つプロセッサでも、後続命令2、
3、4はレジスタAに対し真の依存関係があるとみな
し、実行待機状態となる。
In the case of executing the instruction sequence, if the execution device 3 cannot obtain the data a, the instruction decoder 2 sets the data a
Sends the instruction 1 to the instruction buffer 5 as unknown, and usually the subsequent instructions 2, 3, and 4 are in a standby state. Even in a processor having an out-of-order instruction issuing mechanism, the following instruction 2,
3 and 4 assume that there is a true dependency on the register A, and enter an execution standby state.

【0024】しかし、元の演算Yは加算で結合法則とし
て対称であるので、どの加算から開始してもよい。コン
パイラがこの先に順序がわかる時はそのようにコード生
成するが、リアルタイム及び極度に複雑な演算過程の結
果決まる順序の場合には、メモリ内の命令列の順序を先
に規定することは不可能である。
However, since the original operation Y is symmetric as a rule of combination in addition, any addition may be started. If the compiler knows the order ahead, it will do so, but in the case of an order determined by real-time and extremely complex operations, it is not possible to specify the order of the instruction sequence in memory first. It is.

【0025】このため、実行装置3から資源情報が出力
されてストールが発生した時点で、最適化装置6がこの
命令パターンを判定して順序を最適化する。資源判定が
未だされていないものは利用可能として考える。例えば
データaのオペランド資源未取得がわかった時点で命令
2、3、4までデコード完了していた場合は、 add A,b,A (命令2) add A,c,A (命令3) add A,d,A (命令4) add A,a,A (命令1) と最適化装置6が命令列の実行順序を修正して変更し、
評価値を付けて修正命令バッファ7に書き込む。命令順
序変更前の評価値8と変更後の評価値9との比較により
変更後の命令の方が有効と判断し、変更後の命令を修正
命令バッファ7の下の方から命令バッファ5にコピーし
実行装置3へ転送する。
For this reason, when resource information is output from the execution device 3 and a stall occurs, the optimization device 6 determines the instruction pattern and optimizes the order. Resources for which resource determination has not yet been performed are considered to be usable. For example, if decoding of instructions 2, 3, and 4 has been completed at the time when it is found that the operand resource of data a has not been acquired, add A, b, A (instruction 2) add A, c, A (instruction 3) add A , D, A (instruction 4) add A, a, A (instruction 1) and the optimization device 6 correct and change the execution order of the instruction sequence,
The evaluation value is added to the modified instruction buffer 7 and written. By comparing the evaluation value 8 before the instruction order is changed with the evaluation value 9 after the change, it is determined that the changed instruction is more effective, and the changed instruction is copied to the instruction buffer 5 from the lower part of the modified instruction buffer 7. And transfers it to the execution device 3.

【0026】また、データbも不確定とわかった場合
は、資源変更情報を命令列解釈装置4に送り、命令バッ
ファ5の再評価及び最適化を行ない、修正命令バッファ
7を書き直す。以下、この繰り返しで命令列の評価/最
適化をリアルタイムに行なう。
If the data b is also found to be uncertain, the resource change information is sent to the instruction sequence interpreting device 4, the instruction buffer 5 is re-evaluated and optimized, and the modified instruction buffer 7 is rewritten. Hereinafter, the evaluation / optimization of the instruction sequence is performed in real time by this repetition.

【0027】次に、図3を参照して、演算Y=(a+
b)(a−b)を実行する場合の動作例を説明する。
Next, referring to FIG. 3, the operation Y = (a +
b) An operation example in the case of executing (ab) will be described.

【0028】演算Y=(a+b)(a−b)からコンパ
イラは次に示すコードを生成するものとする。 add a,b,c (命令1) sub a,b,d (命令2) mul c,d,c (命令3)
From the operation Y = (a + b) (ab), the compiler generates the following code. add a, b, c (instruction 1) sub a, b, d (instruction 2) mul c, d, c (instruction 3)

【0029】この命令列を命令バッファ5に取り込んだ
時、データaが未決定であると、最適化装置6はこれら
をグルーピングし、次の命令列に変換する(Y=a2
2)。 mul b,b,d(命令4) mul a,a,c(命令5) sub d,c,c(命令6)
When the instruction sequence is fetched into the instruction buffer 5, if the data a is undecided, the optimizing device 6 groups them and converts them into the next instruction sequence (Y = a 2
b 2 ). mul b, b, d (instruction 4) mul a, a, c (instruction 5) sub d, c, c (instruction 6)

【0030】これらを評価しつつ命令バッファ5に転送
し、命令4はデータaとは無関係になり実行できる。実
行後は命令5、6が残り、データaが確定次第実行でき
る。最適化前は加減算2、乗算1であったのが、最適化
後は乗算1、減算1だけが残りデータaが確定後速やか
に演算終了することができる。従って、命令4を空いた
演算ユニットで実行でき、その間データaが未確定を保
てば変更後の命令列の方が有効と評価でき、選択され
る。
While these are evaluated and transferred to the instruction buffer 5, the instruction 4 can be executed irrespective of the data a. After execution, instructions 5 and 6 remain, and can be executed as soon as data a is determined. Although the addition and subtraction 2 and the multiplication 1 were performed before the optimization, only the multiplication 1 and the subtraction 1 remain after the optimization, so that the operation can be completed immediately after the data a is determined. Therefore, the instruction 4 can be executed by the empty arithmetic unit, and during that time, if the data a remains undetermined, the changed instruction sequence can be evaluated as valid and selected.

【0031】このように、上記実施形態では、命令実行
の際、資源取得の制約からストールが発生することが多
いプログラムをプロセッサで実行する場合は、命令列の
最適化を実行する時間的余裕が内在し、それを利用する
ことにより演算実行速度を向上することができる。
As described above, in the above-described embodiment, when executing a program in which a stall often occurs due to resource acquisition restrictions at the time of executing an instruction, the processor has ample time for optimizing the instruction sequence. It is inherent and can be used to improve the operation execution speed.

【0032】なお、分岐命令の実行時には、分岐予測/
先行フェッチされた命令列に対して最適化の評価を行な
い有利と判定した場合は、修正された命令列に対し分岐
予測を行い、分岐命令を間に含む命令列が最適化された
場合には、その修正命令列間の評価と同時に通常の分岐
予測による候補の評価を合わせて吟味決定するようにし
ている。
When a branch instruction is executed, a branch prediction /
If the prefetched instruction sequence is evaluated for optimization and judged to be advantageous, branch prediction is performed for the corrected instruction sequence, and if the instruction sequence including a branch instruction is optimized, In addition, the evaluation between the modified instruction sequences and the evaluation of the candidates by the ordinary branch prediction are examined and determined.

【0033】また、割り込みが発生した時は、命令バッ
ファ5のみを退避し、割り込み処理中には現割り込み処
理に対して最適化評価を行なう。元の命令に復帰した時
はその時点で資源を解釈し、実行順序を変更して最適化
した命令列を生成するようにしている。
When an interrupt occurs, only the instruction buffer 5 is saved, and during interrupt processing, optimization evaluation is performed on the current interrupt processing. When returning to the original instruction, the resources are interpreted at that point, and the execution order is changed to generate an optimized instruction sequence.

【0034】[0034]

【発明の効果】以上説明したように、この発明によれ
ば、資源未取得の命令が待機している間に資源未取得の
命令と後続命令との異存関係を調べ、資源未取得の命令
の後続命令であっても実質的な処理内容の観点から見て
先行して実行できる命令を先に実行するようにしたの
で、資源獲得時に実行待ち命令が迅速に実行でき、命令
の処理時間を短縮することができる。
As described above, according to the present invention, while an instruction for which a resource has not been acquired is waiting, the coexistence relationship between an instruction for which a resource has not been acquired and a subsequent instruction is examined, and the instruction for which the resource has not been acquired is determined. Even for subsequent instructions, instructions that can be executed first from the viewpoint of the actual processing content are executed first, so instructions waiting for execution can be quickly executed at the time of resource acquisition, shortening instruction processing time can do.

【図面の簡単な説明】[Brief description of the drawings]

【図1】請求項1記載の発明の一実施形態に係るマイク
ロプロセッサの構成を示す図である。
FIG. 1 is a diagram showing a configuration of a microprocessor according to an embodiment of the present invention.

【図2】図1に示すマイクロプロセッサの一動作例を示
す図である。
FIG. 2 is a diagram illustrating an operation example of the microprocessor illustrated in FIG. 1;

【図3】図1に示すマイクロプロセッサの他の動作例を
示す図である。
FIG. 3 is a diagram illustrating another operation example of the microprocessor illustrated in FIG. 1;

【符号の説明】[Explanation of symbols]

1 命令フェッチャ 2 命令デコーダ 3 実行装置 4 命令列解釈装置 5 命令バッファ 6 最適化装置 7 修正命令バッファ 8,9 評価値 DESCRIPTION OF SYMBOLS 1 Instruction fetcher 2 Instruction decoder 3 Execution device 4 Instruction sequence interpreter 5 Instruction buffer 6 Optimizer 7 Modified instruction buffer 8, 9 Evaluation value

Claims (3)

【特許請求の範囲】[Claims] 【請求項1】 フェッチされた命令列をデコードし、デ
コードした命令が実行される際に必要となる資源が取得
できない場合には、資源が取得できない命令に資源情報
を付加して出力する命令デコーダと、 前記デコーダによってデコードされた命令又は実行順序
が変更された命令列の命令を受けて、命令の実行に必要
な資源を取得して命令を実行し、資源が取得できない場
合には資源情報を前記命令デコーダに出力し、資源の取
得/未取得を示す資源変更情報を出力する実行装置と、 前記デコーダによってデコーダされた命令列、及び該命
令列の実行順序が変更された命令列を格納する命令バッ
ファと、 資源未取得の命令を含んで以降の所定の命令列の実行順
序を予め設定された規則にしたがって変更し、変更前命
令列の実行内容と等価の内容を実行する1又は複数の命
令列を生成し、実行順序変更前後の命令列の実行時間を
評価し、変更前の命令列よりも実行時間が短いと評価さ
れた変更後の命令列がある場合には、該命令列を選択し
て前記命令バッファに格納するとともに前記実行装置に
与え、変更後の命令の実行毎に前記実行装置から出力さ
れる資源変更情報に基づいて前記命令バッファに格納さ
れた命令列の実行順序の再変更、再評価を行う命令最適
化装置と、 前記命令最適化装置によって生成された実行順序変更後
の1又は複数の命令列を格納する修正命令バッファとを
有することを特徴とするマイクロプロセッサ。
1. An instruction decoder for decoding a fetched instruction sequence and adding resource information to an instruction for which the resource cannot be obtained when the resource required for executing the decoded instruction cannot be obtained. Receiving the instruction decoded by the decoder or the instruction in the instruction sequence whose execution order has been changed, acquires the resources necessary for executing the instruction, executes the instruction, and if the resource cannot be acquired, the resource information is acquired. An execution device that outputs to the instruction decoder and outputs resource change information indicating acquisition / non-acquisition of resources; an instruction sequence decoded by the decoder; and an instruction sequence in which the execution sequence of the instruction sequence is changed. The instruction buffer and the execution order of the predetermined instruction sequence including the instruction for which the resource has not been acquired are changed according to a preset rule, and the execution order of the instruction sequence before the change is changed. One or more instruction sequences for executing the contents are generated, the execution time of the instruction sequence before and after the execution order is changed is evaluated, and there is a changed instruction sequence that is evaluated to be shorter in execution time than the instruction sequence before the change. In this case, the instruction sequence is selected, stored in the instruction buffer, and given to the execution device, and stored in the instruction buffer based on resource change information output from the execution device every time a changed instruction is executed. An instruction optimizing device that re-changes and re-evaluates the execution order of the executed instruction sequence; and a modified instruction buffer that stores one or a plurality of instruction sequences after the execution order change generated by the instruction optimizing device. A microprocessor characterized in that:
【請求項2】 分岐命令の実行時には、分岐予測/先行
フェッチされた命令列に対して最適化の評価を行ない有
利と判定した場合は、変更された命令列に対し分岐予測
を行い、分岐命令を間に含む命令列が最適化された場合
には、その変更命令列間の評価と同時に通常の分岐予測
による候補の評価を合わせて吟味決定することを特徴と
する請求項1記載のマイクロプロセッサ。
2. When a branch instruction is executed, evaluation of optimization is performed on an instruction sequence that has been predicted / prefetched, and if it is determined that the instruction sequence is advantageous, branch prediction is performed on the changed instruction sequence, and the branch instruction is executed. 2. The microprocessor according to claim 1, wherein when the instruction sequence including the instruction sequence is optimized, the evaluation between the changed instruction sequence and the evaluation of the candidate by the normal branch prediction are performed together with the evaluation. .
【請求項3】 割り込みが発生した時は、前記命令バッ
ファのみを退避し、割り込み処理中には現割り込み処理
に対して最適化評価を行ない、元の命令に復帰した時は
その時点で資源を解釈し、実行順序を変更して最適化し
た命令列を生成することを特徴とする請求項1記載のマ
イクロプロセッサ。
3. When an interrupt occurs, only the instruction buffer is saved. During interrupt processing, optimization evaluation is performed on the current interrupt processing. When returning to the original instruction, resources are saved at that time. 2. The microprocessor according to claim 1, wherein the microprocessor generates an optimized instruction sequence by interpreting and changing an execution order.
JP16301996A 1996-06-24 1996-06-24 Microprocessor Expired - Fee Related JP3547562B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP16301996A JP3547562B2 (en) 1996-06-24 1996-06-24 Microprocessor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP16301996A JP3547562B2 (en) 1996-06-24 1996-06-24 Microprocessor

Publications (2)

Publication Number Publication Date
JPH1011288A true JPH1011288A (en) 1998-01-16
JP3547562B2 JP3547562B2 (en) 2004-07-28

Family

ID=15765655

Family Applications (1)

Application Number Title Priority Date Filing Date
JP16301996A Expired - Fee Related JP3547562B2 (en) 1996-06-24 1996-06-24 Microprocessor

Country Status (1)

Country Link
JP (1) JP3547562B2 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000284970A (en) * 1999-03-29 2000-10-13 Matsushita Electric Ind Co Ltd Program converting device and processor
US7071865B2 (en) 2002-05-27 2006-07-04 Canon Kabushiki Kaisha Display apparatus having a remote control device with a track pad unit

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000284970A (en) * 1999-03-29 2000-10-13 Matsushita Electric Ind Co Ltd Program converting device and processor
US7071865B2 (en) 2002-05-27 2006-07-04 Canon Kabushiki Kaisha Display apparatus having a remote control device with a track pad unit
US7522087B2 (en) 2002-05-27 2009-04-21 Canon Kabushiki Kaisha Remote control device

Also Published As

Publication number Publication date
JP3547562B2 (en) 2004-07-28

Similar Documents

Publication Publication Date Title
US6697932B1 (en) System and method for early resolution of low confidence branches and safe data cache accesses
JP5889986B2 (en) System and method for selectively committing the results of executed instructions
JP3662258B2 (en) Central processing unit having a DSP function decoder having an X86 DSP core and mapping X86 instructions to DSP instructions
US7281250B2 (en) Multi-thread execution method and parallel processor system
US5692170A (en) Apparatus for detecting and executing traps in a superscalar processor
US20060168432A1 (en) Branch prediction accuracy in a processor that supports speculative execution
JP2001282549A (en) Device and method for converting program and recording medium
US7711934B2 (en) Processor core and method for managing branch misprediction in an out-of-order processor pipeline
JPH10133873A (en) Processor and method for speculatively executing condition branching command by using selected one of plural branch prediction system
JPH04275628A (en) Arithmetic processor
JP2001517333A (en) Self-modifying code processor
US8499293B1 (en) Symbolic renaming optimization of a trace
JP2006313422A (en) Calculation processing device and method for executing data transfer processing
WO2020034753A1 (en) Method for executing instructions in cpu
US7849292B1 (en) Flag optimization of a trace
WO2002008893A1 (en) A microprocessor having an instruction format containing explicit timing information
JP2000322257A (en) Speculative execution control method for conditional branch instruction
US7634641B2 (en) Method and apparatus for using multiple threads to spectulatively execute instructions
JP3518510B2 (en) Reorder buffer management method and processor
JP3779012B2 (en) Pipelined microprocessor without interruption due to branching and its operating method
US7937564B1 (en) Emit vector optimization of a trace
US20030005422A1 (en) Technique for improving the prediction rate of dynamically unpredictable branches
JPH1011288A (en) Microprocessor
JP2004038255A (en) Instruction control method and processor
JPH1196005A (en) Parallel processor

Legal Events

Date Code Title Description
A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20031222

A131 Notification of reasons for refusal

Effective date: 20040113

Free format text: JAPANESE INTERMEDIATE CODE: A131

A521 Written amendment

Effective date: 20040315

Free format text: JAPANESE INTERMEDIATE CODE: A523

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Effective date: 20040406

Free format text: JAPANESE INTERMEDIATE CODE: A01

A61 First payment of annual fees (during grant procedure)

Effective date: 20040414

Free format text: JAPANESE INTERMEDIATE CODE: A61

FPAY Renewal fee payment (prs date is renewal date of database)

Free format text: PAYMENT UNTIL: 20080423

Year of fee payment: 4

FPAY Renewal fee payment (prs date is renewal date of database)

Free format text: PAYMENT UNTIL: 20090423

Year of fee payment: 5

FPAY Renewal fee payment (prs date is renewal date of database)

Year of fee payment: 6

Free format text: PAYMENT UNTIL: 20100423

LAPS Cancellation because of no payment of annual fees