JP3392545B2

JP3392545B2 - Instruction sequence optimizer

Info

Publication number: JP3392545B2
Application number: JP26139494A
Authority: JP
Inventors: 出進博井; 田尊吉
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1994-09-30
Filing date: 1994-09-30
Publication date: 2003-03-31
Anticipated expiration: 2018-03-31
Also published as: JPH08101777A

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、例えば情報処理装置の
制御プログラム等を最適化するための命令列最適化装置
に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an instruction sequence optimizing device for optimizing, for example, a control program of an information processing device.

【０００２】[0002]

【従来の技術】近年のマルチメディアの発展に伴って、
例えばブックコンピュータ、ノートコンピュータ、携帯
電話等の携帯型情報処理装置が大きな普及を見せてい
る。2. Description of the Related Art With the recent development of multimedia,
For example, portable information processing devices such as book computers, notebook computers, and mobile phones have become widely used.

【０００３】このような携帯型情報処理装置の制御部の
概略構成を、図２６に示す。同図に示したＣＰＵ(Centr
al Processing Unit) ２６１０において、実行ユニット
２６１１は、制御プログラムを構成する各命令を実行す
る。また、入出力部２６１２は、実行ユニット２６１１
が実行する命令のアドレスを順次アドレスバス２６３１
に出力するとともに、このアドレスに対応する命令を命
令バス２６３２から取り込む。レジスタ部２６１３は、
実行ユニット２６１１による命令の実行に伴って生じた
データを一時的に記憶する。FIG. 26 shows a schematic configuration of a control unit of such a portable information processing apparatus. The CPU (Centr
In al processing Unit) 2610, execution unit 2611 executes each instruction that constitutes the control program. The input / output unit 2612 is the execution unit 2611.
Address of the instruction to be executed by the address bus 2631
And the instruction corresponding to this address is fetched from the instruction bus 2632. The register unit 2613 is
The data generated by the execution of the instruction by the execution unit 2611 is temporarily stored.

【０００４】一方、プログラムメモリ２６２０におい
て、記憶部２６２１には、制御プログラムを構成する各
命令が予め記憶されている。また、入出力部２６２２
は、アドレスバスから入力されたアドレスに対応する命
令を記憶部２６２１から読み出して、命令バス２６３２
に出力する。On the other hand, in the program memory 2620, the storage section 2621 stores in advance the respective commands constituting the control program. Also, the input / output unit 2622
Reads an instruction corresponding to the address input from the address bus from the storage unit 2621,
Output to.

【０００５】このような構成において、ＣＰＵ２６１０
による制御を行う際には、まず、実行ユニット２６１１
が実行する命令のアドレスが、アドレスバス２６３１を
介して、ＣＰＵ２６１０の入出力部２６１２からプログ
ラムメモリ２６２０の入出力部２６２２に送られる。こ
れにより、プログラムメモリ２６２０は、指定されたア
ドレスに対応する命令を記憶部２６２１から読み出し、
入出力部２６２２から出力する。ＣＰＵ２６１０の入出
力部２６１２は、この命令を命令バス２６３２を介して
入力し、実行ユニット２６１１に送る。そして、実行ユ
ニット２６１１がこの命令を実行することによって、情
報処理装置の制御を行っている。In such a configuration, the CPU 2610
When performing control by, first, the execution unit 2611
The address of the instruction to be executed is sent from the input / output unit 2612 of the CPU 2610 to the input / output unit 2622 of the program memory 2620 via the address bus 2631. As a result, the program memory 2620 reads the instruction corresponding to the designated address from the storage unit 2621,
It is output from the input / output unit 2622. The input / output unit 2612 of the CPU 2610 inputs this instruction via the instruction bus 2632 and sends it to the execution unit 2611. Then, the execution unit 2611 executes this instruction to control the information processing device.

【０００６】このようにして、実行ユニット２６１１が
命令の実行を行っているときに、一時的に記憶する必要
のあるデータが生じた場合、このデータはＣＰＵ２６１
０内のレジスタ部２６１３に格納されるとともに、必要
に応じて読み出される。In this way, when the execution unit 2611 is executing an instruction, if there is data that needs to be temporarily stored, this data is stored in the CPU 261.
It is stored in the register unit 2613 in 0 and is read out as needed.

【０００７】ここで、図２６に示したような情報処理装
置に使用する制御プログラムは、予め作成されて、情報
処理装置の製造時に、プログラムメモリ２６２０に格納
される。かかる制御プログラムの開発においては、高級
言語或いはアセンブリ言語のコンパイル等を行って目的
プログラム（すなわちプログラムメモリ２６２０に格納
するプログラム）を作成する手段において、様々な最適
化処理が施される。この最適化処理としては、例えば、
制御プログラムの実行時間を短縮するための最適化処理
や、この制御プログラムの格納に使用されるメモリ領域
を低減させるための最適化処理などが、すでに知られて
いる。Here, the control program used for the information processing apparatus as shown in FIG. 26 is created in advance and stored in the program memory 2620 when the information processing apparatus is manufactured. In the development of such a control program, various optimization processes are performed in a means for creating a target program (that is, a program to be stored in the program memory 2620) by compiling a high-level language or an assembly language. As this optimization processing, for example,
An optimization process for reducing the execution time of the control program, an optimization process for reducing the memory area used for storing the control program, and the like are already known.

【０００８】[0008]

【発明が解決しようとする課題】上述したような情報処
理装置においては、従来より、消費電力の低減が要求さ
れている。特に、携帯型の情報処理装置では、連続使用
できる時間を向上させるために、消費電力の低減が要求
されている。また、携帯型以外の情報処理装置において
も、環境保全やエネルギー消費削減の観点から、消費電
力の低減が要求されている。In the information processing apparatus as described above, reduction in power consumption has been conventionally required. In particular, portable information processing devices are required to reduce power consumption in order to improve the continuous use time. Further, also in information processing devices other than the portable type, reduction of power consumption is required from the viewpoint of environmental protection and energy consumption reduction.

【０００９】情報処理装置の内部回路の消費電力Ｐは、
以下のような式で表される。The power consumption P of the internal circuit of the information processing apparatus is
It is expressed by the following formula.

【００１０】Ｐ＝α・Ｃ・Ｖdd²・ｎ・ｆ＋Ｐs ここで、αは稼働率、Ｃは回路全体のキャパシタンス、
Ｖddは電源電圧、ｎは回路の素子数、ｆは動作周波数、
Ｐs は待機時の消費電力である。P = α · C · Vdd ² · n · f + Ps where α is the operating rate, C is the capacitance of the entire circuit,
Vdd is the power supply voltage, n is the number of circuit elements, f is the operating frequency,
Ps is the power consumption during standby.

【００１１】従来は、これらの各パラメータの値が小さ
くなるようにハードウエアを構成することによって、消
費電力の低減が図られていた。Conventionally, the power consumption has been reduced by configuring the hardware so that the value of each of these parameters becomes small.

【００１２】しかしながら、このようなハードウエア上
の措置だけでは、消費電力を十分に低減させることはで
きなかった。However, the power consumption could not be sufficiently reduced only by such hardware measures.

【００１３】本発明は、このような従来技術の欠点に鑑
みてなされたものであり、情報処理装置用制御プログラ
ムの作成段階で消費電力を低減させるための最適化処理
を行うことができる、命令列最適化装置を提供すること
を目的とする。The present invention has been made in view of the above-mentioned drawbacks of the prior art, and it is possible to perform an optimization process for reducing power consumption at the stage of creating a control program for an information processing device. An object is to provide a column optimization device.

【００１４】[0014]

【課題を解決するための手段】（１）第１の発明に係わ
る命令列最適化装置は、プログラムを記憶するプログラ
ムメモリとこのプログラムメモリから命令バスを介して
前記プログラムを取り込む演算処理部とを備えた情報処
理装置が使用するための前記プログラムを最適化する命
令列最適化装置において、前記プログラムを構成する各
命令について、相互の依存関係を解析する命令列解析手
段と、この命令列解析手段で解析された依存関係に影響
を与えない範囲で前記命令の順序を変更することによっ
て、この命令を前記プログラムメモリから前記演算処理
部に転送する際に前記命令バスに現れるビット列間のハ
ミング距離を低減させる命令列変更手段と、前記プログ
ラムを基本ブロックに分割して分割後の基本ブロックを
前記命令列解析手段に送るブロック分割手段と、を備え
るとともに、前記命令列変更手段が、直前に命令順序決
定処理を行った基本ブロックの最後の前記ビット列と今
回命令順序決定処理を行う基本ブロックの最初の前記ビ
ット列との間のハミング距離を考慮して、このブロック
内での命令順序決定処理を行うことを特徴とする。（２）第２の発明に係わる命令列最適化装置は、データ
を一時的に記憶する複数のレジスタと、プログラムを記
憶するプログラムメモリと、このプログラムメモリから
命令バスを介して取り込んだ命令にしたがって前記レジ
スタに対するデータの書き込み／読み出しを行う演算処
理部とを備えた情報処理装置が使用するためのプログラ
ムを最適化する命令列最適化装置において、前記プログ
ラムを構成する各命令中ののレジスタ番号を認識するレ
ジスタ番号認識手段と、このレジスタ番号認識手段で認
識された前記レジスタ番号の有効範囲を認識するレジス
タ有効範囲認識手段と、このレジスタ有効範囲認識手段
が認識した前記有効範囲に影響を与えない範囲で前記レ
ジスタ番号を変更することによって、このレジスタ番号
を含む命令を前記プログラムメモリから前記演算処理部
に転送する際に前記命令バスに現れるビット列間のハミ
ング距離を低減させる命令列変更手段と、を備えたこと
を特徴とする。（３）第３の発明に係わる命令列最適化装置は、プログ
ラムを記憶するプログラムメモリとこのプログラムメモ
リから命令バスを介して前記プログラムを取り込む演算
処理部とを備えた情報処理装置が使用するための前記プ
ログラムを最適化する命令列最適化装置において、前記
プログラムを構成する各命令の一部または全部につい
て、同じ命令を意味する他のビットパターンを記憶する
記憶手段と、前記プログラム中の命令を前記記憶手段に
記憶されたビットパターンに置換えることによって、こ
の命令を前記プログラムメモリから前記演算処理部に転
送する際に前記命令バスに現れるビット列間のハミング
距離を低減させる命令列変更手段と、を備えたことを特
徴とする。（４）第４の発明に係わる命令列最適化装置は、プログ
ラムを記憶するプログラムメモリとこのプログラムメモ
リから命令バスを介して前記プログラムを取り込む演算
処理部とを備えた情報処理装置が使用するための前記プ
ログラムを最適化する命令列最適化装置において、前記
プログラム中の命令または命令列について、同じ処理結
果を得ることができる他の命令または命令列を選定する
選定手段と、前記プログラム中の命令または命令列を前
記選定手段で選定された命令または命令列と置換えるこ
とによって、この命令または命令列を前記プログラムメ
モリから前記演算処理部に転送する際に前記命令バスに
現れるビット列間のハミング距離を低減させる命令列変
更手段と、を備えたことを特徴とする。（５）第５の発明に係わる命令列最適化装置は、プログ
ラムを記憶するプログラムメモリとこのプログラムメモ
リから命令バスを介して前記プログラムを取り込む演算
処理部とを備えた情報処理装置が使用するための前記プ
ログラムを最適化する命令列最適化装置において、前記
プログラム中の命令または命令列について、同じ処理結
果を得ることができる他の命令または命令列を選定する
選定手段と、前記プログラム中の命令または命令列およ
び前記選定手段で選定された命令または命令列につい
て、これらの命令または命令列を前記プログラムメモリ
から前記演算処理部に転送する際の前記命令バスにおけ
る消費電力を、ハミング距離を考慮して試算する演算手
段と、前記プログラム中の命令または命令列を前記選定
手段で選定された命令または命令列と置換えることによ
って、前記演算手段が試算した消費電力を低減させる命
令列変更手段と、を備えたことを特徴とする。(1) An instruction sequence optimizing apparatus according to a first aspect of the present invention comprises a program memory for storing a program and an arithmetic processing unit for fetching the program from the program memory via an instruction bus. In an instruction string optimizing device for optimizing the program for use by an information processing device provided, an instruction string analyzing means for analyzing mutual dependence relations of respective instructions constituting the program, and the instruction string analyzing means. By changing the order of the instructions within a range that does not affect the dependency analyzed in step 1, the Hamming distance between the bit strings appearing on the instruction bus when the instructions are transferred from the program memory to the arithmetic processing unit is reduced. Instruction sequence changing means for reducing the program, dividing the program into basic blocks, and dividing the basic block into the instruction sequence analysis means. And a block dividing means for sending to, the instruction string changing means, the last bit string of the basic block immediately before the instruction sequence determining process and the first bit string of the basic block to perform the current instruction sequence determining process. It is characterized in that the instruction order determination processing in this block is performed in consideration of the Hamming distance between the blocks. (2) An instruction sequence optimizing device according to a second aspect of the present invention follows a plurality of registers for temporarily storing data, a program memory for storing a program, and an instruction fetched from the program memory via an instruction bus. In an instruction sequence optimizing device for optimizing a program for use by an information processing device having an arithmetic processing unit for writing / reading data to / from the register, the register number in each instruction constituting the program is A register number recognizing unit for recognizing, a register valid range recognizing unit for recognizing the valid range of the register number recognized by the register number recognizing unit, and a valid range recognized by the register valid range recognizing unit are not affected. By changing the register number in the range, the instruction containing this register number Characterized in that and a instruction sequence changing means for reducing the Hamming distance bit Retsukan appearing on the instruction bus when transferring from program memory to the processing unit. (3) The instruction sequence optimizing apparatus according to the third invention is used by an information processing apparatus including a program memory for storing a program and an arithmetic processing unit for fetching the program from the program memory via an instruction bus. In the instruction sequence optimizing device for optimizing the program, a storage means for storing another bit pattern meaning the same instruction for some or all of the instructions constituting the program, and an instruction in the program, Instruction string changing means for reducing the Hamming distance between bit strings appearing on the instruction bus when the instruction is transferred from the program memory to the arithmetic processing unit by replacing the bit pattern stored in the storage means, It is characterized by having. (4) The instruction sequence optimizing device according to the fourth invention is used by an information processing device having a program memory for storing a program and an arithmetic processing unit for fetching the program from the program memory via an instruction bus. In the instruction sequence optimizing device for optimizing the program, selection means for selecting another instruction or instruction sequence that can obtain the same processing result for the instruction or instruction sequence in the program, and the instruction in the program Alternatively, by replacing the instruction sequence with the instruction or instruction sequence selected by the selecting means, the Hamming distance between the bit sequences appearing on the instruction bus when the instruction or instruction sequence is transferred from the program memory to the arithmetic processing unit. And an instruction sequence changing means for reducing (5) The instruction sequence optimizing device according to the fifth invention is used by an information processing device having a program memory for storing a program and an arithmetic processing unit for fetching the program from the program memory via an instruction bus. In the instruction sequence optimizing device for optimizing the program, selection means for selecting another instruction or instruction sequence that can obtain the same processing result for the instruction or instruction sequence in the program, and the instruction in the program Alternatively, regarding the instruction sequence and the instruction or the instruction sequence selected by the selecting means, the power consumption in the instruction bus when the instruction or the instruction sequence is transferred from the program memory to the arithmetic processing unit, considering the Hamming distance. Calculating means and a command selected in the selecting means by the command or command sequence in the program Or by replacing the instruction sequence, wherein said computing means is provided with a instruction sequence changing means for reducing the power consumption estimated.

【００１５】[0015]

[Action]

（１）第１の発明に係わる命令列最適化装置によれば、
プログラムを構成する各命令について相互の依存関係を
解析し、この依存関係に影響を与えない範囲で命令の順
序を変更することによってこの命令をプログラムメモリ
から演算処理部に転送する際に命令バスに現れるビット
列間のハミング距離を低減させることとしたので、情報
処理装置の消費電力を低減させることができる。（２）第２の発明に係わる命令列最適化装置によれば、
プログラムを構成する各命令のレジスタ番号を認識し、
続いて、このレジスタ番号の有効範囲を認識し、そし
て、この有効範囲に影響を与えない範囲でレジスタ番号
を変更することによって、このレジスタ番号を含む命令
が転送される際に命令バスに現れるビット列間のハミン
グ距離を低減させることとしたので、情報処理装置の消
費電力を低減させることができる。（３）第３の発明に係わる命令列最適化装置によれば、
同じ命令を意味する他のビットパターンをプログラム中
の命令と置換えることによって、この命令をプログラム
メモリから演算処理部に転送する際に命令バスに現れる
ビット列間のハミング距離を低減させることとしたの
で、情報処理装置の消費電力を低減させることができ
る。（４）第４の発明に係わる命令列最適化装置によれば、
同じ処理結果を得ることができる他の命令または命令列
をプログラム中の命令または命令列と置換えることによ
って、この命令または命令列をプログラムメモリから演
算処理部に転送する際に命令バスに現れるビット列間の
ハミング距離を低減させることとしたので、情報処理装
置の消費電力を低減させることができる。（５）第５の発明に係わる命令列最適化装置によれば、
プログラム中の命令または命令列、および、この命令ま
たは命令列と同じ処理結果を得ることができる他の命令
または命令列の消費電力を試算し、この消費電力が小さ
い命令または命令列に置換することとしたので、情報処
理装置の消費電力を低減させることができる。(1) According to the instruction sequence optimization device of the first invention,
By analyzing the mutual dependence of each instruction that composes the program and changing the order of the instructions within the range that does not affect the dependence, the instruction bus is transferred to the arithmetic processing unit when the instruction is transferred from the program memory. Since the Hamming distance between the appearing bit strings is reduced, the power consumption of the information processing device can be reduced. (2) According to the instruction sequence optimization device of the second invention,
Recognize the register number of each instruction that makes up the program,
Then, by recognizing the valid range of this register number and changing the register number within the range that does not affect this valid range, the bit string that appears on the instruction bus when the instruction containing this register number is transferred. Since the Hamming distance between them is reduced, the power consumption of the information processing device can be reduced. (3) According to the instruction sequence optimization device of the third invention,
By replacing another bit pattern that means the same instruction with the instruction in the program, the Hamming distance between bit strings appearing on the instruction bus when transferring this instruction from the program memory to the arithmetic processing unit is reduced. The power consumption of the information processing device can be reduced. (4) According to the instruction sequence optimization device of the fourth invention,
A bit string that appears on the instruction bus when this instruction or instruction string is transferred from the program memory to the arithmetic processing unit by replacing another instruction or instruction string that can obtain the same processing result with the instruction or instruction string in the program. Since the Hamming distance between them is reduced, the power consumption of the information processing device can be reduced. (5) According to the instruction sequence optimization device of the fifth invention,
Estimating the power consumption of an instruction or instruction sequence in a program and other instructions or instruction sequences that can obtain the same processing result as this instruction or instruction sequence, and replacing it with an instruction or instruction sequence that consumes less power. Therefore, the power consumption of the information processing device can be reduced.

【００１６】[0016]

【実施例】以下、本発明の実施例について、図面を用い
て説明する。Embodiments of the present invention will be described below with reference to the drawings.

【００１７】（実施例１）実施例１として、第１の発明
の一実施例（請求項１〜４に対応する）について説明す
る。(Embodiment 1) As Embodiment 1, an embodiment of the first invention (corresponding to claims 1 to 4) will be described.

【００１８】図１は、第１の発明（請求項１に対応す
る）の概念を概略的に示すフローチャートである。同図
に示したように、第１の発明においては、まず、プログ
ラムを構成する各命令について、命令列解析手段を用い
て、相互の依存関係を解析する（ステップＳ１００）。
そして、この依存関係に影響を与えることなく命令バス
上に現れるビット列間のハミング距離が低減されるよう
に、命令の順序を変更する（ステップＳ１０１）。そし
て、この命令順序の変更により、命令バスにおける消費
電力の低減が実現される。FIG. 1 is a flow chart schematically showing the concept of the first invention (corresponding to claim 1). As shown in the figure, in the first aspect of the invention, first, with respect to each instruction forming the program, mutual dependency is analyzed using the instruction sequence analysis means (step S100).
Then, the order of the instructions is changed so that the Hamming distance between the bit strings appearing on the instruction bus is reduced without affecting this dependency (step S101). Then, by changing the instruction order, reduction of power consumption in the instruction bus is realized.

【００１９】このステップＳ１０１では、まず、消費電
力低減化手段（第１の発明の「命令列変更手段」に相当
する）を用いて、命令順序の変更と、このときのハミン
グ距離の判定とを行う（ステップＳ１０２）。次に、判
定されたハミング距離を所定の基準値と比較する（ステ
ップＳ１０３）。ここで、基準値は、予め定められた値
でもよいし、それまでに消費電力低減化手段で判定され
たハミング距離の最低値であってもよい。そして、消費
電力低減化手段で判定されたハミング距離が基準値より
も小さい場合は、この命令の順序を最適化の結果として
出力し、この最適化処理を終了する（ステップＳ１０
４）。一方、判定されたハミング距離が基準値よりも大
きい場合は、他の命令順序について、同様の処理（ステ
ップＳ１０２，Ｓ１０３）を繰り返す。In step S101, first, the power consumption reducing means (corresponding to the "instruction sequence changing means" of the first invention) is used to change the instruction order and determine the Hamming distance at this time. Perform (step S102). Next, the determined Hamming distance is compared with a predetermined reference value (step S103). Here, the reference value may be a predetermined value, or may be the lowest value of the Hamming distance that has been determined by the power consumption reduction unit until then. Then, when the Hamming distance determined by the power consumption reduction means is smaller than the reference value, the order of this instruction is output as the result of optimization, and this optimization processing ends (step S10).
4). On the other hand, if the determined Hamming distance is larger than the reference value, the same processing (steps S102 and S103) is repeated for another instruction order.

【００２０】次に、図１を具体化した例（請求項１〜３
に対応する）について、図２のフローチャートを用いて
説明する。Next, an example embodying FIG. 1 (claims 1 to 3)
2) will be described with reference to the flowchart of FIG.

【００２１】なお、ここでは、実行ユニットおよび命令
バスがともに３２ビットの情報処理装置（図１４（ａ）
参照）で使用される制御プログラムの最適化に第１の発
明を適用した場合を例にとって説明する。In this case, the information processing apparatus in which the execution unit and the instruction bus are both 32 bits (see FIG. 14A).
The case where the first invention is applied to the optimization of the control program used in (see) will be described as an example.

【００２２】図２において、ステップＳ２００では、制
御プログラムを、基本ブロックに分割する。ここで、基
本ブロックとは、例えば式や代入文の並びのような、途
中から外部への分岐が起こらず、また、外部から途中へ
の分岐も起こらないプログラムブロックをいう。基本ブ
ロックへの分割が終了すると、次に、この基本ブロック
内の各命令シーケンスに対して、それらの命令の依存関
係およびレジスタの依存関係を解析し、命令の入れ換え
によって処理の因果律を侵さない範囲を特定する。そし
て、このようにして特定された各範囲を識別するための
識別子をプログラムに付加する。In FIG. 2, in step S200, the control program is divided into basic blocks. Here, the basic block means a program block in which a branch from the middle to the outside does not occur and a branch from the outside to the middle does not occur, such as a sequence of expressions and assignment statements. When the division into basic blocks is completed, next, for each instruction sequence in this basic block, the dependency relationship of those instructions and the dependency relationship of registers are analyzed, and the range that does not violate the causality of processing by exchanging instructions. Specify. Then, an identifier for identifying each range thus specified is added to the program.

【００２３】ステップ２００における処理は、従来から
知られているような他の最適化処理においても利用され
る処理であるので、ここでは詳細の説明を省略する。こ
のような処理を開示した文献としては、例えば以下のよ
うなものがある。The process in step 200 is a process used also in other optimization processes known in the related art, and therefore detailed description thereof is omitted here. Documents disclosing such processing include the following, for example.

【００２４】Z.Li and P-C.Yew, ”Efficient Interpro
cedual Analysis fir Program Parallelization and Re
structuring,”Proc.ACM SIGPLAN PPEALS,pp85-97,198
8.S.Jain and C.Thompson,”An Efficient Approach to
Dataflow Analysis ina Multiple Pass Global Optimi
zer, ”Proc.SIGPLAN’88 Cof.on Prog.Lang.Design an
d Implementation(PLDI’88),pp.154-163,1988. 次に、ステップＳ２０１で、基本ブロックごとに最適化
処理を行う。以下、このステップＳ１０１の処理手順に
ついて説明する。Z.Li and PC.Yew, “Efficient Interpro
cedual Analysis fir Program Parallelization and Re
structuring, ”Proc.ACM SIGPLAN PPEALS, pp85-97,198
8.S. Jain and C. Thompson, ”An Efficient Approach to
Dataflow Analysis ina Multiple Pass Global Optimi
zer, ”Proc.SIGPLAN'88 Cof.on Prog.Lang.Design an
d Implementation (PLDI'88), pp.154-163, 1988. Next, in step S201, optimization processing is performed for each basic block. The processing procedure of step S101 will be described below.

【００２５】まず、ステップＳ２０２により、初期設定
を行う。図２において、変数ＬａｓｔＣｏｍは、前回に
最適化処理を行った基本ブロックの最後の命令が代入さ
れている。初期設定においては、この変数ＬａｓｔＣｏ
ｍに、デフォルト値（命令ビット列のひとつ）を代入す
る。このデフォルト値としては、どのような値を使用し
てもよく、例えば、全ビット“１”や全ビット“０”で
あってもよい。但し、現実的には、統計的に最も頻繁に
出現する命令や、最終的にリンクされるヘッダ・プログ
ラム（プロローグ・プログラム、すなわち、ＯＳから目
的のユーザが作成したプログラムを起動させ、実行終了
後のＯＳに戻るためのプログラム）或いはランタイム・
ルーチンなどの最終命令などをデフォルト値とすること
が好ましい。First, in step S202, initial setting is performed. In FIG. 2, the variable LastCom is substituted with the last instruction of the basic block that was previously optimized. In the initial setting, this variable LastCo
The default value (one of the instruction bit strings) is assigned to m. Any value may be used as the default value, for example, all bits “1” or all bits “0”. However, in reality, the most frequently appearing instruction statistically, and the header program (prologue program, that is, the program created by the target user from the OS) to be finally linked is started, and after execution is completed. Program to return to the OS) or runtime
It is preferable to use a final instruction such as a routine as a default value.

【００２６】以降のステップＳ２０３〜Ｓ２１２では、
ハミング距離を最小にするための処理が行われる。ここ
で示す手順は、命令の順番を順次変更し、可能な全パタ
ーンを試行することによって最適化を行う、素朴且つ確
実な「しらみつぶし」による方法を用いている。なお、
複雑なデータ構造の基本ブロックの場合は、特別な手法
を用いて高速処理を行うことも有効であるが、本発明で
は特に限定されるものではない。In subsequent steps S203 to S212,
Processing is performed to minimize the Hamming distance. The procedure shown here uses a naive and reliable "busy" method that optimizes by sequentially changing the order of instructions and trying all possible patterns. In addition,
In the case of a basic block having a complicated data structure, it is effective to perform high-speed processing using a special method, but the present invention is not particularly limited to this.

【００２７】ステップＳ２０３においては、全ての処理
が終了したか否かの判定を行う。本実施例における最適
化処理は、基本ブロックごとに行われるので、すべての
基本ブロックに対する処理が終了した時点で終了する。
この判定の結果、処理を行っていない基本ブロックが残
っている場合には、ステップＳ２０４以降の処理を実行
する。In step S203, it is determined whether or not all the processing has been completed. Since the optimization processing in this embodiment is performed for each basic block, it ends when the processing for all basic blocks is completed.
If the result of this determination is that there is a basic block that has not been processed, the processes in and after step S204 are executed.

【００２８】ステップＳ２０４では、基本ブロック内の
最適化処理が終了したか否かの判定を行う。この最適化
処理は、基本ブロック内の各命令の実行順序を置き換え
ることによって行われる。注目している基本ブロックに
対する処理は、命令の順序依存やレジスタ依存などに矛
盾を生じさせない全ての可能な置き換え方（順列）につ
いて試行し終わっている場合に終了し、ステップＳ２１
３に進む。一方、全ても置き換え方の試行を終了してい
ない場合は、ステップＳ２０５以降の最適化処理を続行
する。In step S204, it is determined whether or not the optimization process in the basic block is completed. This optimization processing is performed by replacing the execution order of each instruction in the basic block. The process for the basic block of interest ends when all possible replacement methods (permutations) that do not cause inconsistency in order dependency or register dependency of the instruction have been tried, and step S21
Go to 3. On the other hand, when all the trials of the replacement method are not completed, the optimization process after step S205 is continued.

【００２９】ステップＳ２０５では、ブロック内の最適
化に先立つ初期設定を行う。図２において、変数Ｈｄ＿
ｓｕｍは、注目している基本ブロック内での命令間のハ
ミング距離の総和を示し、変数Ｈｄ＿ｂｏｕｎは変数Ｌ
ａｓｔＣｏｍと注目している基本ブロックの先頭命令と
のハミング距離である。また、変数Ｈｄ＿ｔｏｔａｌ
は、変数Ｈｄ＿ｓｕｍと変数Ｈｄ＿ｂｏｕｎとの和を示
し、変数Ｈｄ＿ｍｉｎは、変数Ｈｄ＿ｔｏｔａｌの最小
値を示している。この変数Ｈｄ＿ｍｉｎに、初期設定と
して、“∞”を代入する。ここで、“∞”はいかなる数
字よりも大きい値であるものとする。In step S205, initialization is performed prior to optimization within the block. In FIG. 2, the variable Hd_
sum represents the sum of Hamming distances between instructions in the basic block of interest, and variable Hd_bound is variable L.
It is the Hamming distance between astCom and the head instruction of the basic block of interest. Also, the variable Hd_total
Indicates the sum of the variable Hd_sum and the variable Hd_bound, and the variable Hd_min indicates the minimum value of the variable Hd_total. "∞" is substituted into this variable Hd_min as an initial setting. Here, “∞” is a value larger than any number.

【００３０】ステップＳ２０６では、注目している基本
ブロック内のハミング距離の総和を求め、変数Ｈｄ＿ｓ
ｕｍに代入する。この総和は、隣接する命令のビット・
パターンを比較し、対応する桁のビットの値が異なって
いる場合を“１”として累積加算することにより簡単に
求めることができる。In step S206, the sum of the Hamming distances in the basic block of interest is obtained, and the variable Hd_s
Substitute in um. This sum is
It can be easily obtained by comparing the patterns and accumulating when the value of the bit of the corresponding digit is different as "1".

【００３１】ステップＳ２０７では、ステップ２０６と
同様にして、前回処理した基本ブロックの最後の命令
（変数ＬａｓｔＣｏｍ）と今回注目している基本ブロッ
クの先頭命令とのハミング距離を求め、変数Ｈｄ＿ｂｏ
ｕｎに代入する。この操作を行うことによって、基本ブ
ロック間にまたがる最適化処理が可能となる。In step S207, the Hamming distance between the last instruction (variable LastCom) of the previously processed basic block and the first instruction of the basic block of interest this time is calculated in the same manner as in step 206, and the variable Hd_bo is obtained.
Substitute in un. By performing this operation, it is possible to perform an optimization process that spans basic blocks.

【００３２】ステップＳ２０８では、現在の命令シーケ
ンス・パターンにおける命令間のハミング距離の総和、
すなわち変数Ｈｄ＿ｓｕｍと変数Ｈｄ＿ｂｏｕｎとの和
を算出して、変数Ｈｄ＿ｔｏｔａｌに代入する。In step S208, the sum of Hamming distances between instructions in the current instruction sequence pattern,
That is, the sum of the variable Hd_sum and the variable Hd_bound is calculated and substituted into the variable Hd_total.

【００３３】ステップＳ２０９では、ステップＳ２０８
で得られたＨｄ＿ｔｏｔａｌの値と、現在までの試行に
よって得られたＨｄ＿ｔｏｔａｌのうちの最小値である
Ｈｄ＿ｍｉｎとが、比較される。そして、Ｈｄ＿ｔｏｔ
ａｌ≧Ｈｄ＿ｍｉｎの場合は、異なる命令シーケンス・
パターンについての試行をさらに行うべく、ステップＳ
２１２以降を実行する。一方、Ｈｄ＿ｔｏｔａｌ＜Ｈｄ
＿ｍｉｎの場合は、ステップＳ２１０，Ｓ２１１を実行
した後、ステップＳ２１２以降を実行する。In step S209, step S208
The value of Hd_total obtained in (4) is compared with Hd_min, which is the minimum value of Hd_total obtained by the trials up to now. And Hd_tot
If al ≧ Hd_min, a different instruction sequence
To make further trials on the pattern, step S
212 and subsequent steps are executed. On the other hand, Hd_total <Hd
In the case of _min, steps S210 and S211 are executed, and then steps S212 and thereafter are executed.

【００３４】ステップＳ２１０，Ｓ２１１では、変数の
更新を行う。まず、ステップＳ２１０においては、Ｈｄ
＿ｔｏｔａｌの値をＨｄ＿ｍｉｎに代入する。また、ス
テップＳ２１１においては、このＨｄ＿ｍｉｎに対応す
る命令シーケンス・パターンを、変数ＭｉｎＨｄＳｅｑ
ｕｅｎｃｅに記憶する。In steps S210 and S211, the variables are updated. First, in step S210, Hd
Substitute the value of _total for Hd_min. Further, in step S211, the instruction sequence pattern corresponding to this Hd_min is set to the variable MinHdSeq.
memorize in uence.

【００３５】ステップＳ２１２においては、命令シーケ
ンスの入れ換えを行う。ここでは、基本ブロック内の命
令の順番を入れ換えることにより、まだ試行していない
命令シーケンスを生成して、ステップＳ２０４に戻る。
この命令の入れ換えにおいては、ステップＳ２００で行
った解析の結果に基づいて、因果律に矛盾が生じないよ
うにする。In step S212, the instruction sequences are exchanged. Here, by changing the order of the instructions in the basic block, an instruction sequence that has not been tried is generated, and the process returns to step S204.
In this command exchange, the causality is prevented from being inconsistent based on the result of the analysis performed in step S200.

【００３６】以上説明したステップＳ２０３〜Ｓ２１２
からなる処理を反復して行うことにより、最適化を行う
ことができる。Steps S203 to S212 described above
The optimization can be performed by repeatedly performing the process consisting of.

【００３７】すべての命令シーケンス・パターンについ
ての試行を終了すると（ステップＳ２０４）、続いて、
ステップＳ２１３を実行する。ステップＳ２１３では、
最適化された命令シーケンス（変数ＭｉｎＨｄＳｅｑｕ
ｅｎｃｅに記憶されている）を、最適化の結果として出
力する。When the trials for all the instruction sequence patterns are completed (step S204), then,
Step S213 is executed. In step S213,
Optimized instruction sequence (variable MinHdSequ
(stored in ence) is output as the result of optimization.

【００３８】ステップＳ２１４では、変数ＬａｓｔＣｏ
ｍの更新を行う。すなわち、この変数ＬａｓｔＣｏｍ
に、変数ＭｉｎＨｄＳｅｑｕｅｎｃｅに記憶された最後
の命令を代入する。In step S214, the variable LastCo
Update m. That is, this variable LastCom
To the last instruction stored in the variable MinHdSequence.

【００３９】そして、以上の説明と同様にして、次の基
本ブロックについて、ステップＳ２０３以降の処理を実
行する。Then, in the same manner as described above, the processing from step S203 is executed for the next basic block.

【００４０】次に、本実施例の最適化処理を実際に行う
場合について、図３に示したようなプログラムを用いる
場合を例にとって説明する。Next, the case where the optimization process of this embodiment is actually performed will be described by taking the case of using the program shown in FIG. 3 as an example.

【００４１】図３のプログラムは、整数の内積演算を行
うためのプログラムであり、Ｃ言語で記述されている。The program of FIG. 3 is a program for performing an inner product of integers, and is written in C language.

【００４２】図４〜図７は、図３に示したプログラムを
“ＳｕｎＳＰＡＲＣＣコンパイラ”でコンパイルし
た場合のアセンブリ・ソース・プログラムリストを示し
ている。図４〜図７において、第１カラムはライン番
号、第２カラムはアドレス、第３カラムはオブジェクト
・コード、第４カラムはアセンブリ・ソースを、それぞ
れ示している。なお、Ｃ１０１〜Ｃ３１３およびＢ１〜
Ｂ６は、説明のための符号である。ここで、アセンブリ
文法やニーモニック等については、公知技術であるので
説明を省略するが、これらを開示した文献としては例え
ば以下のようなものがある。FIGS. 4 to 7 show assembly source program lists when the program shown in FIG. 3 is compiled by the "Sun SPARC C compiler". 4 to 7, the first column shows the line number, the second column shows the address, the third column shows the object code, and the fourth column shows the assembly source. In addition, C101 to C313 and B1 to
B6 is a code for description. Here, the assembly grammar, mnemonics, and the like are publicly known techniques, and therefore description thereof will be omitted. However, examples of documents disclosing these are as follows.

【００４３】SPARC International,Inc.,The SPARC Arc
hitecture Manual Version 8,Prectice-Hall,Inc.A Sim
on and Schuster Company Englewood Cliffs,New Lerse
y 07632. 本実施例の最適化処理装置に入力されるのは、図４〜図
７に示したようなアセンブリ・ソースである。そして、
このアセンブリ・ソースは、図２にステップＳ２００に
よって基本ブロックが検索される。これにより、このプ
ログラムは、図４〜図７にＢ１〜Ｂ６で示したような６
個の基本ブロックに分割される。SPARC International, Inc., The SPARC Arc
hitecture Manual Version 8, Prectice-Hall, Inc.A Sim
on and Schuster Company Englewood Cliffs, New Lerse
y 07632. The assembly source as shown in FIGS. 4 to 7 is input to the optimization processing apparatus of this embodiment. And
This assembly source is searched for a basic block in step S200 in FIG. As a result, this program is executed as shown in B1 to B6 in FIGS.
Is divided into basic blocks.

【００４４】さらに、各基本ブロック内の命令の依存関
係が解析される。図８および図９は、依存関係の解析結
果を示す有向グラフである。ここで、図８は基本ブロッ
クＢ１（図４参照）の有向グラフであり、図４のＣ１０
１〜Ｃ１０４はそれぞれ図８のノードＮ１０１〜Ｎ１０
４に対応している。同様に、図９は基本ブロックＢ３
（図５参照）の有向グラフであり、図５のＣ３０１〜Ｃ
３１３はそれぞれ図９のノードＮ３０１〜Ｎ３１３に対
応している。また、ノードＮ１００，Ｎ３００は有向グ
ラフのトップを表すダミー・ノードであり、ノードＮ１
０５，Ｎ３１４は有向グラフのボトムを表すダミー・ノ
ードである。さらに、アークＡ１０１〜Ａ１０５，Ａ３
０１〜Ａ３１４は命令の依存関係を示し、矢印の逆の順
番に命令を実行してはならないことを示している。基本
ブロック内の命令の順序関係は、このような有向グラフ
を作成することによって管理する。Furthermore, the dependency relationship of the instructions in each basic block is analyzed. 8 and 9 are directed graphs showing the results of dependency analysis. Here, FIG. 8 is a directed graph of the basic block B1 (see FIG. 4), and C10 of FIG.
1 to C104 are nodes N101 to N10 in FIG.
It corresponds to 4. Similarly, FIG. 9 shows a basic block B3.
6 is a directed graph (see FIG. 5), and C301 to C301 in FIG.
Reference numerals 313 correspond to the nodes N301 to N313 in FIG. 9, respectively. Also, the nodes N100 and N300 are dummy nodes that represent the top of the directed graph.
05 and N314 are dummy nodes representing the bottom of the directed graph. Further, the arcs A101 to A105, A3
01 to A314 indicate instruction dependencies, which indicate that instructions must not be executed in the reverse order of the arrows. The order relation of the instructions in the basic block is managed by creating such a directed graph.

【００４５】まず、第１の基本ブロックＢ１の最適化処
理について説明する。基本ブロックＢ１は、本プログラ
ムを実行するために内部状態を保存するためのプロロー
グ処理である。図８の有向グラフに示したように、４個
の命令Ｃ１０１〜Ｃ１０４は、この順番でしか実行でき
ないので、そのまま出力される。First, the optimization process of the first basic block B1 will be described. The basic block B1 is a prologue process for saving the internal state in order to execute this program. As shown in the directed graph of FIG. 8, since the four instructions C101 to C104 can be executed only in this order, they are output as they are.

【００４６】第２の基本ブロックＢ２の最適化処理につ
いては説明を省略するが、この基本ブロックＢ２の処理
の終了時には、最後の命令Ｃ２０４としての“ｎｏｐ”
がＬａｓｔＣｏｍに記憶されている。The description of the optimization process of the second basic block B2 is omitted, but at the end of the process of the basic block B2, "nop" as the last instruction C204.
Are stored in LastCom.

【００４７】次に、第３の基本ブロックＢ３の最適化処
理を行う。基本ブロックＢ３においては、図９の有向グ
ラフに示したように、“Ｃ３０１”、“Ｃ３０２〜Ｃ３
０４”、“Ｃ３０５”、“Ｃ３０６〜Ｃ３０８”相互間
で命令シーケンスの順序を入れ換えることは可能である
が、“Ｃ３０２〜Ｃ３０４”や“Ｃ３０６〜Ｃ３０８”
内で命令の順序を入れ換えることはできない。また、Ｃ
３０９〜Ｃ３１３は、この順番でしか実行できず、Ｃ３
０１〜Ｃ３０８よりも先に実行することはできない。Next, the optimization processing of the third basic block B3 is performed. In the basic block B3, as shown in the digraph of FIG. 9, "C301", "C302 to C3"
It is possible to change the order of instruction sequences among 04 "," C305 ", and" C306 to C308 ", but" C302 to C304 "and" C306 to C308 "
You cannot change the order of instructions within. Also, C
309 to C313 can be executed only in this order, and C3
It cannot be executed before 01-C308.

【００４８】まず、命令シーケンス・パターンを図５の
とおり、すなわちＣ３０１，Ｃ３０２・・・Ｃ３１３と
した場合について、図２に示したような最適化処理ステ
ップＳ２０５〜Ｓ２１１を行う。このとき、ＬａｓｔＣ
ｏｍには、上述したように、基本ブロックＢ２の最後の
命令Ｃ２０４としての“ｎｏｐ”が記憶されている。ま
た、最小ハミング距離Ｈｄ＿ｍｉｎには、初期設定とし
て“∞”が代入されている。First, when the instruction sequence pattern is as shown in FIG. 5, that is, C301, C302 ... C313, the optimization processing steps S205 to S211 as shown in FIG. 2 are performed. At this time, LastC
As described above, "nop" as the last instruction C204 of the basic block B2 is stored in om. In addition, “∞” is substituted for the minimum Hamming distance Hd_min as an initial setting.

【００４９】図１０は、基本ブロックＢ２の最後の命令
Ｃ２０４および基本ブロックＢ３の各命令Ｃ３０１〜Ｃ
３１３のビット・パターンを示している。このようなビ
ット・パターンについて、Ｈｄ＿ｓｕｍ，Ｈｄ＿ｂｏｕ
ｎ，Ｈｄ＿ｔｏｔａｌを算出すると（図２のステップＳ
２０６〜Ｓ２０８参照）、Ｈｄ＿ｓｕｍ＝１６１、Ｈｄ
＿ｂｏｕｎ＝１３となり、したがってＨｄ＿ｔｏｔａｌ
＝１７４となる。FIG. 10 shows the last instruction C204 of the basic block B2 and each instruction C301 to C3 of the basic block B3.
313 shows the bit pattern. For such bit patterns, Hd_sum, Hd_bou
When n and Hd_total are calculated (step S in FIG. 2).
206 to S208), Hd_sum = 161, Hd
_Bound = 13, so Hd_total
= 174.

【００５０】ここで、最小ハミング距離Ｈｄ＿ｍｉｎ＝
∞であるので、ステップＳ２０９（図２参照）での比較
の結果、Ｈｄ＿ｍｉｎにはＨｄ＿ｔｏｔａｌ＝１７４が
代入され（ステップＳ２１０）、さらに、図５に示した
ような命令シーケンスが変数ＭｉｎＨｄＳｅｑｕｅｎｃ
ｅに記憶される（ステップＳ２１１）。Here, the minimum Hamming distance Hd_min =
Since it is ∞, as a result of the comparison in step S209 (see FIG. 2), Hd_min is set to Hd_total = 174 (step S210), and the instruction sequence shown in FIG. 5 is changed to the variable MinHdSequence.
It is stored in e (step S211).

【００５１】次に、命令の順序を入れ換えた場合につい
て、同様の処理（ステップＳ２０５〜Ｓ２１１）を行
う。Next, similar processing (steps S205 to S211) is performed when the order of the instructions is changed.

【００５２】そして、入れ換えが可能な命令シーケンス
のすべてについての試行が終了すると、変数ＭｉｎＨｄ
Ｓｅｑｕｅｎｃｅに記憶されている命令シーケンスを、
命令シーケンスの最適化の結果として出力する（ステッ
プＳ２１３）。When the trials of all the interchangeable instruction sequences are completed, the variable MinHd
The sequence of instructions stored in Sequence is
It is output as a result of optimizing the instruction sequence (step S213).

【００５３】ステップＳ２１３で出力された命令シーケ
ンスのビットパターンを図１１に示す。また、参考例と
して、最悪の（すなわち、ハミング距離の総和が最大に
なる）ビットパターンを図１２に示す。図１１における
ハミング距離の総和Ｈｄ＿ｔｏｔａｌは１３０となる。
また、図１２におけるハミング距離の総和Ｈｄ＿ｔｏｔ
ａｌは１９６となる。すなわち、本実施例によれば、最
適化処理によって、基本ブロックＢ３を実行する際の命
令バスのスイッチング回数を最適化前の７４．７％とす
ることができ、また、最悪の場合の６６．３％とするこ
とができた。FIG. 11 shows the bit pattern of the instruction sequence output in step S213. As a reference example, the worst bit pattern (that is, the total sum of Hamming distances is maximum) is shown in FIG. The total Hamming distance Hd_total in FIG. 11 is 130.
In addition, the total sum Hd_tot of the Hamming distances in FIG.
al becomes 196. That is, according to the present embodiment, the number of times of switching of the instruction bus when executing the basic block B3 can be set to 74.7% before the optimization by the optimization processing, and 66. It could be 3%.

【００５４】以下、同様にして第４〜第６の基本ブロッ
クＢ４〜Ｂ６についての最適化処理を行うが、これらの
各ブロックＢ４〜Ｂ６内では命令の順番を入れ換えるこ
とはできないので、そのまま出力して処理を終了する。In the same manner, the optimization processing for the fourth to sixth basic blocks B4 to B6 is performed in the same manner. However, since the order of the instructions cannot be changed in each of these blocks B4 to B6, they are output as they are. Ends the process.

【００５５】次に、本実施例に係わる命令列最適化装置
の変形例（請求項４に対応する）について、図１３を用
いて説明する。Next, a modified example (corresponding to claim 4) of the instruction sequence optimizing apparatus according to this embodiment will be described with reference to FIG.

【００５６】プログラム中の命令によっては、命令フォ
ーマットの中に“do not care ”のビット、すなわち
“１”あるいは“０”のどちらであっても、その命令の
動作に影響を与えないビットを含む場合がある。例え
ば、上述の基本ブロックＢ３において、命令Ｃ３０３，
Ｃ３０４，Ｃ３０７，Ｃ３０８の１２ビット目から６ビ
ット目（ビット〈１１：５〉）は、“do not care ”の
ビットである（図１０参照）。このようなビットの値を
適当に変更することによって、隣接する命令間でのハミ
ング距離を低減させることができる場合がある。Depending on the instruction in the program, the instruction format includes a "do not care" bit, that is, a bit that does not affect the operation of the instruction regardless of whether it is "1" or "0". There are cases. For example, in the above basic block B3, the instruction C303,
The 12th to 6th bits (bits <11: 5>) of C304, C307, and C308 are "do not care" bits (see FIG. 10). By appropriately changing the value of such bits, it may be possible to reduce the Hamming distance between adjacent instructions.

【００５７】図１３は、“do not care ”のビットの値
を変更することによってハミング距離を低減させるため
の処理の一例を示すフローチャートである。FIG. 13 is a flowchart showing an example of processing for reducing the Hamming distance by changing the value of the bit "do not care".

【００５８】図１３に示したような処理を、図２のステ
ップＳ２０６に換えて実行することにより、“do not c
are ”のビットを考慮して、さらなるハミング距離の低
減を図ることができる。By executing the processing shown in FIG. 13 instead of step S206 in FIG. 2, "do not c
The Hamming distance can be further reduced by taking the "are" bit into consideration.

【００５９】同図において、ステップＳ１３０１では、
初期設定として、変数Ｈｄ＿ｓｕｍに初期値“∞”を代
入する。In the figure, in step S1301,
As an initial setting, the initial value “∞” is assigned to the variable Hd_sum.

【００６０】次に、ステップＳ１３０２において、本処
理が終了したか否かの判定を行う。本処理においては、
“do not care ”の全ビットの値を変更しつつ、以下の
ような試行を行う。そして、“do not care ”のビット
の“１”，“０”の組み合わせについてのすべての試行
を終了すると、本処理を終了する。Next, in step S1302, it is determined whether or not this processing is completed. In this process,
Perform the following trials while changing the value of all bits of "do not care". Then, when all the trials for the combination of "1" and "0" of the "do not care" bits are completed, this processing is completed.

【００６１】ステップＳ１３０３では、現時点での“do
not care ”のビット・パターンについて、隣接する命
令間でのハミング距離の総和を求め、変数Ｈｄ＿ｓｕｍ
＿ｃｕｒｒｅｎｔに代入する。In step S1303, the current "do"
For the bit pattern of "not care", the sum of the Hamming distances between the adjacent instructions is calculated, and the variable Hd_sum is calculated.
Substitute in _current.

【００６２】ステップＳ１３０４では、Ｈｄ＿ｓｕｍと
Ｈｄ＿ｓｕｍ＿ｃｕｒｒｅｎｔとの大小比較を行う。こ
こで、Ｈｄ＿ｓｕｍ≧Ｈｄ＿ｓｕｍ＿ｃｕｒｒｅｎｔで
あれば、ステップＳ１３０５でＨｄ＿ｓｕｍにＨｄ＿ｓ
ｕｍ＿ｃｕｒｒｅｎｔの値を代入したのち、ステップＳ
１３０６へ進む。一方、Ｈｄ＿ｓｕｍ＜Ｈｄ＿ｓｕｍ＿
ｃｕｒｒｅｎｔであれば、ステップＳ１３０５を実行す
ることなく、そのままステップＳ１３０６へ進む。In step S1304, the magnitude comparison between Hd_sum and Hd_sum_current is performed. Here, if Hd_sum ≧ Hd_sum_current, Hd_s is added to Hd_sum in step S1305.
After substituting the value of um_current, step S
Proceed to 1306. On the other hand, Hd_sum <Hd_sum_
If it is current, the process directly proceeds to step S1306 without executing step S1305.

【００６３】ステップＳ１３０６では、“do not care
”のビット・パターンを、まだ試行していないビット
・パターンに変更する。In step S1306, "do not care".
Change the bit pattern of "to a bit pattern that has not been tried yet.

【００６４】このような処理を、例えば基本ブロックＢ
３の最適化前のプログラム（図１０参照）においては、
ハミング距離の総和Ｈｄ＿ｓｕｍ，Ｈｄ＿ｔｏｔａｌを
１０だけ低減させることができる。Such processing is performed, for example, in the basic block B.
In the pre-optimization program of 3 (see FIG. 10),
The sum of the Hamming distances Hd_sum and Hd_total can be reduced by 10.

【００６５】以上説明したようにして最適化を行った制
御プログラムを情報処理装置のプログラムメモリに格納
し、この制御プログラムを用いてＣＰＵ等の制御を行う
ことにより、命令バスにおける消費電力を低減させるこ
とが可能となる。The control program optimized as described above is stored in the program memory of the information processing apparatus, and the control program is used to control the CPU or the like to reduce the power consumption in the instruction bus. It becomes possible.

【００６６】本実施例に示した手順は、最適化を入力デ
ータに対して順次実行する１パス方式のものである。そ
のため、最終的に得られた命令例は、ハミング距離が最
小でない場合もあり得る。これは、基本ブロックの境界
の最適化において、注目している基本ブロックの一つ前
の基本ブロックの最終命令しか考慮していないためであ
る。したがって、ハミング距離をさらに低減させるため
には、例えば、注目している基本ブロックの次の基本ブ
ロックやさらに次の基本ブロック等をも考慮する方法等
が考えられる。しかし、単純な処理で迅速に最適化を行
うためには、基本ブロックに分割して処理を行う方が望
ましい。The procedure shown in this embodiment is a one-pass method in which optimization is sequentially performed on input data. Therefore, the Hamming distance may not be the minimum in the finally obtained instruction example. This is because the optimization of the boundaries of the basic blocks only considers the final instruction of the basic block immediately preceding the basic block of interest. Therefore, in order to further reduce the Hamming distance, for example, a method of considering a basic block next to the basic block of interest, a basic block next to the basic block, or the like can be considered. However, in order to quickly perform optimization with simple processing, it is preferable to divide the processing into basic blocks and perform the processing.

【００６７】また、アセンブリ・ソースに対して本実施
例に係わる最適化処理と他の最適化処理（例えば、制御
プログラムの実行時間を短縮するための最適化処理や、
この制御プログラムの格納に使用されるメモリ領域を低
減させるための最適化処理など）を行う場合には、各最
適化処理を行う順番に係わらず、本実施例の効果を得る
ことができる。しかし、本実施例の効果を最も有効に得
るためには、本実施例に係わる最適化を最後に行うこと
が望ましい。ここで、本実施例の最適化処理を最後に行
う場合には、他の最適化処理の結果を変更してしまう場
合が考えられるが、このような不都合の防止は、ステッ
プＳ１００（図１参照）の依存解析のフェイズにおいて
考慮すればよい。すなわち、図８、図９に示したような
有向グラフにおいて、他の最適化処理の結果を変更しな
いように制約の設定を行えばよい。The optimization processing according to the present embodiment and other optimization processing (for example, optimization processing for shortening the execution time of the control program,
When performing an optimization process for reducing the memory area used for storing the control program), the effect of the present embodiment can be obtained regardless of the order in which each optimization process is performed. However, in order to obtain the effect of this embodiment most effectively, it is desirable to perform the optimization according to this embodiment last. Here, when the optimization process of the present embodiment is performed last, the result of another optimization process may be changed, but such inconvenience can be prevented in step S100 (see FIG. 1). ) Dependency analysis phase. That is, in the directed graphs shown in FIGS. 8 and 9, the constraint may be set so as not to change the result of other optimization processing.

【００６８】以上説明した本実施例では、実行ユニット
および命令バスがともに３２ビットの情報処理装置で使
用される制御プログラムの最適化を例にとった。すなわ
ち、本実施例で最適化された制御プログラムは、図１４
（ａ）に示したような、命令を一度に１個ずつ読み出し
て、フェッチ、発行、デコード、実行を行う情報処理装
置の制御に使用されることを前提としていた。したがっ
て、最適化処理において考慮すべきハミング距離は、図
１５（ａ）に示すように、隣接する命令間のハミング距
離である。In the present embodiment described above, the optimization of the control program used in the information processing device having both 32-bit execution units and instruction buses is taken as an example. That is, the control program optimized in this embodiment is as shown in FIG.
It has been assumed that the instruction as shown in (a) is read one by one at a time and used to control the information processing apparatus for fetching, issuing, decoding, and executing. Therefore, the Hamming distance to be considered in the optimization process is the Hamming distance between adjacent instructions as shown in FIG.

【００６９】しかしながら、図１４（ｂ）に示したよう
に、今日ではＣＰＵの多くは一度に複数の命令の読み出
し、フェッチ、発行が行える構成となっている。このよ
うなＣＰＵを用いる場合、最適化を行う際に考慮すべき
ハミング距離は、隣接する命令間のハミング距離ではな
く、命令バスの同じフィールド、同じビット位置に割り
当てられる命令間のハミング距離である。すなわち、図
１４（ｂ）に示したように、一度に２命令ずつ命令を読
み出すような構成のＣＰＵの制御プログラムを最適化す
る場合は、図１５（ｂ）に示したように、１つおきの命
令間（例えば、Ｃ３０１とＣ３０３、Ｃ３０２とＣ３０
４等）のハミング距離が低減されるように最適化を行え
ばよい。このような場合、２個づつの命令をビット結合
（コンカチネント）して得られたビット列を作成するこ
ととすれば、本実施例の最適化装置をそのまま用いて最
適化処理を行うことができる。However, as shown in FIG. 14B, many CPUs today are configured to read, fetch, and issue a plurality of instructions at once. When using such a CPU, the Hamming distance to be considered when performing optimization is not the Hamming distance between adjacent instructions but the Hamming distance between instructions assigned to the same field and the same bit position of the instruction bus. . That is, as shown in FIG. 14 (b), when optimizing the control program of the CPU configured to read out two instructions at a time, every other instruction as shown in FIG. 15 (b). Between instructions (for example, C301 and C303, C302 and C30
The optimization may be performed so that the Hamming distance of 4) is reduced. In such a case, if a bit string obtained by bit-combining (concatenating) two instructions at a time is created, the optimizing process can be performed using the optimizing device of this embodiment as it is.

【００７０】図１６（ａ）は、一度に４命令ずつ命令を
読み出すような構成の装置を示している。このような場
合も、図１６（ｂ）に示したように、３つおきの命令間
（例えば、Ｃ３０１とＣ３０５、Ｃ３０２とＣ３０６
等）のハミング距離が低減されるように最適化を行えば
よい。そして、４個づつの命令をビット結合して得られ
たビット列を作成することとすれば、本実施例の最適化
装置をそのまま用いて最適化処理を行うことができる。FIG. 16A shows an apparatus configured to read four instructions at a time. Also in such a case, as shown in FIG. 16B, between every three instructions (for example, C301 and C305, C302 and C306).
The optimization may be performed so as to reduce the Hamming distance of (e.g.). If a bit string obtained by bit-combining four instructions each is created, the optimization processing can be performed using the optimization apparatus of this embodiment as it is.

【００７１】また、情報処理装置によっては、内部の命
令バスのバンド幅と外部の命令バスのバンド幅とが異な
る場合がある。図１７は、内部の命令バスのバンド幅は
１２８ビットであるが、外部の命令バスのバンド幅は３
２ビットである場合を示している。このような場合に
は、内部バスについては３命令おきの命令間でハミング
距離の低減を行い、外部バスについては隣接する命令間
でハミング距離の低減を行えばよい。どちらを優先する
か、或いは互いに妥協するのかは、消費電力の低減に対
する関与の度合い等に応じて、適宜決定すればよい。Depending on the information processing device, the bandwidth of the internal instruction bus may differ from the bandwidth of the external instruction bus. In FIG. 17, the bandwidth of the internal instruction bus is 128 bits, but the bandwidth of the external instruction bus is 3 bits.
The case of 2 bits is shown. In such a case, for the internal bus, the Hamming distance may be reduced between every three instructions, and for the external bus, the Hamming distance may be reduced between adjacent instructions. Which is to be prioritized or compromised with each other may be appropriately determined according to the degree of involvement in reduction of power consumption.

【００７２】（実施例２）実施例２として、第２の発明
の一実施例（請求項５、６に対応する）について説明す
る。(Embodiment 2) As Embodiment 2, an embodiment of the second invention (corresponding to claims 5 and 6) will be described.

【００７３】なお、ここでは、実行ユニットおよび命令
バスがともに３２ビットの場合を例にとって説明する。Here, a case where both the execution unit and the instruction bus are 32 bits will be described as an example.

【００７４】本実施例の命令列最適化装置においては、
レジスタに割り当てられるべき変数に注目して、制御プ
ログラムの最適化を行う。すなわち、その変数が現れる
命令列のある区間でビットの変化量を見て、最小となる
レジスタ番号を割り当てていく。In the instruction sequence optimizing device of this embodiment,
The control program is optimized by paying attention to the variables that should be assigned to the registers. That is, the minimum register number is assigned by observing the bit change amount in a certain section of the instruction sequence in which the variable appears.

【００７５】本実施例では、ｃ＝ａ−ｂ、ｃ＝ａ／ｂ等
において、ａをソース、ｂをターゲット、ｃをデスティ
ネーションとし、ａの値を保持するレジスタをソース・
レジスタ、ｂの値を保持するレジスタをターゲット・レ
ジスタ、ｃの値を保持するデスティネーション・レジス
タと称することとする。そして、３２ビットの命令のう
ち、左側ＭＳＢから数えて、１ビット目〜１０ビット目
および２１ビット目〜２７ビット目を命令コードのフィ
ールドとし、１１ビット目〜１５ビット目をデスティネ
ーション・レジスタのフィールドとし、１６ビット目〜
２０ビット目をソース・レジスタのフィールドとし、２
７ビット目〜３２ビット目をターゲットレジスタのフィ
ールドとする。In the present embodiment, when c = ab, c = a / b, etc., a is a source, b is a target, and c is a destination, and a register holding the value of a is a source.
The register holding the value of b and the register holding the value of b will be called the target register, and the destination register holding the value of c will be called. Of the 32-bit instruction, counting from the left MSB, the 1st to 10th bits and the 21st to 27th bits are used as the field of the instruction code, and the 11th to 15th bits of the destination register. Field, 16th bit ~
The 20th bit is the source register field and 2
The 7th bit to the 32nd bit are the fields of the target register.

【００７６】あるレジスタ番号に格納されるデータの有
効範囲は、このレジスタ番号がデスティネーションのレ
ジスタ番号として現れる命令によってデータがレジスタ
に格納されてから、この格納データが必要とされる命
令、すなわち、このレジスタ番号がソースレジスタ或い
はターゲットレジスタとして現れる命令までである。当
然ながら、一つのレジスタ番号は複数の変数或いはデー
タの一時記憶場所として使い回しをして効率化を図って
いる。したがって、プログラム中で、１個のレジスタ番
号がある一つのレジスタデータを保管している範囲を知
るには、コンパイラの最適化部でレジスタ番号を割り当
てる際に作られるレジスタ割当テーブルを解析し、有効
範囲テーブルを作製することが必要となる。The effective range of the data stored in a certain register number is the instruction in which this stored data is required after the data is stored in the register by the instruction in which this register number appears as the destination register number, that is, This register number is up to the instruction that appears as the source register or the target register. Naturally, one register number is reused as a temporary storage location for a plurality of variables or data for efficiency. Therefore, in order to know the range in which one register data with one register number is stored in the program, analyze the register allocation table created when the register number is allocated by the optimization unit of the compiler It is necessary to create a range table.

【００７７】有効範囲テーブルは、レジスタ割当テーブ
ルから容易に作製することができる。通常、コンパイラ
では、ソースプログラムからデータフローグラフ或いは
依存グラフを生成する。そして、このグラフを用いて、
ある変数のデータまたは一時的な演算の中間結果を示す
データの保持のためにレジスタを割り当てていく。ここ
で、従来は、変数が現れるとレジスタを割り当てテーブ
ルに登録し、データフローグラフ或いは依存グラフから
不要になったと判断された変数についてはレジスタの割
当テーブルからエントリーを削除していた。そして、こ
れにより、あるレジスタ番号のレジスタに保持されてい
るデータが有効なアドレスの範囲、すなわち、あるいロ
ード命令或いは演算結果の書き込みにより有効なデータ
が書き込まれてからそのデータを必要とする最後の演算
命令或いはストア命令が出現するまでの有効期間を判断
していた。The effective range table can be easily created from the register allocation table. Usually, a compiler generates a data flow graph or a dependency graph from a source program. And using this graph,
Registers are allocated to hold data of a variable or data showing an intermediate result of a temporary operation. Here, conventionally, a register is registered in an allocation table when a variable appears, and an entry is deleted from the register allocation table for a variable determined to be unnecessary from the data flow graph or the dependency graph. As a result, the data held in the register of a certain register number is in the range of valid addresses, that is, the valid data is written by writing a load instruction or operation result, and then the last data that requires that data. The valid period until the calculation instruction or the store instruction of appears appears.

【００７８】レジスタ番号を割り当てる際に、例えば、ｃ＝ａ＋ｂｃ＝ｃ＊ｄというソースプログラムがあった場合、ａ：レジスタ０ｂ：レジスタ１ｃ：レジスタ２ｄ：レジスタ３とし、コンパイル後のプログラムを（１）ａｄｄｒ０，ｒ１，ｒ２ｍｕｌｒ２，ｒ３，ｒ２としてもよいが、乗算結果としてのｃのみをレジスタ５
に格納すること、すなわち、（２）ａｄｄｒ０，ｒ１，ｒ２ｍｕｌｒ２，ｒ３，ｒ５としてもよい。この場合には、ソースプログラムの変数
ｃに対し、複数のレジスタ番号を割り当てることにな
る。従来は、上記プログラム（１）のように、例えばＳ
ＰＡＲＣのレジスタウインドウのグローバルレジスタの
ようにレジスタに特殊な意味または機能がある場合を除
いては、レジスタリソースの問題からレジスタからのデ
ータの退避が必要ないときには複数のレジスタ番号への
割り付けをできるだけ避けている。これに対して、本実
施例では、上記プログラム（２）のように、複数のレジ
スタへの割り当てを行うものとする。これにより、評価
対象の評価範囲を分割することができるので、評価範囲
が狭まり、評価対象を増加させることができる。一方、
上述のようにレジスタに特殊な意味・機能がある場合に
は、もともと割り当てられていたレジスタと同じ機能を
有するレジスタ以外には割り当てることができないの
で、選択範囲が狭くなり、注意が必要である。When assigning register numbers, for example, if there is a source program of c = a + b c = c * d, a: register 0 b: register 1 c: register 2 d: register 3 May be defined as (1) add r0, r1, r2 mul r2, r3, r2, but only c as a multiplication result is registered in the register 5
May be stored in, that is, (2) add r0, r1, r2 mul r2, r3, r5. In this case, a plurality of register numbers will be assigned to the variable c of the source program. Conventionally, as in the above program (1), for example, S
Unless the register has a special meaning or function like the global register in the PARC register window, avoid allocation to multiple register numbers if it is not necessary to save data from the register due to register resource problems. ing. On the other hand, in this embodiment, allocation to a plurality of registers is performed as in the above program (2). As a result, the evaluation range of the evaluation target can be divided, so that the evaluation range can be narrowed and the evaluation target can be increased. on the other hand,
If a register has a special meaning and function as described above, it cannot be allocated to a register having the same function as the originally allocated register, so that the selection range becomes narrow and caution is required.

【００７９】有効範囲テーブルを作成した後は、注目す
るレジスタ番号についてハミング距離を求め、割り付け
可能な他のレジスタ番号の再割り当てを行う。この再割
り当てを行う際には、さらに複数のレジスタに割り当て
ることを試みることとして、評価範囲の分割を図ること
も可能である。After the effective range table is created, the Hamming distance is obtained for the register number of interest, and another assignable register number is reallocated. When performing this reassignment, it is possible to divide the evaluation range by attempting to assign more registers.

【００８０】また、本実施例では、複数のレジスタ番号
について同時に評価することも可能である。同時に評価
を行うレジスタ番号を１つに限定すると、置き換えが可
能なレジスタ番号の数が限定されるが、複数のレジスタ
番号について同時に評価することにより、これらの複数
のレジスタ番号を互いに置き換えが可能なレジスタ番号
として置き換えの最適化を行うことができる。In this embodiment, it is also possible to evaluate a plurality of register numbers at the same time. If the number of register numbers to be evaluated at the same time is limited to one, the number of register numbers that can be replaced is limited. However, by simultaneously evaluating a plurality of register numbers, these multiple register numbers can be replaced with each other. The replacement can be optimized as a register number.

【００８１】さらに、レジスタファイルの再割り当てを
行う段階は、コンパイラでレジスタ割り当てを行うとき
でもよいし、一旦割り当てを行った後であってもよい。Further, the step of reallocating the register file may be performed when register allocation is performed by the compiler, or after the allocation is performed once.

【００８２】次に、本実施例の命令列最適化装置の具体
的な例について、図１８〜図２０を用いて説明する。Next, a concrete example of the instruction sequence optimizing apparatus of this embodiment will be described with reference to FIGS.

【００８３】図１８は、本実施例の命令列最適化装置が
行う最適化処理の手順を説明するためのフローチャート
である。FIG. 18 is a flow chart for explaining the procedure of the optimizing process performed by the instruction sequence optimizing apparatus of this embodiment.

【００８４】まず、高級言語或いはアセンブリ言語で作
製されたソース・プログラムをコンパイルし、さらに他
の最適化処理を施すことにより、中間コード（アセンブ
リ・コード）を作製する（ステップＳ１８０１）。この
ようにして得られた中間コードのプログラム例を、図１
９（ａ）に示す。First, an intermediate code (assembly code) is created by compiling a source program created in a high-level language or an assembly language and further performing another optimization process (step S1801). An example of the intermediate code program obtained in this way is shown in FIG.
9 (a).

【００８５】次に、レジスタ割当テーブルを作製し、さ
らに、このレジスタ割当テーブルから有効範囲テーブル
を作製する（ステップＳ１８０２）。図１９（ａ）に示
したプログラムの有効範囲テーブルを、図２０に示す。Next, a register allocation table is prepared, and an effective range table is prepared from this register allocation table (step S1802). FIG. 20 shows the effective range table of the program shown in FIG.

【００８６】そして、今回の試行で注目するデータを選
択し（ステップＳ１８０３）、この注目データに割当ら
れたレジスタ番号を、ハミング距離が最低となるような
レジスタ番号と置き換える（ステップＳ１８０４）。こ
のとき、有効範囲の境界での命令のハミング距離も考慮
し、この境界におけるハミング距離も小さくなるような
最適化を行う。Then, the data of interest in this trial is selected (step S1803), and the register number assigned to this data of interest is replaced with the register number that minimizes the Hamming distance (step S1804). At this time, the Hamming distance of the instruction at the boundary of the effective range is also taken into consideration, and optimization is performed so that the Hamming distance at this boundary is also small.

【００８７】ここで、レジスタ番号０ｘ１ｃに割り当て
られていたデータが注目デ−タである場合について考え
る。このデータの有効範囲は、図２０の有効範囲テーブ
ルより、アドレス０１０１からアドレス１０００までで
あることがわかる。すなわち、かかるデータについて
は、アドレス０１０１〜１０００について評価すればよ
い。なお、図１９（ａ）のプログラムでは、アドレス１
１０１の命令でレジスタ番号０ｘ１ｃに新たなデータが
格納されているが、評価対象外なので評価しない。Here, consider a case where the data assigned to the register number 0x1c is the data of interest. From the valid range table of FIG. 20, it can be seen that the valid range of this data is from address 0101 to address 1000. That is, for such data, the addresses 0101 to 1000 may be evaluated. In addition, in the program of FIG.
Although new data is stored in the register number 0x1c by the instruction of 101, it is not evaluated because it is out of the evaluation target.

【００８８】図１９（ａ）のプログラムでは、レジスタ
番号０ｘ１ｃの前後でのハミング距離の総和は１４とな
っている。ここで、ハミング距離の総和を最小にする他
のレジスタ番号を探すと、レジスタ番号０，２に置き換
えることによってハミング距離の総和を８にできること
がわかる。ここでは、レジスタ番号２については有効範
囲が重複しているので、レジスタ番号０ｘ１ｃをレジス
タ番号０に置き換えることとする。これにより、図１９
（ｂ）に示したようなプログラムを得ることができる。In the program of FIG. 19A, the sum of the Hamming distances before and after the register number 0x1c is 14. Here, by searching for another register number that minimizes the sum of the Hamming distances, it can be seen that the sum of the Hamming distances can be set to 8 by replacing the register numbers 0 and 2. Here, since the valid ranges of the register number 2 overlap, the register number 0x1c is replaced with the register number 0. As a result, FIG.
The program as shown in (b) can be obtained.

【００８９】プログラムの置き換えが終了すると、続い
て、すべてのデータについて最適化が終了したか否かを
判定する（ステップＳ１８０５）。そして、最適化が終
了していないデータが残っている場合には、そのデータ
についてステップＳ１８０３〜Ｓ１８０５を実行する。
一方、すべてのデータについて最適化が終了している場
合には、最適化処理後のプログラムを出力し、最適化処
理を終了する。When the replacement of the program is completed, it is then determined whether the optimization has been completed for all the data (step S1805). Then, if there is data that has not been optimized, steps S1803 to S1805 are executed for that data.
On the other hand, if the optimization has been completed for all data, the program after the optimization processing is output and the optimization processing is terminated.

【００９０】本実施例によれば、以上説明したようにし
て最適化を行った制御プログラムを情報処理装置のプロ
グラムメモリに格納し、この制御プログラムを用いてＣ
ＰＵ等の制御を行うことにより、命令バスにおける消費
電力を低減させることが可能となる。According to this embodiment, the control program optimized as described above is stored in the program memory of the information processing apparatus, and C is stored using this control program.
By controlling the PU or the like, it becomes possible to reduce the power consumption in the instruction bus.

【００９１】なお、本実施例では、実行ユニットおよび
命令バスがともに３２ビットの場合を例に採って説明し
たが、複数ワードのアドレスを同時に転送するような場
合にも、第２の発明を適用できることはもちろんであ
る。例えば４ワードバウンダリで転送する場合には、注
目する命令の４ワード前の命令および４ワード後の命令
とのハミング距離について評価を行えばよい。In the present embodiment, the case where both the execution unit and the instruction bus are 32 bits has been described as an example, but the second invention is also applied to the case where addresses of a plurality of words are simultaneously transferred. Of course you can. For example, when transferring at a 4-word boundary, the Hamming distance between the instruction 4 words before and the instruction 4 words after the instruction of interest may be evaluated.

【００９２】（実施例３）次に、実施例３として、第３
の発明の一実施例（請求項７に対応する）について説明
する。(Third Embodiment) Next, as a third embodiment, the third embodiment will be described.
An embodiment (corresponding to claim 7) of the invention will be described.

【００９３】本実施例では、“ａｄｄ”命令に第３の発
明を適用した場合を例に採って説明する。In this embodiment, the case where the third invention is applied to the "add" instruction will be described as an example.

【００９４】命令機能コードを作成するとき、例えばＳ
ＰＡＲＣの命令機能コードを参照すると、“ａｄｄ”命
令は“００００００”であるが、この“ａｄｄ”命令等
は非常に出現頻度が高い命令であるので、本実施例で
は、“００００００”に加えて“１１１１１１”も“ａ
ｄｄ”命令となるように、命令体系を作成する。すなわ
ち、本実施例の命令列最適化装置で最適化された制御プ
ログラムを使用する情報処理装置においては、“０００
０００”および“１１１１１１”を“ａｄｄ”命令であ
るとしてデコードするように、ＣＰＵの命令デコーダが
構成されるものとする。これにより、コンパイラは、オ
ブジェクト・コードを作製する際に、命令機能コードの
フィールドに“００００００”または“１１１１１１”
のいずれかを割り当てることができる。ここで、“ａｄ
ｄ”命令の前後の命令の命令機能コードのフィールドが
“００１１１０”および“１１０１１０”であったとす
ると、“ａｄｄ”命令の命令機能コードに“０００００
０”を選択した場合のハミング距離は７であり、“ａｄ
ｄ”命令の命令機能コードに“１１１１１１”を選択し
た場合のハミング距離は５である。したがって、この場
合には、コンパイラは、ａｄｄ”命令の命令機能コード
に“１１１１１１”を選択する。When creating an instruction function code, for example, S
Referring to the instruction function code of PARC, the “add” instruction is “000000”, but since this “add” instruction and the like have a very high appearance frequency, in this embodiment, in addition to “000000” "111111" is also "a"
A command system is created so that it becomes the "dd" command. That is, in the information processing device that uses the control program optimized by the command sequence optimizing device of this embodiment, "000".
It is assumed that the instruction decoder of the CPU is configured to decode "000" and "111111" as the "add" instruction, so that the compiler can generate the instruction function code when the object code is created. "000000" or "111111" in the field
You can assign any of Where "ad
If the fields of the instruction function code of the instruction before and after the d "instruction are" 001110 "and" 110110 ", the instruction function code of the" add "instruction is" 00000 ".
The Hamming distance is 7 when "0" is selected.
The Hamming distance when "111111" is selected as the instruction function code of the d "instruction is 5. Therefore, in this case, the compiler selects" 111111 "as the instruction function code of the add" instruction.

【００９５】次に、本実施例の命令列最適化装置の具体
的な例について、図２１および図２２を用いて説明す
る。Next, a concrete example of the instruction sequence optimizing apparatus of this embodiment will be described with reference to FIGS. 21 and 22.

【００９６】図２１は、本実施例の命令列最適化装置が
行う最適化処理の手順を説明するためのフローチャート
である。FIG. 21 is a flow chart for explaining the procedure of the optimizing process performed by the instruction sequence optimizing apparatus of this embodiment.

【００９７】まず、高級言語或いはアセンブリ言語で作
製されたソース・プログラムをコンパイルし、さらに他
の最適化処理を施すことにより、中間コード（アセンブ
リ・コード）を作製する（ステップＳ２１０１）。この
とき、“ａｄｄ”命令の命令機能コードは“０００００
０”となっているものとする。First, an intermediate code (assembly code) is created by compiling a source program created in a high-level language or an assembly language and further performing another optimization process (step S2101). At this time, the instruction function code of the “add” instruction is “00000
It is assumed to be 0 ".

【００９８】次に、各命令について、本実施例を適用す
る命令であるか否か、すなわち命令機能コードのフィー
ルドを複数割り当てられている命令（ここでは“ａｄ
ｄ”命令）であるか否かを、判断する（ステップＳ２１
０２）。Next, it is determined whether or not each instruction is an instruction to which the present embodiment is applied, that is, an instruction to which a plurality of instruction function code fields are assigned (here, "ad
It is determined whether or not it is a "d" command (step S21).
02).

【００９９】そして、本実施例を適用する命令であると
判断された場合は、この命令に対して、置換が可能なビ
ットパターン（ここでは“１１１１１１”）を選出する
（ステップＳ２１０３）。If it is determined that the instruction is one to which the present embodiment is applied, a bit pattern (“111111” in this case) that can be replaced is selected for this instruction (step S2103).

【０１００】さらに、この命令に対応するビットパター
ンのすべてについて、その前後の命令とのハミング距離
を算出し、互いに比較することによって、ハミング距離
が最低となるようなビットパターンを選択する（ステッ
プＳ２１０４）。図２２に、“ａｄｄ”命令についての
最適化処理を行ったプログラムの例を示す。この例で
は、先に現れた“ａｄｄ”命令では命令機能コードを
“００００００”とした方がハミング距離が小さいので
置換を行わず、後に現れた“ａｄｄ”命令では命令機能
コードを“１１１１１１”とした方がハミング距離が小
さいので置換を行っている。Furthermore, for all the bit patterns corresponding to this instruction, the Hamming distances between the preceding and succeeding instructions are calculated and compared with each other to select the bit pattern that minimizes the Hamming distance (step S2104). ). FIG. 22 shows an example of a program that has been optimized for the "add" instruction. In this example, in the "add" instruction that appears earlier, the instruction function code is set to "000000" because the Hamming distance is smaller, so replacement is not performed, and in the "add" instruction that appears later, the instruction function code is set to "111111". Since the Hamming distance is smaller in the case of, the replacement is performed.

【０１０１】プログラムの置換が終了すると、続いて、
すべてのデータについて最適化が終了したか否かを判定
する（ステップＳ２１０５）。そして、最適化が終了し
ていないデータが残っている場合には、そのデータにつ
いてステップＳ２１０３〜Ｓ２１０５を実行する。一
方、すべてのデータについて最適化が終了している場合
には、最適化処理後のプログラムを出力し、最適化処理
を終了する。When the program replacement is completed,
It is determined whether optimization has been completed for all data (step S2105). Then, if there is data that has not been optimized, steps S2103 to S2105 are executed for that data. On the other hand, if the optimization has been completed for all data, the program after the optimization processing is output and the optimization processing is terminated.

【０１０２】本実施例によれば、以上説明したようにし
て最適化を行った制御プログラムを情報処理装置のプロ
グラムメモリに格納し、この制御プログラムを用いてＣ
ＰＵ等の制御を行うことにより、命令バスにおける消費
電力を低減させることが可能となる。According to this embodiment, the control program optimized as described above is stored in the program memory of the information processing apparatus, and C is stored using this control program.
By controlling the PU or the like, it becomes possible to reduce the power consumption in the instruction bus.

【０１０３】（実施例４）次に、実施例４として、第４
の発明の一実施例（請求項８に対応する）について説明
する。(Embodiment 4) Next, as Embodiment 4, the fourth embodiment
An embodiment (corresponding to claim 8) of the invention will be described.

【０１０４】本実施例では、１種類の動作を行うための
実現方法が複数ある場合に、前後の命令とのハミング距
離が最も小さくなるように、その実現方法に係わる命令
を選択する。例えば、レジスタ０ｘ０ｄにデータ“０”
を書き込む場合、ＳＰＡＲＣのようにレジスタ番号０が
書き込みは意味がないが読み出しはデータ“０”を出力
する特別なレジスタとして定義されている場合、その実
現方法としては、以下のようなものがある。これらの実
現方法（すなわち命令）のうちで、前後の命令とのハミ
ング距離が最も小さくなるものを選択して、その命令を
置き換えることとする。In the present embodiment, when there are a plurality of implementing methods for performing one type of operation, the instruction relating to the implementing method is selected so that the Hamming distance between the preceding and following instructions is minimized. For example, register 0x0d stores data “0”
In the case of writing, if the register number 0 is meaningless for writing like SPARC but reading is defined as a special register that outputs data “0”, the following implementation methods are available. . Among these implementation methods (that is, instructions), the instruction having the smallest Hamming distance from the preceding and succeeding instructions is selected and the instruction is replaced.

【０１０５】ｍｏｖｒ０，ｒｄ（０ｘ０ｄにデータ“０”を移す命令）ａｄｄｒ０，ｒ０，ｒｄ（０＋０を０ｘ０ｄに格納させる命令）ｍｕｌｒ？，ｒ０，ｒｄ（ある値に０を掛けた値を０ｘ０ｄに格納させる命令）ｍｕｌｒ０，ｒ？，ｒｄ（０にある値を掛けた値を０ｘ０ｄに格納させる命令）ｘｏｒｒ？，ｒ？，ｒｄ（ある値と、これと同じ値との排他的論理和を取った結
果を０ｘ０ｄに格納させる命令）ｓｌｌｒ０．ｒ？，ｒｄ（０をある値だけ右にシフトさせた値を０ｘ０ｄに格納
させる命令）ｓｒｌｒ０，ｒ？，ｒｄ（０をある値だけ左にシフトさせた値を０ｘ０ｄに格納
させる命令）また、他の具体例としては、イミディエイト加算命令に
よるものがある。イミディエイト加算に対して、イミデ
ィエイト部分を２の補数としたイミディエイト減算は、
演算機能としてはまったく同じである。例えば、ａ＝ｂ＋５と、ａ＝ｂ−（−５）とは、同じ演算として扱われる。ここで、イミディエイ
ト加算とイミディエイト減算とを置き換えた場合、イミ
ディエイトデータを表す命令のフィールドが反転するの
で、両式を置き換えることによってハミング距離を低減
することができる場合がある。Mov r0, rd (instruction for moving data “0” to 0x0d) add r0, r0, rd (instruction for storing 0 + 0 in 0x0d) mul r? , R0, rd (instruction for storing a value obtained by multiplying 0 by 0x0d) mul r0, r? , Rd (instruction for storing a value obtained by multiplying 0 by 0x0d) xor r? , R? , Rd (instruction to store the result of the exclusive OR of a certain value and the same value in 0x0d) sll r0. r? , Rd (an instruction to store a value obtained by shifting 0 to the right by a certain value in 0x0d) srl r0, r? , Rd (instruction for storing a value obtained by shifting 0 to the left by a certain value in 0x0d) Further, as another specific example, there is an immediate addition instruction. Immediate subtraction with the immediate part as 2's complement to immediate addition is
The calculation functions are exactly the same. For example, a = b + 5 and a = b-(-5) are treated as the same calculation. Here, when the immediate addition and the immediate subtraction are replaced with each other, the field of the instruction representing the immediate data is inverted. Therefore, it may be possible to reduce the Hamming distance by replacing both expressions.

【０１０６】本実施例において、置き換えを行う候補を
選出するためには、例えば、いわゆるライブラリを予め
作製しておき、ある命令を評価するときにこのライブラ
リに置き換え候補が登録されているか否かを検索すれば
よい。検索の結果、置き換え候補が検索された場合に
は、この置き換え候補を採用した場合とハミング距離を
比較する。また、イミディエイト加算とイミディエイト
減算とを置き換えるためには、ライブラリに、イミディ
エイト加算の置き換え候補として、イミディエイト減算
を登録しておけばよい。このとき、イミディエイトデー
タを変換する方式或いは手順もライブラリに登録してお
けば、イミディエイト減算が検索されることによって変
換方式・手順も得られるようにすることができる。例え
ば、命令フィールドをイミディエイト減算に置き換える
とともに、イミディエイトデータを２の補間を取ったも
のに置き換えるといった手順を採用することができる。
そして、このような手順で得られた命令データを、ハミ
ング距離の比較対象として採用する。また、単にライブ
ラリの検索を行うのではなく、検索が可能であるか否か
を判断した後で、可能である場合には検索を行うことと
してもよい。In the present embodiment, in order to select a replacement candidate, for example, a so-called library is prepared in advance, and when a certain instruction is evaluated, it is determined whether or not the replacement candidate is registered in this library. Just search. When the replacement candidate is found as a result of the search, the Hamming distance is compared with the case where the replacement candidate is adopted. Further, in order to replace the immediate addition and the immediate subtraction, the immediate subtraction may be registered in the library as a replacement candidate for the immediate addition. At this time, if the method or procedure for converting the immediate data is also registered in the library, the conversion method / procedure can be obtained by searching the immediate subtraction. For example, it is possible to adopt a procedure in which the instruction field is replaced with immediate subtraction and the immediate data is replaced with data obtained by interpolating 2.
Then, the instruction data obtained by such a procedure is adopted as a comparison target of the Hamming distance. Further, instead of simply searching the library, it may be possible to perform the search when it is possible after determining whether or not the search is possible.

【０１０７】次に、本実施例の命令列最適化装置の具体
的な例について、図２３および図２４を用いて説明す
る。Next, a concrete example of the instruction sequence optimizing apparatus of this embodiment will be described with reference to FIGS. 23 and 24.

【０１０８】図２３は、本実施例の命令列最適化装置が
行う最適化処理の手順を説明するためのフローチャート
である。FIG. 23 is a flow chart for explaining the procedure of the optimizing process performed by the instruction sequence optimizing apparatus of this embodiment.

【０１０９】まず、高級言語或いはアセンブリ言語で作
製されたソース・プログラムをコンパイルし、さらに他
の最適化処理を施すことにより、中間コード（アセンブ
リ・コード）を作製する（ステップＳ２３０１）。First, an intermediate code (assembly code) is created by compiling a source program created in a high-level language or an assembly language and further performing another optimization process (step S2301).

【０１１０】次に、各命令について、本実施例を適用す
る命令であるか否か、すなわちライブラリに置き換え候
補が登録されている命令であるか否かを、判断する（ス
テップＳ２３０２）。Next, it is determined whether or not each instruction is an instruction to which the present embodiment is applied, that is, whether or not a replacement candidate is registered in the library (step S2302).

【０１１１】そして、本実施例を適用する命令であると
判断された場合は、ライブラリの検索を行って、置換が
可能な命令を選出する（ステップＳ２３０３）。また、
このとき、命令動作を解析して同等の命令を生成するこ
ととしてもよい。If it is determined that the instruction is one to which the present embodiment is applied, the library is searched to select a replaceable instruction (step S2303). Also,
At this time, the instruction operation may be analyzed to generate an equivalent instruction.

【０１１２】さらに、本実施例を適用する命令およびラ
イブラリで検索された命令について、その前後の命令と
のハミング距離を算出する。そして、各算出結果を互い
に比較することによって、ハミング距離が最低となるよ
うな命令を選択する（ステップＳ２３０４）。図２４に
おいて、（ａ）は本実施例による最適化を行う前のプロ
グラム例であり、（ｂ）は最適化後のプログラム例であ
る。同図において“ａｄｄｉ”命令（イミディエイト加
算命令）を“ｓｕｂｉ”命令（イミディエイト減算命
令）に置き換えることにより、その前後の命令との間の
ハミング距離を２６から２０に低減させることができ
た。Further, for the instruction to which this embodiment is applied and the instruction retrieved by the library, the Hamming distance between the instruction before and after the instruction is calculated. Then, by comparing the respective calculation results with each other, the instruction that minimizes the Hamming distance is selected (step S2304). In FIG. 24, (a) is a program example before the optimization according to the present embodiment, and (b) is a program example after the optimization. By replacing the "addi" instruction (immediate addition instruction) with the "subi" instruction (immediate subtraction instruction) in the figure, the Hamming distance between the preceding and succeeding instructions can be reduced from 26 to 20.

【０１１３】命令の置換が終了すると、続いて、すべて
のデータについて最適化が終了したか否かを判定する
（ステップＳ２３０５）。そして、最適化が終了してい
ないデータが残っている場合には、そのデータについて
ステップＳ２３０３〜Ｓ２３０５を実行する。一方、す
べてのデータについて最適化が終了している場合には、
最適化処理後のプログラムを出力し、最適化処理を終了
する。When the replacement of the instruction is completed, it is then determined whether or not the optimization is completed for all the data (step S2305). Then, if there is data that has not been optimized, steps S2303 to S2305 are executed for that data. On the other hand, if optimization has been completed for all data,
The program after the optimization processing is output, and the optimization processing ends.

【０１１４】本実施例によれば、以上説明したようにし
て最適化を行った制御プログラムを情報処理装置のプロ
グラムメモリに格納し、この制御プログラムを用いてＣ
ＰＵ等の制御を行うことにより、命令バスにおける消費
電力を低減させることが可能となる。According to this embodiment, the control program optimized as described above is stored in the program memory of the information processing apparatus, and C is stored using this control program.
By controlling the PU or the like, it becomes possible to reduce the power consumption in the instruction bus.

【０１１５】（実施例５）次に、実施例５として、第５
の発明の一実施例（請求項９に対応する）について説明
する。(Fifth Embodiment) Next, as a fifth embodiment, the fifth embodiment will be described.
An embodiment (corresponding to claim 9) of the invention will be described.

【０１１６】本実施例では、１種類の動作を行うための
実現方法が複数ある場合に、作動する機能ブロックが小
さく、消費電力が小さくなるなるように、その実現方法
に係わる命令を置き換える。すなわち、データ線のばら
つき、使用する機能ブロックの消費電力などを考慮し、
総合的な消費電力が最小となるように、命令の置き換え
を行う。置き換えの方法としては、上述の実施例４の場
合と同様、ライブラリを使用することができる。例え
ば、レジスタ０ｘ０ｄにデータ“０”を書き込む場合、
採用する命令と使用する機能ブロックとの関係は、表１
のようになる。これらの命令のうちで、消費電力が最も
小さくなるものを選択して、その命令を置き換えること
とする。In the present embodiment, when there are a plurality of implementation methods for performing one type of operation, the instructions relating to the implementation method are replaced so that the functional blocks that operate are small and the power consumption is small. That is, considering variations in data lines, power consumption of functional blocks used, etc.,
Instructions are replaced so that the total power consumption is minimized. As a replacement method, a library can be used as in the case of the above-described fourth embodiment. For example, when writing data “0” to register 0x0d,
Table 1 shows the relationship between the adopted instructions and the functional blocks used.
become that way. Of these instructions, the one with the lowest power consumption is selected and the instruction is replaced.

【０１１７】[0117]

【表１】次に、本実施例の命令列最適化装置の具体的な例につい
て、図２５を用いて説明する。[Table 1] Next, a specific example of the instruction sequence optimizing apparatus of this embodiment will be described with reference to FIG.

【０１１８】図２５は、本実施例の命令列最適化装置が
行う最適化処理の手順を説明するためのフローチャート
である。FIG. 25 is a flow chart for explaining the procedure of the optimizing process performed by the instruction sequence optimizing apparatus of this embodiment.

【０１１９】まず、高級言語或いはアセンブリ言語で作
製されたソース・プログラムをコンパイルし、さらに他
の最適化処理を施すことにより、中間コード（アセンブ
リ・コード）を作製する（ステップＳ２５０１）。First, an intermediate code (assembly code) is created by compiling a source program created in a high-level language or an assembly language and further performing another optimization process (step S2501).

【０１２０】次に、各命令について、本実施例を適用す
る命令であるか否か、すなわちライブラリに置き換え候
補が登録されている命令であるか否かを、判断する（ス
テップＳ２５０２）。Then, it is determined whether or not each instruction is an instruction to which the present embodiment is applied, that is, whether or not the replacement candidate is registered in the library (step S2502).

【０１２１】そして、本実施例を適用する命令であると
判断された場合は、ライブラリの検索を行って、置換が
可能な命令を選出する（ステップＳ２５０３）。また、
このとき、命令動作を解析して同等の命令を生成するこ
ととしてもよい。If it is determined that the instruction is one to which the present embodiment is applied, the library is searched to select an instruction that can be replaced (step S2503). Also,
At this time, the instruction operation may be analyzed to generate an equivalent instruction.

【０１２２】さらに、本実施例を適用する命令およびラ
イブラリで検索された命令について、消費電力を試算す
る。そして、各算出結果を互いに比較することによっ
て、消費電力が最低となるような命令を選択する（ステ
ップＳ２５０４）。Furthermore, the power consumption of the instructions to which the present embodiment is applied and the instructions retrieved by the library are calculated. Then, by comparing the respective calculation results with each other, the instruction having the lowest power consumption is selected (step S2504).

【０１２３】命令の置換が終了すると、続いて、すべて
のデータについて最適化が終了したか否かを判定する
（ステップＳ２５０５）。そして、最適化が終了してい
ないデータが残っている場合には、そのデータについて
ステップＳ２５０３〜Ｓ２５０５を実行する。一方、す
べてのデータについて最適化が終了している場合には、
最適化処理後のプログラムを出力し、最適化処理を終了
する。When the replacement of the instruction is completed, it is then determined whether or not the optimization is completed for all the data (step S2505). Then, if there is data that has not been optimized, steps S2503 to S2505 are executed for that data. On the other hand, if optimization has been completed for all data,
The program after the optimization processing is output, and the optimization processing ends.

【０１２４】本実施例によれば、以上説明したようにし
て最適化を行った制御プログラムを情報処理装置のプロ
グラムメモリに格納し、この制御プログラムを用いてＣ
ＰＵ等の制御を行うことにより、命令バスにおける消費
電力を低減させることが可能となる。According to this embodiment, the control program optimized as described above is stored in the program memory of the information processing apparatus, and C is stored using this control program.
By controlling the PU or the like, it becomes possible to reduce the power consumption in the instruction bus.

【０１２５】[0125]

【発明の効果】以上詳細に説明したように、本発明によ
れば、情報処理装置用制御プログラムの作成段階で消費
電力を低減させるための最適化処理を行うことができ
る、命令列最適化装置を提供することができる。As described in detail above, according to the present invention, an instruction sequence optimizing device capable of performing an optimizing process for reducing power consumption at the stage of creating a control program for an information processing device. Can be provided.

[Brief description of drawings]

【図１】実施例１の概念を概略的に示すフローチャート
である。FIG. 1 is a flowchart schematically showing the concept of the first embodiment.

【図２】図１を具体化した例を示すフローチャートであ
る。FIG. 2 is a flowchart showing an example in which FIG. 1 is embodied.

【図３】実施例１に係わる命令列最適化装置で最適化さ
れるプログラムの一例を示す図である。FIG. 3 is a diagram showing an example of a program optimized by the instruction sequence optimizing apparatus according to the first embodiment.

【図４】図３に示したプログラムをコンパイルしたアセ
ンブリ・ソース・プログラムリストを示す図である。FIG. 4 is a diagram showing an assembly source program list obtained by compiling the program shown in FIG.

【図５】図３に示したプログラムをコンパイルしたアセ
ンブリ・ソース・プログラムリストを示す図である。5 is a diagram showing an assembly source program list obtained by compiling the program shown in FIG.

【図６】図３に示したプログラムをコンパイルしたアセ
ンブリ・ソース・プログラムリストを示す図である。6 is a diagram showing an assembly source program list obtained by compiling the program shown in FIG.

【図７】図３に示したプログラムをコンパイルしたアセ
ンブリ・ソース・プログラムリストを示す図である。7 is a diagram showing an assembly source program list obtained by compiling the program shown in FIG.

【図８】図３における基本ブロックの依存関係の解析結
果を示す有向グラフである。FIG. 8 is a directed graph showing the analysis result of the dependency relationship of the basic blocks in FIG.

【図９】図３における基本ブロックの依存関係の解析結
果を示す有向グラフである。9 is a directed graph showing an analysis result of dependency relationships of the basic blocks in FIG.

【図１０】図３における基本ブロックのビット・パター
ンを示す図である。10 is a diagram showing a bit pattern of a basic block in FIG.

【図１１】実施例１による最適化処理後のビット・パタ
ーンを示す図である。FIG. 11 is a diagram showing a bit pattern after optimization processing according to the first embodiment.

【図１２】実施例１による最適化処理の効果を説明する
ためのビット・パターンを示す参考図である。FIG. 12 is a reference diagram showing a bit pattern for explaining the effect of the optimization processing according to the first embodiment.

【図１３】実施例１の変形例を説明するためのフローチ
ャートである。FIG. 13 is a flowchart for explaining a modified example of the first embodiment.

【図１４】（ａ）、（ｂ）ともに、実施例１で最適化さ
れたプログラムを使用する装置の一構成例を示す概念図
である。14A and 14B are conceptual diagrams showing an example of the configuration of an apparatus that uses the program optimized in the first embodiment.

【図１５】（ａ）、（ｂ）ともに図１４に示した装置で
使用するプログラムのビット・パターンを示す図であ
る。15 (a) and 15 (b) are diagrams showing a bit pattern of a program used in the device shown in FIG.

【図１６】（ａ）は実施例１で最適化されたプログラム
を使用する装置の一構成例を示す概念図、（ｂ）は
（ａ）に示した装置で使用するプログラムのビット・パ
ターンを示す図である。16A is a conceptual diagram showing a configuration example of an apparatus using the program optimized in the first embodiment, and FIG. 16B is a bit pattern of the program used in the apparatus shown in FIG. FIG.

【図１７】実施例１で最適化されたプログラムを使用す
る装置の一構成例を示す概念図である。FIG. 17 is a conceptual diagram showing a configuration example of an apparatus using the program optimized in the first embodiment.

【図１８】実施例２の命令列最適化装置が行う最適化処
理の手順を説明するためのフローチャートである。FIG. 18 is a flowchart illustrating a procedure of an optimization process performed by the instruction sequence optimization device according to the second exemplary embodiment.

【図１９】（ａ）は実施例２の命令列最適化装置が行う
最適化処理で使用する中間コードのプログラム例を示す
図、（ｂ）は（ａ）のプログラムを最適化した結果を示
す図である。19A is a diagram showing a program example of an intermediate code used in the optimization processing performed by the instruction sequence optimizing apparatus of the second embodiment, and FIG. 19B shows a result of optimizing the program of FIG. It is a figure.

【図２０】図１９（ａ）に示したプログラムの有効範囲
テーブルを示す図である。20 is a diagram showing an effective range table of the program shown in FIG.

【図２１】実施例３の命令列最適化装置が行う最適化処
理の手順を説明するためのフローチャートである。FIG. 21 is a flow chart for explaining a procedure of optimization processing performed by the instruction sequence optimization device of the third embodiment.

【図２２】実施例３の最適化処理を“ａｄｄ”命令につ
いて行ったプログラムを示す図である。FIG. 22 is a diagram showing a program in which the optimization processing of the third embodiment is performed for the “add” instruction.

【図２３】実施例４の命令列最適化装置が行う最適化処
理の手順を説明するためのフローチャートである。FIG. 23 is a flowchart for explaining a procedure of optimization processing performed by the instruction sequence optimization device of the fourth embodiment.

【図２４】（ａ）は実施例４による最適化を行う前のプ
ログラムの一例を示す図、（ｂ）は（ａ）のプログラム
を最適化した後のプログラムを示す図である。24A is a diagram showing an example of a program before optimization according to the fourth embodiment, and FIG. 24B is a diagram showing a program after optimization of the program of FIG.

【図２５】実施例５の命令列最適化装置が行う最適化処
理の手順を説明するためのフローチャートである。FIG. 25 is a flowchart illustrating a procedure of optimization processing performed by an instruction string optimization device according to a fifth exemplary embodiment.

【図２６】携帯型情報処理装置の制御部の概略構成を示
すブロック図である。FIG. 26 is a block diagram showing a schematic configuration of a control unit of the portable information processing device.

[Explanation of symbols]

２６１０ＣＰＵ２６１１実行ユニット２６１２入出力部２６１３レジスタ部２６２０プログラムメモリ２６２１記憶部２６３１アドレスバス２６３２命令バス 2610 CPU 2611 Execution unit 2612 I / O unit 2613 register section 2620 Program memory 2621 Storage 2631 address bus 2632 instruction bus

───────────────────────────────────────────────────── フロントページの続き (56)参考文献特開平５−135187（ＪＰ，Ａ) 特開平５−250269（ＪＰ，Ａ) 特開平５−165727（ＪＰ，Ａ) 特開平８−101773（ＪＰ，Ａ) (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06F 9/45 G06F 1/32 ─────────────────────────────────────────────────── ─── Continuation of the front page (56) Reference JP-A-5-135187 (JP, A) JP-A-5-250269 (JP, A) JP-A-5-165727 (JP, A) JP-A-8- 101773 (JP, A) (58) Fields surveyed (Int.Cl. ⁷ , DB name) G06F 9/45 G06F 1/32

Claims

(57) [Claims]

1. An instruction sequence optimization for optimizing the program for use by an information processing apparatus having a program memory for storing the program and an arithmetic processing unit for fetching the program from the program memory via an instruction bus. In the apparatus, for each instruction that constitutes the program, an instruction sequence analysis unit that analyzes mutual dependency relations, and the order of the instructions is changed within a range that does not affect the dependency relation analyzed by the instruction sequence analysis unit. As a result, when the instruction is transferred from the program memory to the arithmetic processing unit, an instruction string changing unit that reduces a Hamming distance between bit strings appearing on the instruction bus, and the program after dividing the program into basic blocks. Basic
Block division means for sending a block to the instruction sequence analysis means
When provided with a said instruction sequence changing means, performing an instruction sequence determination processing immediately before
Last bit sequence of basic block and current instruction order
Between the first bit string of the basic block that performs constant processing
Order within this block, taking into account the Hamming distance of
An instruction sequence optimizing device characterized by performing an order determination process .

2. When the instruction includes the bit string that is not considered when the program is executed by the arithmetic processing unit, the bit string is reduced so that the Hamming distance between the bit string and the bit string before and after the bit string is reduced. 2. The instruction sequence optimizing apparatus according to claim 1, wherein the signal value of is changed.

3. A plurality of registers for temporarily storing data, a program memory for storing a program, and an operation for writing / reading data to / from the register according to an instruction fetched from the program memory via an instruction bus. In an instruction sequence optimizing apparatus for optimizing a program for use by an information processing apparatus including a processing unit, register number recognizing means for recognizing a register number in each instruction constituting the program, and the register number Register valid range recognition means for recognizing the valid range of the register number recognized by the recognition means, and changing the register number within a range that does not affect the valid range recognized by the register valid range recognition means, An instruction including the register number is issued from the program memory to the arithmetic processing unit. Instruction sequence optimization apparatus being characterized in that and a instruction sequence changing means for reducing the Hamming distance bit Retsukan appearing on the instruction bus when transferring the part.

4. A register number that the instruction sequence changing means can replace, with respect to each register number recognized by the register number recognizing means, without affecting the valid range recognized by the register valid range recognizing means. A formulating means for formulating, and a selecting means for selecting, among the register numbers recognized by the register number recognizing means and the register numbers formulated by the formulating means, a register number having the smallest Hamming distance between bit strings appearing on the instruction bus. 4. The instruction sequence optimizing apparatus according to claim 3 , further comprising: a replacing unit that replaces a register number in the program with a register number selected by the selecting unit.

5. An instruction sequence optimization for optimizing the program for use by an information processing apparatus having a program memory for storing the program and an arithmetic processing unit for fetching the program from the program memory via an instruction bus. In the device, with respect to some or all of the respective instructions that form the program, a storage unit that stores another bit pattern that means the same instruction, and replaces the instruction in the program with the bit pattern stored in the storage unit. The instruction sequence changing means for reducing the hamming distance between the bit sequences appearing on the instruction bus when the instruction is transferred from the program memory to the arithmetic processing unit. Device.

6. An instruction sequence optimization for optimizing the program for use by an information processing apparatus having a program memory for storing the program and an arithmetic processing unit for fetching the program from the program memory via an instruction bus. In the apparatus, a selecting unit that selects another instruction or instruction sequence that can obtain the same processing result for the instruction or instruction sequence in the program, and the instruction or instruction sequence in the program is selected by the selecting unit. By substituting an instruction or sequence of instructions,
An instruction sequence changing means for reducing a Hamming distance between bit sequences appearing on the instruction bus when the instruction or the instruction sequence is transferred from the program memory to the arithmetic processing unit; apparatus.

7. An instruction sequence optimization for optimizing the program for use by an information processing apparatus having a program memory for storing the program and an arithmetic processing unit for fetching the program from the program memory via an instruction bus. In the apparatus, for the instruction or instruction sequence in the program, a selection unit that selects another instruction or instruction sequence that can obtain the same processing result, and an instruction or instruction sequence in the program and the selection unit Regarding an instruction or an instruction string, an arithmetic means for trial-calculating the power consumption in the instruction bus when the instruction or the instruction string is transferred from the program memory to the arithmetic processing unit, and the instruction or the instruction string selected by the selecting means Of the instructions or instruction sequences in the program, the power consumption calculated by the computing means Remote small ones, and the instruction sequence changing means for replacing the instruction or instruction sequence in the program, the instruction sequence optimization device characterized by comprising a.