JP2002304302A

JP2002304302A - Optimizing device and optimizing method for microprocessor object code, and recording medium recorded with optimizing program

Info

Publication number: JP2002304302A
Application number: JP2001104924A
Authority: JP
Inventors: Manabu Ezaki; 学江崎
Original assignee: Renesas Micro Systems Co Ltd
Current assignee: Renesas Micro Systems Co Ltd
Priority date: 2001-04-03
Filing date: 2001-04-03
Publication date: 2002-10-18
Anticipated expiration: 2021-04-03
Also published as: JP3758984B2

Abstract

PROBLEM TO BE SOLVED: To relocate data codes in object codes to improve an execution speed of a program even if the data codes are outside a displacement range. SOLUTION: In this optimizing device, a simulator 1 analyzes a data access command in the primary object codes F1. The simulator 1 has a data access information generation part 14 outputting a data address and a size to data access information F3; a data relocation part 15 referring to the data access information F3, sorting the data codes in descending order of an access frequency of each address, selecting the data code of a maximum size in the same address as a selection data code, relocating the selection data codes in the descending order of the access frequency in a cache area and correcting command code; and a secondary object code generation part 16 generating the relocated data and the corrected command code as secondary object codes F4.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明はマイクロプロセッサ
用目的コードの最適化装置、最適化方法及び最適化プロ
グラムを記録した記録媒体に関し、特にマイクロプロセ
ッサ用目的コードの最適化処理を行うコンパイラが生成
した一次目的コードの命令コードを解析して１命令でア
クセス可能な命令コードを生成するための二次目的コー
ドを生成するマイクロプロセッサ用目的コードの最適化
装置、最適化方法及び最適化プログラムを記録した記録
媒体に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a device for optimizing an object code for a microprocessor, an optimizing method, and a recording medium on which an optimizing program is recorded, and more particularly to a device for optimizing an object code for a microprocessor. An apparatus, an optimization method, and an optimization program for a microprocessor purpose code for generating a secondary purpose code for analyzing an instruction code of a primary purpose code to generate an instruction code accessible by one instruction are recorded. It relates to a recording medium.

【０００２】[0002]

【従来の技術】従来、この種のマイクロプロセッサ用目
的コードの最適化処理は、データ処理装置によるコンパ
イル用プログラム又はコンパイラの読み取り実行により
実現され、ソースプログラムのコード（コンパイラが生
成した一次目的コード）を入力し、マイクロプロセッサ
に対応してプログラムの実行速度を高める目的コード
（二次目的コード）を出力するために用いられている。2. Description of the Related Art Conventionally, this kind of optimization processing of an object code for a microprocessor is realized by reading and executing a compiling program or a compiler by a data processing device, and a source program code (primary object code generated by the compiler). Is used to output a target code (secondary target code) for increasing the execution speed of the program corresponding to the microprocessor.

【０００３】ここで、コンパイラが生成した一次目的コ
ードとは、Ｃ言語、ＦＯＲＴＲＡＮ，ＣＯＢＯＬなどの
高級プログラミング言語を、目的とするＣＰＵ上の機械
語命令コード及びデータコードに翻訳したものを意味す
る。Here, the primary purpose code generated by the compiler means a high-level programming language such as C language, FORTRAN, COBOL translated into a machine language instruction code and data code on a target CPU.

【０００４】例えば、パイプライン方式のマイクロプロ
セッサにおいて、命令間のコンフリクトを回避させつつ
プログラムの実行速度を高めるために、互いに依存関係
のない複数の命令に関しては、コンパイラによって、他
の命令の結果を待っている遅延時間が短い命令を自動的
に前に配置するようにした命令コードのスケジューリン
グ方式が提案されている。For example, in a pipelined microprocessor, in order to avoid a conflict between instructions and to increase the execution speed of a program, a plurality of instructions that do not depend on each other are interpreted by a compiler using the results of other instructions. There has been proposed an instruction code scheduling method in which an instruction having a short waiting time is automatically arranged before the instruction.

【０００５】また、現在、マイクロプロセッサは、ＣＰ
Ｕと主メモリとの間に高速、小容量のキャッシュメモリ
を配置し、主メモリに置かれたデータコードへのアクセ
スの高速化を図っているものが一般的となっている。し
かし、必要なデータコードがキャッシュ上に存在しない
キャッシュミスに起因する実行速度低下が依然として残
っており、この実行速度低下を低減するために、コンパ
イラにより、命令実行時のキャッシュミス・ペナルティ
を小さくするための命令スケジューリングを行う方式が
考案されている。At present, the microprocessor is a CP.
It is common to arrange a high-speed, small-capacity cache memory between the U and the main memory to speed up access to data codes stored in the main memory. However, there still remains a reduction in execution speed due to a cache miss in which the required data code does not exist in the cache. To reduce this reduction in execution speed, the compiler reduces the cache miss penalty during instruction execution. A method of performing instruction scheduling has been devised.

【０００６】特開平１０−３３３９１６号公報記載の従
来のマイクロプロセッサ用目的コードの最適化装置であ
るコンパイル装置をブロックで示す図１３（Ａ）を参照
すると、この従来のマイクロプロセッサ用目的コードの
最適化装置は、記録媒体に記録されたコンパイル用プロ
グラムである入力コードＦ５をプロファイルデータＦ２
を用いてコンパイルして目的コードＦ１００を生成する
コンパイル部２と、目的コードＦ１００をシミュレーシ
ョンしプロファイルデータＦ２を生成するマシン又はシ
ミュレータであるシミュレータ１００とを備える。FIG. 13A is a block diagram showing a compiling apparatus which is a conventional apparatus for optimizing an object code for a microprocessor described in Japanese Patent Application Laid-Open No. 10-333916. The conversion device converts the input code F5, which is a compile program recorded on a recording medium, into profile data F2.
And a simulator 100 that compiles the target code F100 to generate the target code F100, and simulates the target code F100 to generate the profile data F2.

【０００７】入力コードＦ５は、例えば、Ｃ言語、ＪＡ
ＶＡ（登録商標）言語、ＦＯＲＴＲＡＮ言語などの高級
言語で記述したものである。The input code F5 is, for example, C language, JA
It is described in a high-level language such as VA (registered trademark) language or FORTRAN language.

【０００８】コンパイル部２は、入力コードＦ５の実行
を行うソフトウェア機能手段として、入力コードＦ５の
供給を受けこの入力コードＦ５の字句解析及び構文解析
を行うフロントエンド２１と、後述するバックエンド２
２とを備える。[0008] The compiling unit 2 includes a front end 21 which receives the input code F5 and performs lexical analysis and syntax analysis of the input code F5, and a back end 2 which will be described later as software function means for executing the input code F5.
2 is provided.

【０００９】バックエンド２２は、目的コードＦ１００
のシミュレーション結果であるプロファイルデータＦ２
を基にキャッシュミスペナルティをできるだけ小さくす
るために命令コードのスケジューリングを行うコードス
ケジューリング部２２１と、コードスケジューリング部
２２１のコードスケジューリング結果に基づきシミュレ
ータ１００上で実行可能な目的コードＦ１００を生成す
る目的コード生成部２２２とを備える。The back end 22 has a purpose code F100
Profile data F2 which is the simulation result of
A code scheduling unit 221 that schedules instruction codes to minimize the cache miss penalty based on the above, and an object code generation unit that generates an object code F100 executable on the simulator 100 based on the code scheduling result of the code scheduling unit 221. A part 222.

【００１０】シミュレータ１００の構成例をブロックで
示す図１３（Ｂ）を参照すると、このシミュレータ１０
０は、コンパイル部２が生成した目的コードである一次
目的コードＦ１００の命令コードの解析を行う命令コー
ド解析部１１と、解析した命令コードの実行を行う命令
シミュレーション部１２と、プロファイルデータの生成
を行いプロファイルデータＦ２に格納するプロファイル
データ生成部１３とを備える。Referring to FIG. 13B, which shows a configuration example of a simulator 100 by blocks,
0 designates an instruction code analyzing unit 11 for analyzing an instruction code of a primary object code F100 which is an object code generated by the compiling unit 2, an instruction simulation unit 12 for executing the analyzed instruction code, and generating profile data. And a profile data generating unit 13 for storing the profile data in the profile data F2.

【００１１】次に、図１３（Ａ），（Ｂ）を参照して、
従来のマイクロプロセッサ用目的コードの最適化装置の
動作について説明すると、まず、コンパイル部２のフロ
ントエンド２１は、入力コードＦ５の供給を受けこの入
力コードＦ５の字句解析及び構文解析を行い、解析結果
をバックエンド２２に供給する。Next, referring to FIGS. 13A and 13B,
First, the front end 21 of the compiling unit 2 receives the input code F5, performs lexical analysis and syntax analysis of the input code F5, and describes the analysis result. Is supplied to the back end 22.

【００１２】次に、バックエンド２２のコードスケジュ
ーリング部２２１は、有効設定された場合に、プロファ
イルデータＦ２を基にキャッシュミスペナルティをでき
るだけ小さくするために命令コードのスケジューリング
を行う。無効設定された場合は不動作となり、何も実行
しない。Next, when the code scheduling section 221 of the back end 22 is set to be valid, the instruction scheduling is performed to minimize the cache miss penalty based on the profile data F2. If it is invalidated, it will not operate and will not execute anything.

【００１３】有効設定の場合、まず、コードスケジュー
リング部２２１は、目的コードＦ１００をシミュレータ
１００で実行して得たＣＰＵ動作の記録であるプロファ
イルデータＦ２を解析しキャッシュミスペナルティ発生
部分を検出するとともにコードスケジューリング実行部
２２４で利用するためのキャッシュ動作情報を生成す
る。次に、このキャッシュ動作情報に基づき検出したキ
ャッシュミスペナルティを軽減するための命令コードの
再スケジューリングを行う。In the case of the valid setting, first, the code scheduling unit 221 analyzes the profile data F2 which is a record of the CPU operation obtained by executing the target code F100 by the simulator 100, detects the cache miss penalty occurrence part, The cache operation information to be used by the scheduling execution unit 224 is generated. Next, instruction code rescheduling for reducing cache miss penalty detected based on the cache operation information is performed.

【００１４】ここで、キャッシュ動作情報とは、キャッ
シュミスしたデータを主記憶からキャッシュに読み込む
動作が各動作クロックで行われている否かを表わす情報
である。Here, the cache operation information is information indicating whether or not the operation of reading the cache-missed data from the main memory into the cache is performed at each operation clock.

【００１５】目的コード生成部２２２は、コードスケジ
ューリング部２２１の命令コードの再スケジューリング
結果を受け、シミュレータ１００上で実行可能な目的コ
ードを生成し、目的コードＦ１００に出力する。The target code generator 222 receives the result of rescheduling the instruction code by the code scheduler 221, generates a target code executable on the simulator 100, and outputs the target code to the target code F 100.

【００１６】シミュレータ１００は、まず、命令コード
解析部１１において、コンパイル部２が生成した一次目
的コードである目的コードＦ１００の命令コード（以下
一次命令コード）の解析を行う。次に、命令シミュレー
ション部１２において、解析した一次命令コードの実行
を行う。最後に、プロファイルデータ生成部１３におい
て、プロファイルデータの生成を行い、プロファイルデ
ータＦ２へ出力する。In the simulator 100, first, the instruction code analyzing section 11 analyzes an instruction code (hereinafter, a primary instruction code) of a target code F100 which is a primary target code generated by the compiling section 2. Next, the instruction simulation unit 12 executes the analyzed primary instruction code. Finally, the profile data generation unit 13 generates profile data and outputs the profile data to the profile data F2.

【００１７】このように従来技術では、シミュレータ１
００上で実行して得られるプロファイルデータＦ２を基
に、実行速度を低下させるキャッシュミスペナルティを
解析し、キャッシュミスペナルティをできるだけ小さく
するために目的コードＦ１００を再スケジューリングす
るコードスケジューリング処理を行い、目的コード生成
部２２２により、最終的な目的コードＦ１００を生成
し、その実行速度を向上させていた。As described above, in the prior art, the simulator 1
Based on the profile data F2 obtained by executing the above process, a cache miss penalty that reduces the execution speed is analyzed, and a code scheduling process for rescheduling the target code F100 is performed to minimize the cache miss penalty. The code generator 222 generates the final target code F100, and improves the execution speed.

【００１８】マイクロプロセッサは、一般的に、機械語
命令のデータコードアクセス命令において、アクセスで
きるデータコードのディスプレースメント（オフセッ
ト）に制限があり、例えば、ディスプレースメントとし
て１６ビットの値までしかとることができない。このた
め、従来のマイクロプロセッサ用目的コードの最適化装
置においては、データコード領域の任意の位置にポイン
タを設定し、そのポインタをポインタ専用のレジスタに
格納することにより、そのレジスタからのオフセット
（ディスプレースメント）を用いてデータコード領域を
アクセスする。そのポインタの近辺にサイズの小さいデ
ータコードをできるだけ多く配置することにより、でき
るだけ多くのデータコードを１命令のデータコードアク
セス命令でアクセスし、実行速度を向上させるという技
術がある。In general, a microprocessor is limited in the displacement (offset) of a data code that can be accessed in a data code access instruction of a machine language instruction. For example, the microprocessor can take only a 16-bit value as a displacement. Can not. For this reason, in the conventional microprocessor object code optimization apparatus, a pointer is set at an arbitrary position in the data code area, and the pointer is stored in a pointer-dedicated register. ) To access the data code area. There is a technique of arranging as many small-sized data codes as possible near the pointer, thereby accessing as many data codes as possible with a single data code access instruction to improve the execution speed.

【００１９】しかし、依然として、アクセスできるデー
タコードのディスプレースメントの範囲外にあるデータ
コード（以下、ディスプレースメント範囲外データコー
ド）においては、１命令でアクセスすることができず、
複数命令でアクセスしなければならないため、ディスプ
レースメント範囲外データコードへのアクセス頻度が高
いプログラムにおいては、実行速度が低下するという問
題がある。However, data codes outside the range of accessible data code displacements (hereinafter, data codes outside the displacement range) cannot be accessed by one instruction.
Since a program needs to be accessed by a plurality of instructions, there is a problem that the execution speed is reduced in a program that frequently accesses the data code outside the displacement range.

【００２０】次に、データコードアクセス命令がディス
プレースメントとして１６ビットの値までしかとること
ができない（以下、１６ビットディスプレースメント）
マイクロプロセッサに対するデータコード領域を説明図
で示す図１４（Ａ）及び命令コードの例を示す図１４
（Ｂ），（Ｃ）を参照して、上記問題が生じる理由につ
いて説明する。Next, a data code access instruction can take only a 16-bit value as a displacement (hereinafter, 16-bit displacement).
FIG. 14A showing the data code area for the microprocessor in an explanatory diagram and FIG. 14 showing an example of an instruction code
The reason why the above problem occurs will be described with reference to FIGS.

【００２１】図１４（Ａ）を参照すると、この図に示す
ポインタＰ１は、データコード領域を高速にアクセスす
るためのデータコード領域の任意の位置を示すポインタ
を示し、このポインタＰ１から１６ビットディスプレー
スメントでアクセスできるデータコード領域９０１に
「Ｓｄａｔａ」が配置されている。このため、「Ｓｄａ
ｔａ」は、図１４（Ｂ）に示すように、データコードア
クセス命令ｌｄ．ｗにより１命令でアクセスすることが
でき、このデータコードアクセス命令ｌｄ．ｗは、図１
４（Ａ）のポインタを示す［ｇｐ］からのディスプレー
スメント＄Ｓｄａｔａに格納されているＳｄａｔａの値
を取り出し、汎用レジスタｒ２０へ格納している。Referring to FIG. 14A, a pointer P1 shown in FIG. 14A indicates a pointer indicating an arbitrary position in the data code area for accessing the data code area at high speed, and a 16-bit display is provided from the pointer P1. “Sdata” is arranged in the data code area 901 accessible by the statement. For this reason, "Sda
ta ”, as shown in FIG. 14B, the data code access instruction ld. w, the data code access instruction ld. w
4 (A), the value of Sdata stored in the displacement $ Sdata from [gp] indicating the pointer is extracted and stored in the general-purpose register r20.

【００２２】しかし、図１４（Ａ）中の「Ｄａｔａ」，
「Ｄａｔａ１」は、ポインタＰ１から１６ビットディス
プレースメントでアクセスできないデータコード領域９
００，９０１にそれぞれ配置されており、図１４（Ｃ）
に示すように、データコードアクセス命令の１６ビット
ディスプレースメント指定では表現できないため、２命
令を必要とする。まず、命令ｍｏｖｈｉにより、Ｄａｔ
ａの上位１６ビットを取り出し、汎用レジスタｒ１に格
納し、次に、データコードアクセス命令ｌｄ．ｗによ
り、前記のＤａｔａの下位１６ビットと前記の汎用レジ
スタｒ１で３２ビットのディスプレースメントを表現
し、Ｄａｔａの値を取り出して汎用レジスタｒ２０に格
納している。このため、１６ビットディスプレースメン
トでアクセスできないデータコード領域９００，９０１
に配置されているデータコードへのアクセス頻度が高い
プログラムにおいては、実行速度が低下する。However, “Data” in FIG.
“Data1” is a data code area 9 that cannot be accessed by a 16-bit displacement from the pointer P1.
00 and 901 respectively, as shown in FIG.
As shown in (1), two instructions are required because they cannot be expressed by the 16-bit displacement specification of the data code access instruction. First, by the instruction movhi, Dat
a, the upper 16 bits of the data code access instruction ld. By using w, the lower 16 bits of the data and the general-purpose register r1 represent a 32-bit displacement, and the value of Data is extracted and stored in the general-purpose register r20. Therefore, the data code areas 900 and 901 that cannot be accessed by the 16-bit displacement
The execution speed of a program that frequently accesses the data code arranged in the program is reduced.

【００２３】また、従来技術において、データコード領
域のどこにどのデータコードを配置するかを指定するこ
とが可能な技術が存在するが、データコードのアクセス
頻度を知る手段がないため、効率的にアクセス頻度の高
いデータコードを１６ビットディスプレースメントでア
クセス可能なデータコード領域に配置することが困難で
あった。Further, in the prior art, there is a technique capable of designating where in the data code area which data code is to be arranged. However, since there is no means for knowing the access frequency of the data code, efficient access is required. It has been difficult to arrange frequently used data codes in a data code area accessible by 16-bit displacement.

【００２４】[0024]

【発明が解決しようとする課題】上述した従来のマイク
ロプロセッサ用目的コードの最適化装置、最適化方法及
び最適化プログラムを記録した記録媒体は、コンパイル
部が生成した一次目的コードの命令コード（以下一次命
令コード）の解析を行い、次に、解析した一次命令コー
ドの実行を行い、最後に、プロファイルデータの生成を
行っており、シミュレータ上で実行して得られるプロフ
ァイルデータを基に、実行速度を低下させるキャッシュ
ミスペナルティを解析し、キャッシュミスペナルティを
できるだけ小さくするために目的コードを再スケジュー
リングするコードスケジューリング処理を行い、目的コ
ード生成部により、最終的な目的コードを生成し、その
実行速度を向上させていたが、アクセスできるデータコ
ードのディスプレースメントの範囲外にあるデータコー
ド（以下、ディスプレースメント範囲外データコード）
においては、１命令でアクセスすることができず、複数
命令でアクセスしなければならないため、ディスプレー
スメント範囲外データコードへのアクセス頻度が高いプ
ログラムにおいては、実行速度が低下するという欠点が
あった。The above-described conventional object code optimizing apparatus for a microprocessor, an optimizing method, and a recording medium on which an optimizing program are recorded include an instruction code (hereinafter, referred to as a primary object code) generated by a compiling unit. (Primary instruction code) is analyzed, and then the analyzed primary instruction code is executed. Finally, profile data is generated. Based on the profile data obtained by executing on the simulator, the execution speed is calculated. Analyze the cache miss penalty that reduces the cache miss penalty, perform code scheduling processing to reschedule the target code in order to minimize the cache miss penalty, generate the final target code by the target code generator, and reduce the execution speed. Improved display of accessible data codes Assessments range data code at the (hereinafter, displacement range data code)
In this case, the program cannot be accessed by one instruction but must be accessed by a plurality of instructions. Therefore, in a program in which the frequency of accessing the data code outside the displacement range is high, there is a disadvantage that the execution speed is reduced.

【００２５】本発明の目的は、ディスプレースメント範
囲外データコードであっても、目的コード中のデータコ
ードを再配置し、プログラムの実行速度を向上させるマ
イクロプロセッサ用目的コードの最適化装置、最適化方
法及び最適化プログラムを記録した記録媒体を提供する
ことにある。An object of the present invention is to provide an object code optimizing apparatus for a microprocessor which rearranges data codes in an object code even if the data code is out of a displacement range, thereby improving a program execution speed. An object of the present invention is to provide a recording medium recording a method and an optimization program.

【００２６】[0026]

【課題を解決するための手段】請求項１記載の発明のマ
イクロプロセッサ用目的コードの最適化装置は、記録媒
体に記録されたコンパイル用プログラムである入力コー
ドをプロファイルデータを用いてコンパイルして一次目
的コードを生成するコンパイル部と、前記一次目的コー
ドをシミュレーションし前記プロファイルデータを生成
するシミュレータとを備えるマイクロプロセッサ用目的
コードの最適化装置において、前記シミュレータが、前
記コンパイル部が生成した前記一次目的コードの命令コ
ードを解析してこの命令コード対応処理の実行である命
令コード実行を行い、前記命令コード実行によるデータ
コードのアクセス回数をアドレス及びアクセス対象のデ
ータコードのサイズ毎に記録したデータアクセス情報に
基づきアクセス頻度の高いデータコードを検出し、１命
令でアクセス可能なデータコード領域であるキャッシュ
領域に再配置して二次目的コードを生成し、前記二次目
的コードの命令コードを解析して前記命令コード実行を
行うことにより、高速のシミュレーション実行を可能と
することを特徴とするものである。According to a first aspect of the present invention, there is provided an apparatus for optimizing an object code for a microprocessor, comprising compiling an input code, which is a compile program recorded on a recording medium, using profile data to obtain a primary code. An object code optimization device for a microprocessor, comprising: a compiling unit that generates an object code; and a simulator that simulates the primary object code and generates the profile data, wherein the simulator generates the primary object generated by the compiling unit. Data access information in which the instruction code of the code is analyzed to execute the instruction code corresponding processing, and the number of accesses of the data code by the execution of the instruction code is recorded for each address and the size of the data code to be accessed. Access frequently based on , A secondary purpose code is generated by relocating to the cache area which is a data code area accessible by one instruction, and the instruction code of the secondary purpose code is analyzed to execute the instruction code. Is performed to enable high-speed simulation execution.

【００２７】また、請求項２記載の発明は、請求項１記
載のマイクロプロセッサ用目的コードの最適化装置にお
いて、前記シミュレータが、前記コンパイル部で生成し
た目的コードである一次目的コード又は前記シミュレー
タが生成した目的コードである二次目的コードのいずれ
か一方（以下目的コード）の命令コードの解析を行う命
令コード解析部と、解析した前記命令コードの実行を行
う命令シミュレーション部と、前記命令コードの実行の
結果に基づき前記プロファイルデータを生成するプロフ
ァイルデータ生成部と、前記一次目的コード中のデータ
アクセス命令の解析を行い、このデータアクセスアドレ
ス（以下アドレス）とデータアクセスサイズ（以下サイ
ズ）をデータアクセス情報に出力するデータアクセス情
報生成部と、前記データアクセス情報を参照してアドレ
ス毎のアクセス頻度の降順にデータコードをソートし、
同一アドレスでの最大サイズの前記データコードを選択
データコードとして選択し、前記キャッシュ領域に前記
アクセス頻度の降順に前記選択データコードを再配置し
命令コードを補正するデータ再配置部と、前記データ再
配置部が再配置したデータ及び補正した命令コードを前
記二次目的コードとして生成する二次目的コード生成部
とを備えて構成されている。According to a second aspect of the present invention, in the apparatus for optimizing an object code for a microprocessor according to the first aspect, the simulator comprises a primary object code or an object code generated by the compiling unit. An instruction code analyzing unit that analyzes an instruction code of one of the secondary objective codes (hereinafter, an objective code) that is the generated objective code; an instruction simulation unit that executes the analyzed instruction code; A profile data generating unit for generating the profile data based on a result of the execution, analyzing a data access instruction in the primary object code, and determining a data access address (hereinafter an address) and a data access size (hereinafter a size) by data access; A data access information generating unit for outputting information, Referring to over data access information to sort the data code in descending order of access frequency of each address,
A data rearrangement unit that selects the maximum size data code at the same address as a selected data code, rearranges the selected data code in the cache area in descending order of the access frequency, and corrects an instruction code; A secondary purpose code generation unit configured to generate the rearranged data and the corrected instruction code as the secondary purpose code by the arranging unit.

【００２８】また、請求項３記載の発明は、請求項２記
載のマイクロプロセッサ用目的コードの最適化装置にお
いて、前記データアクセス情報生成部が、前記一次目的
コード中の前記データアクセス命令の解析を行い、前記
データアクセスアドレスとサイズを検出するデータアク
セス命令解析部と、前記データアクセス命令解析部で検
出した前記データアドレスと前記データアクセスサイズ
とを前記データアクセス情報に出力するデータアクセス
情報出力部とを備えて構成されている。According to a third aspect of the present invention, in the apparatus for optimizing an object code for a microprocessor according to the second aspect, the data access information generating section analyzes the data access instruction in the primary object code. A data access instruction analysis unit for detecting the data access address and size, and a data access information output unit for outputting the data address and the data access size detected by the data access instruction analysis unit to the data access information. It is provided with.

【００２９】また、請求項４記載の発明は、請求項２記
載のマイクロプロセッサ用目的コードの最適化装置にお
いて、前記データ再配置部が、前記データアクセス情報
を参照しアドレス毎にアクセス頻度の降順にデータコー
ドをソートして最大サイズのデータコードを選択し、前
記キャッシュ領域にアクセス頻度の降順に再配置し、再
配置後のアドレスを前記データアクセス情報にそれぞれ
付加し再配置情報として出力するデータ再配置実行部
と、読み出した前記目的コードがデータコードアクセス
命令でありかつそのアクセスアドレスが前記再配置情報
の再配置前のアクセスアドレスと一致する場合に前記目
的コードのアクセスアドレスを配置後のアクセスアドレ
スに置き換え命令コードを補正する命令コード補正部と
を備えて構成されている。According to a fourth aspect of the present invention, in the apparatus for optimizing an object code for a microprocessor according to the second aspect, the data rearrangement unit refers to the data access information in descending order of access frequency for each address. The data to be sorted, the data code having the largest size is selected, the data is rearranged in the cache area in descending order of access frequency, the addresses after the rearrangement are added to the data access information, and the data is output as the rearrangement information. A relocation execution unit that, when the read target code is a data code access instruction and the access address matches the access address before the relocation of the relocation information, sets the access address of the target code after the relocation; And an instruction code correction unit for correcting the instruction code replaced with the address. That.

【００３０】請求項５記載の発明のマイクロプロセッサ
用目的コードの最適化方法は、記録媒体に記録されたコ
ンパイル用プログラムである入力コードをプロファイル
データを用いてコンパイルして一次目的コードを生成
し、前記一次目的コードをシミュレーションし前記プロ
ファイルデータを生成するマイクロプロセッサ用目的コ
ードの最適化方法において、前記シミュレーションが、
前記コンパイルにより生成した前記一次目的コードの命
令コードを解析してこの命令コード対応処理の実行であ
る命令コード実行を行い、前記命令コード実行によるデ
ータコードのアクセス回数をアドレス及びアクセス対象
のデータコードのサイズ毎に記録したデータアクセス情
報に基づきアクセス頻度の高いデータコードを検出し、
１命令でアクセス可能なデータコード領域であるキャッ
シュ領域に再配置して二次目的コードを生成し、前記二
次目的コードの命令コードを解析して前記命令コード実
行を行うことにより、高速のシミュレーション実行を可
能とすることを特徴とするものである。According to a fifth aspect of the present invention, there is provided a method for optimizing an object code for a microprocessor, comprising: compiling an input code, which is a compile program recorded on a recording medium, by using profile data to generate a primary object code; In a method for optimizing a target code for a microprocessor that simulates the primary target code and generates the profile data, the simulation includes:
The instruction code of the primary purpose code generated by the compilation is analyzed to execute the instruction code corresponding to the execution of the instruction code corresponding process, and the number of times the data code is accessed by executing the instruction code is determined by the address and the access target data code. Detects frequently accessed data codes based on data access information recorded for each size,
A high-speed simulation is performed by rearranging in the cache area which is a data code area accessible by one instruction, generating a secondary purpose code, analyzing the instruction code of the secondary purpose code, and executing the instruction code. It is characterized in that it can be executed.

【００３１】請求項６記載の発明のマイクロプロセッサ
用目的コードの最適化方法は、記録媒体に記録されたコ
ンパイル用プログラムである入力コードをプロファイル
データを用いてコンパイルして一次目的コードを生成
し、前記一次目的コードをシミュレーションし前記プロ
ファイルデータを生成するマイクロプロセッサ用目的コ
ードの最適化方法において、前記一次目的コードの命令
コードの解析を行う命令コード解析ステップと、解析し
た前記一次命令コードの実行を行う命令シミュレーショ
ンステップと、前記一次命令コードの実行の結果に基づ
き前記プロファイルデータの生成を行うプロファイルデ
ータ生成ステップと、前記命令コード中のデータアクセ
ス命令の解析を行い、データのアクセスアドレスとアク
セスサイズを検出し、検出したデータの前記アクセスア
ドレスと前記アクセスサイズをデータアクセス情報に格
納するデータアクセス情報生成ステップと、前記データ
アクセス情報生成ステップで生成した前記データコード
のアクセス回数を前記アクセスアドレス及び前記アクセ
スサイズ毎に記録した前記データアクセス情報に基づき
アクセス頻度の降順に前記データコードを検出して１命
令でアクセス可能なデータコード領域であるキャッシュ
領域に前記アクセス頻度の降順に再配置し命令コードを
補正するデータ再配置ステップと、前記データ再配置ス
テップで補正した前記目的コードを二次目的コードとし
て生成する二次目的コード生成ステップとを有すること
を特徴とするものである。According to a sixth aspect of the present invention, there is provided a method for optimizing an object code for a microprocessor, comprising: compiling an input code which is a compile program recorded on a recording medium by using profile data to generate a primary object code; In the method for optimizing a target code for a microprocessor that simulates the primary target code and generates the profile data, an instruction code analyzing step of analyzing an instruction code of the primary target code; and executing the analyzed primary instruction code. An instruction simulation step to perform, a profile data generation step to generate the profile data based on a result of the execution of the primary instruction code, and an analysis of a data access instruction in the instruction code to determine an access address and an access size of data. Detect A data access information generating step of storing the access address and the access size of the detected data in data access information, and the number of accesses of the data code generated in the data access information generating step for each of the access address and the access size. Based on the recorded data access information, the data code is detected in the descending order of the access frequency, and is relocated to the cache area, which is a data code area accessible by one instruction, in the descending order of the access frequency to correct the instruction code. And a secondary object code generating step of generating the object code corrected in the data rearrangement step as a secondary object code.

【００３２】また、請求項７記載の発明は、請求項５記
載のマイクロプロセッサ用目的コードの最適化方法にお
いて、前記命令コード解析ステップが、前記シミュレー
ションの入力となる前記目的コードが、前記コンパイル
により生成された前記一次目的コードであるか、前記シ
ミュレーションにより生成された前記二次目的コードで
あるかの判定を行う一次／二次目的コード判定ステップ
と、前記一次／二次目的コード判定ステップで前記一次
目的コードであれば、この一次目的コードの命令コード
の解析を行う一次目的コード解析ステップと、前記一次
／二次目的コード判定ステップで前記二次目的コードで
あれば、この二次目的コードの命令コードの解析を行う
二次目的コード解析ステップとを有することを特徴とす
るものである。According to a seventh aspect of the present invention, in the method for optimizing an object code for a microprocessor according to the fifth aspect, the instruction code analyzing step includes the step of compiling the object code as an input of the simulation by the compiling. A primary / secondary objective code determining step of determining whether the primary objective code is the generated primary objective code or the secondary objective code generated by the simulation; and If it is a primary purpose code, a primary purpose code analysis step for analyzing the instruction code of the primary purpose code, and if the primary purpose code is the secondary purpose code in the primary / secondary purpose code determination step, the primary purpose code is analyzed. And a secondary purpose code analyzing step of analyzing the instruction code.

【００３３】また、請求項８記載の発明は、請求項６記
載のマイクロプロセッサ用目的コードの最適化方法にお
いて、前記データアクセス情報生成ステップが、前記命
令コード中の前記データアクセス命令の解析を行い、こ
のデータアクセス命令の前記データアクセスアドレスと
前記アクセスサイズを検出するデータアクセス命令解析
ステップと、検出した前記データアクセスアドレスと前
記アクセスサイズを前記データアクセス情報に出力し、
前記データアクセス情報内のアクセス回数をインクリメ
ントするデータアクセス情報出力ステップとを有するこ
とを特徴とするものである。According to an eighth aspect of the present invention, in the method of optimizing an object code for a microprocessor according to the sixth aspect, the data access information generating step analyzes the data access instruction in the instruction code. A data access instruction analyzing step of detecting the data access address and the access size of the data access instruction, and outputting the detected data access address and the access size to the data access information;
Outputting a data access information for incrementing the number of accesses in the data access information.

【００３４】また、請求項９記載の発明は、請求項６記
載のマイクロプロセッサ用目的コードの最適化方法にお
いて、前記データ再配置ステップが、前記データアクセ
ス情報を参照して、アドレス毎にアクセス頻度の降順に
データコードをソートして最大アクセスサイズのデータ
コードを選択データコードとして選択し、この選択デー
タコードを前記キャッシュ領域にアクセス頻度の降順に
再配置し、再配置後のアドレスを前記データアクセス情
報にそれぞれ付加し再配置情報として出力するデータ再
配置実行ステップと、前記目的コードを１命令コードず
つ読み出し、前記目的コードがデータコードアクセス命
令でありかつそのアクセスアドレスが前記再配置情報の
再配置前のアクセスアドレスと一致する場合、一致した
データコードアクセス命令のアクセスアドレスを配置後
のアクセスアドレスに置き換えて命令コードを補正する
命令コード補正ステップとを有することを特徴とするも
のである。According to a ninth aspect of the present invention, in the method of optimizing an object code for a microprocessor according to the sixth aspect, the data relocation step refers to the data access information to determine an access frequency for each address. The data code having the maximum access size is selected as the selected data code by sorting the data codes in descending order of the data size, the selected data code is rearranged in the cache area in the descending order of the access frequency, and the address after the rearrangement is set to the data access A data relocation execution step of adding the information to each piece of information and outputting as relocation information; reading the object code one instruction code at a time; If the previous access address matches, the matched data code access It is characterized in that it has an instruction code correction step of correcting the instruction code by replacing the access address of the scan instruction access address after placement.

【００３５】また、請求項１０記載の発明は、請求項８
記載のマイクロプロセッサ用目的コードの最適化方法に
おいて、前記データアクセス命令解析ステップが、前記
命令コード中の前記データアクセス命令がデータアクセ
ス命令であるか否かの判定を行い前記データアクセス命
令でなければ命令コード終了判定ステップに進むデータ
アクセス命令判定ステップと、前記データアクセス命令
判定ステップで前記データアクセス命令であれば前記デ
ータアクセス命令中のデータアクセスアドレスと前記ア
クセスサイズの取り出しを行うアドレスサイズ取り出し
ステップと、前記アドレスサイズ取り出しステップで取
り出した前記データアクセスアドレスと前記アクセスサ
イズに該当する前記データアクセス情報中のエントリを
検索する該当エントリ検索ステップと、前記データアク
セス情報に該当エントリがある場合、該当エントリアク
セス回数インクリメントステップに進む該当エントリ有
り判定ステップと、前記該当エントリのアクセス回数を
インクリメントする該当エントリアクセス回数インクリ
メントステップと、前記該当エントリ有り判定ステップ
で前記データアクセス情報に前記該当エントリがない場
合新規に、前記アドレスサイズ取り出しステップで取り
出した前記データアクセスアドレスと前記アクセスサイ
ズのエントリを前記データアクセス情報へ追加する新規
エントリ追加ステップと、命令コードの終了であるか否
かの判定を行い、終了でなければ命令コード解析ステッ
プへと戻り、終了であればデータ再配置ステップへ進む
前記命令コード終了判定ステップとを有することを特徴
とするものである。The invention according to claim 10 is the same as that in claim 8
In the method for optimizing an object code for a microprocessor according to the above, the data access instruction analyzing step determines whether the data access instruction in the instruction code is a data access instruction, and determines whether the data access instruction is a data access instruction. A data access instruction determining step for proceeding to an instruction code end determining step; and an address size extracting step for extracting the data access address and the access size in the data access instruction if the data access instruction determining step is the data access instruction. An entry search step for searching for an entry in the data access information corresponding to the data access address and the access size extracted in the address size extraction step; and an entry corresponding to the data access information. If there is a corresponding entry, the corresponding entry presence count determining step proceeds to the corresponding entry access count increment step, the corresponding entry access count increment step increments the access count of the corresponding entry, and If there is no corresponding entry, a new entry adding step of adding the data access address and the access size entry extracted in the address size extracting step to the data access information, and determining whether the end of the instruction code is reached. A determination is made, and if not completed, the process returns to the instruction code analysis step, and if completed, the process proceeds to a data rearrangement step.

【００３６】また、請求項１１記載の発明は、請求項９
記載のマイクロプロセッサ用目的コードの最適化方法に
おいて、前記データ再配置実行ステップが、前記データ
アクセス情報に基づきアドレス毎のアクセス頻度の降順
に前記データコードをソートしソートデータを生成する
データソートステップと、前記ソートデータからアクセ
ス頻度の降順にデータコードを取り出すデータ取り出し
ステップと、取り出したアクセスアドレスと同一アドレ
スの前記データコードの中で最大アクセスサイズのデー
タコードを検索し選択データコードとして選択する最大
サイズエントリ検索ステップと、前記選択データコード
を前記キャッシュ領域に移動するキャッシュ領域移動ス
テップと、前記データアクセス情報に配置後アドレスを
付加し、再配置情報として出力する再配置情報出力ステ
ップと、前記キャッシュ領域の空領域が無いかの判定を
行い前記キャッシュ領域の空領域がまだ残っている場合
後述のデータ終了判定ステップへ進み、前記キャッシュ
領域の空領域がなくなった場合、後述の非キャッシュ領
域移動ステップへ進む空きキャッシュ領域無し判定ステ
ップと、前記ソートデータの全アドレスの終了であれ
ば、前記命令コード補正ステップへ進み、全アドレス終
了でなければ、前記データ取り出しステップへ戻り、以
上の処理を繰り返す前記データ終了判定ステップと、前
記ソートデータの残りのアクセスアドレスのデータコー
ドを低速でのアクセス可能なデータコード領域である非
キャッシュ領域へ移動し、前記再配置情報へ前記再配置
情報を出力し、前記命令コード補正ステップに進む前記
非キャッシュ領域移動ステップとを有することを特徴と
するものである。The invention according to claim 11 is the invention according to claim 9
In the method for optimizing an object code for a microprocessor according to the above, the data relocation execution step includes a step of sorting the data codes in descending order of access frequency for each address based on the data access information to generate sort data. A data fetching step of fetching data codes in descending access frequency order from the sorted data, and a maximum size for searching for a data code having a maximum access size among the data codes having the same address as the fetched access address and selecting the data code as a selected data code An entry search step, a cache area moving step of moving the selected data code to the cache area, a rearrangement information output step of adding a post-placement address to the data access information and outputting it as relocation information, It is determined whether or not there is an empty area in the cache area. If the empty area in the cache area still remains, the process proceeds to a data end determination step described later. Proceeding to a step, there is no free cache area determination step, and if all the addresses of the sort data are completed, proceed to the instruction code correction step; if not, return to the data fetching step and repeat the above processing. The data end determination step, and move the data code of the remaining access address of the sorted data to a non-cache area that is a low-speed accessible data code area, and output the relocation information to the relocation information; Moving the non-cache area to the instruction code correcting step. And it is characterized in Rukoto.

【００３７】また、請求項１２記載の発明は、請求項９
記載のマイクロプロセッサ用目的コードの最適化方法に
おいて、前記命令コード補正ステップが、前記一次目的
コードから１命令コードを取り出す１命令コード取り出
しステップと、前記命令コードを最後まで読み出したか
の判定を行い、最後まで読み出したならば、終了し、ま
だ読み込むべき命令コードが残っていれば、後述のアク
セス命令判定ステップへ進む命令コード終了判定ステッ
プと、前記１命令コード取り出しステップで読み出した
前記命令コードが前記データコードアクセス命令でなけ
れば、前記１命令コード取り出しステップへ戻り、前記
命令コードが前記データコードアクセス命令であれば、
次の一致検索ステップへ進む前記アクセス命令判定ステ
ップと、前記再配置情報内を検索し、前記データコード
アクセス命令のアクセスアドレスと前記再配置情報内の
配置前アドレスとが一致するエントリである一致エント
リを探す前記一致検索ステップと、前記一致検索ステッ
プで前記一致エントリが見つかった場合、次の置換ステ
ップへ進み、前記１命令コード取り出しステップへ戻
り、以上の処理を反復し、前記一致検索ステップで前記
一致エントリが見つからなかった場合、前記１命令コー
ド取り出しステップへ戻り、以上の処理を反復する一致
判定ステップと、データコードアクセス命令のアドレス
を前記一致エントリの配置後アドレスに置き換える前記
置換ステップとを有することを特徴とするものである。The invention according to claim 12 is the invention according to claim 9
In the method for optimizing an object code for a microprocessor according to the above, the instruction code correcting step includes: an instruction code extracting step of extracting one instruction code from the primary object code; and determining whether the instruction code has been read to the end. When the instruction code to be read is read, the operation is terminated. If there is still an instruction code to be read, an instruction code end determination step proceeds to an access instruction determination step described later. If it is not a code access instruction, the process returns to the one instruction code fetching step, and if the instruction code is the data code access instruction,
An access instruction determining step for proceeding to a next match search step; and a matching entry for searching the relocation information, wherein an access address of the data code access instruction matches an address before arrangement in the relocation information. If the match entry is found in the match search step and the match search step, the process proceeds to the next replacement step, returns to the one instruction code extraction step, and repeats the above processing. When a matching entry is not found, the method returns to the one instruction code fetching step, and includes a matching determining step of repeating the above processing, and a replacing step of replacing an address of a data code access instruction with a post-arrangement address of the matching entry. It is characterized by the following.

【００３８】また、請求項１３記載の発明は、請求項９
記載のマイクロプロセッサ用目的コードの最適化方法に
おいて、前記データ再配置実行ステップが、前記データ
アクセス情報に基づきアドレス毎のアクセス頻度の降順
に前記データコードをソートしソートデータを生成する
データソートステップと、前記ソートデータからアクセ
ス頻度の降順にデータコードを取り出すデータ取り出し
ステップと、取り出したアクセスアドレスと同一アドレ
スの前記データコードの中で最大アクセスサイズのデー
タコードを検索し選択データコードとして選択する最大
サイズエントリ検索ステップと、前記キャッシュ領域の
空領域が無いかの判定を行い前記キャッシュ領域の空領
域がまだ残っている場合後述のキャッシュ領域移動ステ
ップへ進み、前記キャッシュ領域の空領域がなくなった
場合、後述の非キャッシュ領域移動ステップへ進む空き
キャッシュ領域無し判定ステップと、前記選択データコ
ードを前記キャッシュ領域に移動するキャッシュ領域移
動ステップと、前記ソートデータの残りのアクセスアド
レスのデータコードを低速でのアクセス可能なデータコ
ード領域である非キャッシュ領域へ移動する前記非キャ
ッシュ領域移動ステップと、前記データアクセス情報に
配置後アドレスを付加し、再配置情報として出力する再
配置情報出力ステップと、前記ソートデータの全アドレ
スの終了であれば、前記命令コード補正ステップへ進
み、全アドレス終了でなければ、前記データ取り出しス
テップへ戻り、以上の処理を繰り返す前記データ終了判
定ステップとを有することを特徴とするものである。The thirteenth aspect of the present invention provides the ninth aspect.
In the method for optimizing an object code for a microprocessor according to the above, the data relocation execution step includes a step of sorting the data codes in descending order of access frequency for each address based on the data access information to generate sort data. A data fetching step of fetching a data code in descending access frequency order from the sorted data; and a maximum size for searching for a data code having a maximum access size among the data codes having the same address as the fetched access address and selecting the data code as a selected data code. An entry search step and a determination as to whether there is no empty area in the cache area, and if an empty area in the cache area still remains, proceed to a cache area moving step described below, and if there is no empty area in the cache area, Non-keys described later Determining that there is no free cache area, proceeding to a cache area moving step, moving the selected data code to the cache area, and moving the data code of the remaining access address of the sorted data at a low speed. A non-cache area moving step of moving to a non-cache area which is a code area; a rearrangement information output step of adding a post-placement address to the data access information and outputting it as relocation information; If it is completed, the process proceeds to the instruction code correction step, and if not all the addresses are completed, the process returns to the data fetching step, and the data end determining step of repeating the above processing is provided.

【００３９】また、請求項１４記載の発明は、請求項９
記載のマイクロプロセッサ用目的コードの最適化方法に
おいて、前記命令コード補正ステップが、前記再配置情
報から１エントリを取り出す再配置情報取り出しステッ
プと、前記再配置情報の終わりまで検索したか否かの判
定を行い、前記再配置情報の終わりであれば終了し、前
記再配置情報の終わりでなければ次の第１の１命令コー
ド取り出しステップへ進む再配置情報終了判定ステップ
と、前記一次目的コードから１命令コードを取り出す前
記第１の１命令コード取り出しステップと、前記１命令
コード取り出しステップで読み出した前記命令コードが
前記データコードアクセス命令でなければ、前記１命令
コード取り出しステップへ戻り、前記命令コードが前記
データコードアクセス命令であれば、次の配置後アドレ
ス置換ステップへ進むアクセス命令判定ステップと、前
記再配置情報内を検索し、前記データコードアクセス命
令のアクセスアドレスと前記再配置情報内の配置前アド
レスとが一致する一致エントリを探し、前記一致エント
リの配置後アドレスへの置換処理を行う前記配置後アド
レス置換ステップと、前記一次目的コードから１命令コ
ードを取り出す第２の１命令コード取り出しステップ
と、前記命令の終了か否かの判定を行い、終了でなけれ
ば前記アクセス命令判定ステップへ戻り、以下の処理を
反復し、終了であれば前記再配置情報取り出しステップ
へ戻り以下の処理を反復する命令コード終了判定ステッ
プとを有することを特徴とするものである。The invention according to claim 14 is the invention according to claim 9
In the method for optimizing an object code for a microprocessor according to the above, the instruction code correcting step is a step of retrieving one entry from the relocation information, and determining whether or not the search is performed up to the end of the relocation information. If it is the end of the relocation information, the process is terminated. If the end of the relocation information is not the end, the process proceeds to the next first instruction code fetching step. The first one instruction code fetching step of fetching an instruction code; and if the instruction code read in the one instruction code fetching step is not the data code access instruction, the process returns to the one instruction code fetching step, and the instruction code is If it is the data code access instruction, go to the next post-placement address replacement step. An access instruction determining step; searching the relocation information for a matching entry in which the access address of the data code access instruction matches the pre-placement address in the relocation information; The post-placement address replacement step of performing a replacement process to a second instruction code extracting step of extracting one instruction code from the primary target code, and determining whether or not the instruction is completed. The method further comprises an instruction code end determining step of returning to the access instruction determining step, repeating the following processing, and returning to the relocation information extracting step if it is completed, and repeating the following processing.

【００４０】請求項１５記載の発明のマイクロプロセッ
サ用目的コードの最適化プログラムを記録した記録媒体
は、コンパイル用プログラムである入力コードをプロフ
ァイルデータを用いてコンパイルして一次目的コードを
生成し、前記一次目的コードをシミュレーションし前記
プロファイルデータを生成するマイクロプロセッサ用目
的コードの最適化プログラムを記録した記録媒体におい
て、前記シミュレーションが、前記コンパイルにより生
成した前記一次目的コードの命令コードを解析してこの
命令コード対応処理の実行である命令コード実行を行
い、前記命令コード実行によるデータコードのアクセス
回数をアドレス及びアクセス対象のデータコードのサイ
ズ毎に記録したデータアクセス情報に基づきアクセス頻
度の高いデータコードを検出し、１命令でアクセス可能
なデータコード領域であるキャッシュ領域に再配置して
二次目的コードを生成し、前記二次目的コードの命令コ
ードを解析して前記命令コード実行を行うことにより、
高速のシミュレーション実行を可能とすることを特徴と
するものである。According to a fifteenth aspect of the present invention, there is provided a recording medium on which a program for optimizing an object code for a microprocessor is compiled, wherein an input code which is a compile program is compiled using profile data to generate a primary object code. In a recording medium storing a microprocessor object code optimization program that simulates a primary object code and generates the profile data, the simulation analyzes an instruction code of the primary object code generated by the compilation, and executes the instruction code. An instruction code is executed, which is the execution of the code-corresponding process, and the number of accesses to the data code by the execution of the instruction code is determined based on data access information recorded for each address and the size of the data code to be accessed. By relocating to the cache area, which is a data code area accessible by one instruction, to generate a secondary purpose code, analyze the instruction code of the secondary purpose code, and execute the instruction code. ,
It is characterized in that a high-speed simulation can be executed.

【００４１】請求項１６記載の発明のマイクロプロセッ
サ用目的コードの最適化プログラムを記録した記録媒体
は、コンパイル用プログラムである入力コードをプロフ
ァイルデータを用いてコンパイルして一次目的コードを
生成し、前記一次目的コードをシミュレーションし前記
プロファイルデータを生成するマイクロプロセッサ用目
的コードの最適化プログラムを記録した記録媒体におい
て、前記一次目的コードの命令コードの解析を行う命令
コード解析ステップと、解析した前記一次命令コードの
実行を行う命令シミュレーションステップと、前記一次
命令コードの実行の結果に基づき前記プロファイルデー
タの生成を行うプロファイルデータ生成ステップと、前
記命令コード中のデータアクセス命令の解析を行い、デ
ータのアクセスアドレスとアクセスサイズを検出し、検
出したデータの前記アクセスアドレスと前記アクセスサ
イズをデータアクセス情報に格納するデータアクセス情
報生成ステップと、前記データアクセス情報生成ステッ
プで生成した前記データコードのアクセス回数を前記ア
クセスアドレス及び前記アクセスサイズ毎に記録した前
記データアクセス情報に基づきアクセス頻度の降順に前
記データコードを検出して１命令でアクセス可能なデー
タコード領域であるキャッシュ領域に前記アクセス頻度
の降順に再配置し命令コードを補正するデータ再配置ス
テップと、前記データ再配置ステップで補正した前記目
的コードを二次目的コードとして生成する二次目的コー
ド生成ステップとを有することを特徴とするものであ
る。According to a sixteenth aspect of the present invention, there is provided a recording medium on which a program for optimizing an object code for a microprocessor according to the present invention is recorded. An instruction code analyzing step of analyzing an instruction code of the primary object code on a recording medium storing an optimization program of an object code for a microprocessor that simulates a primary object code and generates the profile data; and the analyzed primary instruction. An instruction simulation step for executing the code, a profile data generation step for generating the profile data based on a result of the execution of the primary instruction code, and a data access instruction in the instruction code is analyzed to access the data. Address and the access size of the detected data, and storing the access address and the access size of the detected data in data access information, and the number of accesses of the data code generated in the data access information generating step is The data code is detected in descending order of access frequency based on the access address and the data access information recorded for each access size, and is rearranged in a cache area which is a data code area accessible by one instruction in descending order of access frequency. A data rearrangement step of correcting an instruction code; and a secondary object code generation step of generating the object code corrected in the data rearrangement step as a secondary object code.

【００４２】[0042]

【発明の実施の形態】次に、本発明の実施の形態につい
て図面を参照して詳細に説明する。Next, embodiments of the present invention will be described in detail with reference to the drawings.

【００４３】本実施の形態のマイクロプロセッサ用目的
コードの最適化装置及び最適化方法は、記録媒体に記録
されたコンパイル用プログラムである入力コードをプロ
ファイルデータを用いてコンパイルして一次目的コード
を生成するコンパイル部と、上記一次目的コードをシミ
ュレーションし上記プロファイルデータを生成するシミ
ュレータとを備えるマイクロプロセッサ用目的コードの
最適化装置において、上記シミュレータが、上記コンパ
イル部が生成した上記一次目的コードの命令コードを解
析してこの命令コード対応処理の実行である命令コード
実行を行い、上記命令コード実行によるデータコードの
アクセス回数をアドレス及びアクセス対象のデータコー
ドのサイズ毎に記録したデータアクセス情報に基づきア
クセス頻度の高いデータコードを検出し、１命令でアク
セス可能なデータコード領域であるキャッシュ領域に再
配置して二次目的コードを生成し、この二次目的コード
の命令コードを解析して上記命令コード実行を行うこと
により、高速のシミュレーション実行を可能とすること
を特徴とするものである。The apparatus and method for optimizing the object code for a microprocessor according to the present embodiment compiles an input code, which is a compile program recorded on a recording medium, using profile data to generate a primary object code. And a simulator that simulates the primary object code and generates the profile data, wherein the simulator has an instruction code of the primary object code generated by the compiling unit. And executes the instruction code corresponding to the execution of the instruction code corresponding processing. The access frequency based on the data access information recorded for each address and the size of the data code to be accessed by the execution of the instruction code. High Data code is detected and rearranged in a cache area, which is a data code area accessible by one instruction, to generate a secondary purpose code, analyze the instruction code of the secondary purpose code, and execute the above instruction code. Thus, a high-speed simulation can be performed.

【００４４】ここで、コンパイラが生成した一次目的コ
ードとは、Ｃ言語、ＦＯＲＴＲＡＮ，ＣＯＢＯＬなどの
高級プログラミング言語を、目的とするＣＰＵ上の機械
語命令コード及びデータコードに翻訳したものを意味す
る。Here, the primary purpose code generated by the compiler means a high-level programming language such as C language, FORTRAN, COBOL translated into a machine language instruction code and data code on a target CPU.

【００４５】また、１命令でアクセス可能なデータコー
ド領域、すなわち、キャッシュ領域とは、一般的なキャ
ッシュメモリ（以下キャッシュ）だけではなく、例え
ば、従来技術の図４で説明したポインタＰ１の１６ビッ
トディスプレースメント範囲の領域９０１のようなデー
タコード記憶領域を意味する。A data code area that can be accessed by one instruction, that is, a cache area is not only a general cache memory (hereinafter, cache), but also a 16-bit pointer P1 described in FIG. It means a data code storage area such as a displacement range area 901.

【００４６】次に、本発明の実施の形態を図１３と共通
の構成要素には共通の参照文字／数字を付して同様にブ
ロックで示す図２（Ａ）を参照すると、この図に示す本
実施の形態のマイクロプロセッサ用目的コードの最適化
装置は、記録媒体に記録されたコンパイル用プログラム
である入力コードＦ５をプロファイルデータＦ２を用い
てコンパイルして一次目的コードＦ１を生成するコンパ
イル部２と、一次目的コードＦ１をシミュレーションし
プロファイルデータＦ２を生成するとともに二次目的コ
ードＦ４を生成するマシン又はシミュレータであるシミ
ュレータ１とを備える。Next, referring to FIG. 2 (A), which shows the embodiment of the present invention in the same manner as FIG. The object code optimizing device for a microprocessor according to the present embodiment compiles an input code F5, which is a compile program recorded on a recording medium, using profile data F2 to generate a primary object code F1. And a simulator 1 that is a machine or a simulator that simulates the primary purpose code F1 to generate profile data F2 and generates the secondary purpose code F4.

【００４７】シミュレータ１は、一次目的コードＦ１か
ら読み出した一次命令コードを解析してこの一次命令コ
ードの命令コード実行をシミュレーションしてプロファ
イルデータＦ１を生成し、命令コード実行に伴うデータ
コードのアクセス回数をアドレス及びアクセスサイズ毎
に記録したデータアクセス情報Ｆ３を生成し、このデー
タアクセス情報Ｆ３に基づきアクセス頻度の高いデータ
コードを検出し、１命令でアクセス可能なデータコード
領域に再配置し、二次目的コードＦ４を生成する。The simulator 1 analyzes the primary instruction code read from the primary purpose code F1, simulates the execution of the instruction code of the primary instruction code, generates profile data F1, and accesses the data code with the execution of the instruction code. Is generated for each address and access size, a data code having a high access frequency is detected based on the data access information F3, and is rearranged in a data code area accessible by one instruction. A purpose code F4 is generated.

【００４８】入力コードＦ５は、例えば、Ｃ言語、ＪＡ
ＶＡ言語、ＦＯＲＴＲＡＮ言語などの高級言語で記述し
たものである。The input code F5 is, for example, C language, JA
It is described in a high-level language such as a VA language or a FORTRAN language.

【００４９】コンパイル部２は、入力コードＦ５の実行
を行うソフトウェア機能手段として、入力コードＦ５の
供給を受けこの入力コードＦ５の字句解析及び構文解析
を行うフロントエンド２１と、後述するバックエンド２
２とを備える。The compiling unit 2 includes a front end 21 which receives the input code F5 and performs lexical analysis and syntax analysis of the input code F5, and a back end 2 which will be described later, as software function means for executing the input code F5.
2 is provided.

【００５０】バックエンド２２は、一次目的コードＦ１
／二次目的コードＦ４のシミュレーション結果であるプ
ロファイルデータＦ２を基にキャッシュミスペナルティ
をできるだけ小さくするために命令コードのスケジュー
リングを行うコードスケジューリング部２２１と、コー
ドスケジューリング部２２１のコードスケジューリング
結果に基づきシミュレータ１上で実行可能な一次目的コ
ードＦ１を生成する目的コード生成部２２２とを備え
る。The back end 22 has a primary purpose code F1
/ A code scheduling unit 221 for scheduling instruction codes to minimize cache miss penalty based on profile data F2, which is a simulation result of the secondary purpose code F4, and a simulator 1 based on the code scheduling result of the code scheduling unit 221. And a target code generation unit 222 that generates a primary target code F1 that can be executed above.

【００５１】コードスケジューリング部２２１の構成例
をブロックで示す図２（Ｂ）を参照すると、このコード
スケジューリング部２２１は、一次目的コードＦ１のシ
ミュレーション結果であるプロファイルデータＦ２を解
析しキャッシュミスペナルティ発生部分を検出するとと
もにコードスケジューリング実行部２２４で利用するた
めのキャッシュ動作情報を出力するプロファイルデータ
解析部２２３と、上記キャッシュ動作情報に基づき検出
したキャッシュミスペナルティ軽減のための命令コード
の再スケジューリングを行うコードスケジューリング実
行部２２４とを備える。Referring to FIG. 2B, which shows a block diagram of an example of the configuration of the code scheduling section 221, the code scheduling section 221 analyzes profile data F2, which is a simulation result of the primary target code F1, and generates a cache miss penalty generating portion. And a code for rescheduling an instruction code for reducing cache miss penalty detected based on the cache operation information, and a profile data analysis unit 223 for outputting cache operation information for use by the code scheduling execution unit 224. A scheduling execution unit 224.

【００５２】ここで、このキャッシュ動作情報とは、キ
ャッシュミスしたデータを主記憶からキャッシュに読み
込む動作が各動作クロックで行われている否かを表わす
情報である。Here, the cache operation information is information indicating whether or not the operation of reading the cache-missed data from the main memory into the cache is performed at each operation clock.

【００５３】本実施の形態を特徴付けるシミュレータ１
の構成を図１３（Ｂ）と共通の構成要素には共通の参照
文字／数字を付して同様にブロックで示す図１（Ａ）を
参照すると、従来と共通のコンパイル部２で生成した目
的コードである一次目的コードＦ１又はシミュレータ１
が生成した二次目的コードＦ４（以下、一次又は二次目
的コードの両方を指す場合を目的コードＦＸと呼ぶ）の
命令コードの解析を行う命令コード解析部１１と、解析
した命令コードの実行を行う命令シミュレーション部１
２と、プロファイルデータの生成を行いプロファイルデ
ータファイルＦ２（以下特記ない限りファイルを省略
し、例えばプロファイルデータＦ２等と呼ぶ）に格納す
るプロファイルデータ生成部１３とに加えて、コンパイ
ル部２で生成した一次目的コードＦ１中のデータアクセ
ス命令の解析を行い、このデータアクセスアドレス（以
下アドレス）とデータアクセスサイズ（以下サイズ：デ
ータアクセスアドレスとデータアクセスサイズをデータ
アドレスとサイズと省略）をデータアクセス情報Ｆ３に
出力するデータアクセス情報生成部１４と、データアク
セス情報Ｆ３を参照してアドレス毎のアクセス頻度の高
い順（降順）にデータコードをソートし、同一アドレス
での最大サイズのデータコードを選択データコードとし
て選択し、１命令でアクセス可能なデータコード領域
（以下キャッシュ領域）にアクセス頻度の降順に選択デ
ータコードを再配置し命令コードを補正するデータ再配
置部１５と、データ再配置部１５が再配置したデータ及
び補正した命令コードを二次目的コードＦ４として生成
する二次目的コード生成部１６とを備える。Simulator 1 characterizing the present embodiment
FIG. 1A, in which the same components as those in FIG. 13B are denoted by the same reference characters / numbers as in FIG. Primary purpose code F1 which is a code or simulator 1
The instruction code analyzing unit 11 analyzes the instruction code of the secondary object code F4 generated by the program (hereinafter, the case where both the primary and secondary object codes are referred to as the object code FX), and executes the analyzed instruction code. Instruction simulation unit 1
2 and a profile data generating unit 13 that generates profile data and stores the profile data in a profile data file F2 (hereinafter, the file is omitted unless otherwise specified, and is referred to as, for example, profile data F2). The data access instruction in the primary purpose code F1 is analyzed, and the data access address (hereinafter, the address) and the data access size (hereinafter, the size: the data access address and the data access size are abbreviated as the data address and the size) are described in data access information F3 The data codes are sorted in the order of decreasing access frequency (descending order) for each address with reference to the data access information generating unit 14 and the data access information F3, and the data code of the maximum size at the same address is selected. As a single instruction A data rearrangement unit 15 for rearranging selected data codes in an accessible data code area (hereinafter referred to as a cache area) in descending order of access frequency and correcting an instruction code; data rearranged by the data rearrangement unit 15 and corrected instructions A secondary purpose code generation unit 16 that generates a code as a secondary purpose code F4.

【００５４】データアクセス情報生成部１４の構成をブ
ロックで示す図１（Ｂ）を参照すると、このデータアク
セス情報生成部１４は、コンパイル部２で生成した一次
目的コード中のデータアクセス命令の解析を行い、アド
レス及びサイズを検出するデータアクセス命令解析部１
４１と、データアクセス命令解析部で検出したデータア
ドレスとサイズをデータアクセス情報Ｆ３に出力するデ
ータアクセス情報出力部１４２とを備える。Referring to FIG. 1B, which shows the configuration of the data access information generation unit 14 by blocks, the data access information generation unit 14 analyzes the data access instruction in the primary purpose code generated by the compilation unit 2. Data access instruction analysis unit 1 for detecting addresses and sizes
41, and a data access information output unit 142 that outputs the data address and size detected by the data access instruction analysis unit to the data access information F3.

【００５５】データ再配置部１５の構成をブロックで示
す図１（Ｃ）を参照すると、このデータ再配置部１５
は、データアクセス情報Ｆ３を参照しアドレス毎にアク
セス頻度の降順にデータコードをソートして最大サイズ
のデータコードを選択し、キャッシュ領域にアクセス頻
度順に再配置し、再配置後のアドレスをデータアクセス
情報Ｆ３にそれぞれ付加し再配置情報として出力するデ
ータ再配置実行部１５１と、読み出した目的コードＦＸ
がデータコードアクセス命令でありかつその（目的コー
ドＦＸの）アドレスが再配置情報の再配置前のアドレス
と一致する場合に目的コードのアクセスを配置後のアド
レスに置き換え命令コードを補正する命令コード補正部
１５２とを備える。Referring to FIG. 1C, which shows the configuration of the data rearrangement section 15 by blocks,
Refers to the data access information F3, sorts the data codes in descending order of access frequency for each address, selects the data code of the maximum size, rearranges them in the cache area in order of access frequency, and performs data access for the relocated addresses. A data relocation execution unit 151 that adds the information to the information F3 and outputs it as relocation information;
Is a data code access instruction, and when the address (of the target code FX) matches the address before the relocation of the relocation information, the access of the target code is replaced with the address after the relocation, and the instruction code correction for correcting the instruction code And a unit 152.

【００５６】次に、図１、図２、及び本実施の形態のシ
ミュレータ１の処理をフローチャートで示す図３を参照
して本実施の形態の動作について説明すると、まず、コ
ンパイル部２のフロントエンド２１は、入力コードＦ５
の供給を受けこの入力コードＦ５の字句解析及び構文解
析を行い、解析結果をバックエンド２２に供給する。Next, the operation of the present embodiment will be described with reference to FIG. 1 and FIG. 2 and FIG. 3 which is a flowchart showing the processing of the simulator 1 of the present embodiment. 21 is an input code F5
And analyzes the lexical and syntax of the input code F5, and supplies the analysis result to the back end 22.

【００５７】次に、バックエンド２２のコードスケジュ
ーリング部２２１は、有効設定された場合に、プロファ
イルデータＦ２を基にキャッシュミスペナルティをでき
るだけ小さくするために命令コードのスケジューリング
を行う。無効設定された場合は不動作となり、何も実行
しない。Next, the code scheduling section 221 of the back end 22, when enabled, performs instruction code scheduling based on the profile data F2 to minimize cache miss penalty. If it is invalidated, it will not operate and will not execute anything.

【００５８】有効設定の場合、まず、コードスケジュー
リング部２２１のプロファイルデータ解析部２２３は、
目的コードＦＸをシミュレータ１で実行して得たＣＰＵ
動作の記録であるプロファイルデータＦ２を解析しキャ
ッシュミスペナルティ発生部分を検出するとともにコー
ドスケジューリング実行部２２４で利用するためのキャ
ッシュ動作情報を出力し、コードスケジューリング実行
部２２４に供給する。次に、コードスケジューリング実
行部２２４は、供給を受けたキャッシュ動作情報に基づ
きプロファイルデータ解析部２２３で検出したキャッシ
ュミスペナルティを軽減するための命令コードの再スケ
ジューリングを行う。In the case of the valid setting, first, the profile data analysis unit 223 of the code scheduling unit 221
CPU obtained by executing target code FX on simulator 1
The profile data F2, which is a record of the operation, is analyzed to detect a cache miss penalty occurrence portion, and at the same time, outputs cache operation information to be used by the code scheduling execution unit 224 and supplies it to the code scheduling execution unit 224. Next, the code scheduling execution unit 224 reschedules the instruction code for reducing the cache miss penalty detected by the profile data analysis unit 223 based on the supplied cache operation information.

【００５９】目的コード生成部２２２は、コードスケジ
ューリング実行部２２４の命令コードの再スケジューリ
ング結果を受け、シミュレータ１上で実行可能な目的コ
ードを生成し、一次目的コードＦ１に出力する。The target code generator 222 receives the result of the instruction code rescheduling by the code scheduling executor 224, generates a target code executable on the simulator 1, and outputs the target code to the primary target code F1.

【００６０】シミュレータ１では、まず、命令コード解
析部１１が、コンパイル部２が生成した一次目的コード
Ｆ１の命令コード（以下一次命令コード）の解析を行う
（命令コード解析ステップＳ１）。次に、命令シミュレ
ーション部１２が、解析した一次命令コードの実行を行
う（命令シミュレーションステップＳ２）。次に、プロ
ファイルデータ生成部１３が、一次命令コードの実行結
果に基づきプロファイルデータの生成を行い、プロファ
イルデータＦ２へ出力する（プロファイルデータ生成ス
テップＳ３）。ここまでは、従来と同様の処理である。In the simulator 1, first, the instruction code analyzing unit 11 analyzes the instruction code of the primary purpose code F1 generated by the compiling unit 2 (hereinafter, the primary instruction code) (instruction code analyzing step S1). Next, the instruction simulation unit 12 executes the analyzed primary instruction code (instruction simulation step S2). Next, the profile data generation unit 13 generates profile data based on the execution result of the primary instruction code, and outputs the profile data to the profile data F2 (profile data generation step S3). The processing so far is the same as the conventional processing.

【００６１】次に、データアクセス情報生成部１４は、
命令コード中のデータアクセス命令の解析を行い、デー
タのアドレスとサイズを検出し、検出したデータのアド
レスとサイズをデータアクセス情報Ｆ３に格納する（デ
ータアクセス情報生成ステップＳ４）。Next, the data access information generation unit 14
The data access instruction in the instruction code is analyzed, the address and size of the data are detected, and the address and size of the detected data are stored in the data access information F3 (data access information generation step S4).

【００６２】まず、データアクセス命令解析部１４１
は、有効設定された場合、命令コード中のデータアクセ
ス命令の解析を行い、このデータアクセス命令のデータ
アドレスとサイズを検出する（データアクセス命令解析
ステップＳ４１）。次に、データアクセス情報出力部１
４２は、有効設定された場合、データアクセス命令解析
部１４１で検出したデータアドレスとサイズをデータア
クセス情報Ｆ３に出力し、データアクセス情報Ｆ３内の
アクセス回数をインクリメントする（データアクセス情
報出力ステップＳ４２）。First, the data access instruction analyzer 141
Analyzes the data access instruction in the instruction code, and detects the data address and size of the data access instruction (data access instruction analysis step S41). Next, the data access information output unit 1
42 outputs the data address and size detected by the data access instruction analysis unit 141 to the data access information F3 when the valid setting is made, and increments the number of accesses in the data access information F3 (data access information output step S42). .

【００６３】データアクセス情報生成部１４で生成され
るデータアクセス情報Ｆ３の内容例を説明図で示す図７
（Ａ）を参照すると、このデータアクセス情報は、「ア
ドレス」、「アクセスサイズ」、「アクセス回数」とか
ら構成されている。ここで、「アドレス」は、アクセス
時のデータコードの格納アドレスを示し、「アクセスサ
イズ」は、アクセス時のデータコードのサイズを示し、
「アクセス回数」は、データコードのアクセス回数を示
す。FIG. 7 is an explanatory diagram showing an example of the contents of data access information F3 generated by data access information generating section 14.
Referring to (A), this data access information is composed of “address”, “access size”, and “access count”. Here, “address” indicates the storage address of the data code at the time of access, “access size” indicates the size of the data code at the time of access,
The “access count” indicates the access count of the data code.

【００６４】データ再配置部１５は、有効設定された場
合、データアクセス情報生成部１４が生成したデータコ
ードのアクセス回数をアドレス及びアクセスサイズ毎に
記録したデータアクセス情報Ｆ３に基づき、アクセス頻
度の高い順、すなわち降順にデータコードを検出して１
命令でアクセス可能なデータコード領域、すなわち、キ
ャッシュ領域にアクセス頻度の降順に再配置し命令コー
ドを補正する（データ再配置ステップＳ５）。When the data relocation unit 15 is set to be valid, the access frequency of the data code generated by the data access information generation unit 14 is determined based on the data access information F3 recorded for each address and access size. Data code is detected in descending order,
The instruction code is rearranged in the data code area accessible by the instruction, that is, the cache area in descending order of access frequency to correct the instruction code (data rearrangement step S5).

【００６５】このデータ再配置ステップＳ５は、まず、
データ再配置実行部１５１で、データアクセス情報Ｆ３
を参照して、アドレス毎にアクセス頻度の降順にデータ
コードをソートして最大アクセスサイズのデータコード
を選択データコードとして選択し、この選択データコー
ドをキャッシュ領域にアクセス頻度の降順に再配置し、
再配置後のアドレスをデータアクセス情報Ｆ３にそれぞ
れ付加し再配置情報として出力する（データ再配置実行
ステップＳ５１）。図７（Ｂ）は再配置情報の記述例を
示す。In the data relocation step S5, first,
In the data relocation execution unit 151, the data access information F3
, Sort the data codes in descending order of access frequency for each address, select the data code of the maximum access size as the selected data code, and rearrange the selected data code in the cache area in descending order of access frequency,
The addresses after the rearrangement are added to the data access information F3 and output as the rearrangement information (data rearrangement execution step S51). FIG. 7B shows a description example of the relocation information.

【００６６】次に、命令コード補正部１５２で、目的コ
ードＦＸを１命令コードずつ読み出し、この目的コード
ＦＸがデータコードアクセス命令でありかつそのアドレ
スが再配置情報の再配置前のアドレスと一致する場合、
一致したデータコードアクセス命令のアドレスを配置後
のアドレスに置き換えて命令コードを補正する（命令コ
ード補正ステップＳ５２）。Next, the instruction code correction unit 152 reads the target code FX one instruction code at a time, and the target code FX is a data code access instruction, and its address matches the address of the relocation information before relocation. If
The instruction code is corrected by replacing the address of the matched data code access instruction with the address after arrangement (instruction code correction step S52).

【００６７】二次目的コード生成部１６は、命令コード
補正部１５により補正された目的コードを二次目的コー
ドＦ４として生成する（二次目的コード生成ステップＳ
６）。The secondary purpose code generator 16 generates the target code corrected by the instruction code corrector 15 as a secondary target code F4 (secondary target code generation step S4).
6).

【００６８】データアクセス命令解析部１４１のデータ
アクセス命令解析ステップＳ４１とデータアクセス情報
出力部１４２のデータアクセス情報出力ステップＳ４２
の各々の処理をそれぞれフローチャートで示す図４を併
せて参照してデータアクセス情報生成ステップＳ４の詳
細処理について説明すると、データアクセス命令解析ス
テップＳ４１は、まず、データアクセス命令判定ステッ
プＳ４１１で、命令シミュレーション部１２が実行した
命令コードがデータアクセス命令であるか否かの判定を
行う。データアクセス命令であれば次のアドレスサイズ
取り出しステップＳ４１２へと進む。データアクセス命
令でなければ命令コード終了判定ステップＳ４２５へと
進む。The data access command analysis step S41 of the data access command analysis section 141 and the data access information output step S42 of the data access information output section 142
The detailed process of the data access information generation step S4 will be described with reference to FIG. 4 which shows each process in a flowchart. The data access command analysis step S41 first includes a data access command determination step S411 and an instruction simulation It is determined whether the instruction code executed by the unit 12 is a data access instruction. If it is a data access instruction, the process proceeds to the next address size fetching step S412. If it is not a data access instruction, the process proceeds to an instruction code end determination step S425.

【００６９】次に、アドレスサイズ取り出しステップＳ
４１２において、データアクセス命令中のデータアドレ
スとサイズの取り出しを行う。Next, address size extracting step S
At 412, the data address and the size in the data access instruction are extracted.

【００７０】次に、該当エントリ検索ステップＳ４２１
において、アドレスサイズ取り出しステップＳ４１２に
おいて取り出したデータアドレスとサイズに該当する装
置内部の直接アクセス記憶データであるデータアクセス
情報Ｆ３中のエントリを検索する。Next, the corresponding entry search step S421
In step S412, an entry in the data access information F3, which is the direct access storage data inside the device, corresponding to the data address and size extracted in the address size extraction step S412 is searched.

【００７１】次に、該当エントリ有り判定ステップＳ４
２２で直接アクセス記憶データであるデータアクセス情
報Ｆ３に該当エントリがある場合、該当エントリアクセ
ス回数インクリメントステップＳ４２３へと進む。該当
エントリアクセス回数インクリメントステップＳ４２３
で、該当エントリのアクセス回数をインクリメントす
る。該当エントリ有り判定ステップＳ４２２でデータア
クセス情報に該当エントリがない場合新規エントリ追加
ステップＳ４２４へと進み、新規に、アドレスサイズ取
り出しステップＳ４１２で取り出したデータアドレスと
サイズのエントリをデータアクセス情報Ｆ３へ追加す
る。Next, the corresponding entry existence determination step S4
If there is a corresponding entry in the data access information F3 which is the direct access storage data at 22, the process proceeds to the corresponding entry access count increment step S423. Corresponding entry access count increment step S423
Then, the access count of the corresponding entry is incremented. If there is no corresponding entry in the data access information in the corresponding entry existence determination step S422, the process proceeds to a new entry adding step S424, and a new entry of the data address and size extracted in the address size extracting step S412 is added to the data access information F3. .

【００７２】最後に、命令コード終了判定ステップＳ４
２５で、命令コードの終了であるか否かの判定を行い、
終了でなければ命令コード解析ステップＳ１へと戻る。
終了であればデータ再配置ステップＳ５へと進む。Finally, instruction code end determination step S4
At 25, it is determined whether or not the end of the instruction code is reached,
If not, the process returns to the instruction code analysis step S1.
If completed, the process proceeds to the data relocation step S5.

【００７３】次に、データ再配置実行部１５１のデータ
再配置実行ステップＳ５１と命令コード補正部１５２の
命令コード補正ステップＳ５２の各々の処理をそれぞれ
フローチャートで示す図５，図６を併せて参照してデー
タ再配置ステップＳ５の詳細処理について説明すると、
まず、データ再配置実行ステップＳ５１は、データソー
トステップＳ５１１で、データアクセス情報Ｆ３を基
に、アドレス毎のアクセス頻度の高い順（降順）にデー
タコードをソートし、ソートデータＦ６に出力する。Next, the respective processes of the data relocation execution step S51 of the data relocation execution section 151 and the instruction code correction step S52 of the instruction code correction section 152 will be described with reference to FIGS. The detailed processing of the data relocation step S5 will be described below.
First, in a data rearrangement execution step S51, in a data sort step S511, based on the data access information F3, the data codes are sorted in the order of decreasing access frequency for each address (descending order) and output to the sort data F6.

【００７４】次に、データ取り出しステップＳ５１２
で、ソートデータＦ６からアクセス頻度の降順にデータ
コードを取り出し、最大サイズエントリ検索ステップＳ
５１３で、取り出したアドレスと同一アドレスのデータ
コードの中で最大アクセスサイズのデータコードを検索
選択し、このデータコードをキャッシュ領域移動ステッ
プＳ５１４で１命令でアクセス可能なデータ領域である
キャッシュ領域に移動し、再配置情報出力ステップＳ５
１５でデータアクセス情報に配置後アドレスを付加し、
再配置情報Ｆ７として出力する。図７（Ｂ）は、この再
配置情報Ｆ７の例を示す説明図である。Next, a data fetching step S512
Then, the data codes are extracted from the sort data F6 in descending order of the access frequency, and the maximum size entry search step S
In step 513, the data code having the maximum access size is searched and selected from among the data codes of the same address as the extracted address, and this data code is moved to the cache area which is a data area accessible by one instruction in the cache area moving step S514. And relocation information output step S5
At 15, an address is added to the data access information after the arrangement,
Output as relocation information F7. FIG. 7B is an explanatory diagram showing an example of the relocation information F7.

【００７５】次に、空きキャッシュ領域無し判定ステッ
プＳ５１６で、キャッシュ領域の空領域がまだ残ってい
る場合、データ終了判定ステップＳ５１８へ進み、ソー
トデータの全アドレスの終了であれば、命令コード補正
ステップＳ５２へ進み、全アドレス終了でなければ、デ
ータ取り出しステップＳ５１２へ戻り、以上の処理を繰
り返す。Next, in the step S516 of determining that there is no free cache area, if there is still an empty area in the cache area, the process proceeds to a data end determination step S518. Proceeding to S52, if all addresses are not completed, the process returns to the data fetching step S512, and the above processing is repeated.

【００７６】また、空きキャッシュ領域無し判定ステッ
プＳ５１６で、キャッシュ領域の空領域がなくなった場
合、残りデータ移動ステップＳ５１７へ進み、ソートデ
ータの残りのアドレスのデータコードを低速にしかアク
セスできないデータコード領域である非キャッシュ領域
へ移動し、再配置情報Ｆ７へ再配置情報を出力し、命令
コード補正ステップＳ５２に進む。If there is no free space in the cache area in step S516 of determining that there is no free cache area, the flow advances to step S517 to move the remaining data to the data code area where the data code of the remaining address of the sort data can be accessed only at low speed. , And outputs the relocation information to the relocation information F7, and proceeds to the instruction code correction step S52.

【００７７】命令コード補正ステップＳ５２では、ま
ず、１命令コード取り出しステップＳ５２１で、一次目
的コードＦ１から１命令コードを取り出す。In the instruction code correction step S52, first, in one instruction code extraction step S521, one instruction code is extracted from the primary purpose code F1.

【００７８】次に、命令コード終了判定ステップＳ５２
２において、命令コードを最後まで読み出したかの判定
を行い、最後まで読み出したならば、この命令コード補
正ステップＳ５２を終了し、二次目的コード生成ステッ
プＳ６へ進む。Next, an instruction code end determination step S52
In step 2, it is determined whether the instruction code has been read to the end. If the instruction code has been read to the end, the instruction code correction step S52 ends, and the process proceeds to the secondary purpose code generation step S6.

【００７９】命令コード終了判定ステップＳ５２２で、
まだ読み込むべき命令コードが残っていれば、アクセス
命令判定ステップＳ５２３へ進む。アクセス命令判定ス
テップＳ５２３において、１命令コード取り出しステッ
プＳ５２１で読み出した命令コードがデータコードアク
セス命令でなければ、１命令コード取り出しステップＳ
５２１へ戻り、次の命令コードを読み出す。命令コード
がデータコードアクセス命令であれば、一致検索ステッ
プＳ５２４へと進み、再配置情報Ｆ７内を検索し、デー
タコードアクセス命令のアドレスと再配置情報Ｆ７内の
配置前アドレスとが一致するエントリを探す。In the instruction code end determination step S522,
If there is still an instruction code to be read, the process proceeds to an access instruction determination step S523. In the access instruction determination step S523, if the instruction code read in the one instruction code fetch step S521 is not a data code access instruction, the one instruction code fetch step S521
Returning to 521, the next instruction code is read. If the instruction code is a data code access instruction, the process proceeds to a match search step S524 to search the relocation information F7 and find an entry in which the address of the data code access instruction matches the pre-arrangement address in the relocation information F7. look for.

【００８０】最後に、一致判定ステップＳ５２５にて、
一致検索ステップＳ５２４で一致するエントリである一
致エントリが見つかった場合、置換ステップＳ５２６へ
進み、データコードアクセス命令のアドレスを上記一致
エントリの配置後アドレスに置き換え、１命令コード取
り出しステップＳ５２１へ戻り、以上の処理を反復す
る。また、一致検索ステップＳ５２４で一致エントリが
見つからなかった場合、１命令コード取り出しステップ
Ｓ５２１へ戻り、以上の処理を反復する。Finally, in the match determination step S525,
If a matching entry that is a matching entry is found in the matching search step S524, the process proceeds to the replacing step S526, where the address of the data code access instruction is replaced with the address after the arrangement of the matching entry, and the process returns to the one instruction code extracting step S521. Is repeated. If no matching entry is found in the matching search step S524, the process returns to the one instruction code extracting step S521, and the above processing is repeated.

【００８１】例えば、本実施形態のマイクロプロセッサ
用目的コードの最適化装置において、図１２（Ａ）に示
すようなＣ言語における共用体のコード例のように、ア
クセスサイズが異なるが同一データ領域をアクセスする
入力コードをコンパイルする場合、データアクセス情報
Ｆ３は、図１２（Ｂ）に示すようになり、同一アドレス
のデータコードにおいて、小さいサイズのデータコード
のアクセス回数が、大きいサイズのデータコードのアク
セス回数より多い場合がある。For example, in the apparatus for optimizing the object code for a microprocessor according to the present embodiment, like the code example of the union in the C language as shown in FIG. When compiling the input code to be accessed, the data access information F3 is as shown in FIG. 12B. In the data code of the same address, the number of accesses of the small-size data code is changed to the access number of the large-size data code. May be more than the number of times.

【００８２】この場合、データ再配置ステップＳ５にお
いて、アクセス頻度の高いデータコードを優先して再配
置を行うため、番号２の２バイトのデータコードが再配
置の対象となるが、この番号２のデータコード領域は番
号１の４バイトのデータコードでもアクセスされている
ため、４バイトのデータコードが分断されないように、
最大のアクセスサイズである番号１の４バイトのデータ
コードを再配置の対象として選択する。つまり、同一デ
ータ領域をアクセスしているがアクセスサイズが異なる
場合、より大きなアクセスサイズでアクセスされている
データコードを１つの単位としてデータコード領域の再
配置を行う。In this case, in the data rearrangement step S5, the data code having a high access frequency is prioritized for rearrangement. Therefore, the 2-byte data code of number 2 is to be rearranged. Since the data code area is accessed even with the 4-byte data code of number 1, the 4-byte data code is not divided so that
The 4-byte data code of number 1, which is the maximum access size, is selected as a relocation target. That is, when the same data area is accessed but the access size is different, the data code area is rearranged using the data code accessed with the larger access size as one unit.

【００８３】以上説明したように、本実施の形態を特徴
付けるシミュレータ１は、データアクセス情報生成部１
４がデータアクセス情報Ｆ３を生成し、データ再配置部
１５が、高アクセス頻度のデータをキャッシュ領域に再
配置処理するため、二次目的コード生成部６が生成した
目的コードを再度シミュレータ１上で実行する場合、高
アクセス頻度のデータコード領域へのアクセスが高速化
し、プログラムの実行速度が向上する。As described above, the simulator 1 characterizing the present embodiment comprises the data access information generator 1
4 generates the data access information F3, and the data rearrangement unit 15 rearranges the frequently accessed data in the cache area. Therefore, the object code generated by the secondary object code generation unit 6 is regenerated on the simulator 1 again. In the case of execution, access to a data code area having a high access frequency is accelerated, and the execution speed of a program is improved.

【００８４】特に、データコード領域内のデータコード
に対してループ処理を行う場合、著しい効果がある。例
えば、あるデータコード領域内の値を０に初期化するた
め１００回のループを必要とするプログラム例におい
て、データコード領域が２命令でなく１命令でアクセス
可能な目的コードの場合、１００命令分の実行時間だ
け、プログラムの実行速度が向上する。Particularly, when the loop processing is performed on the data code in the data code area, there is a remarkable effect. For example, in an example of a program that requires 100 loops to initialize a value in a certain data code area to 0, if the data code area is a target code that can be accessed by one instruction instead of two instructions, The execution time of the program is improved by the execution time.

【００８５】次に、本発明の第２の実施の形態を特徴付
けるデータアクセス情報生成ステップＳ４Ａの処理を図
４と共通の構成要素には共通の参照文字／数字を付して
同様にフローチャートで示す図８を参照すると、この図
に示す本実施の形態のデータアクセス情報生成ステップ
Ｓ４Ａの前述の第１の実施の形態のデータアクセス情報
生成ステップＳ４との相違点は、内部の直接アクセス記
憶データであるデータアクセス情報Ｆ３の代わりに外部
記憶装置である外部データアクセス情報Ｆ３Ａを有し、
該当エントリ検索ステップＳ４２１の代わりに外部デー
タアクセス情報Ｆ３Ａを検索する該当エントリ検索ステ
ップＳ４２１Ａを有することである。Next, the processing of the data access information generating step S4A characterizing the second embodiment of the present invention is similarly shown by a flowchart with common reference characters / numerals added to components common to FIG. Referring to FIG. 8, the difference between the data access information generation step S4A of the present embodiment shown in FIG. 8 and the data access information generation step S4 of the first embodiment described above is the internal direct access storage data. Has external data access information F3A which is an external storage device instead of the data access information F3,
It has a corresponding entry search step S421A for searching the external data access information F3A instead of the corresponding entry search step S421.

【００８６】図８を参照して本実施の形態のデータアク
セス情報生成ステップＳ４Ａの処理について第１の実施
の形態との相違点を重点的に説明すると、データアクセ
ス命令解析ステップＳ４１１及びアドレスサイズ取り出
しステップＳ４１２は第１の実施の形態と同一の処理を
行う。すなわち、ステップＳ４１１でデータアクセス命
令であるか否かの判定を行い、ステップＳ４１２でデー
タアクセス命令中のアドレスとサイズの取り出しを行
う。The processing of the data access information generation step S4A of the present embodiment will be described with reference to FIG. 8 focusing on the differences from the first embodiment. The data access instruction analysis step S411 and the address size extraction Step S412 performs the same process as in the first embodiment. That is, it is determined in step S411 whether the instruction is a data access instruction, and in step S412, the address and size in the data access instruction are extracted.

【００８７】次に、該当エントリ検索ステップＳ４２１
Ａで、取り出したデータアドレスとサイズに該当する外
部記憶データである外部データアクセス情報Ｆ３Ａ中の
エントリを検索する。Next, the corresponding entry search step S421
In A, an entry in the external data access information F3A, which is external storage data corresponding to the extracted data address and size, is searched.

【００８８】以下、該当エントリ有り判定ステップＳ４
２２、該当エントリアクセス回数インクリメントステッ
プＳ４２３、新規エントリ追加ステップＳ４２４及び命
令コード終了判定ステップＳ４２５は第１の実施の形態
と同一の処理を行う。Hereinafter, the corresponding entry existence determination step S4
22, the corresponding entry access count increment step S423, the new entry addition step S424, and the instruction code end determination step S425 perform the same processing as in the first embodiment.

【００８９】次に、本発明の第３の実施の形態を特徴付
けるデータ再配置実行ステップＳ５１Ａの処理を図５と
共通の構成要素には共通の参照文字／数字を付して同様
にフローチャートで示す図９を参照すると、この図に示
す本実施の形態のデータ再配置実行ステップＳ５１Ａの
前述の第１の実施の形態のデータ再配置実行ステップＳ
５１との相違点は、最大サイズエントリ検索ステップＳ
５１３の次に、空きキャッシュ領域無し判定ステップＳ
５１６と、空きキャッシュ領域無し判定ステップＳ５１
６で空きキャッシュ領域がある場合はキャッシュ領域に
データを移動するキャッシュ領域移動ステップＳ５１４
と、空きキャッシュ領域がない場合キャッシュ領域以外
の領域である非キャッシュ領域にデータを移動する非キ
ャッシュ領域移動ステップＳ５１７とを行い、ャッシュ
領域移動ステップＳ５１４と非キャッシュ領域移動ステ
ップＳ５１７の処理後再配置情報出力ステップＳ５１５
と、データ終了判定ステップＳ５１７を行うことであ
る。Next, the process of the data rearrangement execution step S51A characterizing the third embodiment of the present invention is similarly shown by a flowchart with the same reference characters / numerals as those of FIG. Referring to FIG. 9, the data relocation execution step S51 according to the first embodiment of the data relocation execution step S51A according to the embodiment shown in FIG.
The difference from the first embodiment is that the maximum size entry search step S
After 513, there is no free cache area determination step S
516 and free cache area absence determination step S51
If there is a free cache area in step 6, the cache area is moved to the cache area.
And a non-cache area move step S517 for moving data to a non-cache area which is an area other than the cache area when there is no free cache area, and relocation after the cache area move step S514 and the non-cache area move step S517 Information output step S515
And performing the data end determination step S517.

【００９０】図９を参照して本実施の形態のデータ再配
置実行ステップＳ５１Ａの処理について第１の実施の形
態との相違点を重点的に説明すると、まず、データソー
トステップＳ５１１で、データアクセス情報Ｆ３に基づ
きアドレス毎のアクセス頻度の降順にデータコードをソ
ートし、ソートデータＦ６に出力する。次に、データ取
り出しステップＳ５１２で、ソートデータＦ６からアク
セス頻度の降順にデータコードを取り出し、最大サイズ
エントリ検索ステップＳ５１３で、取り出したアドレス
と同一アドレスのデータコード中で最大アクセスサイズ
のデータコードを検索選択する。With reference to FIG. 9, the processing of the data relocation execution step S51A of this embodiment will be described focusing on the differences from the first embodiment. First, in the data sorting step S511, the data access is executed. The data codes are sorted in descending order of the access frequency for each address based on the information F3, and output to the sorted data F6. Next, in a data fetching step S512, data codes are fetched from the sort data F6 in descending order of access frequency, and in a maximum size entry search step S513, a data code having a maximum access size is searched in a data code having the same address as the fetched address. select.

【００９１】次に、空きキャッシュ領域無し判定ステッ
プＳ５１６で、空きキャッシュ領域がまだ残っている場
合、キャッシュ領域移動ステップＳ５１４へと進み、こ
のキャッシュ領域移動ステップＳ５１４で、最大サイズ
エントリ検索ステップＳ５１３で選択したデータコード
をキャッシュ領域へ移動する。空きキャッシュ領域が無
い場合、非キャッシュ領域移動ステップＳ５１７へと進
み、非キャッシュ領域移動ステップＳ５１７において選
択したデータコードを非キャッシュ領域に移動する。Next, in the step S516 of determining that there is no free cache area, if a free cache area still remains, the process proceeds to the cache area moving step S514, in which the maximum size entry search step S513 is selected. The data code is moved to the cache area. If there is no free cache area, the process proceeds to the non-cache area moving step S517, and the data code selected in the non-cache area moving step S517 is moved to the non-cache area.

【００９２】以下第１の実施の形態と同様に、再配置情
報出力ステップＳ５１５で、データアクセス情報に配置
後アドレスを付加し、再配置情報として出力し、データ
終了判定ステップＳ５１８で終了判定を行う。In the same manner as in the first embodiment, in the relocation information output step S515, the post-placement address is added to the data access information, and the data is output as relocation information, and the end determination is made in the data end determination step S518. .

【００９３】次に、本発明の第４の実施の形態を特徴付
ける命令コード補正ステップＳ５２Ａの処理を図６と共
通の構成要素には共通の参照文字／数字を付して同様に
フローチャートで示す図１０を参照して第１の実施の形
態との相違点を重点的に説明すると、この図に示す本実
施の形態の命令コード補正ステップＳ５２Ａは、まず、
再配置情報取り出しステップＳ５２７で再配置情報Ｆ７
から１エントリを取り出す。Next, the process of the instruction code correcting step S52A characterizing the fourth embodiment of the present invention is shown in the same flowchart as FIG. 6 with common components being denoted by common reference characters / numerals. 10, the difference from the first embodiment will be mainly described. The instruction code correction step S52A of the present embodiment shown in FIG.
In the relocation information extracting step S527, the relocation information F7
Fetch one entry from

【００９４】次に、再配置情報終了判定ステップＳ５２
８で、再配置情報Ｆ７の終わりまで検索したか否かの判
定を行い、再配置情報Ｆ７の終わりであれば終了し、二
次目的コード生成ステップＳ６へ進む。再配置情報Ｆ７
の終わりでなければ１命令コード取り出しステップＳ５
２１へ進む。Next, relocation information end determination step S52
At 8, it is determined whether or not the search has been performed up to the end of the relocation information F7. If the search has been completed, the process ends, and the process proceeds to the secondary purpose code generation step S6. Relocation information F7
If it is not the end, one instruction code fetching step S5
Proceed to 21.

【００９５】次に、１命令コード取り出しステップＳ５
２１において、一次目的コードＦ１の先頭から１命令コ
ードを取り出す。Next, one instruction code extracting step S5
At 21, one instruction code is extracted from the head of the primary purpose code F1.

【００９６】次に、アクセス命令判定ステップＳ５２３
で、１命令コード取り出しステップＳ５２１で読み出し
た命令コードがデータコードアクセス命令でなければ、
１命令コード取り出しステップＳ５２１へと戻り、次の
命令コードを読み出す。命令コードがデータコードアク
セス命令であれば、配置後アドレス置換ステップＳ５３
０へと進み、再配置情報内を検索し、データコードアク
セス命令のアクセスアドレスと再配置情報内の配置前ア
ドレスとが一致する一致エントリを探し、この一致エン
トリの配置後アドレスへの置換処理を行う。Next, an access instruction determination step S523
If the instruction code read in one instruction code fetching step S521 is not a data code access instruction,
Returning to one instruction code fetching step S521, the next instruction code is read. If the instruction code is a data code access instruction, the post-placement address replacement step S53
0, searching the relocation information for a matching entry in which the access address of the data code access instruction matches the pre-placement address in the relocation information, and replaces the matching entry with the post-placement address. Do.

【００９７】次に、１命令コード取り出しステップＳ５
３１において、一次目的コードから次の１命令を取り出
す。Next, one instruction code extracting step S5
At 31, the next instruction is fetched from the primary purpose code.

【００９８】最後に、命令コード終了判定ステップＳ５
２２において、命令の終了か否かの判定を行い、終了で
なければ、アクセス命令判定ステップＳ５２３へ戻り、
以下の処理を繰り返す。命令の終了であれば、再配置情
報取り出しステップＳ５２７へ戻り以下の処理を繰り返
す。Finally, instruction code end determination step S5
At 22, it is determined whether or not the instruction is completed. If not, the process returns to the access instruction determination step S523,
The following processing is repeated. If the instruction has ended, the process returns to the relocation information extracting step S527 and the following processing is repeated.

【００９９】次に、本発明の第５の実施の形態を特徴付
ける命令コード解析ステップＳ１の処理をフローチャー
トで示す図１１を参照すると、この命令コード解析ステ
ップＳ１は、シミュレータ１の入力（シミュレーション
の入力）となる目的コードが、コンパイル部２によるコ
ンパイルの結果から生成された一次目的コードＦ１であ
るか、シミュレータ１により生成された二次目的コード
Ｆ４であるかの判定を行う（一次／二次目的コード判定
ステップＳ１１）。Next, referring to FIG. 11 which is a flowchart showing the processing of an instruction code analyzing step S1 characterizing the fifth embodiment of the present invention, the instruction code analyzing step S1 is executed by the input of the simulator 1 (input of the simulation). ) Is determined as the primary target code F1 generated from the result of compilation by the compiling unit 2 or the secondary target code F4 generated by the simulator 1 (primary / secondary target). Code determination step S11).

【０１００】次に、コンパイル部２から生成された一次
目的コードＦ１であれば、一次目的コード解析ステップ
Ｓ１２へ進み、この一次目的コードＦ１の命令コードの
解析を行う。Next, if it is the primary purpose code F1 generated by the compiling unit 2, the process proceeds to the primary purpose code analysis step S12, where the instruction code of the primary purpose code F1 is analyzed.

【０１０１】また、シミュレータ１から生成された二次
目的コードＦ４であれば、一次目的コード解析ステップ
Ｓ１３へ進み、この二次目的コードＦ４の命令コードの
解析を行う。If it is the secondary purpose code F4 generated from the simulator 1, the process goes to the primary purpose code analyzing step S13 to analyze the instruction code of the secondary purpose code F4.

【０１０２】[0102]

【発明の効果】以上説明したように、本発明のマイクロ
プロセッサ用目的コードの最適化装置、最適化方法及び
最適化プログラムを記録した記録媒体は、本発明を特徴
付けるシミュレータが、コンパイル部が生成した一次目
的コードの命令コードを解析してこの命令コード対応処
理の実行である命令コード実行を行い、この命令コード
実行によるデータコードのアクセス回数をアドレス及び
アクセス対象のデータコードのサイズ毎に記録したデー
タアクセス情報に基づきアクセス頻度の高いデータコー
ドを検出し、１命令でアクセス可能なデータコード領域
であるキャッシュ領域に再配置して二次目的コードを生
成し、この二次目的コードの命令コードを解析して上記
命令コード実行を行うことにより、高アクセス頻度のデ
ータコード領域へのアクセスを高速化でき、プログラム
の実行速度が向上するという効果がある。As described above, the optimizing apparatus, the optimizing method, and the recording medium storing the optimizing program for the object code for the microprocessor of the present invention are provided by the simulator characterizing the present invention, which is generated by the compiling section. Analyzes the instruction code of the primary purpose code, executes the instruction code corresponding to the execution of the instruction code corresponding processing, and records the number of accesses of the data code by the execution of the instruction code for each address and the size of the data code to be accessed. A data code having a high access frequency is detected based on the access information, rearranged in a cache area which is a data code area accessible by one instruction, a secondary purpose code is generated, and an instruction code of the secondary purpose code is analyzed. And execute the above instruction code to access the data code area with high access frequency. Access can speed up, there is an effect that the execution speed of the program is improved.

【０１０３】特に、データコード領域内のデータコード
に対してループ処理を行う場合、著しい効果がある。例
えば、あるデータコード領域内の値を０に初期化するた
め１００回のループを必要とするプログラム例におい
て、データコード領域が２命令でなく１命令でアクセス
可能な目的コードの場合、１００命令分の実行時間だ
け、プログラムの実行速度が向上する。Particularly, when the loop processing is performed on the data code in the data code area, there is a remarkable effect. For example, in an example of a program that requires 100 loops to initialize a value in a certain data code area to 0, if the data code area is a target code that can be accessed by one instruction instead of two instructions, The execution time of the program is improved by the execution time.

[Brief description of the drawings]

【図１】本発明の第１の実施の形態のマイクロプロセッ
サ用目的コードの最適化装置を特徴付けるシミュレーシ
ョン装置を示すブロック図である。FIG. 1 is a block diagram showing a simulation device that characterizes a device for optimizing an object code for a microprocessor according to a first embodiment of the present invention.

【図２】本実施の形態のマイクロプロセッサ用目的コー
ドの最適化装置を示すブロック図である。FIG. 2 is a block diagram illustrating an apparatus for optimizing an object code for a microprocessor according to an embodiment;

【図３】本実施の形態のシミュレーション装置の処理動
作であるシミュレーション方法の一例を示すフローチャ
ートである。FIG. 3 is a flowchart illustrating an example of a simulation method that is a processing operation of the simulation apparatus according to the present embodiment.

【図４】図３のデータアクセス命令解析処理の詳細処理
を示すフローチャートである。FIG. 4 is a flowchart showing a detailed process of a data access instruction analysis process of FIG. 3;

【図５】図３のデータ再配置部１５によるデータ再配置
処理の詳細処理を示すフローチャートである。FIG. 5 is a flowchart showing a detailed process of a data relocation process by the data relocation unit 15 of FIG. 3;

【図６】データ再配置部によるデータ再配置処理の詳細
処理を示すフローチャートである。FIG. 6 is a flowchart showing a detailed process of a data relocation process by a data relocation unit.

【図７】再配置情報の一例を示す説明図である。FIG. 7 is an explanatory diagram illustrating an example of relocation information.

【図８】本発明の第２の実施の形態を特徴付けるデータ
アクセス情報生成部の詳細処理を示すフローチャートで
ある。FIG. 8 is a flowchart illustrating a detailed process of a data access information generation unit characterizing the second embodiment of the present invention.

【図９】本発明の第３の実施の形態を特徴付けるデータ
アクセス情報解析部の詳細処理を示すフローチャートで
ある。FIG. 9 is a flowchart showing a detailed process of a data access information analysis unit characterizing the third embodiment of the present invention.

【図１０】本発明の第４の実施の形態を特徴付けるコー
ド補正処理部の詳細処理を示すフローチャートである。FIG. 10 is a flowchart illustrating a detailed process of a code correction processing unit characterizing a fourth embodiment of the present invention.

【図１１】本発明の第５の実施の形態を特徴付ける命令
コード解析部の詳細処理を示すフローチャートである。FIG. 11 is a flowchart illustrating a detailed process of an instruction code analyzing unit characterizing a fifth embodiment of the present invention.

【図１２】Ｃ言語における共用体のコードのデータアク
セス情報の一例を示す説明図である。FIG. 12 is an explanatory diagram showing an example of data access information of a union code in C language.

【図１３】従来のマイクロプロセッサ用目的コードの最
適化装置の一例を示すブロック図である。FIG. 13 is a block diagram showing an example of a conventional object code optimization device for a microprocessor.

【図１４】データコードアクセス命令のディスプレース
メントとして１６ビットの値までしかとることができな
いマイクロプロセッサに対するデータコード領域，命令
コードの例を示す説明図である。FIG. 14 is an explanatory diagram showing an example of a data code area and an instruction code for a microprocessor which can take only a 16-bit value as a displacement of a data code access instruction.

[Explanation of symbols]

１，１００シミュレータ２コンパイル部１１命令コード解析部１２命令シミュレーション部１３プロファイルデータ生成部１４データアクセス情報生成部１５データ再配置部１６二次目的コード生成部２１フロントエンド２２バックエンド１４１データアクセス命令解析部１４２データアクセス情報出力部１５１データ再配置実行部１５２命令コード補正部２２１コードスケジューリング部２２２目的コード生成部２２３プロファイルデータ解析部２２４コードスケジューリング実行部Ｆ１一次目的コードＦ２プロファイルデータＦ３，Ｆ３Ａデータアクセス情報Ｆ４二次目的コードＦ５入力コードＦ６ソートデータＦ７配置情報Ｆ１００目的コード 1,100 Simulator 2 Compiling unit 11 Instruction code analysis unit 12 Instruction simulation unit 13 Profile data generation unit 14 Data access information generation unit 15 Data relocation unit 16 Secondary purpose code generation unit 21 Front end 22 Back end 141 Data access instruction analysis Unit 142 Data access information output unit 151 Data relocation execution unit 152 Instruction code correction unit 221 Code scheduling unit 222 Object code generation unit 223 Profile data analysis unit 224 Code scheduling execution unit F1 Primary object code F2 Profile data F3, F3A Data access information F4 Secondary purpose code F5 Input code F6 Sort data F7 Positioning information F100 Purpose code

Claims

[Claims]

1. A compiling section for compiling an input code, which is a compiling program recorded on a recording medium, using profile data to generate a primary purpose code, and simulating the primary purpose code to generate the profile data. A microprocessor object code optimization device comprising a simulator, wherein the simulator analyzes an instruction code of the primary object code generated by the compiling unit and executes an instruction code corresponding to execution of the instruction code corresponding process. The number of accesses of the data code by the execution of the instruction code is detected based on the data access information recorded for each address and the size of the data code to be accessed. A certain cap A second object code generated by rearranging the second object code, analyzing the instruction code of the second object code, and executing the instruction code, thereby enabling high-speed simulation execution. Object code optimization device for processors.

2. The method according to claim 1, wherein the simulator analyzes an instruction code of one of a primary target code as a target code generated by the compiling unit and a secondary target code as a target code generated by the simulator. An instruction code analysis unit that executes the analyzed instruction code; a profile data generation unit that generates the profile data based on a result of the execution of the instruction code; and data in the primary object code. A data access information generating unit that analyzes an access instruction and outputs the data access address (hereinafter referred to as an address) and a data access size (hereinafter referred to as a size) to data access information; and an access frequency for each address with reference to the data access information. Data codes in descending order of A data rearrangement unit that selects the data code having the maximum size as a selected data code as a selected data code, rearranges the selected data code in the cache area in descending order of the access frequency, and corrects an instruction code; and 2. The apparatus according to claim 1, further comprising a secondary purpose code generation unit configured to generate the data rearranged by the unit and the corrected instruction code as the secondary purpose code.

3. The data access information analysis section, wherein the data access information generation section analyzes the data access instruction in the primary object code, and detects the data access address and the size, and the data access instruction analysis section. 3. The apparatus according to claim 2, further comprising a data access information output unit that outputs the data address and the data access size detected in the step (c) to the data access information.

4. The data rearrangement section references the data access information, sorts data codes in descending order of access frequency for each address, selects a data code having a maximum size, and places the data code in the cache area in descending order of access frequency. A data relocation execution unit that adds the relocated address to the data access information and outputs the data as relocation information, wherein the read target code is a data code access instruction and the access address is 3. An instruction code correcting unit for correcting an instruction code by replacing an access address of the object code with an access address after the arrangement when the access address before the relocation of the relocation information matches the access address. Object code optimization device for microprocessors.

5. A microprocessor for compiling an input code as a compiling program recorded on a recording medium using profile data to generate a primary purpose code, simulating the primary purpose code, and generating the profile data. In the method of optimizing an object code, the simulation analyzes an instruction code of the primary object code generated by the compilation, executes an instruction code corresponding to execution of the instruction code corresponding process, and executes a data code by executing the instruction code. The number of times of access is detected based on data access information recorded for each address and the size of the data code to be accessed, and a frequently accessed data code is detected and relocated to a cache area which is a data code area accessible by one instruction. Secondary purpose It generates said by performing the instruction code executed by analyzing the instruction code of the secondary object code, a method of optimizing object code for a microprocessor, characterized in that to enable high-speed simulation runs.

6. A microprocessor for compiling an input code, which is a compilation program recorded on a recording medium, using profile data to generate a primary purpose code, simulating the primary purpose code, and generating the profile data. In the method for optimizing an object code, an instruction code analyzing step for analyzing an instruction code of the primary object code; an instruction simulation step for executing the analyzed primary instruction code; and, based on a result of the execution of the primary instruction code. A profile data generating step of generating the profile data, and analyzing a data access instruction in the instruction code,
Detects data access address and access size,
A data access information generating step of storing the access address and the access size of the detected data in data access information; and an access count of the data code generated in the data access information generating step for each of the access address and the access size. Based on the recorded data access information, the data code is detected in the descending order of the access frequency, and is relocated to the cache area, which is a data code area accessible by one instruction, in the descending order of the access frequency to correct the instruction code. A method of optimizing an object code for a microprocessor, comprising: an arranging step; and a secondary object code generating step of generating, as a secondary object code, the object code corrected in the data rearranging step.

7. The instruction code analyzing step, wherein the object code serving as an input of the simulation is the primary object code generated by the compilation or the secondary object code generated by the simulation. A primary / secondary target code determining step of determining whether the primary / secondary target code is the primary target code in the primary / secondary target code determining step; and a primary target code analyzing step of analyzing an instruction code of the primary target code. 7. The method according to claim 6, further comprising a secondary object code analyzing step of analyzing an instruction code of the secondary object code if the primary / secondary object code determining step is the secondary object code. To optimize object code for microprocessors.

8. The data access information generating step analyzes the data access instruction in the instruction code, and detects the data access address and the access size of the data access instruction. 7. The microprocessor according to claim 6, further comprising: outputting the detected data access address and the access size to the data access information, and incrementing a number of accesses in the data access information. How to optimize purpose code.

9. The data rearrangement step refers to the data access information, sorts data codes in descending order of access frequency for each address, and selects a data code having a maximum access size as a selected data code. A data rearrangement execution step of rearranging the selected data code in the cache area in descending order of access frequency, adding the rearranged address to the data access information and outputting the data access information as relocation information; When the target code is a data code access instruction and the access address matches the access address before the relocation of the relocation information, the access address of the matched data code access instruction is replaced with the access address after the read. To correct the instruction code Method of optimizing object code for the microprocessor according to claim 6, characterized in that it comprises a correction step.

10. The data access instruction analyzing step determines whether or not the data access instruction in the instruction code is a data access instruction. If the data access instruction is not the data access instruction, the data proceeds to an instruction code end determination step. An access instruction determination step; an address size extraction step for extracting the data access address and the access size in the data access instruction if the data access instruction is the data access instruction in the data access instruction determination step; A corresponding entry search step of searching for an entry in the data access information corresponding to the data access address and the access size, and, if the data access information has a corresponding entry, a corresponding entry access count A corresponding entry presence determination step for proceeding to an increment step; a corresponding entry access count increment step for incrementing the access count of the corresponding entry; and a new entry if the data access information does not have the corresponding entry in the corresponding entry presence determination step. A new entry adding step of adding the data access address and the access size entry fetched in the address size fetching step to the data access information; and determining whether or not the end of the instruction code has been reached. 9. The method for optimizing an object code for a microprocessor according to claim 8, further comprising: an instruction code end determination step of returning to a code analysis step and proceeding to a data rearrangement step when the instruction code end is completed.

11. The data rearrangement execution step includes the steps of: sorting the data codes in descending order of access frequency for each address based on the data access information to generate sort data; A data fetching step of fetching data codes in descending order; a maximum size entry search step of searching for a data code having a maximum access size among the data codes having the same address as the fetched access address and selecting the selected data code as a selected data code; A cache area moving step of moving a code to the cache area; a relocation information output step of adding a post-placement address to the data access information and outputting it as relocation information; and determining whether there is an empty area in the cache area. Before doing When the empty area of the cache area still remains, the process proceeds to a data end determination step described later. When the empty area of the cache area is exhausted, the empty cache area absence determination step proceeds to a non-cache area moving step described below. If all the addresses of the data are completed, the process proceeds to the instruction code correction step. If not all the addresses are completed, the process returns to the data fetching step, and the above process is repeated. Moving the data code of the access address to a non-cache area which is a data code area accessible at a low speed, outputting the relocation information to the relocation information, and proceeding to the instruction code correction step; 10. The microphone according to claim 9, comprising: Optimization method for object code for low-speed processors.

12. The instruction code correction step includes: extracting one instruction code from the primary object code; and determining whether the instruction code has been read to the end.
If the instruction code has been read to the end, the operation is terminated, and if there is still an instruction code to be read, an instruction code end determination step that proceeds to an access instruction determination step described below; If it is not a data code access instruction,
Returning to the one instruction code fetching step, if the instruction code is the data code access instruction, the access instruction determining step to proceed to the next matching search step; and searching the relocation information for the data code access instruction. The match search step for finding a match entry that is an entry where the access address of the relocation information matches the pre-placement address in the relocation information.If the match entry is found in the match search step, proceed to the next replacement step. Returning to the one instruction code fetching step and repeating the above processing; if the matching entry is not found in the matching search step, returning to the one instruction code fetching step and repeating the above processing; The address of the data code access instruction is assigned to the matching entry. Method of optimizing object code for the microprocessor of claim 9, characterized in that it comprises a said replacement step of replacing the rear address.

13. The data rearrangement execution step includes the steps of: sorting the data codes in descending order of access frequency for each address based on the data access information to generate sort data; A data retrieval step of retrieving data codes in descending order; a maximum size entry retrieval step of retrieving a data code having a maximum access size from the data codes at the same address as the retrieved access address and selecting the selected data code as a selected data code; It is determined whether or not there is an empty area. If the empty area of the cache area still remains, the process proceeds to a cache area moving step described below. If the empty area of the cache area is exhausted, the process proceeds to a non-cache area moving step to be described later. Advancing free cash Determining a non-cache area, a cache area moving step of moving the selected data code to the cache area, and a non-cache area which is a data code area that can access the data code of the remaining access address of the sort data at a low speed. Moving the non-cache area to the data access information; adding a post-placement address to the data access information; and outputting the rearrangement information as relocation information. 10. The microprocessor according to claim 9, further comprising: a code correction step; if not all addresses are completed, returning to the data fetching step and repeating the above processing. Method.

14. The instruction code correcting step includes: a relocation information extracting step of extracting one entry from the relocation information; and determining whether or not the search has been performed until the end of the relocation information. If it is the end, the process ends. If it is not the end of the relocation information, the process proceeds to the next first instruction code fetching step. One instruction code fetching step; and if the instruction code read in the one instruction code fetching step is not the data code access instruction,
Returning to the one instruction code fetching step, if the instruction code is the data code access instruction, an access instruction determining step to proceed to the next post-placement address replacement step; A post-placement address replacement step of searching for a matching entry in which the access address of the instruction matches the pre-placement address in the relocation information, and replacing the matched entry with a post-placement address; A second one-instruction-code fetching step for fetching an instruction code; and determining whether or not the instruction is completed. If not, return to the access-instruction determination step. Repeat the following processing. Return to the relocation information extracting step and repeat the following processing. 9. method of optimizing object code for a microprocessor as claimed, characterized in that the.

15. A program for optimizing a target code for a microprocessor which compiles an input code which is a compile program using profile data to generate a primary target code, simulates the primary target code, and generates the profile data. In the recording medium, the simulation analyzes the instruction code of the primary target code generated by the compilation, executes the instruction code corresponding to the execution of the instruction code corresponding process, and executes the instruction code execution. A data code having a high access frequency is detected based on the data access information recorded for each address and the size of the data code to be accessed, and relocated to a cache area which is a data code area accessible by one instruction. Next Generating a code, analyzing the instruction code of the secondary object code, and executing the instruction code, thereby enabling a high-speed simulation execution; and recording a program for optimizing an object code for a microprocessor. Recording medium.

16. A program for optimizing a target code for a microprocessor which compiles an input code which is a compile program using profile data to generate a primary target code, simulates the primary target code, and generates the profile data. An instruction code analyzing step of analyzing the instruction code of the primary object code, an instruction simulation step of executing the analyzed primary instruction code, and a step of executing the primary instruction code based on a result of the execution of the primary instruction code. Profile data generating step of generating profile data, and analyzing a data access instruction in the instruction code,
Detects data access address and access size,
A data access information generating step of storing the access address and the access size of the detected data in data access information; and an access count of the data code generated in the data access information generating step for each of the access address and the access size. Based on the recorded data access information, the data code is detected in the descending order of the access frequency, and is relocated to the cache area, which is a data code area accessible by one instruction, in the descending order of the access frequency to correct the instruction code. Recording a program for optimizing an object code for a microprocessor, comprising: an arrangement step; and a secondary object code generation step of generating the object code corrected in the data rearrangement step as a secondary object code. Medium.