JP4892387B2

JP4892387B2 - Execution path detection apparatus, information processing apparatus, execution path detection method, program, and recording medium

Info

Publication number: JP4892387B2
Application number: JP2007094075A
Authority: JP
Inventors: 則文吉松; 真吉田; 和彰村上; 敦浩須賀
Original assignee: Kyushu University NUC; Fujitsu Ltd; Fukuoka Industry Science and Technology Foundation
Current assignee: Kyushu University NUC; Fujitsu Ltd; Fukuoka Industry Science and Technology Foundation
Priority date: 2007-03-30
Filing date: 2007-03-30
Publication date: 2012-03-07
Anticipated expiration: 2027-03-30
Also published as: JP2008250867A

Description

本発明は、実行パス検出装置、情報処理装置、実行パス検出方法、プログラム及び記録媒体に関する。 The present invention relates to an execution path detection device, an information processing device, an execution path detection method, a program, and a recording medium.

アプリケーションプログラムを実行するプロセッサの処理の高速化や低消費電力化を実現するために、プログラムの局所性に着目した手法が提案されている。この手法は、プログラムを実行する上でクリティカルな命令列を発見し、該命令列の処理の最適化を図ることで処理の高速化や低消費電力化を実現するものである。このようなクリティカルな命令列を発見する手法として、命令列に含まれる分岐命令の実行履歴を利用する手法がある。 In order to realize high-speed processing and low power consumption of a processor that executes an application program, a method that focuses on the locality of the program has been proposed. This technique realizes high-speed processing and low power consumption by finding a critical instruction sequence for executing a program and optimizing the processing of the instruction sequence. As a technique for finding such a critical instruction sequence, there is a technique that uses an execution history of a branch instruction included in the instruction sequence.

例えば特許文献１には、命令列の命令が実行される方向とは逆方向である戻り方向に分岐する分岐命令の実行回数に着目してループ構造の命令部分を検出し、ループ構造の命令の実行回数が多いと判断されるパス経路を記録する分岐履歴記録装置が開示されている。即ち、特許文献１では、プログラムのループ構造の部分に局所性が存在するという仮定の下で、該パス経路を実行頻度の高い命令列の実行パスとして検出する。 For example, in Patent Document 1, a loop structure instruction portion is detected by paying attention to the number of executions of a branch instruction that branches in a return direction opposite to the direction in which instructions in the instruction sequence are executed. A branch history recording apparatus that records a path route that is determined to have a large number of executions is disclosed. That is, in Patent Document 1, the path path is detected as an execution path of an instruction sequence having a high execution frequency under the assumption that locality exists in the loop structure portion of the program.

また、例えば特許文献２には、分岐命令の分岐履歴が格納されるプロファイラテーブルと、戻り方向の分岐先アドレス及び分岐回数が格納される戻り分岐先テーブルとを設け、戻り分岐先テーブルの分岐先アドレスとプロファイラテーブルに格納された分岐履歴とに基づいて、実行頻度の高い命令列の実行パスを検出する推定装置が開示されている。
特開２０００−１４８４８２号公報特開２００５−９２５３２号公報 For example, Patent Document 2 includes a profiler table that stores a branch history of a branch instruction and a return branch destination table that stores a branch destination address in the return direction and the number of branches, and the branch destination of the return branch destination table. An estimation device that detects an execution path of an instruction sequence having a high execution frequency based on an address and a branch history stored in a profiler table is disclosed.
JP 2000-148482 A JP-A-2005-92532

しかしながら、特許文献１に開示された分岐履歴記録装置では、ソフトウェアで処理する必要がある。そのため、実行頻度の高い命令列の実行パスを検出するための処理時間が長くなり、プロセッサの処理中に、所望の命令列の実行パスを特定することが困難であるという問題がある。 However, the branch history recording apparatus disclosed in Patent Document 1 needs to be processed by software. Therefore, there is a problem that the processing time for detecting an execution path of an instruction sequence having a high execution frequency becomes long, and it is difficult to specify an execution path of a desired instruction sequence during the processing of the processor.

また、特許文献２に開示された推定装置では、膨大な分岐履歴や分岐先アドレスを記録しておく必要があり、分岐履歴や分岐先アドレス等を記憶する記憶領域が膨大になってしまうという問題がある。このような推定装置において、記憶領域を少なくしようとすると、退避等の必要性から処理が中断されてしまい、所望の命令列の実行パスを検出するための処理時間が長くなる。更に、特許文献２におけるＳＷプロファイラ部による処理が複雑であるため、ハードウェア化が困難であり、ソフトウェアで実現せざるを得なくなる。従って、特許文献１と同様に、プロセッサの処理中に、所望の命令列の実行パスを特定することは困難となる。 In addition, in the estimation apparatus disclosed in Patent Document 2, it is necessary to record an enormous branch history and branch destination address, and the storage area for storing the branch history and branch destination address becomes enormous. There is. In such an estimation device, if an attempt is made to reduce the storage area, the processing is interrupted due to the necessity of saving or the like, and the processing time for detecting the execution path of a desired instruction sequence becomes longer. Furthermore, since the processing by the SW profiler unit in Patent Document 2 is complicated, it is difficult to implement hardware, and it must be realized by software. Therefore, as in Patent Document 1, it is difficult to specify an execution path of a desired instruction sequence during the processing of the processor.

本発明は、以上のような技術的課題に鑑みてなされたものであり、その目的とするところは、少ないメモリ容量で、高速に実行頻度の高い命令列の実行パスを検出できる実行パス検出装置、情報処理装置、実行パス検出方法、その実行パス検出方法をコンピュータに実現させるプログラム、及び該プログラムを記録する記録媒体を提供することにある。 The present invention has been made in view of the technical problems as described above, and an object of the present invention is to execute an execution path detecting apparatus capable of detecting an execution path of an instruction sequence having a high execution frequency at high speed with a small memory capacity. An information processing apparatus, an execution path detection method, a program for causing a computer to implement the execution path detection method, and a recording medium for recording the program.

上記課題を解決するために本発明は、
実行頻度が高い命令列の実行パスを検出するための実行パス検出装置であって、
分岐命令を含む命令列を１ブロックとする命令列ブロックの先頭アドレスの命令の実行回数が格納される実行履歴テーブルと、
前記分岐命令の分岐履歴が格納される分岐履歴テーブルと、
前記分岐命令の分岐履歴を収集する処理を行う分岐履歴管理部とを含み、
前記実行回数が所与の閾値を超えたことを条件に、前記分岐履歴管理部が、前記分岐履歴の収集を開始すると共に、前記実行パスを特定するためのパス情報を、前記分岐履歴に基づいて出力する実行パス検出装置に関係する。 In order to solve the above problems, the present invention
An execution path detection device for detecting an execution path of an instruction sequence having a high execution frequency,
An execution history table in which the number of executions of the instruction at the head address of the instruction sequence block in which an instruction sequence including a branch instruction is one block is stored;
A branch history table storing a branch history of the branch instruction;
A branch history management unit that performs a process of collecting a branch history of the branch instruction,
On the condition that the number of executions exceeds a given threshold, the branch history management unit starts collecting the branch history, and path information for identifying the execution path is based on the branch history. This is related to the execution path detection device that outputs the output.

また本発明に係る実行パス検出装置では、
前記命令列ブロックの先頭アドレスの命令の実行回数をカウントするカウンタを含み、
前記実行履歴テーブルは、
前記先頭アドレスの命令毎に、実行回数を記憶することができる。 In the execution path detection device according to the present invention,
A counter that counts the number of times the instruction at the head address of the instruction sequence block is executed;
The execution history table is
The number of executions can be stored for each instruction at the head address.

上記のいずれかの発明によれば、アプリケーションプログラムの局所性に着目して、高頻度で実行される命令列の繰り返し部分を検出してから、該繰り返し部分の分岐履歴を収集するようにしたので、アプリケーションプログラムにおいて実行頻度が高い命令列の実行パスを検出する際に膨大な分岐履歴を収集する必要がなくなる。そのため、本発明によれば、実行パスの検出精度の低下を抑えつつ、少ないメモリ容量で、高速に実行頻度の高い命令列の実行パスを検出できるようになる。 According to any one of the above-described inventions, focusing on the locality of the application program, since the repeated portion of the instruction sequence executed frequently is detected, the branch history of the repeated portion is collected. Therefore, it is not necessary to collect an enormous branch history when detecting an execution path of an instruction sequence having a high execution frequency in an application program. Therefore, according to the present invention, it is possible to detect an execution path of an instruction sequence having a high execution frequency at a high speed with a small memory capacity while suppressing a decrease in execution path detection accuracy.

また本発明に係る実行パス検出装置では、
前記実行履歴テーブルの記憶領域の少なくとも一部が、前記分岐履歴テーブルの記憶領域と重複していてもよい。 In the execution path detection device according to the present invention,
At least a part of the storage area of the execution history table may overlap with the storage area of the branch history table.

また本発明に係る実行パス検出装置では、
前記分岐履歴テーブルに登録すべき情報の少なくとも一部が、前記実行履歴テーブルの記憶領域に書き込まれてもよい。 In the execution path detection device according to the present invention,
At least a part of information to be registered in the branch history table may be written in a storage area of the execution history table.

上記のいずれかの発明においては、まず実行履歴テーブルを用いて高頻度で実行される命令列の繰り返し部分を検出してから、分岐履歴テーブルに該繰り返し部分の分岐履歴を収集する。そのため、分岐履歴を収集する段階では実行履歴テーブルを不要にできるため、本発明によれば、実行履歴テーブルと分岐履歴テーブルの記憶領域を共用するようにしたので、両テーブルのために必要な記憶容量の削減を図ることができる。 In any one of the above-described inventions, first, a repeated portion of an instruction sequence executed frequently is detected using the execution history table, and then the branch history of the repeated portion is collected in the branch history table. Therefore, the execution history table can be made unnecessary at the stage of collecting the branch history. Therefore, according to the present invention, the storage area for the execution history table and the branch history table is shared. The capacity can be reduced.

また本発明に係る実行パス検出装置では、
前記実行履歴テーブルに、
分岐命令の戻り方向の分岐先アドレスの命令の実行回数が格納されてもよい。 In the execution path detection device according to the present invention,
In the execution history table,
The number of executions of the instruction at the branch destination address in the return direction of the branch instruction may be stored.

本発明によれば、戻り方向の分岐命令の実行履歴のみを収集するようにしたので、少ないメモリ容量で、高頻度で繰り返される実行部分を検出できるようになる。 According to the present invention, since only the execution history of the branch instruction in the return direction is collected, it is possible to detect an execution portion that is frequently repeated with a small memory capacity.

また本発明に係る実行パス検出装置では、
前記実行履歴テーブルが、前記先頭アドレス及び前記実行回数を記憶し、
前記分岐履歴管理部が、
前記閾値を超えた前記実行回数に関連付けて記憶される先頭アドレスを用いて、前記分岐履歴の収集を開始することができる。 In the execution path detection device according to the present invention,
The execution history table stores the start address and the number of executions,
The branch history management unit
The collection of the branch history can be started using a head address stored in association with the number of executions exceeding the threshold.

本発明によれば、実行履歴テーブルを用いて検出した高頻度の繰り返し部分の分岐履歴の収集を行うことができるので、少ないメモリ容量で分岐履歴テーブルを構成できるようになる。 According to the present invention, it is possible to collect the branch history of the frequently repeated portion detected using the execution history table, so that the branch history table can be configured with a small memory capacity.

また本発明に係る実行パス検出装置では、
前記分岐履歴テーブルが、
記憶情報をセットアソシアティブ方式で記憶し、
各セットには、
前記先頭アドレス、当該命令列ブロックに含まれる分岐命令のターゲットアドレス、及び分岐回数が記憶され、
該分岐命令に対応して記憶されたターゲットアドレスのうち分岐回数が最も多いセットのターゲットアドレスを出力することができる。 In the execution path detection device according to the present invention,
The branch history table is
Memorize memorized information by set associative method,
Each set includes
The start address, the target address of a branch instruction included in the instruction sequence block, and the number of branches are stored.
Of the target addresses stored corresponding to the branch instruction, the target address of the set having the largest number of branches can be output.

また本発明に係る実行パス検出装置では、
前記分岐履歴管理部が、
前記分岐命令の分岐先アドレスをインデックスとして前記分岐履歴テーブルを検索して得られたターゲットアドレスをパス情報として出力することができる。 In the execution path detection device according to the present invention,
The branch history management unit
The target address obtained by searching the branch history table using the branch destination address of the branch instruction as an index can be output as path information.

上記のいずれかの発明によれば、分岐履歴テーブルに基づいてターゲットアドレス出力できるので、無駄なソフトウェア処理を省略して、高速に、実行パスを特定できるようになる。 According to any one of the above inventions, since the target address can be output based on the branch history table, it is possible to specify an execution path at high speed while omitting useless software processing.

また本発明に係る実行パス検出装置では、
前記分岐命令の分岐先が前記実行履歴テーブルに記録された先頭アドレスとならないとき、又は予め決められた分岐命令の実行回数が所与のサンプリング時間内に達しなかったとき、
前記分岐履歴管理部が、
前記分岐履歴テーブルの記憶情報を無効化することができる。 In the execution path detection device according to the present invention,
When the branch destination of the branch instruction does not become the start address recorded in the execution history table, or when the predetermined number of executions of the branch instruction does not reach within a given sampling time,
The branch history management unit
The stored information in the branch history table can be invalidated.

本発明によれば、分岐履歴の収集を開始した後に、高頻度で繰り返される実行が行われなかったことを検出して分岐履歴テーブルを無効化するようにしたので、無駄にパス情報を収集する必要がなくなる。 According to the present invention, after the branch history collection is started, the branch history table is invalidated by detecting that the execution is not frequently repeated, so that path information is collected wastefully. There is no need.

また本発明は、
上記のいずれか記載の実行パス検出装置と、
アプリケーションプログラムを実行処理する中央演算処理装置とを含み、
前記パス検出装置からの前記パス情報に基づいて、前記アプリケーションプログラムの処理が最適化される情報処理装置に関係する。 The present invention also provides
Any of the above execution path detection devices;
A central processing unit that executes and executes an application program,
The present invention relates to an information processing apparatus in which processing of the application program is optimized based on the path information from the path detection apparatus.

また本発明は、
上記のいずれか記載の実行パス検出装置と、
アプリケーションプログラムを実行処理する中央演算処理装置と、
前記パス検出装置からの前記パス情報に基づいて、前記アプリケーションプログラムの処理を最適化するパス処理部とを含む情報処理装置に関係する。 The present invention also provides
Any of the above execution path detection devices;
A central processing unit that executes application programs;
The present invention relates to an information processing apparatus including a path processing unit that optimizes processing of the application program based on the path information from the path detection apparatus.

上記のいずれかの発明によれば、少ないメモリ容量で、高速に実行頻度の高い命令列の実行パスを検出できる実行パス検出装置が適用され、処理が最適化された情報処理装置を提供できる。 According to any one of the above-described inventions, an information processing apparatus can be provided in which an execution path detection apparatus that can detect an execution path of an instruction sequence that is frequently executed at high speed with a small memory capacity is applied and processing is optimized.

また本発明は、
実行頻度が高い命令列の実行パスを検出するための実行パス検出方法であって、
分岐命令を含む命令列を１ブロックとする命令列ブロックの先頭アドレスの命令の実行回数を実行履歴テーブルに記憶するステップと、
前記実行回数が所与の閾値を超えたことを条件に前記分岐命令の分岐履歴の収集を開始し、該分岐履歴を分岐履歴テーブルに記憶するステップと、
前記実行パスを特定するためのパス情報を、前記分岐履歴に基づいて出力するステップとを含む実行パス検出方法に関係する。 The present invention also provides
An execution path detection method for detecting an execution path of an instruction sequence having a high execution frequency,
Storing in the execution history table the number of executions of the instruction at the head address of the instruction sequence block having one instruction sequence including a branch instruction as a block;
Starting collecting branch history of the branch instruction on condition that the number of executions exceeds a given threshold, and storing the branch history in a branch history table;
And outputting path information for specifying the execution path based on the branch history.

また本発明に係る実行パス検出方法では、
前記実行履歴テーブルの記憶領域の少なくとも一部が、前記分岐履歴テーブルの記憶領域と重複していてもよい。 In the execution path detection method according to the present invention,
At least a part of the storage area of the execution history table may overlap with the storage area of the branch history table.

また本発明に係る実行パス検出方法では、
前記先頭アドレスが、
分岐命令の戻り方向の分岐先アドレスであり、
前記実行履歴テーブルが、前記先頭アドレス及び前記実行回数を記憶し、
前記閾値を超えた前記実行回数に関連付けて記憶される先頭アドレスを用いて、前記分岐履歴の収集を開始することができる。 In the execution path detection method according to the present invention,
The start address is
The branch destination address in the return direction of the branch instruction,
The execution history table stores the start address and the number of executions,
The collection of the branch history can be started using a head address stored in association with the number of executions exceeding the threshold.

また本発明に係る実行パス検出方法では、
前記分岐履歴テーブルが、
記憶情報をセットアソシアティブ方式で記憶し、
各セットには、
前記先頭アドレス、当該命令列ブロックに含まれる分岐命令のターゲットアドレス、及び分岐回数が記憶され、
該分岐命令に対応して記憶されたターゲットアドレスのうち分岐回数が最も多いセットのターゲットアドレスを出力することができる。 In the execution path detection method according to the present invention,
The branch history table is
Memorize memorized information by set associative method,
Each set includes
The start address, the target address of a branch instruction included in the instruction sequence block, and the number of branches are stored.
Of the target addresses stored corresponding to the branch instruction, the target address of the set having the largest number of branches can be output.

また本発明に係る実行パス検出方法では、
該分岐命令の分岐先アドレスをインデックスとして前記分岐履歴テーブルを検索して得られたターゲットアドレスをパス情報として出力することができる。 In the execution path detection method according to the present invention,
The target address obtained by searching the branch history table using the branch destination address of the branch instruction as an index can be output as path information.

また本発明に係る実行パス検出方法では、
前記分岐命令の分岐先が前記実行履歴テーブルに記録された先頭アドレスとならないとき、又は予め決められた分岐命令の実行回数が所与のサンプリング時間内に達しなかったとき、
前記分岐履歴テーブルの記憶情報を無効化することができる。 In the execution path detection method according to the present invention,
When the branch destination of the branch instruction does not become the start address recorded in the execution history table, or when the predetermined number of executions of the branch instruction does not reach within a given sampling time,
The stored information in the branch history table can be invalidated.

上記のいずれかの発明によれば、少ないメモリ容量で、高速に実行頻度の高い命令列の実行パスを検出できる実行パス検出方法を提供できる。 According to any one of the above-described inventions, it is possible to provide an execution path detection method capable of detecting an execution path of an instruction sequence frequently executed at high speed with a small memory capacity.

また本発明は、
コンピュータに、上記のいずれか記載の実行パス検出方法を実行させるためのプログラムに関係する。 The present invention also provides
The present invention relates to a program for causing a computer to execute any of the execution path detection methods described above.

本発明によれば、少ないメモリ容量で、高速に実行頻度の高い命令列の実行パスを検出できる実行パス検出方法をコンピュータに実現させるプログラムを提供できる。 ADVANTAGE OF THE INVENTION According to this invention, the program which makes a computer implement | achieve the execution path detection method which can detect the execution path of an instruction sequence with high memory frequency at high speed with a small memory capacity can be provided.

また本発明は、
上記記載のプログラムを記録したコンピュータ読み取り可能な記録媒体に関係する。 The present invention also provides
The present invention relates to a computer-readable recording medium on which the program described above is recorded.

本発明によれば、少ないメモリ容量で、高速に実行頻度の高い命令列の実行パスを検出できる実行パス検出方法をコンピュータに実現させるプログラムを記録した記録媒体を提供できる。 ADVANTAGE OF THE INVENTION According to this invention, the recording medium which recorded the program which makes a computer implement | achieve the execution path detection method which can detect the execution path of an instruction sequence with high memory frequency at high speed can be provided.

以下、本発明の実施の形態について図面を用いて詳細に説明する。なお、以下に説明する実施の形態は、特許請求の範囲に記載された本発明の内容を不当に限定するものではない。また以下で説明される構成のすべてが本発明の必須構成要件であるとは限らない。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. The embodiments described below do not unduly limit the contents of the present invention described in the claims. Also, not all of the configurations described below are essential constituent requirements of the present invention.

１．情報処理装置
図１に、本実施形態における情報処理装置の構成例のブロック図を示す。 1. Information Processing Device FIG. 1 is a block diagram showing a configuration example of an information processing device according to this embodiment.

情報処理装置１００は、中央演算処理装置（Central Processing Unit：ＣＰＵ）１０と、アクセラレータ２０と、プロファイラ（実行パス推定装置、実行パス検出装置）３０とを含む。情報処理装置１００は、バス５０を含み、バス５０を介して、ＣＰＵ１０、アクセラレータ２０及びプロファイラ３０との間でデータやアドレスのやり取りが行われる。また、情報処理装置１００の外部には、メモリバス５２を介してバス５０と接続されるメモリ１１０が設けられており、ＣＰＵ１０、アクセラレータ２０及びプロファイラ３０はバス５０及びメモリバス５２を介してメモリ１１０にアクセスできるようになっている。 The information processing apparatus 100 includes a central processing unit (CPU) 10, an accelerator 20, and a profiler (execution path estimation apparatus, execution path detection apparatus) 30. The information processing apparatus 100 includes a bus 50, and data and addresses are exchanged with the CPU 10, the accelerator 20, and the profiler 30 via the bus 50. A memory 110 connected to the bus 50 via the memory bus 52 is provided outside the information processing apparatus 100, and the CPU 10, the accelerator 20, and the profiler 30 are connected to the memory 110 via the bus 50 and the memory bus 52. Can be accessed.

ＣＰＵ１０は、ＣＰＵ１０の内蔵メモリ又はメモリ１１０に格納されたアプリケーションプログラムを読み込んで、該プログラムに対応した命令を実行する。 The CPU 10 reads an application program stored in the internal memory of the CPU 10 or the memory 110, and executes an instruction corresponding to the program.

アクセラレータ２０は、ＣＰＵ１０に代わってＣＰＵ１０が実行すべき処理を行う。アクセラレータ２０の処理時間は、ＣＰＵ１０の処理時間より短い。アクセラレータ２０の機能は、ハードウェア又はソフトウェアにより実現される。アクセラレータ２０の機能がソフトウェアにより実現される場合には、アクセラレータ２０がＣＰＵ及びメモリを有し、該メモリ又は図１のメモリ１１０に格納されたアクセラレータ用プログラムを読み込んで該プログラムに対応した処理を実行するＣＰＵによりアクセラレータ２０の機能が実現される。 The accelerator 20 performs processing to be executed by the CPU 10 in place of the CPU 10. The processing time of the accelerator 20 is shorter than the processing time of the CPU 10. The function of the accelerator 20 is realized by hardware or software. When the function of the accelerator 20 is realized by software, the accelerator 20 has a CPU and a memory, and reads an accelerator program stored in the memory or the memory 110 in FIG. 1 and executes processing corresponding to the program. The function of the accelerator 20 is realized by the CPU.

このようなアクセラレータ２０は、例えば、より処理時間が短くなるように、又はより低消費電力となるように、ＣＰＵ１０が処理すべきプログラムの命令列を最適化させた状態で該プログラムの処理結果を生成する。アクセラレータ２０は、例えばＣＰＵ１０が処理すべきプログラムのうち不要な命令列が削除された命令列を実行することで、該プログラムを実行した場合のＣＰＵ１０の処理結果と同じ結果をより短時間で得られるようになっている。また、例えばアクセラレータ２０は、ＣＰＵ１０が処理すべきプログラムの命令列を並列に処理することで、該プログラムを実行した場合のＣＰＵ１０の処理結果と同じ結果をより短時間で得られるようになっている。 Such an accelerator 20, for example, obtains the processing result of the program in a state where the instruction sequence of the program to be processed by the CPU 10 is optimized so that the processing time is shorter or the power consumption is lower. Generate. The accelerator 20 can obtain the same result as the processing result of the CPU 10 when the program is executed in a shorter time by executing the instruction sequence from which the unnecessary instruction sequence is deleted among the programs to be processed by the CPU 10, for example. It is like that. Further, for example, the accelerator 20 can obtain the same result as the processing result of the CPU 10 when the program is executed by processing the instruction sequence of the program to be processed by the CPU 10 in parallel in a shorter time. .

プロファイラ（広義には実行パス検出装置）３０は、プログラムの実行中に実行頻度の高いループ構造のパス（実行パス、命令パス、ホットパス）を高精度で検出（推定）し、検出したパスを特定するための情報であるホットパス情報（広義にはパス情報）を出力する。プロファイラ３０は、ＣＰＵ１０が実行するプログラムを構成する命令列のうち繰り返し実行される命令列を検出し、繰り返し実行される命令列の分岐履歴に基づいて実行頻度の高いパスを特定する処理を行う。 The profiler (execution path detection device in a broad sense) 30 detects (estimates) a path of a loop structure (execution path, instruction path, hot path) that is frequently executed during execution of a program with high accuracy and identifies the detected path Hot path information (pass information in a broad sense), which is information for performing the process, is output. The profiler 30 detects an instruction sequence that is repeatedly executed from among the instruction sequences that constitute a program executed by the CPU 10, and performs a process of specifying a path that is frequently executed based on a branch history of the instruction sequence that is repeatedly executed.

更に情報処理装置１００の外部には、ホットパス処理部１２０が設けられている。ホットパス処理部１２０は、プロファイラ３０からのホットパス情報に基づいて実行頻度の高いループ構造のパスを特定し、このパスの処理を最適化することで情報処理装置１００の処理時間を短縮させたり情報処理装置１００の処理中の消費電力を低減させたりする。 Further, a hot path processing unit 120 is provided outside the information processing apparatus 100. The hot path processing unit 120 identifies a loop structure path with high execution frequency based on the hot path information from the profiler 30 and optimizes the processing of this path to shorten the processing time of the information processing apparatus 100 or to process information. The power consumption during processing of the apparatus 100 is reduced.

例えば、ホットパス処理部１２０は、プロファイラ３０からのホットパス情報に基づいて特定した命令列のパスを並列処理させるように命令列を最適化したり、実行不要な命令を削除して命令列を最適化したりする。このとき、ホットパス処理部１２０は、最適化処理後の命令列をアクセラレータ２０や該アクセラレータ２０がアクセスするメモリに格納する。 For example, the hot path processing unit 120 optimizes the instruction sequence so that the path of the instruction sequence identified based on the hot path information from the profiler 30 is processed in parallel, or optimizes the instruction sequence by deleting unnecessary instructions. To do. At this time, the hot path processing unit 120 stores the instruction sequence after the optimization processing in the accelerator 20 or a memory accessed by the accelerator 20.

或いは、例えばホットパス処理部１２０は、プロファイラ３０からのホットパス情報に基づいて特定した命令列のパスの処理時間が短くなるようにアクセラレータ２０のハードウェア構成を変更したり、或いは該パスを処理する上で情報処理装置１００の消費電力が低減するようにアクセラレータ２０のハードウェア構成を変更したりする処理を行う。この場合、アクセラレータ２０は動的再構成可能な回路であり、ホットパス処理部１２０が、動的再構成に必要なハードウェア構成情報をアクセラレータ２０に供給する。 Alternatively, for example, the hot path processing unit 120 changes the hardware configuration of the accelerator 20 so as to shorten the processing time of the path of the instruction sequence specified based on the hot path information from the profiler 30, or processes the path. Thus, a process of changing the hardware configuration of the accelerator 20 so as to reduce the power consumption of the information processing apparatus 100 is performed. In this case, the accelerator 20 is a dynamically reconfigurable circuit, and the hot path processing unit 120 supplies hardware configuration information necessary for dynamic reconfiguration to the accelerator 20.

次に、図１の情報処理装置１００の処理の概要について説明する。 Next, an outline of processing of the information processing apparatus 100 in FIG. 1 will be described.

本実施形態では、基本ブロックと呼ばれる命令列ブロックを単位に、命令列の実行パスを考える。 In this embodiment, an instruction sequence execution path is considered in units of instruction sequence blocks called basic blocks.

図２に、本実施形態における基本ブロックの説明図を示す。 FIG. 2 is an explanatory diagram of basic blocks in the present embodiment.

アプリケーションプログラムを構成する命令列の実行パスを考えると、アプリケーションプログラムを構成する命令は、プログラムの実行の流れを変える分岐命令と、分岐命令以外の他の命令とに区別できる。そして、プログラムを構成する命令が、アドレス値の小さい命令からアドレス値の大きい命令の方向に順番に実行されるため、実行の流れを変える分岐命令毎に、命令列のブロックを区分できる。このブロックが、基本ブロックである。基本ブロックは、例えば直前の基本ブロックの最後の分岐命令の次の命令から始まり、アドレス値の大きい方向に命令が実行されたときの最初の分岐命令で終わる命令列のブロックである。 Considering the execution path of the instruction sequence constituting the application program, the instructions constituting the application program can be distinguished into a branch instruction that changes the flow of program execution and an instruction other than the branch instruction. Since the instructions constituting the program are executed in order from the instruction with the smallest address value to the instruction with the larger address value, the instruction sequence block can be divided for each branch instruction that changes the flow of execution. This block is a basic block. The basic block is, for example, a block of an instruction sequence that starts from the instruction next to the last branch instruction of the immediately preceding basic block and ends with the first branch instruction when the instruction is executed in the direction in which the address value is larger.

図２では、基本ブロックＢＢ１、ＢＢ２、・・・、ＢＢＫ（Ｋは３以上の正の整数）が示されている。基本ブロックＢＢ１は、このブロックの先頭アドレスＢＳＡ１の命令“ｉｎｓｔ１”に始まり、命令の実行方向に順次命令が実行され、アドレスＢＩＡ１の分岐命令“ｂｎｅ”で終わる。ここで分岐命令“ｂｎｅ”が条件分岐命令であるとすると、条件が「真」のときの分岐先アドレスがＢＳＡ３、条件が「偽」のときの分岐先アドレスがアドレスＢＩＡ１の次のアドレスであるＢＳＡ２である。 In FIG. 2, basic blocks BB1, BB2,..., BBK (K is a positive integer of 3 or more) are shown. The basic block BB1 starts with the instruction “inst1” at the head address BSA1 of this block, sequentially executes instructions in the instruction execution direction, and ends with the branch instruction “bne” at the address BIA1. If the branch instruction “bne” is a conditional branch instruction, the branch destination address when the condition is “true” is BSA3, and the branch destination address when the condition is “false” is the address next to the address BIA1. BSA2.

基本ブロックＢＢ２は、このブロックの先頭アドレスＢＳＡ２の命令“ｉｎｓｔ２”に始まり、命令の実行方向に順次命令が実行され、アドレスＢＩＡ２の分岐命令“ｂｒ”で終わる。アドレスＢＳＡ２は、アドレスＢＩＡ１の分岐命令の分岐先アドレスであるため、アドレスＢＳＡ２は分岐ターゲットアドレスＢＴＡ１である。 The basic block BB2 starts with the instruction “inst2” at the head address BSA2 of this block, sequentially executes instructions in the instruction execution direction, and ends with the branch instruction “br” at the address BIA2. Since the address BSA2 is a branch destination address of the branch instruction of the address BIA1, the address BSA2 is the branch target address BTA1.

基本ブロックＢＢ３は、このブロックの先頭アドレスＢＳＡ３の命令“ｉｎｓｔ３”に始まり、命令の実行方向に順次命令が実行され、同様にアドレスＢＩＡ３の分岐命令（図示せず）で終わる。アドレスＢＳＡ３は、アドレスＢＩＡ１の分岐命令の分岐先アドレスであるため、アドレスＢＳＡ３の分岐ターゲットアドレスＢＴＡ２である。 The basic block BB3 starts with the instruction “inst3” at the head address BSA3 of this block, sequentially executes instructions in the instruction execution direction, and similarly ends with a branch instruction (not shown) at the address BIA3. The address BSA3 is a branch target address BTA2 of the address BSA3 because it is the branch destination address of the branch instruction of the address BIA1.

従って、命令の実行パスを、基本ブロックＢＢ１、ＢＢ２のパスとして表したり、基本ブロックＢＢ１、ＢＢ３のパスとして表したりすることができる。 Therefore, the instruction execution path can be expressed as a path of the basic blocks BB1 and BB2, or can be expressed as a path of the basic blocks BB1 and BB3.

図３に、図１の情報処理装置１００の処理の一例の概要のフローを示す。 FIG. 3 shows an outline flow of an example of processing of the information processing apparatus 100 of FIG.

まず、情報処理装置１００のプロファイラ３０は、アプリケーションプログラムを実行するＣＰＵ１０の実行履歴を収集し、アプリケーションプログラムを構成する命令列のうち高頻度で繰り返し実行される実行命令部分を検出する（ステップＳ１０）。このとき、アプリケーションプログラムを構成する命令列が、それぞれ１つの分岐命令を含む基本ブロックに区分でき、基本ブロック間の流れでプログラムの実行パスを特定できる。その結果、図４に示すように、例えば基本ブロックＦから基本ブロックＡに戻るパスが高頻度で繰り返し実行される実行命令部分として検出される。なお、図４では、基本ブロックＡ〜Ｆが示されており、基本ブロックＡから基本ブロックＢ〜Ｇのいずれかの方向にプログラムの実行が流れるものとする。 First, the profiler 30 of the information processing apparatus 100 collects the execution history of the CPU 10 that executes the application program, and detects an execution instruction portion that is repeatedly executed at a high frequency from the instruction sequence constituting the application program (step S10). . At this time, the instruction sequence constituting the application program can be divided into basic blocks each including one branch instruction, and the execution path of the program can be specified by the flow between the basic blocks. As a result, as shown in FIG. 4, for example, a path returning from the basic block F to the basic block A is detected as an execution instruction portion that is repeatedly executed with high frequency. In FIG. 4, basic blocks A to F are shown, and it is assumed that program execution flows from the basic block A to any one of the basic blocks B to G.

図３に示すように、次に、情報処理装置１００は、ステップＳ１０で検出した実行頻度の高い実行命令部分の分岐履歴の収集を開始する。そして、情報処理装置１００は、収集した分岐履歴情報に基づいてパスを特定するホットパス情報を出力する（ステップＳ１１）。 As illustrated in FIG. 3, the information processing apparatus 100 then starts collecting branch histories of execution instruction parts with high execution frequency detected in step S <b> 10. Then, the information processing apparatus 100 outputs hot path information that identifies a path based on the collected branch history information (step S11).

ホットパス処理部１２０は、プロファイラ３０からのホットパス情報を受けて、上述のようにホットパスの処理の最適化を行い（ステップＳ１２）、一連の処理を終了する（エンド）。 The hot path processor 120 receives the hot path information from the profiler 30, optimizes the hot path process as described above (step S12), and ends the series of processes (end).

図５に、図３のホットパスの説明図を示す。 FIG. 5 is an explanatory diagram of the hot path in FIG.

図５において、図４と同じ基本ブロックＡ〜Ｆが示されている。各基本ブロックは、当該基本ブロックの先頭アドレス（Basic block Start Address：ＢＳＡ）の命令ＦＩと、当該基本ブロックの最終アドレスである分岐命令アドレス（Branch Instruction Address：ＢＩＡ)の分岐命令ＢＩとを含む。各基本ブロック間を接続する矢印は、分岐命令により処理の流れが変えられたことを示し、この矢印に付される数値は該分岐命令による分岐回数を示す。 5, the same basic blocks A to F as in FIG. 4 are shown. Each basic block includes an instruction FI of a basic block start address (BSA) of the basic block and a branch instruction BI of a branch instruction address (BIA) which is a final address of the basic block. The arrows connecting the basic blocks indicate that the processing flow has been changed by the branch instruction, and the numerical value attached to the arrow indicates the number of branches by the branch instruction.

例えば基本ブロックＡの分岐命令ＢＩの分岐先が２つあり、該分岐命令ＢＩが条件分岐命令であることを示す。基本ブロックＡから基本ブロックＢへの矢印は、例えば基本ブロックＡの分岐命令ＢＩの条件が「真」のときの分岐先アドレスが基本ブロックＢの先頭アドレスＢＳＡであることを示す。そして、基本ブロックＡから基本ブロックＢへの処理の流れの変更が、履歴として８００回であったことを意味している。また、基本ブロックＡから基本ブロックＣへの矢印は、例えば基本ブロックＡの分岐命令ＢＩの条件が「偽」のときの分岐先アドレスが基本ブロックＣの先頭アドレスＢＳＡであることを示す。そして、基本ブロックＡから基本ブロックＣへの処理の流れの変更が、履歴として２００回であったことを意味している。 For example, there are two branch destinations of the branch instruction BI of the basic block A, and the branch instruction BI is a conditional branch instruction. An arrow from the basic block A to the basic block B indicates that the branch destination address when the condition of the branch instruction BI of the basic block A is “true” is the start address BSA of the basic block B, for example. This means that the change in the flow of processing from the basic block A to the basic block B has been 800 times as a history. An arrow from the basic block A to the basic block C indicates that the branch destination address when the condition of the branch instruction BI of the basic block A is “false” is the start address BSA of the basic block C, for example. This means that the change in the flow of processing from the basic block A to the basic block C has been 200 times as a history.

これに対して、例えば基本ブロックＤの分岐命令ＢＩの分岐先は１つであり、該分岐命令ＢＩが無条件分岐命令であることを示す。この無条件分岐は、履歴として７００回であることを意味する。 On the other hand, for example, the branch instruction BI of the basic block D has one branch destination, indicating that the branch instruction BI is an unconditional branch instruction. This unconditional branch means that the history is 700 times.

以上のような履歴を履歴情報として収集するため、基本ブロックの開始アドレスＢＳＡ、分岐先アドレスＢＴＡ（Branch Target Address）（広義にはターゲットアドレス）、分岐命令のアドレスＢＩＡ、分岐回数がテーブルを用いて管理される。そして、各基本ブロック間の分岐回数が多いルートがホットパスとして特定される。図５では、基本ブロックＡ、Ｂ、Ｄ、Ｆのルートがホットパスとなり、ホットパス情報として、基本ブロックＡ、Ｂ、Ｄ、Ｆの各基本ブロックの先頭アドレス列を出力することでホットパスが特定される。 In order to collect the above history as history information, the start address BSA of the basic block, the branch destination address BTA (Branch Target Address) (target address in a broad sense), the address BIA of the branch instruction, and the number of branches using a table Managed. A route having a large number of branches between each basic block is identified as a hot path. In FIG. 5, the route of the basic blocks A, B, D, and F becomes a hot path, and the hot path is specified by outputting the start address string of each basic block of the basic blocks A, B, D, and F as hot path information. .

その結果、基本ブロックＡ、Ｂ、Ｄ、Ｆのパスの命令列の最適化により一層高い負荷をかける等して情報処理装置１００の処理を重点的に最適化することができる。 As a result, the processing of the information processing apparatus 100 can be optimized intensively by applying a higher load by optimizing the instruction sequence of the basic blocks A, B, D, and F paths.

このように、本実施形態においては、プログラムの局所性に着目して、高頻度で実行される命令列の繰り返し部分を検出してから、該繰り返し部分の分岐履歴を収集するようにしたので、アプリケーションプログラムのホットパスを検出する際に膨大な分岐履歴を収集する必要がなくなる。そのため、本実施形態によれば、パスの検出精度の低下を抑えつつ、少ないメモリ容量で、高速に実行頻度の高い命令列の実行パスを検出できるようになる。 Thus, in the present embodiment, focusing on the locality of the program, since the repeated portion of the instruction sequence executed at a high frequency is detected, the branch history of the repeated portion is collected. There is no need to collect a huge branch history when detecting a hot path of an application program. Therefore, according to the present embodiment, it is possible to detect an execution path of an instruction sequence having a high execution frequency at a high speed with a small memory capacity while suppressing a decrease in path detection accuracy.

２．プロファイラの説明
次に、このようなホットパスの検出を行う図１のプロファイラ３０について説明する。 2. Description of Profiler Next, the profiler 30 of FIG. 1 that performs such hot path detection will be described.

図６に、図１のプロファイラ３０の構成の概要を示す。 FIG. 6 shows an outline of the configuration of the profiler 30 of FIG.

プロファイラ３０は、制御部２００と、管理テーブル部３００とを含む。制御部２００は、実行履歴管理部２１０、分岐履歴管理部２２０を含む。管理テーブル部３００は、実行履歴テーブル３１０、分岐履歴テーブル３２０を含む。 The profiler 30 includes a control unit 200 and a management table unit 300. The control unit 200 includes an execution history management unit 210 and a branch history management unit 220. The management table unit 300 includes an execution history table 310 and a branch history table 320.

なお、制御部２００の機能は、ソフトウェアで実現されてもよいし、専用のハードウェアにより実現されてもよい。制御部２００がソフトウェアで実現される場合、後述する実行パスの検出方法を実現するプログラムを読み込んだプロファイラ３０のＣＰＵが処理を行う。また、管理テーブル部３００は、後述するように実行履歴テーブル３１０と分岐履歴テーブル３２０の記憶領域を共用することが望ましい。 Note that the function of the control unit 200 may be realized by software or may be realized by dedicated hardware. When the control unit 200 is realized by software, the CPU of the profiler 30 that has read a program that realizes an execution path detection method described later performs processing. Further, the management table unit 300 desirably shares the storage areas of the execution history table 310 and the branch history table 320 as described later.

実行履歴管理部２１０は、実行履歴テーブル３１０を管理する制御を行う。 The execution history management unit 210 performs control for managing the execution history table 310.

図７に、図６の実行履歴テーブル３１０の構成の概要を示す。 FIG. 7 shows an outline of the configuration of the execution history table 310 of FIG.

実行履歴テーブル３１０は、複数のエントリを有する。各エントリには、基本ブロックの先頭アドレスＢＳＡに関連付けて（対応して）、該先頭アドレスＢＳＡの命令の実行回数ＣＯＵＮＴが記憶される。即ち、基本ブロックの先頭アドレスＢＳＡの命令の実行回数がカウントされ、実行履歴テーブル３１０には、基本ブロックの先頭アドレスの命令毎に、実行回数が記憶される。 The execution history table 310 has a plurality of entries. Each entry stores the execution count COUNT of the instruction at the head address BSA in association with (corresponding to) the head address BSA of the basic block. That is, the number of executions of the instruction at the basic block start address BSA is counted, and the execution history table 310 stores the execution number for each instruction at the start address of the basic block.

ここで、先頭アドレスＢＳＡは、分岐命令の戻り方向の分岐先アドレスである。そのため、実行履歴テーブル３１０には、基本ブロックの先頭アドレスＢＳＡではなく、該先頭アドレスＢＳＡを戻り方向の分岐先アドレスとして登録してもよい。このように戻り方向の分岐命令の実行履歴のみを収集することで、少ないメモリ容量で、高頻度で繰り返される実行部分を検出できるようになる。 Here, the head address BSA is a branch destination address in the return direction of the branch instruction. Therefore, in the execution history table 310, instead of the basic block start address BSA, the start address BSA may be registered as a branch destination address in the return direction. By collecting only the execution history of branch instructions in the return direction in this way, it becomes possible to detect an execution portion that is repeated frequently with a small memory capacity.

そこで、実行履歴管理部２１０は、分岐命令を含む命令列を１ブロックとする基本ブロック（命令列ブロック）の先頭アドレスＢＳＡ、該先頭アドレスＢＳＡの命令の実行回数を実行履歴テーブル３１０に登録、追加、検索等をする管理制御を行う。 Therefore, the execution history management unit 210 registers and adds to the execution history table 310 the start address BSA of the basic block (instruction sequence block) having the instruction sequence including the branch instruction as one block, and the execution count of the instruction at the start address BSA. Management control for searching, etc. is performed.

図６において、分岐履歴管理部２２０は、実行履歴テーブル３１０で管理される基本ブロックの先頭アドレスのうちいずれかの先頭アドレスの命令の実行回数が所与の閾値ＴＨ１を超えたことを条件に、分岐命令の分岐履歴の収集を開始し、その後、分岐履歴を収集する処理を行う。 In FIG. 6, the branch history management unit 220, on condition that the number of executions of the instruction at one of the top addresses of the basic block managed by the execution history table 310 has exceeded a given threshold value TH1. Collection of branch history of branch instructions is started, and thereafter processing for collecting branch history is performed.

図８に、図６の分岐履歴テーブル３２０の構成の概要を示す。 FIG. 8 shows an outline of the configuration of the branch history table 320 of FIG.

分岐履歴テーブル３２０は、複数のエントリを有する。各エントリには、基本ブロックの先頭アドレスＢＳＡに関連付けて、分岐命令のアドレスＢＩＡ、該分岐命令の分岐先アドレスＢＴＡ、該分岐命令の分岐回数ＣＯＵＮＴが記憶される。ここで、先頭アドレスＢＳＡは、分岐命令の戻り方向の分岐先アドレスであったり、分岐命令の順方向の分岐先アドレスであったりする。 The branch history table 320 has a plurality of entries. Each entry stores the address BIA of the branch instruction, the branch destination address BTA of the branch instruction, and the branch count COUNT of the branch instruction in association with the head address BSA of the basic block. Here, the head address BSA is a branch destination address in the return direction of the branch instruction or a branch destination address in the forward direction of the branch instruction.

そこで、分岐履歴管理部２２０は、基本ブロックの先頭アドレスＢＳＡ、分岐命令のアドレス、該分岐命令の分岐先アドレス、分岐回数を分岐履歴テーブル３２０に登録、追加、検索等をする管理制御を行う。また分岐履歴管理部２２０は、分岐履歴テーブル３２０に登録された分岐履歴に基づいてホットパス情報（実行頻度の高い実行パスを特定するためのパス情報）を出力する。 Therefore, the branch history management unit 220 performs management control for registering, adding, searching, and the like in the branch history table 320 with the basic block start address BSA, the branch instruction address, the branch destination address of the branch instruction, and the number of branches. Further, the branch history management unit 220 outputs hot path information (path information for specifying an execution path with a high execution frequency) based on the branch history registered in the branch history table 320.

なお、図８において、分岐履歴テーブル３２０に分岐命令のアドレスＢＩＡを記憶させなくてもよい。しかしながら、分岐履歴テーブル３２０に分岐命令のアドレスＢＩＡを記憶させることで、分岐先アドレスＢＴＡの情報として分岐命令のアドレスＢＩＡを基準としたオフセット情報を用いることができるので、分岐先アドレスＢＴＡの記憶領域を削減できるようになる。 In FIG. 8, it is not necessary to store the branch instruction address BIA in the branch history table 320. However, by storing the branch instruction address BIA in the branch history table 320, offset information based on the branch instruction address BIA can be used as the branch destination address BTA information. Can be reduced.

ところで、本実施形態では、まず実行履歴テーブル３１０を用いて高頻度で実行される命令列の繰り返し部分を検出してから、分岐履歴テーブル３２０に該繰り返し部分の分岐履歴を収集するようにしている。そのため、分岐履歴を収集する段階では、実行履歴テーブル３１０を不要にできる。そこで、本実施形態では、実行履歴テーブル３１０と分岐履歴テーブル３２０の記憶領域を共用し、管理テーブル部３００に設けられる記憶容量の削減を図ることができる。 By the way, in the present embodiment, first, a repeated portion of an instruction sequence executed frequently is detected using the execution history table 310, and then the branch history of the repeated portion is collected in the branch history table 320. . Therefore, the execution history table 310 can be eliminated at the stage of collecting branch history. Therefore, in the present embodiment, the storage areas of the execution history table 310 and the branch history table 320 are shared, and the storage capacity provided in the management table unit 300 can be reduced.

図９（Ａ）、図９（Ｂ）に、本実施形態における管理テーブル部３００の説明図を示す。 FIG. 9A and FIG. 9B are explanatory diagrams of the management table unit 300 in this embodiment.

図９（Ａ）は、管理テーブル部３００が有する共用テーブルの構成の概要を示す。共用テーブルは、複数のエントリを有する。各エントリには、基本ブロックの先頭アドレスＢＳＡに関連付けて、分岐命令のアドレスＢＩＡ、該分岐命令の分岐先アドレスＢＴＡ、該分岐命令の分岐回数ＣＯＵＮＴが記憶される。即ち、共用テーブルは、図８に示す分岐履歴テーブル３２０と同様の構成を有する。 FIG. 9A shows an outline of the configuration of the shared table included in the management table unit 300. The shared table has a plurality of entries. Each entry stores the address BIA of the branch instruction, the branch destination address BTA of the branch instruction, and the branch count COUNT of the branch instruction in association with the head address BSA of the basic block. That is, the shared table has the same configuration as the branch history table 320 shown in FIG.

本実施形態では、図９（Ａ）に示す共用テーブルを用いて、図７に示す実行履歴テーブル３１０の機能を実現する。 In the present embodiment, the function of the execution history table 310 shown in FIG. 7 is realized using the shared table shown in FIG.

即ち、図９（Ｂ）に示すように、図９（Ａ）の管理テーブルのうち分岐命令のアドレスＢＩＡと分岐先アドレスＢＴＡの欄がディセーブルにされる（図９（Ｂ）の斜線部分）。その結果、共用履歴テーブルが実行履歴テーブル３１０として機能する場合には、基本ブロックＢＳＡと回数ＣＯＵＮＴの欄のみがイネーブルとなる。これにより、図９（Ａ）に示す共用テーブルで、実行履歴テーブル３１０及び分岐履歴テーブル３２０の両方の機能を実現できる。 That is, as shown in FIG. 9B, the branch instruction address BIA and branch destination address BTA fields in the management table of FIG. 9A are disabled (shaded area in FIG. 9B). . As a result, when the shared history table functions as the execution history table 310, only the basic block BSA and the count COUNT fields are enabled. Thereby, the functions of both the execution history table 310 and the branch history table 320 can be realized with the shared table shown in FIG.

このため、本実施形態では、実行履歴テーブル３１０の記憶領域の少なくとも一部が、分岐履歴テーブル３２０の記憶領域と重複しているということができる。また、分岐履歴テーブル３２０に登録すべき情報の少なくとも一部を、実行履歴テーブル３１０の記憶領域に書き込むことができる。 For this reason, in this embodiment, it can be said that at least a part of the storage area of the execution history table 310 overlaps with the storage area of the branch history table 320. Further, at least a part of information to be registered in the branch history table 320 can be written in the storage area of the execution history table 310.

以下では、管理テーブル部３００が、図９（Ａ）に示す共用テーブルの構成を有しているものとする。 In the following, it is assumed that the management table unit 300 has the configuration of the shared table shown in FIG.

２．１プロファイラの処理例
図１０に、本実施形態におけるプロファイラ３０の処理の概要のフローを示す。 2.1 Profiler Processing Example FIG. 10 shows an outline flow of processing of the profiler 30 in the present embodiment.

まず、プロファイラ３０の分岐履歴管理部２２０は、共用テーブルを図９（Ｂ）に示すように実行履歴テーブルとして用いるための初期化処理を行う（ステップＳ２０）。 First, the branch history management unit 220 of the profiler 30 performs an initialization process for using the shared table as an execution history table as shown in FIG. 9B (step S20).

次に実行履歴管理部２１０は、実行履歴テーブルとして機能する共用テーブルを用いて実行履歴を収集する（ステップＳ２１）。そして、実行履歴テーブルとして機能する共用テーブルに格納される基本ブロックの先頭アドレスの命令の実行回数が所与の閾値ＴＨ１を超えたか否かを判断することで、高頻度の繰り返しがあるか否かを判別する（ステップＳ２２）。 Next, the execution history management unit 210 collects an execution history using a shared table that functions as an execution history table (step S21). Whether or not there is a high-frequency repetition by determining whether or not the number of executions of the instruction at the head address of the basic block stored in the shared table functioning as the execution history table exceeds a given threshold value TH1 Is determined (step S22).

ステップＳ２２において、高頻度の繰り返しがないと判別したとき（ステップＳ２２：Ｎ）、実行履歴管理部２１０は、ステップＳ２１に戻って実行履歴の収集を継続する。 When it is determined in step S22 that there is no high frequency repetition (step S22: N), the execution history management unit 210 returns to step S21 and continues collecting the execution history.

ステップＳ２２において、高頻度の繰り返しがあると判別したとき（ステップＳ２２：Ｙ）、実行履歴管理部２１０は、その旨を分岐履歴管理部２２０に通知する。実行履歴管理部２１０又は分岐履歴管理部２２０は、共用テーブルを図９（Ａ）に示すように分岐履歴テーブルとして用いるための初期化処理を行う（ステップＳ２３）。 When it is determined in step S22 that there is a high-frequency repetition (step S22: Y), the execution history management unit 210 notifies the branch history management unit 220 to that effect. The execution history management unit 210 or the branch history management unit 220 performs an initialization process for using the shared table as a branch history table as shown in FIG. 9A (step S23).

その後、分岐履歴管理部２２０は、閾値ＴＨ１を超えた実行回数に関連付けて記憶される先頭アドレスを用いて、分岐履歴の収集を開始する。そして分岐履歴管理部２２０は、分岐履歴テーブルとして機能する共用テーブルを用いて分岐履歴の収集を行うと共に、分岐命令が入力される毎に分岐履歴情報に基づいて分岐回数の多い分岐先アドレスをパス情報として出力する（ステップＳ２４）。分岐履歴管理部２２０は、公知の分岐履歴収集手法を用いて分岐命令、分岐先アドレスを登録、追加、リプレース等を行うが、更に分岐回数についても登録する。 Thereafter, the branch history management unit 220 starts collecting branch histories using the head address stored in association with the number of executions exceeding the threshold value TH1. The branch history management unit 220 collects branch history using a shared table functioning as a branch history table, and passes a branch destination address with a large number of branches based on branch history information every time a branch instruction is input. It outputs as information (step S24). The branch history management unit 220 registers, adds, and replaces branch instructions and branch destination addresses using a known branch history collection method, but also registers the number of branches.

次に、分岐履歴管理部２２０は、分岐命令が入力される毎に、分岐履歴テーブルがミスヒットし、且つ該分岐命令の分岐先アドレスが、実行履歴テーブルに一度記録された基本ブロックの先頭アドレスと一致するか否かを検出することで、繰り返し処理の有無を判別する（ステップＳ２５）。繰り返し処理がないと判別されたとき（分岐命令の分岐先が実行履歴テーブルに記録された先頭アドレスとならないとき）（ステップＳ２５：Ｙ）、分岐履歴管理部２２０は、分岐履歴テーブルの分岐履歴を廃棄（無効化、初期化）し（ステップＳ２６）、ステップＳ２０に戻る（リターン）。 Next, every time a branch instruction is input, the branch history management unit 220 causes a miss in the branch history table, and the branch destination address of the branch instruction is the start address of the basic block once recorded in the execution history table. Is detected or not, it is determined whether or not iterative processing is performed (step S25). When it is determined that there is no repetitive processing (when the branch destination of the branch instruction is not the start address recorded in the execution history table) (step S25: Y), the branch history management unit 220 stores the branch history in the branch history table. Discard (invalidate, initialize) (step S26), and return to step S20 (return).

分岐履歴管理部２２０は、分岐命令の実行回数を保持している。そしてステップＳ２５において、繰り返し処理があると判別されたとき（ステップＳ２５：Ｎ）、分岐履歴管理部２２０は、所与のサンプリング期間内に分岐命令の実行回数が所与の閾値ＴＨＸ以上であるか否かを判別する（ステップＳ２７）。サンプリング期間内に分岐命令の実行回数が閾値ＴＨＸ以上であると判別されたとき（ステップＳ２７：Ｙ）、分岐履歴管理部２２０は、ステップＳ２４に戻り、分岐履歴の収集とホットパス情報の出力を継続する。一方、サンプリング期間内に分岐命令の実行回数が閾値ＴＨＸ以上ではないと判別されたとき（ステップＳ２７：Ｎ）、分岐履歴管理部２２０は、分岐履歴テーブルの分岐履歴を廃棄（無効化）し（ステップＳ２６）、ステップＳ２０に戻る（リターン）。このように、ステップＳ２７において閾値ＴＨＸ以上繰り返していないか否かを検出することで、ステップＳ２２で検出された高頻度の繰り返し部分について必ずしもホットパスを検出する必要がなくなり、プロファイラ３０が検出するホットパス情報の精度を高めることができる。 The branch history management unit 220 holds the number of executions of branch instructions. In step S25, when it is determined that there is an iterative process (step S25: N), the branch history management unit 220 determines whether the execution count of the branch instruction is equal to or greater than a given threshold value THX within a given sampling period. It is determined whether or not (step S27). When it is determined that the number of executions of the branch instruction within the sampling period is equal to or greater than the threshold value THX (step S27: Y), the branch history management unit 220 returns to step S24 and continues collecting branch history and outputting hot path information. To do. On the other hand, when it is determined that the execution count of the branch instruction is not equal to or greater than the threshold value THX within the sampling period (step S27: N), the branch history management unit 220 discards (invalidates) the branch history in the branch history table ( Step S26) and return to step S20 (return). As described above, by detecting whether or not the repetition is greater than or equal to the threshold value THX in step S27, it is not always necessary to detect the hot path for the frequently repeated portion detected in step S22, and the hot path information detected by the profiler 30 is detected. Can improve the accuracy.

以上のように、分岐命令の分岐先が実行履歴テーブルに記録された先頭アドレスとならないとき、又は予め決められた分岐命令の実行回数が所与のサンプリング時間内に達しなかったとき、分岐履歴管理部２２０が分岐履歴を廃棄できる。 As described above, when the branch destination of the branch instruction does not become the start address recorded in the execution history table, or when the predetermined number of executions of the branch instruction does not reach the given sampling time, the branch history management The unit 220 can discard the branch history.

以上のような処理を、コンピュータを機能させるためのプログラムで実現してもよい。この場合、図１のメモリ１１０又はプロファイラ３０の図示しないメモリに上記の処理を実現するためのプログラムを格納しておき、プロファイラ３０の図示しないＣＰＵがメモリ１１０又はプロファイラ３０の図示しないメモリのプログラムを読み出すことで、上記の処理を実現できる。また、図１において、例えばメモリ１１０又はプロファイラ３０の図示しないメモリに代えてコンピュータ読み取り可能な記録媒体で上記のプログラムを提供してもよい。この記録媒体は、コンピュータにより使用可能な記憶媒体であって、プログラムやデータなどの情報を格納するものであり、その機能は、光ディスク（ＣＤ、ＤＶＤ）、光磁気ディスク（ＭＯ）、磁気ディスク、ハードディスク、磁気テープ、或いはメモリ（ＲＯＭ）などのハードウェアにより実現できる。プロファイラ３０の図示しないＣＰＵは、この記憶媒体に格納される情報に基づいて本発明（本実施形態）の種々の処理を行う。即ちこの記憶媒体には、本発明（本実施形態）の手段を実行（実現）するための情報（プログラム或いはデータ）が格納される。 The above processing may be realized by a program for causing a computer to function. In this case, a program for realizing the above processing is stored in the memory 110 or the profiler 30 (not shown) in FIG. 1, and the CPU (not shown) of the profiler 30 stores the program of the memory 110 or the profiler 30 (not shown). The above processing can be realized by reading. Further, in FIG. 1, for example, the program may be provided by a computer-readable recording medium instead of the memory 110 or the memory (not shown) of the profiler 30. This recording medium is a storage medium that can be used by a computer, and stores information such as programs and data. Its function is an optical disk (CD, DVD), magneto-optical disk (MO), magnetic disk, It can be realized by hardware such as a hard disk, a magnetic tape, or a memory (ROM). A CPU (not shown) of the profiler 30 performs various processes of the present invention (this embodiment) based on information stored in the storage medium. That is, information (program or data) for executing (implementing) the means of the present invention (this embodiment) is stored in this storage medium.

図１１に、実行履歴管理部２１０の処理例のフロー図を示す。 FIG. 11 shows a flowchart of a processing example of the execution history management unit 210.

まず、実行履歴管理部２１０は、例えばＣＰＵ１０による分岐命令の実行を監視し、基本ブロックの先頭アドレスＢＳＡとして該分岐命令の分岐先アドレスの入力を待つ（ステップＳ４０：Ｎ）。基本ブロックの先頭アドレスＢＳＡが入力されたとき（ステップＳ４０：Ｙ）、実行履歴管理部２１０は、該先頭アドレスＢＳＡをインデックスとして、実行履歴テーブルとして機能する共用テーブルを検索する（ステップＳ４１）。 First, the execution history management unit 210 monitors the execution of a branch instruction by the CPU 10, for example, and waits for the input of the branch destination address of the branch instruction as the basic block start address BSA (step S40: N). When the start address BSA of the basic block is input (step S40: Y), the execution history management unit 210 searches the shared table functioning as the execution history table using the start address BSA as an index (step S41).

そして、ステップＳ４１において共用テーブルを検索した結果、既に登録された先頭アドレスＢＳＡがあるとき（ステップＳ４２：Ｙ）、実行履歴管理部２１０は、ステップＳ４１で検索した先頭アドレスＢＳＡに関連付けて記録された実行回数をインクリメントして更新する（ステップＳ４３）。このとき、実行履歴管理部２１０は、ステップＳ４３で更新した実行回数が所与の閾値ＴＨ１を超えたか否かを検出する（ステップＳ４４）。実行回数が閾値ＴＨ１を超えたことが検出されたとき（ステップＳ４４：Ｙ）、実行履歴管理部２１０は、高頻度の繰り返し実行部分が存在したと判断し、当該先頭アドレスＢＳＡを用いた分岐履歴の収集を開始するように分岐履歴収集イネーブルを設定し（ステップＳ４５）、一連の処理を終了する（エンド）。 Then, as a result of searching the shared table in step S41, when there is an already registered head address BSA (step S42: Y), the execution history management unit 210 is recorded in association with the head address BSA searched in step S41. The number of executions is incremented and updated (step S43). At this time, the execution history management unit 210 detects whether or not the number of executions updated in step S43 has exceeded a given threshold value TH1 (step S44). When it is detected that the number of executions exceeds the threshold value TH1 (step S44: Y), the execution history management unit 210 determines that a high-frequency repeated execution part exists, and the branch history using the head address BSA is determined. Branch history collection enable is set so as to start the collection (step S45), and a series of processing ends (end).

一方、ステップＳ４１において共用テーブルを検索した結果、既に登録された先頭アドレスＢＳＡがないとき（ステップＳ４２：Ｎ）、実行履歴管理部２１０は、入力された先頭アドレスＢＳＡを実行履歴テーブルとして機能する共用テーブルに登録する処理を行い（ステップＳ４６）、ステップＳ４０に戻る。なお、実際には、先頭アドレスＢＳＡと分岐先アドレスＢＴＡとを比較し、分岐先アドレスが戻り方向であるときに、分岐先アドレスＢＴＡを新たに追加する先頭アドレスＢＳＡとして登録する。このとき、実行履歴テーブルとして機能する共用テーブルでは、必要に応じて記憶情報のリプレースが行われる。 On the other hand, as a result of searching the shared table in step S41, when there is no registered start address BSA (step S42: N), the execution history management unit 210 uses the input start address BSA as a shared execution history table. Processing for registration in the table is performed (step S46), and the process returns to step S40. In practice, the head address BSA is compared with the branch destination address BTA, and when the branch destination address is in the return direction, the branch destination address BTA is registered as a newly added head address BSA. At this time, in the shared table functioning as the execution history table, the storage information is replaced as necessary.

ステップＳ４４において、実行回数が閾値ＴＨ１を超えていないことが検出されたとき（ステップＳ４４：Ｎ）、実行履歴管理部２１０は、ステップＳ４０に戻って次の基本ブロックの先頭アドレスＢＳＡを待つ。 When it is detected in step S44 that the number of executions does not exceed the threshold TH1 (step S44: N), the execution history management unit 210 returns to step S40 and waits for the start address BSA of the next basic block.

以上のように、実行履歴管理部２１０は、実行履歴の収集を行うと共に、収集した実行履歴に基づいて、高頻度で繰り返される実行部分を検出し、分岐履歴の収集開始を指示できるようになっている。 As described above, the execution history management unit 210 collects execution histories, detects execution portions that are repeated frequently based on the collected execution histories, and can instruct to start collecting branch histories. ing.

図１２に、分岐履歴管理部２２０の処理例のフロー図を示す。 FIG. 12 shows a flowchart of a processing example of the branch history management unit 220.

まず、分岐履歴管理部２２０は、分岐履歴収集イネーブルが設定されているか否かを監視する（ステップＳ６０：Ｎ）。分岐履歴収集イネーブルが設定されていることが検出されたとき（ステップＳ６０：Ｙ）、分岐履歴管理部２２０は、図１１のステップＳ４４で閾値ＴＨ１を超えたエントリの先頭アドレスＢＳＡの入力を待つ（ステップＳ６１：Ｎ）。この先頭アドレスＢＳＡは、分岐命令の分岐先アドレスである。基本ブロックの先頭アドレスＢＳＡが入力されたとき（ステップＳ６１：Ｙ）、分岐履歴管理部２２０は、該先頭アドレスＢＳＡをインデックスとして、分岐履歴テーブルとして機能する共用テーブルを検索する（ステップＳ６２）。 First, the branch history management unit 220 monitors whether the branch history collection enable is set (step S60: N). When it is detected that the branch history collection enable is set (step S60: Y), the branch history management unit 220 waits for input of the head address BSA of the entry exceeding the threshold value TH1 in step S44 of FIG. Step S61: N). This head address BSA is a branch destination address of the branch instruction. When the head address BSA of the basic block is input (step S61: Y), the branch history management unit 220 searches the shared table functioning as a branch history table using the head address BSA as an index (step S62).

そして、ステップＳ６２において共用テーブルを検索した結果、既に登録された先頭アドレスＢＳＡがあるとき（ステップＳ６３：Ｙ）、分岐履歴管理部２２０は、複数の先頭アドレスＢＳＡが登録されているとき（ステップＳ６４：Ｙ）、分岐回数が最も多い分岐先アドレスをホットパス情報として出力する（ステップＳ６５）と共に、当該分岐先アドレスに関連付けて記憶された分岐回数をインクリメントして更新し（ステップＳ６５）、ステップＳ６０に戻る（リターン）。 As a result of searching the shared table in step S62, when there is already registered head address BSA (step S63: Y), the branch history management unit 220 has a plurality of head addresses BSA registered (step S64). : Y), the branch destination address with the largest number of branches is output as hot path information (step S65), and the branch count stored in association with the branch destination address is incremented and updated (step S65). Return (return).

ステップＳ６４において、１つの先頭アドレスＢＳＡのみが登録されているとき（ステップＳ６４：Ｎ）、分岐履歴管理部２２０は、ヒットした分岐先アドレスをホットパス情報として出力すると共に、当該分岐先アドレスに関連付けて記憶された分岐回数をインクリメントして更新し（ステップＳ６６）、ステップＳ６０に戻る（リターン）。 In step S64, when only one head address BSA is registered (step S64: N), the branch history management unit 220 outputs the hit branch destination address as hot path information and associates it with the branch destination address. The stored branch count is incremented and updated (step S66), and the process returns to step S60 (return).

一方、ステップＳ６２において共用テーブルを検索した結果、既に登録された先頭アドレスＢＳＡがないとき（ステップＳ６３：Ｎ）、分岐履歴管理部２２０は、入力された先頭アドレスＢＳＡを分岐履歴テーブルとして機能する共用テーブルに登録する処理を行い（ステップＳ６７）、ステップＳ６０に戻る。このとき、分岐履歴テーブルとして機能する共用テーブルでは、必要に応じて記憶情報のリプレースが行われる。 On the other hand, as a result of searching the shared table in step S62, when there is no registered head address BSA (step S63: N), the branch history management unit 220 uses the input head address BSA as a branch history table. Processing for registration in the table is performed (step S67), and the process returns to step S60. At this time, in the shared table functioning as the branch history table, the stored information is replaced as necessary.

以上のように、分岐履歴管理部２２０は、分岐履歴の収集を行うと共に、収集した分岐履歴に基づいてホットパスを出力できるようになっている。このホットパスは、分岐履歴テーブルを検索して得られた分岐先アドレスを蓄積したホットパス情報として出力される。 As described above, the branch history management unit 220 collects branch histories and outputs a hot path based on the collected branch histories. This hot path is output as hot path information in which branch destination addresses obtained by searching the branch history table are accumulated.

２．２プロファイラの構成例
次に、本実施形態における実行履歴テーブル３１０、分岐履歴テーブル３２０、実行履歴管理部２１０、分岐履歴管理部２２０の要部の構成例について説明する。 2.2 Configuration Example of Profiler Next, configuration examples of main parts of the execution history table 310, the branch history table 320, the execution history management unit 210, and the branch history management unit 220 in the present embodiment will be described.

本実施形態では、実行履歴テーブル３１０及び分岐履歴テーブル３２０が、それぞれ２ウェイ−セットアソシアティブ方式で記憶情報を記憶するものとして説明するが、本発明が、ウェイ数や実行履歴テーブル３１０及び分岐履歴テーブル３２０の構成に限定されるものではない。例えば、実行履歴テーブル３１０及び分岐履歴テーブル３２０は、フルアソシアティブ方式で記憶情報を記憶してもよい。しかしながら、条件分岐命令の分岐先は２箇所であるため、２ウェイ構成とすることで、条件分岐命令の条件が「真」のときの分岐先アドレスと該条件が「偽」の分岐先アドレスとを記憶でき、分岐先アドレスの記憶管理を効率化できる上に無駄な記憶領域を設ける必要がなくなる。 In the present embodiment, the execution history table 310 and the branch history table 320 will be described as storing the stored information by the 2-way-set associative method. However, the present invention is not limited to the number of ways, the execution history table 310, and the branch history table. The configuration is not limited to 320. For example, the execution history table 310 and the branch history table 320 may store stored information using a full associative method. However, since the branch destination of the conditional branch instruction is two locations, the two-way configuration enables the branch destination address when the condition of the conditional branch instruction is “true” and the branch destination address when the condition is “false”. Can be stored efficiently, and it is not necessary to provide a useless storage area.

図１３、図１４（Ａ）、図１４（Ｂ）、図１５（Ａ）、図１５（Ｂ）に、本実施形態における実行履歴テーブル３１０と実行履歴管理部２１０の要部の構成例を示す。 FIG. 13, FIG. 14 (A), FIG. 14 (B), FIG. 15 (A), and FIG. 15 (B) show configuration examples of main parts of the execution history table 310 and the execution history management unit 210 in this embodiment. .

実行履歴テーブル３１０は、２ウェイ−セットアソシアティブ方式で、基本ブロックの先頭アドレスＢＳＡ及び該先頭アドレスＢＳＡの命令の実行回数ＣＯＵＮＴを記憶する。両方のウェイは同様の構成を有し、各ウェイの記憶情報は公知のＬＲＵ（Least Recently Used）方式でリプレース制御されるようになっている。 The execution history table 310 stores the start address BSA of the basic block and the execution count COUNT of the instruction at the start address BSA in a 2-way set associative method. Both ways have the same configuration, and storage information of each way is subjected to replacement control by a known LRU (Least Recently Used) method.

先頭アドレスＢＳＡの例えば下位ビットがデコーダＤＥＣ１、ＤＥＣ２に入力される。各デコーダは、各ウェイのテーブルの１エントリを先頭アドレスＢＳＡの下位ビットに基づいて選択する。各ウェイにおいて選択されたエントリの先頭アドレス及び実行回数ＣＯＵＮＴは読み出される。こうして読み出された先頭アドレスは、先頭アドレスレジスタＢＳＡ１、ＢＳＡ２、実行回数は、実行回数レジスタＣＮＴ１、ＣＮＴ２で保持される。 For example, the lower bits of the head address BSA are input to the decoders DEC1 and DEC2. Each decoder selects one entry in each way table based on the lower bits of the head address BSA. The head address of the entry selected in each way and the execution count COUNT are read out. The head address read in this way is held in the head address registers BSA1 and BSA2, and the execution count is held in the execution count registers CNT1 and CNT2.

コンパレータＣＭＰ１は、先頭アドレスＢＳＡと先頭アドレスレジスタＢＳＡ１の値と比較し、一致したときにヒット信号ｈｉｔａ１をアクティブ（例えば「１」）にする。コンパレータＣＭＰ２は、先頭アドレスＢＳＡと先頭アドレスレジスタＢＳＡ２の値と比較し、一致したときにヒット信号ｈｉｔａ２をアクティブ（例えば「１」）にする。 The comparator CMP1 compares the head address BSA with the value of the head address register BSA1, and activates the hit signal hita1 (for example, “1”) when they match. The comparator CMP2 compares the head address BSA with the value of the head address register BSA2, and activates the hit signal hita2 (for example, “1”) when they match.

実行回数レジスタＣＮＴ１の値は、インクリメンタＩＮＣ１（広義にはカウンタ）でインクリメントされる。実行回数レジスタＣＮＴ２の値は、インクリメンタＩＮＣ２（広義にはカウンタ）でインクリメントされる。 The value of the execution count register CNT1 is incremented by an incrementer INC1 (counter in a broad sense). The value of the execution count register CNT2 is incremented by an incrementer INC2 (counter in a broad sense).

セレクタＳＥＬ１は、選択制御信号ｓｅｌ１に基づいて、実行回数レジスタＣＮＴ１の値又はインクリメンタＩＮＣ１の出力を選択して、選択出力ＣＮＴｓ１を出力する。そして、選択出力ＣＮＴｓ１により、デコーダＤＥＣ１で選択されたエントリの実行回数ＣＯＵＮＴを更新する。セレクタＳＥＬ２は、選択制御信号ｓｅｌ２に基づいて、実行回数レジスタＣＮＴ２の値又はインクリメンタＩＮＣ２の出力を選択して、選択出力ＣＮＴｓ２を出力する。そして、選択出力ＣＮＴｓ２により、デコーダＤＥＣ２で選択されたエントリの実行回数ＣＯＵＮＴを更新する。 The selector SEL1 selects the value of the execution count register CNT1 or the output of the incrementer INC1 based on the selection control signal sel1, and outputs a selection output CNTs1. Then, the execution count COUNT of the entry selected by the decoder DEC1 is updated by the selection output CNTs1. The selector SEL2 selects the value of the execution count register CNT2 or the output of the incrementer INC2 based on the selection control signal sel2, and outputs the selection output CNTs2. Then, the execution count COUNT of the entry selected by the decoder DEC2 is updated by the selection output CNTs2.

図１４（Ａ）に示すように、実行履歴管理部２１０又は実行履歴テーブル３１０は、ヒット判定部２１２を含む。ヒット判定部２１２は、ヒット信号ｈｉｔａ１、ｈｉｔａ２が入力され、選択制御信号ｓｅｌ１、ｓｅｌ２、ライトイネーブルｗｅ１、ｗｅ２を出力する。 As illustrated in FIG. 14A, the execution history management unit 210 or the execution history table 310 includes a hit determination unit 212. The hit determination unit 212 receives the hit signals hita1 and hita2 and outputs selection control signals sel1 and sel2 and write enable we1 and we2.

図１４（Ｂ）に、図１４（Ａ）のヒット判定部２１２の動作説明図を示す。 FIG. 14B shows an operation explanatory diagram of the hit determination unit 212 of FIG.

ヒット判定部２１２は、ヒット信号ｈｉｔａ１がアクティブのとき、選択制御信号ｓｅｌ１をアクティブにしてインクリメンタＩＮＣ１の出力で実行履歴テーブル３１０を更新するように制御する。またヒット判定部２１２は、ヒット信号ｈｉｔａ２がアクティブのとき、選択制御信号ｓｅｌ２をアクティブにしてインクリメンタＩＮＣ２の出力で実行履歴テーブル３１０を更新するように制御する。 When the hit signal hita1 is active, the hit determination unit 212 controls the selection control signal sel1 to be active and update the execution history table 310 with the output of the incrementer INC1. The hit determination unit 212 controls the selection control signal sel2 to be active and update the execution history table 310 with the output of the incrementer INC2 when the hit signal hita2 is active.

更に、ヒット判定部２１２は、ヒット信号ｈｉｔａ１、ｈｉｔａ２が非アクティブのとき、セレクタＳＥＬ１、ＳＥＬ２でいずれも出力されないように制御すると共に、ＬＲＵ制御ビットＬＲＵ１、ＬＲＵ２を用いてＬＲＵ制御を行い、先頭アドレスＢＳＡを基準に分岐先アドレスＢＴＡが戻り方向であるときにライトイネーブルｗｅ１、ｗｅ２の一方をアクティブにする。ライトイネーブルｗｅ１がアクティブのとき、先頭アドレスＢＳＡがデコーダＤＥＣ１で選択されたエントリに書き込まれると共に、実行回数ＣＯＵＮＴとして「１」が書き込まれる。ライトイネーブルｗｅ２がアクティブのとき、先頭アドレスＢＳＡがデコーダＤＥＣ２で選択されたエントリに書き込まれると共に、実行回数ＣＯＵＮＴとして「１」が書き込まれる。 Further, when the hit signals hita1 and hita2 are inactive, the hit determination unit 212 performs control so that none of the selectors SEL1 and SEL2 is output, performs LRU control using the LRU control bits LRU1 and LRU2, and starts address When the branch destination address BTA is in the return direction with reference to BSA, one of the write enables we1 and we2 is activated. When the write enable we1 is active, the head address BSA is written to the entry selected by the decoder DEC1, and “1” is written as the execution count COUNT. When the write enable we2 is active, the head address BSA is written to the entry selected by the decoder DEC2, and “1” is written as the execution count COUNT.

図１５（Ａ）に示すように、実行履歴管理部２１０又は実行履歴テーブル３１０は、分岐履歴収集開始制御部２１４、閾値ＴＨ１が設定される閾値設定レジスタ２１６を含む。分岐履歴収集開始制御部２１４には、ヒット信号ｈｉｔａ１、ｈｉｔａ２、選択出力ＣＮＴｓ１、ＣＮＴｓ２、閾値ＴＨ１が入力され、分岐履歴収集イネーブルを出力する。 As shown in FIG. 15A, the execution history management unit 210 or the execution history table 310 includes a branch history collection start control unit 214 and a threshold setting register 216 in which a threshold TH1 is set. The branch history collection start control unit 214 receives hit signals hita1 and hita2, selection outputs CNTs1 and CNTs2, and a threshold value TH1, and outputs a branch history collection enable.

図１５（Ｂ）に、図１５（Ａ）の分岐履歴収集開始制御部２１４の動作説明図を示す。 FIG. 15B shows an operation explanatory diagram of the branch history collection start control unit 214 in FIG.

分岐履歴収集開始制御部２１４は、ヒット信号ｈｉｔａ１がアクティブのとき、選択出力ＣＮＴｓ１と閾値ＴＨ１とを比較し、選択出力ＣＮＴｓ１が閾値ＴＨ以上のとき分岐履歴イネーブルをアクティブ（例えば「１」）に設定する。また分岐履歴収集開始制御部２１４は、ヒット信号ｈｉｔａ２がアクティブのとき、選択出力ＣＮＴｓ２と閾値ＴＨ１とを比較し、選択出力ＣＮＴｓ２が閾値ＴＨ以上のとき分岐履歴イネーブルをアクティブ（例えば「１」）に設定する。更に分岐履歴収集開始制御部２１４は、ヒット信号ｈｉｔａ１、ｈｉｔａ２が共に非アクティブのとき、分岐履歴収集イネーブルを非アクティブに設定する。 The branch history collection start control unit 214 compares the selected output CNTs1 with the threshold value TH1 when the hit signal hita1 is active, and sets the branch history enable to active (eg, “1”) when the selected output CNTs1 is equal to or greater than the threshold value TH. To do. The branch history collection start control unit 214 compares the selected output CNTs2 with the threshold TH1 when the hit signal hita2 is active, and sets the branch history enable to active (for example, “1”) when the selected output CNTs2 is equal to or greater than the threshold TH. Set. Further, the branch history collection start control unit 214 sets the branch history collection enable to inactive when the hit signals hita1 and hita2 are both inactive.

図１６、図１７（Ａ）、図１７（Ｂ）に、本実施形態における分岐履歴テーブル３２０と分岐履歴管理部２２０の要部の構成例を示す。 FIGS. 16, 17A, and 17B show configuration examples of main parts of the branch history table 320 and the branch history management unit 220 in the present embodiment.

分岐履歴テーブル３２０は、２ウェイ−セットアソシアティブ方式で、基本ブロックの先頭アドレスＢＳＡ、当該基本ブロックの分岐命令の分岐先アドレスＢＴＡ、該分岐先アドレスへの分岐回数ＣＯＵＮＴを記憶する。なお、図１６では、分岐履歴テーブル３２０に分岐命令のアドレスＢＩＡが記録されないものとする。両方のウェイは同様の構成を有し、各ウェイの記憶情報は公知のＬＲＵ方式でリプレース制御されるようになっている。 The branch history table 320 stores the start address BSA of the basic block, the branch destination address BTA of the branch instruction of the basic block, and the branch count COUNT to the branch destination address in a two-way set associative method. In FIG. 16, it is assumed that the branch instruction address BIA is not recorded in the branch history table 320. Both ways have the same configuration, and the storage information of each way is controlled to be replaced by a known LRU method.

先頭アドレスＢＳＡの例えば下位ビットがデコーダＤＥＣ３、ＤＥＣ４に入力される。各デコーダは、各ウェイのテーブルの１エントリを先頭アドレスＢＳＡの下位ビットに基づいて選択する。各ウェイにおいて選択されたエントリの先頭アドレスＢＳＡ、分岐先アドレスＢＴＡ及び分岐回数ＣＯＵＮＴが読み出される。こうして読み出された先頭アドレスは、先頭アドレスレジスタＢＳＡ３、ＢＳＡ４、分岐先アドレスは、分岐先アドレスレジスタＢＴＡ３、ＢＴＡ４、分岐回数は、分岐回数レジスタＣＮＴ３、ＣＮＴ４で保持される。 For example, the lower bits of the head address BSA are input to the decoders DEC3 and DEC4. Each decoder selects one entry in each way table based on the lower bits of the head address BSA. The head address BSA, branch destination address BTA, and branch count COUNT of the entry selected in each way are read. The read start address is held in the start address registers BSA3 and BSA4, the branch destination address is held in the branch destination address registers BTA3 and BTA4, and the branch count is held in the branch count registers CNT3 and CNT4.

コンパレータＣＭＰ３は、先頭アドレスＢＳＡと先頭アドレスレジスタＢＳＡ３の値と比較し、一致したときにヒット信号ｈｉｔｂ１をアクティブ（例えば「１」）にする。コンパレータＣＭＰ４は、先頭アドレスＢＳＡと先頭アドレスレジスタＢＳＡ４の値と比較し、一致したときにヒット信号ｈｉｔｂ２をアクティブ（例えば「１」）にする。 The comparator CMP3 compares the head address BSA with the value of the head address register BSA3, and activates the hit signal hitb1 (for example, “1”) when they match. The comparator CMP4 compares the head address BSA with the value of the head address register BSA4, and activates the hit signal hitb2 (for example, “1”) when they match.

分岐回数レジスタＣＮＴ３の値は、インクリメンタＩＮＣ３（広義にはカウンタ）でインクリメントされる。分岐回数レジスタＣＮＴ４の値は、インクリメンタＩＮＣ４（広義にはカウンタ）でインクリメントされる。 The value of the branch count register CNT3 is incremented by an incrementer INC3 (counter in a broad sense). The value of the branch count register CNT4 is incremented by an incrementer INC4 (counter in a broad sense).

セレクタＳＥＬ３は、選択制御信号ｓｅｌ３に基づいて、分岐先アドレスレジスタＢＴＡ３又は分岐先アドレスレジスタＢＴＡ４の値を選択出力して、ホットパス情報として追加されると共に、次の基本ブロックの先頭アドレスとして分岐履歴テーブル３２０に入力される。 The selector SEL3 selects and outputs the value of the branch destination address register BTA3 or the branch destination address register BTA4 based on the selection control signal sel3, is added as hot path information, and is also used as the head address of the next basic block. 320 is input.

セレクタＳＥＬ４は、選択制御信号ｓｅｌ４に基づいて、分岐回数レジスタＣＮＴ３の値又はインクリメンタＩＮＣ３の出力を選択して、選択出力ＣＮＴｓ３を出力する。そして、選択出力ＣＮＴｓ３により、デコーダＤＥＣ３で選択されたエントリの分岐回数ＣＯＵＮＴを更新する。セレクタＳＥＬ５は、選択制御信号ｓｅｌ５に基づいて、分岐回数レジスタＣＮＴ４の値又はインクリメンタＩＮＣ４の出力を選択して、選択出力ＣＮＴｓ４を出力する。そして、選択出力ＣＮＴｓ４により、デコーダＤＥＣ４で選択されたエントリの分岐回数ＣＯＵＮＴを更新する。 The selector SEL4 selects the value of the branch count register CNT3 or the output of the incrementer INC3 based on the selection control signal sel4, and outputs the selection output CNTs3. Then, the branch count COUNT of the entry selected by the decoder DEC3 is updated by the selection output CNTs3. The selector SEL5 selects the value of the branch count register CNT4 or the output of the incrementer INC4 based on the selection control signal sel5, and outputs the selection output CNTs4. Then, the branch count COUNT of the entry selected by the decoder DEC4 is updated by the selection output CNTs4.

図１７（Ａ）に示すように、分岐履歴管理部２２０又は分岐履歴テーブル３２０は、ヒット判定部２２２を含む。ヒット判定部２２２には、ヒット信号ｈｉｔｂ１、ｈｉｔｂ２、選択出力ＣＮＴｓ３、ＣＮＴｓ４が入力され、選択制御信号ｓｅｌ３、ｓｅｌ４、ｓｅｌ５、ライトイネーブルｗｅ３、ｗｅ４を出力する。 As illustrated in FIG. 17A, the branch history management unit 220 or the branch history table 320 includes a hit determination unit 222. The hit determination unit 222 receives hit signals hitb1 and hitb2, selection outputs CNTs3 and CNTs4, and outputs selection control signals sel3, sel4 and sel5, and write enable we3 and we4.

図１７（Ｂ）に、図１７（Ａ）のヒット判定部２２２の動作説明図を示す。 FIG. 17B illustrates an operation explanatory diagram of the hit determination unit 222 in FIG.

ヒット判定部２２２は、ヒット信号ｈｉｔｂ１がアクティブのとき、選択制御信号ｓｅｌ４をアクティブにしてインクリメンタＩＮＣ３の出力で分岐履歴テーブル３２０を更新するように制御する。またヒット判定部２２２は、ヒット信号ｈｉｔｂ２がアクティブのとき、選択制御信号ｓｅｌ５をアクティブにしてインクリメンタＩＮＣ４の出力で分岐履歴テーブル３２０を更新するように制御する。 When the hit signal hitb1 is active, the hit determination unit 222 controls the selection control signal sel4 to be active and update the branch history table 320 with the output of the incrementer INC3. Further, when the hit signal hitb2 is active, the hit determination unit 222 controls the selection control signal sel5 to be active and update the branch history table 320 with the output of the incrementer INC4.

更に、ヒット判定部２２２は、ヒット信号ｈｉｔｂ１、ｈｉｔｂ２がアクティブのとき、分岐回数の大きい方のウェイを選択する。そのため、ヒット判定部２２２は、選択出力ＣＮＴｓ３が選択出力ＣＮＴｓ４以上のとき、選択制御信号ｓｅｌ４をアクティブに設定すると共に、選択制御信号ｓｅｌ５を非アクティブに設定する。またヒット判定部２２２は、選択出力ＣＮＴｓ３が選択出力ＣＮＴｓ４より小さいとき、選択制御信号ｓｅｌ４を非アクティブに設定すると共に、選択制御信号ｓｅｌ５をアクティブに設定する。更に、ヒット判定部２２２は、選択制御信号ｓｅｌ４と同様の制御で選択制御信号ｓｅｌ３を出力する。 Furthermore, the hit determination unit 222 selects the way having the larger branch count when the hit signals hitb1 and hitb2 are active. Therefore, when the selection output CNTs3 is equal to or higher than the selection output CNTs4, the hit determination unit 222 sets the selection control signal sel4 to active and sets the selection control signal sel5 to inactive. When the selection output CNTs3 is smaller than the selection output CNTs4, the hit determination unit 222 sets the selection control signal sel4 to inactive and sets the selection control signal sel5 to active. Further, the hit determination unit 222 outputs the selection control signal sel3 by the same control as the selection control signal sel4.

更にまた、ヒット判定部２２２は、ヒット信号ｈｉｔｂ１、ｈｉｔｂ２が非アクティブのとき、セレクタＳＥＬ３、ＳＥＬ４でいずれも出力されないように制御すると共に、ＬＲＵ制御ビットＬＲＵ３、ＬＲＵ４を用いてＬＲＵ制御を行い、今回入力された先頭アドレスＢＳＡを書き込むためのライトイネーブルｗｅ３、ｗｅ４の一方をアクティブにする。ライトイネーブルｗｅ３がアクティブのとき、先頭アドレスＢＳＡがデコーダＤＥＣ３で選択されたエントリに書き込まれると共に、分岐回数ＣＯＵＮＴとして「１」が書き込まれる。ライトイネーブルｗｅ４がアクティブのとき、先頭アドレスＢＳＡがデコーダＤＥＣ４で選択されたエントリに書き込まれると共に、分岐回数ＣＯＵＮＴとして「１」が書き込まれる。 Furthermore, when the hit signals hitb1 and hitb2 are inactive, the hit determination unit 222 performs control so that none of the selectors SEL3 and SEL4 is output, and performs LRU control using the LRU control bits LRU3 and LRU4. One of the write enables we3 and we4 for writing the input head address BSA is activated. When the write enable we3 is active, the head address BSA is written to the entry selected by the decoder DEC3, and “1” is written as the branch count COUNT. When the write enable we4 is active, the head address BSA is written to the entry selected by the decoder DEC4, and “1” is written as the branch count COUNT.

以上のように、分岐履歴テーブル３２０が、記憶情報をセットアソシアティブ方式で記憶し、各ウェイ（セット）には、基本ブロックの先頭アドレス、当該基本ブロックに含まれる分岐命令の分岐先アドレス（ターゲットアドレス）、及び分岐回数が記憶される。そして、該分岐命令に対応して記憶された分岐命令のうち分岐回数が最も多いウェイ（セット）の分岐命令の分岐先アドレスを出力できる。 As described above, the branch history table 320 stores the storage information by the set associative method. In each way (set), the start address of the basic block and the branch destination address (target address) of the branch instruction included in the basic block are stored. ), And the number of branches is stored. Then, the branch destination address of the branch instruction of the way (set) having the largest number of branches among the branch instructions stored corresponding to the branch instruction can be output.

３．その他
本発明は上記の実施形態に限定されるものではない。 3. Others The present invention is not limited to the above embodiment.

図１８に、図１の情報処理装置の変形例のブロック図を示す。 FIG. 18 shows a block diagram of a modification of the information processing apparatus of FIG.

図１８において図１と同一部分には同一符号を付し、適宜説明を省略する。本変形例における情報処理装置５００は、ホットパス処理部１２０が情報処理装置５００内に設けられる。そして、ホットパス処理部１２０が、プロファイラ３０からのホットパス情報に基づいて最適化処理を行い、最適化処理結果をアクセラレータ２０に出力する。ここで、最適化処理結果は、ホットパス中の命令列のうち不要なコードが削除された命令列、ホットパス中の命令列を並列動作させるために変更した命令列、又はアクセラレータ２０のハードウェア構成を変更するためのハードウェア構成情報等である。 18, the same parts as those in FIG. 1 are denoted by the same reference numerals, and description thereof will be omitted as appropriate. In the information processing apparatus 500 in this modification, the hot path processing unit 120 is provided in the information processing apparatus 500. Then, the hot path processing unit 120 performs optimization processing based on the hot path information from the profiler 30, and outputs the optimization processing result to the accelerator 20. Here, the optimization processing result is the instruction sequence in which unnecessary codes are deleted from the instruction sequence in the hot path, the instruction sequence changed to operate the instruction sequence in the hot path in parallel, or the hardware configuration of the accelerator 20. Hardware configuration information to be changed.

なお、本発明は上述した実施の形態に限定されるものではなく、本発明の要旨の範囲内で種々の変形実施が可能である。 The present invention is not limited to the above-described embodiment, and various modifications can be made within the scope of the gist of the present invention.

また、本発明のうち従属請求項に係る発明においては、従属先の請求項の構成要件の一部を省略する構成とすることもできる。また、本発明の１の独立請求項に係る発明の要部を、他の独立請求項に従属させることもできる。 In the invention according to the dependent claims of the present invention, a part of the constituent features of the dependent claims can be omitted. Moreover, the principal part of the invention according to one independent claim of the present invention can be made dependent on another independent claim.

本実施形態における情報処理装置の構成例のブロック図。The block diagram of the example of composition of the information processor in this embodiment. 本実施形態における基本ブロックの説明図。Explanatory drawing of the basic block in this embodiment. 図１の情報処理装置の処理の一例の概要のフローを示す図。The figure which shows the outline | summary flow of an example of a process of the information processing apparatus of FIG. 本実施形態における高頻度で繰り返し実行される実行命令部分の説明図。Explanatory drawing of the execution command part repeatedly performed with high frequency in this embodiment. 図３のホットパスの説明図。Explanatory drawing of the hot path of FIG. 図１のプロファイラの構成の概要を示す図。The figure which shows the outline | summary of a structure of the profiler of FIG. 図６の実行履歴テーブルの構成の概要を示す図。The figure which shows the outline | summary of a structure of the execution history table of FIG. 図６の分岐履歴テーブルの構成の概要を示す図。The figure which shows the outline | summary of a structure of the branch log | history table of FIG. 図９（Ａ）、図９（Ｂ）は本実施形態における管理テーブル部の説明図。FIG. 9A and FIG. 9B are explanatory diagrams of the management table unit in the present embodiment. 本実施形態におけるプロファイラの処理の概要のフローを示す図。The figure which shows the flow of the outline | summary of the process of the profiler in this embodiment. 実行履歴管理部の処理例のフロー図。The flowchart of the example of a process of an execution history management part. 分岐履歴管理部の処理例のフロー図。The flowchart of the example of a process of a branch history management part. 本実施形態における実行履歴テーブルと実行履歴管理部の要部の構成例を示す図。The figure which shows the structural example of the principal part of the execution history table and execution history management part in this embodiment. 図１４（Ａ）、図１４（Ｂ）は本実施形態における実行履歴テーブルと実行履歴管理部の要部の構成例を示す図。FIG. 14A and FIG. 14B are diagrams showing a configuration example of main parts of an execution history table and an execution history management unit in the present embodiment. 図１５（Ａ）、図１５（Ｂ）は本実施形態における実行履歴テーブルと実行履歴管理部の要部の構成例を示す図。FIG. 15A and FIG. 15B are diagrams showing a configuration example of main parts of an execution history table and an execution history management unit in the present embodiment. 本実施形態における分岐履歴テーブルと分岐履歴管理部の要部の構成例を示す図。The figure which shows the structural example of the principal part of the branch history table and branch history management part in this embodiment. 本実施形態における分岐履歴テーブルと分岐履歴管理部の要部の構成例を示す図。The figure which shows the structural example of the principal part of the branch history table and branch history management part in this embodiment. 図１の情報処理装置の変形例のブロック図。The block diagram of the modification of the information processing apparatus of FIG.

Explanation of symbols

１０ＣＰＵ
２０アクセラレータ
３０プロファイラ
５０バス
１００、５００情報処理装置
１１０メモリ
１２０ホットパス処理部
２００制御部
２１０実行履歴管理部
２１２、２２２ヒット判定部
２１４実行履歴収集開始制御部
２１６閾値設定レジスタ
２２０分岐履歴管理部
３００管理テーブル部
３１０実行履歴テーブル
３２０分岐履歴テーブル 10 CPU
20 accelerator 30 profiler 50 bus 100, 500 information processing apparatus 110 memory 120 hot path processing unit 200 control unit 210 execution history management unit 212, 222 hit determination unit 214 execution history collection start control unit 216 threshold setting register 220 branch history management unit 300 management Table unit 310 Execution history table 320 Branch history table

Claims

An execution path detection device for detecting an execution path of an instruction sequence having a high execution frequency,
An execution history table in which the number of executions of the instruction at the head address of the instruction sequence block in which an instruction sequence including a branch instruction is one block is stored;
A branch history table storing a branch history of the branch instruction;
A branch history management unit that performs a process of collecting a branch history of the branch instruction,
On the condition that the number of executions exceeds a given threshold, the branch history management unit starts collecting the branch history, and path information for identifying the execution path is based on the branch history. An execution path detection device characterized by:

In claim 1,
A counter that counts the number of times the instruction at the head address of the instruction sequence block is executed;
The execution history table is
An execution path detection apparatus for storing the number of executions for each instruction at the head address.

In claim 1 or 2,
An execution path detection apparatus characterized in that at least a part of a storage area of the execution history table overlaps with a storage area of the branch history table.

In claim 1 or 2,
An execution path detection apparatus, wherein at least part of information to be registered in the branch history table is written in a storage area of the execution history table.

In any one of Claims 1 thru | or 4,
In the execution history table,
An execution path detection device for storing the number of executions of an instruction at a branch destination address in the return direction of a branch instruction.

In any one of Claims 1 thru | or 5,
The execution history table stores the start address and the number of executions,
The branch history management unit
The execution path detection apparatus, wherein the branch history collection is started using a head address stored in association with the number of executions exceeding the threshold.

In any one of Claims 1 thru | or 6.
The branch history table is
Memorize memorized information by set associative method,
Each set includes
The start address, the target address of a branch instruction included in the instruction sequence block, and the number of branches are stored.
An execution path detecting apparatus for outputting a set of target addresses having the largest number of branches among the target addresses stored corresponding to the branch instruction.

In any one of Claims 1 thru | or 7,
The branch history management unit
An execution path detecting apparatus, wherein a target address obtained by searching the branch history table using a branch destination address of the branch instruction as an index is output as path information.

In any one of Claims 1 thru | or 8.
When the branch destination of the branch instruction does not become the start address recorded in the execution history table, or when the predetermined number of executions of the branch instruction does not reach within a given sampling time,
The branch history management unit
An execution path detection device that invalidates storage information of the branch history table.

The execution path detection device according to any one of claims 1 to 9,
A central processing unit that executes and executes an application program,
An information processing apparatus, wherein processing of the application program is optimized based on the path information from the execution path detection apparatus.

The execution path detection device according to any one of claims 1 to 9,
A central processing unit that executes application programs;
An information processing apparatus comprising: a path processing unit that optimizes processing of the application program based on the path information from the execution path detection apparatus.

In an information processing apparatus having an execution history table, a branch history table, and a control unit, an execution path detection method for detecting an execution path of an instruction sequence having a high execution frequency,
A step wherein said control unit is, for registering the number of times of execution of the instruction of the start address of the instruction sequence blocks that 1 block instruction sequence including a branch instruction to the execution history table,
Wherein the control unit comprises the steps of the execution count begins collecting branch history of the branch instruction on condition that exceeds a given threshold, it registers the branch history in the branch history table,
The control unit includes a step of outputting path information for specifying the execution path based on the branch history.

In claim 12,
An execution path detection method, wherein at least a part of a storage area of the execution history table overlaps with a storage area of the branch history table.

In claim 12 or 13,
The start address is
The branch destination address in the return direction of the branch instruction,
The execution history table stores the start address and the number of executions,
The execution path detection method , wherein the control unit starts collecting the branch history by using a head address stored in association with the number of executions exceeding the threshold.

In any of claims 12 to 14,
The branch history table is
Memorize memorized information by set associative method,
Each set includes
The start address, the target address of a branch instruction included in the instruction sequence block, and the number of branches are stored.
An execution path detection method, comprising: outputting a set of target addresses having the largest number of branches among the target addresses stored corresponding to the branch instruction.

In any of claims 12 to 15,
An execution path detection method , wherein the control unit outputs a target address obtained by searching the branch history table using a branch destination address of the branch instruction as an index, as path information.

In any of claims 12 to 16,
When the branch destination of the branch instruction does not become the start address recorded in the execution history table, or when the predetermined number of executions of the branch instruction does not reach within a given sampling time,
The execution path detection method , wherein the control unit invalidates the storage information of the branch history table.

A program for causing a computer to execute the execution path detection method according to any one of claims 12 to 17.

A computer-readable recording medium on which the program according to claim 18 is recorded.