JP5630352B2

JP5630352B2 - Register arrangement optimization method, register arrangement optimization program, and register arrangement optimization apparatus

Info

Publication number: JP5630352B2
Application number: JP2011066636A
Authority: JP
Inventors: 神丸　博文; 博文神丸; 永野　裕二; 裕二永野; 政宏角谷
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2011-03-24
Filing date: 2011-03-24
Publication date: 2014-11-26
Anticipated expiration: 2031-03-24
Also published as: JP2012203581A

Description

本発明は、データ処理装置の高速化技術に関する。 The present invention relates to a technology for speeding up a data processing apparatus.

画像符号化装置等のデータ処理装置では高速な動作が要求されている。そのため、そのようなデータ処理装置のプロセッサで実行されるプログラムでは、目標性能を満足させるために、次のようにして処理の高速化を図っている。例えば、使用頻度の高い変数や再利用タイミングが時間的に近い変数をレジスタに割り当てる（配置する）等して処理の最適化（チューニング作業）を行い、処理の高速化を図っている。一方、プロセッサ毎の違いはあるものの、使用可能なレジスタ数には制限があるため、レジスタに配置しきれない変数については、レジスタよりもアクセス速度が遅いメモリに配置されることになる。 A data processing apparatus such as an image encoding apparatus is required to operate at high speed. Therefore, in a program executed by the processor of such a data processing apparatus, the processing speed is increased as follows in order to satisfy the target performance. For example, processing is optimized (tuning work) by assigning (arranging) frequently used variables or variables whose reuse timing is close in time to a register, thereby speeding up the processing. On the other hand, although there is a difference for each processor, there is a limit to the number of registers that can be used. Therefore, variables that cannot be placed in registers are placed in a memory having a slower access speed than the registers.

なお、処理の高速化に関する技術として、次のような技術が知られている。
例えば、プロセッサの命令フェッチ部からデコード部への命令伝達経路において、命令フェッチ部にフェッチされた命令がバイトコードのときに、それがバイトコードアクセレータに供給されてネイティブコードに変換され、後段へ出力されるようにした技術がある。 The following techniques are known as techniques relating to high-speed processing.
For example, in the instruction transmission path from the instruction fetch unit of the processor to the decode unit, when the instruction fetched by the instruction fetch unit is a byte code, it is supplied to the byte code accelerator and converted into a native code, and output to the subsequent stage There is technology that has been made to do.

また、例えば、プログラム変換装置において、ソースプログラム中における変数への値の設定・参照位置等の解析を行い、その解析結果に基づいて、各々の変数にレジスタやメモリ等のハードウェア資源を割り当てるようにした技術がある。 Further, for example, in the program conversion apparatus, the setting of the value in the source program, the analysis of the reference position, etc. are performed, and hardware resources such as a register and a memory are allocated to each variable based on the analysis result. There is a technology.

特開２００５−４４３３６号公報JP 2005-44336 A 特開平１１−１７５３５１号公報JP-A-11-175351

上述のように、使用頻度の高い変数や再利用タイミングが時間的に近い変数をレジスタに割り当てる等して処理の最適化を行って高速化を図る技術においては、次のような課題がある。 As described above, the technique for optimizing the processing by allocating a frequently used variable or a variable whose reuse timing is close in time to the register to increase the speed has the following problems.

プログラムにおいて、例えば分岐処理等のように、入力データに応じて異なる処理が行われる関数では、入力データに応じて処理ルートが異なる。この場合、使用頻度の高い変数や再利用タイミングが時間的に近い変数も処理ルートによって異なることになる。そこで、上記の技術では、平均的な入力データに応じた処理ルート等に基づいて処理の最適化が行われている。しかしながら、これでは、入力データが変化して処理ルートが異なってしまうと、使用頻度の高い変数や再利用タイミングが時間的に近い変数も異なってしまうため、処理速度の高速化を図るどころか処理速度の低下を招く虞がある。 In a program, for example, a function in which different processing is performed according to input data, such as branch processing, the processing route is different according to input data. In this case, a variable that is frequently used or a variable whose reuse timing is close in time also varies depending on the processing route. Therefore, in the above technique, the process is optimized based on the process route corresponding to the average input data. However, in this case, if the input data changes and the processing route is different, variables that are frequently used and variables that are close in time to reuse will also differ. There is a risk of lowering.

本発明は、上記実情に鑑み、入力データが変化しても処理の高速化を図ることができる、レジスタ配置最適化方法、レジスタ配置最適化プログラム、及びレジスタ配置最適化装置を提供することを目的とする。 In view of the above circumstances, the present invention has an object to provide a register arrangement optimization method, a register arrangement optimization program, and a register arrangement optimization apparatus capable of speeding up processing even when input data changes. And

方法の一観点によれば、命令コードを実行するプロセッサコアとレジスタとを含むプロセッサにおいて、レジスタに配置される変数を最適化する方法が提供される。この方法では、プログラム内の関数に定義されている変数の配置先及び使用頻度の情報を含む変換テーブルを参照し、その変数の使用頻度の情報に基づいて、使用頻度の高い変数が優先的にレジスタに配置されるように、変数の配置先をレジスタ又はメモリに決定する。決定された変数の配置先を、変換テーブルに含まれる変数の配置先の情報に反映する。変換テーブルに含まれる変数の配置先の情報に基づいて、演算命令のオペランドの配置先が指定されていない擬似命令コードを、プロセッサコアが実行可能な命令コードに変換する。変換された命令コードをプロセッサコアが実行する際に使用される変数に応じて、変換テーブルに含まれる変数の使用頻度の情報を更新する。 According to one aspect of the method, a method is provided for optimizing variables placed in a register in a processor including a processor core that executes instruction code and a register. In this method, a variable table defined in a function in the program is referred to and a conversion table including information on the frequency of use is referenced. Based on the information on the frequency of use of the variable, a variable with a high frequency of use is given priority. The placement destination of the variable is determined in the register or the memory so as to be placed in the register. The determined variable placement destination is reflected in the variable placement destination information included in the conversion table. Based on the information on the variable placement destination included in the conversion table, the pseudo instruction code in which the operand placement destination of the operation instruction is not specified is converted into an instruction code executable by the processor core. Information on the frequency of use of variables included in the conversion table is updated according to the variables used when the processor core executes the converted instruction code.

プログラムの一観点によれば、命令コードを実行するプロセッサコアとレジスタとを含むプロセッサにおいて、レジスタに配置される変数を最適化するプログラムが提供される。このプログラムは、次のような処理をコンピュータに実行させる。プログラム内の関数に定義されている変数の配置先及び使用頻度の情報を含む変換テーブルを参照し、その変数の使用頻度の情報に基づいて、使用頻度の高い変数が優先的にレジスタに配置されるように、変数の配置先をレジスタ又はメモリに決定する。決定された変数の配置先を、変換テーブルに含まれる変数の配置先の情報に反映する。変換テーブルに含まれる変数の配置先の情報に基づいて、演算命令のオペランドの配置先が指定されていない擬似命令コードを、プロセッサコアが実行可能な命令コードに変換する。変換された命令コードをプロセッサコアが実行する際に使用される変数に応じて、変換テーブルに含まれる変数の使用頻度の情報を更新する。 According to one aspect of the program, there is provided a program that optimizes variables arranged in a register in a processor including a processor core that executes an instruction code and a register. This program causes a computer to execute the following processing. By referring to the conversion table that contains information on the variable location and usage frequency defined in the functions in the program, the frequently used variable is preferentially allocated to the register based on the usage frequency information of that variable. As described above, the placement destination of the variable is determined in the register or the memory. The determined variable placement destination is reflected in the variable placement destination information included in the conversion table. Based on the information on the variable placement destination included in the conversion table, the pseudo instruction code in which the operand placement destination of the operation instruction is not specified is converted into an instruction code executable by the processor core. Information on the frequency of use of variables included in the conversion table is updated according to the variables used when the processor core executes the converted instruction code.

装置の一観点によれば、命令コードを実行するプロセッサコアとレジスタとを含むプロセッサにおいて、レジスタに配置される変数を最適化する装置が提供される。この装置は、変換テーブルと命令変換部とを含む。変換テーブルは、プログラム内の関数に定義されている変数の配置先及び使用頻度の情報を含む。命令変換部は、変換テーブルに含まれる変数の配置先の情報に基づいて、演算命令のオペランドの配置先が指定されていない擬似命令コードを、プロセッサコアが実行可能な命令コードに変換する。また、命令変換部は、変換テーブルに含まれる変数の使用頻度の情報に基づいて、使用頻度の高い変数が優先的にレジスタに配置されるように、変数の配置先をレジスタ又はメモリに決定する。また、決定された変数の配置先を、変換テーブルに含まれる変数の配置先の情報に反映する。また、変換された命令コードをプロセッサコアが実行する際に使用される変数に応じて、変換テーブルに含まれる変数の使用頻度の情報を更新する。 According to one aspect of the apparatus, there is provided an apparatus for optimizing a variable arranged in a register in a processor including a processor core that executes an instruction code and a register. This apparatus includes a conversion table and an instruction conversion unit. The conversion table includes information on the location and frequency of use of variables defined in the functions in the program. The instruction conversion unit converts the pseudo instruction code in which the operand placement destination of the operation instruction is not specified into an instruction code that can be executed by the processor core, based on the information on the placement destination of the variable included in the conversion table. Further, the instruction conversion unit determines the variable placement destination in the register or the memory so that the frequently used variable is preferentially placed in the register based on the use frequency information of the variable included in the conversion table. . Further, the determined variable placement destination is reflected in the variable placement destination information included in the conversion table. In addition, the variable usage frequency information included in the conversion table is updated according to the variable used when the processor core executes the converted instruction code.

開示の方法、プログラム、及び装置は、入力データが変化しても処理の高速化を図ることができる、という効果を奏する。 The disclosed method, program, and apparatus have an effect that the processing speed can be increased even if input data changes.

データ処理装置の一例を示す図である。It is a figure which shows an example of a data processor. 処理プログラムの構成例を模式的に示す図である。It is a figure which shows the structural example of a processing program typically. レジスタ、内部メモリ、及びメインメモリ部でのアクセス速度の一例を模式的に示す図である。It is a figure which shows typically an example of the access speed in a register | resistor, an internal memory, and a main memory part. 変数の配置方法についての課題の一例を示す図である。It is a figure which shows an example of the subject about the arrangement | positioning method of a variable. 技術例１における変数の配置動作の一例を示す図である。It is a figure which shows an example of the arrangement | positioning operation | movement of the variable in the technical example 1. FIG. 技術例１の課題の一例を示す図である。10 is a diagram illustrating an example of a problem of Technical Example 1. FIG. 技術例２における変数の配置動作の一例を示す図である。It is a figure which shows an example of the arrangement | positioning operation | movement of the variable in the technical example 2. FIG. 技術例２の課題の一例を示す図である。It is a figure which shows an example of the subject of the technical example 2. FIG. 技術例１と技術例２とを組み合わせた場合の課題の一例を示す図である。It is a figure which shows an example of the subject at the time of combining the technical example 1 and the technical example 2. FIG. 実施例１に係るレジスタ配置最適化装置を含むデータ処理装置の一例を示す図である。1 is a diagram illustrating an example of a data processing apparatus including a register arrangement optimizing apparatus according to a first embodiment. 擬似命令コードの変換例を示す図である。It is a figure which shows the example of conversion of a pseudo instruction code. 変換マップテーブルの一例を示す図である。It is a figure which shows an example of a conversion map table. 変換マップテーブルに含まれる過去の使用頻度の情報の更新に係る動作の一例を示すフローチャートである。It is a flowchart which shows an example of the operation | movement which concerns on the update of the information of the past usage frequency contained in the conversion map table. 処理プログラムの実行に伴って行われる変換マップテーブルの更新手順の一例を説明する図である。It is a figure explaining an example of the update procedure of the conversion map table performed with execution of a processing program. 再配置処理（Ｓ２０１）の処理例を示すフローチャートである。It is a flowchart which shows the process example of a rearrangement process (S201). (a) は手法１による変数の配置例を示す図、(b) は手法２による変数の配置例を示す図である。(a) is a figure which shows the example of variable arrangement | positioning by the method 1, (b) is a figure which shows the example of variable arrangement | positioning by the method 2. 実施例１に係るレジスタ配置最適化装置を含む画像符号化装置の構成例を示す図である。1 is a diagram illustrating a configuration example of an image encoding device including a register arrangement optimization device according to Embodiment 1. FIG. 図１７に示した画像符号化装置の処理の流れの概要を模式的に示す図である。It is a figure which shows typically the outline | summary of the flow of a process of the image coding apparatus shown in FIG. 図１７に示した画像符号化装置の動作シーケンスの一例を示す図である。It is a figure which shows an example of the operation | movement sequence of the image coding apparatus shown in FIG. 従来の場合のプログラム開発期間と、実施例１に係るレジスタ配置最適化装置を採用した場合のプログラム開発期間との一例を示す図である。It is a figure which shows an example of the program development period in the case of the conventional case, and the program development period at the time of employ | adopting the register arrangement optimization apparatus based on Example 1. FIG. (a) は入力データ特徴算出部の基本的な処理例を示すフローチャート、(b),(c) はその具体例を示すフローチャートである。(a) is a flowchart showing a basic processing example of the input data feature calculation unit, and (b) and (c) are flowcharts showing specific examples thereof. 実施例２に係るレジスタ配置最適化装置を含む画像符号化装置の構成例を示す図である。FIG. 10 is a diagram illustrating a configuration example of an image encoding device including a register arrangement optimizing device according to a second embodiment. 図２２に示した画像符号化装置の処理の流れの概要を模式的に示す図である。It is a figure which shows typically the outline | summary of the flow of a process of the image coding apparatus shown in FIG. 図２２に示した画像符号化装置の動作シーケンスの一例を示す図である。It is a figure which shows an example of the operation | movement sequence of the image coding apparatus shown in FIG. 各々が変換マップテーブル選択値の情報を含む複数の変換マップテーブルの例を示す図である。It is a figure which shows the example of the some conversion map table in which each contains the information of a conversion map table selection value. コンピュータシステムの構成例を示す図である。It is a figure which shows the structural example of a computer system.

＜実施例１＞
図１は、データ処理装置の一例を示す図である。
図１に示したデータ処理装置は、プロセッサＣＰＵ（Central Processing Unit）部１１０、メインメモリ部１２０、メモリコントローラ部１３０、周辺入出力Ｉ／Ｆ（InterFace）部１４０、通信Ｉ／Ｆ（InterFace）部１５０、及びシステムバス１６０を含む。 <Example 1>
FIG. 1 is a diagram illustrating an example of a data processing apparatus.
1 includes a processor CPU (Central Processing Unit) unit 110, a main memory unit 120, a memory controller unit 130, a peripheral input / output I / F (InterFace) unit 140, and a communication I / F (InterFace) unit. 150 and the system bus 160.

プロセッサＣＰＵ部１１０は、プロセッサコア（演算・制御ユニット）１１１、レジスタ１１２、命令キャッシュ１１３、データキャッシュ１１４、システムバスコントローラ１１５、及び内部メモリ１１６を含む。プロセッサコア１１１は、処理プログラム（命令コード）の実行を行う。レジスタ１１２は、プロセッサコア１１１がアクセスする高速レジスタである。命令キャッシュ１１３は、命令コード用のキャッシュメモリである。データキャシュ１１４は、データコード用のキャッシュメモリである。システムバスコントローラ１１５は、各部を接続するシステムバス１６０のコントローラである。内部メモリ１１６は、プロセッサＣＰＵ部１１０内で使用される小容量のメモリである。 The processor CPU unit 110 includes a processor core (arithmetic / control unit) 111, a register 112, an instruction cache 113, a data cache 114, a system bus controller 115, and an internal memory 116. The processor core 111 executes a processing program (instruction code). The register 112 is a high-speed register that is accessed by the processor core 111. The instruction cache 113 is a cache memory for instruction codes. The data cache 114 is a cache memory for data codes. The system bus controller 115 is a controller of the system bus 160 that connects each unit. The internal memory 116 is a small-capacity memory used in the processor CPU unit 110.

メインメモリ部１２０は、処理プログラム格納領域１２１、処理データ格納領域１２２、及び処理結果格納領域１２３を含む。
メモリコントローラ部１３０は、メインメモリ部１２０のRead/Write制御を行う。 The main memory unit 120 includes a processing program storage area 121, a processing data storage area 122, and a processing result storage area 123.
The memory controller unit 130 performs read / write control of the main memory unit 120.

周辺入出力Ｉ／Ｆ部１４０は、外部機器との間で入出力Ｉ／Ｆ処理を行う。通信Ｉ／Ｆ部１５０は、通信回線を介して外部機器との間でＩ／Ｆ処理を行う。
システムバス１６０は、プロセッサＣＰＵ部１１０、メモリコントローラ部１３０、周辺入出力Ｉ／Ｆ部１４０、及び通信Ｉ／Ｆ部１５０等と接続されるアドレス／データバスである。 The peripheral input / output I / F unit 140 performs input / output I / F processing with an external device. The communication I / F unit 150 performs I / F processing with an external device via a communication line.
The system bus 160 is an address / data bus connected to the processor CPU unit 110, the memory controller unit 130, the peripheral input / output I / F unit 140, the communication I / F unit 150, and the like.

図１に示したデータ処理装置において、処理対象データは、外部機器から周辺入出力Ｉ／Ｆ部１４０を介して又は通信回線から通信Ｉ／Ｆ部１５０を介して入力される。入力された処理対象データは、システムバス１６０及びメモリコントローラ部１３０を経由してメインメモリ部１２０の処理データ格納領域１２２に格納される。格納された処理対象データは、メインメモリ部１２０にロードされた処理プログラムがプロセッサＣＰＵ部１１０により実行されることによって処理される。処理プログラムは、例えば、フィルタリングや画像圧縮等の処理を行う画像処理プログラムである。処理プログラムにおいて、プロセッサＣＰＵ部１１０の処理内容を実現するための命令コードは、メインメモリ部１２０又は内部メモリ１１６に格納される。命令コードは、直接アセンブラ言語で記述されたものであるか、又は、Ｃ言語等の高級言語で記述された上でコンパイラにより変換されたものである。プロセッサＣＰＵ部１１０には、メモリアクセス高速化の為に、キャッシュメモリ(命令キャッシュ１１３及びデータキャッシュ１１４)が設けられており、適宜プロセッサコア１１１で実行される命令コードが事前に命令キャッシュ１１３にコピーされる。また、プロセッサＣＰＵ部１１０には、演算対象データを格納又は一時的に記憶するために、レジスタ１１２が設けられている。プロセッサコア１１１により実行される演算命令は、レジスタ１１２やメモリ（内部メモリ１１６又はメインメモリ部１２０）のデータを演算対象（オペランド）として実行される。演算結果は、システムバス１６０及びメモリコントローラ部１３０を介して、メインメモリ部１２０の処理結果格納領域１２３に格納される。最終結果は、必要に応じて、周辺入出力Ｉ／Ｆ部１４０を介して外部機器へ又は通信Ｉ／Ｆ部１５０を介して通信回線へ出力される。 In the data processing apparatus illustrated in FIG. 1, processing target data is input from an external device via the peripheral input / output I / F unit 140 or from a communication line via the communication I / F unit 150. The input processing target data is stored in the processing data storage area 122 of the main memory unit 120 via the system bus 160 and the memory controller unit 130. The stored processing target data is processed by the processor CPU unit 110 executing a processing program loaded in the main memory unit 120. The processing program is, for example, an image processing program that performs processing such as filtering and image compression. In the processing program, an instruction code for realizing the processing contents of the processor CPU unit 110 is stored in the main memory unit 120 or the internal memory 116. The instruction code is directly written in an assembler language, or is written in a high-level language such as C language and converted by a compiler. The processor CPU unit 110 is provided with a cache memory (instruction cache 113 and data cache 114) for speeding up memory access, and an instruction code executed by the processor core 111 is appropriately copied to the instruction cache 113 in advance. Is done. The processor CPU unit 110 is provided with a register 112 for storing or temporarily storing data to be calculated. Arithmetic instructions executed by the processor core 111 are executed using the data in the register 112 and the memory (the internal memory 116 or the main memory unit 120) as an operation target (operand). The calculation result is stored in the processing result storage area 123 of the main memory unit 120 via the system bus 160 and the memory controller unit 130. The final result is output to an external device via the peripheral input / output I / F unit 140 or to a communication line via the communication I / F unit 150 as necessary.

図２は、処理プログラムの構成例を模式的に示す図である。
命令コードとして処理順序が記述されている処理プログラムは、図２に示したように、関数やサブルーチンモジュール（例えば関数[1]〜関数[n]）に分割されており、これらを連携させて全体機能を実現する。各関数内で使用される変数は、レジスタ１１２又はメモリ（内部メモリ１１６又はメインメモリ部１２０）に配置され、演算命令のオペランドとして指定される。図２に示した例では、例えば関数[1]において、演算命令のオペランドとして指定される変数[1]〜変数[6]は、レジスタ（レジスタ−１〜レジスタ−５）１１２に配置されることを示している。また、変数[7]及び変数[8]は、メモリ（内部メモリ１１６又はメインメモリ部１２０）に配置されることを示している。 FIG. 2 is a diagram schematically illustrating a configuration example of the processing program.
The processing program in which the processing order is described as the instruction code is divided into functions and subroutine modules (for example, function [1] to function [n]) as shown in FIG. Realize the function. Variables used in each function are arranged in the register 112 or the memory (the internal memory 116 or the main memory unit 120), and are specified as operands of operation instructions. In the example shown in FIG. 2, for example, in the function [1], the variables [1] to [6] specified as the operands of the operation instruction are arranged in the registers (register-1 to register-5) 112. Is shown. Further, the variable [7] and the variable [8] indicate that they are arranged in the memory (the internal memory 116 or the main memory unit 120).

図３は、レジスタ１１２、内部メモリ１１６、及びメインメモリ部１２０でのアクセス速度の一例を模式的に示す図である。なお、ここでは、メインメモリ部１２０をＤＲＡＭ（Dynamic Random Access Memory）とした場合の例を示す。 FIG. 3 is a diagram schematically illustrating an example of access speeds in the register 112, the internal memory 116, and the main memory unit 120. Here, an example in which the main memory unit 120 is a DRAM (Dynamic Random Access Memory) is shown.

図３に示したように、レジスタ１１２とメモリ（内部メモリ１１６及びＤＲＡＭ）では、アクセス速度が異なる。レジスタ１１２に対して、プロセッサＣＰＵ部１１０の外部に設けられるＤＲＡＭでは数十〜数百倍のアクセス時間を要し、プロセッサＣＰＵ部１１０の内部メモリ１１６でも２〜４倍のアクセス時間を要する。従って、関数内で使用される変数は可能な限りレジスタ１１２に配置した方が、より高速な処理が可能となる。しかしながら、使用できるレジスタ数には制限があるため、レジスタ１１２に配置しきれない変数については、メモリ（内部メモリ１１６又はメインメモリ部１２０）に配置されてしまう。なお、制限されるレジスタ数は、例えばプロセッサ毎に異なる。 As shown in FIG. 3, the access speed is different between the register 112 and the memory (internal memory 116 and DRAM). The DRAM provided outside the processor CPU unit 110 requires several tens to several hundred times the access time for the register 112, and the internal memory 116 of the processor CPU unit 110 requires two to four times the access time. Therefore, the variable used in the function can be processed at a higher speed if it is arranged in the register 112 as much as possible. However, since the number of registers that can be used is limited, variables that cannot be arranged in the register 112 are arranged in the memory (the internal memory 116 or the main memory unit 120). Note that the number of registers to be limited is different for each processor, for example.

そこで、高速な動作が要求されるプログラムでは、目標性能を満足させるために、より使用頻度の高い変数や再利用タイミングが時間的に近い変数をレジスタに割り当てる等して処理の最適化を行い、処理の高速化を図っている。この場合、例えば、レジスタを使用して動作させるために命令コードの変換を行うことにより処理の高速化を図る技術がある（以下「技術例１」という）。また、例えば、アセンブラコードを読み込み、変数の存在区間の長さや変数の存在するループレベルの総和に基づいて変数の割付優先度を算出し、その結果に基づいて変数の配置先を変換することにより高速化を図る技術がある（以下「技術例２」という）。 Therefore, in a program that requires high-speed operation, in order to satisfy the target performance, optimize the processing by assigning more frequently used variables or variables whose reuse timing is close in time to the register, etc. The processing is speeded up. In this case, for example, there is a technique for speeding up the process by converting an instruction code to operate using a register (hereinafter referred to as “Technical Example 1”). Also, for example, by reading the assembler code, calculating the variable assignment priority based on the length of the variable existing section and the total loop level where the variable exists, and converting the variable placement destination based on the result There is a technology for speeding up (hereinafter referred to as “Technical Example 2”).

一方、変数の配置方法については、次のような課題がある。
図４は、変数の配置方法についての課題の一例を示す図である。
図４に示したように、変数の使用頻度や使用順は、関数内の何れの処理ルート（処理ルートＡ〜Ｃ）を通ったかにより異なる。これは、関数内では入力されるデータの特徴に応じた処理が行われるため、入力データが変化すると処理ルートが異なってしまうからである。従って、平均点なデータが入力される場合の処理ルート（例えば処理ルートＣ）で使用される変数をメインにレジスタに配置する等してレジスタ配置を最適化し、総合的に最適となるような変数の配置を行う必要がある。しかしながら、これでは、平均的でないデータが入力された場合の処理ルート(例えば処理ルートＡ又はＢ）では、レジスタが適正に使用されず、処理速度が低下する要因となってしまう。 On the other hand, the variable assignment method has the following problems.
FIG. 4 is a diagram illustrating an example of a problem regarding a variable arrangement method.
As shown in FIG. 4, the frequency of use and the order of use of variables differ depending on which processing route (processing routes A to C) in the function has passed. This is because processing according to the characteristics of the input data is performed in the function, and therefore the processing route differs when the input data changes. Therefore, variables that optimize the register arrangement by, for example, allocating variables used in the processing route (for example, processing route C) when average point data is input to the register as a main, and are variables that are optimal overall. It is necessary to make arrangements. However, in this case, in the processing route (for example, processing route A or B) when non-average data is input, the register is not used properly, which causes a reduction in processing speed.

図５は、上記の技術例１における変数の配置動作の一例を示す図である。
図５に示したように、技術例１では、プロセッサの命令伝達経路において、次のような動作が行われる。まず、命令キャッシュ（CASHE)１７１から読み出された命令コードが命令フェッチ部(FETc)１７２において分別され、それが浮動小数点バイトコードである場合には、それがバイトコードアクセラレータ部(BCA)１７３へ出力される。そして、そこでレジスタ間転送命令に変換され、それがセレクター１７４を介してデコード部（DECc、DECf）１７５へ出力される。技術例１では、このような動作により、処理の高速化を図っている。なお、デコード部１７５において、DECc は全ての命令をデコードし、DECf は浮動小数点命令を認識してデコードする。 FIG. 5 is a diagram illustrating an example of a variable arrangement operation in the first technical example.
As shown in FIG. 5, in the first technical example, the following operation is performed in the instruction transmission path of the processor. First, the instruction code read from the instruction cache (CASHE) 171 is sorted in the instruction fetch unit (FETc) 172. If it is a floating-point byte code, it is sent to the byte code accelerator unit (BCA) 173. Is output. Then, it is converted into an inter-register transfer instruction, which is output to the decoding unit (DECc, DECf) 175 via the selector 174. In the first technical example, the operation speed is increased by such an operation. In the decoding unit 175, DECc decodes all instructions, and DECf recognizes and decodes floating point instructions.

図６は、上記の技術例１の課題の一例を示す図である。
技術例１では、上述のように、一部の命令コードに対して変換を行う方式である。そのため、図６に示したように、例えば、入力データＡの場合は、浮動小数点演算が有りとなって配置先の変換が行われ、有効な方式となる。しかしながら、入力データが変化した入力データＢの場合は、処理ルートが変更され浮動小数点演算が無しとなり、この場合は高速化の効果はない。このように、技術例１では、入力データによって浮動少数点演算の有無が固定的に決まるものであり、それを動的に変更することはできない。 FIG. 6 is a diagram illustrating an example of the problem of the first technical example.
In the first technical example, as described above, a part of instruction codes is converted. Therefore, as shown in FIG. 6, for example, in the case of input data A, there is a floating-point operation, and the arrangement destination is converted, which is an effective method. However, in the case of the input data B in which the input data has changed, the processing route is changed and no floating point arithmetic is performed. In this case, there is no effect of speeding up. As described above, in the first technical example, the presence / absence of the floating-point operation is fixedly determined by the input data, and cannot be dynamically changed.

図７は、上記の技術例２における変数の配置動作の一例を示す図である。
図７に示したように、技術例２では、プログラム変換装置において、次のような動作が行われる。まず、処理プログラム(アセンブラコード)が読み込まれて解析され、変数の存在区間の長さや変数の存在するループレベルの総和が算出され、その算出結果に基づいて各変数の割付優先度が算出される。そして、その算出結果に基づいて変数の配置先が変換される。技術例２では、このような動作により、処理の高速化を図っている。 FIG. 7 is a diagram illustrating an example of a variable arrangement operation in the second technical example.
As shown in FIG. 7, in the technical example 2, the following operation is performed in the program conversion apparatus. First, the processing program (assembler code) is read and analyzed, the length of the variable existence section and the total loop level where the variable exists are calculated, and the assignment priority of each variable is calculated based on the calculation result. . Then, the variable placement destination is converted based on the calculation result. In the technical example 2, the operation speed is increased by such an operation.

図８は、上記の技術例２の課題の一例を示す図である。
技術例２では、上述のように、処理プログラム（アセンブラコード）の解析が行われ、そのコード上での変数の存在区間の長さや変数の存在するループレベルの総和に基づいて配置が決定され、処理プログラムが実行される。そのため、図８に示したように、例えば、処理プログラムの中で最も多く使用されている変数が変数Ｘと判断され（図８のＡ参照）、変数Ｘの配置先がレジスタとされたとする。しかしながら、その後、入力データが変更され、その処理を行う実行ルートにおいて最も多く使用されている変数が、変数Ｘではなく変数Ｙになる場合がある(図８のＢの網掛け部参照)。このように、入力データが変更され、処理プログラムの実行ルートが変更された場合に、技術例２では、使用される変数の存在区間やループ回数の変更に対応できない。 FIG. 8 is a diagram illustrating an example of the problem of the second technical example.
In the technical example 2, as described above, the processing program (assembler code) is analyzed, and the arrangement is determined based on the length of the variable existing section on the code and the sum of the loop levels where the variable exists, A processing program is executed. Therefore, as shown in FIG. 8, for example, it is assumed that the most frequently used variable in the processing program is determined as the variable X (see A in FIG. 8), and the placement destination of the variable X is a register. However, after that, the input data is changed, and the variable most frequently used in the execution route for performing the process may be the variable Y instead of the variable X (see the shaded portion in FIG. 8B). As described above, in the case where the input data is changed and the execution route of the processing program is changed, the technical example 2 cannot cope with the change of the existing section of the variable to be used and the number of loops.

図９は、上記の技術例１と技術例２とを組み合わせた場合の課題の一例を示す図である。
図９に示したように、この場合には、処理プログラム(アセンブラコード等)が読み込まれて、その命令の種類(浮動小数点)や、変数の存在区間の長さや変数の存在するループレベルの総和に基づいて配置先の変換が行われる。しかしながら、この場合には、入力データに応じて変化する関数内の変数の使用頻度によってレジスタとメモリの割り振りを変更（最適化）することができない。 FIG. 9 is a diagram illustrating an example of a problem when the above technical example 1 and the technical example 2 are combined.
As shown in FIG. 9, in this case, a processing program (assembler code or the like) is read, and the type of the instruction (floating point), the length of the variable existence section, and the total loop level where the variable exists. The placement destination is converted based on the above. However, in this case, the allocation of registers and memories cannot be changed (optimized) depending on the frequency of use of variables in the function that changes according to input data.

そこで、上述した変数の配置方法に関する課題に鑑み、実施例１に係るレジスタ配置最適化装置を含むデータ処理装置は、以下のような構成を有する。
図１０は、本実施例に係るレジスタ配置最適化装置を含むデータ処理装置の一例を示す図である。 Accordingly, in view of the above-described problem regarding the variable arrangement method, the data processing apparatus including the register arrangement optimization apparatus according to the first embodiment has the following configuration.
FIG. 10 is a diagram illustrating an example of the data processing apparatus including the register arrangement optimizing apparatus according to the present embodiment.

なお、図１０において、図１に示した要素と同一の要素については、同一の符号を付している。また、図１０に示したデータ処理装置において、本実施例に係るレジスタ配置最適化装置は、命令変換ユニット１１７と変換マップテーブル１１８を含む。命令変換ユニット１１７は命令変換部の一例であり、変換マップテーブル１１８は変換テーブルの一例である。 10, the same elements as those shown in FIG. 1 are denoted by the same reference numerals. In the data processing apparatus shown in FIG. 10, the register arrangement optimizing apparatus according to the present embodiment includes an instruction conversion unit 117 and a conversion map table 118. The instruction conversion unit 117 is an example of an instruction conversion unit, and the conversion map table 118 is an example of a conversion table.

図１０に示したデータ処理装置では、プロセッサＣＰＵ部１１０が、更に、命令変換ユニット１１７と変換マップテーブル１１８を含み、それらが命令キャッシュ１１３とシステムバスコントローラ１１５との間に設けられる。なお、変換マップテーブル１１８は、命令変換ユニット１１７の内部又は外部に設けることが可能であり、例えば、プロセッサＣＰＵ部１１０内の図示しない記憶部に記憶される。 In the data processing apparatus shown in FIG. 10, the processor CPU unit 110 further includes an instruction conversion unit 117 and a conversion map table 118, which are provided between the instruction cache 113 and the system bus controller 115. The conversion map table 118 can be provided inside or outside the instruction conversion unit 117, and is stored in a storage unit (not shown) in the processor CPU unit 110, for example.

また、メインメモリ部１２０の処理プログラム格納領域１２１に格納される処理プログラム中の命令コードは、演算命令のオペランドが明示されていない（演算命令のオペランドの配置先が指定されていない）抽象化された擬似命令コードとなる。すなわち、擬似命令コードは、演算命令のオペランドがレジスタ１１２に配置されているのかメモリ（内部メモリ１１６又はメインメモリ部１２０）に配置されているのかが明示されていない。なお、このような擬似命令コードを含む処理プログラムは、予め生成されて、処理プログラム格納領域１２１に格納される。 In addition, the instruction code in the processing program stored in the processing program storage area 121 of the main memory unit 120 is abstracted in which the operand of the arithmetic instruction is not specified (the location of the arithmetic instruction operand is not specified). Pseudo instruction code. That is, the pseudo instruction code does not clearly indicate whether the operand of the operation instruction is arranged in the register 112 or in the memory (the internal memory 116 or the main memory unit 120). A processing program including such a pseudo instruction code is generated in advance and stored in the processing program storage area 121.

変換マップテーブル１１８は、詳しくは後述するように、処理プログラム内の各関数に定義されている各変数の配置先、平均使用頻度、及び過去の使用頻度等の情報を含む。
命令変換ユニット１１７は、変換マップテーブル１１８に含まれる各変数の配置先の情報に基づいて、擬似命令コードを、プロセッサコア１１１が解釈可能な（実行可能な）命令コード（以下「通常の命令コード」ともいう）に変換する。すなわち、演算命令のオペランドの配置先がレジスタ１１２であるのかメモリ（内部メモリ１１６又はメインメモリ部１２０）であるのかが明示されていない抽象化された擬似命令コードを、それが明示された通常の命令コードに変換する。そして、変換された命令コードを命令キャッシュ１１３にロードする。また、命令変換ユニット１１７は、詳しくは後述するように、変換マップテーブル１１８の更新等の処理も行う。 As will be described in detail later, the conversion map table 118 includes information such as the location of each variable defined in each function in the processing program, the average usage frequency, and the past usage frequency.
The instruction conversion unit 117 converts the pseudo instruction code into an instruction code that can be interpreted (executable) by the processor core 111 (hereinafter referred to as “normal instruction code”) based on the information on the location of each variable included in the conversion map table 118. "). That is, an abstract pseudo-instruction code in which it is not specified whether the operand of the operation instruction is placed in the register 112 or in the memory (the internal memory 116 or the main memory unit 120) Convert to instruction code. Then, the converted instruction code is loaded into the instruction cache 113. The instruction conversion unit 117 also performs processing such as updating the conversion map table 118, as will be described in detail later.

図１１は、擬似命令コードの変換例を示す図である。
図１１に示した例は、命令変換ユニット１１７が、変換マップテーブル１１８に含まれる各変数の配置先の情報に基づいて、擬似命令コードである加算命令「ADD val0, val1」を、通常の命令コードである加算命令「add r0, r1」又は「add (mem0), r1」に変換した例である。 FIG. 11 is a diagram illustrating a conversion example of the pseudo instruction code.
In the example shown in FIG. 11, the instruction conversion unit 117 converts the addition instruction “ADD val0, val1”, which is a pseudo instruction code, based on the information on the location of each variable included in the conversion map table 118 into a normal instruction. In this example, the code is converted into an addition instruction “add r0, r1” or “add (mem0), r1”.

ここで、擬似命令コードである加算命令「ADD val0, val1」は、変数１(val0)と変数２(val1)の加算命令を示している。このように、擬似命令コードは、演算命令のオペランドの配置先がレジスタ１１２であるのかメモリ（内部メモリ１１６又はメインメモリ部１２０）であるのかが明示されていない。 Here, an addition instruction “ADD val0, val1” which is a pseudo instruction code indicates an addition instruction of variable 1 (val0) and variable 2 (val1). Thus, the pseudo instruction code does not clearly indicate whether the operand of the operation instruction is placed in the register 112 or in the memory (the internal memory 116 or the main memory unit 120).

一方、変換後の命令コードである加算命令「add r0, r1」は、変数１がレジスタ０(r0)に配置され、変数２がレジスタ１(r1)に配置されている場合の変換例であり、レジスタ０(r0)に配置されている変数１とレジスタ１(r1)に配置されている変数２の加算命令を示している。なお、レジスタ０(r0)及びレジスタ１(r1)は、レジスタ１１２に含まれる領域である。 On the other hand, the addition instruction “add r0, r1”, which is the instruction code after conversion, is an example of conversion when variable 1 is placed in register 0 (r0) and variable 2 is placed in register 1 (r1). The addition instruction of the variable 1 arranged in the register 0 (r0) and the variable 2 arranged in the register 1 (r1) is shown. Note that the register 0 (r0) and the register 1 (r1) are areas included in the register 112.

また、変換後の命令コードである加算命令「add (mem0), r1」は、変数１がメモリ(mem0)に配置され、変数２がレジスタ１(r1)に配置されている場合の変換例であり、メモリ(mem0)に配置されている変数１とレジスタ１(r1)に配置されている変数２の加算命令を示している。なお、メモリ(mem0)は、内部メモリ１１６又はメインメモリ部１２０に含まれる領域である。 An addition instruction “add (mem0), r1”, which is an instruction code after conversion, is a conversion example in the case where variable 1 is arranged in memory (mem0) and variable 2 is arranged in register 1 (r1). Yes, it shows an addition instruction for the variable 1 arranged in the memory (mem0) and the variable 2 arranged in the register 1 (r1). The memory (mem0) is an area included in the internal memory 116 or the main memory unit 120.

図１２は、変換マップテーブル１１８の一例を示す図である。
図１２に示したように、変換マップテーブル１１８は、処理プログラム内の各関数に定義されている各変数の配置先、平均使用頻度、過去の使用頻度等の情報を含む。但し、図１２では、過去の使用頻度の情報を省略して示している。 FIG. 12 is a diagram illustrating an example of the conversion map table 118.
As shown in FIG. 12, the conversion map table 118 includes information such as the location of each variable defined in each function in the processing program, the average use frequency, and the past use frequency. However, in FIG. 12, the past use frequency information is omitted.

変換マップテーブル１１８に含まれる情報は、処理プログラムの実行に伴って、次のようにして動的に更新（変更）される。
まず、処理プログラム起動時において、各変数の配置先の情報は、例えば、処理プログラム中の各関数内での変数の定義順等によって優先度を設け、その優先度に応じてレジスタ１１２又はメモリ（内部メモリ１１６又はメインメモリ部１２０）に決定される。また、各変数の平均使用頻度及び過去の使用頻度の情報は０とされる。 The information included in the conversion map table 118 is dynamically updated (changed) as follows with the execution of the processing program.
First, when the processing program is started, the information on the location of each variable is given priority according to, for example, the definition order of the variable in each function in the processing program, and the register 112 or memory ( The internal memory 116 or the main memory unit 120) is determined. Also, the information on the average use frequency and past use frequency of each variable is set to zero.

その後、処理プログラムの処理対象となるデータが入力され、処理プログラムの実行が開始されると、処理プログラム中の擬似命令コードは、命令変換ユニット１１７により変換されて命令キャッシュ１１３へロードされる。命令変換ユニット１１７による変換の際には、変換マップテーブル１１８が参照され、次のような変換（オペランド変換ともいう）が行われる。変換マップテーブル１１８において配置先がレジスタ（レジスタ−１〜レジスタ−５）１１２となっている変数は、命令コードのオペランドの配置先が該当するレジスタ１１２となるような命令列に変換される。一方、変換マップテーブル１１８において配置先がメモリ（内部メモリ１１６又はメインメモリ部１２０）となっている変数は、命令コードのオペランドの配置先が該当するメモリとなるような命令列に変換される。 Thereafter, when data to be processed by the processing program is input and the execution of the processing program is started, the pseudo instruction code in the processing program is converted by the instruction conversion unit 117 and loaded into the instruction cache 113. At the time of conversion by the instruction conversion unit 117, the conversion map table 118 is referred to, and the following conversion (also referred to as operand conversion) is performed. In the conversion map table 118, a variable whose placement destination is the register (register-1 to register-5) 112 is converted into an instruction sequence in which the placement destination of the instruction code operand is the corresponding register 112. On the other hand, in the conversion map table 118, a variable whose placement destination is a memory (internal memory 116 or main memory unit 120) is converted into an instruction sequence in which the placement destination of the instruction code operand is the corresponding memory.

そして、このようにして変換された命令コードがプロセッサコア１１１に実行されて変数が使用されると、その変数毎に、変換マップテーブル１１８に含まれる過去の使用頻度の情報が更新されると共に、平均使用頻度の情報が更新される。 When the instruction code converted in this way is executed by the processor core 111 and a variable is used, information on the past use frequency included in the conversion map table 118 is updated for each variable. Average usage frequency information is updated.

例えば、図１２に示した変換マップテーブル１１８において、関数[1]の変数[2]の平均使用頻度は「５０」であることを示し、関数[1]に定義されている変数の中で平均使用頻度が最も高いことを示している。このような平均使用頻度の情報は、変数の配置先の変更に使用される。例えば、図１２に示した変換マップテーブル１１８において、関数[2]の変数[1]は平均使用頻度が「７」であり、関数[2]の変数[5]は平均使用頻度が「２０」であることを示している。この場合には、関数[2]において、メモリ（内部メモリ１１６又はメインメモリ部１２０）に配置されている変数[5]が、レジスタ１１２に配置されている変数[1]よりも平均使用頻度が高くなっている為、変換マップテーブル１１８において、変数の配置先の入れ替えが行われる。すなわち、変数[5]の配置先がレジスタ１１２となり、変数[1]の配置先がメモリ（内部メモリ１１６又はメインメモリ部１２０）となるように、変換マップテーブル１１８において、変数の配置先の入れ替えが行われる。このようにして変数の配置先の入れ替えが行われることにより、変数の配置先を動的に変更（最適化）することが可能となる。また、このようにして変数の配置先の入れ替えが行われた後の変換マップテーブル１１８に含まれる各変数の配置先の情報に基づいて命令変換ユニット１１７による変換が行われることにより、処理の高速化を図ることができる。 For example, in the conversion map table 118 shown in FIG. 12, the average usage frequency of the variable [2] of the function [1] is “50”, and the average among the variables defined in the function [1] It indicates that the frequency of use is the highest. Such information on the average use frequency is used for changing the variable placement destination. For example, in the conversion map table 118 shown in FIG. 12, the variable [1] of the function [2] has an average usage frequency of “7”, and the variable [5] of the function [2] has an average usage frequency of “20”. It is shown that. In this case, in the function [2], the variable [5] arranged in the memory (the internal memory 116 or the main memory unit 120) has an average usage frequency higher than the variable [1] arranged in the register 112. Since the value is higher, the variable placement destination is changed in the conversion map table 118. That is, in the conversion map table 118, the variable placement destination is changed so that the placement destination of the variable [5] is the register 112 and the placement destination of the variable [1] is the memory (the internal memory 116 or the main memory unit 120). Is done. By exchanging the variable placement destinations in this way, the variable placement destination can be dynamically changed (optimized). Further, the conversion by the instruction conversion unit 117 is performed based on the information on the placement destination of each variable included in the conversion map table 118 after the replacement of the placement destination of the variable in this way, thereby speeding up the processing. Can be achieved.

なお、変換マップテーブル１１８において、プログラム起動時の各変数の配置先の情報は、例えば、それを予め内部メモリ１１６又はメインメモリ部１２０に格納しておき、プログラム起動時に読み出して変換マップテーブル１１８に格納するようにすることも可能である。この場合、プログラム起動時の各変数の配置先の情報は、例えば、処理プログラムの静的な解析結果に基づいて決定される。 In the conversion map table 118, information on the location of each variable at the time of starting the program is stored in the internal memory 116 or the main memory unit 120 in advance, for example, and is read out at the time of starting the program and stored in the conversion map table 118. It is also possible to store them. In this case, the information on the location of each variable at the time of starting the program is determined based on, for example, a static analysis result of the processing program.

図１３は、変換マップテーブル１１８に含まれる過去の使用頻度の情報の更新に係る動作の一例を示すフローチャートである。
図１３に示したように、図示しない命令キャッシュ１１３のコントローラによってリード要求（命令コードのロード指示）が行われると（Ｓ１０１）、次のような動作が行われる。まず、このリード要求に応じて、プロセッサコア１１１が命令コードを実行するために必要な擬似命令コードがメインメモリ部１２０の処理プログラム格納領域１２１からリードされ、命令変換ユニット１１７へ転送される（Ｓ１０２）。命令変換ユニット１１７では、変換マップテーブル１１８に含まれる各変数の配置先の情報に基づいて、オペランド変換により、転送された擬似命令コードが通常の命令コードに変換される（Ｓ１０３）。また、変換された命令コードがプロセッサコア１１１により実行されて変数が使用されると、その変数についての過去の使用頻度がインクリメントされるように、変換マップテーブル１１８に含まれる過去の使用頻度の情報が更新される（Ｓ１０４）。なお、変換マップテーブル１１８において、このようにしてインクリメントされた過去の使用頻度の情報は、詳しくは後述するように、平均使用頻度の情報を更新する際に使用される。 FIG. 13 is a flowchart illustrating an example of an operation related to updating past usage frequency information included in the conversion map table 118.
As shown in FIG. 13, when a read request (instruction code load instruction) is made by a controller of an instruction cache 113 (not shown) (S101), the following operation is performed. First, in response to this read request, the pseudo instruction code necessary for the processor core 111 to execute the instruction code is read from the processing program storage area 121 of the main memory unit 120 and transferred to the instruction conversion unit 117 (S102). ). In the instruction conversion unit 117, the transferred pseudo instruction code is converted into a normal instruction code by operand conversion based on the information on the location of each variable included in the conversion map table 118 (S103). Further, when the converted instruction code is executed by the processor core 111 and a variable is used, the past use frequency information included in the conversion map table 118 is so set that the past use frequency for the variable is incremented. Is updated (S104). In the conversion map table 118, the past usage frequency information incremented in this way is used when updating the average usage frequency information, as will be described in detail later.

図１４は、処理プログラムの実行に伴って行われる変換マップテーブル１１８の更新手順の一例を説明する図である。
処理プログラムの実行によって行われるデータ処理は、連続するデータ処理を含む場合がある。一般的に、連続するデータ処理では、一定の処理の区切りを１単位（１処理ユニット）として、それを繰返し実行することで全体処理を実現する。例えば、動画像処理では１画面分の処理を１単位としてこれを繰り返し実行することで、物体認識処理や画像圧縮処理等といった画像処理を実現している。ここでは、そのような連続するデータ処理が行われる場合における変換マップテーブル１１８の更新手順の一例を説明する。 FIG. 14 is a diagram for explaining an example of a procedure for updating the conversion map table 118 performed in accordance with the execution of the processing program.
Data processing performed by executing the processing program may include continuous data processing. In general, in continuous data processing, a certain process break is defined as one unit (one processing unit), and the entire process is realized by repeatedly executing the unit. For example, in moving image processing, image processing such as object recognition processing and image compression processing is realized by repeatedly executing processing for one screen as one unit. Here, an example of a procedure for updating the conversion map table 118 when such continuous data processing is performed will be described.

図１４に示したように、処理プログラムの実行中に行われる１処理ユニット毎の処理では、まず、再配置処理が行われる（Ｓ２０１）。この再配置処理では、詳しくは後述するように、変換マップテーブル１１８に含まれる各変数の平均使用頻度の情報に基づいて各変数の配置先が決定され、それを反映させるように、変換マップテーブル１１８に含まれる各変数の配置先の情報が更新（変更）される。 As shown in FIG. 14, in the processing for each processing unit performed during execution of the processing program, first, rearrangement processing is performed (S201). In this rearrangement processing, as will be described in detail later, the placement destination of each variable is determined based on the information on the average use frequency of each variable included in the transformation map table 118, and the transformation map table is reflected so as to reflect it. Information on the location of each variable included in 118 is updated (changed).

次に、処理プログラムの１処理ユニットが実行される（Ｓ２０２）。ここでは、上述のように、命令変換ユニット１１７によって変換マップテーブル１１８が参照され、擬似命令コードから通常の命令コードへの変換が行われ、変換された命令コードがプロセッサコア１１１によって実行される。 Next, one processing unit of the processing program is executed (S202). Here, as described above, the conversion map table 118 is referred to by the instruction conversion unit 117, conversion from the pseudo instruction code to the normal instruction code is performed, and the converted instruction code is executed by the processor core 111.

次に、頻度平均更新処理が行われる（Ｓ２０３）。この頻度平均更新処理では、Ｓ２０２においてプロセッサコア１１１により命令コードが実行されたときに使用された変数の使用頻度の情報が、１処理ユニット分の過去の使用頻度の情報として、変換マップテーブル１１８に追加される。なお、変換マップテーブル１１８は、図１４に示したように、過去の使用頻度の情報として、現在から過去所定数回（Ｎ回）分の各々の１処理ユニットの実行において使用された変数の使用頻度の情報を含むことができる。例えば、図１４の「過去データ１」は、今回（直近）の１処理ユニットの実行において使用された変数の使用頻度の情報を示し、「過去データＮ」は、（Ｎ−１）処理ユニット前の処理ユニットの実行において使用された変数の使用頻度の情報を示す。このように、プロセッサコア１１１により命令コードが実行されたときに使用される変数に応じて、変換マップテーブル１１８に含まれる過去の使用頻度の情報が更新される。 Next, frequency average update processing is performed (S203). In this frequency average update process, information on the frequency of use of variables used when the instruction code is executed by the processor core 111 in S202 is stored in the conversion map table 118 as information on the past frequency of use for one processing unit. Added. As shown in FIG. 14, the conversion map table 118 uses the variables used in the execution of each processing unit for a predetermined number of times in the past (N times) as information on the past usage frequency. Frequency information can be included. For example, “Past data 1” in FIG. 14 indicates information on the frequency of use of variables used in the execution of the current (most recent) one processing unit, and “Past data N” is (N−1) before the processing unit. Information on the frequency of use of variables used in the execution of the processing unit is shown. As described above, the past use frequency information included in the conversion map table 118 is updated according to the variable used when the instruction code is executed by the processor core 111.

また、Ｓ２０３の頻度平均更新処理では、更に、変換マップテーブル１１８において、過去の使用頻度の情報に基づいて、各関数に定義されている各変数の使用頻度の平均値が求められ、平均使用頻度の情報が更新される。この場合、例えば、現在から過去Ｍ（但し、Ｍ≦Ｎ）回分の使用頻度の情報に基づいて、各関数に定義されている各変数の使用頻度の平均値が求められる。例えば、図１４に示した変換マップテーブル１１８に含まれる関数[2]の変数[3]の平均使用頻度「８」（網掛け部分参照）の情報は、現在から過去６回分の使用頻度（「１２」、「６」、「６」、「８」、「８」、「８」）の平均値から求められたものである。このようにして求められる平均使用頻度の情報を用いることにより、突発的（一時的）な使用頻度の変動による影響を軽減させることができる。 Further, in the frequency average update process of S203, the conversion map table 118 further obtains the average value of the usage frequencies of the variables defined in each function based on the past usage frequency information, and the average usage frequency. Information is updated. In this case, for example, an average value of the usage frequency of each variable defined in each function is obtained based on the usage frequency information for the past M (where M ≦ N) times from the present. For example, the information on the average usage frequency “8” (refer to the shaded portion) of the variable [3] of the function [2] included in the conversion map table 118 shown in FIG. 12 ”,“ 6 ”,“ 6 ”,“ 8 ”,“ 8 ”,“ 8 ”). By using the average usage frequency information thus obtained, it is possible to reduce the influence of sudden (temporary) fluctuations in the usage frequency.

図１５は、上述の再配置処理（Ｓ２０１）の処理例を示すフローチャートである。
図１５に示した処理は、上述の再配置処理（Ｓ２０１）において、１処理ユニット内の各関数に対して行われる処理である。 FIG. 15 is a flowchart illustrating a processing example of the above-described rearrangement processing (S201).
The process shown in FIG. 15 is a process performed for each function in one processing unit in the above-described rearrangement process (S201).

図１５に示した処理において、Ｒｍは、変数の再配置済レジスタを示す。Ｒｎ＿ｍａｘは、関数内の変数を配置可能な最大レジスタ数を示す。ＶＳｉは、関数内の変数を平均使用頻度順に並べたときのｉ番目の変数を示す。 In the processing shown in FIG. 15, Rm indicates a variable relocated register. Rn_max indicates the maximum number of registers in which variables in the function can be arranged. VSi indicates the i-th variable when the variables in the function are arranged in order of average usage frequency.

図１５に示したように、この処理では、まず、変換マップテーブル１１８に含まれる対象となる関数内に定義されている各変数の平均使用頻度の情報が参照される（Ｓ３０１）。なお、本処理が、処理プログラムの起動後２回目以降の１処理ユニットで行われる場合には、Ｓ３０１で参照される情報は、前回の１処理ユニットにおけるＳ２０３で更新された平均使用頻度の情報となる。次に、対象となる関数内に定義されている各変数の平均使用頻度の情報に基づいて平均使用頻度が高い順に並べられた変数ＶＳｉの配置先が、順次レジスタ１１２に決定される（Ｓ３０２）。そして、変数の配置先がレジスタ１１２に決定されたレジスタ数がＲｎ＿ｍａｘに達したら（Ｓ３０３がＹｅｓ）、残りの変数ＶＳｉの配置先がメモリ（内部メモリ１１６又はメインメモリ部１２０）に決定される（Ｓ３０４）。このような処理が、１処理ユニットに含まれる各関数に対して繰り返し行われる。これにより、変換マップテーブル１１８に含まれる各変数の平均使用頻度の情報に基づいて、平均使用頻度の高い変数を優先的にレジスタ１１２に再配置するように、変数の配置先を決定することができる。 As shown in FIG. 15, in this process, first, information on the average use frequency of each variable defined in the target function included in the conversion map table 118 is referred to (S301). In addition, when this process is performed in one processing unit for the second and subsequent times after the processing program is started, the information referred to in S301 is the average usage frequency information updated in S203 in the previous one processing unit. Become. Next, the placement destination of the variable VSi arranged in descending order of the average usage frequency is sequentially determined in the register 112 based on the information on the average usage frequency of each variable defined in the target function (S302). . When the number of registers in which the variable placement destination is determined as the register 112 reaches Rn_max (Yes in S303), the placement destination of the remaining variable VSi is determined in the memory (the internal memory 116 or the main memory unit 120) ( S304). Such processing is repeatedly performed for each function included in one processing unit. Thereby, based on the information on the average use frequency of each variable included in the conversion map table 118, the variable placement destination can be determined so that the variable having the high average use frequency is preferentially rearranged in the register 112. it can.

なお、静的なコンパイルにおいて、レジスタ割り付けを最適化する手法として、例えば、PolettoとSarkarの“linear scan allocation”という手法が知られている（以下「手法１」という）。この手法１において、例えば、レジスタの使用期間に、上述の動的な情報（変換マップテーブル１１８に含まれる情報）を更に加味するようにすることで、より効率の良い変数の配置が可能となる（以下「手法２」という）。 As a technique for optimizing register allocation in static compilation, for example, a technique called “linear scan allocation” by Poletto and Sarkar (hereinafter referred to as “method 1”) is known. In Method 1, for example, by further adding the above-described dynamic information (information included in the conversion map table 118) to the register usage period, it is possible to arrange variables more efficiently. (Hereinafter referred to as “Method 2”).

図１６(a) は、手法１による変数の配置例を示す図である。図１６(b) は、手法２による変数の配置例を示す図である。
なお、図１６に示した例は、変数Ａのソースコード上の使用終了タイミングが全変数中で最長となっているものの、実際の実行時には判定文のfalse側の処理ルートで使用されておりtrue側の処理ルートでは使用されていない場合の例である。また、図１６において、Ｒ０、Ｒ１は、レジスタに含まれる領域を示す。 FIG. 16A is a diagram illustrating an example of variable arrangement according to Method 1. FIG. FIG. 16 (b) is a diagram showing an example of variable arrangement according to method 2.
In the example shown in FIG. 16, although the use end timing of the variable A on the source code is the longest among all the variables, it is used in the processing route on the false side of the judgment statement at the time of actual execution. This is an example of a case where it is not used in the processing route on the side. In FIG. 16, R0 and R1 indicate areas included in the register.

図１６(a) に示したように、動的な情報を加味しない手法１では、変数Ａをレジスタからメモリへ追い出すspill動作が行われている。これに対し、図１６(b) に示したように、動的な情報を加味した手法２では、例えば、実行される処理ルートの予測が行われ、判定文で実行されない処理ルートが判断される。その結果、変数Ａのspill動作は行われず、より最適な変数の配置とすることができる。 As shown in FIG. 16 (a), in Method 1 in which dynamic information is not taken into account, a spill operation is performed to expel the variable A from the register to the memory. On the other hand, as shown in FIG. 16 (b), in the method 2 in consideration of dynamic information, for example, a processing route to be executed is predicted, and a processing route that is not executed by the determination statement is determined. . As a result, the spill operation of the variable A is not performed, and a more optimal variable arrangement can be achieved.

次に、本実施例に係るレジスタ配置最適化装置を含むデータ処理装置の具体例として、本実施例に係るレジスタ配置最適化装置を画像符号化装置に適用した例を説明する。
図１７は、本実施例に係るレジスタ配置最適化装置を含む画像符号化装置の構成例を示す図である。 Next, as a specific example of the data processing apparatus including the register arrangement optimizing apparatus according to this embodiment, an example in which the register arrangement optimizing apparatus according to this embodiment is applied to an image encoding apparatus will be described.
FIG. 17 is a diagram illustrating a configuration example of an image encoding device including the register arrangement optimizing device according to the present embodiment.

なお、図１７において、図１０に示した要素と同一の要素については、同一の符号を付している。また、図１７に示した画像符号化装置において、本実施例に係るレジスタ配置最適化装置は、図１０に示したデータ処理装置の場合と同様に、命令変換ユニット１１７と変換マップテーブル１１８を含む。 In FIG. 17, the same elements as those shown in FIG. 10 are denoted by the same reference numerals. Further, in the image encoding device shown in FIG. 17, the register arrangement optimizing device according to the present embodiment includes an instruction conversion unit 117 and a conversion map table 118, as in the case of the data processing device shown in FIG. .

図１７に示した画像符号化装置において、画像処理プロセッサ２１０の構成は、図１０に示したプロセッサＣＰＵ部１１０と基本的に同じである。また、ビデオ（Video）入力Ｉ／Ｆ（InterFace）部２２０は、外部のカメラ等から入力されるデータの入力処理を行う。ビデオ（Video）出力Ｉ／Ｆ（InterFace）部２３０は、外部のディスプレイ等への出力処理を行う。その他の構成については、図１０に示したデータ処理装置と基本的に同じである。 In the image encoding device shown in FIG. 17, the configuration of the image processor 210 is basically the same as that of the processor CPU unit 110 shown in FIG. A video input I / F (InterFace) unit 220 performs input processing of data input from an external camera or the like. A video output I / F (InterFace) unit 230 performs output processing to an external display or the like. Other configurations are basically the same as those of the data processing apparatus shown in FIG.

続いて、図１７に示した画像符号化装置の動作を、図１８及び図１９を用いて説明する。
図１８は、図１７に示した画像符号化装置の処理の流れの概要を模式的に示す図である。 Next, the operation of the image encoding device illustrated in FIG. 17 will be described with reference to FIGS. 18 and 19.
FIG. 18 is a diagram schematically showing an outline of the processing flow of the image encoding device shown in FIG.

図１８に示したように、カメラ等からの入力画像データは、ビデオ入力Ｉ／Ｆ部２２０、システムバス１６０、及びメモリコントローラ部１３０を介して、メインメモリ部１２０の処理データ格納領域１２２に蓄積される（図１８の矢印Ａ参照）。蓄積された画像データは、処理画像単位に、メモリコントローラ部１３０、システムバス１６０、及びシステムバスコントローラ１１５を介してデータキャッシュ１１４に読み出される（図１８の矢印Ｂ参照）。一方、メインメモリ部１２０の処理プログラム格納領域１２１に格納されている処理プログラム（擬似命令コードを含む）は、メモリコントローラ部１３０、システムバス１６０、及びシステムバスコントローラ１１５を介して命令変換ユニット１１７にロードされる（図１８の矢印Ｃ参照）。命令変換ユニット１１７では、変換マップテーブル１１８が参照され（図１８の矢印Ｄ参照）、その内容に基づいて擬似命令コードが通常の命令コードに変換されて命令キャッシュ１１３へ転送される（図１８の矢印Ｅ参照）。転送された命令コードは、プロセッサコア１１１により読み出され（図１８の矢印Ｆ参照）、その命令コードの実行が、例えばレジスタ１１２に配置されている変数が使用される等して行われる（図１８の矢印Ｇ参照）。このようにして処理プログラムが実行されると、それにより出力データが作成される。そして、その出力データが、データキャッシュ１１４、システムバスコントローラ１１５、及びシステムバス１６０を経由した後、例えば、通信Ｉ／Ｆ部１５０又はビデオ出力Ｉ／Ｆ部２３０を介して出力される（図１８の矢印Ｈ参照）。出力データは、例えば、画像圧縮データ等である。 As shown in FIG. 18, input image data from a camera or the like is stored in the processing data storage area 122 of the main memory unit 120 via the video input I / F unit 220, the system bus 160, and the memory controller unit 130. (See arrow A in FIG. 18). The accumulated image data is read out to the data cache 114 via the memory controller unit 130, the system bus 160, and the system bus controller 115 in units of processed images (see arrow B in FIG. 18). On the other hand, the processing program (including pseudo instruction code) stored in the processing program storage area 121 of the main memory unit 120 is transferred to the instruction conversion unit 117 via the memory controller unit 130, the system bus 160, and the system bus controller 115. Loaded (see arrow C in FIG. 18). In the instruction conversion unit 117, the conversion map table 118 is referred to (see arrow D in FIG. 18), and the pseudo instruction code is converted into a normal instruction code based on the content thereof and transferred to the instruction cache 113 (in FIG. 18). (See arrow E). The transferred instruction code is read by the processor core 111 (see arrow F in FIG. 18), and the instruction code is executed by using, for example, a variable arranged in the register 112 (see FIG. 18). 18 arrow G). When the processing program is executed in this manner, output data is thereby created. Then, the output data passes through the data cache 114, the system bus controller 115, and the system bus 160, and then is output, for example, via the communication I / F unit 150 or the video output I / F unit 230 (FIG. 18). Arrow H). The output data is, for example, compressed image data.

図１９は、図１７に示した画像符号化装置の動作シーケンスの一例を示す図である。
図１９に示したように、カメラ等からの入力画像データは、ビデオ入力Ｉ／Ｆ部２２０等を介して、メインメモリ部１２０の処理データ格納領域１２２に蓄積され、画像保存が行われる（Ｓ４０１）。次に、メインメモリ部１２０の処理プログラム格納領域１２１に格納されている処理プログラム（擬似命令コードを含む）が処理画像単位で起動し（Ｓ４０２）、命令変換ユニット１１７にロードされる。命令変換ユニット１１７では、変換マップテーブル１１８が参照され（Ｓ４０３）、処理プログラム内の各関数に定義されている各変数の配置先が決定される（Ｓ４０４）。また、擬似命令コードが通常の命令コードに変換される（Ｓ４０５）。変換された命令コードは命令キャッシュ１１３へ転送され、その後、プロセッサコア１１１により読み出されて、その命令コードの実行が行われる（Ｓ４０６）。一方、命令変換ユニット１１７では、プロセッサコア１１１により命令コードが実行されて変数が使用されると、その変数の使用頻度がインクリメントされる（Ｓ４０７）等して変換マップテーブル１１８に含まれる情報が更新される。そして、このようにして処理プログラムが実行され、それにより作成された出力データ（画像データ等）は、例えば、通信Ｉ／Ｆ部１５０又はビデオ出力Ｉ／Ｆ部２３０を介して、通信回線や外部のディスプレイ等に出力される（Ｓ４０８）。 FIG. 19 is a diagram illustrating an example of an operation sequence of the image encoding device illustrated in FIG. 17.
As shown in FIG. 19, the input image data from the camera or the like is accumulated in the processing data storage area 122 of the main memory unit 120 via the video input I / F unit 220 or the like, and the image is saved (S401). ). Next, the processing program (including the pseudo instruction code) stored in the processing program storage area 121 of the main memory unit 120 is activated in units of processed images (S402) and loaded into the instruction conversion unit 117. In the instruction conversion unit 117, the conversion map table 118 is referred to (S403), and the placement destination of each variable defined in each function in the processing program is determined (S404). Further, the pseudo instruction code is converted into a normal instruction code (S405). The converted instruction code is transferred to the instruction cache 113, and then read out by the processor core 111, and the instruction code is executed (S406). On the other hand, in the instruction conversion unit 117, when the instruction code is executed by the processor core 111 and a variable is used, the frequency of use of the variable is incremented (S407) and the information included in the conversion map table 118 is updated. Is done. Then, the processing program is executed in this way, and output data (image data or the like) created by the processing program is transmitted to the communication line or the external via the communication I / F unit 150 or the video output I / F unit 230, for example. (S408).

以上のように、本実施例に係るレジスタ配置最適化装置によれば、プログラム内の関数に定義されている変数の使用頻度に基づいて、使用頻度の高い変数が優先的にレジスタに配置されるように変数の配置先を動的に決定することができる。従って、変数の配置先を動的に最適化することができ、処理の高速化を図ることができる。また、このような最適化を、例えば動作モードの指定等といったユーザの特別な操作無しに、行うことができる。 As described above, according to the register arrangement optimizing apparatus according to the present embodiment, a frequently used variable is preferentially arranged in the register based on the use frequency of the variable defined in the function in the program. Thus, the variable placement destination can be determined dynamically. Therefore, the variable placement destination can be dynamically optimized, and the processing speed can be increased. Further, such optimization can be performed without any special user operation such as designation of an operation mode.

また、プログラムの開発期間を、次のような理由により、短縮することもできる。
図２０は、従来の場合のプログラム開発期間と、本実施例に係るレジスタ配置最適化装置を採用した場合のプログラム開発期間との一例を示す図である。 In addition, the program development period can be shortened for the following reasons.
FIG. 20 is a diagram illustrating an example of a conventional program development period and a program development period when the register arrangement optimizing apparatus according to the present embodiment is employed.

図２０に示したように、例えば、従来の場合において、設計期間に１ヶ月、高速化のための修正期間（チューニング作業）に２ヶ月を要していたとする。この場合、本実施例に係るレジスタ配置最適化装置を採用した場合には、高速化のための修正期間が不要になるため、その分（２ヶ月）の開発期間の短縮を図ることができる。 As shown in FIG. 20, for example, in the conventional case, it is assumed that one month is required for the design period and two months are required for the correction period (tuning work) for speeding up. In this case, when the register arrangement optimizing apparatus according to the present embodiment is employed, a correction period for speeding up becomes unnecessary, and therefore the development period (two months) can be shortened accordingly.

また、本実施例に係るレジスタ配置最適化装置を含む画像処理装置においては、例えば、複数フレームの画像が入力された場合に、その複数フレームの画像に対する画像処理に応じた高速化が可能になる。 Further, in the image processing apparatus including the register arrangement optimizing apparatus according to the present embodiment, for example, when a plurality of frames of images are input, it is possible to increase the speed according to the image processing for the plurality of frames of images. .

＜実施例２＞
実施例２に係るレジスタ配置最適化装置は、入力データ特徴算出部を更に含むと共に、変換マップテーブル１１８として複数の変換マップテーブルを含むことが、実施例１に係るレジスタ配置最適化装置と異なる。なお、入力データ特徴算出部はテーブル選択部の一例である。 <Example 2>
The register arrangement optimizing apparatus according to the second embodiment is different from the register arrangement optimizing apparatus according to the first embodiment in that it further includes an input data feature calculation unit and includes a plurality of conversion map tables as the conversion map table 118. The input data feature calculation unit is an example of a table selection unit.

本実施例に係るレジスタ配置最適化装置において、入力データ特徴算出部は、入力データの特徴を算出し、その算出結果に応じて、複数の変換マップテーブルの中から、対応する変換マップテーブルを選択する等の処理を行う。 In the register arrangement optimizing device according to the present embodiment, the input data feature calculation unit calculates the feature of the input data, and selects a corresponding conversion map table from a plurality of conversion map tables according to the calculation result Perform processing such as.

図２１(a),(b),(c) は、入力データ特徴算出部の処理例を示すフローチャートである。図２１(a) は、その基本的な処理例を示すフローチャートであり、図２１(b),(c) は、その具体例を示すフローチャートである。 FIGS. 21A, 21B, and 21C are flowcharts showing an example of processing of the input data feature calculation unit. FIG. 21A is a flowchart showing an example of the basic processing, and FIGS. 21B and 21C are flowcharts showing specific examples thereof.

図２１(a) に示したように、入力データ特徴算出部の基本的な処理は、次のようになる。まず、入力データが受信されると（Ｓ５０１）、その入力データの差分が算出される（Ｓ５０２）。次に、複数の変換マップテーブルから、その差分に応じた変換マップテーブルが選択され（Ｓ５０３）、その旨が命令変換ユニット１１７に通知される（Ｓ５０４）。これにより、命令変換ユニット１１７では、入力データ特徴算出部により選択された変換マップテーブルを用いて擬似命令コードの変換が行われる。 As shown in FIG. 21 (a), the basic processing of the input data feature calculation unit is as follows. First, when input data is received (S501), a difference between the input data is calculated (S502). Next, a conversion map table corresponding to the difference is selected from a plurality of conversion map tables (S503), and a notification to that effect is sent to the instruction conversion unit 117 (S504). Thereby, in the instruction conversion unit 117, the pseudo instruction code is converted using the conversion map table selected by the input data feature calculation unit.

具体的には、例えば、図２１(b) に示したように、入力データ（画像データ）が受信されると（Ｓ６０１）、その入力データのフレーム内差分が算出され（Ｓ６０２）、そのフレーム内差分の値が所定値よりも大きいか否かが判定される（Ｓ６０３）。ここで、その判定結果がＹｅｓの場合には、フレーム内処理用の変換マップテーブルが選択され、その旨が命令変換ユニット１１７に通知される（Ｓ６０４）。これにより、命令変換ユニット１１７では、入力データ特徴算出部により選択されたフレーム内処理用の変換マップテーブルを用いて擬似命令コードの変換が行われる。一方、Ｓ６０３の判定結果がＮｏの場合には、変換マップテーブルを変更しない旨が命令変換ユニット１１７に通知される（Ｓ６０５）。これにより、命令変換ユニット１１７では、デフォルトとして設定されている変換マップテーブルを用いて擬似命令コードの変換が行われる。 Specifically, for example, as shown in FIG. 21B, when input data (image data) is received (S601), an intra-frame difference of the input data is calculated (S602). It is determined whether or not the difference value is larger than a predetermined value (S603). Here, if the determination result is Yes, a conversion map table for intra-frame processing is selected, and that is notified to the instruction conversion unit 117 (S604). As a result, the instruction conversion unit 117 converts the pseudo instruction code using the conversion map table for intra-frame processing selected by the input data feature calculation unit. On the other hand, when the determination result in S603 is No, the instruction conversion unit 117 is notified that the conversion map table is not changed (S605). As a result, the instruction conversion unit 117 converts the pseudo instruction code using the conversion map table set as default.

或いは、例えば、図２１(c) に示したように、入力データ（画像データ）が受信されると（Ｓ７０１）、その入力データのフレーム間差分が算出され（Ｓ７０２）、そのフレーム間差分の値が所定値よりも大きいか否かが判定される（Ｓ７０３）。ここで、その判定結果がＹｅｓの場合には、フレーム間処理用の変換マップテーブルが選択され、その旨が命令変換ユニット１１７に通知される（Ｓ７０４）。これにより、命令変換ユニット１１７では、入力データ特徴算出部により選択されたフレーム間処理用の変換マップテーブルを用いて擬似命令コードの変換が行われる。一方、Ｓ７０３の判定結果がＮｏの場合には、変換マップテーブルを変更しない旨が命令変換ユニット１１７に通知される（Ｓ７０５）。これにより、命令変換ユニット１１７では、デフォルトとして設定されている変換マップテーブルを用いて擬似命令コードの変換が行われる。 Alternatively, for example, as shown in FIG. 21 (c), when input data (image data) is received (S701), an interframe difference of the input data is calculated (S702), and the value of the interframe difference is calculated. It is determined whether or not is greater than a predetermined value (S703). Here, if the determination result is Yes, a conversion map table for inter-frame processing is selected, and this is notified to the instruction conversion unit 117 (S704). As a result, the instruction conversion unit 117 converts the pseudo instruction code using the conversion map table for inter-frame processing selected by the input data feature calculation unit. On the other hand, if the determination result in S703 is No, the instruction conversion unit 117 is notified that the conversion map table is not changed (S705). As a result, the instruction conversion unit 117 converts the pseudo instruction code using the conversion map table set as default.

このような図２１(b) 又は図２１(c) に示した処理により、フレーム内差分又はフレーム間差分が大きい画像においては、変換マップテーブルが切り替えられ、その切り替えられた変換マップテーブルを用いて擬似命令コードの変換が行われる。このようにすることで、例えば、動画像におけるシーンチェンジ等において、画像データに揺らぎが発生したとしても変換マップテーブルに含まれる情報に影響を与えずに変数の配置先を切り替えることができる。 With the processing shown in FIG. 21 (b) or FIG. 21 (c), in an image having a large intra-frame difference or inter-frame difference, the conversion map table is switched, and the switched conversion map table is used. The pseudo instruction code is converted. In this way, for example, even when a fluctuation occurs in image data in a scene change or the like in a moving image, the variable placement destination can be switched without affecting the information included in the conversion map table.

次に、本実施例に係るレジスタ配置最適化装置を画像符号化装置に適用した例を説明する。
図２２は、本実施例に係るレジスタ配置最適化装置を含む画像符号化装置の構成例を示す図である。 Next, an example in which the register arrangement optimizing apparatus according to the present embodiment is applied to an image encoding apparatus will be described.
FIG. 22 is a diagram illustrating a configuration example of an image encoding device including the register arrangement optimizing device according to the present embodiment.

なお、図２２において、図１７に示した要素と同一の要素については、同一の符号を付している。また、図２２に示した画像符号化装置において、本実施例に係るレジスタ配置最適化装置は、命令変換ユニット１１７と複数の変換マップテーブル１１８と入力画像特徴算出ユニット３１１を含む。 In FIG. 22, the same elements as those shown in FIG. 17 are denoted by the same reference numerals. In the image encoding device shown in FIG. 22, the register arrangement optimizing device according to this embodiment includes an instruction conversion unit 117, a plurality of conversion map tables 118, and an input image feature calculation unit 311.

図２２に示した画像符号化装置において、画像処理プロセッサ２１０の構成は、入力画像特徴算出ユニット３１１を更に含むと共に、複数の変換マップテーブル１１８を含む以外は、図１７に示した画像処理プロセッサ２１０と基本的に同じである。なお、入力画像特徴算出ユニット３１１は、上記の入力データ特徴算出部の一例である。その他の構成については、図１７に示した画像符号化装置と基本的に同じである。 In the image encoding device shown in FIG. 22, the configuration of the image processor 210 further includes an input image feature calculation unit 311 and includes a plurality of conversion map tables 118. The image processor 210 shown in FIG. And basically the same. The input image feature calculation unit 311 is an example of the input data feature calculation unit. Other configurations are basically the same as those of the image encoding device shown in FIG.

続いて、図２２に示した画像符号化装置の動作を、図２３及び図２４を用いて説明する。
図２３は、図２２に示した画像符号化装置の処理の流れの概要を模式的に示す図である。 Next, the operation of the image encoding device shown in FIG. 22 will be described with reference to FIGS.
FIG. 23 is a diagram schematically showing an outline of the processing flow of the image encoding device shown in FIG.

図２３に示したように、カメラ等からの入力画像データは、ビデオ入力Ｉ／Ｆ部２２０、システムバス１６０、及びメモリコントローラ部１３０を介して、メインメモリ部１２０の処理データ格納領域１２２に蓄積される（図２３の矢印Ａ参照）。蓄積された画像データは、処理画像単位に、メモリコントローラ部１３０、システムバス１６０、及びシステムバスコントローラ１１５を介してデータキャッシュ１１４に読み出される（図２３の矢印Ｂ参照）。また、これと共に、それが入力画像特徴算出ユニット３１１に入力される（図２３の矢印Ｃ参照）。入力画像特徴算出ユニット３１１では、入力された画像データのフレーム内差分又はフレーム間差分が算出される。そして、その算出結果に応じて、対応する変換マップテーブル１１８の選択又は変換マップテーブル１１８の変更無しの通知が、命令変換ユニット１１７に対して行われる（図２３の矢印Ｄ参照）。一方、メインメモリ部１２０の処理プログラム格納領域１２１に格納されている処理プログラム（擬似命令コードを含む）は、メモリコントローラ部１３０、システムバス１６０、及びシステムバスコントローラ１１５を介して命令変換ユニット１１７にロードされる（図２３の矢印Ｅ参照）。命令変換ユニット１１７では、入力画像特徴算出ユニット３１１からの通知に応じて、対応する変換マップテーブル１１８が選択、参照される（図２３の矢印Ｆ参照）。そして、その変換マップテーブル１１８の内容に基づいて擬似命令コードが通常の命令コードに変換されて命令キャッシュ１１３へ転送される（図２３の矢印Ｇ参照）。転送された命令コードは、プロセッサコア１１１により読み出され（図２３の矢印Ｈ参照）、その命令コードの実行が、例えばレジスタ１１２に配置されている変数が使用される等して行われる（図２３の矢印Ｉ参照）。このようにして処理プログラムが実行されると、それにより出力データが作成される。そして、その出力データが、データキャッシュ１１４、システムバスコントローラ１１５、及びシステムバス１６０を経由した後、例えば、通信Ｉ／Ｆ部１５０又はビデオ出力Ｉ／Ｆ部２３０を介して出力される（図２３の矢印Ｊ参照）。出力データは、例えば、画像圧縮データ等である。 As shown in FIG. 23, input image data from a camera or the like is accumulated in the processing data storage area 122 of the main memory unit 120 via the video input I / F unit 220, the system bus 160, and the memory controller unit 130. (See arrow A in FIG. 23). The accumulated image data is read out to the data cache 114 via the memory controller unit 130, the system bus 160, and the system bus controller 115 in units of processed images (see arrow B in FIG. 23). At the same time, it is input to the input image feature calculation unit 311 (see arrow C in FIG. 23). The input image feature calculation unit 311 calculates an intra-frame difference or an inter-frame difference of input image data. Then, according to the calculation result, notification of selection of the corresponding conversion map table 118 or no change of the conversion map table 118 is made to the instruction conversion unit 117 (see arrow D in FIG. 23). On the other hand, the processing program (including pseudo instruction code) stored in the processing program storage area 121 of the main memory unit 120 is transferred to the instruction conversion unit 117 via the memory controller unit 130, the system bus 160, and the system bus controller 115. Loaded (see arrow E in FIG. 23). In response to the notification from the input image feature calculation unit 311, the instruction conversion unit 117 selects and refers to the corresponding conversion map table 118 (see arrow F in FIG. 23). Then, the pseudo instruction code is converted into a normal instruction code based on the contents of the conversion map table 118 and transferred to the instruction cache 113 (see arrow G in FIG. 23). The transferred instruction code is read by the processor core 111 (see arrow H in FIG. 23), and the instruction code is executed by using, for example, a variable arranged in the register 112 (see FIG. 23). 23, arrow I). When the processing program is executed in this manner, output data is thereby created. The output data passes through the data cache 114, the system bus controller 115, and the system bus 160, and then is output, for example, via the communication I / F unit 150 or the video output I / F unit 230 (FIG. 23). Arrow J). The output data is, for example, compressed image data.

図２４は、図２２に示した画像符号化装置の動作シーケンスの一例を示す図である。
図２４に示したように、カメラ等からの入力画像データは、ビデオ入力Ｉ／Ｆ部２２０等を介して、メインメモリ部１２０の処理データ格納領域１２２に蓄積され、画像保存が行われる（Ｓ８０１）。次に、メインメモリ部１２０の処理プログラム格納領域１２１に格納されている処理プログラム（擬似命令コードを含む）が処理画像単位で起動し（Ｓ８０２）、命令変換ユニット１１７にロードされる。入力画像特徴算出ユニット３１１では、入力画像データの差分（フレーム内差分又はフレーム間差分）が算出される（Ｓ８０３）。また、その算出結果に応じて、対応する変換マップテーブル１１８の選択又は変換マップテーブル１１８の変更無しの通知が命令変換ユニット１１７に対して行われる（Ｓ８０４）。命令変換ユニット１１７では、入力画像特徴算出ユニット３１１からの通知に応じて、対応する変換マップテーブル１１８が選択、参照され（Ｓ８０５）、処理プログラム内の各関数に定義されている各変数の配置先が決定される（Ｓ８０６）。また、擬似命令コードが通常の命令コードに変換される（Ｓ８０７）。変換された命令コードは命令キャッシュ１１３へ転送され、その後、プロセッサコア１１１により読み出されて、その命令コードの実行が行われる（Ｓ８０８）。一方、命令変換ユニット１１７では、プロセッサコア１１１により命令コードが実行されて変数が使用されると、その変数の使用頻度がインクリメントされる（Ｓ８０９）等して変換マップテーブル１１８に含まれる情報が更新される。そして、このようにして処理プログラムが実行され、それにより作成された出力データ（画像データ等）は、例えば、通信Ｉ／Ｆ部１５０又はビデオ出力Ｉ／Ｆ部２３０を介して、通信回線や外部のディスプレイ等に出力される（Ｓ８１０）。 FIG. 24 is a diagram illustrating an example of an operation sequence of the image encoding device illustrated in FIG. 22.
As shown in FIG. 24, input image data from a camera or the like is accumulated in the processing data storage area 122 of the main memory unit 120 via the video input I / F unit 220 or the like, and image storage is performed (S801). ). Next, the processing program (including the pseudo instruction code) stored in the processing program storage area 121 of the main memory unit 120 is activated for each processing image (S802) and loaded into the instruction conversion unit 117. The input image feature calculation unit 311 calculates a difference (intraframe difference or interframe difference) between input image data (S803). Further, according to the calculation result, notification of selection of the corresponding conversion map table 118 or no change of the conversion map table 118 is made to the instruction conversion unit 117 (S804). The instruction conversion unit 117 selects and refers to the corresponding conversion map table 118 in response to the notification from the input image feature calculation unit 311 (S805), and places each variable defined in each function in the processing program. Is determined (S806). Further, the pseudo instruction code is converted into a normal instruction code (S807). The converted instruction code is transferred to the instruction cache 113, and thereafter read out by the processor core 111, and the instruction code is executed (S808). On the other hand, in the instruction conversion unit 117, when the instruction code is executed by the processor core 111 and a variable is used, the frequency of use of the variable is incremented (S809) and the information included in the conversion map table 118 is updated. Is done. Then, the processing program is executed in this way, and output data (image data or the like) created by the processing program is transmitted to the communication line or the external via the communication I / F unit 150 or the video output I / F unit 230, for example. (S810).

なお、入力画像特徴算出ユニット３１１が、選択した変換マップテーブル１１８を命令変換ユニット１１７に通知する方法としては、次のような方法がある。
例えば、入力画像特徴算出ユニット３１１は、選択した変換マップテーブル１１８に対応する変換マップテーブル選択値の情報を命令変換ユニット１１７へ通知する。これを受けた命令変換ユニットは、その変換マップテーブル選択値の情報に基づいて、対応する変換マップテーブル１１８を選択する。なお、この場合、複数の変換マップテーブル１１８の各々には、対応する変換マップテーブル選択値の情報が含まれている。 As a method for the input image feature calculation unit 311 to notify the instruction conversion unit 117 of the selected conversion map table 118, there are the following methods.
For example, the input image feature calculation unit 311 notifies the instruction conversion unit 117 of information on the conversion map table selection value corresponding to the selected conversion map table 118. Receiving this, the instruction conversion unit selects the corresponding conversion map table 118 based on the information of the conversion map table selection value. In this case, each of the plurality of conversion map tables 118 includes information on a corresponding conversion map table selection value.

図２５は、各々が変換マップテーブル選択値の情報を含む複数の変換マップテーブル１１８の例を示す図である。
図２５に示したように、複数の変換マップテーブル１１８の各々は、対応する変換マップテーブル選択値の情報を含む。例えば、変換マップテーブル１１８ａは、変換マップテーブル選択値が「１」である情報を含み、変換マップテーブル１１８ｂは、変換マップテーブル選択値が「２」である情報を含む。なお、図２５に示した変換マップテーブル１１８ａ、１１８ｂでは、過去の使用頻度の情報を省略して示している。 FIG. 25 is a diagram illustrating an example of a plurality of conversion map tables 118 each including information of a conversion map table selection value.
As shown in FIG. 25, each of the plurality of conversion map tables 118 includes information on a corresponding conversion map table selection value. For example, the conversion map table 118a includes information whose conversion map table selection value is “1”, and the conversion map table 118b includes information whose conversion map table selection value is “2”. In the conversion map tables 118a and 118b shown in FIG. 25, the past use frequency information is omitted.

以上のように、本実施例に係るレジスタ配置最適化装置によれば、命令変換ユニット１１７が使用する（参照、更新する）変換マップテーブル１１８を、入力データの特徴に応じた変換マップテーブル１１８とすることができる。従って、例えば、入力画像のシーンに応じて、適応的に変換マップテーブル１１８が使用されるようになるので、画像処理の高速化を図ることができる。 As described above, according to the register arrangement optimizing apparatus according to the present embodiment, the conversion map table 118 used (referenced or updated) by the instruction conversion unit 117 is converted into the conversion map table 118 corresponding to the characteristics of the input data. can do. Therefore, for example, the conversion map table 118 is adaptively used according to the scene of the input image, so that the image processing can be speeded up.

また、上述の実施例１に係るレジスタ配置最適化装置と同様に、プログラムの開発期間を短縮することもできる。
また、本実施例に係るレジスタ配置最適化装置を含む画像処理装置においては、例えば、複数フレームの画像が入力された場合に、その複数フレームの画像の特徴に応じた高速化が可能になる。 Further, similar to the register arrangement optimizing apparatus according to the first embodiment, the program development period can be shortened.
Further, in the image processing apparatus including the register arrangement optimizing apparatus according to the present embodiment, for example, when a plurality of frames of images are input, it is possible to increase the speed according to the characteristics of the plurality of frames of images.

なお、上述の実施例１及び２に係るレジスタ配置最適化装置では、プロセッサコア１１１により命令コードが実行されたときに使用された変数に応じて、変換マップテーブル１１８に含まれる、変数の過去の使用頻度及び平均使用頻度の情報が更新されるものであった。これを、例えば、命令変換ユニット１１７により擬似命令コードが変換された際に使用された変数に応じて、行うようにすることも可能である。 In the register arrangement optimizing device according to the first and second embodiments described above, the past of variables included in the conversion map table 118 according to the variables used when the instruction code is executed by the processor core 111 is used. Information on usage frequency and average usage frequency was updated. This can be performed according to, for example, a variable used when the pseudo instruction code is converted by the instruction conversion unit 117.

また、上述の実施例１及び２に係るレジスタ配置最適化装置は、例えば、次のようなコンピュータシステムによって実現することも可能である。
図２６は、そのコンピュータシステムの構成例を示す図である。 Further, the register arrangement optimizing device according to the first and second embodiments described above can be realized by, for example, the following computer system.
FIG. 26 is a diagram illustrating a configuration example of the computer system.

図２６に示したように、このコンピュータシステムは、ＣＰＵ４０１、ＲＯＭ４０２、ＲＡＭ４０３、通信インタフェース４０４、記憶装置４０５、入出力装置４０６、可搬型記憶媒体の読取り装置４０７、及び、これらの全てが接続されたバス４０８を含む。なお、ＲＯＭは Read Only Memory である。 As shown in FIG. 26, this computer system includes a CPU 401, a ROM 402, a RAM 403, a communication interface 404, a storage device 405, an input / output device 406, a portable storage medium reading device 407, and all of these. Includes bus 408. The ROM is a read only memory.

記憶装置３０５としてはハードディスク、磁気ディスクなど様々な形式の記憶装置を使用することができる。
例えば、実施例１に係るレジスタ配置最適化装置を本コンピュータシステムにより実現する場合には、記憶装置４０５、またはＲＯＭ４０２に、命令変換ユニット１１７等が行う動作（処理）のためのプログラム等が格納される。また、ＲＡＭ４０３には、変換マップテーブル１１８等が格納される。そして、そのプログラムがＣＰＵ４０１によって実行されることにより、実施例１に係るレジスタ配置最適化装置が実現される。 As the storage device 305, various types of storage devices such as a hard disk and a magnetic disk can be used.
For example, when the register arrangement optimizing apparatus according to the first embodiment is realized by this computer system, a program for an operation (processing) performed by the instruction conversion unit 117 or the like is stored in the storage device 405 or the ROM 402. The The RAM 403 stores a conversion map table 118 and the like. Then, by executing the program by the CPU 401, the register arrangement optimizing device according to the first embodiment is realized.

また、例えば、実施例２に係るレジスタ配置最適化装置を本コンピュータシステムにより実現する場合には、記憶装置４０５、またはＲＯＭ４０２に、命令変換ユニット１１７及び入力データ特徴算出部等が行う動作（処理）のためのプログラム等が格納される。また、ＲＡＭ４０３には、複数の変換マップテーブル１１８等が格納される。そして、そのプログラムがＣＰＵ４０１によって実行されることにより、実施例２に係るレジスタ配置最適化装置が実現される。 Further, for example, when the register arrangement optimizing device according to the second embodiment is realized by the computer system, operations (processing) performed by the instruction conversion unit 117, the input data feature calculation unit, and the like in the storage device 405 or the ROM 402. Stores a program and the like. The RAM 403 stores a plurality of conversion map tables 118 and the like. Then, by executing the program by the CPU 401, the register arrangement optimizing device according to the second embodiment is realized.

このようなプログラムは、プログラム提供者４０９からネットワーク４１０、および通信インタフェース４０４を介して、例えば記憶装置４０５に格納されて、ＣＰＵ４０１によって実行されることも可能である。また、市販され、流通している可搬型記憶媒体４１１に格納され、読取り装置４０７にセットされて、ＣＰＵ４０１によって実行されることも可能である。可搬型記憶媒体４１１としてはＣＤ−ＲＯＭ、フレキシブルディスク、光ディスク、光磁気ディスク、ＤＶＤディスク、ＵＳＢメモリなど様々な形式の記憶媒体を使用することができる。 Such a program can be stored in, for example, the storage device 405 from the program provider 409 via the network 410 and the communication interface 404 and executed by the CPU 401. Further, it can be stored in a commercially available portable storage medium 411, set in the reading device 407, and executed by the CPU 401. As the portable storage medium 411, various types of storage media such as a CD-ROM, a flexible disk, an optical disk, a magneto-optical disk, a DVD disk, and a USB memory can be used.

なお、本コンピュータシステムにおいて、通信インターフェース４０４、記憶装置４０５、入出力装置４０６、及び読み取り装置４０７は無くても良い。
以上、実施例を説明したが、本発明は、上述した実施例に限定されることなく、本発明の要旨を逸脱しない範囲内で種々の改良・変更が可能である。 In the computer system, the communication interface 404, the storage device 405, the input / output device 406, and the reading device 407 may be omitted.
Although the embodiments have been described above, the present invention is not limited to the above-described embodiments, and various improvements and modifications can be made without departing from the gist of the present invention.

以上の実施の形態に関し、更に以下の付記を開示する。
（付記１）
命令コードを実行するプロセッサコアとレジスタとを含むプロセッサにおいて、前記レジスタに配置される変数を最適化する方法であって、
プログラム内の関数に定義されている変数の配置先及び使用頻度の情報を含む変換テーブルを参照し、前記変数の使用頻度の情報に基づいて、使用頻度の高い変数が優先的に前記レジスタに配置されるように、変数の配置先を前記レジスタ又はメモリに決定し、
前記決定された変数の配置先を、前記変換テーブルに含まれる前記変数の配置先の情報に反映し、
前記変換テーブルに含まれる前記変数の配置先の情報に基づいて、演算命令のオペランドの配置先が指定されていない擬似命令コードを、前記プロセッサコアが実行可能な命令コードに変換し、
前記変換された命令コードを前記プロセッサコアが実行する際に使用される変数に応じて、前記変換テーブルに含まれる前記変数の使用頻度の情報を更新する、
ことを特徴とするレジスタ配置最適化方法。
（付記２）
前記変換テーブルに含まれる前記変数の使用頻度の情報は、変数の平均使用頻度及び過去の使用頻度の情報を含み、
前記決定では、前記変換テーブルに含まれる前記変数の平均使用頻度の情報に基づいて、平均使用頻度の高い変数が優先的に前記レジスタに配置されるように、変数の配置先を前記レジスタ又は前記メモリに決定し、
前記更新では、前記変換された命令コードを前記プロセッサコアが実行する際に使用される変数に応じて、前記変換テーブルに含まれる前記変数の過去の使用頻度の情報を更新すると共に、前記変数の過去の使用頻度の情報に基づいて前記変換テーブルに含まれる前記変数の平均使用頻度の情報を更新する、
ことを特徴とする付記１記載のレジスタ配置最適化方法。
（付記３）
前記変換テーブルは、複数の変換テーブルの中から、前記プロセッサが処理する入力データの特徴に応じて選択された変換テーブルである、
ことを特徴とする付記１又は２記載のレジスタ配置最適化方法。
（付記４）
前記入力データは画像データであり、
前記選択された変換テーブルは、前記画像データのフレーム内差分又はフレーム間差分に応じて選択された変換テーブルである、
ことを特徴とする付記３記載のレジスタ配置最適化方法。
（付記５）
命令コードを実行するプロセッサコアとレジスタとを含むプロセッサにおいて、前記レジスタに配置される変数を最適化するプログラムであって、
プログラム内の関数に定義されている変数の配置先及び使用頻度の情報を含む変換テーブルを参照し、前記変数の使用頻度の情報に基づいて、使用頻度の高い変数が優先的に前記レジスタに配置されるように、変数の配置先を前記レジスタ又はメモリに決定し、
前記決定された変数の配置先を、前記変換テーブルに含まれる前記変数の配置先の情報に反映し、
前記変換テーブルに含まれる前記変数の配置先の情報に基づいて、演算命令のオペランドの配置先が指定されていない擬似命令コードを、前記プロセッサコアが実行可能な命令コードに変換し、
前記変換された命令コードを前記プロセッサコアが実行する際に使用される変数に応じて、前記変換テーブルに含まれる前記変数の使用頻度の情報を更新する、
という処理をコンピュータに実行させるレジスタ配置最適化プログラム。
（付記６）
前記変換テーブルに含まれる前記変数の使用頻度の情報は、変数の平均使用頻度及び過去の使用頻度の情報を含み、
前記決定では、前記変換テーブルに含まれる前記変数の平均使用頻度の情報に基づいて、平均使用頻度の高い変数が優先的に前記レジスタに配置されるように、変数の配置先を前記レジスタ又は前記メモリに決定し、
前記更新では、前記変換された命令コードを前記プロセッサコアが実行する際に使用される変数に応じて、前記変換テーブルに含まれる前記変数の過去の使用頻度の情報を更新すると共に、前記変数の過去の使用頻度の情報に基づいて前記変換テーブルに含まれる前記変数の平均使用頻度の情報を更新する、
ことを特徴とする付記５記載のレジスタ配置最適化プログラム。
（付記７）
前記変換テーブルは、複数の変換テーブルの中から、前記プロセッサが処理する入力データの特徴に応じて選択された変換テーブルである、
ことを特徴とする付記５又は６記載のレジスタ配置最適化プログラム。
（付記８）
前記入力データは画像データであり、
前記選択された変換テーブルは、前記画像データのフレーム内差分又はフレーム間差分に応じて選択された変換テーブルである、
ことを特徴とする付記７記載のレジスタ配置最適化プログラム。
（付記９）
命令コードを実行するプロセッサコアとレジスタとを含むプロセッサにおいて、前記レジスタに配置される変数を最適化する装置であって、
プログラム内の関数に定義されている変数の配置先及び使用頻度の情報を含む変換テーブルと、
前記変換テーブルに含まれる前記変数の配置先の情報に基づいて、演算命令のオペランドの配置先が指定されていない擬似命令コードを、前記プロセッサコアが実行可能な命令コードに変換する命令変換部と、
を備え、
前記命令変換部は、
前記変換テーブルに含まれる前記変数の使用頻度の情報に基づいて、使用頻度の高い変数が優先的に前記レジスタに配置されるように、変数の配置先を前記レジスタ又はメモリに決定し、
前記決定された変数の配置先を、前記変換テーブルに含まれる前記変数の配置先の情報に反映し、
前記変換された命令コードを前記プロセッサコアが実行する際に使用される変数に応じて、前記変換テーブルに含まれる前記変数の使用頻度の情報を更新する、
ことを特徴とするレジスタ配置最適化装置。
（付記１０）
前記変換テーブルに含まれる前記変数の使用頻度の情報は、変数の平均使用頻度及び過去の使用頻度の情報を含み、
前記命令変換部は、
前記変換テーブルに含まれる前記変数の平均使用頻度の情報に基づいて、平均使用頻度の高い変数が優先的に前記レジスタに配置されるように、変数の配置先を前記レジスタ又は前記メモリに決定し、
前記変換された命令コードを前記プロセッサコアが実行する際に使用される変数に応じて、前記変換テーブルに含まれる前記変数の過去の使用頻度の情報を更新すると共に、前記変数の過去の使用頻度の情報に基づいて前記変換テーブルに含まれる前記変数の平均使用頻度の情報を更新する、
ことを特徴とする付記９記載のレジスタ配置最適化装置。
（付記１１）
複数の変換テーブルの中から、前記プロセッサが処理する入力データの特徴に応じた変換テーブルを選択するテーブル選択部を更に備え、
前記命令変換部が参照、更新する変換テーブルは、前記テーブル選択部により選択された変換テーブルである、
ことを特徴とする付記９又は１０記載のレジスタ配置最適化装置。
（付記１２）
前記入力データは画像データであり、
前記テーブル選択部は、前記複数の変換テーブルの中から、前記プロセッサが処理する画像データのフレーム内差分又はフレーム間差分に応じた変換テーブルを選択する、
ことを特徴とする付記１１記載のレジスタ配置最適化装置。 Regarding the above embodiment, the following additional notes are disclosed.
(Appendix 1)
In a processor including a processor core for executing an instruction code and a register, a method for optimizing a variable arranged in the register,
Refers to a conversion table that includes information on the location and usage frequency of variables defined in functions in the program, and preferentially allocates frequently used variables to the registers based on the usage frequency information of the variables. To determine the location of the variable in the register or memory,
Reflecting the determined variable placement destination in the variable placement destination information included in the conversion table,
Based on the information on the placement destination of the variable included in the conversion table, the pseudo instruction code in which the placement destination of the operand of the operation instruction is not specified is converted into an instruction code executable by the processor core,
Updating information on the frequency of use of the variables included in the conversion table according to the variables used when the processor core executes the converted instruction code;
A register arrangement optimizing method characterized by the above.
(Appendix 2)
The information on the frequency of use of the variable included in the conversion table includes information on the average use frequency and past use frequency of the variable,
In the determination, based on the information on the average use frequency of the variable included in the conversion table, the variable placement destination is set to the register or the above so that a variable having a high average use frequency is preferentially placed in the register. Decide on memory,
In the update, according to a variable used when the processor core executes the converted instruction code, information on the past use frequency of the variable included in the conversion table is updated, and the variable Updating information on average usage frequency of the variable included in the conversion table based on information on past usage frequency;
The register arrangement optimizing method according to supplementary note 1, wherein:
(Appendix 3)
The conversion table is a conversion table selected from a plurality of conversion tables according to the characteristics of input data processed by the processor.
The register arrangement optimizing method according to appendix 1 or 2, characterized in that:
(Appendix 4)
The input data is image data;
The selected conversion table is a conversion table selected according to an intra-frame difference or an inter-frame difference of the image data.
The register arrangement optimizing method according to supplementary note 3, characterized in that:
(Appendix 5)
In a processor including a processor core for executing an instruction code and a register, a program for optimizing a variable arranged in the register,
Refers to a conversion table that includes information on the location and usage frequency of variables defined in functions in the program, and preferentially allocates frequently used variables to the registers based on the usage frequency information of the variables. To determine the location of the variable in the register or memory,
Reflecting the determined variable placement destination in the variable placement destination information included in the conversion table,
Based on the information on the placement destination of the variable included in the conversion table, the pseudo instruction code in which the placement destination of the operand of the operation instruction is not specified is converted into an instruction code executable by the processor core,
Updating information on the frequency of use of the variables included in the conversion table according to the variables used when the processor core executes the converted instruction code;
A register placement optimization program that causes a computer to execute the process.
(Appendix 6)
The information on the frequency of use of the variable included in the conversion table includes information on the average use frequency and past use frequency of the variable,
In the determination, based on the information on the average use frequency of the variable included in the conversion table, the variable placement destination is set to the register or the above so that a variable having a high average use frequency is preferentially placed in the register. Decide on memory,
In the update, according to a variable used when the processor core executes the converted instruction code, information on the past use frequency of the variable included in the conversion table is updated, and the variable Updating information on average usage frequency of the variable included in the conversion table based on information on past usage frequency;
The register arrangement optimizing program according to appendix 5, characterized in that:
(Appendix 7)
The conversion table is a conversion table selected from a plurality of conversion tables according to the characteristics of input data processed by the processor.
The register arrangement optimizing program according to appendix 5 or 6, characterized in that:
(Appendix 8)
The input data is image data;
The selected conversion table is a conversion table selected according to an intra-frame difference or an inter-frame difference of the image data.
The register arrangement optimizing program according to appendix 7, wherein
(Appendix 9)
In a processor including a processor core for executing an instruction code and a register, an apparatus for optimizing a variable arranged in the register,
A conversion table containing information on the location and frequency of use of variables defined in the functions in the program;
An instruction conversion unit that converts a pseudo instruction code in which an operand placement destination of an operation instruction is not designated into an instruction code executable by the processor core based on information on a placement destination of the variable included in the conversion table; ,
With
The instruction conversion unit includes:
Based on the information on the frequency of use of the variable included in the conversion table, the variable placement destination is determined in the register or the memory so that the frequently used variable is preferentially placed in the register,
Reflecting the determined variable placement destination in the variable placement destination information included in the conversion table,
Updating information on the frequency of use of the variables included in the conversion table according to the variables used when the processor core executes the converted instruction code;
A register arrangement optimizing device characterized by that.
(Appendix 10)
The information on the frequency of use of the variable included in the conversion table includes information on the average use frequency and past use frequency of the variable,
The instruction conversion unit includes:
Based on the information on the average use frequency of the variable included in the conversion table, the variable placement destination is determined in the register or the memory so that a variable with a high average use frequency is preferentially placed in the register. ,
According to a variable used when the processor core executes the converted instruction code, information on the past use frequency of the variable included in the conversion table is updated, and the past use frequency of the variable is updated. Updating the information on the average use frequency of the variables included in the conversion table based on the information of
The register arrangement optimizing device according to appendix 9, wherein
(Appendix 11)
A table selection unit that selects a conversion table according to the characteristics of the input data processed by the processor from a plurality of conversion tables;
The conversion table referred to and updated by the instruction conversion unit is a conversion table selected by the table selection unit.
The register arrangement optimizing device according to appendix 9 or 10, wherein
(Appendix 12)
The input data is image data;
The table selection unit selects a conversion table corresponding to an intra-frame difference or inter-frame difference of image data processed by the processor from the plurality of conversion tables.
The register arrangement optimizing device according to appendix 11, wherein

１１０プロセッサＣＰＵ部
１１１プロセッサコア
１１２レジスタ
１１３命令キャッシュ
１１４データキャッシュ
１１５システムバスコントローラ
１１６内部メモリ
１１７命令変換ユニット
１１８変換マップテーブル
１２０メインメモリ部
１２１処理プログラム格納領域
１２２処理データ格納領域
１２３処理結果格納領域
１３０メモリコントローラ部
１４０周辺入出力ＩＦ部
１５０通信ＩＦ部
１６０システムバス
１７１命令キャッシュ
１７２命令フェッチ部
１７３バイトコードアクセレータ部１７３
１７４セレクター
１７５デコード部
２１０画像処理プロセッサ
２２０ビデオ入力Ｉ／Ｆ部
２３０ビデオ出力Ｉ／Ｆ部
３１１入力画像特徴算出ユニット 110 processor CPU section 111 processor core 112 register 113 instruction cache 114 data cache 115 system bus controller 116 internal memory 117 instruction conversion unit 118 conversion map table 120 main memory section 121 processing program storage area 122 processing data storage area 123 processing result storage area 130 Memory controller unit 140 Peripheral input / output IF unit 150 Communication IF unit 160 System bus 171 Instruction cache 172 Instruction fetch unit 173 Byte code accelerator unit 173
174 Selector 175 Decoding unit 210 Image processor 220 Video input I / F unit 230 Video output I / F unit 311 Input image feature calculation unit

Claims

In a processor including a processor core for executing an instruction code and a register, a method for optimizing a variable arranged in the register,
Refers to a conversion table that includes information on the location and usage frequency of variables defined in functions in the program, and preferentially allocates frequently used variables to the registers based on the usage frequency information of the variables. To determine the location of the variable in the register or memory,
Reflecting the determined variable placement destination in the variable placement destination information included in the conversion table,
Based on the information on the placement destination of the variable included in the conversion table, the pseudo instruction code in which the placement destination of the operand of the operation instruction is not specified is converted into an instruction code executable by the processor core,
Updating information on the frequency of use of the variables included in the conversion table according to the variables used when the processor core executes the converted instruction code;
A register arrangement optimizing method characterized by the above.

The information on the frequency of use of the variable included in the conversion table includes information on the average use frequency and past use frequency of the variable,
In the determination, based on the information on the average use frequency of the variable included in the conversion table, the variable placement destination is set to the register or the above so that a variable having a high average use frequency is preferentially placed in the register. Decide on memory,
In the update, according to a variable used when the processor core executes the converted instruction code, information on the past use frequency of the variable included in the conversion table is updated, and the variable Updating information on average usage frequency of the variable included in the conversion table based on information on past usage frequency;
The register arrangement optimizing method according to claim 1.

The conversion table is a conversion table selected from a plurality of conversion tables according to the characteristics of input data processed by the processor.
3. The register arrangement optimizing method according to claim 1, wherein the register arrangement is optimized.

The input data is image data;
The selected conversion table is a conversion table selected according to an intra-frame difference or an inter-frame difference of the image data.
4. The register arrangement optimizing method according to claim 3, wherein:

In a processor including a processor core for executing an instruction code and a register, a program for optimizing a variable arranged in the register,
Refers to a conversion table that includes information on the location and usage frequency of variables defined in functions in the program, and preferentially allocates frequently used variables to the registers based on the usage frequency information of the variables. To determine the location of the variable in the register or memory,
Reflecting the determined variable placement destination in the variable placement destination information included in the conversion table,
Based on the information on the placement destination of the variable included in the conversion table, the pseudo instruction code in which the placement destination of the operand of the operation instruction is not specified is converted into an instruction code executable by the processor core,
Updating information on the frequency of use of the variables included in the conversion table according to the variables used when the processor core executes the converted instruction code;
A register placement optimization program that causes a computer to execute the process.

In a processor including a processor core for executing an instruction code and a register, an apparatus for optimizing a variable arranged in the register,
A conversion table containing information on the location and frequency of use of variables defined in the functions in the program;
An instruction conversion unit that converts a pseudo instruction code in which an operand placement destination of an operation instruction is not designated into an instruction code executable by the processor core based on information on a placement destination of the variable included in the conversion table; ,
With
The instruction conversion unit includes:
Based on the information on the frequency of use of the variable included in the conversion table, the variable placement destination is determined in the register or the memory so that the frequently used variable is preferentially placed in the register,
Reflecting the determined variable placement destination in the variable placement destination information included in the conversion table,
Updating information on the frequency of use of the variables included in the conversion table according to the variables used when the processor core executes the converted instruction code;
A register arrangement optimizing device characterized by that.