JP7249306B2

JP7249306B2 - Evaluation device, evaluation method and evaluation program

Info

Publication number: JP7249306B2
Application number: JP2020065650A
Authority: JP
Inventors: 晋作清本
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2020-04-01
Filing date: 2020-04-01
Publication date: 2023-03-30
Anticipated expiration: 2040-04-01
Also published as: JP2021163300A

Description

本発明は、プログラムの難読化手法の安全性を評価するための装置、方法及びプログラムに関する。 The present invention relates to an apparatus, method, and program for evaluating the security of program obfuscation techniques.

一般に、ソフトウェアには、価値のあるアルゴリズム及びコンテンツの暗号鍵等、利用者に対して秘密にすべき情報が含まれる場合がある。一方で、ソフトウェアを解析するためのリバースエンジニアリング技術も数多く開発されているのが現状である。このため、これらの技術によりソフトウェアが解析されると、不正な者に秘密情報が入手されるという脅威が考えられる。
こうした脅威に対し、ソフトウェアの仕様を保ったまま、ソフトウェアの解析を困難にする難読化という技術がある（例えば、特許文献１参照）。 In general, software may contain information that should be kept secret to users, such as valuable algorithms and content encryption keys. On the other hand, the current situation is that many reverse engineering techniques for analyzing software have been developed. Therefore, if software is analyzed by these techniques, there is a threat that confidential information may be obtained by an unauthorized person.
To deal with such threats, there is a technology called obfuscation that makes it difficult to analyze software while maintaining software specifications (see, for example, Patent Document 1).

特開２０１７－５８９９４号公報JP 2017-58994 A

しかしながら、難読化されたプログラムから秘密にすべき情報が解析される脅威を定量的に評価することはできなかった。 However, we were unable to quantitatively assess the threat of confidential information being analyzed from obfuscated programs.

本発明は、難読化手法の安全性を定量的に評価できる評価装置、評価方法及び評価プログラムを提供することを目的とする。 An object of the present invention is to provide an evaluation device, an evaluation method, and an evaluation program capable of quantitatively evaluating the security of an obfuscation technique.

本発明に係る評価装置は、難読化前の複数のプログラムから評価用プログラムを選択して所定のラベルを付与し、当該評価用プログラム及び前記所定のラベルとは異なるラベルを付与した他のプログラムを訓練データとして、学習済みモデルを生成する学習部と、前記評価用プログラム及び前記他のプログラムのうち、いずれかのプログラムに対して評価対象の難読化処理を行った後のプログラムを前記学習済みモデルに入力し、判定結果を取得する判定部と、複数の前記判定結果の正答率に基づいて、前記難読化処理の安全性の評価値を算出する評価部と、を備える。 The evaluation device according to the present invention selects an evaluation program from a plurality of programs before obfuscation, assigns a predetermined label, and uses the evaluation program and other programs with labels different from the predetermined label. As training data, a learning unit that generates a trained model, and one of the evaluation program and the other program that has undergone evaluation target obfuscation processing is used as the trained model. and an evaluation unit that calculates an evaluation value of the security of the obfuscation process based on the percentage of correct answers of the plurality of determination results.

前記評価部は、前記評価用プログラム及び前記他のプログラムのそれぞれに対する前記判定結果の正答率に基づいて、前記難読化処理の安全性の評価値を算出してもよい。 The evaluation unit may calculate the security evaluation value of the obfuscation process based on the percentage of correct answers in the determination result for each of the evaluation program and the other program.

前記評価部は、無作為にラベルの答えを選択した場合の正答確率に応じた前記判定結果への重み付けにより、前記正答率を調整してもよい。 The evaluation unit may adjust the percentage of correct answers by weighting the determination result according to the probability of correct answers when the label answer is randomly selected.

前記学習部は、前記複数のプログラムから前記評価用プログラムを繰り返し選択し、前記判定部は、複数の前記学習済みモデルから、それぞれ前記判定結果を取得してもよい。 The learning unit may repeatedly select the evaluation program from the plurality of programs, and the determination unit may acquire the determination result from each of the plurality of trained models.

前記評価部は、前記判定結果の正答率を、前記評価用プログラムを変えて複数回取得することにより統計処理してもよい。 The evaluation unit may statistically process the percentage of correct answers in the determination result by changing the evaluation program and acquiring the percentage a plurality of times.

前記判定部は、前記いずれかのプログラムに対して前記難読化処理を、パラメータをそれぞれ変えて複数回行った際の前記判定結果をそれぞれ取得してもよい。 The determination unit may acquire the determination results when the obfuscation processing is performed on any of the programs a plurality of times with different parameters.

前記評価部は、前記判定結果の正答率を、前記難読化処理のパラメータを変えて複数回取得することにより統計処理してもよい。 The evaluation unit may statistically process the percentage of correct answers of the determination result by changing parameters of the obfuscation process and acquiring the correct answer rate a plurality of times.

前記訓練データ、及び前記学習済みモデルへの入力は、プログラムを２次元画像に変換したものであってもよい。 The input to the training data and the trained model may be a program converted to a two-dimensional image.

前記訓練データ、及び前記学習済みモデルへの入力は、プログラムをベクトルに変換したものであってもよい。 The training data and the input to the trained model may be a program converted into vectors.

本発明に係る評価方法は、難読化前の複数のプログラムから評価用プログラムを選択して所定のラベルを付与し、当該評価用プログラム及び前記所定のラベルとは異なるラベルを付与した他のプログラムを訓練データとして、学習済みモデルを生成する学習ステップと、前記評価用プログラム及び前記他のプログラムのうち、いずれかのプログラムに対して評価対象の難読化処理を行った後のプログラムを前記学習済みモデルに入力し、判定結果を取得する判定ステップと、複数の前記判定結果の正答率に基づいて、前記難読化処理の安全性の評価値を算出する評価ステップと、をコンピュータが実行する。 An evaluation method according to the present invention selects an evaluation program from a plurality of pre-obfuscated programs, assigns a predetermined label to the evaluation program, and assigns another program with a label different from the predetermined label. As training data, a learning step of generating a trained model, and one of the evaluation program and the other program that has undergone evaluation target obfuscation processing is used as the trained model. , the computer executes a determination step of obtaining determination results, and an evaluation step of calculating an evaluation value of the security of the obfuscation processing based on the percentage of correct answers of the plurality of determination results.

本発明に係る評価プログラムは、前記評価装置としてコンピュータを機能させるためのものである。 An evaluation program according to the present invention is for causing a computer to function as the evaluation device.

本発明によれば、難読化手法の安全性を定量的に評価できる。 According to the present invention, the security of an obfuscation technique can be quantitatively evaluated.

実施形態における評価装置の機能構成を示す図である。It is a figure which shows the functional structure of the evaluation apparatus in embodiment. 実施形態における安全性の評価方法の処理手順を例示する図である。It is a figure which illustrates the processing procedure of the evaluation method of safety in embodiment.

以下、本発明の実施形態の一例について説明する。
図１は、本実施形態における評価装置１の機能構成を示す図である。
評価装置１は、サーバ又はパーソナルコンピュータ等の情報処理装置（コンピュータ）であり、制御部１０及び記憶部２０の他、各種データの入出力デバイス及び通信デバイス等を備える。 An example of an embodiment of the present invention will be described below.
FIG. 1 is a diagram showing the functional configuration of an evaluation device 1 according to this embodiment.
The evaluation device 1 is an information processing device (computer) such as a server or a personal computer, and includes a control unit 10 and a storage unit 20, input/output devices for various data, communication devices, and the like.

制御部１０は、評価装置１の全体を制御する部分であり、記憶部２０に記憶された各種プログラムを適宜読み出して実行することにより、本実施形態における各機能を実現する。制御部１０は、ＣＰＵであってよい。 The control unit 10 is a part that controls the entire evaluation apparatus 1, and implements each function in this embodiment by appropriately reading and executing various programs stored in the storage unit 20. FIG. The control unit 10 may be a CPU.

記憶部２０は、ハードウェア群を評価装置１として機能させるための各種プログラム、及び各種データ等の記憶領域であり、ＲＯＭ、ＲＡＭ、フラッシュメモリ又はハードディスク（ＨＤＤ）等であってよい。具体的には、記憶部２０は、本実施形態の各機能を制御部１０に実行させるためのプログラム（評価プログラム）、学習モデル等を記憶する。 The storage unit 20 is a storage area for various programs, various data, and the like for causing the hardware group to function as the evaluation device 1, and may be a ROM, RAM, flash memory, hard disk (HDD), or the like. Specifically, the storage unit 20 stores a program (evaluation program) for causing the control unit 10 to execute each function of the present embodiment, a learning model, and the like.

制御部１０は、学習部１１と、判定部１２と、評価部１３とを備える。
制御部１０は、これらの機能部により、学習モデルを用いた判定結果を用いて、難読化手法の安全性を評価する。 The control unit 10 includes a learning unit 11 , a determination unit 12 and an evaluation unit 13 .
The control unit 10 evaluates the security of the obfuscation method by using the determination result using the learning model by these functional units.

学習部１１は、難読化前の複数のプログラムから評価用プログラムを一つ選択して所定のラベルを付与し、この評価用プログラム及び所定のラベルとは異なるラベルを付与した他のプログラムを訓練データとしてＤＮＮ（ＤｅｅｐＮｅｕｒａｌＮｅｔｗｏｒｋ）等に学習させ、学習済みモデルを生成する。
なお、付与されるラベルは、評価用プログラムを示すラベルｔ１と、その他を示すラベルｔ２の２種類であってもよいし、他のプログラムそれぞれに別々のラベル（ｔ２，ｔ３，…）が付与されてもよい。すなわち、学習済みモデルの出力は、ｔ１を含む２種類以上である。 The learning unit 11 selects one evaluation program from a plurality of pre-obfuscation programs, assigns a predetermined label to it, and stores this evaluation program and other programs assigned labels different from the predetermined label as training data. is learned by a DNN (Deep Neural Network) or the like to generate a trained model.
It should be noted that the assigned labels may be of two types, a label t1 indicating the evaluation program and a label t2 indicating others, or different labels (t2, t3, . . . ) may be assigned to the other programs. may That is, the outputs of the trained model are two or more types including t1.

ここで、訓練データ、及び学習済みモデルへの入力は、例えば、プログラムのビット列を２次元画像データに変換したものであってよく、この場合、ＤＮＮ（ＤｅｅｐＮｅｕｒａｌＮｅｔｗｏｒｋ）等に学習させることで学習済みモデルが生成される。
あるいは、訓練データ、及び学習済みモデルへの入力は、例えば、プログラムをベクトルに変換したものであってもよく、この場合、ＤＢＳｃａｎ等のクラスタリング手法の学習をさせることで学習済みモデルが生成される。 Here, the input to the training data and the trained model may be, for example, the bit string of the program converted into two-dimensional image data. A finished model is generated.
Alternatively, the training data and input to the trained model may be, for example, a program converted into a vector. In this case, the trained model is generated by learning a clustering method such as DBScan. .

学習部１１は、安全性評価のための多様な評価結果を収集するために、複数のプログラムから新たな評価用プログラムを繰り返し選択し、選択した評価用プログラムそれぞれに基づく学習済みモデルを判定部１２に提供できる。 In order to collect various evaluation results for safety evaluation, the learning unit 11 repeatedly selects a new evaluation program from a plurality of programs, and determines a trained model based on each of the selected evaluation programs. can be provided to

判定部１２は、評価用プログラム及び他のプログラムのうち、いずれか一つ又は複数のプログラムに対して評価対象の難読化処理を行った後のプログラムを、生成された学習済みモデルに入力し、ラベルの判定結果を取得する。 The determination unit 12 inputs, into the generated trained model, one or more of the evaluation programs and the other programs after performing the evaluation target obfuscation process, and Get label judgment result.

このとき、判定部１２は、評価用プログラム毎に生成された複数の学習済みモデルから、それぞれ判定結果を取得してもよい。
また、判定部１２は、各プログラムに対して難読化処理を、パラメータを変えて複数回行った際の判定結果をそれぞれ取得してもよい。 At this time, the determination unit 12 may acquire determination results from a plurality of trained models generated for each evaluation program.
Further, the determination unit 12 may acquire determination results when the obfuscation processing is performed on each program a plurality of times with different parameters.

評価部１３は、複数の判定結果の正答率に基づいて、難読化処理の強度すなわち安全性の評価値を算出する。
ここで、判定結果が正答であるとは、難読化された評価用プログラムをｔ１と判定した場合、難読された他のプログラムをｔ１以外（例えばｔ２、ｔ３、・・・）と判定した場合であってよい。あるいは、他のプログラムにｔ２，ｔ３，…と別々のラベルが付与された場合に、ｔ１以外についても正しいラベルを出力した場合のみを正答としてもよい。 The evaluation unit 13 calculates the strength of the obfuscation process, that is, the safety evaluation value, based on the correct answer rate of the plurality of determination results.
Here, the determination result is a correct answer when the obfuscated evaluation program is determined as t1, and when other obfuscated programs are determined as other than t1 (for example, t2, t3, . . . ). It can be. Alternatively, when different labels such as t2, t3, .

また、正答率は、例えば次のように算出される。
（１）評価部１３は、判定部１２による評価用プログラム及び他のプログラムのそれぞれに対する判定結果を全て集計して正答率を算出する。
このとき、評価部１３は、無作為にラベルの答えを選択した場合の正答確率に応じた判定結果への重み付けにより、正答率を調整してもよい。例えば、他のプログラム（ラベルｔ２）が複数用意された場合に、他のプログラム（ｔ２）をｔ２と分類した判定結果よりも、評価用プログラム（ｔ１）をｔ１と分類した判定結果に重み付けがされてよい。 Also, the percentage of correct answers is calculated, for example, as follows.
(1) The evaluation unit 13 aggregates all the evaluation results for each of the evaluation program and the other programs by the determination unit 12 and calculates the percentage of correct answers.
At this time, the evaluation unit 13 may adjust the percentage of correct answers by weighting the determination results according to the probability of correct answers when the label answers are randomly selected. For example, when a plurality of other programs (label t2) are prepared, the judgment result of classifying the evaluation program (t1) as t1 is weighted more than the judgment result of classifying the other program (t2) as t2. you can

（２）評価部１３は、選択された評価用プログラム毎に判定結果の正答率を求め、評価用プログラムを変えて正答率を複数回取得することにより、平均値又は最大値等の統計値を算出する。 (2) The evaluation unit 13 obtains the correct answer rate of the determination results for each selected evaluation program, and acquires the correct answer rate multiple times by changing the evaluation program, thereby obtaining statistical values such as average values or maximum values. calculate.

（３）評価部１３は、評価用プログラムを変えて複数取得した判定結果から正答率を求め、さらに、難読化処理のパラメータを変えて正答率を複数回取得することにより、平均値又は最大値等の統計値を算出する。 (3) The evaluation unit 13 obtains the correct answer rate from the determination results obtained multiple times by changing the evaluation program. Calculate statistical values such as

図２は、本実施形態における安全性の評価方法の処理手順を例示する図である。
この例では、訓練データに付与されるラベルは２種類、すなわち複数のプログラムのうち、評価用プログラムのラベルをｔ１、他のプログラムのラベルを共通のｔ２としている。また、学習モデルとして、画像データを入力とするＤＮＮを採用している。 FIG. 2 is a diagram illustrating the processing procedure of the safety evaluation method according to this embodiment.
In this example, there are two types of labels given to the training data, ie, among the multiple programs, the label for the evaluation program is t1, and the common label for the other programs is t2. Also, as a learning model, a DNN that receives image data is adopted.

学習部１１は、これらのプログラムをそれぞれ画像データ化し、ＤＮＮに入力して学習させる。これにより生成された学習済みモデルは、入力されたプログラムの画像データをｔ１又はｔ２に分類して判定結果として出力する。 The learning unit 11 converts these programs into image data and inputs them to the DNN for learning. The trained model thus generated classifies the image data of the input program into t1 or t2 and outputs it as a determination result.

判定部１２は、評価対象である難読化手法を用いて難読化した評価用プログラムを画像データ化し、学習済みモデルに入力すると、判定結果としてｔ１又はｔ２が得られる。そして、判定結果がｔ１の場合が正答としてカウントされる。
このとき、判定部１２は、他のプログラムについても、同様に画像データ化して学習済みモデルに入力し、判定結果がｔ２の場合を正答としてカウントしてもよい。 The determination unit 12 converts the evaluation program obfuscated using the obfuscation method, which is the object of evaluation, into image data and inputs it to the trained model, and obtains t1 or t2 as a determination result. Then, the case where the determination result is t1 is counted as a correct answer.
At this time, the determination unit 12 may similarly convert other programs into image data and input them to the trained model, and may count the case where the determination result is t2 as a correct answer.

ここで、評価対象である難読化手法がパラメータ（例えば乱数）の変更等により同一のプログラムから毎回異なる難読化プログラムを生成する場合、評価用プログラム及び他のプログラムの難読化は、繰り返し行われる。判定部１２は、難読化後の各プログラムをそれぞれ画像化して学習済みモデルに入力すると、同様に判定結果を得る。 Here, when the obfuscation method to be evaluated generates different obfuscated programs each time from the same program by changing parameters (for example, random numbers), obfuscation of the evaluation program and other programs is repeatedly performed. The determination unit 12 similarly obtains a determination result when each obfuscated program is imaged and input to the trained model.

そして、評価部１３は、収集した判定結果の正答率が低いほど難読化手法が安全であるとして、例えば正答率の逆数等の評価値を算出する。 Then, the evaluation unit 13 calculates an evaluation value such as the reciprocal of the correct answer rate, based on the assumption that the lower the correct answer rate of the collected determination results is, the safer the obfuscation method is.

このとき、評価部１３は、前述のように、評価用プログラムを入れ替えて同様に正答率を算出すると、これらの平均値又は最大値等の統計値に基づいて評価値を算出してもよい。
あるいは、評価部１３は、難読化手法のパラメータ毎に正答率を算出すると、これらの平均値又は最大値等の統計値に基づいて評価値を算出してもよい。 At this time, the evaluation unit 13 may calculate an evaluation value based on a statistical value such as an average value or a maximum value after replacing the evaluation program and calculating the percentage of correct answers in the same manner as described above.
Alternatively, after calculating the percentage of correct answers for each parameter of the obfuscation method, the evaluation unit 13 may calculate an evaluation value based on a statistical value such as an average value or maximum value.

本実施形態によれば、評価装置１は、所定のラベルを付与した評価用プログラムと、別のラベルを付与した他のプログラムとを訓練データとして、学習済みモデルを生成し、評価用プログラム及び他のプログラムのうち、いずれかのプログラムに対して難読化処理を行った後のプログラムを学習済みモデルに入力し、判定結果を取得する。そして、評価装置１は、複数の判定結果の正答率に基づいて、評価対象である難読化処理の安全性の評価値を算出する。
したがって、評価装置１は、ＡＩ（ＡｒｔｉｆｉｃｉａｌＩｎｔｅｌｌｉｇｅｎｃｅ）により難読化後のプログラムが難読化前と同じプログラムであると分類される場合を危険と判断することで、従来は人手に頼っていた安全性の評価を自動化し、難読化手法の安全性を定量的に評価できる。 According to the present embodiment, the evaluation device 1 generates a trained model using an evaluation program to which a predetermined label is assigned and another program to which a different label is assigned as training data. Input the obfuscated program to the trained model and obtain the judgment result. Then, the evaluation device 1 calculates an evaluation value of the safety of the obfuscation processing to be evaluated based on the correct answer rate of the plurality of determination results.
Therefore, the evaluation apparatus 1 determines that an obfuscated program is dangerous when AI (Artificial Intelligence) classifies it as the same program as before obfuscation. The evaluation can be automated and the security of the obfuscation method can be quantitatively evaluated.

評価装置１は、評価用プログラム及び他のプログラムのそれぞれに対する判定結果の正答率に基づいて、難読化処理の安全性の評価値を算出することで、判定結果を容易に収集し、安全性の評価値を適切に算出できる。
このとき、評価装置１は、無作為にラベルの答えを選択した場合の正答確率に応じた判定結果への重み付けにより、正答率を調整することで、判定結果それぞれの重要度を反映し、より適切に安全性を評価できる。 The evaluation device 1 easily collects the judgment results and evaluates the safety by calculating the evaluation value of the safety of the obfuscation processing based on the correct answer rate of the judgment results for each of the evaluation program and other programs. Appropriate evaluation value can be calculated.
At this time, the evaluation device 1 adjusts the percentage of correct answers by weighting the judgment results according to the probability of correct answers when randomly selecting the label answer, thereby reflecting the importance of each judgment result, Safety can be evaluated appropriately.

評価装置１は、複数のプログラムから評価用プログラムを繰り返し選択し、選択の度に生成した複数の学習済みモデルから、それぞれ判定結果を取得してもよい。
これにより、評価装置１は、複数の評価用プログラムを用いて、難読化手法の安全性を多面的に適切に評価できる。
また、評価装置１は、評価用プログラムの選択毎の正答率を統計処理することで、より適切な評価値を算出できる。 The evaluation device 1 may repeatedly select an evaluation program from a plurality of programs, and acquire determination results from a plurality of trained models generated each time a selection is made.
As a result, the evaluation device 1 can appropriately evaluate the security of the obfuscation technique from multiple perspectives using a plurality of evaluation programs.
In addition, the evaluation device 1 can calculate a more appropriate evaluation value by statistically processing the percentage of correct answers for each selection of the evaluation program.

評価装置１は、難読化処理を、パラメータを変えて複数回行った際の判定結果をそれぞれ取得してもよい。
これにより、評価装置１は、複数の難読化結果を用いて、難読化手法の安全性を多面的に適切に評価できる。
また、評価装置１は、パラメータ毎の正答率を統計処理することで、より適切な評価値を算出できる。 The evaluation device 1 may acquire the determination results when the obfuscation processing is performed multiple times with different parameters.
As a result, the evaluation device 1 can appropriately evaluate the security of the obfuscation technique from multiple perspectives using a plurality of obfuscation results.
In addition, the evaluation device 1 can calculate a more appropriate evaluation value by statistically processing the percentage of correct answers for each parameter.

評価装置１は、プログラムのビット列を２次元画像データに変換したものを訓練データ、及び学習済みモデルへの入力とすることで、ＤＮＮ等による分類手法に適合させ、適切に評価方法を実現できる。
また、評価装置１は、プログラムをベクトルに変換したものを訓練データ、及び学習済みモデルへの入力とすることで、ＤＢＳｃａｎ等によるクラスタリング手法に適合させ、適切に評価方法を実現できる。 The evaluation device 1 uses a program bit string converted to two-dimensional image data as training data and an input to a trained model, so that it can be adapted to a classification method such as a DNN and can appropriately implement an evaluation method.
In addition, the evaluation apparatus 1 uses a program converted into a vector as training data and an input to a trained model, thereby making it suitable for a clustering method such as DBScan and realizing an appropriate evaluation method.

以上、本発明の実施形態について説明したが、本発明は前述した実施形態に限るものではない。また、前述した実施形態に記載された効果は、本発明から生じる最も好適な効果を列挙したに過ぎず、本発明による効果は、実施形態に記載されたものに限定されるものではない。 Although the embodiments of the present invention have been described above, the present invention is not limited to the above-described embodiments. Moreover, the effects described in the above-described embodiments are merely enumerations of the most suitable effects produced by the present invention, and the effects of the present invention are not limited to those described in the embodiments.

評価装置１による評価方法は、ソフトウェアにより実現される。ソフトウェアによって実現される場合には、このソフトウェアを構成するプログラムが、情報処理装置（コンピュータ）にインストールされる。また、これらのプログラムは、ＣＤ－ＲＯＭのようなリムーバブルメディアに記録されてユーザに配布されてもよいし、ネットワークを介してユーザのコンピュータにダウンロードされることにより配布されてもよい。さらに、これらのプログラムは、ダウンロードされることなくネットワークを介したＷｅｂサービスとしてユーザのコンピュータに提供されてもよい。 The evaluation method by the evaluation device 1 is realized by software. When it is implemented by software, a program constituting this software is installed in an information processing device (computer). Further, these programs may be recorded on removable media such as CD-ROMs and distributed to users, or may be distributed by being downloaded to users' computers via a network. Furthermore, these programs may be provided to the user's computer as a web service through the network without being downloaded.

１評価装置
１０制御部
１１学習部
１２判定部
１３評価部
２０記憶部 1 evaluation device 10 control unit 11 learning unit 12 determination unit 13 evaluation unit 20 storage unit

Claims

An evaluation program is selected from a plurality of programs before obfuscation and given a predetermined label, and the evaluation program and another program given a label different from the predetermined label are used as training data, and a trained model is generated. a learning unit to generate;
a determination unit for inputting a program after obfuscation processing of an evaluation target to one of the evaluation program and the other program into the trained model and acquiring a determination result;
and an evaluation unit that calculates an evaluation value of security of the obfuscation process based on correct answer rates of the plurality of determination results.

2. The evaluation device according to claim 1, wherein the evaluation unit calculates the safety evaluation value of the obfuscation processing based on the percentage of correct answers in the determination result for each of the evaluation program and the other program.

3. The evaluation apparatus according to claim 2, wherein the evaluation unit adjusts the percentage of correct answers by weighting the determination results according to the probability of correct answers when the label answers are randomly selected.

The learning unit repeatedly selects the evaluation program from the plurality of programs,
The evaluation apparatus according to any one of claims 1 to 3, wherein the determination unit acquires the determination result from each of the plurality of trained models.

5. The evaluation apparatus according to claim 4, wherein the evaluation unit performs statistical processing on the percentage of correct answers in the determination result by changing the evaluation program and acquiring the percentage a plurality of times.

6. The determination unit according to any one of claims 1 to 5, wherein the determination unit acquires the determination results when the obfuscation processing is performed on the program a plurality of times with different parameters. Evaluation device.

7. The evaluation apparatus according to claim 6, wherein the evaluation unit statistically processes the percentage of correct answers in the determination result by changing parameters of the obfuscation processing and obtaining the accuracy a plurality of times.

8. The evaluation apparatus according to any one of claims 1 to 7, wherein the input to the training data and the learned model is obtained by converting a program into a two-dimensional image.

8. The evaluation apparatus according to any one of claims 1 to 7, wherein the training data and the input to the learned model are obtained by converting a program into vectors.

An evaluation program is selected from a plurality of programs before obfuscation and given a predetermined label, and the evaluation program and another program given a label different from the predetermined label are used as training data, and a trained model is generated. a learning step to generate;
a determination step of inputting into the trained model a program after obfuscation processing to be evaluated for one of the evaluation program and the other program, and acquiring a determination result;
an evaluation step of calculating an evaluation value of the security of the obfuscation process based on the percentage of correct answers of the plurality of determination results; and an evaluation method executed by a computer.

An evaluation program for causing a computer to function as the evaluation device according to any one of claims 1 to 9.