JPWO2016030927A1

JPWO2016030927A1 - Process analysis apparatus, process analysis method, and process analysis program

Info

Publication number: JPWO2016030927A1
Application number: JP2016545089A
Authority: JP
Inventors: 匠山本; 鐘治桜井; 河内　清人; 清人河内
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2014-08-28
Filing date: 2014-08-28
Publication date: 2017-04-27
Anticipated expiration: 2034-08-28
Also published as: US20170337378A1; US10325094B2; JP6246377B2; WO2016030927A1; CN106664201A

Abstract

本発明は、情報処理装置で実行されるプロセスを解析して、プロセスの中で利用されている暗号関数や復号関数などの暗号ロジックを抽出するプロセス解析装置に関する。プロセス解析装置は、解析対象のプロセスの実行トレースを取得する実行トレース取得部と、実行トレースからループ構造を示す処理単位であるブロックを抽出するブロック抽出部と、ブロックから、入力情報と出力情報とを含むブロック情報を抽出するブロック情報抽出部と、ブロック情報の入力情報または出力情報を用いてブロックの入出力関係の特徴を判定する特徴判定情報を生成し、この特徴判定情報を利用してブロックの入出力関係を解析し、暗号関数または復号関数の入出力関係の特徴を示すブロックを暗号ロジックと判定するブロック情報解析部とを備える。The present invention relates to a process analysis apparatus that analyzes a process executed by an information processing apparatus and extracts cryptographic logic such as an encryption function and a decryption function used in the process. The process analysis apparatus includes an execution trace acquisition unit that acquires an execution trace of a process to be analyzed, a block extraction unit that extracts a block that is a processing unit indicating a loop structure from the execution trace, and input information and output information from the block. A block information extraction unit for extracting block information including the block information, and feature determination information for determining the feature of the input / output relationship of the block using the input information or output information of the block information, and using the feature determination information to generate a block A block information analyzing unit that analyzes the input / output relationship of the encryption function and determines the block indicating the characteristics of the input / output relationship of the encryption function or the decryption function as the encryption logic.

Description

本発明は、情報処理装置で実行されるプロセスを解析して、プロセスの中で利用されている暗号関数や復号関数などの暗号ロジックを抽出するプロセス解析装置に関する。 The present invention relates to a process analysis apparatus that analyzes a process executed by an information processing apparatus and extracts cryptographic logic such as an encryption function and a decryption function used in the process.

近年、新しいセキュリティ脅威として、特定の組織をねらい、執拗に攻撃を行なうＡｄｖａｎｃｅｄＰｅｒｓｉｓｔｅｎｔＴｈｒｅａｔ（ＡＰＴ）と呼ばれる「標的型攻撃」が顕在化している。ＡＰＴでは、メールによって標的とする組織の端末にマルウェアを感染させ、感染したマルウェアが外部の攻撃者のサーバと通信を行ない、新しい攻撃プログラムのダウンロードや組織システム内の機密情報の送信を行なう。このようなセキュリティインシデントの発生を早期に検出し被害の拡大を防ぐために、ネットワーク機器の様々なログを監視し不審な兆候の検出を行なうＳｅｃｕｒｉｔｙＯｐｅｒａｔｉｏｎＣｅｎｔｅｒ（ＳＯＣ）サービスが必要とされる。一方、インシデントの発生が判明した場合、組織は、インシデントの原因や被害の調査、対応策の検討、サービスの復旧、再発防止策の実施などのインシデントレスポンスを行なわなければならない。さらに、組織の顧客やビジネスパートナーによっては、組織は、機密情報の何が漏えいし、何が漏えいしなかったのかを、明確にする必要がある。 In recent years, as a new security threat, a “targeted attack” called Advanced Persistent Threat (APT), which attacks persistently with a specific organization, has become apparent. In APT, malware is infected to a target organization terminal by mail, and the infected malware communicates with an external attacker's server to download a new attack program and transmit confidential information in the organization system. In order to detect the occurrence of such security incidents at an early stage and prevent the spread of damage, a Security Operation Center (SOC) service that monitors various logs of network devices and detects suspicious signs is required. On the other hand, when the occurrence of an incident is found, the organization must perform an incident response such as investigation of the cause and damage of the incident, examination of countermeasures, restoration of services, and implementation of measures for preventing recurrence. In addition, depending on the organization's customers and business partners, the organization needs to clarify what has been leaked and what has not been leaked.

組織がインシデントの原因や被害の調査を行なうにあたり、パソコン、サーバ、ネットワーク機器などが生成したログやネットワーク上で記録されたパケットを解析し、マルウェアの侵入経路、感染端末、アクセスした情報、攻撃者からの命令、外部に送信した情報などを調査するネットワークフォレンジックが重要な役割を担う。ところが、昨今のマルウェアは、暗号技術を活用して通信を秘匿する。そのため、組織がネットワークフォレンジックを行なったとしても、攻撃者から送られる命令が何であり、どのような情報が外部に送信されたのかを追跡することが困難となっている。 When an organization investigates the cause or damage of an incident, it analyzes logs generated by computers, servers, network devices, etc. and packets recorded on the network to find malware intrusion routes, infected terminals, information accessed, attackers Network forensics, which investigates commands from the Internet and information sent to the outside, plays an important role. However, recent malware uses encryption technology to conceal communication. For this reason, even if the organization performs network forensics, it is difficult to track what commands are sent from the attacker and what information is sent to the outside.

この問題に対応するためには、マルウェアが通信の秘匿に利用した暗号ロジックと鍵を見つけ、暗号化された通信を復号する必要がある。一般に、この作業はマルウェアのプログラムのバイナリを解析する必要がある。従来の暗号ロジック抽出手法の多くは、例えば、特許文献１に開示されているマルウェア解析システムのように、マルウェアを実行した際の実行トレースから、暗号ロジックによく見られる特徴を探すことで、暗号ロジックと鍵の特定を行なう。また、マルウェアのプログラムのバイナリを解析する技術としては、非特許文献１〜９に開示されている技術が知られている。 In order to cope with this problem, it is necessary to find the encryption logic and key used by the malware to conceal the communication, and to decrypt the encrypted communication. In general, this task requires analyzing the malware program binaries. Many of the conventional cryptographic logic extraction methods, such as the malware analysis system disclosed in Patent Document 1, search for characteristics often found in cryptographic logic from execution traces when executing malware. Identify logic and keys. As techniques for analyzing binaries of malware programs, techniques disclosed in Non-Patent Documents 1 to 9 are known.

特開2013-114637号公報JP 2013-114637

Noe Lutz, Towards Revealing Attacker's Intent by Automatically Decrypting Network Traffic, Master Thesis MA-2008-08.Noe Lutz, Towards Revealing Attacker's Intent by Automatically Decrypting Network Traffic, Master Thesis MA-2008-08. Zhi Wang, Xuxian Jiang, Weidong Cui, Xinyuan Wang and Mike Grace, ReFormat: automatic reverse engineering of encrypted messages, Proceedings of the 14th European Conference on Research in Computer Security.Zhi Wang, Xuxian Jiang, Weidong Cui, Xinyuan Wang and Mike Grace, ReFormat: automatic reverse engineering of encrypted messages, Proceedings of the 14th European Conference on Research in Computer Security. Felix Matenaar, Andre Wichmann, Felix Leder and Elmar Gerhards-Padilla, CIS: The Crypto Intelligence System for Automatic Detection and Localization of Cryptographic Functions in Current Malware, Proceedings of the 7th and Unwanted Software (Malware 2012).Felix Matenaar, Andre Wichmann, Felix Leder and Elmar Gerhards-Padilla, CIS: The Crypto Intelligence System for Automatic Detection and Localization of Cryptographic Functions in Current Malware, Proceedings of the 7th and Unwanted Software (Malware 2012). Xin Li, Xinyuan WaInternational Conference on Malicious ng, Wentao Chang, CipherXRay: Exposing Cryptographic Operations and Transient Secrets from Monitored Binary Execution, IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING (preprint) 2012.Xin Li, Xinyuan WaInternational Conference on Malicious ng, Wentao Chang, CipherXRay: Exposing Cryptographic Operations and Transient Secrets from Monitored Binary Execution, IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING (preprint) 2012. Felix Grobert, Carsten Willems, and Thorsten Holz, Automated Identification of Cryptographic Primitives in Binary Programs, Proceedings of the 14th International Conference on Recent Advances in Intrusion Detection.Felix Grobert, Carsten Willems, and Thorsten Holz, Automated Identification of Cryptographic Primitives in Binary Programs, Proceedings of the 14th International Conference on Recent Advances in Intrusion Detection. Joan Calvet, Jose M. Fernandez, Jean-Yves Marion, Aligot: Cryptographic Function Identification in Obfuscated Binary Programs, Proceedings of the 19th ACM Conference on Computer and Communications Security, CCS 2012.Joan Calvet, Jose M. Fernandez, Jean-Yves Marion, Aligot: Cryptographic Function Identification in Obfuscated Binary Programs, Proceedings of the 19th ACM Conference on Computer and Communications Security, CCS 2012. Intel, Pin - A Dynamic Binary Instrumentation Tool, https://software.intel.com/en-us/articles/pin-a-dynamic-binary-instrumentation-toolIntel, Pin-A Dynamic Binary Instrumentation Tool, https://software.intel.com/en-us/articles/pin-a-dynamic-binary-instrumentation-tool Bitblaze, TEMU: The BitBlaze Dynamic Analysis Component, http://bitblaze.cs.berkeley.edu/temu.htmlBitblaze, TEMU: The BitBlaze Dynamic Analysis Component, http://bitblaze.cs.berkeley.edu/temu.html Jordi Tubella and Antonio Gonzalez, Control Speculation in Multithreaded Processors through Dynamic Loop Detection, In Proceedings of the Fourth International Symposium on High-Performance Computer Architecture, pp.14-23, 1998.Jordi Tubella and Antonio Gonzalez, Control Speculation in Multithreaded Processors through Dynamic Loop Detection, In Proceedings of the Fourth International Symposium on High-Performance Computer Architecture, pp. 14-23, 1998.

特許文献１に代表される従来技術では、暗号ロジックの候補として多くの無関係なロジックが抽出される。マルウェア解析者は、手作業で無関係なロジックを除外しなければならず、多大な手間暇を要するという課題がある。そのため、無関係なロジックの抽出を抑えた、精度の高い暗号ロジック抽出手法が必要となる。 In the prior art represented by Patent Document 1, many irrelevant logics are extracted as encryption logic candidates. Malware analysts have to exclude irrelevant logic by hand and have the problem of requiring a lot of time and effort. Therefore, a highly accurate cryptographic logic extraction method that suppresses the extraction of irrelevant logic is required.

この発明は上記のような課題を解決するためになされたもので、ファイルや通信を暗号化するマルウェアが利用する暗号ロジックの特徴を基にしてマルウェアの実行トレースを解析することにより、マルウェアが用いる暗号ロジックを、精度良く特定することを目的とする。 The present invention has been made to solve the above problems, and is used by malware by analyzing execution traces of malware based on the characteristics of cryptographic logic used by malware that encrypts files and communications. The purpose is to specify the cryptographic logic with high accuracy.

上記で述べた課題を解決するため、本発明のプロセス解析装置は、解析対象のプロセスの実行トレースを取得する実行トレース取得部と、前記実行トレースからループ構造を示す処理単位であるブロックを抽出するブロック抽出部と、前記ブロックから、入力情報と出力情報とを含むブロック情報を抽出するブロック情報抽出部と、前記ブロック情報の前記入力情報または前記出力情報を用いて前記ブロックの入出力関係の特徴を判定する特徴判定情報を生成し、この特徴判定情報を利用して前記ブロックの入出力関係を解析し、暗号関数または復号関数の入出力関係の特徴を示すブロックを暗号ロジックと判定するブロック情報解析部とを備える。 In order to solve the problems described above, the process analysis apparatus of the present invention extracts an execution trace acquisition unit that acquires an execution trace of a process to be analyzed, and a block that is a processing unit indicating a loop structure from the execution trace. A block extraction unit, a block information extraction unit that extracts block information including input information and output information from the block, and features of input / output relationships of the blocks using the input information or the output information of the block information Block information that generates feature determination information for determining the block, analyzes the input / output relationship of the block using the feature determination information, and determines a block indicating the feature of the input / output relationship of the encryption function or the decryption function as encryption logic And an analysis unit.

本発明によれば、実行トレースから抽出したブロックの入出力関係の特徴を判定する特徴判定情報を生成し、この特徴判定情報を利用してブロックの入出力関係を解析し、暗号関数または復号関数の入出力関係の特徴を示すブロックを暗号ロジックと判定することにより、マルウェアが用いる暗号ロジックを、精度良く特定することができるという効果がある。 According to the present invention, the feature determination information for determining the feature of the input / output relationship of the block extracted from the execution trace is generated, the input / output relationship of the block is analyzed using this feature determination information, and the encryption function or the decryption function By determining the block showing the characteristics of the input / output relationship as the encryption logic, there is an effect that the encryption logic used by the malware can be specified with high accuracy.

実施の形態１に係るプロセス解析装置の一構成例を示す構成図である。1 is a configuration diagram illustrating a configuration example of a process analysis apparatus according to a first embodiment. 実施の形態１に係るプロセス解析装置のプロセス解析処理の流れを示すフローチャートである。3 is a flowchart showing a flow of process analysis processing of the process analysis apparatus according to the first embodiment. 定義リスト１０５の一例を示す図である。It is a figure which shows an example of the definition list | wrist 105. FIG. 入力情報と出力情報を格納するデータ形式の一例を説明する図である。It is a figure explaining an example of the data format which stores input information and output information. 暗号関数の入出力情報が含む印刷可能文字列の特徴を示す図である。It is a figure which shows the characteristic of the printable character string which the input / output information of a cryptographic function contains. 実施例１に係るプロセス解析装置の一構成例を示す構成図である。1 is a configuration diagram illustrating a configuration example of a process analysis apparatus according to a first embodiment. 実施例１に係る文字列割合判定部１６０の文字列割合判定処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the character string ratio determination process of the character string ratio determination part 160 which concerns on Example 1. FIG. 文字コード表ＤＢ１６２に格納された文字コード表の例を示す図である。It is a figure which shows the example of the character code table stored in character code table DB162. マルウェアの暗号関数の使い方の一例（その１）を示す図である。It is a figure which shows an example (the 1) of usage of the encryption function of malware. 実施例２に係るプロセス解析装置の一構成例を示す構成図である。FIG. 6 is a configuration diagram illustrating a configuration example of a process analysis apparatus according to a second embodiment. 実施例２に係るデータデコード部１７０のデコード判定処理の流れを示すフローチャートである。10 is a flowchart illustrating a flow of a decoding determination process of a data decoding unit 170 according to the second embodiment. マルウェアの暗号関数の使い方の一例（その２）を示す図である。It is a figure which shows an example (the 2) of usage of the encryption function of malware. 実施例３に係るプロセス解析装置の一構成例を示す構成図である。FIG. 10 is a configuration diagram illustrating a configuration example of a process analysis apparatus according to a third embodiment. 実施例３に係るデータ解凍部１８０のデータ解凍判定処理の流れを示すフローチャートである。10 is a flowchart illustrating a flow of data decompression determination processing of a data decompression unit 180 according to Embodiment 3. 暗号の基本定義を説明する図である。It is a figure explaining the basic definition of encryption. 実施例４に係るプロセス解析装置の一構成例を示す構成図である。FIG. 10 is a configuration diagram illustrating a configuration example of a process analysis apparatus according to a fourth embodiment. 実施例４に係る仮想実行部１９０の仮想実行判定処理の流れを示すフローチャートである。10 is a flowchart illustrating a flow of a virtual execution determination process of a virtual execution unit 190 according to a fourth embodiment. 実施例４に係る仮想実行部１９０の仮想実行解析の処理の流れ（前半）を示すフローチャートである。14 is a flowchart illustrating a flow (first half) of a virtual execution analysis process of a virtual execution unit 190 according to a fourth embodiment. 実施例４に係る仮想実行部１９０の仮想実行解析の処理の流れ（後半）を示すフローチャートである。FIG. 10 is a flowchart illustrating a flow (second half) of a virtual execution analysis process of a virtual execution unit 190 according to Embodiment 4. FIG.

実施の形態１．
図１は、実施の形態１に係るプロセス解析装置の一構成例を示す構成図である。
図１において、プロセス解析装置１００は、実行トレース取得部１１０、ブロック抽出部１２０、ブロック情報抽出部１３０、ブロック情報解析部１４０、及び、解析結果出力部１５０を備える。Embodiment 1 FIG.
FIG. 1 is a configuration diagram illustrating a configuration example of a process analysis apparatus according to the first embodiment.
1, the process analysis apparatus 100 includes an execution trace acquisition unit 110, a block extraction unit 120, a block information extraction unit 130, a block information analysis unit 140, and an analysis result output unit 150.

プロセス解析装置１００は、マルウェアのプログラムのバイナリを解析するための機器であり、例えば、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）がバスを介して、ＲＯＭ、ＲＡＭ、通信ボード、ディスプレイ、キーボード、マウス、磁気ディスク装置などのハードウェアデバイスと接続されたコンピュータにより、プロセス解析装置１００を構成する。また、プロセス解析装置１００では、ＣＰＵ上に仮想マシンを備え、マルウェアのプログラムを実行できる実行環境を備えている。 The process analysis device 100 is a device for analyzing a binary of a malware program. For example, a CPU (Central Processing Unit) is connected to a ROM, RAM, communication board, display, keyboard, mouse, magnetic disk device via a bus. The process analysis apparatus 100 is configured by a computer connected to a hardware device such as. The process analysis apparatus 100 includes a virtual machine on the CPU and an execution environment that can execute a malware program.

実行トレース取得部１１０は、仮想マシンの実行環境上で、解析対象の実行ファイル１０１を実行し、実行した処理のログ情報である実行トレース１０２と、実行したプロセスに関する各種情報であるプロセス情報１０３を取得する。 The execution trace acquisition unit 110 executes the execution file 101 to be analyzed on the execution environment of the virtual machine, and executes an execution trace 102 that is log information of the executed process and process information 103 that is various information related to the executed process. get.

ブロック抽出部１２０は、実行トレース取得部１１０により取得した実行トレース１０２から、プログラムを構成する基本的な構成要素であるブロックを抽出し、抽出した複数のブロックのリストであるブロックリスト１０４を出力する。また、ブロック抽出部１２０は、実行トレース１０２の各行から、後述するブロック情報解析処理に必要な情報を抽出して、定義リスト１０５を出力する。 The block extraction unit 120 extracts blocks, which are basic components constituting the program, from the execution trace 102 acquired by the execution trace acquisition unit 110, and outputs a block list 104, which is a list of a plurality of extracted blocks. . In addition, the block extraction unit 120 extracts information necessary for block information analysis processing described later from each line of the execution trace 102 and outputs the definition list 105.

ブロック情報抽出部１３０は、実行トレース１０２とブロックリスト１０４と定義リスト１０５から、ブロックで実行される入出力情報を含むブロック情報を抽出し、ブロック情報リスト１０６を出力する。 The block information extraction unit 130 extracts block information including input / output information executed in the block from the execution trace 102, the block list 104, and the definition list 105, and outputs the block information list 106.

ブロック情報解析部１４０は、ブロック情報抽出部１３０が出力したブロック情報リスト１０６を用いて、解析対象ブロックが暗号ロジックか否かを解析し、解析結果リスト１０７を出力する。 The block information analysis unit 140 uses the block information list 106 output by the block information extraction unit 130 to analyze whether the analysis target block is a cryptographic logic and outputs an analysis result list 107.

解析結果出力部１５０は、ブロック情報解析部１４０が解析した解析結果リスト１０７の内容を、例えばディスプレイに出力して解析者に表示する。 The analysis result output unit 150 outputs the contents of the analysis result list 107 analyzed by the block information analysis unit 140 to, for example, a display and displays it to the analyst.

次に、プロセス解析装置１００のプロセス解析処理を、図２のフローチャートを用いて説明する。
図２は、実施の形態１に係るプロセス解析装置のプロセス解析処理の流れを示すフローチャートである。Next, the process analysis processing of the process analysis apparatus 100 will be described using the flowchart of FIG.
FIG. 2 is a flowchart showing a flow of process analysis processing of the process analysis apparatus according to the first embodiment.

まず、ステップＳ１１０において、実行トレース取得部１１０は、仮想マシンなどの解析環境上で、解析対象の実行ファイル１０１のプログラムを実行する。実行トレース取得部１１０は、実行したプログラムのプロセスを監視し、同プロセスの実行トレース１０２を記録する。実行トレース１０２により記録する情報は、例えば、以下のような情報を記録する。
・実行した命令のアドレス
・実行した命令のアドレス（オペコード、オペランド）
・アクセスされたレジスタとその値
・アクセスされたメモリのアドレス、その値、そのモード（ＲＥＡＤ／ＷＲＩＴＥ）First, in step S110, the execution trace acquisition unit 110 executes the program of the execution file 101 to be analyzed in an analysis environment such as a virtual machine. The execution trace acquisition unit 110 monitors the process of the executed program and records the execution trace 102 of the process. As information recorded by the execution trace 102, for example, the following information is recorded.
-Address of executed instruction-Address of executed instruction (opcode, operand)
・ Accessed register and its value ・ Accessed memory address, its value, its mode (READ / WRITE)

実行トレースを取得する方法としては、例えば、非特許文献７に記載されているＰｉｎなどのＤｙｎａｍｉｃＩｎｓｔｒｕｍｅｎｔａｔｉｏｎＴｏｏｌを用いる方法や、非特許文献８に記載されているＴＥＭＵのようなエミュレータを用いる方法がある。実行トレース取得部１１０では、これらの既存の方法を基に、実行トレースを取得する。 As a method for acquiring an execution trace, for example, there is a method using a Dynamic Instrumentation Tool such as Pin described in Non-Patent Document 7, or a method using an emulator such as TEMU described in Non-Patent Document 8. . The execution trace acquisition unit 110 acquires an execution trace based on these existing methods.

また、実行トレース取得部１１０は、実行トレース１０２の取得と同時に、実行トレース１０２を取得したプロセスがロードしているＤＬＬや関数についての情報をプロセス情報１０３として抽出する。プロセス情報１０３には、例えば、以下のような情報が記録される。
・プロセスのベースアドレス
・プロセスがロードしているＤＬＬの名前、アドレス、サイズ
・ＤＬＬがエクスポートしているＡＰＩの名前、アドレス
なお、プロセス情報１０３の分かりやすい実例としては、メモリにロードされているプロセスのＰＥヘッダである。In addition, the execution trace acquisition unit 110 extracts, as process information 103, information on DLLs and functions loaded by the process that acquired the execution trace 102 simultaneously with the acquisition of the execution trace 102. For example, the following information is recorded in the process information 103.
-Base address of the process-Name, address, and size of the DLL loaded by the process-Name and address of the API exported by the DLL As an easy-to-understand example of the process information 103, the process loaded in the memory PE header.

次に、ステップＳ１２０において、ブロック抽出部１２０は、実行トレース１０２から、プログラムを構成する基本的な構成要素であるブロックを抽出する。ここでいうブロックとは、関数、ループ、連結したループなどであり、それらを表現するために必要な、以下の情報を持つ。
・ブロックＩＤ
・ブロックの種類
・ブロックの開始アドレス
・ブロックの終了アドレス
・ブロック内の命令列（プロセスのメモリイメージから取得）
ブロック抽出部１２０は、抽出した複数のブロックについて上記のブロックの情報を、ブロックリスト１０４として管理する。Next, in step S120, the block extraction unit 120 extracts blocks, which are basic components constituting the program, from the execution trace 102. The block here is a function, a loop, a connected loop, or the like, and has the following information necessary to express them.
・ Block ID
-Block type-Block start address-Block end address-Instruction sequence in the block (taken from the memory image of the process)
The block extraction unit 120 manages the block information as a block list 104 for the plurality of extracted blocks.

以下、ブロックを表現する各情報について説明する。
ブロックＩＤは、ブロックリストにおいて一意となる値を設定する。ブロックの種類とは、ブロックを構成する最も外側のロジック（関数、ループ、連結したループ）を設定する。ブロックの開始アドレスとは、プロセスが利用するメモリ上の何番地目からブロックが開始しているかを示す。ブロックの終了アドレスとは、プロセスが利用するメモリ上の何番地目でブロックが終了しているかを示す。ブロック内の命令列とは、プロセスが利用するメモリ上で、開始アドレスと終了アドレスの範囲に存在する命令シーケンスである。Hereinafter, each piece of information representing a block will be described.
The block ID is set to a unique value in the block list. As the block type, the outermost logic (function, loop, connected loop) constituting the block is set. The block start address indicates from which address on the memory used by the process the block starts. The block end address indicates the address on the memory used by the process at which the block ends. The instruction sequence in the block is an instruction sequence existing in the range of the start address and the end address on the memory used by the process.

ブロック抽出部１２０は、ｃａｌｌなどの関数呼出命令と、ｒｅｔなどのリターン命令の関係を実行トレース１０２上で追跡することにより、関数を特定する。また、ブロック抽出部１２０は、命令パターンの繰返しや後方ジャンプ（ＢａｃｋｗａｒｄＪｕｍｐ）を実行トレース上で追跡することにより、ループを特定する。また、ブロック抽出部１２０は、ループ間の入出力関係を実行トレース上で追跡することにより、連結したループを特定する。なお、ブロックリスト１０４の抽出に関しては、例えば、非特許文献５、６、９に開示されている技術を活用することができる。 The block extraction unit 120 specifies a function by tracking the relationship between a function call instruction such as “call” and a return instruction such as “ret” on the execution trace 102. In addition, the block extraction unit 120 identifies a loop by tracking repetition of an instruction pattern and backward jump (Backward Jump) on an execution trace. In addition, the block extraction unit 120 identifies the connected loops by tracking the input / output relationship between the loops on the execution trace. For extraction of the block list 104, for example, techniques disclosed in Non-Patent Documents 5, 6, and 9 can be used.

さらに、ステップＳ１２０では、後述のステップで必要な情報として、定義リスト１０５を作成する。
図３は、定義リスト１０５の一例を示す図である。
図３に示すように、定義リスト１０５は、ブロック抽出部１２０が実行トレース１０２を１行ずつ読んでいく中で、以下の情報を記録したテーブルである。
・実行トレースの行番号
・同行で実行された命令があるアドレス
・同行で変更された記憶領域（レジスタ、メモリ）
・新しい値
・値のサイズFurther, in step S120, a definition list 105 is created as information necessary for the steps described later.
FIG. 3 is a diagram illustrating an example of the definition list 105.
As shown in FIG. 3, the definition list 105 is a table in which the following information is recorded while the block extraction unit 120 reads the execution trace 102 line by line.
-Line number of the execution trace-Address where the instruction executed on the same line is located-Storage area (register, memory) changed on the same line
New value / value size

次に、ステップＳ１３０において、ブロック情報抽出部１３０は、実行トレース１０２とブロック（ブロックリスト１０４、定義リスト１０５）から、ブロック情報を抽出し、ブロック情報リスト１０６を出力する。ここでいうブロック情報リスト１０６には、以下に示す情報を持つブロック情報が要素として登録される。
・ブロックＩＤ
・入力情報
・出力情報
・コンテキストNext, in step S130, the block information extraction unit 130 extracts block information from the execution trace 102 and the blocks (block list 104, definition list 105), and outputs the block information list 106. In this block information list 106, block information having the following information is registered as an element.
・ Block ID
・ Input information ・ Output information ・ Context

以下、ブロック情報を表現する各情報について説明する。
ブロックＩＤとは、ブロックリスト１０４に登録されているブロックと対応づけるための情報である。
入力情報とは、実行トレース１０２において、以下の条件を満たす情報である。
・ブロックの実行前に定義される情報
・ブロックの実行中において、上書きされる前に、読まれる情報
出力情報とは、実行トレース１０２において、以下の条件を満たす情報である。
・記憶領域（レジスタ、メモリ）において、ブロック実行中、同記憶領域に最後に書き込まれた情報
コンテキストとは、ブロックが実行トレース１０２において、どのタイミングで実行されているかを表わすための情報である。Hereinafter, each piece of information representing block information will be described.
The block ID is information for associating with a block registered in the block list 104.
Input information is information that satisfies the following conditions in the execution trace 102.
Information defined before execution of block Information read before being overwritten during execution of block Output information is information that satisfies the following conditions in the execution trace 102.
In the storage area (register, memory), the information context last written in the storage area during execution of the block is information for indicating when the block is executed in the execution trace 102.

以下、入力情報、出力情報、コンテキストの抽出処理の詳細を説明する。 The details of the input information, output information, and context extraction processing will be described below.

まず、入力情報の抽出は、次のように行なう。
ブロック情報抽出部１３０は、実行トレース１０２を１行ずつ解析する。注目している実行トレース１０２の命令アドレスが、ブロックリスト１０４に登録されている、いずれかのブロックＢ１の開始アドレスと終了アドレスの範囲に含まれているとする。ブロック情報抽出部１３０は、実行トレース１０２をさらに解析し、ブロックＢ１の範囲にあるアドレスＸで命令が実行され、同命令によって、ある記憶領域がＲＥＡＤされているとする。そして、ブロック情報抽出部１３０は、定義リスト１０５を解析し、ブロックＢ１の範囲内かつアドレスＸよりも前に実行された命令によって、同記憶領域がＷＲＩＴＥされていないかを確認する。ＷＲＩＴＥされていない場合、同記憶領域を入力情報とする。First, input information is extracted as follows.
The block information extraction unit 130 analyzes the execution trace 102 line by line. It is assumed that the instruction address of the execution trace 102 of interest is included in the range of the start address and end address of any block B1 registered in the block list 104. The block information extraction unit 130 further analyzes the execution trace 102, and it is assumed that an instruction is executed at an address X in the range of the block B1, and a certain storage area is read by the instruction. Then, the block information extraction unit 130 analyzes the definition list 105 and confirms whether or not the storage area is WRITEed by an instruction executed within the range of the block B1 and before the address X. In the case where WRITE is not performed, the storage area is set as input information.

なお、実行トレース１０２上、隣接するメモリ領域が同じアドレスの同じ命令によってＲＥＡＤされている場合、同メモリ領域はバッファとしてアクセスされている可能性が高いため、同メモリ領域をまとめて入力情報とする。すなわち、入力情報としては、同メモリ領域の先頭アドレス、サイズ、格納されているバイト列、から構成される。また、入力情報の種類として、『バッファ』であることも記録する。また、同じアドレスの命令によってＲＥＡＤされているが、値がカウントアップまたはカウントダウンしている入力情報については、入力情報の種類を『カウンタ』とする。ループの中でループの終了条件に使われている入力情報や、カウンタの初期値に使われている入力情報の種類を、『終了条件』とする。 If the adjacent memory area is read by the same instruction at the same address on the execution trace 102, it is highly likely that the memory area is accessed as a buffer. Therefore, the memory area is collectively used as input information. . That is, the input information is composed of the start address of the memory area, the size, and the stored byte string. In addition, “buffer” is recorded as the type of input information. For input information that is read by an instruction at the same address but whose value is counted up or down, the type of the input information is “counter”. The type of input information used for the loop end condition in the loop and the input information used for the initial value of the counter is referred to as “end condition”.

次に、出力情報の抽出は、次のように行なう。
ブロック情報抽出部１３０が、あるブロックＢ１の実行トレース１０２を解析しているとする。ブロック情報抽出部１３０は、実行トレース１０２をさらに解析し、トレースがブロックＢ１の範囲を超えたとする。このとき、ブロック情報抽出部１３０は、定義リスト１０５を解析することで、同ブロックの範囲で実行された命令によってＷＲＩＴＥされた情報を出力情報とする。また、同じ記憶領域に複数回ＷＲＩＴＥが行なわれている場合は、実行トレース１０２の行番号が大きいもの、すなわち、最新のものを出力情報とする。Next, output information is extracted as follows.
It is assumed that the block information extraction unit 130 is analyzing the execution trace 102 of a certain block B1. The block information extraction unit 130 further analyzes the execution trace 102, and it is assumed that the trace exceeds the range of the block B1. At this time, the block information extraction unit 130 analyzes the definition list 105, and uses the information written by the command executed in the range of the block as output information. If WRITE is performed a plurality of times in the same storage area, the execution trace 102 having the largest line number, that is, the latest one is used as the output information.

なお、入力情報と同様に、実行トレース１０２上、隣接するメモリ領域が同じアドレスの同じ命令によってＷＲＩＴＥされている場合、バッファとしてアクセスされている可能性が高いため、同メモリ領域をまとめて出力情報とする。すなわち、出力情報としては同メモリ領域の先頭アドレス、サイズ、格納されているバイト列、から構成される。また、出力情報の種類として、『バッファ』であることも記録する。また、同じアドレスの命令によってＷＲＩＴＥされているが、値がカウントアップまたはカウントダウンしている出力情報については、出力情報の種類を『カウンタ』とする。ループの中でループの終了条件に使われている出力情報や、カウンタの初期値に使われている出力情報の種類を、『終了条件』とする。 As in the case of input information, if adjacent memory areas on the execution trace 102 are written by the same instruction at the same address, there is a high possibility that they are accessed as a buffer. And That is, the output information is composed of the start address, size, and stored byte string of the memory area. In addition, “buffer” is recorded as the type of output information. For output information that is WRITE by the instruction at the same address but whose value is counted up or down, the type of output information is “counter”. The type of output information used for the loop end condition in the loop and the output information used for the initial value of the counter is referred to as “end condition”.

次に、入力情報と出力情報を格納するデータ形式を説明する。
図４は、入力情報と出力情報を格納するデータ形式の一例を説明する図である。
図４において、入力情報と出力情報に関する情報は、記憶領域（開始アドレス）、値（バイト列）、サイズ（バイト）、情報の種類が格納されている。Next, a data format for storing input information and output information will be described.
FIG. 4 is a diagram for explaining an example of a data format for storing input information and output information.
In FIG. 4, information relating to input information and output information includes a storage area (start address), a value (byte string), a size (byte), and a type of information.

次に、コンテキストの抽出は、次のように行なう。
コンテキストは、ブロックの呼び出し関係（入れ子関係）を表現するためのものである。例えば、Ｂ１、Ｂ２、Ｂ３、Ｂ４、Ｂ５、Ｂ６、Ｂ７、Ｂ８をブロックとする。このとき、Ｂ１の実行終了後にＢ２が実行され、Ｂ２の中でＢ３とＢ４が実行され、Ｂ２の実行終了後にＢ５が実行されるとする。さらに、Ｂ５の中でＢ６が実行され、Ｂ６の中で、Ｂ７が実行されるとする。そして、Ｂ７、Ｂ６の実行終了後、Ｂ８が実行されるとする。Next, context extraction is performed as follows.
The context is for expressing the calling relationship (nesting relationship) of blocks. For example, B1, B2, B3, B4, B5, B6, B7, and B8 are blocks. At this time, B2 is executed after the execution of B1, B3 and B4 are executed in B2, and B5 is executed after the execution of B2. Furthermore, B6 is executed in B5, and B7 is executed in B6. Then, B8 is executed after the execution of B7 and B6 is completed.

上記で述べたブロックの呼び出し関係の場合、Ｂ１、Ｂ２、Ｂ５、Ｂ８のコンテキストを、それぞれ１、２、３、４と表現する。また、Ｂ２の中で実行されるＢ３、Ｂ４のコンテキストを、それぞれ２．１、２．２と表現する。同様に、Ｂ５の中で実行されるＢ６のコンテキストを３．１と表現する。さらに、Ｂ６の中で実行されるＢ７のコンテキストを３．１．１と表現する。このようにコンテキストを表現することにより、同じブロック（同じブロックＩＤ）の呼び出しであっても、呼び出し場所に応じて区別することができる。 In the case of the block calling relationship described above, the contexts of B1, B2, B5, and B8 are expressed as 1, 2, 3, and 4, respectively. The contexts B3 and B4 executed in B2 are expressed as 2.1 and 2.2, respectively. Similarly, the context of B6 executed in B5 is expressed as 3.1. Further, the context of B7 executed in B6 is expressed as 3.1.1. By expressing the context in this way, even calls of the same block (same block ID) can be distinguished according to the call location.

なお、コンテキストの表現形式は、ブロックの呼び出し関係（入れ子関係）を表現できる表現形式であればどのような表現形式でも構わない。 Note that the expression format of the context may be any expression format as long as it can express the calling relationship (nesting relationship) of the blocks.

また、ブロック情報抽出部１３０は、実行トレース１０２を解析していく中で、ブロックが終了したのか、それともブロックの中で新しくブロックが呼ばれたかのかは、次のように判断する。
・あるブロックの実行トレース１０２を解析している最中に、関数呼び出し（例えばｃａｌｌやｅｎｔｅｒ）以外の命令（例えばｊｍｐ、ｊｎｅ、ｒｅｔ）でブロックの範囲外へジャンプする場合、同ブロックの終了を意味する。
・あるブロックの実行トレース１０２を解析している最中に、関数呼び出し（例えばｃａｌｌやｅｎｔｅｒ）命令でブロックの範囲外へジャンプする場合、同ブロックは終了せず、同ブロックの中で新たにブロックが呼び出されたものとする。
・あるブロックの実行トレース１０２を解析している最中に、上記以外のケースで、実行トレース１０２がブロックの範囲を超えた場合、同ブロックの終了を意味する。In addition, while analyzing the execution trace 102, the block information extraction unit 130 determines whether the block has ended or whether a new block has been called in the block as follows.
When the execution trace 102 of a certain block is being analyzed, if an instruction other than a function call (eg, call or enter) (eg, jmp, jne, ret) is used to jump out of the block range, the end of the block is means.
-While the execution trace 102 of a certain block is being analyzed, if a function call (for example, call or enter) instruction is used to jump outside the block range, the block will not be terminated, and a new block will be created within the block. Is called.
-While analyzing the execution trace 102 of a block, if the execution trace 102 exceeds the block range in cases other than the above, it means the end of the block.

次に、図２のフローチャートの説明に戻る。
ステップＳ１４０において、ブロック情報解析部１４０は、ブロック情報リスト１０６のブロック情報を解析し、暗号ロジックを特定する。ブロック情報の解析では、ブロック情報解析部１４０は、ブロック情報の入力情報または出力情報を用いて、ブロックの入出力関係の特徴を判定する特徴判定情報を生成し、この特徴判定情報を利用してブロックの入出力関係を解析し、暗号関数または復号関数の入出力関係の特徴を示すブロックを暗号ロジックと判定する。ブロック情報解析部１４０が生成する特徴判定情報と、特徴判定情報を利用した暗号ロジックの判定方法は、後述する実施例１〜実施例４で詳細に説明する。ブロック情報解析部１４０は、ブロック情報を解析した結果、暗号ロジックの判定結果を含む解析結果リスト１０７を出力する。Next, the description returns to the flowchart of FIG.
In step S140, the block information analysis unit 140 analyzes the block information in the block information list 106 and identifies the encryption logic. In the block information analysis, the block information analysis unit 140 generates feature determination information for determining features of the input / output relationship of the block using the input information or output information of the block information, and uses the feature determination information. The input / output relationship of the block is analyzed, and the block showing the characteristics of the input / output relationship of the encryption function or the decryption function is determined as the encryption logic. The feature determination information generated by the block information analysis unit 140 and the encryption logic determination method using the feature determination information will be described in detail in Examples 1 to 4 to be described later. As a result of analyzing the block information, the block information analysis unit 140 outputs the analysis result list 107 including the determination result of the cryptographic logic.

最後に、ステップＳ１５０において、解析結果出力部１５０は、ブロックリスト１０４、ブロック情報リスト１０６、解析結果リスト１０７を基に、解析結果を整理して、以下の情報を出力する。
・暗号ロジックの開始アドレス
・入力情報（記憶領域、値、サイズ）
・出力情報（記憶領域、値、サイズ）
・復号ロジックの開始アドレスFinally, in step S150, the analysis result output unit 150 organizes the analysis results based on the block list 104, the block information list 106, and the analysis result list 107, and outputs the following information.
・ Cryptographic logic start address ・ Input information (storage area, value, size)
-Output information (storage area, value, size)
・ Decryption logic start address

以上のように、本実施の形態１の発明では、実行トレースから抽出したブロックの入出力関係の特徴を判定する特徴判定情報を生成し、この特徴判定情報を利用してブロックの入出力関係を解析し、暗号関数または復号関数の入出力関係の特徴を示すブロックを暗号ロジックと判定することにより、マルウェアが用いる暗号ロジックを、精度良く特定することができるという効果がある。 As described above, in the first embodiment of the present invention, the feature determination information for determining the feature of the input / output relationship of the block extracted from the execution trace is generated, and the input / output relationship of the block is determined using this feature determination information. By analyzing and determining the block showing the characteristics of the input / output relationship of the encryption function or the decryption function as the encryption logic, there is an effect that the encryption logic used by the malware can be specified with high accuracy.

次に、本実施の形態１を具体的に実現する実施例１〜実施例４について、以下に説明する。 Next, Examples 1 to 4 that specifically realize the first embodiment will be described below.

実施例１．
暗号関数の入力は、通常、平文である。一方、暗号関数の出力は、ランダムなバイト列である。そのため、暗号関数の入力には印刷可能文字列の割合が多く、出力には印刷可能文字列の割合が少ないという特徴が見られる。
図５は、暗号関数の入出力情報が含む印刷可能文字列の特徴を示す図である。
図５では、入力される「こんにちは」が印刷可能文字列であり、また、出力される情報の内、「唖」が印刷可能文字列であり、「・」が印刷不可能な文字列を示している。実施例１では、このような暗号関数の入出力情報が含む印刷可能文字列の特徴を特徴判定情報に利用して、暗号ロジックを特定する実施例を説明する。Example 1.
The input of the cryptographic function is usually plain text. On the other hand, the output of the cryptographic function is a random byte sequence. Therefore, there is a feature that the ratio of the printable character string is large in the input of the encryption function and the ratio of the printable character string is small in the output.
FIG. 5 is a diagram showing the characteristics of the printable character string included in the input / output information of the cryptographic function.
In Figure 5, a printable character string "Hello" is input, also, among the information output, "mute" is printable string, "-" indicates a character string non-printable ing. In the first embodiment, a description will be given of an embodiment in which the encryption logic is specified by using the feature of the printable character string included in the input / output information of the encryption function as the feature determination information.

図６は、実施例１に係るプロセス解析装置の一構成例を示す構成図である。
図６において、ブロック情報解析部１４０は、文字列割合判定部１６０、文字コード判定アルゴリズムデータベース（データベースは、以下ＤＢと記載する）１６１、文字コード表ＤＢ１６２を備える。ブロック情報解析部１４０は、ブロック情報リスト１０６が入力され、文字列割合判定処理を行なって、解析結果リスト１０７を出力する。また、文字列割合判定部１６０は、ブロック情報解析部１４０から入力されたブロック情報の入力情報に対して、印刷可能文字列割合を判定して暗号ロジックを特定し、特定した暗号ロジックを含む暗号ロジックリスト１（１６３）をブロック情報解析部１４０に出力する。FIG. 6 is a configuration diagram illustrating a configuration example of the process analysis apparatus according to the first embodiment.
6, the block information analysis unit 140 includes a character string ratio determination unit 160, a character code determination algorithm database (database is hereinafter referred to as DB) 161, and a character code table DB 162. The block information analysis unit 140 receives the block information list 106, performs a character string ratio determination process, and outputs an analysis result list 107. In addition, the character string ratio determination unit 160 determines the printable character string ratio for the input information of the block information input from the block information analysis unit 140, specifies the encryption logic, and the encryption including the specified encryption logic The logic list 1 (163) is output to the block information analysis unit 140.

なお、文字列割合判定部１６０、文字コード判定アルゴリズムＤＢ１６１、文字コード表ＤＢ１６２は、ブロック情報解析部１４０の内部に備える構成としても良い。 The character string ratio determination unit 160, the character code determination algorithm DB 161, and the character code table DB 162 may be provided inside the block information analysis unit 140.

次に、実施例１の文字列割合判定処理の流れを、図７を参照して説明する。
図７は、実施例１に係る文字列割合判定部１６０の文字列割合判定処理の流れを示すフローチャートである。Next, the flow of character string ratio determination processing according to the first embodiment will be described with reference to FIG.
FIG. 7 is a flowchart illustrating the flow of the character string ratio determination process of the character string ratio determination unit 160 according to the first embodiment.

まず、ステップＳ１６０１において、文字列割合判定部１６０は、暗号ロジックリスト１（１６３）を初期化する。暗号ロジックリスト１（１６３）は、文字列割合判定部１６０で暗号ロジックの候補と判定されたブロック情報を格納するブロック情報リスト１０６である。 First, in step S1601, the character string ratio determination unit 160 initializes the encryption logic list 1 (163). The cryptographic logic list 1 (163) is a block information list 106 that stores block information that is determined as a cryptographic logic candidate by the character string ratio determination unit 160.

次に、ステップＳ１６０２において、文字列割合判定部１６０は、ブロック情報リスト１０６に次の要素（ブロック情報）があるかを確認する。次の要素のブロック情報が無ければ、Ｎｏの分岐に進み、処理を終了する。一方、次の要素のブロック情報があれば、ステップＳ１６０３において、次の要素Ｂｉを選ぶ。 Next, in step S1602, the character string ratio determination unit 160 confirms whether or not the next element (block information) is present in the block information list 106. If there is no block information of the next element, the process proceeds to No branch, and the process ends. On the other hand, if there is block information of the next element, the next element Bi is selected in step S1603.

次に、ステップＳ１６０４において、文字列割合判定部１６０は、ブロック情報の入力情報の印刷可能文字列の割合を判定する。ここで印刷可能文字列とは、改行復帰文字、または空文字（ヌル文字）で終わるｃ個の文字の連鎖からなる印刷可能な文字列のことである。文字列割合判定部１６０は、情報の種類に「バッファ」が設定されている入力情報に対して、値に設定されているバイト列の文字コード判定を行なう。この文字コード判定では、文字コード判定アルゴリズムＤＢ１６１に登録されているアルゴリズムを用いて、文字コード判定を実行する。文字コードが判定できれば、文字コード表ＤＢ１６２から対応する文字コード表を取得し、印刷可能な文字を確認することができる。
図８は、文字コード表ＤＢ１６２に格納された文字コード表の例を示す図である。
図８では、文字「あ」から文字「ぱ」に対して、文字コードが対応付けられて格納された例を示している。In step S1604, the character string ratio determining unit 160 determines the ratio of printable character strings in the input information of the block information. Here, the printable character string is a printable character string composed of a chain of c characters ending with a carriage return character or a null character (null character). The character string ratio determining unit 160 determines the character code of the byte string set as the value for the input information in which “buffer” is set as the information type. In this character code determination, character code determination is executed using an algorithm registered in the character code determination algorithm DB 161. If the character code can be determined, the corresponding character code table can be acquired from the character code table DB 162 and the printable characters can be confirmed.
FIG. 8 is a diagram illustrating an example of a character code table stored in the character code table DB 162.
FIG. 8 shows an example in which character codes are stored in association with characters “A” to “Pa”.

文字列割合判定部１６０は、入力情報のバイト列から得られた印刷可能文字列の文字列長の総和を、同バイト列長で割ることで、印刷可能文字列割合を算出する。なお、文字列長は、マルチバイト文字を２バイトとして計算する。 The character string ratio determination unit 160 calculates the printable character string ratio by dividing the sum of the character string lengths of the printable character strings obtained from the byte string of the input information by the byte string length. Note that the character string length is calculated assuming that a multibyte character is 2 bytes.

次に、ステップＳ１６０５において、文字列割合判定部１６０は、ブロック情報の出力情報の印刷可能文字列の割合を判定する。ここでの処理手順はステップＳ１６０４と同様である。 In step S <b> 1605, the character string ratio determination unit 160 determines the ratio of printable character strings in the block information output information. The processing procedure here is the same as that in step S1604.

次に、ステップＳ１６０６において、文字列割合判定部１６０は、入力と出力の印刷可能文字列の割合の差分として、入力の印刷可能文字列割合−出力の印刷文字列割合を計算する。 In step S <b> 1606, the character string ratio determination unit 160 calculates an input printable character string ratio−output print character string ratio as a difference between the input and output printable character string ratios.

次に、ステップＳ１６０７において、文字列割合判定部１６０は、ステップＳ１６０６で計算した差分が、閾値θ以上であれば、暗号ロジックリスト１（１６３）に、同ブロック情報Ｂｉを追加する。なお、上記のｃ、及びθは、調整可能なパラメータである。 In step S1607, if the difference calculated in step S1606 is greater than or equal to the threshold θ, the character string ratio determination unit 160 adds the block information Bi to the encryption logic list 1 (163). Note that the above c and θ are adjustable parameters.

なお、ステップＳ１６０４において、入力情報のファイルタイプ検査を行うことも可能である。入力がWORDやPDFなどの既知のファイル形式のファイルであれば、そのファイル形式に従ってテキスト情報を抽出し、同テキスト情報に対してのみ印刷可能文字列割合を計算する。こうすることで、WORDやPDFなどのテキストを特別な形式でエンコードするタイプのファイルが入力情報であったとしても、適切に入力情報の印刷可能文字列割合を計算することができる。ファイルタイプ検査は既知のツールを利用することが可能である。 In step S1604, it is possible to perform a file type inspection of input information. If the input is a file of a known file format such as WORD or PDF, text information is extracted according to the file format, and a printable character string ratio is calculated only for the text information. By doing this, even if a file of a type that encodes text such as WORD or PDF in a special format is input information, the ratio of printable character strings of the input information can be calculated appropriately. The file type check can use a known tool.

実施例２．
マルウェアなどの不正プログラムは、暗号化されたデータを印刷可能なデータにエンコード（例えばＢａｓｅ６４エンコード）してから、インターネット上に送信することがある。
図９は、マルウェアの暗号関数の使い方の一例（その１）を示す図である。
図９では、マルウェアが、暗号関数によりメッセージを暗号化した暗号文を、Ｂａｓｅ６４エンコードしてから、ＨＴＴＰ送信によりインターネット上に送信している例を示している。そこで、あるブロックの出力情報を、既知のデコーダ（例えばＢａｓｅ６４デコーダ）を用いてデコードを行ない、デコードに成功した場合、デコードした値と等しい出力情報を持っているブロックを、暗号ロジックの候補とすることができる。実施例２では、このような暗号関数の入出力情報が含むエンコードの特徴を特徴判定情報に利用して、暗号ロジックを特定する実施例を説明する。Example 2
In some cases, a malicious program such as malware encodes encrypted data into printable data (for example, Base64 encoding) and then transmits the data to the Internet.
FIG. 9 is a diagram illustrating an example (part 1) of how to use a cryptographic function of malware.
FIG. 9 shows an example in which the malware encrypts a ciphertext obtained by encrypting a message with a cryptographic function and then transmits it on the Internet by HTTP transmission. Therefore, when the output information of a certain block is decoded using a known decoder (for example, a Base64 decoder) and decoding is successful, a block having output information equal to the decoded value is set as a candidate for the encryption logic. be able to. In the second embodiment, a description will be given of an embodiment in which the encryption logic is specified by using the feature of the encoding included in the input / output information of the encryption function as the feature determination information.

図１０は、実施例２に係るプロセス解析装置の一構成例を示す構成図である。
図１０において、ブロック情報解析部１４０は、データデコード部１７０、エンコード／デコードアルゴリズムＤＢ１７１を備える。ブロック情報解析部１４０は、ブロック情報リスト１０６が入力され、デコード判定処理を行なって、解析結果リスト１０７を出力する。また、データデコード部１７０は、ブロック情報解析部１４０から入力されたブロック情報の出力情報に対してデコードを行ない、デコードに成功した場合、デコードした値と等しい出力情報を持っているブロックを、暗号ロジックの候補とし、候補とした暗号ロジックを含む暗号ロジックリスト２（１７２）をブロック情報解析部１４０に出力する。FIG. 10 is a configuration diagram illustrating a configuration example of the process analysis apparatus according to the second embodiment.
In FIG. 10, the block information analysis unit 140 includes a data decoding unit 170 and an encoding / decoding algorithm DB 171. The block information analysis unit 140 receives the block information list 106, performs a decoding determination process, and outputs an analysis result list 107. In addition, the data decoding unit 170 decodes the output information of the block information input from the block information analysis unit 140. If the decoding is successful, the block having the output information equal to the decoded value is encrypted. The encryption logic list 2 (172) including the candidate encryption logic is output to the block information analysis unit 140 as a logic candidate.

なお、データデコード部１７０とエンコード／デコードアルゴリズムＤＢ１７１は、ブロック情報解析部１４０の内部に備える構成としても良い。 The data decoding unit 170 and the encoding / decoding algorithm DB 171 may be configured to be provided inside the block information analysis unit 140.

次に、実施例２のデコード判定処理の流れを、図１１を参照して説明する。
図１１は、実施例２に係るデータデコード部１７０のデコード判定処理の流れを示すフローチャートである。Next, the flow of decoding determination processing according to the second embodiment will be described with reference to FIG.
FIG. 11 is a flowchart illustrating the flow of the decoding determination process of the data decoding unit 170 according to the second embodiment.

まず、ステップＳ１７０１において、データデコード部１７０は、暗号ロジックリスト２（１７２）を初期化する。暗号ロジックリスト２（１７２）は、データデコード部１７０で暗号ロジックの候補と判定されたブロック情報を格納するブロック情報リスト１０６である。 First, in step S1701, the data decoding unit 170 initializes the encryption logic list 2 (172). The encryption logic list 2 (172) is a block information list 106 that stores block information that is determined by the data decoding unit 170 as a candidate for encryption logic.

次に、ステップＳ１７０２において、データデコード部１７０は、ブロック情報リスト１０６に次の要素（ブロック情報）があるかを確認する。次の要素のブロック情報が無ければ、Ｎｏの分岐に進み、処理を終了する。一方、次の要素のブロック情報があれば、ステップＳ１７０３において、次の要素Ｂｉを選ぶ。 Next, in step S <b> 1702, the data decoding unit 170 confirms whether there is a next element (block information) in the block information list 106. If there is no block information of the next element, the process proceeds to No branch, and the process ends. On the other hand, if there is block information of the next element, the next element Bi is selected in step S1703.

次に、ステップＳ１７０４において、データデコード部１７０は、既知のデコードアルゴリズムを適用して、ブロック情報の出力情報をデコードする。既知のデコードアルゴリズムは、エンコード／デコードアルゴリズムＤＢ１７１に格納されている。データデコード部１７０は、情報の種類が「バッファ」となっている出力情報に対して、デコードを行なう。 Next, in step S1704, the data decoding unit 170 decodes the output information of the block information by applying a known decoding algorithm. Known decoding algorithms are stored in the encoding / decoding algorithm DB 171. The data decoding unit 170 decodes the output information whose information type is “buffer”.

次に、ステップＳ１７０５において、データデコード部１７０は、デコードに成功したか否かを判定する。データデコード部１７０は、エンコード／デコードアルゴリズムＤＢ１７１に格納されているいずれかのデコードアルゴリズムでデコードに成功した場合、同デコード結果を保持して、Ｙｅｓの分岐によりステップＳ１７０７に進む。 In step S1705, the data decoding unit 170 determines whether the decoding has succeeded. If the data decoding unit 170 succeeds in decoding using any one of the decoding algorithms stored in the encoding / decoding algorithm DB 171, the data decoding unit 170 holds the decoding result and proceeds to step S 1707 by branching to Yes.

次に、ステップＳ１７０７において、データデコード部１７０は、ステップＳ１７０５で保持したデコード結果と一致する出力情報を持つブロックを、ブロック情報リスト１０６から探す。なお、処理の効率化のために、ブロックＢｉのコンテキストより古いコンテキストのブロックに限定して検索対象としても良い。データデコード部１７０は、ブロック情報リスト１０６を検索した結果、デコード結果と一致する出力情報を持つブロックＢｊ（ｉ≠ｊ）を発見した場合、ステップＳ１７０８において、ブロックＢｊのブロック情報を、暗号ロジック候補リスト２（１７２）に追加する。 In step S 1707, the data decoding unit 170 searches the block information list 106 for a block having output information that matches the decoding result held in step S 1705. In order to improve processing efficiency, the search target may be limited to blocks having a context older than the context of the block Bi. If the data decoding unit 170 finds a block Bj (i ≠ j) having output information that matches the decoding result as a result of searching the block information list 106, in step S1708, the data decoding unit 170 converts the block information of the block Bj into a cryptographic logic candidate. Add to list 2 (172).

実施例３．
マルウェアなどの不正プログラムは、データを暗号化する前に、同データを圧縮することがある。
図１２は、マルウェアの暗号関数の使い方の一例（その２）を示す図である。
図１２では、マルウェアが、メッセージが暗号関数に入力される前に、メッセージを圧縮関数により圧縮し、この圧縮データを暗号関数により暗号化した暗号文を、ＨＴＴＰ送信によりインターネット上に送信している例を示している。そこで、あるブロックの入力情報を、既知の解凍アルゴリズム（例えばｚｉｐ、ｌｚｈ）で解凍を試み、解凍に成功した場合、同ブロックを暗号ロジックの候補とすることができる。実施例３では、このような暗号関数の入出力情報が含む圧縮処理の特徴を特徴判定情報に利用して、暗号ロジックを特定する実施例を説明する。Example 3 FIG.
A malicious program such as malware may compress the data before encrypting the data.
FIG. 12 is a diagram illustrating an example (part 2) of how to use a cryptographic function of malware.
In FIG. 12, before the message is input to the encryption function, the malware compresses the message with the compression function, and transmits the ciphertext obtained by encrypting the compressed data with the encryption function to the Internet by HTTP transmission. An example is shown. Therefore, when the input information of a certain block is decompressed with a known decompression algorithm (for example, zip, lzh), and the decompression is successful, the block can be made a candidate for the encryption logic. In the third embodiment, a description will be given of an embodiment in which cryptographic logic is specified by using the characteristics of compression processing included in the input / output information of the cryptographic function as feature determination information.

図１３は、実施例３に係るプロセス解析装置の一構成例を示す構成図である。
図１３において、ブロック情報解析部１４０は、データ解凍部１８０、圧縮／解凍アルゴリズムＤＢ１８１を備える。ブロック情報解析部１４０は、ブロック情報リスト１０６が入力され、データ解凍判定処理を行なって、解析結果リスト１０７を出力する。また、データ解凍部１８０は、ブロック情報解析部１４０から入力されたブロック情報の入力情報に対して解凍を行ない、解凍に成功した場合、同ブロックを、暗号ロジックの候補とし、候補とした暗号ロジックを含む暗号ロジックリスト３（１８２）をブロック情報解析部１４０に出力する。FIG. 13 is a configuration diagram illustrating a configuration example of the process analysis apparatus according to the third embodiment.
In FIG. 13, the block information analysis unit 140 includes a data decompression unit 180 and a compression / decompression algorithm DB 181. The block information analysis unit 140 receives the block information list 106, performs data decompression determination processing, and outputs the analysis result list 107. In addition, the data decompression unit 180 decompresses the input information of the block information input from the block information analysis unit 140. If the decompression is successful, the data decompression unit 180 sets the block as a candidate for the encryption logic, Is output to the block information analysis unit 140.

なお、データ解凍部１８０と圧縮／解凍アルゴリズムＤＢ１８１は、ブロック情報解析部１４０の内部に備える構成としても良い。 The data decompression unit 180 and the compression / decompression algorithm DB 181 may be configured to be included in the block information analysis unit 140.

次に、実施例３のデータ解凍判定処理の流れを、図１４を参照して説明する。
図１４は、実施例３に係るデータ解凍部１８０のデータ解凍判定処理の流れを示すフローチャートである。Next, the flow of data decompression determination processing according to the third embodiment will be described with reference to FIG.
FIG. 14 is a flowchart illustrating the flow of the data decompression determination process of the data decompression unit 180 according to the third embodiment.

まず、ステップＳ１８０１において、データ解凍部１８０は、暗号ロジックリスト３（１８２）を初期化する。暗号ロジックリスト３（１８２）は、データ解凍部１８０で暗号ロジックの候補と判定されたブロック情報を格納するブロック情報リスト１０６である。 First, in step S1801, the data decompression unit 180 initializes the cryptographic logic list 3 (182). The cryptographic logic list 3 (182) is a block information list 106 that stores block information that is determined by the data decompression unit 180 as a candidate for cryptographic logic.

次に、ステップＳ１８０２において、データ解凍部１８０は、ブロック情報リスト１０６に次の要素（ブロック情報）があるかを確認する。次の要素のブロック情報が無ければ、Ｎｏの分岐に進み、処理を終了する。一方、次の要素のブロック情報があれば、ステップＳ１８０３において、次の要素Ｂｉを選ぶ。 Next, in step S1802, the data decompression unit 180 confirms whether there is a next element (block information) in the block information list 106. If there is no block information of the next element, the process proceeds to No branch, and the process ends. On the other hand, if there is block information of the next element, the next element Bi is selected in step S1803.

次に、ステップＳ１８０４において、データ解凍部１８０は、既知の解凍アルゴリズムを適用して、ブロック情報の入力情報を解凍する。既知の解凍アルゴリズムは、圧縮／解凍デコードアルゴリズムＤＢ１８１に格納されている。データ解凍部１８０は、情報の種類が「バッファ」となっている入力情報に対して、解凍を行なう。 In step S1804, the data decompression unit 180 decompresses the input information of the block information by applying a known decompression algorithm. Known decompression algorithms are stored in the compression / decompression decoding algorithm DB 181. The data decompression unit 180 decompresses input information whose information type is “buffer”.

次に、ステップＳ１８０５において、データ解凍部１８０は、解凍に成功したか否かを判定する。データ解凍部１８０は、圧縮／解凍デコードアルゴリズムＤＢ１８１に格納されているいずれかの解凍アルゴリズムで解凍に成功した場合、Ｙｅｓの分岐によりステップＳ１８０６に進む。 In step S1805, the data decompression unit 180 determines whether decompression has succeeded. If the data decompression unit 180 succeeds in decompression using any decompression algorithm stored in the compression / decompression decoding algorithm DB 181, the process proceeds to step S 1806 due to a Yes branch.

次に、ステップＳ１８０６において、データ解凍部１８０は、ブロックＢｊのブロック情報を、暗号ロジック候補リスト３（１８２）に追加する。 Next, in step S1806, the data decompression unit 180 adds the block information of the block Bj to the encryption logic candidate list 3 (182).

実施例４．
暗号の基本定義より、あるメッセージ（平文）をある鍵によって暗号化して得られる暗号文を、同じ鍵で復号すると元のメッセージが得られることは自明である。すなわち、暗号関数、復号関数、鍵、平文を、それぞれＥｎｃ、Ｄｅｃ、ｋ、ｍとした場合、ｍ＝Ｄｅｃ（ｋ、Ｅｎｃ（ｋ、ｍ））が成立する。
図１５は、暗号の基本定義を説明する図である。
図１５では、平文「こんにちは」をある鍵によって暗号化して、暗号文「・・・唖・」を得た場合、同じ鍵で暗号文「・・・唖・」を復号して、元のメッセージの平文「こんにちは」が得られることを示している。
このような暗号の基本定義に従えば、あるブロックｆの出力の一部（暗号文と想定）を、別のブロックｇの入力の一部としてｇの処理を実行し、ｇの出力がｆの入力（平文と想定）と一致すれば、ｆは暗号関数、ｇは復号関数である可能性が高い。そこで、あるブロックのペアを選び、これらのブロックの入力情報と出力情報に対して、上記の暗号の基本定義を利用した処理を行なえば、同ブロックのペアを暗号ロジックの候補とすることができる。実施例４では、このような暗号の基本定義による入出力情報の特徴を特徴判定情報に利用して、暗号の基本定義の関係が成立するブロックのペアを見つけ出すことにより、暗号ロジックを特定する実施例を説明する。Example 4
From the basic definition of encryption, it is obvious that the original message can be obtained by decrypting the ciphertext obtained by encrypting a certain message (plaintext) with a certain key with the same key. That is, if the encryption function, the decryption function, the key, and the plaintext are Enc, Dec, k, and m, respectively, m = Dec (k, Enc (k, m)) is established.
FIG. 15 is a diagram for explaining the basic definition of encryption.
In FIG. 15, encrypted by the key with the plain text "Hello", if you give the ciphertext "... dumb,", to decrypt the ciphertext "... dumb," in the same key, the original message It shows that the plain text "Hello" is obtained.
According to such a basic definition of encryption, a part of the output of a certain block f (assumed to be a ciphertext) is processed as part of the input of another block g, and the output of g is f. If it matches the input (assuming plaintext), it is highly possible that f is an encryption function and g is a decryption function. Therefore, if a pair of blocks is selected, and the input information and output information of these blocks are processed using the above basic definition of encryption, the pair of blocks can be made a candidate for the encryption logic. . In the fourth embodiment, an encryption logic is specified by finding a pair of blocks in which the relationship of the basic definition of the encryption is established by using the feature of the input / output information based on the basic definition of the encryption as the feature determination information. An example will be described.

図１６は、実施例４に係るプロセス解析装置の一構成例を示す構成図である。
図１６において、ブロック情報解析部１４０は、仮想実行部１９０を備える。
ブロック情報解析部１４０は、ブロック情報リスト１０６が入力され、仮想実行判定処理を行なって、解析結果リスト１０７を出力する。また、仮想実行部１９０は、ブロック情報解析部１４０から入力されたブロック情報の入力情報と出力情報を利用して、ブロックのペアに対して、暗号の基本定義を利用した仮想実行を行ない、仮想実行に成功した場合、同ブロックのペアを暗号／復号関数ペアの候補とし、候補とした暗号／復号関数ペアを含む暗号／復号関数ペアリスト１９１をブロック情報解析部１４０に出力する。FIG. 16 is a configuration diagram illustrating a configuration example of a process analysis apparatus according to the fourth embodiment.
In FIG. 16, the block information analysis unit 140 includes a virtual execution unit 190.
The block information analysis unit 140 receives the block information list 106, performs virtual execution determination processing, and outputs the analysis result list 107. Also, the virtual execution unit 190 performs virtual execution using the basic definition of encryption on the block pair using the input information and output information of the block information input from the block information analysis unit 140, and performs virtual execution. When the execution is successful, the pair of the same block is set as a candidate for the encryption / decryption function pair, and the encryption / decryption function pair list 191 including the encryption / decryption function pair as a candidate is output to the block information analysis unit 140.

なお、仮想実行部１９０は、ブロック情報解析部１４０の内部に備える構成としても良い。 The virtual execution unit 190 may be configured to be provided inside the block information analysis unit 140.

次に、実施例４の仮想実行判定処理の流れを、図１７を参照して説明する。
図１７は、実施例４に係る仮想実行部１９０の仮想実行判定処理の流れを示すフローチャートである。Next, the flow of the virtual execution determination process according to the fourth embodiment will be described with reference to FIG.
FIG. 17 is a flowchart illustrating the flow of the virtual execution determination process of the virtual execution unit 190 according to the fourth embodiment.

まず、ステップＳ１９０１において、仮想実行部１９０は、既に抽出済みの暗号ロジック候補をマージして、暗号ロジックリスト４を作成する。既に抽出済みの暗号ロジック候補としては、例えば、実施例１〜実施例３で判定した暗号ロジック候補である暗号ロジックリスト１〜３を用いる。なお、暗号ロジックリスト４を作成する際、重複した暗号ロジック候補がある場合は、１つにまとめる。 First, in step S1901, the virtual execution unit 190 creates a cryptographic logic list 4 by merging already extracted cryptographic logic candidates. As the encryption logic candidates that have already been extracted, for example, encryption logic lists 1 to 3 that are encryption logic candidates determined in the first to third embodiments are used. When the encryption logic list 4 is created, if there are duplicate encryption logic candidates, they are combined into one.

次に、ステップＳ１９０２において、仮想実行部１９０は、解析結果リスト１０７を初期化する。解析結果リスト１０７は、仮想実行部１９０で暗号ロジックと復号ロジックのペアと判定されたブロック情報のペアからなるリストである。 Next, in step S1902, the virtual execution unit 190 initializes the analysis result list 107. The analysis result list 107 is a list including block information pairs determined by the virtual execution unit 190 as a pair of encryption logic and decryption logic.

次に、ステップＳ１９０３において、仮想実行部１９０は、暗号ロジックリスト４に次の要素（ブロック情報）があるかを確認する。次の要素のブロック情報が無ければ、Ｎｏの分岐に進み、処理を終了する。一方、次の要素のブロック情報があれば、ステップＳ１９０４において、次の要素Ｂｉを選ぶ。 Next, in step S1903, the virtual execution unit 190 confirms whether there is a next element (block information) in the cryptographic logic list 4. If there is no block information of the next element, the process proceeds to No branch, and the process ends. On the other hand, if there is block information of the next element, the next element Bi is selected in step S1904.

次に、ステップＳ１９０５において、仮想実行部１９０は、次の要素Ｂｉの出力情報を利用して、暗号の基本定義を利用した仮想実行解析を行なう。仮想実行解析の処理の詳細は、後述する。 Next, in step S1905, the virtual execution unit 190 performs virtual execution analysis using the basic definition of encryption using the output information of the next element Bi. Details of the virtual execution analysis process will be described later.

次に、ステップＳ１９０６において、仮想実行部１９０は、仮想実行解析結果がＮｕｌｌか否かを判定する。仮想実行解析結果がＮｕｌｌでなければ、Ｎｏの分岐によりステップＳ１９０７に進む。 Next, in step S1906, the virtual execution unit 190 determines whether the virtual execution analysis result is Null. If the virtual execution analysis result is not Null, the process advances to step S1907 due to a No branch.

次に、ステップＳ１９０７において、仮想実行部１９０は、仮想実行解析結果を解析結果リスト１０７に登録する。 In step S 1907, the virtual execution unit 190 registers the virtual execution analysis result in the analysis result list 107.

次に、ステップＳ１９０５の仮想実行解析の流れを、図１８と図１９を参照して詳細に説明する。
図１８は、実施例４に係る仮想実行部１９０の仮想実行解析の流れ（前半）を示すフローチャートである。
図１９は、実施例４に係る仮想実行部１９０の仮想実行解析の流れ（後半）を示すフローチャートである。Next, the flow of the virtual execution analysis in step S1905 will be described in detail with reference to FIGS.
FIG. 18 is a flowchart illustrating the flow (first half) of the virtual execution analysis of the virtual execution unit 190 according to the fourth embodiment.
FIG. 19 is a flowchart illustrating the flow (second half) of the virtual execution analysis of the virtual execution unit 190 according to the fourth embodiment.

ステップＳ１９０５の仮想実行解析は、仮想実行部１９０に対して、引数としてブロック情報Ｂｉと解析対象の実行ファイルが与えられる。 In the virtual execution analysis in step S1905, block information Bi and an execution file to be analyzed are given as arguments to the virtual execution unit 190.

まず、ステップＳ２０１において、仮想実行部１９０は、暗号／復号関数ペアリストを初期化する。
次に、ステップＳ２０２において、仮想実行部１９０は、解析対象の実行ファイル１０１を仮想環境上で実行してプロセスを起動し、一定期間経過後、同プロセスをサスペンドする。
次に、ステップＳ２０３において、仮想実行部１９０は、同プロセスのスナップショットを作成する。これをスナップショット１と呼ぶ。First, in step S201, the virtual execution unit 190 initializes an encryption / decryption function pair list.
Next, in step S202, the virtual execution unit 190 starts the process by executing the execution file 101 to be analyzed on the virtual environment, and suspends the process after a certain period.
Next, in step S203, the virtual execution unit 190 creates a snapshot of the same process. This is called snapshot 1.

次に、ステップＳ２０４において、仮想実行部１９０は、ブロック情報リスト１０６に次の要素（ブロック情報）があるかを確認する。次の要素のブロック情報が無ければ、Ｎｏの分岐に進み、ステップＳ２２２において、暗号／復号関数ペアリストを返し、処理を終了する。一方、次の要素のブロック情報があれば、ステップＳ２０５において、次の要素Ｂｊを選ぶ。なお、ブロックＢｉとブロックＢｊのコンテキストを確認し、入れ子の関係になっていないブロックＢｊを選択することも可能である。 Next, in step S <b> 204, the virtual execution unit 190 confirms whether there is a next element (block information) in the block information list 106. If there is no block information of the next element, the process proceeds to a No branch. In step S222, the encryption / decryption function pair list is returned, and the process ends. On the other hand, if there is block information of the next element, the next element Bj is selected in step S205. It is also possible to check the context of the block Bi and the block Bj and select a block Bj that is not nested.

次に、ステップＳ２０６において、仮想実行部１９０は、プロセスのスナップショット1をリストアする。
次に、ステップＳ２０７において、仮想実行部１９０は、ブロックＢｊを構成する命令列をプロセスにインジェクトする。具体的には、ブロック情報ＢｊのブロックＩＤに対応する要素を、ブロックリスト１０４から検索し、ブロックの命令列と開始アドレスを取得する。そして、同プロセスの同開始アドレスから同命令列をインジェクトする。
次に、ステップＳ２０８において、仮想実行部１９０は、同プロセスのスナップショットを作成する。これをスナップショット２と呼ぶ。
次に、ステップＳ２０９において、仮想実行部１９０は、Ｂｊの入力情報を取得する。Next, in step S206, the virtual execution unit 190 restores the snapshot 1 of the process.
Next, in step S207, the virtual execution unit 190 injects an instruction sequence constituting the block Bj into the process. Specifically, the element corresponding to the block ID of the block information Bj is searched from the block list 104, and the instruction sequence and start address of the block are acquired. Then, the same instruction sequence is injected from the same start address of the same process.
Next, in step S208, the virtual execution unit 190 creates a snapshot of the same process. This is called snapshot 2.
Next, in step S209, the virtual execution unit 190 acquires Bj input information.

次に、ステップＳ２１０において、仮想実行部１９０は、Ｂｊの入力情報とＢｉの出力情報とから、入力スナップショットIssを生成する。入力スナップショットとは、実行するブロックの入力となる情報である。入力スナップショットIssの生成は、次のようにして行なう。ブロックＢｉはｎ個の出力情報を持っているとし、Ｏ＝ {Ｏ１〜Ｏｎ}で表わす。一方、ブロックＢｊはｍ個の入力情報を持っているとし、Ｉ＝ {Ｉ１〜Ｉｍ}で表わす。Ｏｉ∈Ｏとし、Iのｊ番目の要素をＯｉに入れ替えた入力情報をIssとする。入れ替える情報は、入出力情報の種類が「バッファ」のもの同士とする。その他にもサイズが近い情報同士を優先的に選択し、入れ替えることも可能である。入れ替えは、値とサイズに対して行なう。また、同じIssが２度生成されることはないように入れ替えを行なう。 Next, in step S210, the virtual execution unit 190 generates an input snapshot Iss from the input information of Bj and the output information of Bi. The input snapshot is information that becomes an input of a block to be executed. The input snapshot Iss is generated as follows. The block Bi has n pieces of output information, and is represented by O = {O1-On}. On the other hand, the block Bj has m pieces of input information, and is represented by I = {I1 to Im}. Assume that the input information with Oi ∈ O and the jth element of I replaced with Oi is Iss. The information to be exchanged is that whose input / output information type is “buffer”. In addition, it is also possible to preferentially select and replace information with similar sizes. Replacement is performed for values and sizes. In addition, replacement is performed so that the same Iss is not generated twice.

次に、ステップＳ２１１において、仮想実行部１９０は、新しいIssが生成されたか判定する。新しいIssが生成されない場合、Ｎｏの分岐に進み、ステップＳ２０４を実行する。一方、新しいIssが生成された場合、ステップＳ２１２に処理が進み、仮想実行部１９０は、プロセスのスナップショット２をリストアし、ステップＳ２１３において、Issを同プロセスに反映させる。Issの反映では、Issにある全ての入力情報の値を、適切な記憶領域（レジスタ、メモリ）に設定する。 Next, in step S211, the virtual execution unit 190 determines whether a new Iss has been generated. When a new Iss is not generated, the process proceeds to No branch, and step S204 is executed. On the other hand, when a new Iss is generated, the process proceeds to step S212, and the virtual execution unit 190 restores the snapshot 2 of the process, and reflects the Iss in the process in step S213. In reflecting Iss, the values of all input information in Iss are set in an appropriate storage area (register, memory).

次に、ステップＳ２１４において、仮想実行部１９０は、Instruction Registerに、インジェクトした命令列の先頭アドレスをセットし、ステップＳ２１５において、プロセスをレジュームする。 Next, in step S214, the virtual execution unit 190 sets the head address of the injected instruction sequence in the Instruction Register, and in step S215, resumes the process.

次に、ステップＳ２１６において、仮想実行部１９０は、プロセスの実行アドレスを監視し、実行アドレスがブロックＢｊの範囲を超えるかどうかを確認する。
次に、ステップＳ２１７において、仮想実行部１９０は、ブロックの実行が終了したか否かを判定する。仮想実行部１９０は、監視している実行アドレスがブロックＢｊの範囲を超えた場合、ブロックＢｊの実行が終了したと判定し、ステップＳ２１８において、プロセスをサスペンドする。Next, in step S216, the virtual execution unit 190 monitors the execution address of the process and checks whether the execution address exceeds the range of the block Bj.
Next, in step S217, the virtual execution unit 190 determines whether or not the execution of the block has ended. If the monitored execution address exceeds the range of the block Bj, the virtual execution unit 190 determines that the execution of the block Bj has ended, and suspends the process in step S218.

次に、ステップＳ２１９において、仮想実行部１９０は、実行されたブロックＢｊの出力情報とブロックＢｉの入力情報とを比較する。ブロックＢｊを実行して得られる出力情報は、ブロックＢｊの出力情報の開始アドレスを基に、サスペンド中のプロセスのメモリから抽出する。 Next, in step S219, the virtual execution unit 190 compares the output information of the executed block Bj with the input information of the block Bi. Output information obtained by executing the block Bj is extracted from the memory of the suspended process based on the start address of the output information of the block Bj.

次に、ステップＳ２２０において、仮想実行部１９０は、ブロックＢｊの出力情報とブロックＢｉの入力情報とが一致するかを判定し、一致する場合、Ｙｅｓの分岐によりステップＳ２２１に進み、ブロックＢｉとブロックＢｊを暗号ロジックと復号ロジックのペアとして、暗号／復号関数ペアリストに登録する。 Next, in step S220, the virtual execution unit 190 determines whether the output information of the block Bj matches the input information of the block Bi. If they match, the process proceeds to step S221 by a Yes branch, and the block Bi and the block Bi Bj is registered in the encryption / decryption function pair list as a pair of encryption logic and decryption logic.

なおステップＳ２０７において、ブロックの命令列をインジェクトするのではなく、ブロックの開始アドレスまでプロセスを実行してもよい。その場合、ステップＳ２１４において同ステップの処理は行わず、ステップＳ２１５においてプロセスをレジュームするだけで良い。 In step S207, the process may be executed up to the start address of the block instead of injecting the instruction sequence of the block. In that case, the process of the same step is not performed in step S214, and it is only necessary to resume the process in step S215.

以上のように、実施例１〜実施例４の発明では、実行トレースから抽出したブロックの入出力関係の特徴を判定する特徴判定情報を生成し、この特徴判定情報を利用してブロックの入出力関係を解析し、暗号関数または復号関数の入出力関係の特徴を示すブロックを暗号ロジックと判定することにより、マルウェアが用いる暗号ロジックを、精度良く特定することができるという効果がある。 As described above, in the inventions according to the first to fourth embodiments, the feature determination information for determining the feature of the input / output relationship of the block extracted from the execution trace is generated, and the input / output of the block using the feature determination information. By analyzing the relationship and determining the block indicating the characteristics of the input / output relationship of the encryption function or the decryption function as the encryption logic, there is an effect that the encryption logic used by the malware can be specified with high accuracy.

１００プロセス解析装置、１０１実行ファイル、１０２実行トレース、１０３プロセス情報、１０４ブロックリスト、１０５定義リスト、１０６ブロック情報リスト、１０７解析結果リスト、１１０実行トレース取得部、１２０ブロック抽出部、１３０ブロック情報抽出部、１４０ブロック情報解析部、１５０解析結果出力部、１６０文字列割合判定部、１６１文字コード判定アルゴリズムＤＢ、１６２文字コード表ＤＢ、１６３暗号ロジックリスト１、１７０データデコード部、１７１エンコード／デコードアルゴリズムＤＢ、１７２暗号ロジックリスト２、１８０データ解凍部、１８１圧縮／解凍アルゴリズムＤＢ、１８２暗号ロジックリスト３、１９０仮想実行部、１９１暗号／復号関数ペアリスト。 100 process analysis device, 101 execution file, 102 execution trace, 103 process information, 104 block list, 105 definition list, 106 block information list, 107 analysis result list, 110 execution trace acquisition unit, 120 block extraction unit, 130 block information extraction Section, 140 block information analysis section, 150 analysis result output section, 160 character string ratio determination section, 161 character code determination algorithm DB, 162 character code table DB, 163 encryption logic list 1, 170 data decoding section, 171 encoding / decoding algorithm DB, 172 Encryption logic list 2, 180 Data decompression unit, 181 Compression / decompression algorithm DB, 182 Encryption logic list 3, 190 Virtual execution unit, 191 Encryption / decryption function pair list .

Claims

An execution trace acquisition unit that acquires an execution trace of the process to be analyzed;
A block extraction unit for extracting a block which is a processing unit indicating a loop structure from the execution trace;
A block information extraction unit that extracts block information including input information and output information from the block;
Generate feature determination information for determining features of the input / output relationship of the block using the input information or the output information of the block information, and analyze the input / output relationship of the block using the feature determination information, A process analysis apparatus comprising: a block information analysis unit that determines a block indicating characteristics of an input / output relationship of an encryption function or a decryption function as encryption logic.

A character string ratio determining unit that determines a printable character string ratio that is a ratio of printable character strings included in the input information or the output information of the block information;
The block information analysis unit includes, as the feature determination information, a first printable character string ratio of the input information determined by the character string ratio determination unit and a second printable character string ratio of the output information. The process analysis apparatus according to claim 1, wherein a difference is calculated, and when the difference is equal to or greater than a predetermined threshold, the block is determined to be cryptographic logic.

A data decoding unit for decoding the output information of the block information;
The block information analysis unit generates a decoding result obtained by decoding the output information of the block information by the data decoding unit as the feature determination information, and the block having the output information that matches the decoding result is converted into a cryptographic logic The process analysis apparatus according to claim 1, wherein

A data decompression unit for decompressing the output information of the block information;
The block information analysis unit generates a decoding result obtained by decompressing the output information of the block information by the data decompression unit as the feature determination information, and the block having the output information that matches the decoding result is converted into a cryptographic logic. The process analysis apparatus according to claim 1, wherein

A virtual execution unit that inputs the output information of the block information to another block and executes processing of the other block;
The block information analysis unit generates an execution result obtained by executing the processing of the other block by the virtual execution unit as the feature determination information, and the block having the input information that matches the execution result is defined as a cryptographic logic. The process analysis apparatus according to claim 1 for determining.

A process analysis method of a process analysis apparatus that analyzes a process to be analyzed and determines cryptographic logic,
An execution trace acquisition step in which an execution trace acquisition unit acquires an execution trace of the process to be analyzed;
A block extraction step for extracting a block which is a processing unit indicating a loop structure from the execution trace;
A block information extraction step for extracting block information including input information and output information from the block;
A block information analysis unit generates feature determination information for determining features of the input / output relationship of the block using the input information or the output information of the block information, and uses the feature determination information to input the block. A process analysis method comprising: a block information analysis step that analyzes an output relationship and determines a block indicating a characteristic of an input / output relationship of a cryptographic function or a decryption function as a cryptographic logic.

On the computer,
An execution trace acquisition step for acquiring an execution trace of the process to be analyzed;
A block extraction step of extracting a block which is a processing unit indicating a loop structure from the execution trace;
A block information extraction step for extracting block information including input information and output information from the block;
Generate feature determination information for determining features of the input / output relationship of the block using the input information or the output information of the block information, and analyze the input / output relationship of the block using the feature determination information, A process analysis program for executing a block information analysis step for determining a block indicating a characteristic of an input / output relationship of an encryption function or a decryption function as encryption logic.