WO2022180702A1 - Analysis function addition device, analysis function addition program, and analysis function addition method - Google Patents

Analysis function addition device, analysis function addition program, and analysis function addition method Download PDF

Info

Publication number
WO2022180702A1
WO2022180702A1 PCT/JP2021/006933 JP2021006933W WO2022180702A1 WO 2022180702 A1 WO2022180702 A1 WO 2022180702A1 JP 2021006933 W JP2021006933 W JP 2021006933W WO 2022180702 A1 WO2022180702 A1 WO 2022180702A1
Authority
WO
WIPO (PCT)
Prior art keywords
execution
analysis
instruction
detection unit
unit
Prior art date
Application number
PCT/JP2021/006933
Other languages
French (fr)
Japanese (ja)
Inventor
利宣 碓井
知範 幾世
裕平 川古谷
誠 岩村
潤 三好
Original Assignee
日本電信電話株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電信電話株式会社 filed Critical 日本電信電話株式会社
Priority to PCT/JP2021/006933 priority Critical patent/WO2022180702A1/en
Priority to JP2023501730A priority patent/JPWO2022180702A1/ja
Publication of WO2022180702A1 publication Critical patent/WO2022180702A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements

Definitions

  • the present invention relates to an analysis function imparting device, an analysis function imparting program, and an analysis function imparting method.
  • a malicious script is a script that behaves maliciously, and is a program that exploits the functions provided by the script engine to carry out attacks. In general, attacks are carried out using the default script engine of the operating system (OS), or the script engine of specific applications such as web browsers and document file viewers.
  • OS operating system
  • specific applications such as web browsers and document file viewers.
  • script engines may require user permission, but they can also implement actions via the system, such as file operations, network communication, and process startup. Therefore, attacks using malicious scripts pose a threat to users in the same way as attacks using malware in executable files.
  • Code obfuscation is a problem that arises when analyzing malicious scripts. Many malicious scripts are subjected to a process called obfuscation, which hinders analysis. Obfuscation deliberately increases the complexity of the code, making it difficult to analyze the code superficially. That is, it interferes with an analysis method called static analysis, which analyzes information obtained from the code without executing the script.
  • the latter is an analysis obstruction in which a malicious script acquires information about the environment in which it is executed, and does not exhibit malicious behavior unless it meets certain conditions. For example, if a feature that is frequently seen in the analysis environment is found, it is determined that the user is being analyzed, and is used to interrupt analysis by interrupting execution.
  • FIG. 22 is a diagram showing a code fragment showing an example of anti-analysis.
  • This code fragment acquires the number of cores of the CPU (Central Processing Unit) of the environment being executed, and if it is not 2 or more and 8 or less, it judges that the analysis environment is highly likely and terminates execution. It has an anti-analysis attack. Otherwise, it judges that it is not an analysis environment and shows malignant behavior.
  • CPU Central Processing Unit
  • Non-Patent Document 1 describes a technique for realizing symbolic execution, which is a type of multipath execution, for JavaScript (registered trademark). According to this method, in the conditional branching of a JavaScript script, it is possible to comprehensively follow the executable paths and observe the behavior.
  • Non-Patent Document 2 describes a method for realizing route forced execution, which is a type of multipath execution, for JavaScript. According to this method, in the conditional branching of JavaScript scripts, all paths can be exhaustively traced and the behavior can be observed.
  • Non-Patent Document 3 after manually remodeling the script engine in advance, by executing the script engine on the symbolic execution infrastructure for binary, the script executed on the script engine , describes a technique for realizing symbolic execution through a script engine. According to this technique, if there is a script engine that can be modified manually, any script language can be used to achieve general-purpose symbolic execution, exhaustively trace executable paths, and observe behavior.
  • Non-Patent Document 4 describes a method of analyzing a virtual machine (VM) that malware often uses to obfuscate its own programs. According to this technique, by analyzing the VM, it is possible to obtain information on its architecture. Since it is the VM that controls script execution in the script engine, the concept of this method can be partly diverted.
  • VM virtual machine
  • Non-Patent Document 5 multipath execution of scripts is enabled by analyzing the script engine and adding code to realize the multipath execution function based on the obtained architecture information. According to this method, multipath execution can be realized for various script languages and engines.
  • Non-Patent Document 1 and Non-Patent Document 2 have the problem that it is necessary to design and implement the multipath execution function individually for each script engine.
  • the methods described in Non-Patent Document 1 and Non-Patent Document 2 have the problem that it is necessary to know the architecture information of the VM of the script engine in advance in order to realize the multipath execution function.
  • Non-Patent Document 3 requires modifications to the script engine, so there is also the problem that it is necessary to know the architecture information of the VM of the script engine in advance.
  • the method described in Non-Patent Document 3 does not consider detailed architecture such as the mechanism of conditional branching within the script engine, so there is a problem that it is difficult to perform fine-grained multipath execution for the script.
  • Non-Patent Document 4 only targets VMs owned by malware, not VMs owned by script engines, so there is a problem that it cannot be directly applied to script engines.
  • the method described in Non-Patent Document 4 also has the problem that it does not refer to acquisition of architecture information related to conditional branching, which is important for multipath execution.
  • the technique described in Non-Patent Document 4 focuses only on the analysis of the VM, and has the problem that it does not take into account the addition of functions to the VM, such as the addition of multipath execution.
  • Non-Patent Document 5 has the problem that it can only be applied to the decode/dispatch type script engine and cannot be applied to the threaded code type script engine, which is another major method.
  • the present invention has been made in view of the above, and does not require separate design and implementation even for a threaded code type script engine, and does not require detailed architecture information such as conditional branching in advance. It is an object of the present invention to provide an analysis function imparting device, an analysis function imparting program, and an analysis function imparting method capable of realizing impartation of a multipath execution function in consideration of architecture.
  • the analysis function imparting device of the present invention includes a first analysis unit that analyzes the virtual machine of the script engine and an instruction set architecture that is the system of instructions of the virtual machine.
  • a second analysis unit that analyzes, and a virtual Based on a program counter, a conditional branch flag, which is an area that holds a flag indicating whether or not a branch is taken at the time of a conditional branch in the execution state, and a branch VM instruction, which is a virtual machine instruction that causes a branch
  • the script engine a first analysis unit that applies a hook that provides a multipath execution function, the first analysis unit analyzes a plurality of execution traces obtained by changing execution conditions using differential execution analysis, A first acquisition unit that acquires a program counter and a conditional branch flag and acquires a plurality of execution traces by changing execution conditions, and a first acquisition unit that clusters the execution traces and detects the boundary of each VM instruction Analyze a plurality of execution traces using the detection
  • a third detection unit that analyzes the binary of the script engine based on the boundary of each VM instruction detected by the first detection unit and detects the dispatcher, and a difference that focuses on the number of times the memory is read and a fourth detection unit that analyzes a plurality of execution traces using execution analysis and detects a conditional branch flag.
  • a multipath execution function that takes into account detailed architecture such as conditional branching can be implemented without requiring separate design and implementation and without prior architectural information. grant can be realized.
  • FIG. 1 is a diagram for explaining an example of the configuration of a threaded code type script engine.
  • FIG. 2 is a diagram showing pseudocode of a threaded code type VM that the script engine has.
  • FIG. 3 is a diagram illustrating an example of the configuration of the analysis function imparting device according to the embodiment.
  • FIG. 4 is a diagram showing an example of a test script (first test script) used for virtual program counter detection.
  • FIG. 5 is a diagram showing an example of a test script (second test script) used for branch VM instruction detection.
  • FIG. 6 is a diagram showing an example of an execution trace.
  • FIG. 7 is a diagram illustrating an example of a VM execution trace.
  • FIG. 8 is a diagram explaining processing of the VM instruction boundary detection unit.
  • FIG. 1 is a diagram for explaining an example of the configuration of a threaded code type script engine.
  • FIG. 2 is a diagram showing pseudocode of a threaded code type VM that the script engine has.
  • FIG. 9 is a diagram for explaining processing of the virtual program counter detection unit.
  • FIG. 10 is a diagram explaining processing of the dispatcher detection unit.
  • FIG. 11 is a diagram explaining processing of the branch VM instruction detection unit.
  • FIG. 12 is a flow chart showing a processing procedure of analysis function imparting processing according to the embodiment.
  • FIG. 13 is a flow chart showing a processing procedure of execution trace acquisition processing shown in FIG.
  • FIG. 14 is a flowchart showing the procedure of the VM instruction boundary detection process shown in FIG. 12;
  • FIG. 15 is a flow chart showing the procedure of the virtual program counter detection process shown in FIG.
  • FIG. 16 is a flow chart showing a processing procedure of dispatcher detection processing shown in FIG.
  • FIG. 17 is a flow chart showing the procedure of the conditional branch flag detection process shown in FIG. FIG.
  • FIG. 18 is a flowchart of a procedure of a VM execution trace acquisition process shown in FIG. 12;
  • FIG. 19 is a flow chart showing a processing procedure of branch VM instruction detection processing shown in FIG. 12 .
  • FIG. 20 is a flow chart showing the processing procedure of the analysis function imparting process shown in FIG.
  • FIG. 21 is a diagram showing an example of a computer that implements the analysis function imparting device by executing a program.
  • FIG. 22 is a diagram showing a code fragment showing an example of anti-analysis.
  • An analysis function imparting device is an analysis function imparting device that can be applied to a threaded code type script engine.
  • the analysis function imparting apparatus can detect the boundary of the VM instruction and the virtual program counter (VPC) which is a variable indicating the VM instruction to be executed next.
  • VPC virtual program counter
  • a dispatcher, a conditional branch flag, and a branch VM instruction which is a VM instruction that causes branching, are detected in order.
  • FIGS. 1 and 2 These are all components of the script engine and information about the architecture. The structure of a typical script engine and their functions will be described with reference to FIGS. 1 and 2. FIG.
  • FIG. 1 is a diagram for explaining an example of the configuration of a threaded code type script engine.
  • script engine 100 has bytecode compiler 102 and virtual machine (VM) 103 .
  • the bytecode compiler 102 also has a syntax analysis unit 104 and a bytecode generation unit 105 .
  • the VM 103 also has a code cache unit 106, a decode unit 107, a pointer cache unit 108, and a plurality of sets 109-1 to 109-3 of VM instruction handler units and dispatcher units.
  • the script engine 100 accepts script input.
  • the syntax analysis unit 104 receives a script as an input, generates an abstract syntax tree (AST) through lexical analysis and syntactic analysis, and outputs it to the bytecode generation unit 105 .
  • the bytecode generation unit 105 receives the AST as an input, converts it into bytecode, and stores it in the code cache unit 106 .
  • the decoding unit 107 collectively reads the codes from the code cache unit 106 and decodes all the read codes.
  • the decoding unit 107 converts all codes into pointers and stores them in the pointer cache unit 108 .
  • Groups 109-1 to 109-3 of distributed VM instruction handler units and dispatcher units execute programs corresponding to VM instructions. The contents described in the script are executed by referring to the pointer in the pointer cache unit 108, executing the VM instruction while checking the pointer one by one, and dispatching to the next VM instruction.
  • FIG. 2 is a diagram showing pseudocode of a threaded code type VM that the script engine has.
  • the pseudocode first initializes the VPC (line 1).
  • the pointer pointed to by the VPC is obtained from the pointer cache as the pointer of the VM instruction handler to be executed next (line 2).
  • a goto statement is used to dispatch to the next VM instruction handler (line 3).
  • the dispatched VM instruction handler is executed (lines 5, 9, and 13).
  • there is a dispatcher behind each VM instruction handler that gets a pointer to the VM instruction handler to be executed next and dispatches to it (lines 6, 7, 10, 11, 14, 15). .
  • a branch VM instruction is a VM instruction that causes a branch within a script, and a conditional branch flag is an area that holds a flag indicating whether or not a branch is taken at the time of a conditional branch.
  • the analysis function imparting apparatus 10 acquires an execution trace consisting of a branch trace and a memory access trace by hooking a branch instruction and hooking a memory operation instruction to a script engine binary.
  • a branch trace is a record of executed branches
  • a memory access trace is a record of executed memory reads and writes.
  • the analysis function imparting device 10 detects the boundaries of each VM instruction. That is, when there are a plurality of pairs of distributed VM instruction handler units and dispatcher units, it is detected where each of them starts and ends. At this time, the analysis function imparting apparatus 10 clusters the execution traces and detects clusters whose number of times of execution is equal to or greater than a threshold value as VM instructions. The analysis function imparting device 10 detects the start point and the end point of the continuous instruction string forming the VM instruction as boundaries. The VM instruction boundary detected here is used in VPC detection and dispatcher detection.
  • this analysis function imparting device 10 analyzes the execution trace and detects the VPC.
  • the analysis function imparting device applies differential execution analysis focusing on the number of times of memory reading to detect the VPC.
  • this analysis function imparting device 10 analyzes the binary of the script engine and detects the dispatcher.
  • the dispatcher is implemented by referring to the pointer cache and jumping to the pointer of the next VM instruction handler.
  • Dispatchers are distributed behind each VM instruction handler and generally their code is highly identical. By searching for code that exists behind such VM instruction handlers and has a high degree of identity, the analysis function imparting device 10 detects the dispatcher in a predetermined manner.
  • this analysis function imparting device 10 analyzes the execution trace and detects conditional branch flags.
  • the analysis function imparting device 10 applies differential execution analysis focused on memory reading to detect conditional branch flags.
  • the analysis function imparting device 10 obtains a VM execution trace for the script engine binary by monitoring the VPC and the pointer of the VM instruction handler dispatched by the dispatcher.
  • the VM execution trace records pointers of executed VM instruction handlers and VPCs.
  • This analysis function imparting device 10 analyzes this VM execution trace and detects branch VM instructions. In detecting a branch VM instruction, the analysis function imparting device 10 first executes a large number of test scripts to obtain a VM execution trace. Then, the analysis function imparting device 10 associates the pointer to the VM instruction with the VM instruction, and virtually assigns a VM opcode to each as an identifier. Then, the analysis function imparting device 10 collects the amount of change in the VPC before and after the execution of each VM opcode. If the VM opcode is anything other than a branch VM instruction, the amount of change in VPC is approximately constant.
  • the VPC varies depending on the branch destination.
  • the analysis function imparting device 10 evaluates variations in the amount of change in the VPC for each VM opcode in terms of variance, and detects those whose variance is equal to or greater than a certain threshold value as branch VM instructions.
  • the analysis function imparting device 10 hooks the script engine binary based on the VPC, the branch VM instruction, and the conditional branch flag obtained up to this point. With this hook, the analyzer 10 monitors what the VPC points to and branches the execution state when it is a branch VM instruction. Then, the analysis function imparting device 10 executes one execution state as it is, and executes the other execution state after rewriting the conditional branch flag. This causes both execution paths of the conditional branch to be executed. As described above, the analysis function imparting device 10 realizes imparting the multipath function to the crypto engine as a retrofit.
  • FIG. 3 is a diagram illustrating an example of the configuration of the analysis function imparting device according to the embodiment.
  • the analysis function imparting device 10 has an input unit 11, a control unit 12, a storage unit 13, and an output unit . Then, the analysis function imparting device 10 receives the input of the test script and the script engine binary.
  • the input unit 11 is composed of input devices such as a keyboard and a mouse, receives input of information from the outside, and inputs the information to the control unit 12 . Further, the input unit 11 has a communication interface for transmitting and receiving various information to and from another device connected via a wired connection or a network, etc., and receives input of information transmitted from the other device. accept.
  • the input unit 11 receives input of test scripts and script engine binaries, and outputs them to the control unit 12 .
  • a test script is a script input when dynamically analyzing a script engine to acquire an execution trace and a VM execution trace. Details of the test script will be described later.
  • Script engine binaries are the executable files that make up the script engine.
  • a script engine binary may consist of multiple executable files.
  • the control unit 12 has an internal memory for storing programs defining various processing procedures and required data, and executes various processing using these.
  • the control unit 12 is an electronic circuit such as a CPU (Central Processing Unit) or MPU (Micro Processing Unit).
  • the control unit 12 has a virtual machine analysis unit 121 (first analysis unit), an instruction set architecture analysis unit 122 (second analysis unit), and an analysis function addition unit 123 (addition unit).
  • the virtual machine analysis unit 121 analyzes the VM of the script engine.
  • the virtual machine analysis unit 121 acquires a plurality of execution traces by changing execution conditions, analyzes the plurality of execution traces using differential execution analysis, and acquires VPCs and conditional branch flags. Also, the virtual machine analysis unit 121 statically analyzes the script engine binary to acquire VM instruction boundaries and dispatchers.
  • the virtual machine analysis unit 121 includes an execution trace acquisition unit 1211 (first acquisition unit), a VM instruction boundary detection unit 1212 (first detection unit), a virtual program counter detection unit 1213 (second detection unit), and a dispatcher detection unit. It has a unit 1214 (third detection unit) and a conditional branch flag detection unit 1215 (fourth detection unit).
  • the execution trace acquisition unit 1211 accepts the test script and script engine binary as input.
  • the execution trace acquisition unit 1211 acquires an execution trace by executing the test script while monitoring execution of the script engine binary.
  • An execution trace consists of a branch trace and a memory access trace.
  • the branch trace records the type of branch instruction, the branch source address, and the branch destination address at the time of execution.
  • a memory access trace records the type of memory operation and the memory address of the operation target. Branch traces and memory access traces are known to be obtainable by instruction hooks.
  • the execution trace acquired by the execution trace acquisition unit 1211 is stored in the execution trace DB 131 .
  • the VM instruction boundary detection unit 1212 clusters the execution traces and detects the boundary of each VM instruction.
  • the VM instruction boundary detection unit 1212 clusters the execution trace and detects clusters whose number of executions is equal to or greater than a threshold value as VM instructions. Clustering finds contiguous code regions that are executed multiple times. This may be done, for example, by grouping together code distances between executed instructions, by finding common subsequences of executed code blocks, or by other methods.
  • the analysis function imparting device 10 detects the start point and the end point of the continuous instruction string forming the detected VM instruction as boundaries.
  • the VM instruction boundary detected here is used in VPC detection and dispatcher detection.
  • the virtual program counter detection unit 1213 extracts and analyzes the execution trace for the first test script stored in the execution trace DB 131 to detect the VPC.
  • the virtual program counter detection unit 1213 analyzes a plurality of execution traces using differential execution analysis focusing on the number of times of memory reading and the boundary of each VM instruction detected by the VM instruction boundary detection unit 1212, and detects a VPC. .
  • the virtual program counter detection unit 1213 utilizes the fact that reading into the memory holding the VPC always occurs after execution of each VM instruction, and detects the VPC by finding the reading destination.
  • the virtual program counter detection unit 1213 uses differential execution analysis focusing on the number of times of memory reading for VPC detection.
  • the virtual program counter detection unit 1213 compares the execution traces of a plurality of test scripts acquired using the test scripts, and finds that the memory read count is proportional to both the number of repetitions and the number of statements to be repeated. Discover changing memory. Then, the virtual program counter detection unit 1213 refers to the boundary of each VM instruction detected by the VM instruction boundary detection unit 1212, and narrows down the read memory values to those that always point to the starting point of the VM instruction. The virtual program counter detection unit 1213 detects this memory as a VPC.
  • the dispatcher detection unit 1214 cuts out each VM instruction part from the script engine binary based on the VM instruction boundary detected by the VM instruction boundary detection unit 1212, and detects a part with a high degree of similarity between each VM instruction as a dispatcher.
  • a sequence alignment algorithm for example, may be used to detect portions with a high degree of similarity, or other methods may be used.
  • the conditional branch flag detection unit 1215 extracts and analyzes the execution trace for the second test script stored in the execution trace DB 131 and finds the conditional branch flag.
  • the conditional branch flag detection unit 1215 analyzes a plurality of execution traces and detects conditional branch flags using differential execution analysis focusing on the number of times of memory reading.
  • the conditional branch flag detection unit 1215 executes conditional branching in various patterns, and compares the memory change pattern at that time with the conditional branching pattern on the test script to detect the memory storing the conditional branching flag. do.
  • the instruction set architecture analysis unit 122 analyzes the instruction set architecture, which is the system of VM instructions.
  • the instruction set architecture analysis unit 122 has a VM execution trace acquisition unit 1221 (second acquisition unit) and a branch VM instruction detection unit 1222 (fifth detection unit).
  • the VM execution trace acquisition unit 1221 accepts test scripts and script engine binaries as inputs.
  • the VM execution trace acquisition unit 1221 acquires the VM execution trace, which is the execution trace executed on the VM, by executing the test script while monitoring execution of the script engine binary.
  • a VM execution trace consists of a VPC and a VM opcode for each executed VM instruction.
  • the VPC recording can be realized by monitoring the VPC memory detected by the virtual program counter detection unit 1213 .
  • the VM opcode here is an identifier virtually assigned to each linking a pointer to a VM instruction and a VM instruction.
  • the VM execution trace acquired by the VM execution trace acquisition unit 1221 is stored in the VM execution trace DB 133 .
  • the branch VM instruction detection unit 1222 extracts and analyzes the VM execution trace stored in the VM execution trace DB 133 to detect branch VM instructions.
  • the branch VM instruction detection unit 1222 pays attention to the fact that the magnitude of variation in the VPC value is different between the branch VM instruction and the other VM instructions, and determines a threshold value to branch the one with the larger variation in the VPC value. Detect as a VM instruction.
  • the branch VM instruction detection unit 1222 detects a branch VM instruction based on variations in the amount of change in the virtual program counter for each VM opcode in the VM execution trace.
  • the analysis function imparting unit 123 hooks the script engine to impart a multipath execution function.
  • the analysis function imparting unit 123 hooks the script engine using the obtained VPC, branch VM instruction, and conditional branch flag. This hook monitors the VPC to confirm the VM opcode, and branches the execution state if the VM opcode is for a branch VM instruction. This hook executes one execution state as it is, and rewrites the conditional branch flag to execute the other execution state, thereby providing the script engine with a multi-pass execution function.
  • the storage unit 13 is implemented by a semiconductor memory device such as RAM (Random Access Memory) and flash memory, or a storage device such as a hard disk and an optical disk, and stores a processing program for operating the analysis function imparting device 10, a processing Data used during program execution is stored.
  • the storage unit 13 has an execution trace database (DB) 131 , a VM execution trace DB 133 and an architecture information DB 132 .
  • the execution trace DB 131 and VM execution trace DB 133 store execution traces and VM execution traces acquired by the execution trace acquisition unit 1211 and VM execution trace acquisition unit 1221, respectively.
  • the execution trace DB 131 and VM execution trace DB 133 are managed by the analysis function imparting device 10 .
  • the execution trace DB 131 and the VM execution trace DB 133 may be managed by another device (server or the like). Via the communication interface, the acquired execution trace and VM execution trace are output to the management server of the execution trace DB 131 and VM execution trace DB 133 and stored in the execution trace DB 131 and VM execution trace DB 133 .
  • the output unit 14 is, for example, a liquid crystal display, a printer, etc., and outputs various information including information about the analysis function imparting device 10 . Further, the output unit 14 may be an interface that controls input/output of various data with an external device, and may output various information to the external device.
  • test script is a script that is input when dynamically analyzing the script engine. This test script focuses on the execution of branch instructions and the number of memory read/writes, and is used to capture the difference in behavior of the script engine that occurs when the test script is executed a different number of times. This test script is prepared in advance for analysis and is created manually. This creation requires knowledge of the specifications of the target script language.
  • FIG. 4 is a diagram showing an example of a test script (first test script) used for VPC detection.
  • the first test script uses iteration (line 2).
  • line 2 the number of repetitions
  • lines 3 to 5 the number of sentences to be repeated
  • FIG. 5 is a diagram showing an example of a test script (second test script) used for branch VM instruction detection.
  • the second test script uses multiple conditional branches (lines 4 to 8). In the second test script, this multiple conditional branching controls the branching conditions so that branches are taken or not taken in a particular pattern of order (lines 1, 5). In the second test script, the number of conditional branches and the pattern of success or failure of branching are changed to generate a difference.
  • FIG. 6 is a diagram showing an example of an execution trace.
  • the execution trace consists of a branch trace and a memory access trace, as described above.
  • FIG. 6 is a diagram showing an example of an execution trace.
  • the configuration of the execution trace will be shown using FIG.
  • trace indicates whether the log line is a branch trace or a memory access trace.
  • a branch trace log line for example, has the format described in lines 1 to 10 in Figure 6, and consists of three elements: type, src, and dst.
  • type indicates whether the executed branch instruction is a call instruction, a jmp instruction, or a ret instruction. Also, src indicates a branch source address, and dst indicates a branch destination address.
  • the memory access trace log line for example, has the format described in lines 11 to 13 in Figure 6, and consists of three elements: type, target, and value. type indicates whether the memory access is read or write. target indicates a memory address to be accessed. In addition, the value of the result of memory access is stored in value.
  • FIG. 7 is a diagram illustrating an example of a VM execution trace.
  • a VM execution trace is a record of VM opcodes and VPCs, as described above.
  • FIG. 7 is a cutout of a portion of the VM execution trace.
  • the configuration of the VM execution trace will be shown using FIG.
  • a VM execution trace log line for example, has the format shown in Fig. 7 and consists of two elements: vpc and pointer.
  • vpc indicates the value of VPC.
  • pointer indicates the value of the pointer that points to the beginning of the VM instruction handler to be executed, which is obtained from the pointer cache.
  • FIG. 8 is a diagram for explaining the processing of the VM instruction boundary detection unit 1212. As shown in FIG.
  • the VM instruction boundary detection unit 1212 detects the boundary of each VM instruction. At this time, the VM instruction boundary detection unit 1212 detects the VM instruction and its boundary in order to provide the script multipath execution function in the threaded code type VM. Specifically, the VM instruction boundary detection unit 1212 extracts an execution trace from the execution trace DB 131 . Then, as shown in FIG. 8, the VM instruction boundary detection unit 1212 clusters the execution trace by a predetermined method, and sets clusters whose execution count is equal to or greater than a threshold as VM instructions (for example, VM instruction handlers 1 to 3). To detect. The VM instruction boundary detection unit 1212 detects a start point and an end point of a continuous instruction sequence forming a VM instruction as a boundary.
  • the virtual program counter detection unit 1213 detects VPCs and pointer caches in order to provide a script multipath execution function in a threaded code VM.
  • the detection of the virtual program counter is realized by analyzing the memory access trace log of the acquired execution trace.
  • the virtual program counter detection unit 1213 uses differential execution analysis focusing on the number of times the memory is read.
  • FIG. 9 is a diagram for explaining the processing of the virtual program counter detection unit 1213. As shown in FIG.
  • the virtual program counter detection unit 1213 extracts one execution trace by the first test script from the execution trace DB 131.
  • the number of VPC reads is proportional to the number of iterations in the test script and the number of statements in the iteration.
  • N When the number of repetitions is N and the number of sentences to be repeated is M, approximately MN VPC reads occur. Therefore, the virtual program counter detection unit 1213 extracts memory increased by 4MN and 9MN in the execution trace for the first test script in which N and M are increased by 2N and 2M, and 3N and 3M, respectively.
  • the virtual program counter detection unit 1213 extracts a memory area that has read/write for each execution of one VM instruction and monotonically increases ((1) in FIG. 9).
  • the virtual program counter detection unit 1213 detects as a VPC that the read memory value always points to the starting point of the VM instruction. Specifically, the virtual program counter detection unit 1213 collates the destination of the VPC with the address of the VM instruction handler, and narrows down to a matching memory area ((2) in FIG. 9).
  • the dispatcher detection unit 1214 detects the dispatcher by analyzing the binary of the script engine using a predetermined method.
  • FIG. 10 is a diagram for explaining the processing of the dispatcher detection unit 1214. As shown in FIG.
  • the dispatcher detection unit 1214 detects dispatchers in order to provide script multipath execution functions in threaded code VMs.
  • the dispatcher detection unit 1214 cuts out each VM instruction part from the script engine binary based on the VM instruction boundary detected by the VM instruction boundary detection unit 1212 . Then, the dispatcher detection unit 1214 calculates the similarity between the codes of each VM instruction based on the assumption that the similarity of the dispatcher code is high ((1) in FIG. 10), and calculates the similarity between all the VM instructions. Detect the high degree part as a dispatcher.
  • the dispatcher detection unit 1214 can detect code that is commonly executed in the second half of a VM instruction as a dispatcher ((1) in FIG. 10).
  • conditional branch flag detection unit 1215 detects the conditional branch flag by analyzing memory access.
  • the conditional branch flag detection unit 1215 uses the execution trace obtained using the second test script.
  • the conditional branch flag detection unit 1215 detects the conditional branch flag by analyzing the test script and performing two stages of narrowing down.
  • the conditional branch flag has two states: branch taken or not taken. Also, the conditional branch flag is considered to be read a number of times proportional to the number of conditional branches.
  • conditional branch flag detection unit 1215 extracts the memories with the number of memory reads proportional to the number of conditional branches as the first stage of narrowing down. Then, as the second stage of narrowing down, the conditional branch flag detection unit 1215 extracts a memory in which two values are exchanged so that the value at the time of reading each memory corresponds to the conditional branch of the test script.
  • conditional branch flag detection unit 1215 extracts memory addresses that alternate between two values, X, Y, X, X, Y.
  • the conditional branch flag detection unit 1215 detects the conditional branch flag by repeating this while changing the number of times of branching.
  • the branch VM instruction detection unit 1222 detects the branch VM instruction by analyzing the acquired VM execution trace log. Since the test script here only needs to include a branch VM instruction, any script that includes branch control syntax may be used. For example, prepare a test script by collecting from the Internet or from official documents.
  • the branch VM instruction detection unit 1222 associates a pointer to a VM instruction with a VM instruction for each VM execution trace in the VM execution trace DB 133, and virtually assigns a VM opcode to each as an identifier.
  • FIG. 11 is a diagram for explaining the processing of the branch VM instruction detection unit 1222. As shown in FIG.
  • the branch VM instruction detector 1222 uses variance to evaluate the dispersion of pointers to this VM instruction.
  • the branch VM instruction detection unit 1222 calculates the variance of the VPC change amount for each VM opcode, and narrows down only the VM opcodes with the calculated variance larger than the threshold.
  • the branch VM instruction detection unit 1222 detects a VM instruction (in the example of FIG. 11, VM instruction handler 3) with variations in the progress of the VPC as a branch VM instruction while associating the pointer with the VM instruction ( (1) in FIG. 11).
  • the threshold is set to a value that can divide the resulting two groups by plotting the obtained variance values on a number line, for example.
  • analysis function imparting unit 123 receives as inputs the script engine binary and the hook points and tap points detected in the processing up to this point.
  • the analysis function imparting unit 123 hooks the script engine at the hook point.
  • the analysis function imparting unit 123 inserts the code for analysis so that the language element corresponding to the hook is executed at the time of hooking, and the memory of the tap point as its argument is output to the log.
  • Code for this analysis can be easily generated if the hook points and tap points are known. As a result, when the script is executed, its behavior will be output to the log, and the addition of the analysis function is realized.
  • the addition of the analysis function by this hook may be realized by directly rewriting the binary for the script engine binary, or by rewriting the memory image when the binary is executed and expanded on the process memory.
  • FIG. 12 is a flow chart showing a processing procedure of analysis function imparting processing according to the embodiment.
  • the input unit 11 receives a test script and a script engine binary as input (step S1).
  • the execution trace acquisition unit 1211 performs an execution trace acquisition process of executing the test script while monitoring the binary of the script engine and acquiring a branch trace and a memory access trace (step S2). Then, the VM instruction boundary detection unit 1212 detects a VM instruction and performs VM instruction boundary detection processing for detecting the boundary of the VM instruction (step S3).
  • the virtual program counter detection unit 1213 extracts and analyzes the execution trace for the first test script stored in the execution trace DB 131, and performs virtual program counter detection processing for discovering the VPC (step S4).
  • the dispatcher detection unit 1214 extracts each VM instruction part from the script engine binary, and performs dispatcher detection processing for detecting a part having a high degree of similarity between each VM instruction as a dispatcher (step S5).
  • the conditional branch flag detection unit 1215 extracts and analyzes the execution trace for the second test script stored in the execution trace DB 131, and performs conditional branch detection processing for finding the conditional branch flag (step S6).
  • the VM execution trace acquisition unit 1221 receives a test script and a script engine binary as inputs, and performs VM execution trace acquisition processing for acquiring a VM execution trace by executing the test script while monitoring the execution of the script engine binary. (Step S7).
  • the branch VM instruction detection unit 1222 extracts and analyzes the VM execution trace stored in the VM execution trace DB 133, and performs branch VM instruction detection processing for detecting the branch VM instruction (step S8).
  • the analysis function imparting unit 123 performs the analysis function imparting process of hooking the script engine using the obtained VPC, branch VM instruction and conditional branch flag (step S9). Then, the output unit 14 outputs the script engine binary provided with the multipath execution function (step S10).
  • FIG. 13 is a flow chart showing a processing procedure of execution trace acquisition processing shown in FIG.
  • the execution trace acquisition unit 1211 receives the test script and the script engine binary as inputs (step S11).
  • the execution trace acquisition unit 1211 hooks the received script engine to acquire a branch trace (step S12).
  • the execution trace acquisition unit 1211 also hooks the received script engine to acquire a memory access trace (step S13).
  • the execution trace acquisition unit 1211 inputs the test script received in that state to the script engine to execute it (step S14), and stores the execution trace acquired thereby in the execution trace DB 131 (step S15).
  • the execution trace acquisition unit 1211 determines whether all the input test scripts have been executed (step S16). If the execution trace acquisition unit 1211 has finished executing all of the input test scripts (step S16: Yes), the execution trace acquisition unit 1211 ends the process. On the other hand, if the execution trace acquisition unit 1211 has not executed all of the input test scripts (step S16: No), it returns to execution of the test scripts in step S14 and continues processing.
  • FIG. 14 is a flowchart showing the procedure of the VM instruction boundary detection process shown in FIG. 12;
  • the VM instruction boundary detection unit 1212 extracts an execution trace from the execution trace DB 131 (step S21).
  • the VM instruction boundary detection unit 1212 clusters the execution traces by a predetermined method (step S22). Any method may be used for the clustering.
  • the VM instruction boundary detection unit 1212 detects clusters whose number of executions is equal to or greater than the threshold as VM instructions (step S23). Then, the VM instruction boundary detection unit 1212 sets the start point and the end point of the continuous instruction sequence forming the VM instruction as the boundary (step S24). The VM instruction boundary detection unit 1212 outputs the VM instruction boundary as a return value (step S25), and ends the VM instruction boundary detection process.
  • FIG. 15 is a flow chart showing the procedure of the virtual program counter detection process shown in FIG. 12
  • the virtual program counter detection unit 1213 extracts one execution trace by the first test script from the execution trace DB 131 (step S31). Subsequently, the virtual program counter detection unit 1213 focuses on the memory access trace of the execution trace, and counts the number of times of reading for each memory reading destination (step S32).
  • the virtual program counter detection unit 1213 receives as an input the first test script used to acquire the execution trace (step S33), analyzes the first test script, and detects the number of repetitions and the number of sentences to be repeated. Acquire (step S34).
  • the virtual program counter detection unit 1213 extracts from the execution trace DB 131 one more execution trace by the first test script with a different repetition count and number of repeated sentences (step S35). Then, the virtual program counter detection unit 1213 pays attention to the memory access trace and counts the number of readings for each memory reading destination (step S36). In addition, the virtual program counter detection unit 1213 receives as an input the first test script used to acquire the execution trace (step S37), analyzes the test script, and acquires the number of repetitions and the number of repeated sentences. (step S38).
  • the virtual program counter detection unit 1213 narrows down only memory read destinations whose read count changes in proportion to the number of repetitions and the increase or decrease in the number of repeated sentences (step S39). Furthermore, the virtual program counter detection unit 1213 narrows down the memory read destinations narrowed down in step S39 to those in which the read memory value always points to the start point of the VM instruction (step S40).
  • the virtual program counter detection unit 1213 determines whether or not the memory reading destination has been narrowed down to only one (step S41). If the virtual program counter detection unit 1213 cannot narrow down the memory reading destination to only one (step S41: No), the process returns to step S35, extracts the next execution trace, and continues the process. On the other hand, if the virtual program counter detection unit 1213 narrows down the memory reading destination to only one (step S41: Yes), it stores the narrowed down memory reading destination in the architecture information DB 132 as a virtual program counter (step S42). ) and terminate the process.
  • FIG. 16 is a flow chart showing a processing procedure of dispatcher detection processing shown in FIG.
  • the dispatcher detection unit 1214 receives the script engine binary as an input (step S51).
  • the dispatcher detector 1214 receives the boundary of the VM instruction from the VM instruction boundary detector 1212 (step S52).
  • the dispatcher detection unit 1214 cuts out each VM instruction part from the script engine binary based on the boundary of the VM instruction received from the VM instruction boundary detection unit 1212 (step S53).
  • the dispatcher detection unit 1214 calculates the code similarity between each VM instruction by a predetermined method (step S54). Any similarity calculation method can be used as long as it can calculate the similarity between codes.
  • the dispatcher detection unit 1214 extracts a portion with a high degree of similarity among all VM instructions based on the degree of similarity calculated in step S54 (step S55). Then, the dispatcher detection unit 1214 determines whether it is the end part of the VM instruction (step S56).
  • step S56: No If it is not the end part of the VM instruction (step S56: No), the dispatcher detection unit 1214 returns to step S55 and continues processing. If it is the end part of the VM instruction (step S56: Yes), the dispatcher detection unit 1214 outputs the extracted part as the dispatcher (step S57), and ends the process.
  • FIG. 17 is a flow chart showing the procedure of the conditional branch flag detection process shown in FIG. 12
  • conditional branch flag detection unit 1215 extracts one execution trace by the second test script from the execution trace DB 131 (step S71). Then, the conditional branch flag detection unit 1215 focuses on the memory access trace and counts the number of readings for each memory reading destination (step S72).
  • conditional branch flag detection unit 1215 receives as an input the second test script used to acquire the execution trace (step S73), analyzes this second test script, and determines the number of conditional branches and True/ A False order pattern is obtained (step S74). Then, the conditional branch flag detection unit 1215 narrows down only memory read destinations whose number of times of reading changes in proportion to the number of times of conditional branching (step S75). Furthermore, the conditional branch flag detection unit 1215 narrows down the read memory values to memory read destinations where two values are exchanged according to the order pattern of True/False (step S76).
  • the conditional branch flag detection unit 1215 determines whether or not the memory reading destination has been narrowed down to only one (step S77). If the conditional branch flag detection unit 1215 cannot narrow down the memory reading destination to only one (step S77: No), the process returns to step S71, extracts the next execution trace, and continues the process. On the other hand, if the memory read destination is narrowed down to only one (step S77: Yes), the conditional branch flag detection unit 1215 stores the narrowed down read destination in the architecture information DB 132 as a virtual program counter (step S78). End the process.
  • FIG. 18 is a flowchart of a procedure of a VM execution trace acquisition process shown in FIG. 12;
  • the VM execution trace acquisition unit 1221 receives the test script and the script engine binary as input (step S81). Then, the VM execution trace acquisition unit 1221 hooks the received script engine to record the VPC and VM operation code (step S82).
  • the VM execution trace acquisition unit 1221 inputs the test script received in that state to the script engine to execute it (step S83), and stores the VM execution trace acquired thereby in the VM execution trace DB 133 (step S84).
  • the VM execution trace acquisition unit 1221 determines whether all the input test scripts have been executed (step S85). If the VM execution trace acquisition unit 1221 has finished executing all the input test scripts (step S85: Yes), the process ends. If the VM execution trace acquisition unit 1221 has not finished executing all of the input test scripts (step S85: No), it returns to execution of the test scripts in step S83 and continues processing.
  • FIG. 19 is a flow chart showing a processing procedure of branch VM instruction detection processing shown in FIG. 12 .
  • the branch VM instruction detection unit 1222 extracts one VM execution trace from the VM execution trace DB 133 (step S91).
  • the branch VM instruction detection unit 1222 associates the pointer to the VM instruction with the VM instruction, and assigns a VM opcode to each as an identifier (step S92).
  • the branch VM instruction detection unit 1222 aggregates the amount of change in VPC before and after execution for each VM opcode (step S93).
  • the branch VM instruction detection unit 1222 determines whether or not all VM execution traces in the VM execution trace DB 133 have been processed (step S94). If all VM execution traces in the VM execution trace DB 133 have not been processed (step S94: No), the branch VM instruction detection unit 1222 returns to step S91 to extract and process the next VM execution trace.
  • step S94 the branch VM instruction detection unit 1222 calculates the variance of the VPC variation for each VM opcode (step S95).
  • the branch VM instruction detection unit 1222 receives the threshold as an input (step S96).
  • the branch VM instruction detection unit 1222 narrows down to only VM opcodes whose variance is larger than the threshold (step S97), stores them as branch VM instructions in the architecture information DB 132 (step S98), and ends the process.
  • FIG. 20 is a flow chart showing the processing procedure of the analysis function imparting process shown in FIG. 12
  • the analysis function imparting unit 123 receives a script engine binary as an input (step S101). Then, the analysis function imparting unit 123 extracts the VPC, the conditional branch flag, and the conditional branch VM instruction from the architecture information DB 132 (step S102). Subsequently, the analysis function imparting unit 123 applies a hook to the hook point of the script engine (step S103). The analysis function imparting unit 123 generates a code and inserts it into the script engine so that the multipath execution code is executed at the time of this hook (step S104). The analysis function imparting unit 123 outputs the hooked script engine thus obtained as a script engine with a multipath execution function (step S105), and terminates the process.
  • the analysis function imparting apparatus 10 executes the test script while monitoring the binary of the script engine, and acquires the branch trace and the memory access trace. Then, the analysis function imparting device 10 analyzes the virtual machine based on the execution trace, and acquires architecture information such as VM instruction boundaries, VPC, dispatchers, and conditional branch flags. Furthermore, the analysis function imparting device 10 executes the test script to acquire the VM execution trace, analyzes the instruction set architecture using the VM execution trace, and acquires the branch VM instruction as architecture information. Then, the analysis function imparting device 10 imparts the multipath execution function to the script engine based on the obtained architecture information.
  • the analysis function imparting device 10 detects various types of architecture information through analysis based on acquisition of execution traces and VM execution traces, even for proprietary script engines for which only binaries are available.
  • a multipath execution function can be added without manual reverse engineering.
  • analysis function imparting device 10 can automatically impart a multipath execution function to various script engines as long as a test script is prepared. can be realized.
  • the analysis function imparting device 10 considers detailed architecture such as conditional branching, it is possible to implement an accurate multipath execution function against the conditional branching of the script.
  • the analysis function imparting device 10 focuses on the threaded code type script engine, it is possible to impart the multipath execution function even to the script engine having the threaded code type VM.
  • the analysis function imparting apparatus 10 analyzes the script engine and retrofits the multipath execution function to the script engine of various script languages including threaded code type. On the other hand, it is possible to automatically provide the multipath execution function.
  • the analysis function imparting apparatus 10 is useful for analyzing the behavior of malicious scripts written in a wide variety of script languages, and has a route that cannot be executed unless a specific condition is met. It is suitable for comprehensively analyzing the behavior of a malicious script that is not affected by it. For this reason, the behavior of malicious scripts can be analyzed by providing multipath execution functions to various script engines using the analysis function imparting device 10, the analysis function imparting program, and the analysis function imparting method according to the present embodiment. It is possible to utilize it for countermeasures such as detection.
  • Each component of the analysis function imparting apparatus 10 shown in FIG. 3 is functionally conceptual, and does not necessarily need to be physically configured as shown. That is, the specific form of distributing and integrating the functions of the analysis function imparting device 10 is not limited to the illustrated one, and all or part of it can be functionally or It can be physically distributed or integrated.
  • each process performed in the analysis function imparting device 10 may be realized by a CPU and a program that is analyzed and executed by the CPU. Further, each process performed in the analysis function imparting device 10 may be realized as hardware by wired logic.
  • FIG. 21 is a diagram showing an example of a computer that implements the analysis function imparting device 10 by executing a program.
  • the computer 1000 has a memory 1010 and a CPU 1020, for example.
  • Computer 1000 also has hard disk drive interface 1030 , disk drive interface 1040 , serial port interface 1050 , video adapter 1060 and network interface 1070 . These units are connected by a bus 1080 .
  • the memory 1010 includes a ROM 1011 and a RAM 1012.
  • the ROM 1011 stores a boot program such as BIOS (Basic Input Output System).
  • BIOS Basic Input Output System
  • Hard disk drive interface 1030 is connected to hard disk drive 1090 .
  • a disk drive interface 1040 is connected to the disk drive 1100 .
  • a removable storage medium such as a magnetic disk or optical disk is inserted into the disk drive 1100 .
  • Serial port interface 1050 is connected to mouse 1110 and keyboard 1120, for example.
  • Video adapter 1060 is connected to display 1130, for example.
  • the hard disk drive 1090 stores, for example, an OS 1091, application programs 1092, program modules 1093, and program data 1094. That is, a program that defines each process of the analysis function imparting apparatus 10 is implemented as a program module 1093 in which code executable by the computer 1000 is described. Program modules 1093 are stored, for example, on hard disk drive 1090 .
  • the hard disk drive 1090 stores a program module 1093 for executing processing similar to the functional configuration of the analysis function imparting apparatus 10 .
  • the hard disk drive 1090 may be replaced by an SSD (Solid State Drive).
  • the setting data used in the processing of the above-described embodiment is stored as program data 1094 in the memory 1010 or the hard disk drive 1090, for example. Then, the CPU 1020 reads out the program module 1093 and the program data 1094 stored in the memory 1010 and the hard disk drive 1090 to the RAM 1012 as necessary and executes them.
  • the program modules 1093 and program data 1094 are not limited to being stored in the hard disk drive 1090, but may be stored in a removable storage medium, for example, and read by the CPU 1020 via the disk drive 1100 or the like. Alternatively, the program modules 1093 and program data 1094 may be stored in another computer connected via a network (LAN (Local Area Network), WAN (Wide Area Network), etc.). Program modules 1093 and program data 1094 may then be read by CPU 1020 through network interface 1070 from other computers.
  • LAN Local Area Network
  • WAN Wide Area Network

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Virology (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

An analysis function addition device (10) that has a virtual machine analysis unit (121) that analyzes a virtual machine of a script engine, a command set architecture analysis unit (122) that analyzes a command set architecture, and an analysis function addition unit (123) that performs hooking to add a multipath execution function to the script engine on the basis of architecture information. The virtual machine analysis unit (121) has a VM command boundary detection unit (1212) that performs clustering on a plurality of execution traces and detects the boundaries of VM commands, a virtual program counter detection unit (1213) that uses the boundaries of the VM commands and differential execution analysis that is focused on a read-in count for memory to detect a virtual program counter, a dispatcher detection unit (1214) that detects a dispatcher on the basis of the boundaries of the VM commands, and a conditional branch flag detection unit (1215) that uses the differential execution analysis that is focused on the read-in count for the memory to detect a conditional branch flag.

Description

解析機能付与装置、解析機能付与プログラム及び解析機能付与方法Analysis function imparting device, analysis function imparting program, and analysis function imparting method
 本発明は、解析機能付与装置、解析機能付与プログラム及び解析機能付与方法に関する。 The present invention relates to an analysis function imparting device, an analysis function imparting program, and an analysis function imparting method.
 マルウェアを用いたスパム(マルスパム)やファイルレスマルウェアなどの多様な攻撃の形態が生じるにともなって、悪性な挙動を示すスクリプト(悪性スクリプト)による攻撃の脅威が顕在化している。 With the emergence of various forms of attacks such as spam using malware (malspam) and fileless malware, the threat of attacks using scripts that exhibit malicious behavior (malicious scripts) has become apparent.
 悪性スクリプトとは、悪意のある挙動を持ったスクリプトであり、スクリプトエンジンの提供する機能を悪用して攻撃を実現するプログラムである。一般に、オペレーティングシステム(Operating System:OS)がデフォルトで有するスクリプトエンジンや、Webブラウザや文書ファイルのビューアなど、特定のアプリケーションが有するスクリプトエンジンを用いて攻撃が実施される。 A malicious script is a script that behaves maliciously, and is a program that exploits the functions provided by the script engine to carry out attacks. In general, attacks are carried out using the default script engine of the operating system (OS), or the script engine of specific applications such as web browsers and document file viewers.
 こうしたスクリプトエンジンの多くは、ユーザの許可が必要な場合もあるものの、ファイル操作やネットワーク通信、プロセスの起動など、システムを介した挙動も実現可能である。したがって、悪性スクリプトを用いた攻撃は、実行ファイルのマルウェアを用いた攻撃と同様に、ユーザに対しての脅威となる。  Many of these script engines may require user permission, but they can also implement actions via the system, such as file operations, network communication, and process startup. Therefore, attacks using malicious scripts pose a threat to users in the same way as attacks using malware in executable files.
 この悪性スクリプトによる攻撃に対策を講じるためには、スクリプトの持つ挙動を正確に把握する必要がある。したがって、スクリプトを解析することで、その挙動を明らかにする技術が希求される。 In order to take countermeasures against attacks by this malicious script, it is necessary to accurately understand the behavior of the script. Therefore, a technique for clarifying the behavior by analyzing the script is desired.
 悪性スクリプトを解析する際に生じる問題として、コードの難読化がある。悪性スクリプトの多くは、難読化と呼ばれる、解析を妨害する処理が施されている。難読化は、故意にコードの複雑さを高めることで、コードの表層的な情報に基づく解析を困難にする。すなわち、スクリプトを実行せずに、コードから得られる情報で解析する、静的解析と呼ばれる解析方法を妨害する。 Code obfuscation is a problem that arises when analyzing malicious scripts. Many malicious scripts are subjected to a process called obfuscation, which hinders analysis. Obfuscation deliberately increases the complexity of the code, making it difficult to analyze the code superficially. That is, it interferes with an analysis method called static analysis, which analyzes information obtained from the code without executing the script.
 特に、実行するコードの一部を外部から動的に取得する場合は、そのコードは実行しなければ得られないため、静的には解析できない。したがって、静的解析はその原理上、不可能となる。 In particular, when part of the code to be executed is dynamically obtained from the outside, the code cannot be obtained without executing it, so it cannot be analyzed statically. Therefore, static analysis is impossible in principle.
 一方で、スクリプトを実行し、その振る舞いを監視することで挙動を知る動的解析と呼ばれる手法は、前述のような難読化の影響を受けない。このため、悪性スクリプトの解析においては、動的解析に基づく手法が主に用いられている。 On the other hand, a method called dynamic analysis, in which behavior is learned by running a script and monitoring its behavior, is not affected by obfuscation as described above. For this reason, techniques based on dynamic analysis are mainly used in the analysis of malicious scripts.
 一般的な動的解析では、解析環境で悪性スクリプトを実行し、その挙動を監視することにより、悪性スクリプト中で実行された単一の実行経路の挙動のみが得られる。このため、解析環境で実行されなかった経路の挙動は得ることができないという問題がある。 In general dynamic analysis, by running a malicious script in an analysis environment and monitoring its behavior, only the behavior of a single execution path executed in the malicious script can be obtained. Therefore, there is a problem that the behavior of paths that have not been executed in the analysis environment cannot be obtained.
 言い換えると、特定の条件下でしか実行されない経路を有する悪性スクリプトについては、動的解析によっても、全ての挙動を解析しきれないという問題がある。 In other words, there is the problem that even dynamic analysis cannot analyze all behaviors of malicious scripts that have routes that are executed only under specific conditions.
 特定の条件下でしか実行されない経路がある場合として、例えば、指令サーバからの指令によってその先の実行経路が決まる場合や、解析妨害によって解析環境では悪性な挙動を示さないようになっている場合がある。 When there is a path that is executed only under specific conditions, for example, when the execution path after that is determined by a command from the command server, or when analysis obstruction prevents malicious behavior in the analysis environment There is
 前者は、指令サーバからの指令がなければ、その先の実行経路が決定されず、悪性な挙動を持った経路が実行されない場合である。悪性スクリプトを検出して解析する際には、既に攻撃者が撤退して指令サーバがなくなっている場合も少なくないため、そのような場合には、悪性な挙動を観測できない。 In the former case, if there is no command from the command server, the execution path ahead is not determined and the path with malicious behavior is not executed. When a malicious script is detected and analyzed, it is not uncommon for the attacker to withdraw and the command server to disappear. In such a case, malicious behavior cannot be observed.
 後者は、悪性スクリプトが、自身が実行されている環境の情報を取得し、それが特定の条件を満たしていなければ、悪性な挙動を示さないという解析妨害である。例えば、解析環境に高頻度に見られる特徴が見られた場合には、自分が解析されていると判断して、実行を中断するという解析妨害に用いられる。 The latter is an analysis obstruction in which a malicious script acquires information about the environment in which it is executed, and does not exhibit malicious behavior unless it meets certain conditions. For example, if a feature that is frequently seen in the analysis environment is found, it is determined that the user is being analyzed, and is used to interrupt analysis by interrupting execution.
 図22は、解析妨害の一例を示すコード片を示す図である。このコード片は、実行されている環境のCPU(Central Processing Unit)のコア数を取得し、それが2以上かつ8以下でなければ、解析環境の可能性が高いと判断して、実行を終了するという解析妨害を持つ。さもなければ、解析環境ではないと判断して、悪性な挙動を示す。 FIG. 22 is a diagram showing a code fragment showing an example of anti-analysis. This code fragment acquires the number of cores of the CPU (Central Processing Unit) of the environment being executed, and if it is not 2 or more and 8 or less, it judges that the analysis environment is highly likely and terminates execution. It has an anti-analysis attack. Otherwise, it judges that it is not an analysis environment and shows malignant behavior.
 このような特定の条件下でしか実行されない経路の挙動を捉えるためには、複数の実行経路を実行するマルチパス実行が必要となる。 In order to capture the behavior of paths that are executed only under specific conditions, multipath execution that executes multiple execution paths is required.
 マルチパス実行では、実行が条件分岐に到達した際に、実行状態を分岐させ、分岐した各々の実行状態が、分岐のそれぞれの実行経路を辿るようにする。これにより、条件分岐で発生する二つの実行経路の両方を実行する。 In multi-pass execution, when execution reaches a conditional branch, the execution state is branched so that each branched execution state follows the respective execution path of the branch. As a result, both of the two execution paths generated by the conditional branch are executed.
 マルチパス実行の実現について、例えば、非特許文献1には、JavaScript(登録商標)に対して、マルチパス実行の一種であるシンボリック実行を実現する手法が記載されている。この手法によれば、JavaScriptのスクリプトの条件分岐において、実行可能な経路を網羅的に辿り、挙動を観測できる。 Regarding the realization of multipath execution, for example, Non-Patent Document 1 describes a technique for realizing symbolic execution, which is a type of multipath execution, for JavaScript (registered trademark). According to this method, in the conditional branching of a JavaScript script, it is possible to comprehensively follow the executable paths and observe the behavior.
 また、非特許文献2には、JavaScriptに対して、マルチパス実行の一種である経路強制実行を実現する手法が記載されている。この手法によれば、JavaScriptのスクリプトの条件分岐において、全ての経路を網羅的に辿り、挙動を観測できる。 In addition, Non-Patent Document 2 describes a method for realizing route forced execution, which is a type of multipath execution, for JavaScript. According to this method, in the conditional branching of JavaScript scripts, all paths can be exhaustively traced and the behavior can be observed.
 非特許文献3には、スクリプトエンジンに予め手動で改造を施した上で、そのスクリプトエンジンをバイナリ向けのシンボリック実行基盤の上で実行することで、スクリプトエンジン上で実行されているスクリプトに対して、スクリプトエンジン越しにシンボリック実行を実現する手法が記載されている。この手法によれば、手動で改造を施せるスクリプトエンジンがあれば、どのようなスクリプト言語でも汎用的にシンボリック実行を実現し、実行可能な経路を網羅的に辿って、挙動を観測できる。 In Non-Patent Document 3, after manually remodeling the script engine in advance, by executing the script engine on the symbolic execution infrastructure for binary, the script executed on the script engine , describes a technique for realizing symbolic execution through a script engine. According to this technique, if there is a script engine that can be modified manually, any script language can be used to achieve general-purpose symbolic execution, exhaustively trace executable paths, and observe behavior.
 そして、非特許文献4には、マルウェアが自身のプログラムの難読化にしばしば用いる仮想機械(Virtual Machine:VM)を解析する手法が記載されている。この手法によれば、VMを解析することで、そのアーキテクチャの情報を取得できる。スクリプトエンジンにおいてスクリプトの実行を司るのはVMであるため、この手法の考え方を一部転用できる。 And Non-Patent Document 4 describes a method of analyzing a virtual machine (VM) that malware often uses to obfuscate its own programs. According to this technique, by analyzing the VM, it is possible to obtain information on its architecture. Since it is the VM that controls script execution in the script engine, the concept of this method can be partly diverted.
 非特許文献5には、スクリプトエンジンを解析し、得られたアーキテクチャの情報に基づいて、マルチパス実行機能を実現するコードを追加することにより、スクリプトのマルチパス実行を可能にしている。この手法によれば、多様なスクリプトの言語やエンジンに対して、マルチパス実行を実現できる。 In Non-Patent Document 5, multipath execution of scripts is enabled by analyzing the script engine and adding code to realize the multipath execution function based on the obtained architecture information. According to this method, multipath execution can be realized for various script languages and engines.
 しかしながら、非特許文献1及び非特許文献2に記載の手法では、スクリプトエンジンごとに個別にマルチパス実行機能を設計し、実装する必要があるという課題があった。また、非特許文献1及び非特許文献2に記載の手法では、マルチパス実行機能を実現するために、スクリプトエンジンのVMのアーキテクチャの情報を事前に知る必要があるという課題があった。 However, the methods described in Non-Patent Document 1 and Non-Patent Document 2 have the problem that it is necessary to design and implement the multipath execution function individually for each script engine. In addition, the methods described in Non-Patent Document 1 and Non-Patent Document 2 have the problem that it is necessary to know the architecture information of the VM of the script engine in advance in order to realize the multipath execution function.
 また、非特許文献3に記載の手法では、スクリプトエンジンへの改造を要するため、やはり、スクリプトエンジンのVMのアーキテクチャ情報を事前に知る必要があるという課題があった。また、非特許文献3に記載の手法では、スクリプトエンジン内での条件分岐の仕組みなど、詳細なアーキテクチャを考慮しないため、スクリプトに対する細粒度のマルチパス実行が難しいという課題があった。 In addition, the method described in Non-Patent Document 3 requires modifications to the script engine, so there is also the problem that it is necessary to know the architecture information of the VM of the script engine in advance. In addition, the method described in Non-Patent Document 3 does not consider detailed architecture such as the mechanism of conditional branching within the script engine, so there is a problem that it is difficult to perform fine-grained multipath execution for the script.
 このスクリプトエンジンのアーキテクチャ情報の取得には、解析作業が必要となる。オープンソースのスクリプトエンジンに対しては、ソースコードの解析によって実現できるが、ソースコードが得られるスクリプト言語に限られ、一定の工数も要する。さらに、プロプライエタリのスクリプトエンジンについては、バイナリのリバースエンジニアリングの必要があり、人手での実施には熟練したリバースエンジニアと多大な工数を要するため、現実的でない。さらに、そのリバースエンジニアリングの自動化は、確立されていない。 Acquiring the architecture information of this script engine requires analysis work. For an open source script engine, it can be realized by analyzing the source code, but it is limited to the script language from which the source code can be obtained, and a certain number of man-hours are required. Furthermore, proprietary script engines require binary reverse engineering, and manual implementation requires a skilled reverse engineer and a great deal of man-hours, which is not realistic. Furthermore, the automation of its reverse engineering has not been established.
 そして、非特許文献4に記載の手法では、マルウェアの持つVMのみを対象としており、スクリプトエンジンの持つVMは対象としていないため、スクリプトエンジンには直接的には適用できないという課題があった。また、非特許文献4に記載の手法は、マルチパス実行に重要な条件分岐に関わるアーキテクチャ情報の取得には言及していないという課題もあった。さらに、非特許文献4に記載の手法では、VMの解析のみに焦点を当てており、マルチパス実行の付与など、VMへの機能付与は考慮していないという課題もあった。 In addition, the method described in Non-Patent Document 4 only targets VMs owned by malware, not VMs owned by script engines, so there is a problem that it cannot be directly applied to script engines. In addition, the method described in Non-Patent Document 4 also has the problem that it does not refer to acquisition of architecture information related to conditional branching, which is important for multipath execution. Furthermore, the technique described in Non-Patent Document 4 focuses only on the analysis of the VM, and has the problem that it does not take into account the addition of functions to the VM, such as the addition of multipath execution.
 そして、非特許文献5に記載の手法では、デコード・ディスパッチ型のスクリプトエンジンにしか適用できず、もう一つの主要な方式であるスレッデッドコード型のスクリプトエンジンには適用できないという課題があった。 In addition, the method described in Non-Patent Document 5 has the problem that it can only be applied to the decode/dispatch type script engine and cannot be applied to the threaded code type script engine, which is another major method.
 本発明は、上記に鑑みてなされたものであって、スレッデッドコード型のスクリプトエンジンに対しても、個別の設計及び実装を要さず、事前のアーキテクチャ情報なしに、条件分岐などの詳細なアーキテクチャを考慮したマルチパス実行機能の付与を実現できる解析機能付与装置、解析機能付与プログラム及び解析機能付与方法を提供することを目的とする。 The present invention has been made in view of the above, and does not require separate design and implementation even for a threaded code type script engine, and does not require detailed architecture information such as conditional branching in advance. It is an object of the present invention to provide an analysis function imparting device, an analysis function imparting program, and an analysis function imparting method capable of realizing impartation of a multipath execution function in consideration of architecture.
 上述した課題を解決し、目的を達成するために、本発明の解析機能付与装置は、スクリプトエンジンの仮想機械を解析する第一の解析部と、仮想機械の命令の体系である命令セットアーキテクチャを解析する第二の解析部と、第一の解析部及び第二の解析部による解析によって得られたアーキテクチャ情報である、次に実行される仮想機械の命令(VM命令)を指し示す変数である仮想プログラムカウンタ、実行状態の条件分岐時に分岐がなされるか否かのフラグを保持する領域である条件分岐フラグ、及び、分岐を発生させる仮想機械命令である分岐VM命令に基づいて、スクリプトエンジンに、マルチパス実行機能を付与するフックを施す付与部と、を有し、第一の解析部は、実行時の条件を変えて取得した複数の実行トレースを、差分実行解析を用いて解析し、仮想プログラムカウンタ、及び、条件分岐フラグを取得し、実行時の条件を変えて複数の実行トレースを取得する第一の取得部と、実行トレースをクラスタリングして、各VM命令の境界を検出する第一の検出部と、メモリの読み込み回数に着目した差分実行解析と第一の検出部によって検出された各VM命令の境界とを用いて複数の実行トレースを解析し、仮想プログラムカウンタを検出する第二の検出部と、第一の検出部によって検出された各VM命令の境界を基に、スクリプトエンジンのバイナリを解析し、ディスパッチャを検出する第三の検出部と、メモリの読み込み回数に着目した差分実行解析を用いて複数の実行トレースを解析し、条件分岐フラグを検出する第四の検出部と、を有することを特徴とする。 In order to solve the above-described problems and achieve the object, the analysis function imparting device of the present invention includes a first analysis unit that analyzes the virtual machine of the script engine and an instruction set architecture that is the system of instructions of the virtual machine. A second analysis unit that analyzes, and a virtual Based on a program counter, a conditional branch flag, which is an area that holds a flag indicating whether or not a branch is taken at the time of a conditional branch in the execution state, and a branch VM instruction, which is a virtual machine instruction that causes a branch, the script engine: a first analysis unit that applies a hook that provides a multipath execution function, the first analysis unit analyzes a plurality of execution traces obtained by changing execution conditions using differential execution analysis, A first acquisition unit that acquires a program counter and a conditional branch flag and acquires a plurality of execution traces by changing execution conditions, and a first acquisition unit that clusters the execution traces and detects the boundary of each VM instruction Analyze a plurality of execution traces using the detection unit, the difference execution analysis focusing on the number of times of memory reading, and the boundaries of each VM instruction detected by the first detection unit, and detect the virtual program counter. A third detection unit that analyzes the binary of the script engine based on the boundary of each VM instruction detected by the first detection unit and detects the dispatcher, and a difference that focuses on the number of times the memory is read and a fourth detection unit that analyzes a plurality of execution traces using execution analysis and detects a conditional branch flag.
 本発明によれば、スレッデッドコード型のスクリプトエンジンに対しても、個別の設計及び実装を要さず、事前のアーキテクチャ情報なしに、条件分岐などの詳細なアーキテクチャを考慮したマルチパス実行機能の付与を実現できる。 According to the present invention, even for a threaded code type script engine, a multipath execution function that takes into account detailed architecture such as conditional branching can be implemented without requiring separate design and implementation and without prior architectural information. grant can be realized.
図1は、スレッデッドコード型のスクリプトエンジンの構成の一例を説明するための図である。FIG. 1 is a diagram for explaining an example of the configuration of a threaded code type script engine. 図2は、スクリプトエンジンが有するスレッデッドコード型のVMの擬似コードを示す図である。FIG. 2 is a diagram showing pseudocode of a threaded code type VM that the script engine has. 図3は、実施の形態に係る解析機能付与装置の構成の一例を説明する図である。FIG. 3 is a diagram illustrating an example of the configuration of the analysis function imparting device according to the embodiment. 図4は、仮想プログラムカウンタ検出に用いるテストスクリプト(第一のテストスクリプト)の一例を示す図である。FIG. 4 is a diagram showing an example of a test script (first test script) used for virtual program counter detection. 図5は、分岐VM命令検出に用いるテストスクリプト(第二のテストスクリプト)の一例を示す図である。FIG. 5 is a diagram showing an example of a test script (second test script) used for branch VM instruction detection. 図6は、実行トレースの一例を示す図である。FIG. 6 is a diagram showing an example of an execution trace. 図7は、VM実行トレースの一例を示す図である。FIG. 7 is a diagram illustrating an example of a VM execution trace. 図8は、VM命令境界検出部の処理を説明する図である。FIG. 8 is a diagram explaining processing of the VM instruction boundary detection unit. 図9は、仮想プログラムカウンタ検出部の処理を説明する図である。FIG. 9 is a diagram for explaining processing of the virtual program counter detection unit. 図10は、ディスパッチャ検出部の処理を説明する図である。FIG. 10 is a diagram explaining processing of the dispatcher detection unit. 図11は、分岐VM命令検出部の処理を説明する図である。FIG. 11 is a diagram explaining processing of the branch VM instruction detection unit. 図12は、実施の形態に係る解析機能付与処理の処理手順を示すフローチャートである。FIG. 12 is a flow chart showing a processing procedure of analysis function imparting processing according to the embodiment. 図13は、図12に示す実行トレース取得処理の処理手順を示すフローチャートである。FIG. 13 is a flow chart showing a processing procedure of execution trace acquisition processing shown in FIG. 図14は、図12に示すVM命令境界検出処理の処理手順を示すフローチャートである。FIG. 14 is a flowchart showing the procedure of the VM instruction boundary detection process shown in FIG. 12; 図15は、図12に示す仮想プログラムカウンタ検出処理の処理手順を示すフローチャートである。FIG. 15 is a flow chart showing the procedure of the virtual program counter detection process shown in FIG. 図16は、図12に示すディスパッチャ検出処理の処理手順を示すフローチャートである。FIG. 16 is a flow chart showing a processing procedure of dispatcher detection processing shown in FIG. 図17は、図12に示す条件分岐フラグ検出処理の処理手順を示すフローチャートである。FIG. 17 is a flow chart showing the procedure of the conditional branch flag detection process shown in FIG. 図18は、図12に示すVM実行トレース取得処理の処理手順を示すフローチャートである。FIG. 18 is a flowchart of a procedure of a VM execution trace acquisition process shown in FIG. 12; 図19は、図12に示す分岐VM命令検出処理の処理手順を示すフローチャートである。FIG. 19 is a flow chart showing a processing procedure of branch VM instruction detection processing shown in FIG. 12 . 図20は、図12に示す解析機能付与処理の処理手順を示すフローチャートである。FIG. 20 is a flow chart showing the processing procedure of the analysis function imparting process shown in FIG. 図21は、プログラムが実行されることにより、解析機能付与装置が実現されるコンピュータの一例を示す図である。FIG. 21 is a diagram showing an example of a computer that implements the analysis function imparting device by executing a program. 図22は、解析妨害の一例を示すコード片を示す図である。FIG. 22 is a diagram showing a code fragment showing an example of anti-analysis.
 以下に、本願に係る解析機能付与装置、解析機能付与プログラム及び解析機能付与方法の実施形態を図面に基づいて詳細に説明する。また、本発明は、以下に説明する実施形態により限定されるものではない。 Embodiments of the analytical function imparting device, the analytical function imparting program, and the analytical function imparting method according to the present application will be described in detail below based on the drawings. Moreover, the present invention is not limited to the embodiments described below.
[実施の形態]
 実施の形態に係る解析機能付与装置は、スレッデッドコード型のスクリプトエンジンに適用できる解析機能付与装置である。実施の形態に係る解析機能付与装置は、テストスクリプトを用いてスクリプトエンジンのバイナリを解析することにより、VM命令の境界と、次に実行されるVM命令を指し示す変数である仮想プログラムカウンタ(VPC)と、ディスパッチャと、条件分岐フラグと、分岐を発生させるVM命令である分岐VM命令とを順に検出する。
[Embodiment]
An analysis function imparting device according to an embodiment is an analysis function imparting device that can be applied to a threaded code type script engine. By analyzing the binary of the script engine using the test script, the analysis function imparting apparatus according to the embodiment can detect the boundary of the VM instruction and the virtual program counter (VPC) which is a variable indicating the VM instruction to be executed next. , a dispatcher, a conditional branch flag, and a branch VM instruction, which is a VM instruction that causes branching, are detected in order.
 なお、これらはいずれも、スクリプトエンジンの構成要素であり、アーキテクチャに関する情報である。図1及び図2を参照して、一般的なスクリプトエンジンの構成とそれらの働きについて説明する。  These are all components of the script engine and information about the architecture. The structure of a typical script engine and their functions will be described with reference to FIGS. 1 and 2. FIG.
 図1は、スレッデッドコード型のスクリプトエンジンの構成の一例を説明するための図である。図1に示すように、スクリプトエンジン100は、バイトコードコンパイラ102と仮想機械(Virtual Machine:VM)103を有する。また、バイトコードコンパイラ102は、構文解析部104、バイトコード生成部105を有する。また、VM103は、コードキャッシュ部106、デコード部107、ポインタキャッシュ部108、複数のVM命令ハンドラ部とディスパッチャ部との組109-1~109-3を有する。スクリプトエンジン100は、スクリプトの入力を受け付ける。 FIG. 1 is a diagram for explaining an example of the configuration of a threaded code type script engine. As shown in FIG. 1, script engine 100 has bytecode compiler 102 and virtual machine (VM) 103 . The bytecode compiler 102 also has a syntax analysis unit 104 and a bytecode generation unit 105 . The VM 103 also has a code cache unit 106, a decode unit 107, a pointer cache unit 108, and a plurality of sets 109-1 to 109-3 of VM instruction handler units and dispatcher units. The script engine 100 accepts script input.
 構文解析部104は、スクリプトを入力として受け取り、字句解析及び構文解析を経て、抽象構文木(Abstract Syntax Tree:AST)を生成し、バイトコード生成部105に出力する。バイトコード生成部105は、ASTを入力として受け取り、バイトコードに変換してコードキャッシュ部106に格納する。 The syntax analysis unit 104 receives a script as an input, generates an abstract syntax tree (AST) through lexical analysis and syntactic analysis, and outputs it to the bytecode generation unit 105 . The bytecode generation unit 105 receives the AST as an input, converts it into bytecode, and stores it in the code cache unit 106 .
 デコード部107は、コードキャッシュ部106から一括でコードを読み出し、読み出した全コードをデコードする。デコード部107は、全コードをそれぞれポインタに変換し、ポインタキャッシュ部108に格納する。分散したVM命令ハンドラ部とディスパッチャ部との組109-1~109-3は、VM命令に対応したプログラムを実行する。ポインタキャッシュ部108のポインタを参照し、ポインタを逐一確認しながらVM命令を実行し、次のVM命令にディスパッチすることで、スクリプトに記述した内容が実行される。 The decoding unit 107 collectively reads the codes from the code cache unit 106 and decodes all the read codes. The decoding unit 107 converts all codes into pointers and stores them in the pointer cache unit 108 . Groups 109-1 to 109-3 of distributed VM instruction handler units and dispatcher units execute programs corresponding to VM instructions. The contents described in the script are executed by referring to the pointer in the pointer cache unit 108, executing the VM instruction while checking the pointer one by one, and dispatching to the next VM instruction.
 図2を参照して、スクリプトエンジンの構成要素の働きについて説明する。図2は、スクリプトエンジンが有するスレッデッドコード型のVMの擬似コードを示す図である。図2に示すように、まず、擬似コードは、VPCを初期化している(1行目)。擬似コードでは、ポインタキャッシュからVPCの指すポインタを、次に実行するVM命令ハンドラのポインタとして取得している(2行目)。擬似コードでは、goto文を用いて次のVM命令ハンドラにディスパッチされる(3行目)。そして、擬似コードでは、ディスパッチされた先の、VM命令ハンドラが実行される(5、9、13行目)。擬似コードでは、各VM命令ハンドラの後部にディスパッチャが存在し、次に実行するVM命令ハンドラのポインタの取得とそれへのディスパッチが行われる(6、7、10、11、14,15行目)。 The functions of the components of the script engine will be described with reference to FIG. FIG. 2 is a diagram showing pseudocode of a threaded code type VM that the script engine has. As shown in FIG. 2, the pseudocode first initializes the VPC (line 1). In the pseudo code, the pointer pointed to by the VPC is obtained from the pointer cache as the pointer of the VM instruction handler to be executed next (line 2). In the pseudocode, a goto statement is used to dispatch to the next VM instruction handler (line 3). Then, in the pseudocode, the dispatched VM instruction handler is executed (lines 5, 9, and 13). In the pseudocode, there is a dispatcher behind each VM instruction handler that gets a pointer to the VM instruction handler to be executed next and dispatches to it (lines 6, 7, 10, 11, 14, 15). .
 また、分岐VM命令とはスクリプト内で分岐を発生させるVM命令であり、条件分岐フラグは、条件分岐時に分岐がなされるか否かのフラグを保持する領域である。 A branch VM instruction is a VM instruction that causes a branch within a script, and a conditional branch flag is an area that holds a flag indicating whether or not a branch is taken at the time of a conditional branch.
[解析機能付与装置]
 まず、本実施の形態に係る解析機能付与装置10は、スクリプトエンジンバイナリに対して、分岐命令のフックと、メモリ操作命令のフックにより、ブランチトレースとメモリアクセストレースからなる実行トレースを取得する。ただし、ブランチトレースは、実行された分岐を記録したものであり、メモリアクセストレースは、実行されたメモリの読み書きを記録したものである。
[Equipment for providing analysis function]
First, the analysis function imparting apparatus 10 according to the present embodiment acquires an execution trace consisting of a branch trace and a memory access trace by hooking a branch instruction and hooking a memory operation instruction to a script engine binary. However, a branch trace is a record of executed branches, and a memory access trace is a record of executed memory reads and writes.
 そして、解析機能付与装置10は、各VM命令の境界を検出する。すなわち、分散したVM命令ハンドラ部とディスパッチャ部との組が複数あるとしたとき、各々どこから開始し、どこで終了するのかを検出する。この時、解析機能付与装置10は、実行トレースをクラスタリングして、実行回数が閾値以上のクラスタをVM命令として検出する。解析機能付与装置10は、VM命令を構成する連続した命令列の開始点と終了点とを境界として検出する。ここで検出したVM命令の境界は、VPC検出、ディスパッチャ検出において用いられる。 Then, the analysis function imparting device 10 detects the boundaries of each VM instruction. That is, when there are a plurality of pairs of distributed VM instruction handler units and dispatcher units, it is detected where each of them starts and ends. At this time, the analysis function imparting apparatus 10 clusters the execution traces and detects clusters whose number of times of execution is equal to or greater than a threshold value as VM instructions. The analysis function imparting device 10 detects the start point and the end point of the continuous instruction string forming the VM instruction as boundaries. The VM instruction boundary detected here is used in VPC detection and dispatcher detection.
 また、この解析機能付与装置10は、実行トレースを解析し、VPCを検出する。解析機能付与装置は、VPCの検出には、メモリの読み込み回数に着目した差分実行解析を適用する。 Also, this analysis function imparting device 10 analyzes the execution trace and detects the VPC. The analysis function imparting device applies differential execution analysis focusing on the number of times of memory reading to detect the VPC.
 さらに、この解析機能付与装置10は、スクリプトエンジンのバイナリを解析し、ディスパッチャを検出する。前提として、ディスパッチャは、ポインタキャッシュの参照と次のVM命令ハンドラのポインタへのジャンプで実現される。ディスパッチャは、各々のVM命令ハンドラの後部に分散的に配置されており、一般にそれらのコードの同一性は高い。こうしたVM命令ハンドラの後部に存在し、同一性の高いコードを探すことで、解析機能付与装置10は、所定の方法でディスパッチャを検出する。 Furthermore, this analysis function imparting device 10 analyzes the binary of the script engine and detects the dispatcher. As a premise, the dispatcher is implemented by referring to the pointer cache and jumping to the pointer of the next VM instruction handler. Dispatchers are distributed behind each VM instruction handler and generally their code is highly identical. By searching for code that exists behind such VM instruction handlers and has a high degree of identity, the analysis function imparting device 10 detects the dispatcher in a predetermined manner.
 そして、この解析機能付与装置10は、実行トレースを解析し、条件分岐フラグを検出する。解析機能付与装置10は、条件分岐フラグの検出として、メモリの読み込みに着目した差分実行解析を適用する。 Then, this analysis function imparting device 10 analyzes the execution trace and detects conditional branch flags. The analysis function imparting device 10 applies differential execution analysis focused on memory reading to detect conditional branch flags.
 続いて、この解析機能付与装置10は、スクリプトエンジンバイナリに対して、VPCの監視と、ディスパッチャがディスパッチするVM命令ハンドラのポインタの監視により、VM実行トレースを取得する。ただし、VM実行トレースは、実行されたVM命令ハンドラのポインタと、VPCを記録したものである。 Subsequently, the analysis function imparting device 10 obtains a VM execution trace for the script engine binary by monitoring the VPC and the pointer of the VM instruction handler dispatched by the dispatcher. However, the VM execution trace records pointers of executed VM instruction handlers and VPCs.
 この解析機能付与装置10は、このVM実行トレースを解析し、分岐VM命令を検出する。解析機能付与装置10は、分岐VM命令の検出において、まず、多数のテストスクリプトを実行して、VM実行トレースを取得する。そして、解析機能付与装置10は、VM命令へのポインタとVM命令とを紐づけ、各々に識別子としてVMオペコードを仮想的に割り振る。そして、解析機能付与装置10は、VMオペコードごとに、その実行の前後でのVPCの変化量を収集する。VMオペコードが分岐VM命令以外のものの場合、VPCの変化量は、ほぼ一定である。一方、VMオペコードが分岐VM命令のものの場合、VPCは分岐先によってばらつきが生じる。解析機能付与装置10は、VMオペコードごとVPCの変化量のばらつきを分散で評価し、分散が一定の閾値以上のものを、分岐VM命令として検出する。 This analysis function imparting device 10 analyzes this VM execution trace and detects branch VM instructions. In detecting a branch VM instruction, the analysis function imparting device 10 first executes a large number of test scripts to obtain a VM execution trace. Then, the analysis function imparting device 10 associates the pointer to the VM instruction with the VM instruction, and virtually assigns a VM opcode to each as an identifier. Then, the analysis function imparting device 10 collects the amount of change in the VPC before and after the execution of each VM opcode. If the VM opcode is anything other than a branch VM instruction, the amount of change in VPC is approximately constant. On the other hand, if the VM opcode is for a branch VM instruction, the VPC varies depending on the branch destination. The analysis function imparting device 10 evaluates variations in the amount of change in the VPC for each VM opcode in terms of variance, and detects those whose variance is equal to or greater than a certain threshold value as branch VM instructions.
 そして、解析機能付与装置10は、ここまでで得られたVPC、分岐VM命令、及び、条件分岐フラグに基づいて、スクリプトエンジンのバイナリに対して、フックを施す。このフックによって、解析機能付与装置10は、VPCが指す先を監視し、それが分岐VM命令であるとき、実行状態を分岐させる。そして、解析機能付与装置10は、一方の実行状態をそのまま実行し、もう一方の実行状態は条件分岐フラグを書き換えた上で実行する。これによって、条件分岐の両方の実行経路が実行されるようになる。以上のようにして、解析機能付与装置10は、クリプトエンジンへの後付けでのマルチパス機能の付与を実現する。 Then, the analysis function imparting device 10 hooks the script engine binary based on the VPC, the branch VM instruction, and the conditional branch flag obtained up to this point. With this hook, the analyzer 10 monitors what the VPC points to and branches the execution state when it is a branch VM instruction. Then, the analysis function imparting device 10 executes one execution state as it is, and executes the other execution state after rewriting the conditional branch flag. This causes both execution paths of the conditional branch to be executed. As described above, the analysis function imparting device 10 realizes imparting the multipath function to the crypto engine as a retrofit.
[解析機能付与装置の構成]
 続いて、図3を参照して、実施の形態に係る解析機能付与装置10の構成について具体的に説明する。図3は、実施の形態に係る解析機能付与装置の構成の一例を説明する図である。
[Configuration of analysis function imparting device]
Next, the configuration of the analysis function imparting device 10 according to the embodiment will be specifically described with reference to FIG. FIG. 3 is a diagram illustrating an example of the configuration of the analysis function imparting device according to the embodiment.
 図3に示すように、解析機能付与装置10は、入力部11、制御部12、記憶部13、出力部14を有する。そして、解析機能付与装置10は、テストスクリプト及びスクリプトエンジンバイナリの入力を受け付ける。 As shown in FIG. 3, the analysis function imparting device 10 has an input unit 11, a control unit 12, a storage unit 13, and an output unit . Then, the analysis function imparting device 10 receives the input of the test script and the script engine binary.
 入力部11は、キーボードやマウス等の入力デバイスで構成され、外部からの情報の入力を受け付け、制御部12に入力する。また、入力部11は、有線接続、或いは、ネットワーク等を介して接続された他の装置との間で、各種情報を送受信する通信インタフェースを有し、他の装置から送信された情報の入力を受け付ける。入力部11は、テストスクリプト及びスクリプトエンジンバイナリの入力を受け付け、制御部12に出力する。テストスクリプトは、スクリプトエンジンを動的解析して実行トレース及びVM実行トレースを取得する際に、入力されるスクリプトである。なお、テストスクリプトの詳細は後述する。スクリプトエンジンバイナリは、スクリプトエンジンを構成する実行可能ファイルである。スクリプトエンジンバイナリは、複数の実行可能ファイルによって構成される場合がある。 The input unit 11 is composed of input devices such as a keyboard and a mouse, receives input of information from the outside, and inputs the information to the control unit 12 . Further, the input unit 11 has a communication interface for transmitting and receiving various information to and from another device connected via a wired connection or a network, etc., and receives input of information transmitted from the other device. accept. The input unit 11 receives input of test scripts and script engine binaries, and outputs them to the control unit 12 . A test script is a script input when dynamically analyzing a script engine to acquire an execution trace and a VM execution trace. Details of the test script will be described later. Script engine binaries are the executable files that make up the script engine. A script engine binary may consist of multiple executable files.
 制御部12は、各種の処理手順などを規定したプログラム及び所要データを格納するための内部メモリを有し、これらによって種々の処理を実行する。例えば、制御部12は、CPU(Central Processing Unit)やMPU(Micro Processing Unit)などの電子回路である。制御部12は、仮想機械解析部121(第一の解析部)、命令セットアーキテクチャ解析部122(第二の解析部)及び解析機能付与部123(付与部)を有する。 The control unit 12 has an internal memory for storing programs defining various processing procedures and required data, and executes various processing using these. For example, the control unit 12 is an electronic circuit such as a CPU (Central Processing Unit) or MPU (Micro Processing Unit). The control unit 12 has a virtual machine analysis unit 121 (first analysis unit), an instruction set architecture analysis unit 122 (second analysis unit), and an analysis function addition unit 123 (addition unit).
 仮想機械解析部121は、スクリプトエンジンのVMを解析する。仮想機械解析部121は、実行時の条件を変えて複数の実行トレースを取得し、差分実行解析を用いて複数の実行トレースを解析し、VPC及び条件分岐フラグを取得する。また、仮想機械解析部121は、スクリプトエンジンバイナリを静的解析して、VM命令境界およびディスパッチャを取得する。仮想機械解析部121は、実行トレース取得部1211(第一の取得部)、VM命令境界検出部1212(第一の検出部)、仮想プログラムカウンタ検出部1213(第二の検出部)、ディスパッチャ検出部1214(第三の検出部)及び条件分岐フラグ検出部1215(第四の検出部)を有する。 The virtual machine analysis unit 121 analyzes the VM of the script engine. The virtual machine analysis unit 121 acquires a plurality of execution traces by changing execution conditions, analyzes the plurality of execution traces using differential execution analysis, and acquires VPCs and conditional branch flags. Also, the virtual machine analysis unit 121 statically analyzes the script engine binary to acquire VM instruction boundaries and dispatchers. The virtual machine analysis unit 121 includes an execution trace acquisition unit 1211 (first acquisition unit), a VM instruction boundary detection unit 1212 (first detection unit), a virtual program counter detection unit 1213 (second detection unit), and a dispatcher detection unit. It has a unit 1214 (third detection unit) and a conditional branch flag detection unit 1215 (fourth detection unit).
 実行トレース取得部1211は、テストスクリプト及びスクリプトエンジンバイナリを入力として受け付ける。実行トレース取得部1211は、スクリプトエンジンバイナリの実行を監視しながら、テストスクリプトを実行することで、実行トレースを取得する。 The execution trace acquisition unit 1211 accepts the test script and script engine binary as input. The execution trace acquisition unit 1211 acquires an execution trace by executing the test script while monitoring execution of the script engine binary.
 実行トレースは、ブランチトレースとメモリアクセストレースとによって構成される。ブランチトレースは、実行の際の分岐命令の種類と、分岐元アドレスと分岐先アドレスを記録する。メモリアクセストレースは、メモリ操作の種類と、操作対象のメモリアドレスを記録する。ブランチトレース及びメモリアクセストレースは、命令フックによって取得可能であることが知られている。実行トレース取得部1211が取得した実行トレースは、実行トレースDB131に格納される。 An execution trace consists of a branch trace and a memory access trace. The branch trace records the type of branch instruction, the branch source address, and the branch destination address at the time of execution. A memory access trace records the type of memory operation and the memory address of the operation target. Branch traces and memory access traces are known to be obtainable by instruction hooks. The execution trace acquired by the execution trace acquisition unit 1211 is stored in the execution trace DB 131 .
 VM命令境界検出部1212は、実行トレースをクラスタリングして、各VM命令の境界を検出する。VM命令境界検出部1212は、実行トレースをクラスタリングして、実行回数が閾値以上のクラスタをVM命令として検出する。クラスタリングでは、複数回実行される連続したコード領域を検出する。これにはたとえば、実行された命令間のコード上の距離が近いものをまとめてもよいし、実行されたコードブロックの共通部分列を探してもよいし、他の方法によってもよい。解析機能付与装置10は、検出したVM命令を構成する連続した命令列の開始点と終了点とを境界として検出する。ここで検出したVM命令の境界は、VPC検出、ディスパッチャ検出において用いられる。 The VM instruction boundary detection unit 1212 clusters the execution traces and detects the boundary of each VM instruction. The VM instruction boundary detection unit 1212 clusters the execution trace and detects clusters whose number of executions is equal to or greater than a threshold value as VM instructions. Clustering finds contiguous code regions that are executed multiple times. This may be done, for example, by grouping together code distances between executed instructions, by finding common subsequences of executed code blocks, or by other methods. The analysis function imparting device 10 detects the start point and the end point of the continuous instruction string forming the detected VM instruction as boundaries. The VM instruction boundary detected here is used in VPC detection and dispatcher detection.
 仮想プログラムカウンタ検出部1213は、実行トレースDB131に格納された第一のテストスクリプトに対する実行トレースを取り出して解析し、VPCを検出する。仮想プログラムカウンタ検出部1213は、メモリの読み込み回数に着目した差分実行解析とVM命令境界検出部1212によって検出された各VM命令の境界とを用いて複数の実行トレースを解析し、VPCを検出する。仮想プログラムカウンタ検出部1213は、各VM命令の実行後には、必ずVPCを保持するメモリへの読み込みが発生することを利用し、この読み込み先を発見することで、VPCを検出する。 The virtual program counter detection unit 1213 extracts and analyzes the execution trace for the first test script stored in the execution trace DB 131 to detect the VPC. The virtual program counter detection unit 1213 analyzes a plurality of execution traces using differential execution analysis focusing on the number of times of memory reading and the boundary of each VM instruction detected by the VM instruction boundary detection unit 1212, and detects a VPC. . The virtual program counter detection unit 1213 utilizes the fact that reading into the memory holding the VPC always occurs after execution of each VM instruction, and detects the VPC by finding the reading destination.
 このため、仮想プログラムカウンタ検出部1213は、VPCの検出として、メモリの読み込み回数に着目した差分実行解析を用いる。仮想プログラムカウンタ検出部1213は、テストスクリプトを用いて取得された複数のテストスクリプトの実行トレースを比較し、メモリ読み込み回数が、繰り返される回数及び繰り返される文の数との双方の増減に比例して変化するメモリを発見する。そして、仮想プログラムカウンタ検出部1213は、VM命令境界検出部1212によって検出された各VM命令の境界を参照して、読み込んだメモリの値が常にVM命令の開始点を指しているものに絞り込む。仮想プログラムカウンタ検出部1213は、このメモリをVPCとして検出する。 For this reason, the virtual program counter detection unit 1213 uses differential execution analysis focusing on the number of times of memory reading for VPC detection. The virtual program counter detection unit 1213 compares the execution traces of a plurality of test scripts acquired using the test scripts, and finds that the memory read count is proportional to both the number of repetitions and the number of statements to be repeated. Discover changing memory. Then, the virtual program counter detection unit 1213 refers to the boundary of each VM instruction detected by the VM instruction boundary detection unit 1212, and narrows down the read memory values to those that always point to the starting point of the VM instruction. The virtual program counter detection unit 1213 detects this memory as a VPC.
 ディスパッチャ検出部1214は、VM命令境界検出部1212が検出したVM命令の境界を基に、スクリプトエンジンバイナリから各VM命令部分を切り出し、各VM命令間で類似度が高い部分をディスパッチャとして検出する。類似度の高い部分の検出には、たとえば系列アライメントアルゴリズムを用いてもよく、その他の方法によってもよい。 The dispatcher detection unit 1214 cuts out each VM instruction part from the script engine binary based on the VM instruction boundary detected by the VM instruction boundary detection unit 1212, and detects a part with a high degree of similarity between each VM instruction as a dispatcher. A sequence alignment algorithm, for example, may be used to detect portions with a high degree of similarity, or other methods may be used.
 条件分岐フラグ検出部1215は、実行トレースDB131に格納された第二のテストスクリプトに対する実行トレースを取り出して解析し、条件分岐フラグを発見する。条件分岐フラグ検出部1215は、メモリの読み込み回数に着目した差分実行解析を用いて、複数の実行トレースを解析し、条件分岐フラグを検出する。条件分岐フラグ検出部1215は、様々なパターンで条件分岐を実行し、その際のメモリの変化のパターンをテストスクリプト上の条件分岐のパターンと照らし合わせることで、条件分岐フラグを格納するメモリを検出する。 The conditional branch flag detection unit 1215 extracts and analyzes the execution trace for the second test script stored in the execution trace DB 131 and finds the conditional branch flag. The conditional branch flag detection unit 1215 analyzes a plurality of execution traces and detects conditional branch flags using differential execution analysis focusing on the number of times of memory reading. The conditional branch flag detection unit 1215 executes conditional branching in various patterns, and compares the memory change pattern at that time with the conditional branching pattern on the test script to detect the memory storing the conditional branching flag. do.
 命令セットアーキテクチャ解析部122は、VMの命令の体系である命令セットアーキテクチャを解析する。命令セットアーキテクチャ解析部122は、VM実行トレース取得部1221(第二の取得部)及び分岐VM命令検出部1222(第五の検出部)を有する。 The instruction set architecture analysis unit 122 analyzes the instruction set architecture, which is the system of VM instructions. The instruction set architecture analysis unit 122 has a VM execution trace acquisition unit 1221 (second acquisition unit) and a branch VM instruction detection unit 1222 (fifth detection unit).
 VM実行トレース取得部1221は、実行トレース取得部1211と同じく、テストスクリプト及びスクリプトエンジンバイナリを入力として受け付ける。VM実行トレース取得部1221は、スクリプトエンジンバイナリの実行を監視しながら、テストスクリプトを実行することで、VM上で実行された実行トレースであるVM実行トレースを取得する。 As with the execution trace acquisition unit 1211, the VM execution trace acquisition unit 1221 accepts test scripts and script engine binaries as inputs. The VM execution trace acquisition unit 1221 acquires the VM execution trace, which is the execution trace executed on the VM, by executing the test script while monitoring execution of the script engine binary.
 VM実行トレースは、実行されたVM命令ごとのVPCとVMオペコードで構成される。VPCの記録は、仮想プログラムカウンタ検出部1213で検出されたVPCのメモリを監視することで実現できる。ここでのVMオペコードは、VM命令へのポインタとVM命令とを紐づけた各々に仮想的に割り振られた識別子である。VM実行トレース取得部1221が取得したVM実行トレースは、VM実行トレースDB133に格納される。 A VM execution trace consists of a VPC and a VM opcode for each executed VM instruction. The VPC recording can be realized by monitoring the VPC memory detected by the virtual program counter detection unit 1213 . The VM opcode here is an identifier virtually assigned to each linking a pointer to a VM instruction and a VM instruction. The VM execution trace acquired by the VM execution trace acquisition unit 1221 is stored in the VM execution trace DB 133 .
 分岐VM命令検出部1222は、VM実行トレースDB133に格納されたVM実行トレースを取り出して解析し、分岐VM命令を検出する。分岐VM命令検出部1222は、分岐VM命令とそれ以外のVM命令とではVPCの値のばらつきの大きさが異なることに着目し、閾値を決めて、よりVPCの値のばらつきの大きいものを分岐VM命令として検出する。分岐VM命令検出部1222は、VM実行トレースのVMオペコードごとの仮想プログラムカウンタの変化量のばらつきによって、分岐VM命令を検出する。 The branch VM instruction detection unit 1222 extracts and analyzes the VM execution trace stored in the VM execution trace DB 133 to detect branch VM instructions. The branch VM instruction detection unit 1222 pays attention to the fact that the magnitude of variation in the VPC value is different between the branch VM instruction and the other VM instructions, and determines a threshold value to branch the one with the larger variation in the VPC value. Detect as a VM instruction. The branch VM instruction detection unit 1222 detects a branch VM instruction based on variations in the amount of change in the virtual program counter for each VM opcode in the VM execution trace.
 解析機能付与部123は、仮想機械解析部121及び命令セットアーキテクチャ解析部122による解析によって得られたアーキテクチャ情報に基づいて、スクリプトエンジンに、マルチパス実行機能を付与するフックを施す。解析機能付与部123は、得られたVPC、分岐VM命令及び条件分岐フラグを用いてスクリプトエンジンにフックを施す。このフックは、VPCを監視してVMオペコードを確認し、分岐VM命令のVMオペコードであれば、実行状態を分岐させるフックである。そして、このフックは、一方の実行状態はそのまま実行し、もう一方の実行状態は条件分岐フラグを書き換えて実行することで、スクリプトエンジンにマルチパス実行機能を付与するフックである。 Based on the architecture information obtained by the analysis by the virtual machine analysis unit 121 and the instruction set architecture analysis unit 122, the analysis function imparting unit 123 hooks the script engine to impart a multipath execution function. The analysis function imparting unit 123 hooks the script engine using the obtained VPC, branch VM instruction, and conditional branch flag. This hook monitors the VPC to confirm the VM opcode, and branches the execution state if the VM opcode is for a branch VM instruction. This hook executes one execution state as it is, and rewrites the conditional branch flag to execute the other execution state, thereby providing the script engine with a multi-pass execution function.
 記憶部13は、RAM(Random Access Memory)、フラッシュメモリ(Flash Memory)等の半導体メモリ素子、または、ハードディスク、光ディスク等の記憶装置によって実現され、解析機能付与装置10を動作させる処理プログラムや、処理プログラムの実行中に使用されるデータなどが記憶される。記憶部13は、実行トレースデータベース(DB)131、VM実行トレースDB133及びアーキテクチャ情報DB132を有する。 The storage unit 13 is implemented by a semiconductor memory device such as RAM (Random Access Memory) and flash memory, or a storage device such as a hard disk and an optical disk, and stores a processing program for operating the analysis function imparting device 10, a processing Data used during program execution is stored. The storage unit 13 has an execution trace database (DB) 131 , a VM execution trace DB 133 and an architecture information DB 132 .
 実行トレースDB131及びVM実行トレースDB133は、それぞれ実行トレース取得部1211及びVM実行トレース取得部1221によって取得された実行トレース及びVM実行トレースを格納する。実行トレースDB131及びVM実行トレースDB133は、解析機能付与装置10によって管理される。もちろん、実行トレースDB131及びVM実行トレースDB133は、他の装置(サーバ等)によって管理されていてもよく、この場合には、実行トレース取得部1211及びVM実行トレース取得部1221は、出力部14の通信インタフェースを介して、取得した実行トレース及びVM実行トレースを、実行トレースDB131及びVM実行トレースDB133の管理サーバ等に出力して、実行トレースDB131及びVM実行トレースDB133に記憶させる。 The execution trace DB 131 and VM execution trace DB 133 store execution traces and VM execution traces acquired by the execution trace acquisition unit 1211 and VM execution trace acquisition unit 1221, respectively. The execution trace DB 131 and VM execution trace DB 133 are managed by the analysis function imparting device 10 . Of course, the execution trace DB 131 and the VM execution trace DB 133 may be managed by another device (server or the like). Via the communication interface, the acquired execution trace and VM execution trace are output to the management server of the execution trace DB 131 and VM execution trace DB 133 and stored in the execution trace DB 131 and VM execution trace DB 133 .
 出力部14は、例えば、液晶ディスプレイやプリンタ等であって、解析機能付与装置10に関する情報を含む各種情報を出力する。また、出力部14は、外部装置との間で、各種データの入出力を司るインタフェースであってもよく、外部装置に各種情報を出力してもよい。 The output unit 14 is, for example, a liquid crystal display, a printer, etc., and outputs various information including information about the analysis function imparting device 10 . Further, the output unit 14 may be an interface that controls input/output of various data with an external device, and may output various information to the external device.
[テストスクリプトの構成]
 テストスクリプトについて説明する。テストスクリプトは、スクリプトエンジンを動的解析する際に入力されるスクリプトである。このテストスクリプトは、分岐命令の実行やメモリ読み書きの回数に着目し、異なる回数のテストスクリプトを実行したときに生じるスクリプトエンジンの挙動の差分を捉えるために用いられる。このテストスクリプトは、解析の事前に準備するものであり、手動で作成するものである。この作成には、対象のスクリプト言語の仕様に関する知識が必要となる。
[Configuration of test script]
Describe the test script. A test script is a script that is input when dynamically analyzing the script engine. This test script focuses on the execution of branch instructions and the number of memory read/writes, and is used to capture the difference in behavior of the script engine that occurs when the test script is executed a different number of times. This test script is prepared in advance for analysis and is created manually. This creation requires knowledge of the specifications of the target script language.
 図4は、VPCの検出に用いるテストスクリプト(第一のテストスクリプト)の一例を示す図である。第一のテストスクリプトでは、繰り返し処理を用いる(2行目)。第一のテストスクリプトでは、テストスクリプト内の繰り返し回数(2行目)や繰り返される文の数(3行目から5行目)を増減させることで、実行時の条件を変更し、差分を発生させる。 FIG. 4 is a diagram showing an example of a test script (first test script) used for VPC detection. The first test script uses iteration (line 2). In the first test script, by increasing or decreasing the number of repetitions (line 2) and the number of sentences to be repeated (lines 3 to 5) in the test script, the execution conditions are changed and differences are generated. Let
 図5は、分岐VM命令検出に用いるテストスクリプト(第二のテストスクリプト)の一例を示す図である。第二のテストスクリプトでは、複数回の条件分岐を用いる(4行目から8行目)。第二のテストスクリプトにおいて、この複数回の条件分岐では、特定の順序のパターンで分岐がなされたり、なされなかったりするように、分岐条件を制御する(1行目、5行目)。第二のテストスクリプトでは、条件分岐の回数や、分岐の成否の順序パターンを変更し、差分を発生させる。 FIG. 5 is a diagram showing an example of a test script (second test script) used for branch VM instruction detection. The second test script uses multiple conditional branches (lines 4 to 8). In the second test script, this multiple conditional branching controls the branching conditions so that branches are taken or not taken in a particular pattern of order (lines 1, 5). In the second test script, the number of conditional branches and the pattern of success or failure of branching are changed to generate a difference.
[実行トレースの構成]
 次に、実行トレースについて説明する。図6は、実行トレースの一例を示す図である。実行トレースは、前述の通り、ブランチトレースとメモリアクセストレースによって構成されている。図6は、実行トレースの一例を示す図である。以降、図6を用いて実行トレースの構成を示す。
Execution trace configuration
Next, the execution trace will be explained. FIG. 6 is a diagram showing an example of an execution trace. The execution trace consists of a branch trace and a memory access trace, as described above. FIG. 6 is a diagram showing an example of an execution trace. Hereinafter, the configuration of the execution trace will be shown using FIG.
 実行トレースは、traceという要素を有する。traceには、そのログ行がブランチトレースか、メモリアクセストレースかが示される。 An execution trace has an element called trace. trace indicates whether the log line is a branch trace or a memory access trace.
 ブランチトレースのログ行は、例えば、図6の1行目から10行目に記載の書式になっており、type、src、dstの三つの要素からなる。typeは、実行された分岐命令がcall命令によるものか、jmp命令によるものか、ret命令によるものかを示す。また、srcは、分岐元のアドレスを示し、dstは、分岐先のアドレスを示す。 A branch trace log line, for example, has the format described in lines 1 to 10 in Figure 6, and consists of three elements: type, src, and dst. type indicates whether the executed branch instruction is a call instruction, a jmp instruction, or a ret instruction. Also, src indicates a branch source address, and dst indicates a branch destination address.
 メモリアクセストレースのログ行は、たとえば、図6の11行目から13行目に記載の書式になっており、type、target、valueの三つの要素からなる。typeは、メモリアクセスが読み込みか書き込みかを示す。targetは、メモリアクセスの対象となるメモリアドレスを示す。また、valueには、メモリアクセスの結果の値が格納される。 The memory access trace log line, for example, has the format described in lines 11 to 13 in Figure 6, and consists of three elements: type, target, and value. type indicates whether the memory access is read or write. target indicates a memory address to be accessed. In addition, the value of the result of memory access is stored in value.
[VM実行トレースの構成]
 次に、VM実行トレースについて説明する。図7は、VM実行トレースの一例を示す図である。VM実行トレースは、前述の通り、VMオペコードとVPCとを記録したものである。図7は、VM実行トレースの一部を切り出したものである。以降、図7を用いてVM実行トレースの構成を示す。
[Configuration of VM execution trace]
Next, VM execution trace will be described. FIG. 7 is a diagram illustrating an example of a VM execution trace. A VM execution trace is a record of VM opcodes and VPCs, as described above. FIG. 7 is a cutout of a portion of the VM execution trace. Hereinafter, the configuration of the VM execution trace will be shown using FIG.
 VM実行トレースのログ行は、たとえば、図7に記載の書式になっており、vpc及びpointerの二つの要素からなる。vpcは、VPCの値を示す。また、pointerは、ポインタキャッシュから取得された、実行されるVM命令ハンドラの先頭を指すポインタの値を示す。 A VM execution trace log line, for example, has the format shown in Fig. 7 and consists of two elements: vpc and pointer. vpc indicates the value of VPC. Also, pointer indicates the value of the pointer that points to the beginning of the VM instruction handler to be executed, which is obtained from the pointer cache.
[VM命令境界検出部の処理]
 次に、VM命令境界検出部1212の処理について説明する。図8は、VM命令境界検出部1212の処理を説明する図である。
[Processing of VM instruction boundary detection unit]
Next, processing of the VM instruction boundary detection unit 1212 will be described. FIG. 8 is a diagram for explaining the processing of the VM instruction boundary detection unit 1212. As shown in FIG.
 VM命令境界検出部1212は、各VM命令の境界を検出する。この時、VM命令境界検出部1212は、スレッデッドコード型VMにおけるスクリプトマルチパス実行機能の付与のために、VM命令とその境界の検出を行う。具体的には、VM命令境界検出部1212は、実行トレースDB131から実行トレースを取り出す。そして、図8に示すように、VM命令境界検出部1212は、実行トレースを、所定の方法でクラスタリングして、実行回数が閾値以上のクラスタをVM命令(例えば、VM命令ハンドラ1~3)として検出する。VM命令境界検出部1212は、VM命令を構成する連続した命令列の開始点と終了点とを境界として検出する。 The VM instruction boundary detection unit 1212 detects the boundary of each VM instruction. At this time, the VM instruction boundary detection unit 1212 detects the VM instruction and its boundary in order to provide the script multipath execution function in the threaded code type VM. Specifically, the VM instruction boundary detection unit 1212 extracts an execution trace from the execution trace DB 131 . Then, as shown in FIG. 8, the VM instruction boundary detection unit 1212 clusters the execution trace by a predetermined method, and sets clusters whose execution count is equal to or greater than a threshold as VM instructions (for example, VM instruction handlers 1 to 3). To detect. The VM instruction boundary detection unit 1212 detects a start point and an end point of a continuous instruction sequence forming a VM instruction as a boundary.
[仮想プログラムカウンタ検出部の処理]
 次に、仮想プログラムカウンタ検出部1213の処理について説明する。仮想プログラムカウンタ検出部1213は、スレッデッドコード型VMにおけるスクリプトマルチパス実行機能の付与のために、VPC、ポインタキャッシュの検出を行う。仮想プログラムカウンタの検出は、取得した実行トレースのメモリアクセストレースのログを解析することで実現される。仮想プログラムカウンタ検出部1213は、メモリの読み込み回数に着目した差分実行解析を用いる。図9は、仮想プログラムカウンタ検出部1213の処理を説明する図である。
[Processing of Virtual Program Counter Detector]
Next, processing of the virtual program counter detection unit 1213 will be described. The virtual program counter detection unit 1213 detects VPCs and pointer caches in order to provide a script multipath execution function in a threaded code VM. The detection of the virtual program counter is realized by analyzing the memory access trace log of the acquired execution trace. The virtual program counter detection unit 1213 uses differential execution analysis focusing on the number of times the memory is read. FIG. 9 is a diagram for explaining the processing of the virtual program counter detection unit 1213. As shown in FIG.
 仮想プログラムカウンタ検出部1213は、実行トレースDB131から第一のテストスクリプトによる実行トレースを一つ取り出す。VPCの読み込みの回数は、テストスクリプト内の繰り返し回数及び、繰り返し処理の中の文の数に比例する。繰り返しの回数をN、繰り返される文の数をMとしたとき、概ねMN程度のVPCの読み込みが発生する。このため、仮想プログラムカウンタ検出部1213は、N及びMをそれぞれ2Nと2M、3Nと3Mと増やした第一のテストスクリプトに対する実行トレースにおいて、4MN、9MNという増え方をしたメモリを抽出する。具体的には、図9に示すように、仮想プログラムカウンタ検出部1213は、1VM命令実行毎にRead/Writeがあり、単調増加するメモリ領域を抽出する(図9の(1))。 The virtual program counter detection unit 1213 extracts one execution trace by the first test script from the execution trace DB 131. The number of VPC reads is proportional to the number of iterations in the test script and the number of statements in the iteration. When the number of repetitions is N and the number of sentences to be repeated is M, approximately MN VPC reads occur. Therefore, the virtual program counter detection unit 1213 extracts memory increased by 4MN and 9MN in the execution trace for the first test script in which N and M are increased by 2N and 2M, and 3N and 3M, respectively. Specifically, as shown in FIG. 9, the virtual program counter detection unit 1213 extracts a memory area that has read/write for each execution of one VM instruction and monotonically increases ((1) in FIG. 9).
 そして、仮想プログラムカウンタ検出部1213は、読み込んだメモリの値が常にVM命令の開始点を指しているものを、VPCとして検出する。具体的には、仮想プログラムカウンタ検出部1213は、VPCの指し先とVM命令ハンドラのアドレスとを照合して、一致するメモリ領域に絞り込む(図9の(2))。 Then, the virtual program counter detection unit 1213 detects as a VPC that the read memory value always points to the starting point of the VM instruction. Specifically, the virtual program counter detection unit 1213 collates the destination of the VPC with the address of the VM instruction handler, and narrows down to a matching memory area ((2) in FIG. 9).
[ディスパッチャ検出部の処理]
 次に、ディスパッチャ検出部1214の処理について説明する。ディスパッチャ検出部1214は、スクリプトエンジンのバイナリを所定の手法で解析することで、ディスパッチャを検出する。図10は、ディスパッチャ検出部1214の処理を説明する図である。
[Processing of the dispatcher detector]
Next, processing of the dispatcher detection unit 1214 will be described. The dispatcher detection unit 1214 detects the dispatcher by analyzing the binary of the script engine using a predetermined method. FIG. 10 is a diagram for explaining the processing of the dispatcher detection unit 1214. As shown in FIG.
 ディスパッチャ検出部1214は、スレッデッドコード型VMにおけるスクリプトマルチパス実行機能の付与のために、ディスパッチャの検出を行う。ディスパッチャ検出部1214は、VM命令境界検出部1212が検出したVM命令の境界を基に、スクリプトエンジンバイナリから各VM命令部分を切り出す。そして、ディスパッチャ検出部1214は、ディスパッチャのコードの類似性は高いとした仮定の基(図10の(1))、各VM命令間でコード間の類似度を算出し、全VM命令間で類似度が高い部分を、ディスパッチャとして検出する。ディスパッチャ検出部1214は、VM命令の後半部で共通的に実行されるコードを、ディスパッチャとして検出できる(図10の(1))。 The dispatcher detection unit 1214 detects dispatchers in order to provide script multipath execution functions in threaded code VMs. The dispatcher detection unit 1214 cuts out each VM instruction part from the script engine binary based on the VM instruction boundary detected by the VM instruction boundary detection unit 1212 . Then, the dispatcher detection unit 1214 calculates the similarity between the codes of each VM instruction based on the assumption that the similarity of the dispatcher code is high ((1) in FIG. 10), and calculates the similarity between all the VM instructions. Detect the high degree part as a dispatcher. The dispatcher detection unit 1214 can detect code that is commonly executed in the second half of a VM instruction as a dispatcher ((1) in FIG. 10).
[条件分岐フラグ検出部の処理]
 次に、条件分岐フラグ検出部1215の処理について説明する。条件分岐フラグ検出部1215は、メモリアクセスを解析することで、条件分岐フラグを検出する。
[Processing of Conditional Branch Flag Detector]
Next, processing of the conditional branch flag detection unit 1215 will be described. The conditional branch flag detection unit 1215 detects the conditional branch flag by analyzing memory access.
 条件分岐フラグ検出部1215は、第二のテストスクリプトを用いて得られた実行トレースを用いる。条件分岐フラグ検出部1215は、テストスクリプトを解析して、二段階の絞り込みをすることで、条件分岐フラグを検出する。条件分岐フラグには、分岐がなされるか、なされないかの二つの状態がある。また、条件分岐フラグは、条件分岐の回数に比例した回数、読み込まれると考えられる。 The conditional branch flag detection unit 1215 uses the execution trace obtained using the second test script. The conditional branch flag detection unit 1215 detects the conditional branch flag by analyzing the test script and performing two stages of narrowing down. The conditional branch flag has two states: branch taken or not taken. Also, the conditional branch flag is considered to be read a number of times proportional to the number of conditional branches.
 このことから、条件分岐フラグ検出部1215は、一段階目の絞り込みとして、条件分岐の回数に比例した回数のメモリ読み込みがあるメモリを抽出する。そして、条件分岐フラグ検出部1215は、二段階目の絞り込みとして、各メモリ読み込み時の値が、テストスクリプトの条件分岐と対応付くように二つの値を行き来しているメモリを抽出する。 From this, the conditional branch flag detection unit 1215 extracts the memories with the number of memory reads proportional to the number of conditional branches as the first stage of narrowing down. Then, as the second stage of narrowing down, the conditional branch flag detection unit 1215 extracts a memory in which two values are exchanged so that the value at the time of reading each memory corresponds to the conditional branch of the test script.
 例えば、条件分岐フラグが、分岐がなされる場合をX、なされない場合をYで保持している場合、図5の第二のテストスクリプトでは、条件分岐の順序のパターンはなされる、なされない、なされる、なされる、なされないとなる。このため、条件分岐フラグ検出部1215は、X、Y、X、X、Yと二つの値を行き来しているメモリアドレスを抽出する。条件分岐フラグ検出部1215は、これを分岐の回数を変更しながら繰り返すことにより、条件分岐フラグを検出する。 For example, if the conditional branch flag holds X if the branch is taken and Y if not taken, then in the second test script of FIG. will be done, will be done, will not be done. Therefore, the conditional branch flag detection unit 1215 extracts memory addresses that alternate between two values, X, Y, X, X, Y. The conditional branch flag detection unit 1215 detects the conditional branch flag by repeating this while changing the number of times of branching.
[分岐VM命令検出部の処理]
 次に、分岐VM命令検出部1222の処理について説明する。分岐VM命令検出部1222は、取得したVM実行トレースのログを解析することで分岐VM命令を検出する。ここでのテストスクリプトは、分岐VM命令が含まれていればよいため、分岐の制御構文を含むスクリプトでありさえすればどのようなものでもよい。例えば、インターネット上から収集したり、公式ドキュメントから取得したりしてテストスクリプトを準備する。
[Processing of Branch VM Instruction Detector]
Next, processing of the branch VM instruction detection unit 1222 will be described. The branch VM instruction detection unit 1222 detects the branch VM instruction by analyzing the acquired VM execution trace log. Since the test script here only needs to include a branch VM instruction, any script that includes branch control syntax may be used. For example, prepare a test script by collecting from the Internet or from official documents.
 まず、分岐VM命令検出部1222は、VM実行トレースDB133の各VM実行トレースに対し、VM命令へのポインタとVM命令とを紐づけ、各々に識別子として、VMオペコードを仮想的に割り振る。図11は、分岐VM命令検出部1222の処理を説明する図である。 First, the branch VM instruction detection unit 1222 associates a pointer to a VM instruction with a VM instruction for each VM execution trace in the VM execution trace DB 133, and virtually assigns a VM opcode to each as an identifier. FIG. 11 is a diagram for explaining the processing of the branch VM instruction detection unit 1222. As shown in FIG.
 ここで、あるVM命令が分岐命令のとき、VPCの進みは、分岐先に依存して変化する。一方、分岐命令以外のときは、VPCの進みは、VM命令のサイズに依存して変化する。このため、VM命令のオペコードとVM命令へのポインタとの組を収集し、オペコードごとにVPCの進みを見たとき、分岐命令であれば分岐先によってVPCの進みにばらつきがみられる。 Here, when a certain VM instruction is a branch instruction, the advance of the VPC changes depending on the branch destination. On the other hand, for instructions other than branch instructions, the advance of the VPC changes depending on the size of the VM instruction. For this reason, when the sets of the operation code of the VM instruction and the pointer to the VM instruction are collected and the progress of the VPC is observed for each operation code, the progress of the VPC varies depending on the branch destination if the instruction is a branch instruction.
 したがって、分岐VM命令検出部1222は、このVM命令へのポインタのばらつきを評価するため、分散を用いる。分岐VM命令検出部1222は、VMオペコード毎にVPCの変化量の分散を算出し、算出した分散が閾値よりも大きいVMオペコードのみに絞り込む。これによって、分岐VM命令検出部1222は、ポインタとVM命令を対応付けつつ、VPCの進みにばらつきのあるVM命令(図11の例では、VM命令ハンドラ3)を、分岐VM命令として検出する(図11の(1))。 Therefore, the branch VM instruction detector 1222 uses variance to evaluate the dispersion of pointers to this VM instruction. The branch VM instruction detection unit 1222 calculates the variance of the VPC change amount for each VM opcode, and narrows down only the VM opcodes with the calculated variance larger than the threshold. As a result, the branch VM instruction detection unit 1222 detects a VM instruction (in the example of FIG. 11, VM instruction handler 3) with variations in the progress of the VPC as a branch VM instruction while associating the pointer with the VM instruction ( (1) in FIG. 11).
 あるオペコードに対するVPCの進みの集合OをO={o,o,・・・,o}(VPCoの平均は(1)式を参照)とし、tを閾値としたとき、分岐命令か否かは、分散s((2)式を参照)を基に、(3)式のように判定される。これによって、分岐VM命令検出部1222は、分岐VM命令を検出する。 Let O={o 0 , o 1 , . Whether or not is determined as in formula (3) based on the variance s (see formula (2)). Thereby, the branch VM instruction detection unit 1222 detects the branch VM instruction.
Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000002
Figure JPOXMLDOC01-appb-M000002
Figure JPOXMLDOC01-appb-M000003
Figure JPOXMLDOC01-appb-M000003
 なお、分岐以外のVM命令では、ばらつきがほとんど見られず、分岐VM命令とそれ以外のVM命令との境界は明確であることが多い。このため、閾値として、例えば、得られた分散の値を数直線上にプロットして、できた二つの群を分割可能な値が設定される。 For VM instructions other than branching, almost no variation is seen, and the boundary between branching VM instructions and other VM instructions is often clear. For this reason, the threshold is set to a value that can divide the resulting two groups by plotting the obtained variance values on a number line, for example.
[解析機能付与部の処理]
 次に、解析機能付与部123の処理を説明する。解析機能付与部123は、スクリプトエンジンバイナリと、ここまでの処理で検出されたフックポイント及びタップポイントを入力として受け付ける。解析機能付与部123は、スクリプトエンジンに対して、フックポイントでのフックを施す。
[Processing of analysis function imparting unit]
Next, processing of the analysis function imparting unit 123 will be described. The analysis function imparting unit 123 receives as inputs the script engine binary and the hook points and tap points detected in the processing up to this point. The analysis function imparting unit 123 hooks the script engine at the hook point.
 ここで、解析機能付与部123は、フック時に、フックに対応した言語要素が実行され、その引数としてのタップポイントのメモリがログ出力されるように、解析用のコードを挿入する。この解析用のコードは、フックポイントとタップポイントとが判明していれば、容易に生成できる。これによって、スクリプトが実行された際に、その挙動がログ出力されるようになり、解析機能の付与が実現される。 Here, the analysis function imparting unit 123 inserts the code for analysis so that the language element corresponding to the hook is executed at the time of hooking, and the memory of the tap point as its argument is output to the log. Code for this analysis can be easily generated if the hook points and tap points are known. As a result, when the script is executed, its behavior will be output to the log, and the addition of the analysis function is realized.
 このフックによる解析機能の付与は、スクリプトエンジンバイナリに対するバイナリを直接書き換えて実現してもよく、バイナリが実行されてプロセスメモリ上に展開された際にメモリイメージを書き換えて実現してもよい。  The addition of the analysis function by this hook may be realized by directly rewriting the binary for the script engine binary, or by rewriting the memory image when the binary is executed and expanded on the process memory.
[解析機能付与装置の処理手順]
 次に、解析機能付与装置10による解析機能付与処理の処理手順について説明する。図12は、実施の形態に係る解析機能付与処理の処理手順を示すフローチャートである。
[Processing procedure of analysis function imparting device]
Next, a processing procedure of analysis function imparting processing by the analysis function imparting device 10 will be described. FIG. 12 is a flow chart showing a processing procedure of analysis function imparting processing according to the embodiment.
 まず、入力部11は、テストスクリプト及びスクリプトエンジンバイナリを入力として受け取る(ステップS1)。 First, the input unit 11 receives a test script and a script engine binary as input (step S1).
 そして、実行トレース取得部1211は、スクリプトエンジンのバイナリを監視しながらテストスクリプトを実行してブランチトレースとメモリアクセストレースを取得する実行トレース取得処理を行う(ステップS2)。そして、VM命令境界検出部1212は、VM命令を検出し、VM命令の境界を検出するVM命令境界検出処理を行う(ステップS3)。 Then, the execution trace acquisition unit 1211 performs an execution trace acquisition process of executing the test script while monitoring the binary of the script engine and acquiring a branch trace and a memory access trace (step S2). Then, the VM instruction boundary detection unit 1212 detects a VM instruction and performs VM instruction boundary detection processing for detecting the boundary of the VM instruction (step S3).
 仮想プログラムカウンタ検出部1213は、実行トレースDB131に格納された第一のテストスクリプトに対する実行トレースを取り出して解析し、VPCを発見する仮想プログラムカウンタ検出処理を行う(ステップS4)。ディスパッチャ検出部1214は、スクリプトエンジンバイナリから各VM命令部分を切り出し、各VM命令間で類似度が高い部分をディスパッチャとして検出するディスパッチャ検出処理を行う(ステップS5)。条件分岐フラグ検出部1215は、実行トレースDB131に格納された第二のテストスクリプトに対する実行トレースを取り出して解析し、条件分岐フラグを発見する条件分岐検出処理を行う(ステップS6)。 The virtual program counter detection unit 1213 extracts and analyzes the execution trace for the first test script stored in the execution trace DB 131, and performs virtual program counter detection processing for discovering the VPC (step S4). The dispatcher detection unit 1214 extracts each VM instruction part from the script engine binary, and performs dispatcher detection processing for detecting a part having a high degree of similarity between each VM instruction as a dispatcher (step S5). The conditional branch flag detection unit 1215 extracts and analyzes the execution trace for the second test script stored in the execution trace DB 131, and performs conditional branch detection processing for finding the conditional branch flag (step S6).
 VM実行トレース取得部1221は、テストスクリプト及びスクリプトエンジンバイナリを入力として受け付け、スクリプトエンジンバイナリの実行を監視しながら、テストスクリプトを実行することで、VM実行トレースを取得するVM実行トレース取得処理を行う(ステップS7)。分岐VM命令検出部1222は、VM実行トレースDB133に格納されたVM実行トレースを取り出して解析し、分岐VM命令を検出する分岐VM命令検出処理を行う(ステップS8)。 The VM execution trace acquisition unit 1221 receives a test script and a script engine binary as inputs, and performs VM execution trace acquisition processing for acquiring a VM execution trace by executing the test script while monitoring the execution of the script engine binary. (Step S7). The branch VM instruction detection unit 1222 extracts and analyzes the VM execution trace stored in the VM execution trace DB 133, and performs branch VM instruction detection processing for detecting the branch VM instruction (step S8).
 解析機能付与部123は、得られたVPC、分岐VM命令及び条件分岐フラグを用いてスクリプトエンジンにフックを施す解析機能付与処理を行う(ステップS9)。そして、出力部14は、マルチパス実行機能が付与されたスクリプトエンジンバイナリを出力する(ステップS10)。 The analysis function imparting unit 123 performs the analysis function imparting process of hooking the script engine using the obtained VPC, branch VM instruction and conditional branch flag (step S9). Then, the output unit 14 outputs the script engine binary provided with the multipath execution function (step S10).
[実行トレース取得処理の処理手順]
 次に、図13に示す実行トレース取得処理の流れについて説明する。図13は、図12に示す実行トレース取得処理の処理手順を示すフローチャートである。
[Execution trace acquisition process procedure]
Next, the flow of execution trace acquisition processing shown in FIG. 13 will be described. FIG. 13 is a flow chart showing a processing procedure of execution trace acquisition processing shown in FIG.
 まず、実行トレース取得部1211は、テストスクリプト及びスクリプトエンジンバイナリを入力として受け取る(ステップS11)。そして、実行トレース取得部1211は、受け取ったスクリプトエンジンに対して、ブランチトレースを取得するためのフックを施す(ステップS12)。また、実行トレース取得部1211は、受け取ったスクリプトエンジンに対して、メモリアクセストレースを取得するためのフックも施す(ステップS13)。 First, the execution trace acquisition unit 1211 receives the test script and the script engine binary as inputs (step S11). The execution trace acquisition unit 1211 hooks the received script engine to acquire a branch trace (step S12). The execution trace acquisition unit 1211 also hooks the received script engine to acquire a memory access trace (step S13).
 そして、実行トレース取得部1211は、その状態で受け取ったテストスクリプトをスクリプトエンジンに入力して実行させ(ステップS14)、それによって取得される実行トレースを実行トレースDB131に格納する(ステップS15)。 Then, the execution trace acquisition unit 1211 inputs the test script received in that state to the script engine to execute it (step S14), and stores the execution trace acquired thereby in the execution trace DB 131 (step S15).
 実行トレース取得部1211は、入力されたテストスクリプトを全て実行し終えているか否かを判定する(ステップS16)。実行トレース取得部1211は、入力されたテストスクリプトを全て実行し終えている場合(ステップS16:Yes)、処理を終了する。これに対し、実行トレース取得部1211は、入力されたテストスクリプトを全て実行していない場合(ステップS16:No)、ステップS14のテストスクリプトの実行に戻って処理を続ける。 The execution trace acquisition unit 1211 determines whether all the input test scripts have been executed (step S16). If the execution trace acquisition unit 1211 has finished executing all of the input test scripts (step S16: Yes), the execution trace acquisition unit 1211 ends the process. On the other hand, if the execution trace acquisition unit 1211 has not executed all of the input test scripts (step S16: No), it returns to execution of the test scripts in step S14 and continues processing.
[VM命令境界検出処理の処理手順]
 次に、図12に示すVM命令境界検出処理の流れについて説明する。図14は、図12に示すVM命令境界検出処理の処理手順を示すフローチャートである。
[Processing Procedure of VM Instruction Boundary Detection Processing]
Next, the flow of VM instruction boundary detection processing shown in FIG. 12 will be described. FIG. 14 is a flowchart showing the procedure of the VM instruction boundary detection process shown in FIG. 12;
 まず、VM命令境界検出部1212は、実行トレースDB131から実行トレースを取り出す(ステップS21)。VM命令境界検出部1212は、実行トレースを所定の方法でクラスタリングする(ステップS22)。クラスタリングは、いずれの手法を用いてもよい。 First, the VM instruction boundary detection unit 1212 extracts an execution trace from the execution trace DB 131 (step S21). The VM instruction boundary detection unit 1212 clusters the execution traces by a predetermined method (step S22). Any method may be used for the clustering.
 VM命令境界検出部1212は、実行回数が閾値以上のクラスタをVM命令として検出する(ステップS23)。そして、VM命令境界検出部1212は、VM命令を構成する連続した命令列の開始点と終了点とを境界とする(ステップS24)。VM命令境界検出部1212は、VM命令の境界を返り値として出力して(ステップS25)、VM命令境界検出処理を終了する。 The VM instruction boundary detection unit 1212 detects clusters whose number of executions is equal to or greater than the threshold as VM instructions (step S23). Then, the VM instruction boundary detection unit 1212 sets the start point and the end point of the continuous instruction sequence forming the VM instruction as the boundary (step S24). The VM instruction boundary detection unit 1212 outputs the VM instruction boundary as a return value (step S25), and ends the VM instruction boundary detection process.
[仮想プログラムカウンタ検出処理の処理手順]
 次に、図12に示す仮想プログラムカウンタ検出処理の流れについて説明する。図15は、図12に示す仮想プログラムカウンタ検出処理の処理手順を示すフローチャートである。
[Processing procedure of virtual program counter detection processing]
Next, the flow of the virtual program counter detection process shown in FIG. 12 will be described. FIG. 15 is a flow chart showing the procedure of the virtual program counter detection process shown in FIG.
 まず、仮想プログラムカウンタ検出部1213は、実行トレースDB131から第一のテストスクリプトによる実行トレースを一つ取り出す(ステップS31)。続いて、仮想プログラムカウンタ検出部1213は、実行トレースのうちのメモリアクセストレースに着目し、メモリ読み込み先ごとに読み込み回数を数え上げる(ステップS32)。 First, the virtual program counter detection unit 1213 extracts one execution trace by the first test script from the execution trace DB 131 (step S31). Subsequently, the virtual program counter detection unit 1213 focuses on the memory access trace of the execution trace, and counts the number of times of reading for each memory reading destination (step S32).
 仮想プログラムカウンタ検出部1213は、実行トレースの取得に用いた第一のテストスクリプトを入力として受け取り(ステップS33)、その第一のテストスクリプトを解析して繰り返しの回数と繰り返される文の数とを取得する(ステップS34)。 The virtual program counter detection unit 1213 receives as an input the first test script used to acquire the execution trace (step S33), analyzes the first test script, and detects the number of repetitions and the number of sentences to be repeated. Acquire (step S34).
 続いて、仮想プログラムカウンタ検出部1213は、実行トレースDB131から、繰り返し回数や繰り返される文の数の異なる第一のテストスクリプトによる実行トレースを、さらに一つ取り出す(ステップS35)。そして、仮想プログラムカウンタ検出部1213は、メモリアクセストレースに着目し、メモリ読み込み先ごとに読み込み回数を数え上げる(ステップS36)。また、仮想プログラムカウンタ検出部1213は、実行トレースの取得に用いた第一のテストスクリプトを入力として受け取り(ステップS37)、テストスクリプトを解析して、繰り返しの回数と繰り返される文の数とを取得する(ステップS38)。 Subsequently, the virtual program counter detection unit 1213 extracts from the execution trace DB 131 one more execution trace by the first test script with a different repetition count and number of repeated sentences (step S35). Then, the virtual program counter detection unit 1213 pays attention to the memory access trace and counts the number of readings for each memory reading destination (step S36). In addition, the virtual program counter detection unit 1213 receives as an input the first test script used to acquire the execution trace (step S37), analyzes the test script, and acquires the number of repetitions and the number of repeated sentences. (step S38).
 ここで、仮想プログラムカウンタ検出部1213は、繰り返し回数や繰り返される文の増減に比例して読み込み回数が変化するメモリ読み込み先のみに絞り込む(ステップS39)。さらに、仮想プログラムカウンタ検出部1213は、ステップS39において絞り込んだメモリ読み込み先を、読み込んだメモリの値が常にVM命令の開始点を指しているものに絞り込む(ステップS40)。 Here, the virtual program counter detection unit 1213 narrows down only memory read destinations whose read count changes in proportion to the number of repetitions and the increase or decrease in the number of repeated sentences (step S39). Furthermore, the virtual program counter detection unit 1213 narrows down the memory read destinations narrowed down in step S39 to those in which the read memory value always points to the start point of the VM instruction (step S40).
 そして、仮想プログラムカウンタ検出部1213は、メモリ読み込み先を一つのみに絞り込めたか否かを判定する(ステップS41)。仮想プログラムカウンタ検出部1213は、メモリ読み込み先を一つのみに絞り込めていない場合(ステップS41:No)、ステップS35に戻り、次の実行トレースを一つ取り出して処理を継続する。一方、仮想プログラムカウンタ検出部1213は、メモリ読み込み先を一つのみに絞り込めた場合(ステップS41:Yes)、絞り込まれたメモリ読み込み先を仮想プログラムカウンタとしてアーキテクチャ情報DB132に格納して(ステップS42)、処理を終了する。 Then, the virtual program counter detection unit 1213 determines whether or not the memory reading destination has been narrowed down to only one (step S41). If the virtual program counter detection unit 1213 cannot narrow down the memory reading destination to only one (step S41: No), the process returns to step S35, extracts the next execution trace, and continues the process. On the other hand, if the virtual program counter detection unit 1213 narrows down the memory reading destination to only one (step S41: Yes), it stores the narrowed down memory reading destination in the architecture information DB 132 as a virtual program counter (step S42). ) and terminate the process.
[ディスパッチャ検出処理の処理手順]
 次に、図12に示すディスパッチャ検出処理の流れについて説明する。図16は、図12に示すディスパッチャ検出処理の処理手順を示すフローチャートである。
[Processing procedure of dispatcher detection processing]
Next, the flow of dispatcher detection processing shown in FIG. 12 will be described. FIG. 16 is a flow chart showing a processing procedure of dispatcher detection processing shown in FIG.
 まず、ディスパッチャ検出部1214は、スクリプトエンジンバイナリを入力として受け取る(ステップS51)。ディスパッチャ検出部1214は、VM命令境界検出部1212から、VM命令の境界を受け取る(ステップS52)。 First, the dispatcher detection unit 1214 receives the script engine binary as an input (step S51). The dispatcher detector 1214 receives the boundary of the VM instruction from the VM instruction boundary detector 1212 (step S52).
 ディスパッチャ検出部1214は、VM命令境界検出部1212から受け取ったVM命令の境界を基に、スクリプトエンジンバイナリから各VM命令部分を切り出す(ステップS53)。ディスパッチャ検出部1214は、各VM命令間でコード間の類似度を所定の方法で算出する(ステップS54)。類似度の算出手法は、コード間の類似度を算出できる手法であれば、どの手法でもよい。 The dispatcher detection unit 1214 cuts out each VM instruction part from the script engine binary based on the boundary of the VM instruction received from the VM instruction boundary detection unit 1212 (step S53). The dispatcher detection unit 1214 calculates the code similarity between each VM instruction by a predetermined method (step S54). Any similarity calculation method can be used as long as it can calculate the similarity between codes.
 ディスパッチャ検出部1214は、ステップS54において算出した類似度を基に、全VM命令間で類似度が高い部分を取り出す(ステップS55)。そして、ディスパッチャ検出部1214は、VM命令の終端部分であるかを判定する(ステップS56)。 The dispatcher detection unit 1214 extracts a portion with a high degree of similarity among all VM instructions based on the degree of similarity calculated in step S54 (step S55). Then, the dispatcher detection unit 1214 determines whether it is the end part of the VM instruction (step S56).
 VM命令の終端部分でない場合(ステップS56:No)、ディスパッチャ検出部1214は、ステップS55に戻り処理を続ける。また、VM命令の終端部分である場合(ステップS56:Yes)、ディスパッチャ検出部1214は、取り出した部分をディスパッチャとして出力して(ステップS57)、処理を終了する。 If it is not the end part of the VM instruction (step S56: No), the dispatcher detection unit 1214 returns to step S55 and continues processing. If it is the end part of the VM instruction (step S56: Yes), the dispatcher detection unit 1214 outputs the extracted part as the dispatcher (step S57), and ends the process.
[条件分岐フラグ検出処理の処理手順]
 次に、図12に示す条件分岐フラグ検出処理の流れについて説明する。図17は、図12に示す条件分岐フラグ検出処理の処理手順を示すフローチャートである。
[Processing procedure of conditional branch flag detection processing]
Next, the flow of the conditional branch flag detection process shown in FIG. 12 will be described. FIG. 17 is a flow chart showing the procedure of the conditional branch flag detection process shown in FIG.
 まず、条件分岐フラグ検出部1215は、実行トレースDB131から第二のテストスクリプトによる実行トレースを一つ取り出す(ステップS71)。そして、条件分岐フラグ検出部1215は、メモリアクセストレースに着目し、メモリ読み込み先ごとに読み込み回数を数え上げる(ステップS72)。 First, the conditional branch flag detection unit 1215 extracts one execution trace by the second test script from the execution trace DB 131 (step S71). Then, the conditional branch flag detection unit 1215 focuses on the memory access trace and counts the number of readings for each memory reading destination (step S72).
 また、条件分岐フラグ検出部1215は、実行トレースの取得に用いた第二のテストスクリプトを、入力として受け取り(ステップS73)、この第二のテストスクリプトを解析して、条件分岐の回数とTrue/Falseの順序パターンを取得する(ステップS74)。そして、条件分岐フラグ検出部1215は、条件分岐の回数に比例して読み込み回数が変化するメモリ読み込み先のみに絞り込む(ステップS75)。さらに、条件分岐フラグ検出部1215は、読み込んだメモリの値がTrue/Falseの順序パターンに合わせて二つの値を行き来しているメモリ読み込み先のみに絞り込む(ステップS76)。 Also, the conditional branch flag detection unit 1215 receives as an input the second test script used to acquire the execution trace (step S73), analyzes this second test script, and determines the number of conditional branches and True/ A False order pattern is obtained (step S74). Then, the conditional branch flag detection unit 1215 narrows down only memory read destinations whose number of times of reading changes in proportion to the number of times of conditional branching (step S75). Furthermore, the conditional branch flag detection unit 1215 narrows down the read memory values to memory read destinations where two values are exchanged according to the order pattern of True/False (step S76).
 条件分岐フラグ検出部1215は、メモリ読み込み先を一つのみに絞り込めたか否かを判定する(ステップS77)。条件分岐フラグ検出部1215は、メモリ読み込み先を一つのみに絞り込めていない場合(ステップS77:No)、ステップS71に戻り、次の実行トレースを一つ取り出して処理を継続する。一方、条件分岐フラグ検出部1215は、メモリ読み込み先を一つのみに絞り込めた場合(ステップS77:Yes)、絞り込まれた読み込み先を仮想プログラムカウンタとしてアーキテクチャ情報DB132に格納し(ステップS78)、処理を終了する。 The conditional branch flag detection unit 1215 determines whether or not the memory reading destination has been narrowed down to only one (step S77). If the conditional branch flag detection unit 1215 cannot narrow down the memory reading destination to only one (step S77: No), the process returns to step S71, extracts the next execution trace, and continues the process. On the other hand, if the memory read destination is narrowed down to only one (step S77: Yes), the conditional branch flag detection unit 1215 stores the narrowed down read destination in the architecture information DB 132 as a virtual program counter (step S78). End the process.
[VM実行トレース取得処理の処理手順]
 次に、図12に示すVM実行トレース取得処理の流れについて説明する。図18は、図12に示すVM実行トレース取得処理の処理手順を示すフローチャートである。
[Processing Procedure of VM Execution Trace Acquisition Processing]
Next, the flow of the VM execution trace acquisition process shown in FIG. 12 will be described. FIG. 18 is a flowchart of a procedure of a VM execution trace acquisition process shown in FIG. 12;
 まず、VM実行トレース取得部1221は、テストスクリプト及びスクリプトエンジンバイナリを入力として受け取る(ステップS81)。そして、VM実行トレース取得部1221は、受け取ったスクリプトエンジンに対して、VPC及びVMオペコードを記録するためのフックを施す(ステップS82)。 First, the VM execution trace acquisition unit 1221 receives the test script and the script engine binary as input (step S81). Then, the VM execution trace acquisition unit 1221 hooks the received script engine to record the VPC and VM operation code (step S82).
 VM実行トレース取得部1221は、その状態で受け取ったテストスクリプトをスクリプトエンジンに入力して実行させ(ステップS83)、それによって取得されるVM実行トレースをVM実行トレースDB133に格納する(ステップS84)。 The VM execution trace acquisition unit 1221 inputs the test script received in that state to the script engine to execute it (step S83), and stores the VM execution trace acquired thereby in the VM execution trace DB 133 (step S84).
 VM実行トレース取得部1221は、入力されたテストスクリプトを全て実行したか否かを判定する(ステップS85)。VM実行トレース取得部1221は、入力されたテストスクリプトを全て実行し終えている場合(ステップS85:Yes)、処理を終了する。VM実行トレース取得部1221は、入力されたテストスクリプトを全て実行し終えていない場合(ステップS85:No)、ステップS83のテストスクリプトの実行に戻って処理を続ける。 The VM execution trace acquisition unit 1221 determines whether all the input test scripts have been executed (step S85). If the VM execution trace acquisition unit 1221 has finished executing all the input test scripts (step S85: Yes), the process ends. If the VM execution trace acquisition unit 1221 has not finished executing all of the input test scripts (step S85: No), it returns to execution of the test scripts in step S83 and continues processing.
[分岐VM命令検出処理の処理手順]
 次に、図12に示す分岐VM命令検出処理の流れについて説明する。図19は、図12に示す分岐VM命令検出処理の処理手順を示すフローチャートである。
[Processing Procedure of Branch VM Instruction Detection Processing]
Next, the flow of branch VM instruction detection processing shown in FIG. 12 will be described. FIG. 19 is a flow chart showing a processing procedure of branch VM instruction detection processing shown in FIG. 12 .
 まず、分岐VM命令検出部1222は、VM実行トレースDB133から、VM実行トレースを一つ取り出す(ステップS91)。分岐VM命令検出部1222は、VM命令へのポインタとVM命令を紐付け、各々に識別子としてVMオペコードを割り振る(ステップS92)。そして、分岐VM命令検出部1222は、VMオペコードごとに、実行の前後でのVPCの変化量を集計する(ステップS93)。 First, the branch VM instruction detection unit 1222 extracts one VM execution trace from the VM execution trace DB 133 (step S91). The branch VM instruction detection unit 1222 associates the pointer to the VM instruction with the VM instruction, and assigns a VM opcode to each as an identifier (step S92). Then, the branch VM instruction detection unit 1222 aggregates the amount of change in VPC before and after execution for each VM opcode (step S93).
 分岐VM命令検出部1222は、VM実行トレースDB133の全てのVM実行トレースを処理し終えたか否かを判定する(ステップS94)。VM実行トレースDB133の全てのVM実行トレースを処理し終えていない場合(ステップS94:No)、分岐VM命令検出部1222は、ステップS91に戻り、次のVM実行トレースを一つ取り出して処理する。 The branch VM instruction detection unit 1222 determines whether or not all VM execution traces in the VM execution trace DB 133 have been processed (step S94). If all VM execution traces in the VM execution trace DB 133 have not been processed (step S94: No), the branch VM instruction detection unit 1222 returns to step S91 to extract and process the next VM execution trace.
 VM実行トレースDB133の全てのVM実行トレースを処理し終えている場合(ステップS94:Yes)、分岐VM命令検出部1222は、VMオペコードごとにVPCの変化量の分散を算出する(ステップS95)。そして、分岐VM命令検出部1222は、閾値を入力として受け取る(ステップS96)。分岐VM命令検出部1222は、分散が閾値よりも大きいVMオペコードのみに絞り込み(ステップS97)、それらを分岐VM命令としてアーキテクチャ情報DB132に格納して(ステップS98)、処理を終了する。 If all VM execution traces in the VM execution trace DB 133 have been processed (step S94: Yes), the branch VM instruction detection unit 1222 calculates the variance of the VPC variation for each VM opcode (step S95). The branch VM instruction detection unit 1222 then receives the threshold as an input (step S96). The branch VM instruction detection unit 1222 narrows down to only VM opcodes whose variance is larger than the threshold (step S97), stores them as branch VM instructions in the architecture information DB 132 (step S98), and ends the process.
[解析機能付与処理の処理手順]
 次に、図12に示す解析機能付与処理の流れについて説明する。図20は、図12に示す解析機能付与処理の処理手順を示すフローチャートである。
[Processing procedure of processing for adding analysis function]
Next, the flow of the analysis function imparting process shown in FIG. 12 will be described. FIG. 20 is a flow chart showing the processing procedure of the analysis function imparting process shown in FIG.
 まず、解析機能付与部123は、スクリプトエンジンバイナリを入力として受け取る(ステップS101)。そして、解析機能付与部123はアーキテクチャ情報DB132からVPC、条件分岐フラグ、条件分岐VM命令を取り出す(ステップS102)。続いて、解析機能付与部123は、スクリプトエンジンのフックポイントにフックを施す(ステップS103)。解析機能付与部123は、このフック時に、マルチパス実行用コードが実行されるよう、コードを生成してスクリプトエンジンに挿入する(ステップS104)。解析機能付与部123は、こうして得られたフックの施されたスクリプトエンジンを、マルチパス実行機能付きのスクリプトエンジンとして出力し(ステップS105)、処理を終了する。 First, the analysis function imparting unit 123 receives a script engine binary as an input (step S101). Then, the analysis function imparting unit 123 extracts the VPC, the conditional branch flag, and the conditional branch VM instruction from the architecture information DB 132 (step S102). Subsequently, the analysis function imparting unit 123 applies a hook to the hook point of the script engine (step S103). The analysis function imparting unit 123 generates a code and inserts it into the script engine so that the multipath execution code is executed at the time of this hook (step S104). The analysis function imparting unit 123 outputs the hooked script engine thus obtained as a script engine with a multipath execution function (step S105), and terminates the process.
[実施の形態の効果]
 以上のように、本実施の形態に係る解析機能付与装置10は、スクリプトエンジンのバイナリを監視しながらテストスクリプトを実行してブランチトレースとメモリアクセストレースとを取得する。そして、解析機能付与装置10は、その実行トレースに基づいて仮想機械を解析し、VM命令境界、VPC、ディスパッチャ、条件分岐フラグのアーキテクチャ情報を取得する。さらに、解析機能付与装置10は、テストスクリプトを実行してVM実行トレースを取得し、そのVM実行トレースを用いて命令セットアーキテクチャを解析して、分岐VM命令をアーキテクチャ情報として取得する。そして、解析機能付与装置10は、得られたアーキテクチャ情報を基にスクリプトエンジンにマルチパス実行機能を付与する。
[Effects of Embodiment]
As described above, the analysis function imparting apparatus 10 according to the present embodiment executes the test script while monitoring the binary of the script engine, and acquires the branch trace and the memory access trace. Then, the analysis function imparting device 10 analyzes the virtual machine based on the execution trace, and acquires architecture information such as VM instruction boundaries, VPC, dispatchers, and conditional branch flags. Furthermore, the analysis function imparting device 10 executes the test script to acquire the VM execution trace, analyzes the instruction set architecture using the VM execution trace, and acquires the branch VM instruction as architecture information. Then, the analysis function imparting device 10 imparts the multipath execution function to the script engine based on the obtained architecture information.
 これによって、実施の形態に係る解析機能付与装置10は、バイナリのみしか手に入らないプロプライエタリなスクリプトエンジンに対しても、実行トレースおよびVM実行トレースの取得に基づく解析により各種アーキテクチャ情報を検出し、人手でのリバースエンジニアリングを要することなく、マルチパス実行機能の付与を実現できる。 As a result, the analysis function imparting device 10 according to the embodiment detects various types of architecture information through analysis based on acquisition of execution traces and VM execution traces, even for proprietary script engines for which only binaries are available. A multipath execution function can be added without manual reverse engineering.
 また、解析機能付与装置10では、多様なスクリプトエンジンに対して、テストスクリプトさえ用意すれば自動でマルチパス実行機能を付与できるため、個別の設計や実行を要することなく、マルチパス実行機能の付与を実現できる。 In addition, the analysis function imparting device 10 can automatically impart a multipath execution function to various script engines as long as a test script is prepared. can be realized.
 さらに、解析機能付与装置10は、条件分岐などの詳細なアーキテクチャを考慮しているため、スクリプトの条件分岐に対して正確なマルチパス実行機能の付与を実現できる。 Furthermore, since the analysis function imparting device 10 considers detailed architecture such as conditional branching, it is possible to implement an accurate multipath execution function against the conditional branching of the script.
 加えて、解析機能付与装置10は、スレッデッドコード型のスクリプトエンジンに焦点を当てているため、スレッデッドコード型のVMを持つスクリプトエンジンに対しても、マルチパス実行機能の付与を実現できる。 In addition, since the analysis function imparting device 10 focuses on the threaded code type script engine, it is possible to impart the multipath execution function even to the script engine having the threaded code type VM.
 このように、本実施の形態に係る解析機能付与装置10は、スクリプトエンジンを解析し、マルチパス実行機能を後付けで付与することにより、スレッデッドコード型も含む多種多様なスクリプト言語のスクリプトエンジンに対して、マルチパス実行機能の自動的な付与を実現できる。 As described above, the analysis function imparting apparatus 10 according to the present embodiment analyzes the script engine and retrofits the multipath execution function to the script engine of various script languages including threaded code type. On the other hand, it is possible to automatically provide the multipath execution function.
 上述したように、本実施の形態に係る解析機能付与装置10は、多種多様なスクリプト言語で記述される悪性スクリプトの挙動の解析に有用であり、特定の条件を満たさなければ実行されない経路を持った悪性スクリプトに対して、その影響を受けずに、挙動を網羅的に解析することに適している。このため、本実施の形態に係る解析機能付与装置10、解析機能付与プログラム及び解析機能付与方法を用いて、様々なスクリプトエンジンにマルチパス実行機能を付与することで、悪性スクリプトの挙動を解析して検知などの対策に生かすことが可能である。 As described above, the analysis function imparting apparatus 10 according to the present embodiment is useful for analyzing the behavior of malicious scripts written in a wide variety of script languages, and has a route that cannot be executed unless a specific condition is met. It is suitable for comprehensively analyzing the behavior of a malicious script that is not affected by it. For this reason, the behavior of malicious scripts can be analyzed by providing multipath execution functions to various script engines using the analysis function imparting device 10, the analysis function imparting program, and the analysis function imparting method according to the present embodiment. It is possible to utilize it for countermeasures such as detection.
[実施形態のシステム構成について]
 図3に示す解析機能付与装置10の各構成要素は機能概念的なものであり、必ずしも物理的に図示のように構成されていることを要しない。すなわち、解析機能付与装置10の機能の分散及び統合の具体的形態は図示のものに限られず、その全部または一部を、各種の負荷や使用状況などに応じて、任意の単位で機能的または物理的に分散または統合して構成することができる。
[About the system configuration of the embodiment]
Each component of the analysis function imparting apparatus 10 shown in FIG. 3 is functionally conceptual, and does not necessarily need to be physically configured as shown. That is, the specific form of distributing and integrating the functions of the analysis function imparting device 10 is not limited to the illustrated one, and all or part of it can be functionally or It can be physically distributed or integrated.
 また、解析機能付与装置10においておこなわれる各処理は、全部または任意の一部が、CPU及びCPUにより解析実行されるプログラムにて実現されてもよい。また、解析機能付与装置10においておこなわれる各処理は、ワイヤードロジックによるハードウェアとして実現されてもよい。 Further, all or any part of each process performed in the analysis function imparting device 10 may be realized by a CPU and a program that is analyzed and executed by the CPU. Further, each process performed in the analysis function imparting device 10 may be realized as hardware by wired logic.
 また、実施の形態において説明した各処理のうち、自動的におこなわれるものとして説明した処理の全部または一部を手動的に行うこともできる。もしくは、手動的におこなわれるものとして説明した処理の全部または一部を公知の方法で自動的に行うこともできる。この他、上述及び図示の処理手順、制御手順、具体的名称、各種のデータやパラメータを含む情報については、特記する場合を除いて適宜変更することができる。 Also, among the processes described in the embodiments, all or part of the processes described as being performed automatically can also be performed manually. Alternatively, all or part of the processes described as being performed manually can be performed automatically by known methods. In addition, the above-described and illustrated processing procedures, control procedures, specific names, and information including various data and parameters can be changed as appropriate unless otherwise specified.
[プログラム]
 図21は、プログラムが実行されることにより、解析機能付与装置10が実現されるコンピュータの一例を示す図である。コンピュータ1000は、例えば、メモリ1010、CPU1020を有する。また、コンピュータ1000は、ハードディスクドライブインタフェース1030、ディスクドライブインタフェース1040、シリアルポートインタフェース1050、ビデオアダプタ1060、ネットワークインタフェース1070を有する。これらの各部は、バス1080によって接続される。
[program]
FIG. 21 is a diagram showing an example of a computer that implements the analysis function imparting device 10 by executing a program. The computer 1000 has a memory 1010 and a CPU 1020, for example. Computer 1000 also has hard disk drive interface 1030 , disk drive interface 1040 , serial port interface 1050 , video adapter 1060 and network interface 1070 . These units are connected by a bus 1080 .
 メモリ1010は、ROM1011及びRAM1012を含む。ROM1011は、例えば、BIOS(Basic Input Output System)等のブートプログラムを記憶する。ハードディスクドライブインタフェース1030は、ハードディスクドライブ1090に接続される。ディスクドライブインタフェース1040は、ディスクドライブ1100に接続される。例えば磁気ディスクや光ディスク等の着脱可能な記憶媒体が、ディスクドライブ1100に挿入される。シリアルポートインタフェース1050は、例えばマウス1110、キーボード1120に接続される。ビデオアダプタ1060は、例えばディスプレイ1130に接続される。 The memory 1010 includes a ROM 1011 and a RAM 1012. The ROM 1011 stores a boot program such as BIOS (Basic Input Output System). Hard disk drive interface 1030 is connected to hard disk drive 1090 . A disk drive interface 1040 is connected to the disk drive 1100 . A removable storage medium such as a magnetic disk or optical disk is inserted into the disk drive 1100 . Serial port interface 1050 is connected to mouse 1110 and keyboard 1120, for example. Video adapter 1060 is connected to display 1130, for example.
 ハードディスクドライブ1090は、例えば、OS1091、アプリケーションプログラム1092、プログラムモジュール1093、プログラムデータ1094を記憶する。すなわち、解析機能付与装置10の各処理を規定するプログラムは、コンピュータ1000により実行可能なコードが記述されたプログラムモジュール1093として実装される。プログラムモジュール1093は、例えばハードディスクドライブ1090に記憶される。例えば、解析機能付与装置10における機能構成と同様の処理を実行するためのプログラムモジュール1093が、ハードディスクドライブ1090に記憶される。なお、ハードディスクドライブ1090は、SSD(Solid State Drive)により代替されてもよい。 The hard disk drive 1090 stores, for example, an OS 1091, application programs 1092, program modules 1093, and program data 1094. That is, a program that defines each process of the analysis function imparting apparatus 10 is implemented as a program module 1093 in which code executable by the computer 1000 is described. Program modules 1093 are stored, for example, on hard disk drive 1090 . For example, the hard disk drive 1090 stores a program module 1093 for executing processing similar to the functional configuration of the analysis function imparting apparatus 10 . The hard disk drive 1090 may be replaced by an SSD (Solid State Drive).
 また、上述した実施の形態の処理で用いられる設定データは、プログラムデータ1094として、例えばメモリ1010やハードディスクドライブ1090に記憶される。そして、CPU1020が、メモリ1010やハードディスクドライブ1090に記憶されたプログラムモジュール1093やプログラムデータ1094を必要に応じてRAM1012に読み出して実行する。 Also, the setting data used in the processing of the above-described embodiment is stored as program data 1094 in the memory 1010 or the hard disk drive 1090, for example. Then, the CPU 1020 reads out the program module 1093 and the program data 1094 stored in the memory 1010 and the hard disk drive 1090 to the RAM 1012 as necessary and executes them.
 なお、プログラムモジュール1093やプログラムデータ1094は、ハードディスクドライブ1090に記憶される場合に限らず、例えば着脱可能な記憶媒体に記憶され、ディスクドライブ1100等を介してCPU1020によって読み出されてもよい。あるいは、プログラムモジュール1093及びプログラムデータ1094は、ネットワーク(LAN(Local Area Network)、WAN(Wide Area Network)等)を介して接続された他のコンピュータに記憶されてもよい。そして、プログラムモジュール1093及びプログラムデータ1094は、他のコンピュータから、ネットワークインタフェース1070を介してCPU1020によって読み出されてもよい。 The program modules 1093 and program data 1094 are not limited to being stored in the hard disk drive 1090, but may be stored in a removable storage medium, for example, and read by the CPU 1020 via the disk drive 1100 or the like. Alternatively, the program modules 1093 and program data 1094 may be stored in another computer connected via a network (LAN (Local Area Network), WAN (Wide Area Network), etc.). Program modules 1093 and program data 1094 may then be read by CPU 1020 through network interface 1070 from other computers.
 以上、本発明者によってなされた発明を適用した実施の形態について説明したが、本実施の形態による本発明の開示の一部をなす記述及び図面により本発明は限定されることはない。すなわち、本実施の形態に基づいて当業者等によりなされる他の実施の形態、実施例及び運用技術等はすべて本発明の範疇に含まれる。 Although the embodiment to which the invention made by the present inventor is applied has been described above, the present invention is not limited by the description and drawings forming part of the disclosure of the present invention according to the present embodiment. That is, other embodiments, examples, operation techniques, etc. made by those skilled in the art based on the present embodiment are all included in the scope of the present invention.
 10 解析機能付与装置
 11 入力部
 12 制御部
 13 記憶部
 14 出力部
 100 スクリプトエンジン
 102 バイトコードコンパイラ
 103 仮想機械(VM)
 104 構文解析部
 105 バイトコード生成部
 106 コードキャッシュ部
 107 デコード部
 108 ポインタキャッシュ部
 121 仮想機械解析部
 122 命令セットアーキテクチャ解析部
 123 解析機能付与部
 131 実行トレースデータベース(DB)
 132 アーキテクチャ情報DB
 133 VM実行トレースDB
 1211 実行トレース取得部
 1212 VM命令境界検出部
 1213 仮想プログラムカウンタ検出部
 1214 ディスパッチャ検出部
 1215 条件分岐フラグ検出部
 1221 VM実行トレース取得部
 1222 分岐VM命令検出部
REFERENCE SIGNS LIST 10 analysis function imparting device 11 input unit 12 control unit 13 storage unit 14 output unit 100 script engine 102 bytecode compiler 103 virtual machine (VM)
104 syntax analysis unit 105 bytecode generation unit 106 code cache unit 107 decode unit 108 pointer cache unit 121 virtual machine analysis unit 122 instruction set architecture analysis unit 123 analysis function addition unit 131 execution trace database (DB)
132 Architecture information DB
133 VM execution trace DB
1211 execution trace acquisition unit 1212 VM instruction boundary detection unit 1213 virtual program counter detection unit 1214 dispatcher detection unit 1215 conditional branch flag detection unit 1221 VM execution trace acquisition unit 1222 branch VM instruction detection unit

Claims (8)

  1.  スクリプトエンジンの仮想機械を解析する第一の解析部と、
     前記仮想機械の命令の体系である命令セットアーキテクチャを解析する第二の解析部と、
     前記第一の解析部及び前記第二の解析部による解析によって得られたアーキテクチャ情報である、次に実行される前記仮想機械の命令を指し示す変数である仮想プログラムカウンタ、実行状態の条件分岐時に分岐がなされるか否かのフラグを保持する領域である条件分岐フラグ、及び、分岐を発生させる仮想機械命令である分岐仮想機械命令に基づいて、スクリプトエンジンに、マルチパス実行機能を付与するフックを施す付与部と、
     を有し、
     前記第一の解析部は、
     実行時の条件を変えて取得した複数の実行トレースを、差分実行解析を用いて解析し、前記仮想プログラムカウンタ、及び、前記条件分岐フラグを取得し、
     実行時の条件を変えて複数の実行トレースを取得する第一の取得部と、
     前記実行トレースをクラスタリングして、各VM命令の境界を検出する第一の検出部と、
     メモリの読み込み回数に着目した差分実行解析と前記第一の検出部によって検出された各VM命令の境界とを用いて前記複数の実行トレースを解析し、前記仮想プログラムカウンタを検出する第二の検出部と、
     前記第一の検出部によって検出された各VM命令の境界を基に、スクリプトエンジンのバイナリを解析し、ディスパッチャを検出する第三の検出部と、
     メモリの読み込み回数に着目した差分実行解析を用いて前記複数の実行トレースを解析し、前記条件分岐フラグを検出する第四の検出部と、
     を有することを特徴とする解析機能付与装置。
    a first analysis unit that analyzes the virtual machine of the script engine;
    a second analysis unit that analyzes an instruction set architecture, which is a system of instructions of the virtual machine;
    A virtual program counter, which is a variable indicating the instruction of the virtual machine to be executed next, which is architecture information obtained by the analysis by the first analysis unit and the second analysis unit; Based on the conditional branch flag, which is an area that holds a flag indicating whether or not to be executed, and the branch virtual machine instruction, which is a virtual machine instruction that causes branching, a hook that gives the script engine a multipath execution function is provided. an applicator for administering;
    has
    The first analysis unit is
    analyzing a plurality of execution traces obtained by changing execution conditions using differential execution analysis to obtain the virtual program counter and the conditional branch flag;
    a first acquisition unit that acquires a plurality of execution traces by changing execution conditions;
    a first detection unit that clusters the execution traces and detects boundaries of each VM instruction;
    A second detection of analyzing the plurality of execution traces using a differential execution analysis focusing on the number of times of memory reading and the boundary of each VM instruction detected by the first detection unit to detect the virtual program counter. Department and
    a third detection unit that analyzes the binary of the script engine based on the boundary of each VM instruction detected by the first detection unit and detects a dispatcher;
    a fourth detection unit that analyzes the plurality of execution traces using differential execution analysis focused on the number of times of memory reading and detects the conditional branch flag;
    An analysis function imparting device characterized by having:
  2.  前記第一の解析部及び前記第二の解析部は、テスト用のスクリプトを用いた解析を実施することを特徴とする請求項1に記載の解析機能付与装置。  The analysis function imparting device according to claim 1, wherein the first analysis unit and the second analysis unit perform analysis using a test script.
  3.  前記第一の検出部は、前記実行トレースをクラスタリングして、実行回数が閾値以上のクラスタをVM命令として検出し、検出したVM命令を構成する連続した命令列の開始点と終了点とを境界として検出することを特徴とする請求項1または2に記載の解析機能付与装置。 The first detection unit clusters the execution trace, detects a cluster whose execution count is equal to or greater than a threshold value as a VM instruction, and defines a boundary between a start point and an end point of a continuous instruction string that constitutes the detected VM instruction. 3. The analytical function imparting device according to claim 1, wherein the detection is performed as follows.
  4.  前記第二の検出部は、前記実行取トレースの取得に用いたテストスクリプトを解析して、メモリ読み込み回数が、繰り返される回数及び繰り返される文の数との双方の増減に比例して変化するするメモリに絞り込むとともに、読み込んだメモリの値が常に前記VM命令の開始点を指しているものに絞り込み、絞り込んだメモリを、前記仮想プログラムカウンタとして検出することを特徴とする請求項3に記載の解析機能付与装置。 The second detection unit analyzes the test script used to acquire the execution trace, and the memory read count changes in proportion to the increase/decrease of both the number of repetitions and the number of repeated sentences. 4. The analysis according to claim 3, wherein the memory is narrowed down, the value of the read memory always points to the starting point of the VM instruction, and the narrowed-down memory is detected as the virtual program counter. Functional device.
  5.  第三の検出部は、前記第一の検出部によって検出されたVM命令の境界を基に、前記スクリプトエンジンのバイナリから各VM命令部分を切り出し、各VM命令間で類似度が高い部分を前記ディスパッチャとして検出することを特徴とする請求項1~4のいずれか一つに記載の解析機能付与装置。 A third detection unit cuts out each VM instruction portion from the binary of the script engine based on the boundary of the VM instruction detected by the first detection unit, and detects a portion having a high degree of similarity between the VM instructions. 5. The analysis function imparting device according to claim 1, wherein the analysis function imparting device is detected as a dispatcher.
  6.  前記第二の解析部は、
     前記仮想機械において実行された実行トレースである仮想機械実行トレースを取得する第二の取得部と、
     前記仮想機械実行トレースの仮想機械オペコードごとの仮想プログラムカウンタの変化量のばらつきによって、前記分岐仮想機械命令を検出する第五の検出部と、
     を有することを特徴とする請求項1~5のいずれか一つに記載の解析機能付与装置。
    The second analysis unit is
    a second acquisition unit that acquires a virtual machine execution trace that is an execution trace executed in the virtual machine;
    a fifth detection unit that detects the branch virtual machine instruction based on variations in the amount of change in the virtual program counter for each virtual machine opcode of the virtual machine execution trace;
    The analysis function imparting device according to any one of claims 1 to 5, characterized by having:
  7.  スクリプトエンジンの仮想機械を解析する第一の解析ステップと、
     前記仮想機械の命令の体系である命令セットアーキテクチャを解析する第二の解析ステップと、
     前記第一の解析ステップ及び前記第二の解析ステップにおける解析によって得られたアーキテクチャ情報である、次に実行される前記仮想機械の命令を指し示す変数である仮想プログラムカウンタ、実行状態の条件分岐時に分岐がなされるか否かのフラグを保持する領域である条件分岐フラグ、及び、分岐を発生させる仮想機械命令である分岐仮想機械命令に基づいて、スクリプトエンジンに、マルチパス実行機能を付与するフックを施す付与ステップと、
     をコンピュータに実行させ、
     前記第一の解析ステップは、実行時の条件を変えて取得した複数の実行トレースを、差分実行解析を用いて解析し、前記仮想プログラムカウンタ、及び、前記条件分岐フラグを取得し、
     前記第一の解析ステップは、
     実行時の条件を変えて複数の実行トレースを取得する第一の取得ステップと、
     前記実行トレースをクラスタリングして、各VM命令の境界を検出する第一の検出ステップと
     メモリの読み込み回数に着目した差分実行解析と前記第一の検出ステップにおいて検出された各VM命令の境界とを用いて前記複数の実行トレースを解析し、前記仮想プログラムカウンタを検出する第二の検出ステップと、
     前記第一の検出ステップにおいて検出された各VM命令の境界を基に、スクリプトエンジンのバイナリを解析し、ディスパッチャを検出する第三の検出ステップと、
     メモリの読み込み回数に着目した差分実行解析を用いて前記複数の実行トレースを解析し、前記条件分岐フラグを検出する第四の検出ステップと、
     をコンピュータに実行させるための解析機能付与プログラム。
    a first parsing step of parsing the virtual machine of the script engine;
    a second analysis step of analyzing an instruction set architecture, which is a system of instructions of the virtual machine;
    A virtual program counter, which is a variable indicating the instruction of the virtual machine to be executed next, which is architecture information obtained by the analysis in the first analysis step and the second analysis step, and a branch when a conditional branch of the execution state occurs Based on the conditional branch flag, which is an area that holds a flag indicating whether or not to be executed, and the branch virtual machine instruction, which is a virtual machine instruction that causes branching, a hook that gives the script engine a multipath execution function is provided. an imparting step of administering;
    on the computer, and
    The first analysis step analyzes a plurality of execution traces obtained by changing execution conditions using differential execution analysis to obtain the virtual program counter and the conditional branch flag;
    The first analysis step includes
    a first acquisition step of acquiring a plurality of execution traces under different runtime conditions;
    a first detection step of clustering the execution trace to detect the boundary of each VM instruction; a differential execution analysis focusing on the number of memory reads; and a boundary of each VM instruction detected in the first detection step. a second detection step of analyzing the plurality of execution traces to detect the virtual program counter using
    a third detection step of analyzing the binary of the script engine and detecting the dispatcher based on the boundaries of each VM instruction detected in the first detection step;
    a fourth detection step of analyzing the plurality of execution traces using differential execution analysis focused on the number of times of memory reading, and detecting the conditional branch flag;
    Analysis function imparting program for executing on a computer.
  8.  解析機能付与装置が実行する解析機能付与方法であって、
     スクリプトエンジンの仮想機械を解析する第一の解析工程と、
     前記仮想機械の命令の体系である命令セットアーキテクチャを解析する第二の解析工程と、
     前記第一の解析工程及び前記第二の解析工程における解析によって得られたアーキテクチャ情報である、次に実行される前記仮想機械の命令を指し示す変数である仮想プログラムカウンタ、実行状態の条件分岐時に分岐がなされるか否かのフラグを保持する領域である条件分岐フラグ、及び、分岐を発生させる仮想機械命令である分岐仮想機械命令に基づいて、スクリプトエンジンに、マルチパス実行機能を付与するフックを施す付与工程と、
     を含み、
     前記第一の解析工程は、実行時の条件を変えて取得した複数の実行トレースを、差分実行解析を用いて解析し、前記仮想プログラムカウンタ、及び、前記条件分岐フラグを取得し、
     前記第一の解析工程は、
     実行時の条件を変えて複数の実行トレースを取得する第一の取得工程と、
     前記実行トレースをクラスタリングして、各VM命令の境界を検出する第一の検出工程と
     メモリの読み込み回数に着目した差分実行解析と前記第一の検出工程において検出された各VM命令の境界とを用いて前記複数の実行トレースを解析し、前記仮想プログラムカウンタを検出する第二の検出工程と、
     前記第一の検出工程において検出された各VM命令の境界を基に、スクリプトエンジンのバイナリを解析し、ディスパッチャを検出する第三の検出工程と、
     メモリの読み込み回数に着目した差分実行解析を用いて前記複数の実行トレースを解析し、前記条件分岐フラグを検出する第四の検出工程と、
     を含んだことを特徴とする解析機能付与方法。
    An analytical function imparting method executed by an analytical function imparting device,
    a first analysis step of analyzing the virtual machine of the script engine;
    a second analysis step of analyzing an instruction set architecture, which is a system of instructions of the virtual machine;
    A virtual program counter, which is a variable indicating the instruction of the virtual machine to be executed next, which is architecture information obtained by the analysis in the first analysis step and the second analysis step; Based on the conditional branch flag, which is an area that holds a flag indicating whether or not to be executed, and the branch virtual machine instruction, which is a virtual machine instruction that causes branching, a hook that gives the script engine a multipath execution function is provided. an application step of applying;
    including
    The first analysis step analyzes a plurality of execution traces obtained by changing execution conditions using differential execution analysis to obtain the virtual program counter and the conditional branch flag;
    The first analysis step includes
    a first acquiring step of acquiring a plurality of execution traces under different runtime conditions;
    a first detection step of clustering the execution trace to detect the boundaries of each VM instruction; a differential execution analysis focusing on the number of times the memory is read; and a boundary of each VM instruction detected in the first detection step. a second detection step of analyzing the plurality of execution traces to detect the virtual program counter using
    a third detection step of analyzing the binary of the script engine and detecting the dispatcher based on the boundaries of each VM instruction detected in the first detection step;
    a fourth detection step of analyzing the plurality of execution traces using differential execution analysis focused on the number of times of memory reading, and detecting the conditional branch flag;
    A method for providing an analysis function, comprising:
PCT/JP2021/006933 2021-02-24 2021-02-24 Analysis function addition device, analysis function addition program, and analysis function addition method WO2022180702A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/JP2021/006933 WO2022180702A1 (en) 2021-02-24 2021-02-24 Analysis function addition device, analysis function addition program, and analysis function addition method
JP2023501730A JPWO2022180702A1 (en) 2021-02-24 2021-02-24

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2021/006933 WO2022180702A1 (en) 2021-02-24 2021-02-24 Analysis function addition device, analysis function addition program, and analysis function addition method

Publications (1)

Publication Number Publication Date
WO2022180702A1 true WO2022180702A1 (en) 2022-09-01

Family

ID=83047874

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2021/006933 WO2022180702A1 (en) 2021-02-24 2021-02-24 Analysis function addition device, analysis function addition program, and analysis function addition method

Country Status (2)

Country Link
JP (1) JPWO2022180702A1 (en)
WO (1) WO2022180702A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024079794A1 (en) * 2022-10-11 2024-04-18 日本電信電話株式会社 Analysis function addition device, analysis function addition method, and analysis function addition program

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020075335A1 (en) * 2018-10-11 2020-04-16 日本電信電話株式会社 Analysis function imparting device, analysis function imparting method, and analysis function imparting program

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020075335A1 (en) * 2018-10-11 2020-04-16 日本電信電話株式会社 Analysis function imparting device, analysis function imparting method, and analysis function imparting program

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
ERTL, ANTON. M. ET AL.: "Vmgen - a generator of efficient virtual machine interpreters", SOFTWARE - PRACTICE AND EXPERIENCE, vol. 32, no. 3, 28 January 2002 (2002-01-28), pages 265 - 294, XP001087296, DOI: https://doi.org/10.10002/spe.434 *
SHARIF, M. ET AL.: "Automoatic Reverse Engineering of Malware Emulators", 2009 30TH IEEE SYMPOSIUM ON SECURITY AND PRIVACY, 2009, pages 94 - 109, XP031515099 *
USUI TOSHINOBU, FURUKAWA WAKI, ​​OTSUKI HAYATO, KAWAFURUYA YUHEI, IWAMURA SEI, MIYOSHI JUN,MATSUURA KANTA MATSUURA: "Automatically appending multi-path execution functionality to vanilla script engines", COMPUTER SECURITY SYMPOSIUM 2019 (21/10/2019 - 24/10/2019), 14 October 2019 (2019-10-14) - 24 October 2019 (2019-10-24), Japan , pages 961 - 968, XP009539387, ISSN: 1882-0840 *
USUI, T. ET AL.: "Automatic Reverse Engineering of Script Engine Binaries for Building Script API Tracers", DIGITAL THREATS: RESEARCH AND PRACTICE, vol. 2, no. 1, January 2021 (2021-01-01), pages 1 - 31, XP058674428, DOI: https://doi.org/10.1145/3416126 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024079794A1 (en) * 2022-10-11 2024-04-18 日本電信電話株式会社 Analysis function addition device, analysis function addition method, and analysis function addition program

Also Published As

Publication number Publication date
JPWO2022180702A1 (en) 2022-09-01

Similar Documents

Publication Publication Date Title
JP7287480B2 (en) Analysis function imparting device, analysis function imparting method and analysis function imparting program
Carmony et al. Extract Me If You Can: Abusing PDF Parsers in Malware Detectors.
US20140130158A1 (en) Identification of malware detection signature candidate code
JP7115552B2 (en) Analysis function imparting device, analysis function imparting method and analysis function imparting program
US9507933B2 (en) Program execution apparatus and program analysis apparatus
US9176849B2 (en) Partitioning of program analyses into sub-analyses using dynamic hints
EP3547121B1 (en) Combining device, combining method and combining program
KR101796369B1 (en) Apparatus, method and system of reverse engineering collaboration for software analsis
US9495542B2 (en) Software inspection system
US20160011951A1 (en) Techniques for web service black box testing
KR20210045122A (en) Apparatus and method for generating test input a software using symbolic execution
WO2022180702A1 (en) Analysis function addition device, analysis function addition program, and analysis function addition method
WO2023067668A1 (en) Analysis function addition method, analysis function addition device, and analysis function addition program
Heelan et al. Augmenting vulnerability analysis of binary code
US20230141948A1 (en) Analysis and Testing of Embedded Code
US9910889B2 (en) Rapid searching and matching of data to a dynamic set of signatures facilitating parallel processing and hardware acceleration
WO2020111482A1 (en) Reverse engineering method and system utilizing big data based on program execution context
Li et al. Characterizing erasable accounts in ethereum
WO2023067667A1 (en) Analysis function imparting method, analysis function imparting device, and analysis function imparting program
WO2023067663A1 (en) Analysis function addition method, analysis function addition device, and analysis function addition program
WO2023067665A1 (en) Analysis function addition method, analysis function addition device, and analysis function addition program
JP6984760B2 (en) Converter and conversion program
Bhardwaj et al. Fuzz testing in stack-based buffer overflow
Nep et al. A research on countering virtual machine evasion techniques of malware in dynamic analysis
Chen et al. Dynamic Taint Analysis with Control Flow Graph for Vulnerability Analysis

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21927805

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2023501730

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21927805

Country of ref document: EP

Kind code of ref document: A1