CN115687111B - Direct comparison dependency identification method and system for computer binary program - Google Patents

Direct comparison dependency identification method and system for computer binary program Download PDF

Info

Publication number
CN115687111B
CN115687111B CN202211329616.4A CN202211329616A CN115687111B CN 115687111 B CN115687111 B CN 115687111B CN 202211329616 A CN202211329616 A CN 202211329616A CN 115687111 B CN115687111 B CN 115687111B
Authority
CN
China
Prior art keywords
subsequence
input
index structure
instruction operand
cmp instruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211329616.4A
Other languages
Chinese (zh)
Other versions
CN115687111A (en
Inventor
杨国正
陈泽翰
陆余良
朱凯龙
于璐
黄晖
戚兰兰
张天翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN202211329616.4A priority Critical patent/CN115687111B/en
Publication of CN115687111A publication Critical patent/CN115687111A/en
Application granted granted Critical
Publication of CN115687111B publication Critical patent/CN115687111B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The disclosure provides a direct comparison dependency identification method and system for a computer binary program, and relates to the technical field of data processing. The method takes an input use case constructed based on a deBruijn sequence as input, analyzes the values of all cmp instruction operands in the running process of a target program, utilizes an index structure matched with the input to quickly match fields directly related to the operands, records related information to form data describing I2S dependence, improves the coverage rate of fuzzy test on deep codes, and further improves the vulnerability mining efficacy of the deep codes.

Description

Direct comparison dependency identification method and system for computer binary program
Technical Field
The invention belongs to the technical field of digital processing, and particularly relates to a direct comparison dependency identification method and system for a computer binary program.
Background
With the continuous development of internet technology, the human society is indispensible from the internet, the software scale and complexity are increased day by day, inevitably, the software loopholes are increased more and more, and the security problem is increasingly prominent. In recent years, the time for malicious attacks by utilizing software vulnerabilities is endless. The lux virus "WannaCry", which was disclosed in 2017, 5, 12, was made using a Windows system vulnerability called "permanent blue". The Lecable virus mats are rolled in more than 100 countries worldwide, attack important institutions such as medical treatment, education, banks and the like, and cause huge economic loss. As a software vulnerability discovery technology, the fuzzy test is concerned and accepted in industry and academia due to high automation degree and low false alarm rate. In order to better solve the problems of 'magic number', 'checksum' related to I2S dependency identification so as to further improve the vulnerability discovery effect, researchers need to expend a great deal of effort and calculation to perform code analysis so as to construct a proper input case and test a deep code.
From the prior art, redqueen, grimoire are configured with lightweight I2S dependent identification schemes. Such schemes identify I2S dependencies by observing target program execution state changes by filling random values of fields into input cases. Such techniques are close to this patent in that both identify I2S associations by analyzing the execution state of the target program, however, the random values populated by the techniques do not necessarily satisfy the relevant constraints of changing the execution state of the target program, and thus such schemes suffer from I2S-dependent reporting limitations.
The main flow of the I2S dependent identification in the prior art is as follows: taking the input use case after dyeing as input, executing a target program, and observing whether the execution state of the target program changes; and selecting a field which successfully triggers the state change, comparing the value of the field with the recorded values of all cmp instruction operands, and if the two values are consistent, considering that I2S dependency exists between the field and the cmp instruction operands. The selection of the field for I2S dependent identification by this scheme depends on whether the padded random value triggers a change in the execution state of the target program, however the padded random value does not necessarily satisfy the relevant constraint of changing the execution state of the target program. Therefore, this scheme has limitations for the presence of false negatives for I2S dependent identification. In addition, the key for defining the existence of the dependency in a field is whether the execution of the target program changes before and after dyeing the field, if so, the field is considered to be dependent; otherwise, no dependence exists. Such methods are simple and robust, however dyeing the field where the I2S dependency exists does not necessarily change the execution state of the target program, as the filled random values do not necessarily meet the relevant conditions for changing the execution state. The implementation method can generate missed judgment on the I2S dependence, and the improvement capability of code coverage rate and collapse triggering when the implementation method is applied to fuzzy test is limited.
Disclosure of Invention
In order to solve the technical problems, the invention provides a direct comparison dependency identification scheme for a computer binary program.
The first aspect of the invention discloses a direct comparison dependency identification method for a computer binary program. The method comprises the following steps: s1, a call generation module constructs an input use case and an index structure based on a Debrucine sequence of a given parameter, the input use case is sent to a monitoring engine, and the index structure is sent to an identification module; s2, calling the monitoring engine to take the input use case as the input of a target program, monitoring a cmp instruction operand real-time value of the target program in an execution state, and sending the cmp instruction operand real-time value to the identification module; and S3, calling the identification module to determine a field with an I2S association relation based on the index structure and the cmp instruction operand real-time value, and extracting the field with the I2S association relation and the cmp instruction operand corresponding to the field as I2S dependent data.
According to the method of the first aspect of the present invention, in the step S1, constructing the input use case specifically includes: obtaining given parameters of the debluring sequence B (k, n): the number of the elements is k, and the order is n; and constructing an input case based on a character set A with the number of k, wherein the character set A comprises k different characters, and the character length of the input case is M.
According to the method of the first aspect of the present invention, in the step S1, constructing the index structure specifically includes: and shifting the input use case based on the order of n, and segmenting M-n+1 subsequences, wherein the offset of each subsequence relative to the initial subsequence is {0,1,..M-n }, and the index structure is formed by each subsequence and the offset corresponding to each subsequence.
According to the method of the first aspect of the present invention, the step S3 specifically includes: matching each subsequence in the index structure in each cmp instruction operand real-time value; when data containing a certain subsequence exists in a real-time value of a certain cmp instruction operand, extracting a first n-bit character containing the data of the certain subsequence, searching the first n-bit character in the index structure, and extracting an offset corresponding to the first n-bit character; and acquiring a cmp instruction operand and an I2S field corresponding to the real-time value of the cmp instruction operand to form the I2S dependent data, wherein the I2S dependent data also comprises the corresponding offset and the character length of the data containing a certain subsequence.
The second aspect of the invention discloses a direct comparison dependency identification system for a computer binary program. The system comprises: a first processing unit configured to: the method comprises the steps that a call generation module constructs an input use case and an index structure based on a Debrucine sequence of a given parameter, the input use case is sent to a monitoring engine, and the index structure is sent to an identification module; a second processing unit configured to: calling the monitoring engine to take the input use case as the input of a target program, monitoring a cmp instruction operand real-time value of the target program in an execution state, and sending the cmp instruction operand real-time value to the identification module; a third processing unit configured to: and calling the identification module to determine a field with an I2S association relation based on the index structure and the cmp instruction operand real-time value, and extracting the field with the I2S association relation and the corresponding cmp instruction operand as I2S dependent data.
According to the system of the second aspect of the present invention, the first processing unit is specifically configured to construct the input use case in the following manner: obtaining given parameters of the debluring sequence B (k, n): the number of the elements is k, and the order is n; and constructing an input case based on a character set A with the number of k, wherein the character set A comprises k different characters, and the character length of the input case is M.
According to the system of the second aspect of the present invention, the first processing unit is specifically configured to construct the index structure in the following manner: and shifting the input use case based on the order of n, and segmenting M-n+1 subsequences, wherein the offset of each subsequence relative to the initial subsequence is {0,1,..M-n }, and the index structure is formed by each subsequence and the offset corresponding to each subsequence.
According to the system of the second aspect of the present invention, the third processing unit is specifically configured to: matching each subsequence in the index structure in each cmp instruction operand real-time value; when data containing a certain subsequence exists in a real-time value of a certain cmp instruction operand, extracting a first n-bit character containing the data of the certain subsequence, searching the first n-bit character in the index structure, and extracting an offset corresponding to the first n-bit character; and acquiring a cmp instruction operand and an I2S field corresponding to the real-time value of the cmp instruction operand to form the I2S dependent data, wherein the I2S dependent data also comprises the corresponding offset and the character length of the data containing a certain subsequence.
A third aspect of the invention discloses an electronic device. The electronic device comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the steps in the direct comparison dependency identification method for the computer binary program according to the first aspect of the invention when executing the computer program.
A fourth aspect of the invention discloses a computer-readable storage medium. The computer readable storage medium stores a computer program, which when executed by a processor, implements the steps in a direct comparison dependency identification method for a computer binary program according to the first aspect of the present invention.
In summary, the technical scheme provided by the application takes an input use case constructed based on a de Bruijn sequence as input, analyzes the values of all cmp instruction operands when a target program runs, utilizes an index structure matched with the input to quickly match fields directly related to the operands, records related information to form data describing I2S dependence, improves the coverage rate of fuzzy test on deep codes, and further improves the efficacy of vulnerability mining. The application mainly aims to solve the problem that I2S depends on missed judgment in the prior art.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings which are required in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are some embodiments of the invention and that other drawings may be obtained from these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a direct alignment dependency identification process for a computer-oriented binary program according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of another scheme for constructing input use cases and index structures based on de Bruijn sequences according to embodiments of the present invention;
FIG. 3 is a schematic diagram of other schemes for identifying I2S fields based on de Bruijn sequences according to embodiments of the present invention;
fig. 4 is a block diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
FA: finite state automata (FA, FINITE STATE automaton) is a computational model of a class of discrete digital systems, commonly used to construct strings with certain regularity. The symbolic definition of a finite state automaton can be written as a five-tuple (Q, Σ, δ, Q 0, F), where Q represents the state set and the number of internal states is finite; sigma represents the input symbol set, which is also finite; delta: q x Sigma-Q is a transfer function defining the transfer rule of what input symbol is received in the current state and then transferred to what state; q 0 is an initial state, and the initial state of the finite state machine is marked; f is a termination state set, and when the FA enters the state F and F epsilon F, the FA finishes executing.
B (k, n): represents the de Bruijn sequence of the order of k-element n. Such sequences consist of k mutually different characters of a 1,a2,…,ak within the internal character set a= { a 1,a2,…,ak } and the sub-sequence position within the sequence of length not less than n is unique, there is no further repeated sub-sequence in the whole sequence.
Vulnerability: vulnerabilities refer to flaws in a computer program or system that typically originate from a flaw in the source code design of the program.
Fuzzy test: a vulnerability discovery technology for computer programs uses a large number of randomly generated input cases as the input of target programs, monitors abnormal conditions generated in the process of operating the input cases by the programs, and discovers possible design defects and code vulnerability of the target programs.
Cmp instruction: the instruction is a binary program assembly instruction of a computer, compares data of two operands, adjusts other condition register values according to a comparison result, and influences a program execution flow.
I2S direct comparison dependency (I2S), when the actual running value of a certain cmp instruction operand in the program is a copy of a certain field value in the Input use case, the field and the cmp instruction operand are considered to have I2S dependency.
Redqueen: the method for filling input cases and observing the execution state change of the target program by random values is used for comparing field values capable of triggering the execution state change with operands of all cmp instructions, identifying and utilizing I2S dependency to improve code coverage rate, and further improving fuzzy test work of vulnerability discovery effectiveness.
Grimoire: the same I2S recognition and utilization mechanism as Redqueen is utilized, and a series of other improvements are carried out on the basis of Redqueen so as to further improve the fuzzy test work of the vulnerability discovery efficacy.
The first aspect of the invention discloses a direct comparison dependency identification method for a computer binary program. FIG. 1 is a schematic diagram of a direct alignment dependency identification process for a computer-oriented binary program according to an embodiment of the present invention; as shown in fig. 1, the method includes: s1, a call generation module constructs an input use case and an index structure based on a Debrucine sequence of a given parameter, the input use case is sent to a monitoring engine, and the index structure is sent to an identification module; s2, calling the monitoring engine to take the input use case as the input of a target program, monitoring a cmp instruction operand real-time value of the target program in an execution state, and sending the cmp instruction operand real-time value to the identification module; and S3, calling the identification module to determine a field with an I2S association relation based on the index structure and the cmp instruction operand real-time value, and extracting the field with the I2S association relation and the cmp instruction operand corresponding to the field as I2S dependent data.
In some embodiments, the user gives the parameters of the generated sequence B (k, n): number k, order n, and character set a. The generation module (in figure 1: de Bruijn (Debrujn) sequence input use case and index structure generation module) constructs an input use case and a corresponding index structure, and outputs the input use case and the corresponding index structure to the real-time monitoring engine and the identification module (in figure 1: I2S field identification module based on de Bruijn matching); the real-time monitoring engine starts a target program according to the input use case, monitors data of program execution states such as branch coverage and the like in real time, records real-time values of cmp instruction operands of the target program, and outputs the records to the I2S field identification module based on de Bruijn matching; the de Bruijn matching module matches fields associated with the I2S according to the index structure and the cmp instruction operand record to form an I2S dependent data tuple recorded with the information from the field to the cmp instruction operand; eventually all the I2S dependent data tuples will be sorted and output, forming I2S dependent data.
In some embodiments, in the step S1, constructing the input use case specifically includes: obtaining given parameters of the debluring sequence B (k, n): the number of the elements is k, and the order is n; and constructing an input case based on a character set A with the number of k, wherein the character set A comprises k different characters, and the character length of the input case is M.
In some embodiments, in the step S1, constructing the index structure specifically includes: and shifting the input use case based on the order of n, and segmenting M-n+1 subsequences, wherein the offset of each subsequence relative to the initial subsequence is {0,1,..M-n }, and the index structure is formed by each subsequence and the offset corresponding to each subsequence.
In some embodiments, the step S3 specifically includes: matching each subsequence in the index structure in each cmp instruction operand real-time value; when data containing a certain subsequence exists in a real-time value of a certain cmp instruction operand, extracting a first n-bit character containing the data of the certain subsequence, searching the first n-bit character in the index structure, and extracting an offset corresponding to the first n-bit character; and acquiring a cmp instruction operand and an I2S field corresponding to the real-time value of the cmp instruction operand to form the I2S dependent data, wherein the I2S dependent data also comprises the corresponding offset and the character length of the data containing a certain subsequence.
Specifically, some target computer binary program has the following header check code:
the code will check whether the content of 8 bytes of the input case, which is offset by using 4 as the start, is 'MAGICHDR', namely whether the hexadecimal value of the 8 bytes of the input case is '4d 41 4749 43 48 44 52', if yes, the test case can trigger the program crash point bug1 (), and successfully discover the vulnerability of software. After compiling, the comparison code exists in the computer binary program in the form of a cmp instruction with an address of 0x100FC 004. To support the construction of such test cases, the I2S dependency identification scheme in the method will automatically identify such direct alignment dependencies. The test staff only needs to provide proper sequence generation parameters to construct corresponding input cases according to the identified I2S dependency, or the test staff can be applied to the fuzzy test to improve the code coverage rate and the exploitation efficiency of software vulnerability.
The test staff gives the parameters of the generated sequence B (k, n): the number of elements k=2, the order n=4, the character set a= { a, b }. The de Bruijn sequence input use case and index structure generation module constructs an input use case 'aaaababaabbbbabbaaa' and a corresponding index structure (see table 1), and outputs the input use case 'aaaababaabbbbabbaaa' and the corresponding index structure to the real-time monitoring engine and the I2S field identification module based on de Bruijn matching respectively.
Subsequences Initial offset
aaaa 0
aaab 1
aaba 2
abab 3
baba 4
abaa 5
baab 6
aabb 7
abbb 8
bbbb 9
bbba 10
bbab 11
babb 12
abba 13
bbaa 14
baaa 15
TABLE 1
The real-time monitoring engine starts the target program according to the input use case aaaababaabbbbabbaaa, monitors the data of program execution states such as branch coverage and the like in real time, records the real-time value of the cmp instruction operation of the target program, and outputs the record to the I2S field identification module based on de Bruijn matching.
The de Bruijn matching module identifies 'babaabbb' data from the operand v0 of the cmp instruction with the address of 0x100FC004 according to the index structure and the operand record of the cmp instruction, and preliminarily judges the data as legal subsequence according to the character table. The record I2S dependent data tuple <4,8,0x100fc004.v0> is then formed from the data content as an index from the index structure (see table 1) matching the corresponding offset of 4 according to the first four bytes content "baba" of the data.
The test staff can locate the cmp instruction at 0x100FC004 according to I2S dependent data tuples <4,8,0x100FC004.V0>, revise the input use case to 'aaaaMAGICHDRbabbaaa' by combining the value of another operand, and construct the input use case of triggering program crash point bug1 (). Or applied to fuzzing, automatically constructs an input use case capable of entering a new branch and triggering a program crash point bug1 ().
In some embodiments, the de Bruijn sequence input use case and index structure generation module in the de Bruijn sequence generation FA (FIG. 2) may be replaced by a module to which other de Bruijn sequence construction algorithms are applied; the step marked in the main flow (see fig. 3) of I2S field identification based on de Bruijn matching is de Bruijn matching, which can be replaced by any substring matching algorithm.
The second aspect of the invention discloses a direct comparison dependency identification system for a computer binary program. The system comprises: a first processing unit configured to: the method comprises the steps that a call generation module constructs an input use case and an index structure based on a Debrucine sequence of a given parameter, the input use case is sent to a monitoring engine, and the index structure is sent to an identification module; a second processing unit configured to: calling the monitoring engine to take the input use case as the input of a target program, monitoring a cmp instruction operand real-time value of the target program in an execution state, and sending the cmp instruction operand real-time value to the identification module; a third processing unit configured to: and calling the identification module to determine a field with an I2S association relation based on the index structure and the cmp instruction operand real-time value, and extracting the field with the I2S association relation and the corresponding cmp instruction operand as I2S dependent data.
According to the system of the second aspect of the present invention, the first processing unit is specifically configured to construct the input use case in the following manner: obtaining given parameters of the debluring sequence B (k, n): the number of the elements is k, and the order is n; and constructing an input case based on a character set A with the number of k, wherein the character set A comprises k different characters, and the character length of the input case is M.
According to the system of the second aspect of the present invention, the first processing unit is specifically configured to construct the index structure in the following manner: and shifting the input use case based on the order of n, and segmenting M-n+1 subsequences, wherein the offset of each subsequence relative to the initial subsequence is {0,1,..M-n }, and the index structure is formed by each subsequence and the offset corresponding to each subsequence.
According to the system of the second aspect of the present invention, the third processing unit is specifically configured to: matching each subsequence in the index structure in each cmp instruction operand real-time value; when data containing a certain subsequence exists in a real-time value of a certain cmp instruction operand, extracting a first n-bit character containing the data of the certain subsequence, searching the first n-bit character in the index structure, and extracting an offset corresponding to the first n-bit character; and acquiring a cmp instruction operand and an I2S field corresponding to the real-time value of the cmp instruction operand to form the I2S dependent data, wherein the I2S dependent data also comprises the corresponding offset and the character length of the data containing a certain subsequence.
A third aspect of the invention discloses an electronic device. The electronic device comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the steps in the direct comparison dependency identification method for the computer binary program according to the first aspect of the invention when executing the computer program.
Fig. 4 is a block diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 4, the electronic device includes a processor, a memory, a communication interface, a display screen, and an input device connected through a system bus. Wherein the processor of the electronic device is configured to provide computing and control capabilities. The memory of the electronic device includes a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The communication interface of the electronic device is used for conducting wired or wireless communication with an external terminal, and the wireless communication can be achieved through WIFI, an operator network, near Field Communication (NFC) or other technologies. The display screen of the electronic equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the electronic equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the electronic equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those skilled in the art that the structure shown in fig. 4 is merely a block diagram of a portion related to the technical solution of the present disclosure, and does not constitute a limitation of the electronic device to which the technical solution of the present disclosure is applied, and a specific electronic device may include more or less components than those shown in the drawings, or may combine some components, or have different component arrangements.
A fourth aspect of the invention discloses a computer-readable storage medium. The computer readable storage medium stores a computer program, which when executed by a processor, implements the steps in a direct comparison dependency identification method for a computer binary program according to the first aspect of the present invention.
In summary, under the condition that the binary program source code of the target computer is not given, the technical scheme provided by the invention automatically identifies the field with I2S dependence in the whole input use case, can more completely identify the field with I2S dependence, avoids the problem of I2S dependence miss judgment, helps test staff or a fuzzy tester construct test use cases which can enter new branches and cover deeper codes, reduces the I2S identification time cost and improves the exploitation efficiency of software vulnerability.
Note that the technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be regarded as the scope of the description. The above examples illustrate only a few embodiments of the application, which are described in detail and are not to be construed as limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of protection of the present application is to be determined by the appended claims.

Claims (8)

1. A direct alignment dependency identification method for a computer binary program, the method comprising:
S1, a call generation module constructs an input use case and an index structure based on a Debrucine sequence of a given parameter, the input use case is sent to a monitoring engine, and the index structure is sent to an identification module;
S2, calling the monitoring engine to take the input use case as the input of a target program, monitoring a cmp instruction operand real-time value of the target program in an execution state, and sending the cmp instruction operand real-time value to the identification module;
s3, calling the identification module to determine a field with an I2S association relation based on the index structure and the cmp instruction operand real-time value, and extracting the field with the I2S association relation and the cmp instruction operand corresponding to the field as I2S dependent data;
the step S3 specifically includes:
matching each subsequence in the index structure in each cmp instruction operand real-time value;
when data containing a certain subsequence exists in a real-time value of a certain cmp instruction operand, extracting a first n-bit character containing the data of the certain subsequence, searching the first n-bit character in the index structure, and extracting an offset corresponding to the first n-bit character;
And acquiring a cmp instruction operand and an I2S field corresponding to the real-time value of the cmp instruction operand to form the I2S dependent data, wherein the I2S dependent data also comprises the corresponding offset and the character length of the data containing a certain subsequence.
2. The method for identifying direct alignment dependency of a binary program according to claim 1, wherein in the step S1, the constructing the input case specifically includes:
Obtaining given parameters of the debluring sequence B (k, n): the number of the elements is k, and the order is n;
And constructing an input case based on a character set A with the number of k, wherein the character set A comprises k different characters, and the character length of the input case is M.
3. The method for identifying direct alignment dependency of a binary program according to claim 2, wherein in the step S1, constructing the index structure specifically comprises: and shifting the input use case based on the order of n, and segmenting M-n+1 subsequences, wherein the offset of each subsequence relative to the initial subsequence is {0,1,..M-n }, and the index structure is formed by each subsequence and the offset corresponding to each subsequence.
4. A computer binary program oriented direct alignment dependent identification system, the system comprising:
A first processing unit configured to: the method comprises the steps that a call generation module constructs an input use case and an index structure based on a Debrucine sequence of a given parameter, the input use case is sent to a monitoring engine, and the index structure is sent to an identification module;
A second processing unit configured to: calling the monitoring engine to take the input use case as the input of a target program, monitoring a cmp instruction operand real-time value of the target program in an execution state, and sending the cmp instruction operand real-time value to the identification module;
a third processing unit configured to: invoking the identification module to determine a field with an I2S association relation based on the index structure and the cmp instruction operand real-time value, and extracting the field with the I2S association relation and the cmp instruction operand corresponding to the field as I2S dependent data;
Wherein the third processing unit is specifically configured to:
matching each subsequence in the index structure in each cmp instruction operand real-time value;
when data containing a certain subsequence exists in a real-time value of a certain cmp instruction operand, extracting a first n-bit character containing the data of the certain subsequence, searching the first n-bit character in the index structure, and extracting an offset corresponding to the first n-bit character;
And acquiring a cmp instruction operand and an I2S field corresponding to the real-time value of the cmp instruction operand to form the I2S dependent data, wherein the I2S dependent data also comprises the corresponding offset and the character length of the data containing a certain subsequence.
5. The computer binary program oriented direct comparison dependency identification system of claim 4, wherein the first processing unit is specifically configured to construct the input use case in the following manner:
Obtaining given parameters of the debluring sequence B (k, n): the number of the elements is k, and the order is n;
And constructing an input case based on a character set A with the number of k, wherein the character set A comprises k different characters, and the character length of the input case is M.
6. The computer binary-oriented direct-comparison dependency identification system of claim 5, wherein the first processing unit is specifically configured to construct the index structure in the following manner: and shifting the input use case based on the order of n, and segmenting M-n+1 subsequences, wherein the offset of each subsequence relative to the initial subsequence is {0,1,..M-n }, and the index structure is formed by each subsequence and the offset corresponding to each subsequence.
7. An electronic device comprising a memory and a processor, the memory storing a computer program, the processor implementing the steps in a direct comparison dependency identification method for a computer binary program according to any one of claims 1 to 3 when the computer program is executed.
8. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of a direct comparison dependency identification method for a computer binary program according to any one of claims 1 to 3.
CN202211329616.4A 2022-10-27 2022-10-27 Direct comparison dependency identification method and system for computer binary program Active CN115687111B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211329616.4A CN115687111B (en) 2022-10-27 2022-10-27 Direct comparison dependency identification method and system for computer binary program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211329616.4A CN115687111B (en) 2022-10-27 2022-10-27 Direct comparison dependency identification method and system for computer binary program

Publications (2)

Publication Number Publication Date
CN115687111A CN115687111A (en) 2023-02-03
CN115687111B true CN115687111B (en) 2024-05-14

Family

ID=85099940

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211329616.4A Active CN115687111B (en) 2022-10-27 2022-10-27 Direct comparison dependency identification method and system for computer binary program

Country Status (1)

Country Link
CN (1) CN115687111B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101847121A (en) * 2010-05-07 2010-09-29 北京大学 Method for discovering software vulnerabilities
TW201633747A (en) * 2014-11-26 2016-09-16 惠普發展公司有限責任合夥企業 Determine vulnerability using runtime agent and network sniffer
CN108845944A (en) * 2018-06-28 2018-11-20 中国人民解放军国防科技大学 Method for improving software fuzz testing efficiency by combining symbolic execution
CN109308415A (en) * 2018-09-21 2019-02-05 四川大学 One kind is towards binary guiding performance fuzz testing method and system
CN111581106A (en) * 2020-05-12 2020-08-25 全球能源互联网研究院有限公司 Binary program vulnerability testing method and device and readable storage medium
CN111859388A (en) * 2020-06-30 2020-10-30 广州大学 Multi-level mixed vulnerability automatic mining method
KR20210045122A (en) * 2019-10-16 2021-04-26 연세대학교 산학협력단 Apparatus and method for generating test input a software using symbolic execution
CN112800423A (en) * 2021-01-26 2021-05-14 北京航空航天大学 Binary code authorization vulnerability detection method
CN113963749A (en) * 2021-09-10 2022-01-21 华南农业大学 High-throughput sequencing data automatic assembly method, system, equipment and storage medium
CN114546836A (en) * 2022-01-26 2022-05-27 中国人民解放军战略支援部队信息工程大学 Public component library automatic testing method and device based on push-down automaton guidance
CN115017516A (en) * 2022-06-02 2022-09-06 电子科技大学 Fuzzy test method based on symbolic execution

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090182988A1 (en) * 2008-01-11 2009-07-16 International Business Machines Corporation Compare Relative Long Facility and Instructions Therefore

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101847121A (en) * 2010-05-07 2010-09-29 北京大学 Method for discovering software vulnerabilities
TW201633747A (en) * 2014-11-26 2016-09-16 惠普發展公司有限責任合夥企業 Determine vulnerability using runtime agent and network sniffer
CN108845944A (en) * 2018-06-28 2018-11-20 中国人民解放军国防科技大学 Method for improving software fuzz testing efficiency by combining symbolic execution
CN109308415A (en) * 2018-09-21 2019-02-05 四川大学 One kind is towards binary guiding performance fuzz testing method and system
KR20210045122A (en) * 2019-10-16 2021-04-26 연세대학교 산학협력단 Apparatus and method for generating test input a software using symbolic execution
CN111581106A (en) * 2020-05-12 2020-08-25 全球能源互联网研究院有限公司 Binary program vulnerability testing method and device and readable storage medium
CN111859388A (en) * 2020-06-30 2020-10-30 广州大学 Multi-level mixed vulnerability automatic mining method
CN112800423A (en) * 2021-01-26 2021-05-14 北京航空航天大学 Binary code authorization vulnerability detection method
CN113963749A (en) * 2021-09-10 2022-01-21 华南农业大学 High-throughput sequencing data automatic assembly method, system, equipment and storage medium
CN114546836A (en) * 2022-01-26 2022-05-27 中国人民解放军战略支援部队信息工程大学 Public component library automatic testing method and device based on push-down automaton guidance
CN115017516A (en) * 2022-06-02 2022-09-06 电子科技大学 Fuzzy test method based on symbolic execution

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
基于符号执行的二进制代码漏洞发现;牛伟纳;丁雪峰;刘智;张小松;;计算机科学(第10期);全文 *
牛伟纳 ; 丁雪峰 ; 刘智 ; 张小松 ; .基于符号执行的二进制代码漏洞发现.计算机科学.2013,(第10期),全文. *
通过区域筛选和简洁de Bruijn图比对高重复短序列;黎瑶;钟诚;;小型微型计算机系统(第09期);全文 *
黎瑶 ; 钟诚 ; .通过区域筛选和简洁de Bruijn图比对高重复短序列.小型微型计算机系统.2020,(第09期),全文. *

Also Published As

Publication number Publication date
CN115687111A (en) 2023-02-03

Similar Documents

Publication Publication Date Title
CN111177417B (en) Security event correlation method, system and medium based on network security knowledge graph
Wang et al. Hyperscan: A fast multi-pattern regex matcher for modern {CPUs}
Lam et al. Context-sensitive program analysis as database queries
US9990583B2 (en) Match engine for detection of multi-pattern rules
US7854002B2 (en) Pattern matching for spyware detection
Bruschi et al. Code normalization for self-mutating malware
Zhao et al. Structural attack against graph based android malware detection
CN107122221A (en) Compiler for regular expression
CN107563201A (en) Association sample lookup method, device and server based on machine learning
CN111695119A (en) Web vulnerability detection method based on fine-grained static taint analysis and symbolic execution
Nguyen et al. Toward a deep learning approach for detecting php webshell
Aldriwish A deep learning approach for malware and software piracy threat detection
Gao et al. A hybrid approach to formal verification of higher-order masked arithmetic programs
CN113900896B (en) Method, device, equipment and storage medium for monitoring code operation
Rasool et al. A novel JSON based regular expression language for pattern matching in the internet of things
Brumley et al. Theory and techniques for automatic generation of vulnerability-based signatures
CN115687111B (en) Direct comparison dependency identification method and system for computer binary program
CN117150497A (en) Intelligent contract denial of service vulnerability detection method based on symbolic execution constraint optimization
Fellin et al. Experimental assessment of xor-masking data obfuscation based on k-clique opaque constants
Yan et al. SPIDER: Speeding up Side-Channel Vulnerability Detection via Test Suite Reduction
Valgenti et al. GPP-Grep: High-speed regular expression processing engine on general purpose processors
Li et al. Semantically find similar binary codes with mixed key instruction sequence
Qian et al. Nimbus: Toward speed up function signature recovery via input resizing and multi-task learning
Oliart et al. Fast algorithms for uniform semi-unification
Tuttle et al. Reverse engineering for mobile systems forensics with Ares

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant