CN111914260A - Binary program vulnerability detection method based on function difference - Google Patents

Binary program vulnerability detection method based on function difference Download PDF

Info

Publication number
CN111914260A
CN111914260A CN202010574987.3A CN202010574987A CN111914260A CN 111914260 A CN111914260 A CN 111914260A CN 202010574987 A CN202010574987 A CN 202010574987A CN 111914260 A CN111914260 A CN 111914260A
Authority
CN
China
Prior art keywords
function
vulnerability
patch
target
path
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010574987.3A
Other languages
Chinese (zh)
Other versions
CN111914260B (en
Inventor
晋武侠
徐一飞
刘烃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN202010574987.3A priority Critical patent/CN111914260B/en
Publication of CN111914260A publication Critical patent/CN111914260A/en
Application granted granted Critical
Publication of CN111914260B publication Critical patent/CN111914260B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • G06F21/577Assessing vulnerabilities and evaluating computer system security
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3604Software analysis for verifying properties of programs
    • G06F11/3608Software analysis for verifying properties of programs using formal methods, e.g. model checking, abstract interpretation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/03Indexing scheme relating to G06F21/50, monitoring users, programs or devices to maintain the integrity of platforms
    • G06F2221/033Test or assess software

Abstract

The invention discloses a binary program vulnerability detection method based on function difference, which is used for extracting patch features aiming at a known vulnerability function, carrying out feature matching in a suspected target function, identifying whether a corresponding patch is applied or not, and judging whether the known vulnerability is included or not. Firstly, determining a vulnerability related function, collecting a binary code containing the vulnerability function and a repaired function, and performing disassembly processing; secondly, determining the change between two versions of the same function by using a differential analysis technology and generating patch characteristics; and finally, screening out a suspected target function from the target program, positioning and representing a local key area in the target function, performing feature matching through similarity calculation to detect whether the target function contains a vulnerability, and completing vulnerability detection on the target program according to the vulnerability detection result. The method aims to quickly and accurately detect whether the target program contains the known vulnerability after the known vulnerability to be searched is given, and solves the problem of high false alarm rate of the existing vulnerability detection method based on function matching.

Description

Binary program vulnerability detection method based on function difference
Technical Field
The invention belongs to the technical field of binary program analysis and vulnerability detection, and particularly relates to a binary program vulnerability detection method based on function difference.
Background
Known vulnerabilities are those for which patches have been issued. With the increasingly mature development form of componentization, the completion and support of various third-party class libraries greatly improve the development efficiency, however, developers may not know all components of software in detail, and developers may pay more attention to the implementation of program functional logic, use the class libraries of old versions, or fail to update some components in time, if bugs which have been discovered and reported exist in the class libraries or components, the bugs may continue to influence the developed programs, and form a potential safety hazard. With the development and perfection of the software industry system, various commercial software and programs are greatly developed, closed-source software and programs for closing source codes are increased, binary codes become one of the main existing forms of software, and the situations of using and depending on the programs are not rare. Therefore, when detecting a bug in a program, it is necessary to deal with a case where information such as a source code and a version thereof cannot be obtained.
Most of the existing binary vulnerability detection methods are binary program similarity detection technologies based on function granularity. And searching similar functions in the files to be detected by taking the functions containing the known bugs as search targets, and judging the programs containing the similar functions as containing the bugs. Firstly, when a vulnerability function is highly similar to a non-hole-leaking function, false alarm is easily caused, and the non-hole-leaking function is judged to have a vulnerability; secondly, with the change of versions, the same function may include multiple changes (such as function update) in different versions, and changes (which may be regarded as noise) irrelevant to the vulnerability are formed in the function, which affects the overall similarity judgment of the function, thereby interfering with the judgment of the vulnerability and leading to an erroneous judgment result.
Disclosure of Invention
In order to solve the problem of high false alarm rate of the conventional method, the invention provides a binary program vulnerability detection method based on function difference, which is used for accurately judging whether a known vulnerability exists in a binary program or whether a corresponding patch exists in the binary program by constructing patch characteristics.
In order to achieve the purpose, the invention adopts the technical scheme that:
a binary program vulnerability detection method based on function difference comprises the following steps:
s1: constructing a feature extraction object vulnerability function VF and a repaired function PF;
s2: generating patch characteristics by using a binary function differential analysis technology for patch identification;
s3: taking a vulnerability function as an object, and screening out a function similar to the vulnerability function from a target program based on a binary function similarity detection technology to serve as a suspected target function TF;
s4: performing patch identification on the target function TF according to the similarity relation between the effective path set generated by the target function TF and the patch characteristics, if a patch can be identified in the target function, determining that the patch does not exist in the function, otherwise, determining that the function still contains the patch, wherein the method comprises the steps of determining a local key area related to the patch in the target function TF by using a binary function differential analysis technology, generating and reducing effective paths, and performing patch identification in the target function TF;
s5: judging whether the related functions of the yet-to-be-analyzed loopholes exist or not, if only one function is influenced by one loophole, continuing the method downwards, and if a plurality of functions are influenced by one loophole and still the related functions of the yet-to-be-analyzed loophole exist, returning to the step S1 to continue iteration;
s6: judging whether the target binary program contains the vulnerability or not according to the actual judgment condition of all functions related to the vulnerability to be searched in the target binary program, wherein if one vulnerability only affects one function, the judgment result of the function is consistent with the judgment result of the vulnerability; if a bug affects multiple functions, if more than one function is determined to contain a bug or the number of functions determined to contain a bug is greater than or equal to the number of functions determined to be repaired, the program is considered to still contain the bug, otherwise, the bug is considered to be repaired.
The invention further improves the following steps: in step S1, a function related to the vulnerability is determined according to the related information of the known vulnerability to be searched, the last binary code containing the vulnerability version VF and the first repaired version PF of the related function is collected, the disassembly processing is performed by the disassembly tool, and the control flow graph represented by the assembly code is used as the feature extraction object for the function VF and the PF respectively.
In step S2, performing differential analysis on the input vulnerability function and its repaired function by using a binary differential analysis technique, locating a boundary basic block BBB of all changed basic blocks CBB in the two functions based on the obtained differential analysis result of the two functions, and constructing a patch feature according to the boundary basic block BBB, where the changed basic block is a basic block added, deleted or modified between the functions; the boundary basic block refers to a neighbor node of the change basic block, but may be a change basic block itself.
In step S2, a plurality of local control flow graphs are formed in the function by connecting the basic variable blocks and the basic boundary blocks, and a set of all effective paths in the function is obtained by traversing the local control flow graphs, where an effective path VT is a continuous basic block sequence starting and ending with the basic boundary block and includes at least one basic variable block and no loop, and if a loop exists, the loop is flattened, and the set of effective paths generated by the vulnerability function is T1The effective path set of the repaired function is T2And taking the boundary basic block and the effective path set as patch features.
In step S2, for each effective path, first removing the jump instruction between the basic blocks, connecting the basic blocks in the path into an instruction sequence, and then normalizing the instructions in the effective path, including:
1) address standardization: replacing the specific address with "address";
2) memory standardization: memory addressing is replaced with "mem";
3) register normalization: the specific register is replaced with "reg".
In step S4, it is determined whether a target function has been patched, and if the target function has been patched, it is determined that a corresponding bug in the target function has been fixed, otherwise, it is determined that the bug still exists in the target function, and step S4 specifically includes:
s401: generating an effective path of an objective function: firstly, a target function TF is differentially analyzed with a vulnerability function VF and a repaired function PF respectively by using a binary differential analysis method, a basic block matching algorithm is used for matching a boundary basic block BBB of the PF in characteristics in the target function TF according to the differential analysis result of the TF and the VF, one or more local control flow graphs can be constructed by connecting the CBB in the TF and the boundary basic block BBB which is matched in the TF and adjacent to the CBB, the local control flow graphs are considered to be the embodiment of patches or vulnerability behaviors and are called as local key regions, and an effective path set T is generated by traversing the local control flow graphs3Similarly, the active path set T is generated by concatenating the CBB in a VF with the BBB in the corresponding VF4Through the differential analysis of the TF and the PF, for the differential analysis result of the TF and the PF, a boundary basic block BBB of the VF in the characteristics is matched in the target function TF by using a basic block matching algorithm, one or more local control flow graphs can be constructed by connecting the CBB in the TF with the boundary basic blocks BBB which are matched in the TF and adjacent to the CBB, and an effective path set T is generated by traversing the local control flow graphs5Similarly, the effective path set T is generated by connecting CBBs in a PF with BBBs in the corresponding PF6After generating the effective path set, merging the effective paths connected end to end, then for each effective path, firstly removing the jump instruction between the basic blocks, connecting the basic blocks in the path into an instruction sequence, and then standardizing the instructions in the effective paths, including:
1) address standardization: replacing the specific address with "address";
2) memory standardization: memory addressing is replaced with "mem";
3) register normalization: the specific register is replaced with "reg".
S402: reduction of the objective function path: for T4If T is each valid path T in1There is no path with t that contains a common CBB, then this valid path t isWill go from T4To form a reduced effective path set T41Similarly, T will also be passed6And T2Is compared from T6The irrelative paths are reduced to form a reduced effective path set T62
S403: and (3) judging algorithm: deducing the relationship among the functions according to the similarity relationship of the differences among the functions to judge whether the target function is patched or not so as to complete vulnerability detection, wherein Sim (T, T') represents the similarity among the path sets, and the method specifically comprises the following three conditions:
case 1: t is1And T2Neither is empty: if the target function has been repaired, the difference T between TF and VF3Should compare the difference T between TF and PF5More pronounced, and T3Should be associated with T in the patch feature2More similar; if the target function still contains a vulnerability, the difference T between TF and PF5Should compare the difference T between TF and VF3More pronounced, and T5Should be associated with T in the patch feature1More similarly, therefore, in this case, if Sim (T)3,T2)>Sim(T5,T1) The target function is considered to be repaired, otherwise, the target function is considered to still contain the vulnerability;
case 2: t is1Is empty, T2Not empty: t is1Being empty means that some new code has been added to the patch, if the target function has been patched, then T3Will be related to T2More similarly, and T62Should be empty; if the target function still contains a bug, T62And T2Should be more similar, T3Should be empty, so in this case, if Sim (T)2,T3)>Sim(T2,T62) The target function is considered to be repaired, otherwise, the target function is considered to still contain the vulnerability;
case 3: t is2Is empty, T1Not empty: t is2Being empty means that the patch has some code deleted, similar to case 2, if Sim (T)1,T41)>Sim(T1,T5) The target function is considered as having been fixed, otherwise it is considered as still containing a vulnerability.
The invention further improves the following steps: in step S403, in order to calculate the similarity score between each pair of path sets, firstly, the similarity between the paths needs to be calculated, and the similarity score is calculated by comparing the instruction sequences of the two paths, and the specific calculation formula is as follows:
Figure BDA0002551006640000041
wherein, t1And t2Two paths for which a similarity score is to be calculated, edge (t)1,t2) For the edit distance between the two paths, len (t)1) And len (t)2) Are respectively a path t1And t2Length of (d);
after the similarity of each pair of paths is calculated, the final similarity between the two sets is calculated according to the following formula:
Figure BDA0002551006640000042
wherein, T1And T2For two sets of paths for which similarity scores are to be calculated, t1And t2Respectively from the set T1And T2Two paths of (d), Sim (t)1,t2) Is the similarity between two paths, | T1I and I T2L are respectively the set T1And T2Number of paths in, len (T)1) And len (T)2) Are respectively a set T1And T2The middle path total length.
Compared with the prior art, the invention has the beneficial effects that:
(1) the method for detecting the vulnerability of the binary program based on the function difference is provided, patch features are constructed to accurately judge whether corresponding patches exist in the binary program or not so as to judge whether known vulnerabilities exist or not, the problem that the existing vulnerability detection method based on the function matching is high in false alarm rate is solved, and the detection accuracy rate is improved;
(2) the binary program can be directly analyzed without depending on program source codes;
(3) only a small number of basic blocks are used for feature generation and patch identification, so that the speed and accuracy of patch identification are improved, and the method has the capability of analyzing large programs in a real scene;
(4) the boundary basic block is utilized to position a local key area in the target function to form a local control flow graph, so that the influence caused by modification irrelevant to the vulnerability in the target function is reduced, and the anti-interference capability is improved;
(5) whether a target object has a known vulnerability can be verified at both function and binary program granularity.
Drawings
FIG. 1 is an overall flow chart of the method of the present invention;
fig. 2 is a schematic diagram of the difference result between VF and PF, and fig. 2(a) and fig. 2(b) show control flow diagrams of dtls1_ process _ heartbeat () function in openssl1.0.1f and openssl1.0.1g, respectively;
fig. 3 is a schematic diagram of the difference result between TF and VF, and fig. 3(a) and fig. 3(b) respectively show control flow diagrams of dtls1_ process _ heartbeat () function in openssl1.0.1e and openssl1.0.1 f;
fig. 4 is a schematic diagram of the difference result between TF and PF, and fig. 4(a) and fig. 4(b) respectively show control flow diagrams of dtls1_ process _ heartbeat () function in openssl1.0.1e and openssl1.0.1 g.
Detailed Description
In order to make the objects, features and advantages of the present invention more apparent and understandable, embodiments of the present invention are described in detail below with reference to the accompanying drawings and examples.
Taking the known bug CVE-2014-0160 as an example, the binary program of the class library OpenSSL1.0.1e is taken as the target binary program to be detected to detect the bug.
As shown in fig. 1, a binary program vulnerability detection method based on function difference includes the following steps:
step S1: according to related information such as patches, determining a function related to a known vulnerability CVE-2014-0160 to be dtls1_ process _ heartbeat (), tls1_ process _ heartbeat (), and selecting dtls1_ process _ heartbeat () to analyze; collecting the last bug version function, namely the binary code of the function dtls1_ process _ heartbeat () in OpenSSL1.0.1f, and collecting the first repaired version function, namely the binary code of the function dtls1_ process _ heartbeat () in OpenSSL1.0.1 g; and disassembling the two functions by using a disassembling tool to obtain the control flow diagrams VF and PF which are expressed by assembly codes.
Step S2: carrying out differential analysis on the input vulnerability function VF and the repaired function PF, wherein the binary differential analysis technology is not specifically limited; based on the difference analysis result of the two functions, the boundary basic block BBB of all the changed basic blocks CBB in the two functions can be located, and fig. 2(a) and fig. 2(b) respectively represent the control flow diagrams of function dtls1_ process _ heartbeat () in openssl1.0.1f and openssl1.0.1 g; FIG. 2 shows the results of two functional difference analyses, where basic blocks A, C, D, F and basic blocks A ', C ', D ', E ', F ', G ', I ', L ', M ', N ' are CBB, basic blocks B, E, K, G and basic blocks B ', H ', P ', J ', U ' are BBB;
connecting the change basic block and the boundary basic block can form a plurality of local control flow diagrams in the function, for example, in fig. 2(a), the basic blocks E, F and G can form a local control flow diagram, and an effective path 'E- > F- > G' can be obtained by traversing the control flow diagram; all local control flow graphs are constructed and traversed, all effective paths in the functions can be obtained, and an effective path set T is formed1As follows:
A->B; B->C->D->E;
A->C->D->E; B->C->D->G;
A->C->D->G; B->C->K;
A->C->K; E->F->G;
also, as shown in FIG. 2(b), an efficient path set T can be constructed2As follows:
A′->B′; A′->C′->L′->U′;
A′->C′->D′->M′->′; A′->C′->D′->E′->F′->N′->U′;
A′->C′->D′->E′->F′->G′->H′; A′->C′->D′->E′->F′->G′->J′;
A′->C′->D′->E′->P′; B′->C′->L′->U′;
B′->C′->D′->M′->U′; B′->C′->D′->E′->F′->N′->U′;
B′->C′->D′->E′->F′->G′->H′; B′->C′->D′->E′->F′->G′->J′;
B′->C′->D′->E′->P′; H′->I′->J′;
then, for each effective path, removing jump instructions among basic blocks, connecting the basic blocks in the path into an instruction sequence, and then normalizing the instructions in all effective paths according to a normalization rule, for example, an instruction sequence segment of an effective path a- > B is changed before and after normalization as follows:
Figure BDA0002551006640000061
after generating the effective path set, merging and optimizing the effective paths connected end to end, for example, the effective paths 'A- > B', 'B- > C- > D- > E' can be merged and optimized into 'A- > B- > C- > D- > E';
and finally, taking the boundary basic block and the effective path set as patch features.
Step S3: the method comprises the steps of taking a vulnerability function as an object, searching a function similar to the vulnerability function in a target program OpenSSL1.0.1e based on a binary function similarity detection technology, obtaining a function dtls1_ process _ heartbeat () in the OpenSSL1.0.1e as a suspected target function, and reducing a search space, wherein the binary function similarity detection technology is not specifically limited.
Step S4: in the step, vulnerability detection is carried out on the suspected target function, and the specific steps are as follows:
step S401: firstly, a binary differential analysis method is used for carrying out differential analysis on a target function TF and a vulnerability function VF, as shown in FIG. 3, the analysis result of the TF and the VF is the same as that of a function dtls1_ process _ heartbeat () in OpenSSL1.0.1e and that in OpenSSL1.0.1f, so that no changed basic block exists; then, a basic block matching algorithm is used for matching a boundary basic block BBB in the PF in the features in an objective function TF to be matched with basic blocks B, E, K, G and J in the (a) of the figure 3; at this time, CBB and BBB cannot form a local control flow graph, i.e. no effective path set, T, is generated3Is an empty set; similarly, there is no basic block that has changed in VF, CBB and BBB cannot form a local control flow graph, i.e. no effective path set, T4Is an empty set;
then, a binary differential analysis method is used for carrying out differential analysis on the target function TF and the repaired function PF, as shown in FIG. 4, the analysis results of the TF and the PF are shown, and the basic blocks A, C, D and F and the basic blocks A ', C', D ', E', F ', G', I ', L', M 'and N' are changed basic blocks; then, a basic block matching algorithm is used for matching a boundary basic block BBB in the VF in the characteristics in an objective function TF to match basic blocks B, E, K and G in the (a) of the graph 4; local control flow graphs can be formed by connecting CBBs and BBBs, and an effective path set T is generated by traversing the local control flow graphs5As follows:
A->B; B->C->D->E;
A->C->D->E; B->C->D->G;
A->C->D->G; B->C->K;
A->C->K; E->F->G;
similarly, the effective path set T is generated by connecting CBBs in a PF with BBBs in the corresponding PF6As follows:
A′->B′; A′->C′->L′->U′;
A′->C′->D′->M′->U′; A′->C′->D′->E′->F′->N′->U′;
A′->C′->D′->E′->F′->G′->H′; A′->C′->D′->E′->F′->G′->J′;
A′->C′->D′->E′->P′; B′->C′->L′->U′;
B′->C′->D′->M′->U′; B′->C′->D′->E′->F′->N′->U′;
B′->C′->D′->E′->F′->G′->H′; B′->C′->D′->E′->F′->G′->J′;
B′->C′->D′->E′->P′; H′->I′->J′;
after generating an effective path set, merging and optimizing the effective paths connected end to end;
finally, performing instruction connection on the effective paths in all the path sets, and performing standardization processing according to a standardization rule;
step S402: in this example, due to T4Is empty, does not need reduction, and is at T6For each valid path T, at T2There is a path and it contains the same CBB, so no irrelevant path is reduced to form the effective path set T62
Step S403: in this example, the condition of case one is met, so the determination is made according to case one: respectively calculating the similarity between paths and the similarity between path sets according to a formula to finally obtain Sim (T)3,T2)<Sim(T5,T1) Therefore, the determination target function TF still contains a vulnerability.
Step S5: returning to step S1 to perform vulnerability detection in the target program by taking the function tls1_ process _ heartbeat () as an object because an unanalyzed related function tls1_ process _ heartbeat () still exists; when the process is started again after the function tls1_ process _ heartbeat () is analyzed, the process goes to step S6 if there is no unanalyzed correlation function.
Step S6: through the determination of the above steps, if it is considered that functions highly similar to the function dtls1_ process _ heartbeat () and the function tls1_ process _ heartbeat () exist in the binary program of openssl1.0.1e and both contain bugs, it is determined that the binary program openssl1.0.1e contains bugs CVE-2014-0160, and the algorithm is ended.
To summarize: compared with the existing method, the method provided by the invention has the advantages that under the given input, the characteristics are constructed based on the function difference analysis method, whether the related functions in the target binary program are patched or not can be judged, the detection of the known bugs is realized, the false alarm rate is reduced, and the accuracy rate is improved.

Claims (7)

1. A binary program vulnerability detection method based on function difference is characterized by comprising the following steps:
s1: constructing a feature extraction object vulnerability function VF and a repaired function PF;
s2: generating patch characteristics by using a binary function differential analysis technology for patch identification;
s3: taking a vulnerability function as an object, and screening out a function similar to the vulnerability function from a target program based on a binary function similarity detection technology to serve as a suspected target function TF;
s4: performing patch identification on the target function TF according to the similarity relation between the effective path set generated by the target function TF and the patch characteristics, if a patch can be identified in the target function, determining that the patch does not exist in the function, otherwise, determining that the function still contains the patch, wherein the method comprises the steps of determining a local key area related to the patch in the target function TF by using a binary function differential analysis technology, generating and reducing effective paths, and performing patch identification in the target function TF;
s5: judging whether the related functions of the yet-to-be-analyzed loopholes exist or not, if only one function is influenced by one loophole, continuing the method downwards, and if a plurality of functions are influenced by one loophole and still the related functions of the yet-to-be-analyzed loophole exist, returning to the step S1 to continue iteration;
s6: judging whether the target binary program contains the vulnerability or not according to the actual judgment condition of all functions related to the vulnerability to be searched in the target binary program, wherein if one vulnerability only affects one function, the judgment result of the function is consistent with the judgment result of the vulnerability; if a bug affects multiple functions, if more than one function is determined to contain a bug or the number of functions determined to contain a bug is greater than or equal to the number of functions determined to be repaired, the program is considered to still contain the bug, otherwise, the bug is considered to be repaired.
2. The method for binary program vulnerability detection based on function difference according to claim 1, wherein in step S1, a function related to a vulnerability is determined according to related information of a known vulnerability to be searched, a last binary code containing a vulnerability version VF and a first repaired version PF of the related function is collected, and a disassembly tool is used to perform disassembly processing, so as to respectively construct a control flow graph represented by the assembly code for the functions VF and PF as a feature extraction object.
3. The method for binary program vulnerability detection based on function difference as claimed in claim 1, wherein in step S2, the binary difference analysis technique is used to perform difference analysis on the input vulnerability function and its repaired function, based on the obtained difference analysis result of the two functions, the boundary basic block BBB of all changed basic blocks CBB in the two functions is located, and patch features are constructed accordingly, wherein the changed basic blocks refer to basic blocks added, deleted or modified between functions; the boundary basic block refers to a neighbor node of the change basic block, but may be a change basic block itself.
4. The method according to claim 3, wherein in step S2, a plurality of local control flow graphs are formed in the function by connecting the basic blocks of variation and the basic blocks of boundary, and all valid paths set in the function are obtained by traversing these local control flow graphs, wherein a valid path VT is a continuous basic path starting and ending with the basic blocks of boundaryThe block sequence at least comprises a change basic block and does not contain a loop, if the loop exists, the loop is flattened, and an effective path set generated by a vulnerability function is T1The effective path set of the repaired function is T2And taking the boundary basic block and the effective path set as patch features.
5. The method according to claim 4, wherein in step S2, for each valid path, the step of removing jump instructions between basic blocks, connecting the basic blocks in the path into an instruction sequence, and then normalizing the instructions in the valid path comprises:
1) address standardization: replacing the specific address with "address";
2) memory standardization: memory addressing is replaced with "mem";
3) register normalization: the specific register is replaced with "reg".
6. The method for detecting vulnerabilities of a binary program based on function differentiation according to claim 1, wherein in step S4, it is determined whether an object function has been patched, if the object function has been patched, it is determined that a corresponding vulnerability in the object function has been fixed, otherwise, it is determined that the vulnerability still exists in the object function, and step S4 specifically includes:
s401: generating an effective path of an objective function: firstly, a target function TF is differentially analyzed with a vulnerability function VF and a repaired function PF respectively by using a binary differential analysis method, a basic block matching algorithm is used for matching a boundary basic block BBB of the PF in characteristics in the target function TF according to the differential analysis result of the TF and the VF, one or more local control flow graphs can be constructed by connecting the CBB in the TF and the boundary basic block BBB which is matched in the TF and adjacent to the CBB, the local control flow graphs are considered to be the embodiment of patches or vulnerability behaviors and are called as local key regions, and an effective path set T is generated by traversing the local control flow graphs3The number of the first and second electrodes is, similarly,generating an active path set T by concatenating CBBs in VFs with BBBs in corresponding VFs4Through the differential analysis of the TF and the PF, for the differential analysis result of the TF and the PF, a boundary basic block BBB of the VF in the characteristics is matched in the target function TF by using a basic block matching algorithm, one or more local control flow graphs can be constructed by connecting the CBB in the TF with the boundary basic blocks BBB which are matched in the TF and adjacent to the CBB, and an effective path set T is generated by traversing the local control flow graphs5Similarly, the effective path set T is generated by connecting CBBs in a PF with BBBs in the corresponding PF6After generating the effective path set, merging the effective paths connected end to end, then for each effective path, firstly removing the jump instruction between the basic blocks, connecting the basic blocks in the path into an instruction sequence, and then standardizing the instructions in the effective paths, including:
1) address standardization: replacing the specific address with "address";
2) memory standardization: memory addressing is replaced with "mem";
3) register normalization: the specific register is replaced with "reg".
S402: reduction of the objective function path: for T4If T is each valid path T in1There is no path with T that contains a common CBB, then the valid path T is from T4To form a reduced effective path set T41Similarly, T will also be passed6And T2Is compared from T6The irrelative paths are reduced to form a reduced effective path set T62
S403: and (3) judging algorithm: deducing the relationship among the functions according to the similarity relationship of the differences among the functions to judge whether the target function is patched or not so as to complete vulnerability detection, wherein Sim (T, T') represents the similarity among the path sets, and the method specifically comprises the following three conditions:
case 1: t is1And T2Neither is empty: if the target function has been repaired, the difference T between TF and VF3Should be proportional between TF and PFDifference T5More pronounced, and T3Should be associated with T in the patch feature2More similar; if the target function still contains a vulnerability, the difference T between TF and PF5Should compare the difference T between TF and VF3More pronounced, and T5Should be associated with T in the patch feature1More similarly, therefore, in this case, if Sim (T)3,T2)>Sim(T5,T1) The target function is considered to be repaired, otherwise, the target function is considered to still contain the vulnerability;
case 2: t is1Is empty, T2Not empty: t is1Being empty means that some new code has been added to the patch, if the target function has been patched, then T3Will be related to T2More similarly, and T62Should be empty; if the target function still contains a bug, T62And T2Should be more similar, T3Should be empty, so in this case, if Sim (T)2,T3)>Sim(T2,T62) The target function is considered to be repaired, otherwise, the target function is considered to still contain the vulnerability;
case 3: t is2Is empty, T1Not empty: t is2Being empty means that the patch has some code deleted, similar to case 2, if Sim (T)1,T41)>Sim(T1,T5) The target function is considered as having been fixed, otherwise it is considered as still containing a vulnerability.
7. The method according to claim 6, wherein in step S403, in order to calculate the similarity score between each pair of path sets, firstly, the similarity between the paths needs to be calculated, and the similarity score is calculated by comparing the instruction sequences of the two paths, and the specific calculation formula is as follows:
Figure FDA0002551006630000031
wherein, t1And t2For the two paths for which the similarity score is to be calculated,edit(t1,t2) For the edit distance between the two paths, len (t)1) And len (t)2) Are respectively a path t1And t2Length of (d);
after the similarity of each pair of paths is calculated, the final similarity between the two sets is calculated according to the following formula:
Figure FDA0002551006630000032
wherein, T1And T2For two sets of paths for which similarity scores are to be calculated, t1And t2Respectively from the set T1And T2Two paths of (d), Sim (t)1,t2) Is the similarity between two paths, | T1I and I T2L are respectively the set T1And T2Number of paths in, len (T)1) And len (T)2) Are respectively a set T1And T2The middle path total length.
CN202010574987.3A 2020-06-22 2020-06-22 Binary program vulnerability detection method based on function difference Active CN111914260B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010574987.3A CN111914260B (en) 2020-06-22 2020-06-22 Binary program vulnerability detection method based on function difference

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010574987.3A CN111914260B (en) 2020-06-22 2020-06-22 Binary program vulnerability detection method based on function difference

Publications (2)

Publication Number Publication Date
CN111914260A true CN111914260A (en) 2020-11-10
CN111914260B CN111914260B (en) 2023-03-31

Family

ID=73226949

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010574987.3A Active CN111914260B (en) 2020-06-22 2020-06-22 Binary program vulnerability detection method based on function difference

Country Status (1)

Country Link
CN (1) CN111914260B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113626820A (en) * 2021-06-25 2021-11-09 中国科学院信息工程研究所 Known vulnerability positioning method and device for network equipment
CN114021146A (en) * 2021-11-15 2022-02-08 杭州戎戍网络安全技术有限公司 Unstructured difference patch analysis method based on value set analysis
CN114065227A (en) * 2022-01-18 2022-02-18 思探明信息科技(南京)有限公司 Vulnerability positioning analysis system
CN115510451A (en) * 2022-09-20 2022-12-23 中国人民解放军国防科技大学 Method and system for judging existence of firmware patch based on random walk

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101154257A (en) * 2007-08-14 2008-04-02 电子科技大学 Dynamic mend performing method based on characteristics of loopholes
CN103198260A (en) * 2013-03-28 2013-07-10 中国科学院信息工程研究所 Automation positioning method for binary system program vulnerabilities
US20160188885A1 (en) * 2014-12-26 2016-06-30 Korea University Research And Business Foundation Software vulnerability analysis method and device
CN107229563A (en) * 2016-03-25 2017-10-03 中国科学院信息工程研究所 A kind of binary program leak function correlating method across framework
CN109241737A (en) * 2018-07-03 2019-01-18 中国科学院信息工程研究所 A kind of difference linear-elsatic buckling method and system towards a variety of patch modes
CN109359468A (en) * 2018-08-23 2019-02-19 阿里巴巴集团控股有限公司 Leak detection method, device and equipment
US20200134172A1 (en) * 2018-10-31 2020-04-30 Korea Internet & Security Agency Method and apparatus for patching binary having vulnerability

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101154257A (en) * 2007-08-14 2008-04-02 电子科技大学 Dynamic mend performing method based on characteristics of loopholes
CN103198260A (en) * 2013-03-28 2013-07-10 中国科学院信息工程研究所 Automation positioning method for binary system program vulnerabilities
US20160188885A1 (en) * 2014-12-26 2016-06-30 Korea University Research And Business Foundation Software vulnerability analysis method and device
CN107229563A (en) * 2016-03-25 2017-10-03 中国科学院信息工程研究所 A kind of binary program leak function correlating method across framework
CN109241737A (en) * 2018-07-03 2019-01-18 中国科学院信息工程研究所 A kind of difference linear-elsatic buckling method and system towards a variety of patch modes
CN109359468A (en) * 2018-08-23 2019-02-19 阿里巴巴集团控股有限公司 Leak detection method, device and equipment
US20200134172A1 (en) * 2018-10-31 2020-04-30 Korea Internet & Security Agency Method and apparatus for patching binary having vulnerability

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
DANJUN LIU等: "VMPBL: Identifying Vulnerable Functions Based on Machine Learning Combining Patched Information and Binary Comparison Technique by LCS", 《2018 17TH IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS/ 12TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA SCIENCE AND ENGINEERING (TRUSTCOM/BIGDATASE)》 *
李登等: "基于同源性分析的嵌入式设备固件漏洞检测", 《计算机工程》 *
李赞等: "一种利用补丁的未知漏洞发现方法", 《软件学报》 *
达小文等: "一种基于补丁比对和静态污点分析的漏洞定位技术研究", 《信息网络安全》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113626820A (en) * 2021-06-25 2021-11-09 中国科学院信息工程研究所 Known vulnerability positioning method and device for network equipment
CN113626820B (en) * 2021-06-25 2023-06-27 中国科学院信息工程研究所 Known vulnerability positioning method and device for network equipment
CN114021146A (en) * 2021-11-15 2022-02-08 杭州戎戍网络安全技术有限公司 Unstructured difference patch analysis method based on value set analysis
CN114021146B (en) * 2021-11-15 2022-07-05 杭州戎戍网络安全技术有限公司 Unstructured difference patch analysis method based on value set analysis
CN114065227A (en) * 2022-01-18 2022-02-18 思探明信息科技(南京)有限公司 Vulnerability positioning analysis system
CN115510451A (en) * 2022-09-20 2022-12-23 中国人民解放军国防科技大学 Method and system for judging existence of firmware patch based on random walk
CN115510451B (en) * 2022-09-20 2023-09-19 中国人民解放军国防科技大学 Random walk-based firmware patch existence judging method and system

Also Published As

Publication number Publication date
CN111914260B (en) 2023-03-31

Similar Documents

Publication Publication Date Title
CN111914260B (en) Binary program vulnerability detection method based on function difference
US10558805B2 (en) Method for detecting malware within a linux platform
CN109308415B (en) Binary-oriented guidance quality fuzzy test method and system
WO2017181286A1 (en) Method for determining defects and vulnerabilities in software code
Alrubaye et al. On the use of information retrieval to automate the detection of third-party java library migration at the method level
CN109670318B (en) Vulnerability detection method based on cyclic verification of nuclear control flow graph
CN112214399B (en) API misuse defect detection system based on sequence pattern matching
CN113326187A (en) Data-driven intelligent detection method and system for memory leakage
CN115129591A (en) Binary code-oriented reproduction vulnerability detection method and system
CN110851830B (en) CPU (Central processing Unit) -oriented undisclosed instruction discovery method based on instruction format identification
CN116578980A (en) Code analysis method and device based on neural network and electronic equipment
Liu et al. Vmpbl: Identifying vulnerable functions based on machine learning combining patched information and binary comparison technique by lcs
CN112115053A (en) API misuse defect detection method based on sequence pattern matching
Zhao et al. Fault centrality: boosting spectrum-based fault localization via local influence calculation
CN112905370A (en) Topological graph generation method, anomaly detection method, device, equipment and storage medium
CN108804308B (en) Defect detection method and device for new version program
CN114065227B (en) Vulnerability positioning analysis system
CN115408700A (en) Open source component detection method based on binary program modularization
CN115168855A (en) Patch existence detection method based on key basic block
CN109002716A (en) A kind of malicious code intrusion detection of mobile application and prevention method
Alrabaee et al. Compiler provenance attribution
JP6911928B2 (en) Hypothesis verification device, hypothesis verification method, and program
CN111124922A (en) Rule-based automatic program repair method, storage medium, and computing device
KR101822062B1 (en) Method for recommending component of software based on identifying component co-usability, and recording medium thereof
CN112199684A (en) Java patch existence detection method based on cross-language code association

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant