CN113392016A - Protocol generation method, device, equipment and medium for processing program abnormal condition - Google Patents

Protocol generation method, device, equipment and medium for processing program abnormal condition Download PDF

Info

Publication number
CN113392016A
CN113392016A CN202110713929.9A CN202110713929A CN113392016A CN 113392016 A CN113392016 A CN 113392016A CN 202110713929 A CN202110713929 A CN 202110713929A CN 113392016 A CN113392016 A CN 113392016A
Authority
CN
China
Prior art keywords
code segment
condition
detection condition
detection
handling
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110713929.9A
Other languages
Chinese (zh)
Inventor
范皓
邢哲源
李池
周旻
赵曦斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Bond Jinke Information Technology Co ltd
Tsinghua University
Original Assignee
China Bond Jinke Information Technology Co ltd
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Bond Jinke Information Technology Co ltd, Tsinghua University filed Critical China Bond Jinke Information Technology Co ltd
Priority to CN202110713929.9A priority Critical patent/CN113392016A/en
Publication of CN113392016A publication Critical patent/CN113392016A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3604Software analysis for verifying properties of programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3688Test management for test execution, e.g. scheduling of test suites
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • G06F21/577Assessing vulnerabilities and evaluating computer system security
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/03Indexing scheme relating to G06F21/50, monitoring users, programs or devices to maintain the integrity of platforms
    • G06F2221/033Test or assess software

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Quality & Reliability (AREA)
  • Computing Systems (AREA)
  • Stored Programmes (AREA)

Abstract

The embodiment of the invention discloses a method, a device, equipment and a medium for generating a protocol for processing program abnormal conditions, wherein the method comprises the following steps: determining a called function which accords with a preset exception handling code structure as a candidate code segment for handling exception conditions; in the candidate code segments, if a first candidate code segment containing a wrong semantic feature is detected, extracting a first detection condition corresponding to each function from the first candidate code segment, and if a second candidate code segment not containing the wrong semantic feature is detected, extracting a second detection condition corresponding to each function from a second candidate code segment; and for any called function, screening out target detection conditions which accord with a preset voting strategy from the first detection conditions and/or the second detection conditions, and taking the target detection conditions as an exception handling protocol for handling the exception condition of the program. By adopting the technical scheme, the accuracy and the detection efficiency of the interface exception handling protocol are improved.

Description

Protocol generation method, device, equipment and medium for processing program abnormal condition
Technical Field
The embodiment of the invention relates to the technical field of computer program vulnerability detection, in particular to a protocol generation method, a device, equipment and a medium for processing program abnormal conditions.
Background
With the rapid development of the field of software engineering, an interface (API) is widely used by modern software systems as an important medium for communication between programs and within programs. Meanwhile, interface misuse (API misose) occurs sometimes, for example, 40% of the most dangerous 25 software errors released by CWE organization 2011 are related to interface misuse. One common type of misuse is Error handling (Error handling) of an interface, which refers to how to handle an exception condition after an interface execution Error and keep the program flow executing normally. Irregular interface error handling is listed as one of the most dangerous 10 types of vulnerabilities affecting system security in 2017 by an OWASP project, often causes software system breakdown, even can cause catastrophic results, threatens the property of people and even the life security. Rocket Arraena No. 5, due to the defect of buffer overflow in the code, directly causes the computer system to crash, finally causes the rocket and the carried satellite to be ash, and causes the loss of more than 3.7 billion dollars.
In order to reduce the catastrophic consequences of irregular error processing on a software system and effectively detect the problem of irregular error processing in the software system, it is necessary to provide an automatic generation method and system of an interface protocol for exception handling. But the automated generation of error handling conventions for C language is difficult. On one hand, a code segment which is possibly subjected to exception handling in a code needs to be locked, however, under the condition that a specific exception handling mechanism is absent, the positioning and verification of the exception handling code segment in the system is a time-consuming and complex task, and only through analysis with high accuracy rate and reduction of false alarm rate, the existing detection mode can be effectively extracted, and a solid foundation is laid for a subsequent effective specification inference task. On the other hand, compared with the execution code of the normal path, the exception handling code is contained in the error path of the program and contains special program semantics, so that the screening of the exception handling code segment is carried out by locking the potential program semantic features of the exception handling code segment through the research work of the actual project.
In order to address the above series of challenges, some conventional researchers have proposed a method for generating a specification for exception handling. In addition to manually summarizing conventions by reading official documents, developers also make inferences about conventions by summarizing patterns of use of exception code. For example, researchers such as Acharya think that the occurrence of the exit function indicates that the normal execution flow of the program is damaged, and use the exit mark to identify the abnormal processing code, thereby extracting the detection condition of the control flow and performing the protocol inference; researchers such as Kang think that the error path has fewer branches and statements than the normal path, and the abnormal processing codes are positioned through the characteristics, and the detection conditions are extracted for subsequent protocol inference.
The methods can help protocol inference of interface exception handling, but because the analysis strategy is not comprehensive enough, the given characteristics cannot cover a large-scale exception handling code segment, so that the accuracy and recall rate in the protocol inference process are low, and the adaptability to large-scale programs is poor. Due to these disadvantages, the existing analysis mechanism cannot be efficiently applied in a complex industrial environment.
Therefore, the conventional method for deducing the interface exception handling protocol has the typical problems of high false alarm rate, high missing report rate and low detection efficiency.
Disclosure of Invention
The embodiment of the invention provides a method, a device, equipment and a medium for generating a protocol for processing program abnormal conditions, and improves the accuracy and the detection efficiency of interface abnormal processing protocols.
In a first aspect, an embodiment of the present invention provides a method for generating a specification for processing an abnormal condition of a program, where the method is applied to a process of processing an abnormal condition after an interface executes an error, and includes:
for a called function and a caller thereof with a mapping relation in a source code, determining the called function which accords with a preset exception handling code structure as a candidate code segment for handling exception conditions;
in the candidate code segments, a code segment containing the error semantic features is taken as a first candidate code segment, and a code segment not containing the error semantic features is taken as a second candidate code segment;
if the first candidate code segment is detected, extracting a first detection condition corresponding to each function from the first candidate code segment, and if the second candidate code segment is detected, extracting a second detection condition corresponding to each function from the second candidate code segment; the first detection condition and the second detection condition are both used in the corresponding candidate code segment for: controlling whether the code for processing the abnormal condition is executed;
for any called function, screening out target detection conditions which accord with a preset voting strategy from the first detection conditions and/or the second detection conditions; and taking the target detection condition as an exception handling protocol for handling the exception condition of the program.
Further, determining the called function conforming to the preset exception handling code structure as a candidate code segment for handling the exception condition, including:
searching whether the called function conforms to a preset exception handling code structure in a control flow automata CFA corresponding to the caller based on a mapping relation between the called function and the caller;
if so, determining the called function which accords with the preset exception handling code structure as a candidate code segment for handling the exception condition.
Further, the method further comprises:
compiling and capturing a source code to obtain an intermediate structure, and converting the intermediate structure into an intermediate language LLVM IR;
modeling the LLVM IR as a control flow automaton CFA and a program call graph CG;
and acquiring the mapping relation between each called function and the caller thereof through the CG.
Further, screening out target detection conditions meeting a preset voting strategy from the first detection conditions and the second detection conditions, wherein the target detection conditions comprise:
voting the second detection condition, and taking the detection condition with the minimum support degree greater than or equal to the first set threshold and the minimum confidence degree greater than or equal to the second set threshold as a candidate detection condition;
merging the candidate detection condition and the first detection condition based on the same function name;
and extracting the detection condition with the largest frequency of occurrence from the combined detection conditions as a target detection condition.
Further, screening out target detection conditions meeting a preset voting strategy from the first detection conditions, wherein the target detection conditions comprise:
and extracting the detection condition with the largest frequency of occurrence from the first detection conditions as a target detection condition.
Further, screening out target detection conditions meeting a preset voting strategy from the second detection conditions, wherein the target detection conditions comprise:
voting is carried out on the second detection conditions, the detection conditions with the minimum support degree being greater than or equal to the first set threshold and the minimum confidence degree being greater than or equal to the second set threshold are used as candidate detection conditions, and the detection conditions with the largest frequency of occurrence are extracted from the candidate detection conditions and used as target detection conditions.
In a second aspect, an embodiment of the present invention further provides a specification generating device for processing an abnormal condition of a program, where the device further includes:
the candidate code segment determining module is configured to determine a called function which accords with a preset exception handling code structure as a candidate code segment for handling exception conditions for a called function and a caller of the called function which have a mapping relation in a source code;
a candidate code segment classification module configured to, among the candidate code segments, take a code segment containing an error semantic feature as a first candidate code segment and take a code segment not containing the error semantic feature as a second candidate code segment;
the detection condition extraction module is configured to extract a first detection condition corresponding to each function from the first candidate code segment if the first candidate code segment is detected, and extract a second detection condition corresponding to each function from the second candidate code segment if the second candidate code segment is detected; the first detection condition and the second detection condition are both used in the corresponding candidate code segment for: controlling whether the code for processing the abnormal condition is executed;
the target detection condition determining module is configured to screen out a target detection condition which accords with a preset voting strategy from the first detection condition and/or the second detection condition for any one called function;
and the exception handling protocol generation module is configured to take the target detection condition as an exception handling protocol for handling the exception condition of the program.
Further, the candidate code segment determining module is specifically configured to:
searching whether the calling function corresponding to the callee accords with a preset exception handling code structure or not in a Control Flow Automata (CFA) corresponding to the caller based on the mapping relation between the called function and the caller;
if so, determining the called function which accords with the preset exception handling code structure as a candidate code segment for handling the exception condition.
Further, the apparatus further comprises:
the middle structure conversion module is configured to compile and capture a source code to obtain a middle structure, and convert the middle structure into an intermediate language LLVM IR;
a modeling module configured to model the LLVM IR as a control flow automaton CFA and a program call graph CG;
and the mapping relation acquisition module is configured to acquire the mapping relation between each called function and the caller thereof through the CG.
Further, the target detection condition determining module is specifically configured to:
voting the second detection condition, and taking the detection condition with the minimum support degree greater than or equal to the first set threshold and the minimum confidence degree greater than or equal to the second set threshold as a candidate detection condition;
merging the candidate detection condition and the first detection condition based on the same function name;
and extracting the detection condition with the largest frequency of occurrence from the combined detection conditions as a target detection condition.
Further, the target detection condition determining module is specifically configured to:
and extracting the detection condition with the largest frequency of occurrence from the first detection conditions as a target detection condition.
Further, the target detection condition determining module is specifically configured to:
voting is carried out on the second detection conditions, the detection conditions with the minimum support degree being greater than or equal to the first set threshold and the minimum confidence degree being greater than or equal to the second set threshold are used as candidate detection conditions, and the detection conditions with the largest frequency of occurrence are extracted from the candidate detection conditions and used as target detection conditions.
In a third aspect, an embodiment of the present invention further provides a computing device, including:
a memory storing executable program code;
a processor coupled with the memory;
the processor calls the executable program code stored in the memory to execute the protocol generation method for processing the program abnormal condition provided by any embodiment of the invention.
In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the protocol generation method for processing the abnormal condition of the program according to any embodiment of the present invention.
The technical scheme provided by the embodiment of the invention overcomes the limitation of the existing method. The method and the device not only realize an efficient protocol generation technology, but also help detect the interface misuse behaviors aiming at the C language. By identifying effective characteristics in the exception handling codes and combining two effective structures of CFA and CG, the voting strategy is fully utilized, the protocol of the interface defect is automatically deduced, and meanwhile, the automatically generated protocol helps the interface misuse detection process, so that the accuracy and recall rate of the inference on the exception handling protocol in an actual project are improved.
The innovation points of the embodiment of the invention comprise:
1. by modeling the LLVM IR as a control flow automaton CFA and a program call graph CG, the mapping relation between each called function and a caller thereof can be obtained from the CG, whether the called function conforms to an exception handling code structure or not is checked from the CFA corresponding to the caller, and the identification rate of exception handling codes for handling program exceptions is improved.
2. The method has the advantages that through the identification of effective characteristics in the exception handling code, the two effective structures of the CFA and the CG are combined, the quadratic voting strategy is fully utilized, the protocol of the interface defect is automatically deduced, meanwhile, the automatically generated protocol helps the interface misuse detection process, the limitation of the existing method can be overcome, the exception handling protocol is deduced with higher accuracy and recall rate, and the method is one of the innovation points of the embodiment of the invention.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is an overall work flow diagram of a protocol generation method for processing a program exception according to an embodiment of the present invention;
fig. 2a is a schematic flowchart of a protocol generation method for processing a program exception according to an embodiment of the present invention;
FIG. 2b is a screenshot of a code segment for handling an exception according to an embodiment of the present invention;
FIG. 3 is a flowchart of a CFA process according to an embodiment of the present invention;
fig. 4 is a block diagram illustrating a protocol generation apparatus for processing an exception condition of a program according to a second embodiment of the present invention;
fig. 5 is a schematic structural diagram of a computing device according to a third embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without inventive effort based on the embodiments of the present invention, are within the scope of the present invention.
It is to be noted that the terms "comprises" and "comprising" and any variations thereof in the embodiments and drawings of the present invention are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
In order to clearly and clearly explain the content of each embodiment of the present invention, first, the implementation principle of the technical solution provided by the embodiment of the present invention is briefly introduced as follows:
the method for generating a specification for processing program exception conditions provided by the embodiment of the present invention mainly includes the following three parts, please refer to fig. 1.
1. CFA and CG-oriented context information acquisition
The main function of the part is to construct a control flow automaton CFA and a program call graph CG graph through LLVM IR, identify an interface exception handling code through a common error handling code structure, and simplify the complexity of the identification process.
2. Error handling code definition based on error handling characteristics
The common characteristics of the error handling codes are found through the investigation of the error handling codes, and then the potential error handling codes obtained in the previous part are screened by utilizing the characteristics of the error handling codes. Defining a code segment with an Error semantic feature as P-EHB (basic Error Handling Block); for a code segment without Error semantics, it is defined as M-EHB (Maybe Error Handling Block).
3. Error convention mining based on voting strategy
And finally deducing error processing protocols by voting detection conditions from the P-EHBs and the M-EHBs respectively.
The technical scheme of the embodiment of the invention integrates a static analysis method, uses a Control Flow Automaton (CFA) and a Call Graph (CG) to efficiently simulate a code structure in a program, performs multi-entry analysis on the structure, positions the characteristics of an exception handling module and improves the efficiency of automatic protocol excavation. The voting strategy guides the excavation process of the specification of the exception handling module by carrying out classification design on the detection conditions to which different features belong, and helps the interface misuse detector to find more vulnerabilities.
The following describes each of the above-described parts in detail.
Example one
Fig. 2a is a schematic flowchart of a protocol generation method for processing an abnormal condition of a program according to an embodiment of the present invention, where the method is applicable to a process of processing an abnormal condition after an interface execution error, and can be executed by a protocol generation device for processing an abnormal condition of a program, where the protocol generation device can be implemented by software and/or hardware. As shown in fig. 2a, the method provided in this embodiment specifically includes:
110. and determining the called function which accords with the preset exception handling code structure as a candidate code segment for handling the exception condition for the called function and the caller thereof with the mapping relation in the source code.
Wherein, the abnormal condition mainly refers to: (1) the interface type error is not processed, for example, memory allocation is performed on the SSL _ new function in the openssl protocol library, and after the function is used, a return value should be detected to determine whether the function is executed successfully. However, some developers do not perform error handling detection when using the library function SSL _ new, e.g. line 5 of the code segment shown in fig. 2b, resulting in a security problem that is subsequently fixed by the developer, e.g. lines 6-10 of the code segment shown in fig. 2 b. (2) An error of an interface class is handled incorrectly, for example, the return value of the error condition of the SSL _ connect function in line 13 of the code segment shown in fig. 2b is less than or equal to 0, but at the beginning, the developer performs error check only for the condition less than 0, resulting in the error condition equal to 0 being ignored. Of course, the incorrect handling of the interface-type error may not only be manifested as a check condition error, but also include the handling behavior of the error (the error handling behavior refers to the code segment performing the error handling, for example, lines 15 and 16 of the code segment shown in fig. 2 b), and an exception may also occur.
Illustratively, the called function with the mapping relationship and the caller thereof can be obtained through a program Call Graph (CG). The program call graph can be obtained by the following steps:
compiling and grabbing the source code to obtain an intermediate structure of the i, and converting the intermediate structure of the i into an intermediate language LLVM IR; LLVM IR is modeled as a Control Flow Automation (CFA) and program call graph.
The CFA represents a program to be analyzed, and the structure can conveniently realize that program sentences are analyzed sentence by sentence and in sequence, and the next analysis is carried out based on the analysis. As shown in fig. 3, a CFA is a quadruple CFA (V, E, Entry, Exit), where N0-N6 are all nodes in the CFA, V is a set of nodes, representing the location of the program,
Figure BDA0003134041350000101
is marginalA set representing a process executing from one location to another location of a program, where Ops represents the operation of the process, Entry represents a program start node, and Exit represents a program stop node.
Illustratively, the called function conforming to the preset exception handling code structure is determined as a candidate code segment for handling the exception condition, and the method can be realized by the following steps:
searching whether the called function conforms to a preset exception handling code structure in a control flow automaton corresponding to a caller based on a mapping relation between the called function and the caller; if so, determining the called function which accords with the preset exception handling code structure as a candidate code segment for handling the exception condition.
The exception handling code structure is preset as { function call, return value detection, handling behavior }. The code which is used for processing the potential abnormal condition can be searched through the preset abnormal processing code structure. Specifically, call dependencies on each function are collected through the CG (func: caller1, caller2, …). And then checking whether the target function func meets a preset exception handling code structure in the caller, namely searching code segments which accord with the preset exception handling code structure in the CFA, and taking the code segments as potential candidate code segments for handling the exception condition for subsequent analysis.
120. And in the candidate code segments, the code segment containing the error semantic features is taken as a first candidate code segment, and the code segment not containing the error semantic features is taken as a second candidate code segment.
In this embodiment, the code segment containing the error semantic features is a code segment with obvious error semantic information, for example, F1: a function for error handling; f2: log describing an error log; f3: goto statement for error handling. In this embodiment, the first candidate code segment containing the incorrect semantic features may be marked as being most likely to be exception-handled (P-EHB).
In this embodiment, the code segment that does not include the error semantic feature is a code segment that does not have obvious error semantics, for example, the keyword return. In this embodiment, the second candidate code segment that does not contain the incorrect semantic features may be marked as likely to have exception handling (M-EHB).
It should be noted that "first" and "second" in this embodiment are only used to distinguish different code segments and different detection conditions, and do not have any limiting effect.
130. If the first candidate code segment is detected, the first detection condition corresponding to each function is extracted from the first candidate code segment, and if the second candidate code segment is detected, the second detection condition corresponding to each function is extracted from the second candidate code segment.
Wherein the detection condition controls whether code that handles the abnormal situation is executed. In this embodiment, the detection conditions c used at different exception handling positions for the same function f may all be collected to form a multiple map (multimap) based on the function f and the detection conditions c.
Illustratively, if the first candidate code segment is detected, the first detection condition corresponding to each function is extracted from the first candidate code segment, that is, for the P-EHBs, the corresponding function and detection condition are extracted and added to the multiple mapping condPEH based on the target function and the corresponding detection condition in the P-EHB: < objective function of P-EHB, detection condition >.
Illustratively, if a second candidate code segment is detected, extracting a second detection condition corresponding to each function from the second candidate code segment, that is, for M-EHBs, extracting the corresponding function and detection condition, and adding them to the multiple mapping condMEH based on the target function and corresponding detection condition in M-EHB: < objective function of M-EHB, detection condition >.
Further, multiple maps formed based on the function f and the detection condition c in the P-EHB and the M-EHB may be stored in the tables Tp and Tm, respectively.
140. And for any called function, screening out target detection conditions which meet a preset voting strategy from the first detection conditions and/or the second detection conditions.
As an embodiment, for any called function, if only the first detection condition is extracted, the screening of the target detection condition that meets the preset voting policy from the first detection condition may include:
and voting for the first detection condition once to extract the detection condition with the highest occurrence frequency as the target detection condition. For example, if the first detection condition includes a detection condition whose value range is a dependency relationship, the detection condition with a large value range may be split, and the number of the detection sub-conditions is determined according to the value range of the split detection sub-conditions, that is, if the detection condition whose value range is a dependency relationship exists in the target detection condition, the detection condition with a small value range is used as the target detection condition.
Specifically, the target detection condition may be split and then selected again, for example, for the function setsocket, if the detection condition is less than 0 and not equal to 0, the detection condition equal to 0 may be split into two parts, that is, less than 0 and greater than 0. Thus, the condition less than 0 is recorded to occur 2 times, and the condition greater than 0 is recorded to be 1 time, so that for the function setsocket, the exception handling convention for handling the exception condition of the program is less than 0.
Further, if there is a logically conflicting detection condition in the target detection conditions, the logically conflicting condition is filtered out from the target detection conditions. For example, if the detection condition of the function OBJ _ nid2sn includes two detection conditions eq null and ne null that are contradictory, the exception handling specification of the function OBJ _ nid2sn for handling the program exception is marked as illegal.
As another embodiment, for any called function, if only the second detection condition is extracted, the screening of the target detection condition that meets the preset voting policy from the second detection condition may include:
voting is carried out on the second detection condition twice, namely voting is carried out on the second detection condition firstly, the detection condition with the minimum support degree being more than or equal to the first set threshold and the minimum confidence degree being more than or equal to the second set threshold is taken as a candidate detection condition, and the detection condition with the largest frequency of occurrence is extracted from the candidate detection condition and taken as a target detection condition.
Among them, for the second detection condition, because the M-EHB contains less error semantics, a stricter voting strategy is required. In the voting strategy, the minimum support is set to 3, and the minimum confidence is set to 75%, which means that if a certain detection condition occurs less than 3 times or occurs less than 75% frequently, the condition is eliminated. The detection conditions in condMEH are filtered according to the above principle, and filterCondMEH is obtained after filtering and used as candidate detection conditions.
As another embodiment, for any called function, if the first detection condition and the second detection condition are extracted, the screening of the target detection condition meeting the preset voting policy from the first detection condition and the second detection condition may include:
voting the second detection condition, and taking the detection condition with the minimum support degree greater than or equal to the first set threshold and the minimum confidence degree greater than or equal to the second set threshold as a candidate detection condition; the candidate detection condition and the first detection condition are combined based on the same function name, for example, filterCondMEH and condPEH may be combined based on the same function name to obtain condMerge. And extracting the detection condition with the largest frequency of occurrence from the combined detection conditions as a target detection condition. Wherein the minimum support may be set to 3 and the minimum confidence may be set to 75%. The frequency of the occurrence of the detection condition may be determined again by splitting the target detection condition, and the frequency determination method may be referred to specifically, and is not repeated here.
Further, if there is a detection condition with conflicting logic in the target detection condition, the condition with conflicting logic is filtered from the target detection condition, and the specific filtering manner may refer to the filtering manner provided by the above implementation manner, which is not described herein again.
150. And taking the target detection condition as an exception handling protocol for handling the exception condition of the program.
In the present application, an exception handling specification refers to a detection condition of a function that handles a program exception. The specification exists in the form of < function name, specification condition >, and the error condition representing the function return value is the inferred specification condition.
For example, SSL _ new, eq null indicates that the function is executed with an error when the return value of SSL _ new is null. BIO _ read, slt 0 indicates that the return value of BIO _ read is less than 0, which represents a function execution error. SSL _ connect, and sle 0 indicates that the function is faulty when the return value of SSL _ connect is less than or equal to 0.
The technical scheme provided by the embodiment of the invention integrates the exception handling attribute, and improves the accuracy and recall rate of the exception handling protocol inference in the actual project. Based on the structure and characteristics of the code for processing the program abnormal condition, the abnormal processing code is effectively defined, and the protocol is deduced by setting different voting strategies. Meanwhile, the exception handling protocol can be directly used for the existing interface defect detection tool and effectively helps to detect defects. Therefore, the embodiment of the invention can overcome the limitation of the existing method, deduces the exception handling protocol with higher accuracy and recall rate, and has important significance for promoting the automatic detection of the interface misuse defect.
Example two
Fig. 4 is a block diagram of a configuration of a specification generating apparatus for processing an abnormal condition of a program according to a second embodiment of the present invention, and as shown in fig. 4, the apparatus includes: a candidate code segment determining module 210, a candidate code segment classifying module 220, a detection condition extracting module 230, a target detection condition determining module 240, and an exception handling rule generating module 250; wherein the content of the first and second substances,
a candidate code segment determining module 210 configured to determine, as to a called function having a mapping relationship in a source code and a caller thereof, a called function conforming to a preset exception handling code structure as a candidate code segment for handling an exception condition;
a candidate code segment classification module 220 configured to, among the candidate code segments, take a code segment containing an incorrect semantic feature as a first candidate code segment and take a code segment not containing the incorrect semantic feature as a second candidate code segment;
a detection condition extracting module 230 configured to extract a first detection condition corresponding to each function from the first candidate code segment if the first candidate code segment is detected, and extract a second detection condition corresponding to each function from the second candidate code segment if the second candidate code segment is detected; the first detection condition and the second detection condition are both used in the corresponding candidate code segment for: controlling whether the code for processing the abnormal condition is executed;
a target detection condition determining module 240 configured to, for any one of the called functions, screen out a target detection condition that meets a preset voting policy from the first detection condition and/or the second detection condition; the preset voting strategy is that the support degree and the confidence degree respectively meet preset requirements;
and an exception handling specification generating module 250 configured to use the target detection condition as an exception handling specification for handling the program exception.
Further, the candidate code segment determining module is specifically configured to:
searching whether the called function conforms to a preset exception handling code structure in a control flow automata CFA corresponding to the caller based on a mapping relation between the called function and the caller;
if so, determining the called function which accords with the preset exception handling code structure as a candidate code segment for handling the exception condition.
Further, the apparatus further comprises:
the middle structure conversion module is configured to compile and capture a source code to obtain a middle structure, and convert the middle structure into an intermediate language LLVM IR;
a modeling module configured to model the LLVM IR as a control flow automaton CFA and a program call graph CG;
and the mapping relation acquisition module is configured to acquire the mapping relation between each called function and the caller thereof through the CG.
Further, the target detection condition determining module is specifically configured to:
voting the second detection condition, and taking the detection condition with the minimum support degree greater than or equal to the first set threshold and the minimum confidence degree greater than or equal to the second set threshold as a candidate detection condition;
merging the candidate detection condition and the first detection condition based on the same function name;
and extracting the detection condition with the largest frequency of occurrence from the combined detection conditions as a target detection condition.
Further, the target detection condition determining module is specifically configured to:
and extracting the detection condition with the largest frequency of occurrence from the first detection conditions as a target detection condition.
Further, the target detection condition determining module is specifically configured to:
voting is carried out on the second detection conditions, the detection conditions with the minimum support degree being greater than or equal to the first set threshold and the minimum confidence degree being greater than or equal to the second set threshold are used as candidate detection conditions, and the detection conditions with the largest frequency of occurrence are extracted from the candidate detection conditions and used as target detection conditions.
The protocol generation device for processing the program abnormal condition, provided by the embodiment of the invention, can execute the protocol generation method for processing the program abnormal condition, provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method. For details of the technique not described in detail in the above embodiments, reference may be made to a protocol generation method for processing a program exception condition provided in any embodiment of the present invention.
EXAMPLE III
Referring to fig. 5, fig. 5 is a schematic structural diagram of a computing device according to a third embodiment of the present invention. As shown in fig. 5, the computing device may include:
a memory 701 in which executable program code is stored;
a processor 702 coupled to the memory 701;
the processor 702 calls the executable program code stored in the memory 701 to execute the protocol generation method for processing the program exception condition according to any embodiment of the present invention.
The embodiment of the invention discloses a computer-readable storage medium which stores a computer program, wherein the computer program enables a computer to execute a protocol generation method for processing program abnormal conditions provided by any embodiment of the invention.
In various embodiments of the present invention, it should be understood that the sequence numbers of the above-mentioned processes do not imply an inevitable order of execution, and the execution order of the processes should be determined by their functions and inherent logic, and should not constitute any limitation on the implementation process of the embodiments of the present invention.
In the embodiments provided herein, it should be understood that "B corresponding to A" means that B is associated with A from which B can be determined. It should also be understood, however, that determining B from a does not mean determining B from a alone, but may also be determined from a and/or other information.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated units, if implemented as software functional units and sold or used as a stand-alone product, may be stored in a computer accessible memory. Based on such understanding, the technical solution of the present invention, which is a part of or contributes to the prior art in essence, or all or part of the technical solution, can be embodied in the form of a software product, which is stored in a memory and includes several requests for causing a computer device (which may be a personal computer, a server, a network device, or the like, and may specifically be a processor in the computer device) to execute part or all of the steps of the above-described method of each embodiment of the present invention.
It will be understood by those skilled in the art that all or part of the steps in the methods of the embodiments described above may be implemented by hardware instructions of a program, and the program may be stored in a computer-readable storage medium, where the storage medium includes Read-Only Memory (ROM), Random Access Memory (RAM), Programmable Read-Only Memory (PROM), Erasable Programmable Read-Only Memory (EPROM), One-time Programmable Read-Only Memory (OTPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Compact Disc Read-Only Memory (CD-ROM), or other Memory, such as a magnetic disk, or a combination thereof, A tape memory, or any other medium readable by a computer that can be used to carry or store data.
Those of ordinary skill in the art will understand that: the figures are merely schematic representations of one embodiment, and the blocks or flow diagrams in the figures are not necessarily required to practice the present invention.
Those of ordinary skill in the art will understand that: modules in the devices in the embodiments may be distributed in the devices in the embodiments according to the description of the embodiments, or may be located in one or more devices different from the embodiments with corresponding changes. The modules of the above embodiments may be combined into one module, or further split into multiple sub-modules.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A method for generating a protocol for processing program abnormal conditions is applied to the processing process of the abnormal conditions after interface execution errors, and is characterized by comprising the following steps:
for a called function and a caller thereof with a mapping relation in a source code, determining the called function which accords with a preset exception handling code structure as a candidate code segment for handling exception conditions;
in the candidate code segments, a code segment containing the error semantic features is taken as a first candidate code segment, and a code segment not containing the error semantic features is taken as a second candidate code segment;
if the first candidate code segment is detected, extracting a first detection condition corresponding to each function from the first candidate code segment, and if the second candidate code segment is detected, extracting a second detection condition corresponding to each function from the second candidate code segment; the first detection condition and the second detection condition are both used in the corresponding candidate code segment for: controlling whether the code for processing the abnormal condition is executed;
for any called function, screening out target detection conditions which accord with a preset voting strategy from the first detection conditions and/or the second detection conditions;
and taking the target detection condition as an exception handling protocol for handling the exception condition of the program.
2. The method of claim 1, determining a called function conforming to a preset exception handling code structure as a candidate code segment for handling an exception condition, comprising:
searching whether the called function conforms to a preset exception handling code structure in a control flow automata CFA corresponding to the caller based on a mapping relation between the called function and the caller;
if so, determining the called function which accords with the preset exception handling code structure as a candidate code segment for handling the exception condition.
3. The method of claim 2, further comprising:
compiling and capturing a source code to obtain an intermediate structure, and converting the intermediate structure into an intermediate language LLVM IR;
modeling the LLVM IR as a control flow automaton CFA and a program call graph CG;
and acquiring the mapping relation between each called function and the caller thereof through the CG.
4. The method of claim 1, wherein the step of screening the first detection condition and the second detection condition for the target detection condition meeting the predetermined voting strategy comprises:
voting the second detection condition, and taking the detection condition with the minimum support degree greater than or equal to the first set threshold and the minimum confidence degree greater than or equal to the second set threshold as a candidate detection condition;
merging the candidate detection condition and the first detection condition based on the same function name;
and extracting the detection condition with the largest frequency of occurrence from the combined detection conditions as a target detection condition.
5. The method of claim 1, wherein the step of screening the first detection conditions for the target detection conditions meeting the predetermined voting strategy comprises:
and extracting the detection condition with the largest frequency of occurrence from the first detection conditions as a target detection condition.
6. The method of claim 1, wherein the step of screening the second detection conditions for the target detection conditions meeting the predetermined voting strategy comprises:
voting is carried out on the second detection conditions, the detection conditions with the minimum support degree being greater than or equal to the first set threshold and the minimum confidence degree being greater than or equal to the second set threshold are used as candidate detection conditions, and the detection conditions with the largest frequency of occurrence are extracted from the candidate detection conditions and used as target detection conditions.
7. A specification generating apparatus for processing an abnormal condition of a program, comprising:
the candidate code segment determining module is configured to determine a called function which accords with a preset exception handling code structure as a candidate code segment for handling exception conditions for a called function and a caller of the called function which have a mapping relation in a source code;
a candidate code segment classification module configured to, among the candidate code segments, take a code segment containing an error semantic feature as a first candidate code segment and take a code segment not containing the error semantic feature as a second candidate code segment;
a detection condition extracting module configured to extract a first detection condition and a second detection condition corresponding to each function from the first candidate code segment and the second candidate code segment, respectively, where the first detection condition and the second detection condition are both used in the corresponding candidate code segments: controlling whether the code for processing the abnormal condition is executed;
the target detection condition determining module is configured to screen out a target detection condition which accords with a preset voting strategy from the first detection condition and/or the second detection condition for any one called function;
and the exception handling protocol generation module is configured to take the target detection condition as an exception handling protocol for handling the exception condition of the program.
8. The apparatus according to claim 7, wherein the candidate code segment determining module is specifically configured to:
searching whether the called function conforms to a preset exception handling code structure in a control flow automata CFA corresponding to the caller based on a mapping relation between the called function and the caller;
if so, determining the called function which accords with the preset exception handling code structure as a candidate code segment for handling the exception condition.
9. A computing device, comprising:
a memory storing executable program code;
a processor coupled with the memory;
the processor calls the executable program code stored in the memory to perform a reduced method of handling program exceptions as claimed in any one of claims 1 to 6.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out a reduction method for handling program exceptions according to any one of claims 1-6.
CN202110713929.9A 2021-06-25 2021-06-25 Protocol generation method, device, equipment and medium for processing program abnormal condition Pending CN113392016A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110713929.9A CN113392016A (en) 2021-06-25 2021-06-25 Protocol generation method, device, equipment and medium for processing program abnormal condition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110713929.9A CN113392016A (en) 2021-06-25 2021-06-25 Protocol generation method, device, equipment and medium for processing program abnormal condition

Publications (1)

Publication Number Publication Date
CN113392016A true CN113392016A (en) 2021-09-14

Family

ID=77624122

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110713929.9A Pending CN113392016A (en) 2021-06-25 2021-06-25 Protocol generation method, device, equipment and medium for processing program abnormal condition

Country Status (1)

Country Link
CN (1) CN113392016A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113961475A (en) * 2021-12-22 2022-01-21 清华大学 Protocol-oriented error processing defect detection method and system

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005158030A (en) * 2003-08-28 2005-06-16 Ricoh Co Ltd Program code automatic generation method, apparatus, and computer readable medium
US20140007116A1 (en) * 2012-06-30 2014-01-02 Microsoft Corporation Implementing functional kernels using compiled code modules
CN103955426A (en) * 2014-04-21 2014-07-30 中国科学院计算技术研究所 Method and device for detecting code C null-pointer reference
CN106708742A (en) * 2017-02-20 2017-05-24 许继集团有限公司 Method and device for automated test of communication protocol module test architecture
CN107885999A (en) * 2017-11-08 2018-04-06 华中科技大学 A kind of leak detection method and system based on deep learning
CN110245496A (en) * 2019-05-27 2019-09-17 华中科技大学 A kind of source code leak detection method and detector and its training method and system
CN110442527A (en) * 2019-08-16 2019-11-12 扬州大学 Automation restorative procedure towards bug report
CN112214399A (en) * 2020-09-16 2021-01-12 北京京航计算通讯研究所 API misuse defect detection system based on sequence pattern matching

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005158030A (en) * 2003-08-28 2005-06-16 Ricoh Co Ltd Program code automatic generation method, apparatus, and computer readable medium
US20140007116A1 (en) * 2012-06-30 2014-01-02 Microsoft Corporation Implementing functional kernels using compiled code modules
CN103955426A (en) * 2014-04-21 2014-07-30 中国科学院计算技术研究所 Method and device for detecting code C null-pointer reference
CN106708742A (en) * 2017-02-20 2017-05-24 许继集团有限公司 Method and device for automated test of communication protocol module test architecture
CN107885999A (en) * 2017-11-08 2018-04-06 华中科技大学 A kind of leak detection method and system based on deep learning
CN110245496A (en) * 2019-05-27 2019-09-17 华中科技大学 A kind of source code leak detection method and detector and its training method and system
CN110442527A (en) * 2019-08-16 2019-11-12 扬州大学 Automation restorative procedure towards bug report
CN112214399A (en) * 2020-09-16 2021-01-12 北京京航计算通讯研究所 API misuse defect detection system based on sequence pattern matching

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113961475A (en) * 2021-12-22 2022-01-21 清华大学 Protocol-oriented error processing defect detection method and system
CN113961475B (en) * 2021-12-22 2022-04-15 清华大学 Protocol-oriented error processing defect detection method and system

Similar Documents

Publication Publication Date Title
Ding et al. Towards learning (dis)-similarity of source code from program contrasts
CN104899147A (en) Code static analysis method oriented to security check
CN110909363A (en) Software third-party component vulnerability emergency response system and method based on big data
KR101979329B1 (en) Method and apparatus for tracking security vulnerable input data of executable binaries thereof
CN111522708B (en) Log recording method, computer equipment and storage medium
CN112560043A (en) Vulnerability similarity measurement method based on context semantics
CN113468525A (en) Similar vulnerability detection method and device for binary program
CN114238980B (en) Industrial control equipment vulnerability mining method, system, equipment and storage medium
CN112688966A (en) Webshell detection method, device, medium and equipment
CN113392016A (en) Protocol generation method, device, equipment and medium for processing program abnormal condition
CN114048227A (en) SQL statement anomaly detection method, device, equipment and storage medium
CN116841906A (en) Intelligent contract detection method and device and electronic equipment
CN116610567A (en) Early warning method and device for abnormal application program, processor and electronic equipment
CN110321130B (en) Non-repeatable compiling and positioning method based on system call log
CN114691197A (en) Code analysis method and device, electronic equipment and storage medium
Harzevili et al. Automatic Static Vulnerability Detection for Machine Learning Libraries: Are We There Yet?
CN113641702A (en) Method and device for interactive processing with database client after statement audit
CN112464237A (en) Static code safety diagnosis method and device
Kim et al. Source code analysis for static prediction of dynamic memory usage
Xiao et al. Performing high efficiency source code static analysis with intelligent extensions
CN112162777B (en) Source code feature extraction method and device
CN116305131B (en) Static confusion removing method and system for script
CN117555811B (en) Embedded software analysis method, device and storage medium based on static symbol execution
CN116383834B (en) Detection method for source code vulnerability detection tool abnormality and related equipment
CN111274133B (en) Static scanning method, device and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination