CN111382067A - Method and system for generating high-quality seeds in fuzzy test - Google Patents

Method and system for generating high-quality seeds in fuzzy test Download PDF

Info

Publication number
CN111382067A
CN111382067A CN202010124736.5A CN202010124736A CN111382067A CN 111382067 A CN111382067 A CN 111382067A CN 202010124736 A CN202010124736 A CN 202010124736A CN 111382067 A CN111382067 A CN 111382067A
Authority
CN
China
Prior art keywords
character string
analysis
taint
test
binary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010124736.5A
Other languages
Chinese (zh)
Inventor
孙利民
郑尧文
宋站威
刘明东
朱红松
石志强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Information Engineering of CAS
Original Assignee
Institute of Information Engineering of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Information Engineering of CAS filed Critical Institute of Information Engineering of CAS
Priority to CN202010124736.5A priority Critical patent/CN111382067A/en
Publication of CN111382067A publication Critical patent/CN111382067A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3684Test management for test design, e.g. generating new test cases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3676Test management for coverage analysis

Abstract

The invention provides a method and a system for generating high-quality seeds in a fuzzy test, wherein the method comprises the following steps: firstly, extracting and preprocessing a binary character string from a target program; then, performing static taint analysis on the obtained binary character string to obtain a key character string of the grey box test; finally, based on the key character string, a gray box test tool is utilized to generate high quality seeds. According to the high-quality seed generation method and system provided by the embodiment of the invention, a character string set is called through analyzing and analyzing the reverse binary program to be detected; the method calls key character string information in a test program by using a static taint analysis technology, participates in the generation of the input of the gray box test, finally realizes the discovery of program overflow bugs by using the dynamic gray box test, provides high-quality seeds for the efficient fuzzy test and bug mining of software with specific syntactic structure input, and effectively improves the code coverage rate of the gray box test.

Description

Method and system for generating high-quality seeds in fuzzy test
Technical Field
The embodiment of the invention belongs to the technical field of computers, and particularly relates to a method and a system for generating high-quality seeds in a fuzzy test.
Background
The fuzzy test is one of the most efficient technical means for software and system vulnerability mining. The gray box test is one of fuzzy test technologies, and dynamic execution information of a program is acquired through a lightweight instrumentation technology to guide the gray box test to detect more paths, so that more bugs in the program are found.
For example, the current most commonly used gray box testing tool AFL is to mutate the input of finding a new path of a program as a seed and try to test more paths. However, the tool still adopts a strategy based on random variation, and adopts variation with byte as a unit to insert and turn, and the generated input has a good detection effect on software analyzed according to bytes, but because the input samples generated by the general gray box test system are all random variation, most of the input has no semantic information of the software, the software discards the input in the early stage of processing the input, and the tool is not suitable for vulnerability detection on the software analyzed according to a specific syntactic structure, so that the purpose of deeply testing the software cannot be achieved.
Therefore, for software with specific grammar structure input, how to generate high-quality seeds for efficient fuzzy test and even efficient vulnerability mining becomes a technical problem to be solved urgently at the present stage.
Disclosure of Invention
The embodiment of the invention provides a method and a system for generating high-quality seeds in a fuzzy test, which are used for overcoming the defect that the software cannot be deeply tested because most of inputs generated by the conventional universal gray box test system are randomly varied and do not have semantic information of the software, so that the software discards the inputs at the early stage of processing the inputs.
In a first aspect, an embodiment of the present invention provides a method for generating high-quality seeds in a fuzzy test, which mainly includes: extracting and preprocessing a binary character string from a target program; performing static taint analysis on the obtained binary character string to obtain a key character string of the grey box test; based on the key string, a gray box test tool is utilized to generate high quality seeds.
Preferably, the extracting and preprocessing of the binary string from the target program may include: testing a binary service program of a target program under Linux; reading the header information of a target program by using a readelf command, and acquiring start and stop addresses of the data and the data section; from the start and stop addresses, the binary character strings are extracted and stored with the '\ 0', '\ r', '\ n', and'\ t' control characters as partitions, respectively.
Preferably, the performing static taint analysis on the obtained binary character string to obtain a key character string of the gray box test may include: defining a stain point source and a stain target, and acquiring library function codes called by a target program for realization; performing data flow analysis aiming at a library function called by a target program, and determining a stain transmission abstract; performing stain transmission analysis according to the granularity of code blocks on the basis of a stain source and a stain transmission abstract; and if the taint source is transmitted to a register of the taint target, judging that the binary character string corresponding to the taint source is a key character string of the ash box test.
Preferably, the above defining a stain source and a stain target and obtaining a library function code implementation called by a target program specifically include: setting a register for storing the address of the binary character string as a sewage source; setting the comparison library function of the binary character string and the comparison library function of the memory as a taint target; and acquiring a dynamic link library called by the target program, and further acquiring codes of all library functions used by the target program from the dynamic link library to realize.
Preferably, the performing data flow analysis on the library function called by the target program to determine the taint propagation summary includes: skipping processing is carried out when the printing function, the file operation function, the environment variable acquisition function and the parameter acquisition function in the library function are aimed at, and a stain spreading abstract is not generated; and (4) carrying out data distribution analysis aiming at the functions of character string copying, connecting, intercepting and measuring in the library function, and determining a corresponding taint propagation abstract.
Preferably, the above stain spreading analysis based on the stain source and the stain spreading abstract according to the granularity of the code blocks specifically includes: in the analysis process of taint propagation, carrying out code block taint propagation analysis from a taint source according to the granularity of code blocks; after the code block taint propagation analysis is completed, code transfer analysis is carried out; in the code transfer analysis, if the library function is called, the taint propagation abstract is utilized to carry out taint propagation analysis.
Preferably, the generating high-quality seeds based on the key character string by using a gray box testing tool specifically includes: storing each key character string to form a dictionary file; and starting a gray box testing tool AFL, calling a key character string from the dictionary file to insert into the original seed in each test, and generating a high-quality seed.
In a second aspect, an embodiment of the present invention provides a high-quality seed generation system in a fuzzy test, which mainly includes: binary string extraction unit, key string extraction unit and seed generation unit, wherein: the binary string extraction unit is mainly used for extracting and preprocessing a binary string from a target program; the key character string extraction unit is mainly used for performing static stain analysis on the obtained binary character string to obtain a key character string of the ash box test; the seed generation unit is mainly used for generating high-quality seeds by utilizing a gray box testing tool based on the key character strings.
In a third aspect, an embodiment of the present invention provides an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the steps of the method for generating high-quality seeds in a fuzz test according to any one of the first aspect when executing the program.
In a fourth aspect, an embodiment of the present invention provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the method for generating high-quality seeds in a fuzz test according to any one of the first aspect.
According to the method and the system for generating the high-quality seeds in the fuzzy test, provided by the embodiment of the invention, the character string set is called through analyzing and resolving the reverse binary program to be detected; the method calls key character string information in a test program by using a static taint analysis technology, participates in the generation of the input of the gray box test, finally realizes the utilization of a dynamic gray box test tool, provides high-quality seeds for the efficient fuzzy test and vulnerability mining of software with specific syntactic structure input, effectively improves the code coverage rate of the gray box test, and further more efficiently discovers program overflow type vulnerabilities.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
Fig. 1 is a schematic flow chart of a method for generating high-quality seeds in a fuzzy test according to an embodiment of the present invention;
FIG. 2 is a block diagram of a flow chart of another method for generating high quality seeds according to an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a high-quality seed generation system in a fuzzy test according to an embodiment of the present invention;
FIG. 4 is a diagram of implementation steps of taint propagation analysis on a software assembly code block and a control flow graph according to an embodiment of the present invention;
fig. 5 is a physical structure diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Because computer software is artificially programmed, more or less security holes are caused because the software programmer does not consider the problems in the process of programming the software, and generally comprise the abnormal operation of the software after the computer is infected by virus invasion, protocol holes, and the like. Software testing is the process of executing programs for discovering security vulnerabilities, and conventional techniques for vulnerability detection generally include static detection (white-box detection), dynamic detection (black-box detection), and hybrid detection (gray-box detection). The gray box detection is a test between the white box test and the black box test, so that the correctness of output and input is concerned, and the condition inside the program is also concerned, namely the gray box test does not require the detailed and complete internal process of the test like the white box test, but the gray box test is more concerned about the internal logic of the program than the black box test, and the internal running state is judged mainly through some characteristic phenomena, events and signs, so that a good test effect can be achieved in the integrated test stage of the program.
Fuzzing is a method of discovering software bugs by providing configuration test inputs to a target program and monitoring for abnormal results. However, the implementation of Fuzzing requires a large number of test cases, so-called seeds, on the one hand; on the other hand, in an actual situation, the program to be tested may not be measured by using many test cases, the system bug may be obtained certainly, and the seeds need to be filtered, that is, a certain rule is established to convert general seeds into high-quality seeds.
In view of this, an embodiment of the present invention provides a method for generating high-quality seeds in a fuzzy test, as shown in fig. 1, including, but not limited to, the following steps:
s1: extracting and preprocessing a binary character string from a target program;
s2: performing static taint analysis on the obtained binary character string to obtain a key character string of the grey box test;
s3: based on the key string, a gray box test tool is utilized to generate high quality seeds.
The extracting and preprocessing of the binary string from the target program in step S1 may include, but is not limited to, the following steps:
firstly, testing a binary service program of a target program under Linux; then, reading the header information of the target program by using a readelf command, and acquiring the start-stop addresses of the data and the data section; finally, from the start and stop addresses of the acquired data segment, binary character strings are extracted and stored by using '\ 0', '\ r', '\ n', and'\\ t' control characters as partitions.
Specifically, in the binary string extraction stage, the readable string is stored in a constant form in a specific area of the binary. At the beginning of the test, the specific area is acquired first, and then the corresponding character string is extracted according to the '0' separator. And further preprocessing the extracted character string, comprising:
if the character string has control characters such as '\ t', '\\ r', '\\ n', etc., the character string is further disassembled according to the corresponding control characters to obtain corresponding sub-character strings, and all the sub-character strings are used as the target program binary character string in the step S1.
If there is no control character such as '\ t', '\ r', '\\ n' in the character string, the character string is directly used as the target program binary character string in step S1.
The taint analysis technology is an important technical means for protecting the security of private data and realizing vulnerability detection, and is divided into static taint analysis and dynamic taint analysis according to whether a target program needs to be operated in the analysis process. The static taint analysis utilized in the embodiment of the invention mainly analyzes data among variables and control dependency relationships in an off-line manner by methods such as lexical analysis and grammatical analysis and the like to detect whether taint data can be transmitted from a taint source to a taint target (i.e. a taint gathering point), and can ensure that a target program does not need to be operated and codes do not need to be modified in the process of vulnerability detection.
Specifically, as an optional embodiment, performing static taint analysis on the obtained binary character string to obtain a key character string of the gray box test may specifically include the following steps:
defining a stain point source and a stain target, and acquiring library function codes called by a target program for realization; performing data flow analysis aiming at a library function called by a target program, and determining a stain transmission abstract; performing stain spread analysis according to the granularity of code blocks on the basis of the stain source and the stain spread abstract; and if the dirty point source is transmitted to the register of the dirty target, judging that the binary character string corresponding to the dirty point source is the key character string of the ash box test.
As shown in fig. 2, the method for generating high-quality seeds in a fuzz test provided in this embodiment mainly includes:
firstly, extracting and preprocessing a binary string, specifically: the target program is tested, mainly the binary service program under Linux, and the file format of the binary service program is ELF. Furthermore, reading the header information of the target program by using a readelf command, and acquiring the start-stop addresses of the data and the data section. Then, character strings are extracted and stored from the start addresses of the respective data segments by dividing control characters such as '\ 0', '\ r', '\ n', and'\ t'.
Further, performing extraction of key character strings based on static taint analysis, including defining a taint source: the register storing the address of the binary string is used as the source of the dirty point. Specifically, in IDA Pro, assuming that the storage address of the character string is addr, based on IDAPython, an xrefsto (addr) interface is called to obtain all reference information for accessing the character string. Frm is the address of the instruction that references the string if one of the references is xref. And further resolving a source register of the instruction, and using the source register as a pollution source.
Define taint target (taint convergence point): and setting the comparison library function of the binary character string and the comparison library function of the memory as the taint target. Namely, the specific registers of the comparison library function of the character string obtained in the previous step and the comparison library function of the memory are used as the taint target. The comparison library function of the character string may include strcmp, strncmp, strcasecmp and strncasecmp, and both the first and second parameters of the library function may be used as dirty target registers to indicate that the memory area pointed by the parameter pointer is compared with the external input. The memory comparison library function mainly comprises memcmp, and similarly, the first parameter and the second parameter of the memory comparison library function can be used as taint target registers.
Further, generation of the taint summary for all the above library functions needs to be performed. Taint analysis can be abstracted into a triple form including a source of taint, a target of taint, and harmless treatment. Wherein the taint source represents the direct introduction of untrusted data or confidential data into the system; taint target represents directly producing security sensitive operations (violating data integrity) or revealing private data to the outside (violating data confidentiality); the harmless treatment means that data transmission does not harm the information safety of the software system any more by means of data encryption or harm removing operation and the like. Taint analysis is the analysis of whether data introduced by a taint source in a program can be propagated directly to a taint target without being harmlessly processed. If not, the system is indicated to be information flow safe; otherwise, it indicates that the system has security problems such as private data leakage or dangerous data operation.
Taint propagation analysis can be divided into static taint analysis and dynamic taint analysis depending on whether a program needs to be run during the analysis. Based on the content of the above embodiments, the high-quality seed generation method provided by the embodiment of the present invention is mainly a seed generation method suitable for the fuzz test, which is proposed for static taint analysis. The object of static taint analysis is typically the source code or intermediate representation of the program. The problem of static analysis of explicit flows in taint propagation can be translated into analysis of static data dependencies in programs.
Based on the content of the above embodiment, in the embodiment of the present invention, the data flow analysis is performed on the library function to determine the generation of the taint abstract, and the main steps include: skipping processing is carried out when a printing function, a file operation function, an environment variable acquisition function and a parameter acquisition function in the library function are aimed at, and a stain spreading abstract is not generated; and (4) carrying out data distribution analysis aiming at the functions of character string copying, connecting, intercepting and measuring in the library function, and determining a corresponding taint propagation abstract.
In particular, for non-binary string related functions, no taint propagation will be performed on the string variables. Therefore, only the binary string correlation function acquired in step S1 is analyzed. For convenience of explanation, definition a is given in the examples of the present invention0、a1、a2、a3Respectively representing the first four parameters of the function, v0Is the return value of the function. X<Y represents the existence of a spread rule of Y to X. If the memory area pointed by the Y is polluted, the memory area pointed by the X is also polluted; if the memory region pointed to by Y is not contaminated, then the memory region pointed to by X is also uncontaminated. Specific analysis steps may include:
1) for string print class functions, such as fprintf, sprintf, printf, vsnprintf, snprintf, fputs, puts, etc., no blob digest will be generated, and when blob propagation is analyzed to such functions, it will be skipped directly to continue the analysis.
2) For file operation class functions, such as open, fopen, stat, fstat, lstat, rewind, unlink, remove, rename, system, etc., since these strings are usually opened or executed as file names or commands, there is no taint propagation and no taint summary generated, and when taint propagation is analyzed to such functions, it will be skipped directly to continue the analysis.
3) For getenv environment variable acquisition and getopt parameter acquisition functions, the obtained character strings are environment variable values or parameter values, and are irrelevant to environment variable names and parameter names.
4) In the character string or the memory operation function, determining that functions of copying functions of the character string, such as memcpy, memncpy, memmeve, strcpy, strncpy and the like, have taint propagation, wherein the taint propagation rule is a0< -a 1; the taint propagation rules of the character string copy functions strdup, strdupa and strdupa are v0< -a1 and v0< -a 0; the character string connection function is strcat, and the taint propagation rule is a0< -a 1; the string length measurement function strlen has no taint propagation rule; the taint propagation rules of character string interception functions such as strchr, strrchr, strstrstr and strrstr are v0< -a1 and v0< -a 0. And respectively acquiring taint propagation rules corresponding to different library functions, and defining corresponding taint propagation abstracts.
As an optional embodiment, performing taint propagation analysis according to the granularity of the code blocks based on the taint propagation summary of the taint source and the obtained library functions may include: in the analysis process of the taint propagation, carrying out code block taint propagation analysis from a taint source according to the granularity of the code blocks; after the code block taint propagation analysis is completed, code transfer analysis is carried out; in the code transfer analysis, if the library function is called, the taint propagation abstract is utilized to carry out taint propagation analysis.
Specifically, in the taint propagation analysis process, starting from a taint source, taint propagation analysis is performed according to the granularity of code blocks, and the taint propagation analysis process comprises the following steps: and after the current code block taint propagation analysis is completed, carrying out code block transfer analysis. And the taint propagation analysis is performed in a code block breadth analysis mode, and if a plurality of subsequent code blocks of the current code block exist, the subsequent code blocks are sequentially stored in the queue for subsequent analysis. The following code block analysis is divided into the following cases: 1) if the subsequent code block is still in the current function, only storing the subsequent code block in an analysis queue for subsequent code block analysis; 2) and if the subsequent code block is a subfunction which is a library function, performing library function analysis. The concrete implementation is to analyze according to the generation of the taint abstract recorded in the embodiment and continue to analyze the code block behind the library function; 3) and if the subsequent code block is a subfunction and the subfunction is still in the target program, adding the code block of the first address of the subfunction into the analysis queue.
And finally, if the taint source is transmitted to the taint target, performing target point library function analysis. Specifically, the analysis is performed according to the taint target rule described in the above embodiment, that is, if the taint spreads to the taint target register of the target point library function, it is determined that the corresponding character string is the key character string of the gray box test; if not, analysis continues. And if the taint source is finally propagated to the taint target but not, judging that the corresponding character string is not the key string of the grey box test.
Based on the above description of the embodiments, as an alternative embodiment, generating high-quality seeds by using a gray box testing tool based on the key character string may include:
storing each key character string respectively to form a dictionary file; and starting a gray box testing tool AFL, calling a key character string from the dictionary file to insert into the original seed in each test, and generating a high-quality seed.
Specifically, after the gray box test keyword string is obtained, it is saved as a dictionary file keywords _ file. The specific storage manner may be that, for each keyword string fuzz _ keywords, the storage is "stringsxx — keywords", where x represents the sequence number of the current keyword string.
Further, the AFL gray box test tool is started, and the parameter is followed by-x keywords _ file. In each test, the AFL will insert the key into the original seed, thereby generating a high quality seed.
In summary, the embodiment of the present invention provides a method for generating high-quality seeds in a fuzzy test, which calls a string set by performing analysis and analysis on a binary program to be detected reversely; the method has the advantages that the key character string information in the test program is called by using the static taint analysis technology and participates in the generation of the input of the gray box test, finally, high-quality seeds are provided for the efficient fuzzy test and vulnerability mining of the software with specific grammatical structure input, the code coverage rate of the gray box test is effectively improved, and more program overflow type vulnerabilities are found.
As shown in fig. 3, an embodiment of the present invention provides a high-quality seed generation system in a fuzzy test, which mainly includes a binary string extraction unit 1, a key string extraction unit 2, and a seed generation unit 3, where:
the binary string extraction unit 1 is mainly used for extracting and preprocessing a binary string from a target program;
the key character string extraction unit 2 is mainly used for performing static stain analysis on the obtained binary character string to obtain a key character string of the ash box test;
the seed generation unit 3 is mainly used for generating high-quality seeds by using a gray box test tool based on the key character strings.
Fig. 4 is a diagram of implementation steps of taint propagation analysis on a software assembly code block and a control flow diagram according to an embodiment of the present invention, as shown in fig. 4, an analysis module is mainly arranged in the key character string extraction unit 2, and is used for performing static taint analysis on the obtained binary character string to obtain a key character string of a gray box test, where the whole analysis process may be: firstly, carrying out taint analysis on the granularity of code blocks on the obtained binary character string; and if the taint propagation analysis of the current code block is completed, performing transfer analysis on the code block to obtain the analysis target of the next step.
The following cases can be classified into the following cases. 1) If the subsequent code block is still in the current function, only storing the subsequent code block in a queue for subsequent code block analysis; 2) and if the subsequent code block is a subfunction which is a library function, performing library function analysis. The concrete implementation is that the taint abstract generated above is analyzed, and the code block behind the library function is continuously analyzed; 3) and if the subsequent code block is a subfunction and the subfunction is still in the program, adding the code block with the first address of the subfunction into the analysis queue.
Further, performing library function analysis, namely performing target point library function analysis if the stain source is spread to the stain target, wherein the step of performing the stain spread analysis by using the stain abstract of the library function when the stain spread is analyzed to the library function is included.
Finally, the analysis of the target point library function is performed, including: and if the taint is propagated to a taint target register of the target point library function, judging that the corresponding character string is a key character string of the grey box test. Otherwise, the analysis is continued. And if the taint source is finally propagated to the taint target but not, judging that the corresponding character string is not the key string of the grey box test.
It should be noted that the high-quality seed generation system in the fuzzy test provided in the embodiment of the present invention may be configured to execute the high-quality seed generation method in the fuzzy test in the foregoing embodiment when the system is specifically run, and details are not described again.
According to the high-quality seed generation system in the fuzzy test, provided by the embodiment of the invention, a character string set is called through analyzing and analyzing a reverse binary program to be detected; the method calls key character string information in a test program by using a static taint analysis technology, participates in the generation of the input of the gray box test, finally realizes the discovery of program overflow bugs by using the dynamic gray box test, provides high-quality seeds for the efficient fuzzy test and bug mining of software with specific syntactic structure input, and effectively improves the code coverage rate of the gray box test.
Fig. 5 illustrates a physical structure diagram of an electronic device, which may include, as shown in fig. 5: a processor (processor)310, a communication Interface (communication Interface)320, a memory (memory)330 and a communication bus 340, wherein the processor 310, the communication Interface 320 and the memory 330 communicate with each other via the communication bus 340. The processor 310 may call logic instructions in the memory 430 to perform the following method: extracting and preprocessing a binary character string from a target program; performing static taint analysis on the obtained binary character string to obtain a key character string of the grey box test; based on the key string, a gray box test tool is utilized to generate high quality seeds.
In addition, the logic instructions in the memory 330 may be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, an embodiment of the present invention further provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program is implemented to perform the transmission method provided in the foregoing embodiments when executed by a processor, and for example, the method includes: extracting and preprocessing a binary character string from a target program; performing static taint analysis on the obtained binary character string to obtain a key character string of the grey box test; based on the key string, a gray box test tool is utilized to generate high quality seeds.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A method for generating high-quality seeds in a fuzzy test is characterized by comprising the following steps:
extracting and preprocessing a binary character string from a target program;
performing static taint analysis on the obtained binary character string to obtain a key character string of a grey box test;
based on the key string, a gray box test tool is utilized to generate high quality seeds.
2. The method for generating high-quality seeds in a fuzz test according to claim 1, wherein the extracting and preprocessing of the binary string from the target program comprises:
testing a binary service program of a target program under Linux;
reading the header information of a target program by using a readelf command, and acquiring start and stop addresses of the data and the data section;
from the start and stop addresses, the binary character strings are extracted and saved with '\ 0', '\ r', '\ n', '\\ t' control characters as partitions, respectively.
3. The method for generating high-quality seeds in a fuzz test according to claim 1, wherein the performing static taint analysis on the obtained binary character string to obtain a key character string of a gray box test comprises:
defining a stain point source and a stain target, and acquiring library function codes called by a target program for realization;
performing data flow analysis on the library function called by the target program, and determining a stain transmission abstract;
performing stain spread analysis according to the granularity of code blocks on the basis of the stain source and the stain spread abstract;
and if the dirty point source is transmitted to the register of the dirty target, judging that the binary character string corresponding to the dirty point source is the key character string of the ash box test.
4. The method for generating high-quality seeds in a fuzzy test according to claim 3, wherein said defining a dirty point source and a dirty target, and obtaining a library function code implementation called by a target program comprises:
setting a register storing the binary string address as the sewage source;
setting the comparison library function of the binary character string and the comparison library function of the memory as the taint target;
and acquiring a dynamic link library called by the target program, and further acquiring codes of all library functions used by the target program from the dynamic link library to realize.
5. The method for generating high-quality seeds in a fuzz test according to claim 3, wherein the step of performing data flow analysis on the library function called by the target program and determining a taint propagation summary comprises the following steps:
skipping processing is carried out when the printing function, the file operation function, the environment variable acquisition function and the parameter acquisition function in the library function are aimed at, and a stain spreading abstract is not generated;
and (4) carrying out data distribution analysis aiming at the functions of character string copying, connecting, intercepting and measuring in the library function, and determining the taint propagation abstract.
6. The method for generating high-quality seeds in the fuzz test according to claim 3, wherein the performing the taint propagation analysis according to the code block granularity based on the taint source and the taint propagation abstract comprises:
in the analysis process of the taint propagation, code block taint propagation analysis is carried out from the taint source according to the code block granularity;
after the code block taint propagation analysis is completed, code transfer analysis is carried out;
in the code transfer analysis, if the library function is called, the taint propagation abstract is utilized to carry out taint propagation analysis.
7. The method for generating high-quality seeds in a fuzz test according to claim 1, wherein the generating high-quality seeds by using a gray box test tool based on the key character string comprises:
storing each key character string to form a dictionary file;
and starting a gray box testing tool AFL, calling the key character string from the dictionary file to insert into the original seed in each test, and generating the high-quality seed.
8. A system for generating high quality seeds in a fuzzy test, comprising:
the binary string extraction unit is used for extracting and preprocessing a binary string from a target program;
the key character string extraction unit is used for performing static stain analysis on the obtained binary character string to obtain a key character string of the grey box test;
and the seed generation unit is used for generating high-quality seeds by utilizing a gray box testing tool based on the key character string.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program performs the steps of the method for generating high quality seeds in a fuzz test according to any of the claims 1 to 7.
10. A non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor, performs the steps of the method for generating high quality seeds in a fuzz test according to any of the claims 1 to 7.
CN202010124736.5A 2020-02-27 2020-02-27 Method and system for generating high-quality seeds in fuzzy test Pending CN111382067A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010124736.5A CN111382067A (en) 2020-02-27 2020-02-27 Method and system for generating high-quality seeds in fuzzy test

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010124736.5A CN111382067A (en) 2020-02-27 2020-02-27 Method and system for generating high-quality seeds in fuzzy test

Publications (1)

Publication Number Publication Date
CN111382067A true CN111382067A (en) 2020-07-07

Family

ID=71218617

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010124736.5A Pending CN111382067A (en) 2020-02-27 2020-02-27 Method and system for generating high-quality seeds in fuzzy test

Country Status (1)

Country Link
CN (1) CN111382067A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111818071A (en) * 2020-07-15 2020-10-23 国家计算机网络与信息安全管理中心 Vehicle stain analysis method and device
CN111881039A (en) * 2020-07-24 2020-11-03 广州大学 Seed processing method and system for fuzz test, fuzz test method and system and storage medium
CN112632557A (en) * 2020-12-22 2021-04-09 厦门大学 Kernel vulnerability mining method, medium, equipment and device based on fuzzy test
CN113419960A (en) * 2021-07-01 2021-09-21 中国人民解放军国防科技大学 Seed generation method and system for kernel fuzzy test of trusted operating system
CN113746819A (en) * 2021-08-24 2021-12-03 中国科学院信息工程研究所 Binary software protocol detection load mining method and device
CN114944997A (en) * 2022-03-24 2022-08-26 浙江大华技术股份有限公司 Protocol detection method, protocol detection device and computer readable storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105022958A (en) * 2015-07-11 2015-11-04 复旦大学 Android application used application program vulnerability detection and analysis method based on code library security specifications

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105022958A (en) * 2015-07-11 2015-11-04 复旦大学 Android application used application program vulnerability detection and analysis method based on code library security specifications

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
YAOWEN ZHENG ET AL: "An Efficient Greybox Fuzzing Scheme for Linux-based IoT Programs Through Binary Static Analysis", 《2019 IEEE 38TH INTERNATIONAL PERFORMANCE COMPUTING AND COMMUNICATIONS CONFERENCE (IPCCC)》 *
任玉柱 等: "污点分析技术研究综述", 《计算机应用》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111818071A (en) * 2020-07-15 2020-10-23 国家计算机网络与信息安全管理中心 Vehicle stain analysis method and device
CN111881039A (en) * 2020-07-24 2020-11-03 广州大学 Seed processing method and system for fuzz test, fuzz test method and system and storage medium
CN112632557A (en) * 2020-12-22 2021-04-09 厦门大学 Kernel vulnerability mining method, medium, equipment and device based on fuzzy test
CN113419960A (en) * 2021-07-01 2021-09-21 中国人民解放军国防科技大学 Seed generation method and system for kernel fuzzy test of trusted operating system
CN113419960B (en) * 2021-07-01 2022-06-14 中国人民解放军国防科技大学 Seed generation method and system for kernel fuzzy test of trusted operating system
CN113746819A (en) * 2021-08-24 2021-12-03 中国科学院信息工程研究所 Binary software protocol detection load mining method and device
CN114944997A (en) * 2022-03-24 2022-08-26 浙江大华技术股份有限公司 Protocol detection method, protocol detection device and computer readable storage medium
CN114944997B (en) * 2022-03-24 2024-02-20 浙江大华技术股份有限公司 Protocol detection method, protocol detection device and computer readable storage medium

Similar Documents

Publication Publication Date Title
CN111382067A (en) Method and system for generating high-quality seeds in fuzzy test
CN107368417B (en) Testing method of vulnerability mining technology testing model
Carmony et al. Extract Me If You Can: Abusing PDF Parsers in Malware Detectors.
Sheyner Scenario graphs and attack graphs
Christodorescu et al. Testing malware detectors
CN113497809B (en) MIPS framework vulnerability mining method based on control flow and data flow analysis
JP7115552B2 (en) Analysis function imparting device, analysis function imparting method and analysis function imparting program
Yi et al. An intelligent communication warning vulnerability detection algorithm based on IoT technology
Del Grosso et al. Improving network applications security: a new heuristic to generate stress testing data
EP2881877A1 (en) Program execution device and program analysis device
WO2017039136A1 (en) System for analyzing attack action for vulnerable point of source code-based software
CN114996126B (en) Vulnerability detection method and system for EOSIO intelligent contracts
CN113158197B (en) SQL injection vulnerability detection method and system based on active IAST
Lanzi et al. A smart fuzzer for x86 executables
CN113935041A (en) Vulnerability detection system and method for real-time operating system equipment
CN109165509B (en) Method, device, system and storage medium for measuring real-time credibility of software
US11822673B2 (en) Guided micro-fuzzing through hybrid program analysis
Mammar et al. Using testing techniques for vulnerability detection in C programs
CN113419960A (en) Seed generation method and system for kernel fuzzy test of trusted operating system
Chen et al. Automated finite state machine extraction
Inácio et al. CorCA: An Automatic Program Repair Tool for Checking and Removing Effectively C Flaws
CN113742724B (en) Security mechanism defect detection method of network protocol software
Letychevskyi Algebraic methods for detection of vulnerabilities in software systems
Antunes Monitoring web applications for vulnerability discovery and removal under attack
CN116680703A (en) Vulnerability detection method and device, computer readable storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200707