Disclosure of Invention
In view of the above, one or more embodiments of the present disclosure provide an application bug scanning method and related apparatus, so as to solve the problem of low efficiency in detecting an application bug.
Based on the above purpose, one or more embodiments of the present specification provide a method for scanning vulnerabilities of application software, including:
acquiring a program source code and program basic information of the application software;
performing instrumentation on the program source code;
matching the program basic information with a program information database to obtain a test seed;
carrying out mutation on the test seeds according to a preset mutation strategy to obtain a first test case;
performing a preset round number test on the program source code through the first test case to obtain a crash result of the application software;
and determining the vulnerability of the application software according to the crash result of the application software.
Further, the performing a preset number of rounds of tests on the program source code through the first test case includes:
performing a first round of test on the program source code by using the first test case;
acquiring code coverage rate based on edges and blocks;
obtaining a mutation strategy for generating a new path according to the code coverage rate;
increasing the scheduling times of the variation strategy generating the new path when the next round of test is carried out;
carrying out mutation on the test seed based on a mutation strategy for generating a new path to obtain a second test case;
and carrying out a new round of test on the program source code by using the second test case, and returning to the step of obtaining the code coverage rate based on the edges and the blocks to continue executing until the preset round number is reached.
Further, the determining the vulnerability of the application software according to the crash result of the application software further includes:
acquiring a first crash signal generated by an operating system kernel function when the application software crashes;
determining a second crash signal which is generated when the program source code crashes based on the program basic information;
in response to determining that the first crash signal is the same as the second crash signal, the crash result is a program source code vulnerability.
Further, the program basic information comprises one or more of a programming language type, a programming framework and a work-oriented type.
Further, matching the program basic information with a program information database to obtain a test seed, including:
the program information database selects a corresponding seed as the test seed based on one of the programming language type, the programming framework and the work-oriented type of the application program; or
And the program information database selects corresponding seeds respectively based on a plurality of items in the programming language type, the programming framework and the work-oriented type of the application program, and takes an intersection as the test seed.
Further, the mutation policy includes one or more of addition, multiplication, byte flipping, bit flipping, and byte setting.
Further, the instrumentation of the program source code includes:
performing lexical analysis and syntactic analysis on the program source code to obtain the position of a key code;
and inserting a probe at the position of the key code.
Based on the same inventive concept, one or more embodiments of the present specification further provide an electronic device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and the processor implements the method as described in any one of the above items when executing the program.
Based on the same inventive concept, one or more embodiments of the present specification also provide a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the method as any one of the above.
As can be seen from the above, in the application software vulnerability scanning method and the related device provided in one or more embodiments of the present specification, the corresponding test seeds are obtained through matching the program basic information with the program information database, and at this time, the test seeds are more targeted for testing the application software, and meanwhile, the code coverage rate can be continuously improved through a cyclic code coverage rate feedback mechanism, so that the vulnerability detection efficiency is significantly improved.
Detailed Description
For the purpose of promoting a better understanding of the objects, aspects and advantages of the present disclosure, reference is made to the following detailed description taken in conjunction with the accompanying drawings.
It is to be noted that unless otherwise defined, technical or scientific terms used in one or more embodiments of the present specification should have the ordinary meaning as understood by those of ordinary skill in the art to which this disclosure belongs. The use of "first," "second," and similar terms in one or more embodiments of the specification is not intended to indicate any order, quantity, or importance, but rather is used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that the element or item listed before the word covers the element or item listed after the word and its equivalents, but does not exclude other elements or items. The terms "connected" or "coupled" and the like are not restricted to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", and the like are used only to indicate relative positional relationships, and when the absolute position of the object being described is changed, the relative positional relationships may also be changed accordingly.
As described in the background section, the existing vulnerability scanning method for application software is still difficult to meet the requirement in terms of detection efficiency. The applicant finds that the existing application software vulnerability scanning method mainly has the following problems in the process of implementing the disclosure: no better method exists in the selection of the test seeds, which causes the low efficiency of the fuzz test on the application software; and no good mutation strategy is available for screening the mutation methods which generate more new paths and the mutation methods which generate fewer new paths, so that the efficiency of performing fuzzy test on the application software is further reduced.
To solve the above technical problem, one or more embodiments of the present specification provide, in conjunction with fig. 1, an application software scanning method, which may be performed by a fuzz tester, including the following steps:
step S101: program source codes and program basic information of the application software are obtained.
In the step, a get command in a code hosting platform (Github) can be used for directly pulling a project file of the application software to an input folder corresponding to the fuzzy tester, so as to obtain a program source code; or downloading the program source code of the application software through the Internet and inputting the path for storing the program source code into the fuzzy tester.
The program basic information comprises one or more of a programming language type, a programming framework and a work-oriented type.
Step S102: and performing instrumentation on the program source code.
In this step, the compiling can be performed by a compiler provided by the fuzzy tester, lexical analysis and syntactic analysis can be performed on the program source code in the compiling process to obtain the position of the specific key code, a probe is inserted into the position of the key code on the premise of not changing the original logic integrity of the program source code to be tested for information acquisition, and the instrumentation of the program source code is convenient for subsequent code coverage rate statistics.
Step S103: and matching the program basic information with a program information database to obtain a test seed.
In this step, the program basic information obtained in step S101 is used to select a corresponding test seed in the program information database according to one or more of the applied programming language type, the programming framework, and the work-oriented type.
Specifically, when the program basic information of the application program comprises one of a programming language type, a programming frame and a work-oriented type, directly taking the correspondingly selected seed as a test seed; when the program basic information of the application program comprises multiple items in a programming language type, a programming frame and a work-oriented type, corresponding seeds are respectively selected, and then intersection is taken as a test seed required by the fuzz test.
Step S104: and mutating the test seeds according to a preset mutation strategy to obtain a first test case.
In this step, the mutation policy includes one or more of addition, multiplication, byte inversion, bit inversion, byte setting, and the like. After the mutation strategy is preset, the fuzzy tester can be prevented from randomly selecting the mutation strategy, so that the test case generated after the mutation is more targeted.
Step S105: and testing the preset round number of the program source code through the first test case to obtain the collapse result of the application software.
Further, with reference to fig. 2, the preset round number test is performed on the program source code through the first test case, which includes the following steps:
step S201: and carrying out a first round of test on the program source code by utilizing the first test case.
In this step, each element of the first test case is specifically input into the instrumented program source code in a loop.
Step S202: edge and block based code coverage is obtained.
In the step, the code coverage rate based on the comprehensive evaluation of the edges and the blocks is obtained through the instrumentation code segment.
Step S203: and obtaining a mutation strategy for generating a new path according to the code coverage rate.
In this step, a mutation strategy for generating a new path, that is, a mutation strategy with high code coverage rate, may be selected according to the code coverage rate and the key value pair formed by the corresponding mutation strategy.
Step S204: and increasing the scheduling times of the mutation strategies generating the new paths when the next round of test is carried out.
In the step, after the test of the program source code is completed, the coverage rates of the generated codes are different through the test cases obtained by different variation strategies; mutation strategies for generating new paths, i.e. mutation strategies for improving code coverage. Therefore, in the next round of testing, the scheduling times of the mutation strategies for generating new paths need to be increased, so as to improve the code coverage. In addition, the increased number of times of scheduling may be selected according to actual situations, and is not specifically limited herein.
Step S205: and carrying out mutation on the test seeds based on the mutation strategy for generating the new path and the corresponding scheduling times to obtain a second test case.
In this step, the number of times of scheduling of the mutation strategy of the new path generated in the byte flipping, adding, multiplying, bit flipping and byte setting is increased, and after the mutation strategy of the currently allocated number of times of scheduling is used to mutate the test seeds, a second test case capable of further improving the code coverage rate is obtained.
Step S206: and carrying out a new round of test on the program source code by using the second test case, and returning to the step S202 to continue executing until the preset round number is reached.
In this step, after the test of the preset number of rounds, the code coverage rate is continuously adjusted and optimized, and finally the code coverage rate is converged to the highest code coverage rate in the preset number of rounds, so as to obtain more crash results.
Step S106: and determining the vulnerability of the application software according to the crash result of the application software.
Next, a specific application scenario of the application software vulnerability scanning method of this embodiment is given. The application program is LibTIFF which is a library used for reading and writing label image file formats, application program source codes of the LibTIFF are downloaded through the Internet, the programming language type of the LibTIFF is C language, a jpeg library and a tiff library are further used, and the work-oriented type of the LibTIFF is graphics and image processing. Compiling the LibTIFF program source code by using a compiler provided by the fuzzy tester, and performing instrumentation on each key tuple in the program source code. Furthermore, the program information database matches seeds through keywords of C language, graph and image processing, wherein C language programming is more oriented to the bottom layer, some seeds which are easy to trigger the bottom layer mechanism of the operating system are selected according to the characteristics, a jpeg library and a tiff library finish operation aiming at pictures, corresponding picture seeds are selected by the corresponding program information database according to the characteristics, and the intersection of the seeds retrieved according to the two characteristics is taken as a test seed. And (3) carrying out variation on the test seeds by utilizing addition, multiplication, byte inversion, bit inversion and byte setting to obtain a first test case. Next, testing the program source code for 20 rounds, performing a first round of testing on the program source code of the LibTIFF by using a first test case, acquiring code coverage based on edge and block comprehensive evaluation through instrumentation code segments, and selecting a variation strategy for generating a new path according to the code coverage and a key value pair formed by a corresponding variation strategy; and increasing the scheduling times of the mutation strategy for generating the new path in the next round of test. And further, carrying out mutation on the test seeds again based on a mutation strategy for generating a new path and corresponding scheduling times to obtain a second test case, carrying out a new round of test on the program source codes by using the second test case, then, obtaining code coverage rate circulation execution operation based on edge and block comprehensive evaluation until 20 rounds of test are completed, finally obtaining a collapse result of the LibTIFF, and finally determining the vulnerability of the LibTIFF according to the collapse result of the LibTIFF.
In one or more embodiments of the present disclosure, the determining the vulnerability of the application software according to the crash result of the application software in the foregoing steps may further include:
acquiring a first crash signal generated by an operating system kernel function when application software crashes;
determining a second crash signal which is correspondingly generated when the program source code crashes based on the program basic information;
in response to determining that the first crash signal is the same as the second crash signal, the crash result is a program source code vulnerability.
It can be understood that whether the current crash result is a program code bug is determined by judging the consistency of a first crash signal generated by an operating system when the application software crashes and a second crash signal generated by the program source code, if so, the current crash result belongs to the program code bug, otherwise, the current crash result belongs to other bugs except the program bug, and the bug detection efficiency is further improved. Correspondingly, the information security vulnerability existing in the program source code can be reversely deduced through the related information of the second crash signal.
Therefore, according to the application software vulnerability scanning method provided by one or more embodiments of the present specification, the corresponding program basic information of different application software is matched with the program information database to obtain a more targeted test seed, the influence of the invalid test seed on the detection efficiency is reduced, the vulnerability detection efficiency is improved, and the code coverage rate feedback mechanism based on the loop is continuously optimized in the preset number of test rounds, so that the vulnerability detection efficiency is greatly improved. Meanwhile, the bugs are further compared and analyzed to be program source code bugs or other bugs, and therefore accuracy of bug detection is improved.
It should be noted that the method of one or more embodiments of the present disclosure may be performed by a single device, such as a computer or server. The method of the embodiment can also be applied to a distributed scene and is completed by the mutual cooperation of a plurality of devices. In such a distributed scenario, one of the devices may perform only one or more steps of the method of one or more embodiments of the present disclosure, and the devices may interact with each other to complete the method.
It should be noted that the above description describes certain embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
Based on the same inventive concept, corresponding to any of the above-mentioned embodiments, one or more embodiments of the present specification further provide an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the method for scanning vulnerabilities of application software according to any of the above-mentioned embodiments.
Fig. 3 is a schematic diagram illustrating a more specific hardware structure of an electronic device according to this embodiment, where the electronic device may include: a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050. Wherein the processor 1010, memory 1020, input/output interface 1030, and communication interface 1040 are communicatively coupled to each other within the device via a bus 1050.
The processor 1010 may be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute related programs to implement the technical solutions provided in the embodiments of the present disclosure.
The Memory 1020 may be implemented in the form of a ROM (Read Only Memory), a RAM (Random Access Memory), a static storage device, a dynamic storage device, or the like. The memory 1020 may store an operating system and other application programs, and when the technical solution provided by the embodiments of the present specification is implemented by software or firmware, the relevant program codes are stored in the memory 1020 and called to be executed by the processor 1010.
The input/output interface 1030 is used for connecting an input/output module to input and output information. The i/o module may be configured as a component in a device (not shown) or may be external to the device to provide a corresponding function. Wherein the input devices may include a keyboard, mouse, touch screen, microphone, various sensors, etc., and the output devices may include a display, speaker, vibrator, indicator light, etc.
The communication interface 1040 is used for connecting a communication module (not shown in the drawings) to implement communication interaction between the present apparatus and other apparatuses. The communication module can realize communication in a wired mode (for example, USB, network cable, etc.), and can also realize communication in a wireless mode (for example, mobile network, WIFI, bluetooth, etc.).
Bus 1050 includes a path that transfers information between various components of the device, such as processor 1010, memory 1020, input/output interface 1030, and communication interface 1040.
It should be noted that although the above-mentioned device only shows the processor 1010, the memory 1020, the input/output interface 1030, the communication interface 1040 and the bus 1050, in a specific implementation, the device may also include other components necessary for normal operation. In addition, those skilled in the art will appreciate that the above-described apparatus may also include only those components necessary to implement the embodiments of the present description, and not necessarily all of the components shown in the figures.
The electronic device of the foregoing embodiment is used to implement the corresponding method for scanning application software vulnerabilities in any of the foregoing embodiments, and has the beneficial effects of the corresponding method embodiment, which are not described herein again.
Based on the same inventive concept, corresponding to any of the above-mentioned embodiment methods, one or more embodiments of the present specification further provide a non-transitory computer-readable storage medium storing computer instructions for causing the computer to execute the application software vulnerability scanning method according to any of the above-mentioned embodiments.
Computer-readable media of the present embodiments, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
The computer instructions stored in the storage medium of the foregoing embodiment are used to enable the computer to execute the application software vulnerability scanning method according to any of the foregoing embodiments, and have the beneficial effects of the corresponding method embodiments, which are not described herein again.
Those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, is limited to these examples; within the spirit of the present disclosure, features from the above embodiments or from different embodiments may also be combined, steps may be implemented in any order, and there are many other variations of different aspects of one or more embodiments of the present description as described above, which are not provided in detail for the sake of brevity.
In addition, well-known power/ground connections to Integrated Circuit (IC) chips and other components may or may not be shown in the provided figures, for simplicity of illustration and discussion, and so as not to obscure one or more embodiments of the disclosure. Furthermore, devices may be shown in block diagram form in order to avoid obscuring the understanding of one or more embodiments of the present description, and this also takes into account the fact that specifics with respect to implementation of such block diagram devices are highly dependent upon the platform within which the one or more embodiments of the present description are to be implemented (i.e., specifics should be well within purview of one skilled in the art). Where specific details (e.g., circuits) are set forth in order to describe example embodiments of the disclosure, it should be apparent to one skilled in the art that one or more embodiments of the disclosure can be practiced without, or with variation of, these specific details. Accordingly, the description is to be regarded as illustrative instead of restrictive.
While the present disclosure has been described in conjunction with specific embodiments thereof, many alternatives, modifications, and variations thereof will be apparent to those skilled in the art in light of the foregoing description. For example, other memory architectures, such as Dynamic RAM (DRAM), may use the discussed embodiments.
It is intended that the one or more embodiments of the present specification embrace all such alternatives, modifications and variations as fall within the broad scope of the appended claims. Therefore, any omissions, modifications, substitutions, improvements, and the like that may be made without departing from the spirit or scope of the disclosure are intended to be included within the scope of the disclosure.