WO2022267343A1

WO2022267343A1 - Vulnerability detection method and device, and readable storage medium

Info

Publication number: WO2022267343A1
Application number: PCT/CN2021/134316
Authority: WO
Inventors: 刘宇滨
Original assignee: 深圳前海微众银行股份有限公司
Priority date: 2021-06-25
Filing date: 2021-11-30
Publication date: 2022-12-29
Also published as: CN113342673A

Abstract

The present application discloses a vulnerability detection method and device, and a readable storage medium. The method comprises the steps of: obtaining original stain data corresponding to a preset user request; deduplicating the original stain data on the basis of a preset array and a preset hash algorithm to obtain target stain data; obtaining a function call stack corresponding to the target stain data, the function call stack being a record of calling functions when a preset application program responds to the preset user request; obtaining a function to be detected in the function call stack; and comparing said function to a preset dangerous function to obtain a vulnerability detection result inside the program, wherein the preset dangerous function is used for performing vulnerability detection on the preset application program so as to determine whether the preset application program has a vulnerability.

Description

Vulnerability detection method, device and readable storage medium

This application claims priority to a Chinese patent application with application number 202110716702.X filed on June 25, 2021, the entire contents of which are incorporated herein by reference.

technical field

The present application relates to the field of information security technology of financial technology (Fintech), in particular to a vulnerability detection method, device and readable storage medium.

Background technique

With the continuous development of financial technology, especially Internet technology finance, more and more technologies (such as distributed, artificial intelligence, etc.) The information security of the industry also has higher requirements.

Currently, the black box testing scheme is used to test whether the application program has any vulnerabilities. Specifically, during the testing process, the application program is regarded as a black box that cannot be opened. The internal structure of the program.

That is, black-box testing focuses on the external structure of the application, and only tests from the perspective of the user, starting from the corresponding relationship between input data and output data, without considering the internal structure of the application, resulting in the accuracy of the vulnerability detection of the application. not tall.

technical problem

The main purpose of this application is to provide a vulnerability detection method, device and readable storage medium, aiming at solving the existing technical problem of how to improve the accuracy of vulnerability detection for application programs.

technical solution

In order to achieve the above purpose, the application provides a vulnerability detection method, the vulnerability detection method includes the steps:

Obtain the original tainted data corresponding to the preset user request;

Deduplicating the original tainted data based on a preset array and a preset hash algorithm to obtain target tainted data;

Obtaining a function call stack corresponding to the target taint data, wherein the function call stack is a record of calling a function when a preset application program responds to the preset user request;

Obtain the function to be detected in the function call stack;

Comparing the function to be detected with the preset dangerous function to obtain a program internal vulnerability detection result, wherein the preset dangerous function is used to perform vulnerability detection on the preset application program to determine whether the preset application program is There are loopholes.

In an embodiment, the acquiring the original tainted data corresponding to the preset user request includes:

Insert the bytecode of the preset sensitive function to obtain the taint source data;

Eliminating non-user-input taint data from the taint source data to obtain the original taint data.

In one embodiment, the preset hash algorithm is composed of a preset number of mutually independent hash algorithms, the original taint data is a set of original taint data, and the pair based on the preset array and the preset hash algorithm The original taint data is deduplicated to obtain the target taint data, including:

traverse the original taint data set;

When traversing to an original taint data each time, calculate the original taint data based on each of the hash algorithms to obtain the preset number of hash values;

Obtain the array element whose index is the same as the hash value in the preset array, and calculate the total product of the array elements;

judging whether the total product is zero;

If the total product is zero, set the array elements that are not one in each of the array elements to one, and use the one original stain data as the target stain data, and return to the step of traversing the original stain data set ;

If the total product is one, it is determined that the one original taint data is non-target taint data, and return to the step of traversing the original taint data set.

In one embodiment, the function to be detected is a set of functions to be detected, and the comparison between the function to be detected and the preset dangerous function is used to obtain the detection result of the internal vulnerability of the program, including:

Traversing the set of functions to be tested;

When traversing to a function to be detected each time, comparing the function to be detected with the preset risk function;

When the function to be detected hits the preset risk function, the weight of the hit preset risk function in the preset weight list is obtained, and the detection intermediate result with an initial value of zero is obtained, and the detection is performed based on the weight The intermediate results are accumulated and updated to obtain an updated detection intermediate result, and the step of traversing the set of functions to be detected is returned until the traversal is completed, and the updated detection intermediate result is used as the internal detection result of the program;

When the function to be detected does not match the preset dangerous function, return to the step of traversing the set of functions to be detected.

In one embodiment, the vulnerability detection method further includes:

Obtaining the Uniform Resource Locator URL corresponding to the preset user request;

Preprocessing the URL based on a preset regular expression;

After completing the preprocessing, determine whether there is a website application level intrusion prevention system WAF in the server corresponding to the URL;

If the WAF does not exist in the server, after the domain name system DNS is successfully resolved, send a data test request to the server;

If the preset return value fed back by the server is received, filtering the original request parameters in the preset user request to obtain the filtered request parameters;

A black-box vulnerability detection result is determined based on the filtered request parameters.

In an embodiment, the determining whether there is a WAF in the server corresponding to the URL includes:

constructing a normal request, and sending the normal request to the server corresponding to the URL to obtain the original page;

Constructing an abnormal request, sending the abnormal request to the server, and determining a response status corresponding to the abnormal request;

If the response status is response timeout, it is determined that there is a WAF in the server;

If the response status is that the response has not timed out, then obtain the abnormal page corresponding to the abnormal request;

Comparing the original page and the abnormal page, if the original page is the same as the abnormal page, there is a WAF in the server.

In an embodiment, the original request parameters are a set of original request parameters, and the filtering of the original request parameters in the preset user request includes:

traverse the original request parameter set;

Each time an original request parameter is traversed, send the preset user request to the server, obtain the original page fed back by the server, and replace the original request parameter in the preset user request with the first random number , obtaining a first post-replacement request, and sending the first post-replacement request to the server, and obtaining a first result page fed back by the server;

If the first similarity between the original page and the first result page is greater than or equal to the first preset similarity threshold, replace the original request parameter in the preset user request with a second random number to obtain the second A post-replacement request, and sending the second post-replacement request to the server to obtain a second result page fed back by the server, wherein the first random number is different from the second random number;

If the second similarity between the first result page and the second result page is greater than or equal to a second preset similarity threshold, filter the original request parameters and return to the step of traversing the original request parameter set .

In one embodiment, after determining the black-box vulnerability detection result based on the filtered request parameters, it further includes:

Acquiring the first score corresponding to the program internal vulnerability detection result;

Obtaining a second score corresponding to the black-box vulnerability detection result;

calculating the sum of the first score and the second score to obtain a total score;

If the total score is greater than the preset score threshold, it is determined that the preset application program has a vulnerability.

In addition, in order to achieve the above object, the present application also provides a vulnerability detection device, which includes a memory, a processor, and a vulnerability detection program stored in the memory and operable on the processor, the When the vulnerability detection program is executed by the processor, the above-mentioned steps of the vulnerability detection method are realized.

In addition, in order to achieve the above object, the present application also provides a computer-readable storage medium, on which a vulnerability detection program is stored, and when the vulnerability detection program is executed by a processor, the above-mentioned vulnerability detection is realized. method steps.

Beneficial effect

Compared with the prior art, which uses black-box testing to detect vulnerabilities in applications, resulting in low accuracy of vulnerability detection in applications, this application acquires the original tainted data corresponding to preset user requests; based on preset The array and the preset hash algorithm perform deduplication processing on the original tainted data to obtain the target tainted data; obtain the function call stack corresponding to the target tainted data, wherein the function call stack is the preset application program responding to the Presetting the record of the function called when the user requests; obtaining the function to be detected in the function call stack; comparing the function to be detected with the preset dangerous function to obtain the internal vulnerability detection result of the program, wherein the preset dangerous function It is used for performing vulnerability detection on the preset application program, so as to determine whether the preset application program has a vulnerability. This application realizes deduplication of the original tainted data through the preset array and the preset hash algorithm, obtains the target tainted data, and obtains the function call stack corresponding to the target tainted data. The function call stack is a preset application program responding to a preset user Call the record of the function when requesting, and compare the record with the preset dangerous function, so as to detect the vulnerability of the preset application program, and obtain the internal vulnerability detection result of the program, so as to determine whether the preset application program has a vulnerability, which is understandable , the process of the default application program responding to the preset user request is the process of processing the preset user request according to its own internal structure. Therefore, this application goes deep into the internal structure of the preset application program to obtain Vulnerability detection results inside the program, thereby improving the accuracy of vulnerability detection for preset applications.

Description of drawings

Fig. 1 is a schematic flow chart of the first embodiment of the vulnerability detection method of the present application;

Fig. 2 is a schematic diagram illustrating an example of an array in the embodiment of the present application;

Fig. 3 is a schematic diagram illustrating the corresponding relationship between detected stain data and array elements in the embodiment of the present application;

Fig. 4 is a schematic diagram illustrating an example of identifying the target stain data x3 as the stain data that has been detected in the embodiment of the present application;

FIG. 5 is a schematic structural diagram of the hardware operating environment involved in the solution of the embodiment of the present application.

The realization, functional features and advantages of the present application will be further described in conjunction with the embodiments and with reference to the accompanying drawings.

Embodiments of the present invention

It should be understood that the specific embodiments described here are only used to explain the present application, and are not intended to limit the present application.

The present application provides a vulnerability detection method. Referring to FIG. 1 , FIG. 1 is a schematic flowchart of a first embodiment of the vulnerability detection method of the present application.

The embodiment of the present application provides an embodiment of the vulnerability detection method. It should be noted that although the logic sequence is shown in the flow chart, in some cases, the sequence shown or described can be executed in a different order than here. A step of. The vulnerability detection method can be applied to a program module of the server for detecting traffic. For the convenience of description, the steps of the execution subject describing the vulnerability detection method are omitted below. Vulnerability detection methods include:

Step S10, acquiring the original taint data corresponding to the preset user request.

In this embodiment, the original taint data corresponding to the preset user request is obtained, and there is non-target taint data in the original taint data. It can be understood that the non-target taint data corresponds to repeated test results. Therefore, for the non-target taint data, keep Just one serving.

Further, before obtaining the original tainted data corresponding to the preset user request, it includes:

Step a, inserting the bytecode of the preset sensitive function to obtain the taint source data.

In this embodiment, instrumentation is performed at the bytecode of preset sensitive functions (functions with security vulnerabilities in preset applications, for example, the dangerous system call rm -rf is not intercepted), and the instrumentation action is determined by the above Assuming that the loading time of the class of the application is different, there are two methods of instrumentation. Among them, the difference in loading time is whether the class has been loaded by the classloader during the instrumentation. For the situation that the class is not loaded by the classloader during the instrumentation, the instrumentation operation is performed before the class is loaded, specifically, the bytecode of the class is loaded into the JVM (Java Virtual Machine, Java virtual machine) is converted through the transform method of the transformer to add a hook point. The function hooked by the hook point is the hook function list L1, and the hook function list L1 is the basis for judging whether there is a command execution vulnerability in this class; For the case where the class has been loaded by the classloader during instrumentation, the class loaded by the classloader will be instrumented through the transform method of the transformer. Specifically, add a hook point to the loaded class, and the function hooked by the hook point It is the hook function list L1.

Among them, all the data passing through this category are harmful inputs and tainted source data.

Step b, removing non-user input taint data from the taint source data to obtain the original taint data.

In this embodiment, the taint source data includes user-controllable variables (user-input variables, such as variables (parameters) in preset user requests), and user-controllable variables represent direct introduction of untrusted data or secrets. data into the system; taint source data also includes data not input by users, and this part of data does not affect the safe operation of preset applications. Therefore, this part of data is not detected, and this part of data is eliminated to improve detection efficiency.

Specifically, after the above stub operation is completed, L1 is used to track the data flow direction corresponding to the variable in the preset user request, so as to obtain the original taint data, which includes the parameters in the preset user request, data flow related to The data generated by the function call, etc.

Step S20, performing deduplication processing on the original tainted data based on a preset array and a preset hash algorithm to obtain target tainted data.

In this embodiment, deduplication processing is performed on the original tainted data, that is, non-target tainted data is not taken as a part of the target tainted data, so as to obtain target tainted data with no repeated tainted data.

Before performing deduplication processing on the original tainted data, on the one hand, the original tainted data is mapped to a hash value based on a preset hash algorithm, wherein the preset hash algorithm includes MD5 (Message-Digest Algorithm 5, Information-Digest Algorithm 5) and SHA-1 (Secure Hash Algorithm 1, Secure Hash Algorithm 1), etc. Specifically, the original tainted data is mapped to a hash value with a smaller data volume through a preset hash algorithm, and the hash values of different original tainted data are unique.

It can be understood that by defaulting the hash algorithm, the original tainted data is mapped to a hash value, which reduces the amount of data processing corresponding to deduplication processing, thereby simplifying the deduplication process and improving detection efficiency.

On the other hand, the array elements of the preset array are obtained, wherein the preset array stores hash value information, and the value range of the hash value of each hash algorithm corresponds to the number of array elements, for example, the preset array The number of array elements is 8, and the value range of the hash value is 1-8, so as to ensure that the relationship between the preset array and the hash value is that the hash value corresponds to the index, for example, the hash value is 3, then Get the array element whose index is the third position in the default array.

Specifically, the original tainted data is deduplicated based on a preset array and a preset hash algorithm to obtain target tainted data.

Among them, the deduplication process is:

The preset hash algorithm is composed of a preset number of mutually independent hash algorithms, the original stain data is a collection of original stain data, and the original stain data is processed based on a preset array and a preset hash algorithm. Deduplication processing to obtain target taint data, including:

Step c, traversing the original tainted data set.

In this embodiment, the original taint data set is traversed, that is, one original taint data is taken from the original taint data set at a time, and subsequent steps d-h are performed.

Step d, each time a piece of original tainted data is traversed, the original tainted data is calculated based on each of the hash algorithms to obtain the preset number of hash values.

In this embodiment, the preset hash algorithm used in the above process of mapping original taint data includes a preset number of mutually independent hash algorithms. It can be understood that the greater the number of mutually independent hash algorithms, the better the description of the original taint The more hash values of the data, the higher the description accuracy, which improves the accuracy of judging whether the tainted data is repeated.

It should be noted that the execution of mutually independent hash algorithms requires the use of hardware resources of the machine, and the hardware resources are limited. Therefore, the hardware resources need to be considered when determining the specific value of the preset number.

Specifically, each time a piece of original tainted data is traversed, the original tainted data is calculated based on the preset number of hash algorithms to obtain a preset number of hash values.

Step e, obtaining the array element whose index is the same as the hash value in the preset array, and calculating the total product of the array elements;

Step f, judging whether the total product is zero;

Step g, if the total product is zero, set the non-one array elements in each of the array elements to one, and use the one original taint data as the target taint data, and return the traversing the original taint data Data collection step;

Step h, if the total product is one, determine that the one original taint data is non-target taint data, and return to the step of traversing the original taint data set.

In this embodiment, the preset array includes a bit array and a byte array. Taking the bit array as an example, the preset array is an m-bit bit array A. Referring to FIG. 2 , the initial values of the array elements are all zero. Assuming that the number of hash algorithms is k, k hash values can be obtained by mapping the original tainted data into hash values through the k hash algorithms, and through the index of k preset arrays that are the same as the k hash values , and then the k array elements in the bit array A can be obtained through the index. It should be noted that the embodiment of the byte array is basically the same as the embodiment of the bit array, which will not be repeated here.

Specifically, based on the array elements, the original taint data is deduplicated to obtain the target taint data. Specifically, calculate the product of each array element, that is, calculate the total product of each array element, and judge whether the total product is zero. If the product is zero, it means that the original tainted data is not a repeated target tainted data, and needs to be detected , take the original tainted data as the target tainted data, and return to the above step of traversing the original tainted data set; if the product is one, determine that the original tainted data is non-target tainted data, and return to the above step of traversing the original tainted data set. Referring to Figure 3, taking the number of hash algorithms as 3 as an example, the array elements corresponding to the original taint data x1 and x2 are both 1, that is, the product of each array element is 1, then the original taint data x1 and x2 represent the detected The tainted data of is non-target tainted data.

It should be noted that every time a target taint data is found, the bit array A is updated once. Specifically, referring to FIG. 4, x1 and x2 are the detected taint data. For the original taint data x3, the original taint data x3 corresponds to There are zero array elements in each array element, then the product of each array element corresponding to the original taint data x3 is zero, thus it can be determined that the original taint data x3 is the target taint data, after determining the original taint data x3 is the target taint data , modifying the array element corresponding to the target taint data from zero to one, so as to identify the original taint data x3 as the detected taint data.

Specifically, for example, the preset array is an m-bit bit array A, k hash algorithms H1, H2, ..., Hk that are independent of each other, and the result range of each hash algorithm is 1-m, and the bits of the bit array A The numbers correspond to each other, so that the result of each hash algorithm can be any index of A. When determining whether the original tainted data is the tainted data that has been detected, the above k hash algorithms are used to map the original tainted data, and k hash results y1, y2, y3, y4, ..., yk are obtained to obtain the index For the array elements A[y1], A[y2], A[y3], A[y4], ..., A[yk] in A of y1, y2, y3, y4, ..., yk, calculate the The product is the product result of the formula A[y1]*A[y2]*A[y3]*A[y4]*…*A[yk]. If the result of the product is zero, it means that the original tainted data has not been tested for vulnerabilities. All the array elements of one are set to one, and the vulnerability detection is performed on the original tainted data; if the product result is one, it means that the original tainted data has been subjected to vulnerability detection, and the original tainted data is no longer subjected to vulnerability detection.

Step S30, obtaining a function call stack corresponding to the target taint data, wherein the function call stack is a record of calling a function when a preset application program responds to the preset user request.

In this embodiment, the function call stack corresponding to the target taint data is obtained, wherein the function call stack is the record of calling the function when the preset application program responds to the preset user request, that is, one or more of the preset functions are recorded in the function call stack. Let the data requested by the user flow to the corresponding function.

Wherein, the preset application program is a web application program installed on the server, that is, an application program.

It should be noted that the application program is composed of classes, and the class is composed of functions. The process of the preset application program responding to the preset user request is the process of calling various related functions to process the preset user request. Specifically, when the class receives When the parameter corresponding to the user request is preset, an acquisition action of obtaining the function call stack corresponding to the preset user request will be triggered.

Among them, before the class receives the preset user request, it needs to be inserted into the class. The purpose of the stub is to add a hook point in the code of the class. Through the hook point, the data corresponding to the preset user request can be tracked. The flow direction in this class, wherein the hooked function is a function in the function list used to judge whether there is a vulnerability. It can be understood that the purpose of going deep into the interior of the preset application program from the code level is achieved through instrumentation.

It should be noted that the data corresponding to the preset user request includes at least the request parameters of the preset user request, and the preset user request is an http request, for example, the user's client requests the corresponding page from the server through the request parameters of the http request files (such as page files in html format); preset user requests also include intermediate parameters generated during the process of processing the request parameters and finally obtaining the corresponding page files after the preset application program receives the request parameters.

Among them, in the data corresponding to the preset user request traced through the hook point, there are tainted data and non-tainted data. When obtaining the function call stack, not all the function call stacks corresponding to the data corresponding to the preset user request are obtained. , the screening process is also required for the tainted data to finally obtain the tainted data that needs to obtain the function call stack. Through the above selection and screening process, a small amount of tainted data is selected from a large number of tainted data, and then the small amount of tainted data The data is used to obtain the function call stack. Specifically, the target taint data corresponding to the preset user request is obtained, wherein the target taint data has undergone the above selection and screening process; the corresponding function call stack is obtained through the target taint data , that is, to obtain the function call stack corresponding to the target taint data, thereby reducing the amount of data to be processed, thereby improving the detection efficiency.

Step S40, obtaining the function to be detected in the function call stack;

Step S50, comparing the function to be detected with a preset risk function to obtain a program internal vulnerability detection result, wherein the preset risk function is used to perform vulnerability detection on the preset application program.

In this embodiment, the preset application program is tested for vulnerabilities based on the above-mentioned function call stack, and the internal vulnerability detection result of the program is obtained, that is, the function called by the preset application program is determined through the function call stack, so that according to the preset application program The function called determines whether the preset application is vulnerable.

Specifically, the function call involved in the above data flow is used for vulnerability detection, and the function call is recorded in the function call stack.

Specifically, obtain the function to be detected in the function call stack, perform vulnerability detection on the function to be detected to determine whether there is a security hole in the preset application program, specifically, compare the function to be detected with the preset dangerous function (such as runtime( )) to obtain the program internal vulnerability detection result, wherein the preset dangerous function is used to perform vulnerability detection on the preset application program.

Further, the function to be detected is a set of functions to be detected, and the comparison of the function to be detected and the preset dangerous function to obtain a program internal vulnerability detection result includes:

Step i, traversing the set of functions to be tested.

In this embodiment, the set of functions to be checked is traversed to obtain one function to be checked from the set of functions to be checked each time, and the following steps j-l are performed.

Step j, comparing the function to be detected with the preset risk function each time a function to be detected is traversed;

Step k, when the function to be detected hits the preset risk function, obtain the weight of the hit preset risk function in the preset weight list, and obtain the detection intermediate result with an initial value of zero, based on the weight Accumulating and updating the detection intermediate results to obtain updated detection intermediate results, returning to the step of traversing the set of functions to be detected until the traversal is completed, and using the updated detection intermediate results as the internal detection results of the program;

Step 1, when the function to be detected does not match the preset dangerous function, return to the step of traversing the set of functions to be detected.

In this embodiment, each time a function to be detected is traversed, the function to be detected is compared with the preset dangerous function, and the preset dangerous function is recorded in the list of dangerous functions. In addition, a preset weight needs to be maintained list W, during the comparison process, each time the function to be detected hits a function in the dangerous function list, the weight of the function to be detected in the preset weight list W is obtained, and the detection intermediate result with an initial value of zero is obtained, To record the weight to the detection intermediate result, the recording process is specifically to accumulate and update the detection intermediate result based on the weight to obtain the updated detection intermediate result, and then return to the step of traversing the set of functions to be detected until the end of the traversal, update The final detection intermediate result is used as the internal detection result of the program. It can be understood that the cumulative update is the weight corresponding to each round of traversal and the updated detection intermediate result obtained in the previous round; and in the comparison process, in the function to be detected When the preset dangerous function is not hit, directly return to the step of traversing the set of functions to be detected.

Specifically, after traversing and comparing the functions to be detected in the entire function call stack, the final detection intermediate result is obtained, that is, the program internal detection result. It can be understood that the program internal detection result is the total weight Q.

Specifically, determine whether there is a vulnerability in the preset application program through the total weight Q, specifically, judge the size relationship between the total weight Q and the preset weight threshold P, if Q is greater than P, then the preset application program has security Vulnerabilities; if Q is less than or equal to P, it is considered that there may be vulnerabilities in the system, but it needs to be further detected by the black box scanner.

Among them, the original taint data in the above vulnerability detection process is obtained based on the data flow tracking agent. Specifically, before the class of the preset application program is loaded into the JVM, the JDK (Java Development Kit, Java Development Kit) The Instrument API (Application Programming Interface, application programming interface) generates an interceptor to modify the definition of the class before the program starts, and generates a data flow tracking agent in the running application to obtain the preset application through the data flow tracking agent According to the context, analyze the data flow according to the context and extract the called function call stack according to the data flow, and obtain the internal vulnerability detection result of the program to determine whether there is a vulnerability in the preset application program.

Further, for the process of detecting vulnerabilities through a black box scanner, specifically, the vulnerability detection method further includes:

Step m, obtaining the Uniform Resource Locator URL corresponding to the preset user request;

Step n, preprocessing the URL based on a preset regular expression;

Step o, after completing the preprocessing, determine whether there is a website application level intrusion prevention system WAF in the server corresponding to the URL;

In this embodiment, the black box scanner is first used to obtain the URL corresponding to the preset user request (Uniform Resource Locator, uniform resource locator), and then use the preset regular expression to judge whether the URL requested by the preset user is legal. If it is legal, it will perform subsequent processing; if it is not legal, it will end the vulnerability detection process . Among them, the preset regular expression is (http|https)://[-A-Za-z0-9+&@#/%?=~_|!:,.;]+[-A-Za- z0-9+&@#/%=~_|]. For example, the URL is https://www.baidu.com, which can be matched with the preset regular expression and is a legal URL, and the result of preprocessing is that the URL is legal; another example is that the URL is hjttps://www.baidu.com , which is an incorrect URL. It can be understood that for "hjttps", it is neither "http" nor "https". Therefore, it cannot be matched with the preset regular expression, and it is an illegal URL. The result of preprocessing is The URL is invalid.

After completing the preprocessing and confirming that the format is correct, in order to ensure the smooth development of the subsequent vulnerability detection process, it is necessary to first determine whether there is a WAF ( Web Application Firewall, website application-level intrusion prevention system), if it exists, the WAF will intercept abnormal requests, that is, the subsequent vulnerability detection process cannot be performed through the black box scanner. Therefore, if the WAF exists in the server, the subsequent vulnerability detection process needs to be performed after bypassing the WAF; if the WAF does not exist in the server, the subsequent vulnerability detection process will be directly performed. Among them, bypass methods include: encoding bypass, capitalization bypass, space filtering bypass, and so on.

Wherein, the determining whether there is a WAF in the server corresponding to the URL includes:

Step o1, constructing a normal request, and sending the normal request to the server corresponding to the URL to obtain the original page;

Step o2, constructing an abnormal request, sending the abnormal request to the server, and determining the response status corresponding to the abnormal request;

Step o3, if the response status is response timeout, then determine that there is a WAF in the server;

Step o4, if the response status is that the response has not timed out, then obtain the abnormal page corresponding to the abnormal request;

Step o5, comparing the original page and the abnormal page, if the original page is the same as the abnormal page, there is a WAF in the server.

In this embodiment, the method of determining whether there is a WAF in the server through the black box scanner is to determine whether there is a WAF by comparing the similarity between normal requested pages and abnormally requested pages. Specifically, first construct a normal request through the black box scanner and send the normal request to the server corresponding to the URL, and obtain the original page of the preset application program responding to the normal request; then construct an abnormal request corresponding to the normal request And send the abnormal request to the server, and determine the response status corresponding to the abnormal request. If the response status is response timeout, it means that there is a WAF in the server. If the response status is response timeout, get the default application An unusual page in response to the unusual request.

Comparing the original page and the abnormal page, if the original page is the same as the abnormal page, there is a WAF in the server; if the original page is different from the abnormal page, there is no WAF in the server.

In step p, if the WAF does not exist in the server, send a data test request to the server after the Domain Name System (DNS) resolves successfully.

In this embodiment, if it is determined that the server does not have a WAF or bypasses the WAF, the vulnerability detection is continued. This process is to detect the network stability to determine whether to end the vulnerability detection process or to perform the subsequent steps of the vulnerability detection. It can be understood , the black-box scanner detects vulnerabilities through input and output analysis, that is, after sending a request to the server, it receives the response fed back by the server, so as to detect the vulnerability of the server according to the request and response.

Specifically, the network stability detection process is to send a request to the destination URL (such as a URL corresponding to the server), and judge whether the network is stable according to the returned data packet corresponding to the request. Specifically, DNS (Domain Name System, Domain Name System) analysis is performed on the URL to determine whether the DNS is successfully resolved. If the resolution fails, it means that the website cannot be connected; if the resolution is successful, that is, after the DNS is successfully resolved, a data test request is sent To the server, when the request is successful, the URL will return the corresponding return data packet. If the return value in the return data packet is the default return value, that is, it is not an http error, then the database will be used to identify whether there is an error in the return data packet , if there is no error, it means that the website can be connected. If the above return value is http error or no return data packet, it means that the website cannot be connected.

Step q: If the preset return value fed back by the server is received, filter the original request parameters in the preset user request to obtain filtered request parameters.

In this embodiment, after receiving the preset return value fed back by the server, that is, after determining that the network stability is that the website can be connected, each parameter in the preset user request is checked for its repeatability and whether detection is required, Specifically, if a certain parameter is a repeated parameter or a parameter that does not need to be detected, the parameter is filtered; if a certain parameter is not a repeated parameter or a parameter that needs to be detected, the detection of the parameter is continued. It can be understood that by filtering parameters that do not need to be processed and repeated parameters in the preset user request, the workload of vulnerability detection is reduced, thereby improving the efficiency of vulnerability detection.

Specifically, for repeated parameters, its embodiment is basically the same as the embodiment of performing deduplication processing on original tainted data in the above vulnerability detection method, and will not be repeated here.

Further, for non-repetitive parameters, the specific filtering process is as follows:

The original request parameters are a set of original request parameters, and the filtering of the original request parameters in the preset user request includes:

Step q1, traversing the original request parameter set;

Step q2, each time an original request parameter is traversed, send the preset user request to the server, obtain the original page fed back by the server, and replace the original request parameter in the preset user request with the first A random number, obtain the first request after replacement, and send the first request after replacement to the server, and obtain the first result page fed back by the server;

Step q3, if the first similarity between the original page and the first result page is greater than or equal to a first preset similarity threshold, replace the original request parameter in the preset user request with a second random number, obtaining a second post-replacement request, and sending the second post-replacement request to the server, and obtaining a second result page fed back by the server, wherein the first random number is different from the second random number;

Step q4, if the second similarity between the first result page and the second result page is greater than or equal to a second preset similarity threshold, then filter the original request parameters and return the traversing the original request Parameter collection step.

In this embodiment, the original request parameter set is traversed to obtain one original request parameter from the original request parameter set each time, and the following steps q2-q4 are performed; afterward, each time an original request parameter is traversed, Send the preset user request to the server through the black box scanner, get the original response returned by the server, that is, get the original page, then replace the original request parameter in the preset user request with the first random number, and get the first A post-replacement request, sending the first post-replacement request to the server to obtain the first result page R1 returned by the server; determining the first similarity between the original page and the first result page R1, if the first similarity If it is smaller than the first preset similarity threshold, it means that the parameter cannot be filtered. If the similarity is greater than or equal to the preset similarity threshold, replace the original request parameter with a second random number different from the first random number, obtain a second replaced request, and send the second replaced request to the server , get the second result page R2 returned by the server, determine the second similarity between the first result page R1 and the second result page R2, if the second similarity is less than the second preset similarity threshold, it means that the The original request parameter cannot be filtered. If the third similarity is greater than or equal to the second preset similarity threshold, it means that the original request parameter can be filtered. After filtering the original request parameter, return to the above step of traversing the original request parameter set. To filter the new parameters in the original request parameters.

Step r, determining a black-box vulnerability detection result based on the filtered request parameters.

In this embodiment, after the above parameter filtering process, check whether the filtered request parameter is a dynamic parameter, if the filtered parameter is a dynamic parameter, perform injection detection on the filtered parameter and record the detection result, thereby obtaining Black-box vulnerability detection results.

Further, after determining the black-box vulnerability detection result based on the filtered request parameters, it also includes:

Step s, obtaining the first score corresponding to the program internal vulnerability detection result;

Step t, obtaining a second score corresponding to the black-box vulnerability detection result;

Step u, calculating the sum of the first score and the second score to obtain the total score;

In step v, if the total score is greater than a preset score threshold, it is determined that there is a vulnerability in the preset application program.

In this embodiment, it is determined whether the preset application program has a vulnerability according to the above-mentioned program internal vulnerability detection result and the black box vulnerability detection result. Specifically, the first score corresponding to the internal vulnerability detection result of the program is obtained; the second score corresponding to the black box vulnerability detection result is obtained. That is, through the internal vulnerability detection results of the program and the black box vulnerability detection results, the preset application program is scored for whether there are vulnerabilities, and the scoring result is obtained. The process of determining whether the preset application program has vulnerabilities based on the scoring results is: Calculating the internal vulnerabilities of the program The sum of the first score and the second score corresponding to the detection result and the black-box vulnerability detection result is used to obtain the total score, and it is judged whether the total score is greater than the preset score threshold. If it is greater, it is determined that the preset application has a vulnerability. equal to, it means that there is no vulnerability in the default application.

It should be noted that, compared with only using the internal program vulnerability detection results to perform vulnerability detection through the black box scanner, the combination of the two increases the diversity of vulnerability detection. Vulnerability detection based on the detection results. The vulnerability detection after adding the black box scanner has a wider detection range for the preset application, so that the vulnerability is improved by combining the internal vulnerability detection results of the program and the black box vulnerability detection results. detection accuracy.

Among them, for the traditional black box detection method, dirty data will be generated during the detection process, and the dirty data will flow into the data generated during the normal operation of the preset application program, causing the data generated during the normal operation of the preset application program to be Dirty data pollution. In order to prevent the data generated during the normal operation of the preset application program from being polluted, the dirty data can also be intercepted by a data interception agent.

Specifically, the system command issued by the black box scanner is intercepted by the data interception agent, so as to prevent the preset application program from executing the system command.

Specifically, similar to the data flow tracking agent, before the preset application class is loaded into the JVM, an interceptor is generated through the JDK (Java Development Kit, Java Development Kit) Instrument API (Application Programming Interface, application programming interface) to Modify the definition of this class before the program starts, and generate a data interception agent in the running preset application program, so as to intercept the system commands issued by the black box scanner to the preset application program through the data interception agent, that is, execute in this class Before the system command, it is intercepted, so as to achieve the effect that the test data will not affect the server.

Compared with the prior art, which uses black-box testing to detect vulnerabilities in applications, resulting in low accuracy in detecting vulnerabilities in applications, this embodiment obtains the original tainted data corresponding to preset user requests; The array and the preset hash algorithm are used to deduplicate the original tainted data to obtain the target tainted data; the function call stack corresponding to the target tainted data is obtained, wherein the function call stack is the default application program response The record of the function called when the preset user requests; obtain the function to be detected in the function call stack; compare the function to be detected with the preset dangerous function, and obtain the internal vulnerability detection result of the program, wherein the preset risk The function is used to perform vulnerability detection on the preset application program, so as to determine whether the preset application program has a vulnerability. This application realizes deduplication of the original tainted data through the preset array and the preset hash algorithm, obtains the target tainted data, and obtains the function call stack corresponding to the target tainted data. The function call stack is a preset application program responding to a preset user Call the record of the function when requesting, and compare the record with the preset dangerous function, so as to detect the vulnerability of the preset application program, and obtain the internal vulnerability detection result of the program, so as to determine whether the preset application program has a vulnerability, which is understandable , the process of the default application program responding to the preset user request is the process of processing the preset user request according to its own internal structure. Therefore, this application goes deep into the internal structure of the preset application program to obtain Vulnerability detection results inside the program, thereby improving the accuracy of vulnerability detection for preset applications.

In addition, the present application also provides a vulnerability detection device, which includes:

The first obtaining module is used to obtain the original taint data corresponding to the preset user request;

A deduplication module, configured to perform deduplication processing on the original tainted data based on a preset array and a preset hash algorithm to obtain target tainted data;

The second obtaining module is configured to obtain a function call stack corresponding to the target taint data, wherein the function call stack is a record of calling a function when a preset application program responds to the preset user request;

The third obtaining module is used to obtain the function to be detected in the function call stack;

A comparison module, configured to compare the function to be detected with a preset risk function to obtain a program internal vulnerability detection result, wherein the preset risk function is used to perform vulnerability detection on the preset application program to determine the Check if there are any vulnerabilities in the preset applications mentioned above.

In an embodiment, the first acquisition module is also used for:

In one embodiment, the deduplication module is also used for:

traverse the original taint data set;

judging whether the total product is zero;

In one embodiment, the comparison module is also used for:

Traversing the set of functions to be tested;

In one embodiment, the vulnerability detection device further includes:

A fourth obtaining module, configured to obtain the URL corresponding to the preset user request;

A preprocessing module, configured to preprocess the URL based on a preset regular expression;

The first determination module is used to determine whether there is a website application level intrusion prevention system WAF in the server corresponding to the URL after the preprocessing is completed;

A sending module, configured to send a data test request to the server after the domain name system DNS is successfully resolved if the WAF does not exist in the server;

A filtering module, configured to filter the original request parameters in the preset user request to obtain filtered request parameters if the preset return value fed back by the server is received;

The second determining module is configured to determine a black-box vulnerability detection result based on the filtered request parameters.

In an embodiment, the first determining module is also used for:

In one embodiment, the filtering module is also used for:

traverse the original request parameter set;

In one embodiment, the vulnerability detection device further includes:

The fifth obtaining module is used to obtain the first score corresponding to the program internal vulnerability detection result;

A sixth obtaining module, configured to obtain a second score corresponding to the black-box vulnerability detection result;

a calculation module, configured to calculate the sum of the first score and the second score to obtain a total score;

The third determining module is configured to determine that there is a vulnerability in the preset application program if the total score is greater than a preset score threshold.

The specific implementation manners of the vulnerability detection device of the present application are basically the same as the above-mentioned embodiments of the vulnerability detection method, and will not be repeated here.

In addition, the present application also provides a vulnerability detection device. As shown in FIG. 5 , FIG. 5 is a schematic structural diagram of a hardware operating environment involved in the solution of the embodiment of the present application.

It should be noted that FIG. 5 is a schematic structural diagram of a hardware operating environment of a vulnerability detection device.

As shown in FIG. 5 , the vulnerability detection device may include: a processor 1001 , such as a CPU, a memory 1005 , a user interface 1003 , a network interface 1004 , and a communication bus 1002 . Wherein, the communication bus 1002 is used to realize connection and communication between these components. The user interface 1003 may include a display screen (Display), an input unit such as a keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a wireless interface. Optionally, the network interface 1004 may include a standard wired interface and a wireless interface (such as a WI-FI interface). The memory 1005 may be a high-speed RAM memory, or a stable memory (non-volatile memory), such as a disk memory. Optionally, the memory 1005 may also be a storage device independent of the aforementioned processor 1001 .

In one embodiment, the vulnerability detection device may also include RF (Radio Frequency, radio frequency) circuits, sensors, audio circuits, WiFi modules, etc.

Those skilled in the art can understand that the structure of the vulnerability detection device shown in Figure 5 does not constitute a limitation to the vulnerability detection device, and may include more or less components than those shown in the illustration, or combine certain components, or different components layout.

As shown in FIG. 5 , the memory 1005 as a computer storage medium may include an operating system, a network communication module, a user interface module, and a vulnerability detection program. Wherein, the operating system is a program that manages and controls the hardware and software resources of the vulnerability detection device, and supports the operation of the vulnerability detection program and other software or programs.

In the vulnerability detection device shown in Figure 5, the user interface 1003 is mainly used to connect to the terminal and perform data communication with the terminal, such as receiving a request sent by the terminal; the network interface 1004 is mainly used for the background server to perform data communication with the background server; The device 1001 can be used to call the vulnerability detection program stored in the memory 1005, and execute the steps of the above-mentioned vulnerability detection method.

The specific implementation manners of the vulnerability detection device of the present application are basically the same as the embodiments of the above vulnerability detection method, and will not be repeated here.

In addition, the embodiment of the present application also proposes a computer-readable storage medium, on which a vulnerability detection program is stored, and when the vulnerability detection program is executed by a processor, the steps of the above-mentioned vulnerability detection method are implemented. .

The specific implementation manners of the computer-readable storage medium of the present application are basically the same as the above-mentioned embodiments of the vulnerability detection method, and will not be repeated here.

It should be noted that, in this document, the term "comprising", "comprising" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article or apparatus comprising a set of elements includes not only those elements, It also includes other elements not expressly listed, or elements inherent in the process, method, article, or device. Without further limitations, an element defined by the phrase "comprising a ..." does not preclude the presence of additional identical elements in the process, method, article, or apparatus comprising that element.

The serial numbers of the above embodiments of the present application are for description only, and do not represent the advantages and disadvantages of the embodiments.

Through the description of the above embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus a necessary general-purpose hardware platform, and of course also by hardware, but in many cases the former is better implementation. Based on this understanding, the essence of the technical solution of this application or the part that contributes to the prior art can be embodied in the form of software products, and the computer software products are stored in a storage medium (such as ROM/RAM, disk, CD-ROM), including several instructions to enable a terminal device (which may be a mobile phone, computer, server, device, or network device, etc.) to execute the methods described in various embodiments of the present application.

The above are only preferred embodiments of the present application, and are not intended to limit the patent scope of the present application. All equivalent structures or equivalent process transformations made by using the contents of the specification and drawings of this application, or directly or indirectly used in other related technical fields , are all included in the patent protection scope of the present application in the same way.

Claims

A vulnerability detection method, wherein, the vulnerability detection method comprises the following steps:

Obtain the original tainted data corresponding to the preset user request;

Deduplicating the original tainted data based on a preset array and a preset hash algorithm to obtain target tainted data;

Obtaining a function call stack corresponding to the target taint data, wherein the function call stack is a record of calling a function when a preset application program responds to the preset user request;

Obtain the function to be detected in the function call stack;

Comparing the function to be detected with the preset dangerous function to obtain a program internal vulnerability detection result, wherein the preset dangerous function is used to perform vulnerability detection on the preset application program to determine whether the preset application program is There are loopholes.
The vulnerability detection method according to claim 1, wherein said acquiring the original tainted data corresponding to the preset user request comprises:

Insert the bytecode of the preset sensitive function to obtain the taint source data;

Eliminating non-user-input taint data from the taint source data to obtain the original taint data.
The vulnerability detection method according to claim 1, wherein the preset hash algorithm is composed of a preset number of mutually independent hash algorithms, the original taint data is a set of original taint data, and the preset array based and the preset hash algorithm to deduplicate the original tainted data to obtain the target tainted data, including:

traverse the original taint data set;

When traversing to an original taint data each time, calculate the original taint data based on each of the hash algorithms to obtain the preset number of hash values;

Obtain the array element whose index is the same as the hash value in the preset array, and calculate the total product of the array elements;

judging whether the total product is zero;

If the total product is zero, set the array elements that are not one in each of the array elements to one, and use the one original stain data as the target stain data, and return to the step of traversing the original stain data set ;

If the total product is one, it is determined that the one original taint data is non-target taint data, and return to the step of traversing the original taint data set.
The vulnerability detection method according to claim 1, wherein the function to be detected is a set of functions to be detected, and the comparison of the function to be detected and a preset dangerous function to obtain a program internal vulnerability detection result includes:

Traversing the set of functions to be tested;

When traversing to a function to be detected each time, comparing the function to be detected with the preset risk function;

When the function to be detected hits the preset risk function, the weight of the hit preset risk function in the preset weight list is obtained, and the detection intermediate result with an initial value of zero is obtained, and the detection is performed based on the weight The intermediate results are accumulated and updated to obtain an updated detection intermediate result, and the step of traversing the set of functions to be detected is returned until the traversal is completed, and the updated detection intermediate result is used as the internal detection result of the program;

When the function to be detected does not match the preset dangerous function, return to the step of traversing the set of functions to be detected.
The vulnerability detection method according to claim 1, wherein the vulnerability detection method further comprises:

Obtaining the Uniform Resource Locator URL corresponding to the preset user request;

Preprocessing the URL based on a preset regular expression;

After completing the preprocessing, determine whether there is a website application level intrusion prevention system WAF in the server corresponding to the URL;

If the WAF does not exist in the server, after the domain name system DNS is successfully resolved, send a data test request to the server;

If the preset return value fed back by the server is received, filtering the original request parameters in the preset user request to obtain the filtered request parameters;

A black-box vulnerability detection result is determined based on the filtered request parameters.
The vulnerability detection method according to claim 5, wherein said determining whether there is a WAF in the server corresponding to the URL comprises:

constructing a normal request, and sending the normal request to the server corresponding to the URL to obtain the original page;

Constructing an abnormal request, sending the abnormal request to the server, and determining a response status corresponding to the abnormal request;

If the response status is response timeout, it is determined that there is a WAF in the server;

If the response status is that the response has not timed out, then obtain the abnormal page corresponding to the abnormal request;

Comparing the original page and the abnormal page, if the original page is the same as the abnormal page, there is a WAF in the server.
The vulnerability detection method according to claim 5, wherein the original request parameters are a set of original request parameters, and the filtering of the original request parameters in the preset user request includes:

traverse the original request parameter set;

Each time an original request parameter is traversed, send the preset user request to the server, obtain the original page fed back by the server, and replace the original request parameter in the preset user request with the first random number , obtaining a first post-replacement request, and sending the first post-replacement request to the server, and obtaining a first result page fed back by the server;

If the first similarity between the original page and the first result page is greater than or equal to the first preset similarity threshold, replace the original request parameter in the preset user request with a second random number to obtain the second A post-replacement request, and sending the second post-replacement request to the server to obtain a second result page fed back by the server, wherein the first random number is different from the second random number;

If the second similarity between the first result page and the second result page is greater than or equal to a second preset similarity threshold, filter the original request parameters and return to the step of traversing the original request parameter set .
The vulnerability detection method according to any one of claims 5-7, wherein, after determining the black-box vulnerability detection result based on the filtered request parameters, further comprising:

Acquiring the first score corresponding to the program internal vulnerability detection result;

Acquiring a second score corresponding to the black-box vulnerability detection result;

calculating the sum of the first score and the second score to obtain a total score;

If the total score is greater than the preset score threshold, it is determined that the preset application program has a vulnerability.
A vulnerability detection device, wherein the vulnerability detection device includes a memory, a processor, and a vulnerability detection program stored in the memory and operable on the processor, and the vulnerability detection program is executed by the processor When implementing the steps of the vulnerability detection method as described in any one of claims 1 to 8.
A computer-readable storage medium, wherein a vulnerability detection program is stored on the computer-readable storage medium, and when the vulnerability detection program is executed by a processor, the vulnerability detection according to any one of claims 1 to 8 is realized method steps.