CN116962009A

CN116962009A - Network attack detection method and device

Info

Publication number: CN116962009A
Application number: CN202310729585.XA
Authority: CN
Inventors: 商清华; 汤良; 陈杨; 蔡玉光
Original assignee: Qianxin Technology Group Co Ltd
Current assignee: Qianxin Technology Group Co Ltd
Priority date: 2023-06-19
Filing date: 2023-06-19
Publication date: 2023-10-27

Abstract

The application provides a network attack detection method and device. The method comprises the following steps: acquiring HTTP request data to be analyzed, and carrying out segmentation processing on the HTTP request data to obtain a plurality of corresponding target segmentation words; inputting the target word into a preset network attack detection model, obtaining a network attack prediction result output by the network attack detection model, and determining whether the HTTP request data has network attack behaviors or not based on the network attack prediction result; the network attack detection model is obtained by training based on sample word segmentation information and a network attack detection result label corresponding to the sample word segmentation information. The network attack detection method provided by the application improves the network attack detection efficiency and accuracy of HTTP request data, and can effectively reduce the missing prevention and false alarm conditions of network attack detection.

Description

Network attack detection method and device

Technical Field

The application relates to the technical field of network security, in particular to a network attack detection method and device. In addition, the application also relates to an electronic device and a processor readable storage medium.

Background

In recent years, with rapid development of internet technology, web Attack (Web attach) implemented based on HTTP (Hyper Text Transfer Protocol) requests on a network is increasing. The Web attack behavior is the behavior of attacking the user surfing behavior or the equipment such as a website server, such as embedding malicious codes, modifying website rights, acquiring privacy information of the website user, and the like. Currently, an attacker often hides an attack code in HTTP request data to acquire website server rights, which poses a great threat to information security of enterprises and users. The traditional Web attack behavior detection scheme has low efficiency and recognition accuracy, and the false alarm and missing report phenomenon is serious.

Disclosure of Invention

Therefore, the application provides a network attack detection method and device, which are used for solving the defects of higher false alarm rate, lower efficiency and lower accuracy of a network attack detection scheme in the prior art.

In a first aspect, the present application provides a network attack detection method, including:

acquiring HTTP request data to be analyzed, and carrying out segmentation processing on the HTTP request data to obtain a plurality of corresponding target segmentation words;

inputting the target word into a preset network attack detection model, obtaining a network attack prediction result output by the network attack detection model, and determining whether the HTTP request data has network attack behaviors or not based on the network attack prediction result; the network attack detection model is obtained by training based on sample word segmentation information and a network attack detection result label corresponding to the sample word segmentation information.

Further, the inputting the target word into a preset network attack detection model, obtaining a network attack prediction result output by the network attack detection model, and determining whether the HTTP request data has a network attack behavior based on the network attack prediction result specifically includes: and analyzing the context semantic information of the target word segmentation characterization of the HTTP request data based on the network attack detection model to obtain a probability value of the target word segmentation abnormality output by the network attack detection model, taking the probability value as a network attack prediction result, and determining whether the HTTP request data has network attack behaviors based on the probability value.

Further, the determining whether the HTTP request data has a network attack behavior based on the probability value specifically includes:

normalizing the probability value to obtain a target probability value of the network attack behavior of the HTTP request data;

and comparing the target probability value with a preset probability threshold value, and determining that network attack behaviors exist in the HTTP request data under the condition that the target probability value is larger than or equal to the probability threshold value.

Further, the processing of splitting the HTTP request data to obtain a plurality of corresponding target segmentation words includes:

based on a preset word segmentation device model, carrying out segmentation processing on the HTTP request data to obtain a plurality of target word segments corresponding to the HTTP request data output by the word segmentation device model; the word segmentation device model is trained based on sample HTTP request data and word segmentation results corresponding to the sample HTTP request data.

Further, the segmentation processing is performed on the HTTP request data based on a preset word segmentation device model, so as to obtain a plurality of target word segments corresponding to the HTTP request data, which are output by the word segmentation device model, specifically including:

inputting the HTTP request data into a preset word segmentation device model, and based on a word segmentation table in the word segmentation device model, segmenting the HTTP request data according to different word roots contained in the HTTP request data to obtain a plurality of target word segments corresponding to the HTTP request data.

Further, the analyzing the context semantic information of the target word segmentation characterization of the HTTP request data based on the network attack detection model to obtain a probability value of the target word segmentation abnormality output by the network attack detection model specifically includes:

analyzing the context semantic information of the target word segmentation characterization of the HTTP request data based on a pre-trained ELECTRA model in the network attack detection model to obtain a characterization vector of the target word segmentation; and inputting the characterization vector to a full-connection layer, obtaining a score corresponding to the target word, and analyzing and calculating the score by using a Softmax layer to determine a probability value of the abnormality of the target word.

Further, before analyzing the context semantic information of the target word segmentation characterization of the HTTP request data based on the pre-trained electrora model in the cyber attack detection model, the method further includes: pre-training the ELECTRA model;

the pre-training of the ELECTRA model specifically comprises:

acquiring sample HTTP request data;

performing segmentation processing on the sample HTTP request data by using a word segmentation device model of a preset RoBERTa model to obtain corresponding sample word segmentation information;

and inputting the sample word segmentation information into an ELECTRA model for training.

In a second aspect, the present application provides a network attack detection device, including:

the request data processing unit is used for acquiring HTTP request data to be analyzed, and carrying out segmentation processing on the HTTP request data to obtain a plurality of corresponding target segmentation words;

the network attack detection unit is used for inputting the target word into a preset network attack detection model, obtaining a network attack prediction result output by the network attack detection model, and determining whether the HTTP request data has network attack behaviors or not based on the network attack prediction result; the network attack detection model is obtained by training a network attack detection result label corresponding to sample word segmentation information.

Further, the network attack detection unit is specifically configured to: and analyzing the context semantic information of the target word segmentation characterization of the HTTP request data based on the network attack detection model to obtain a probability value of the target word segmentation abnormality output by the network attack detection model, taking the probability value as a network attack prediction result, and determining whether the HTTP request data has network attack behaviors based on the probability value.

Further, the request data processing unit is specifically configured to:

Further, before analyzing the context semantic information of the target word segmentation characterization of the HTTP request data based on the pre-trained electrora model in the cyber attack detection model, the method further includes: a model training unit for pre-training the ELECTRA model;

the model training unit is specifically configured to:

acquiring sample HTTP request data;

In a third aspect, the present application also provides an electronic device, including: memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the network attack detection method according to any of the preceding claims when executing the computer program.

In a fourth aspect, the present application also provides a processor readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of a network attack detection method according to any of the preceding claims.

According to the network attack detection method, HTTP request data to be analyzed is obtained, segmentation processing is carried out on the HTTP request data, a plurality of corresponding target segmentation words are obtained, the target segmentation words are input into a preset network attack detection model, a network attack prediction result output by the network attack detection model is obtained, and whether network attack behaviors exist in the HTTP request data is determined based on the network attack prediction result; the network attack detection model is obtained by training based on the network attack detection result label corresponding to the sample word segmentation information. The application improves the network attack detection efficiency and accuracy of HTTP request data and can effectively reduce the missing prevention and false alarm conditions of network attack detection.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the following description will briefly describe the drawings that are required to be used in the embodiments or the prior art, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings may be obtained according to these drawings without any inventive effort for a person skilled in the art.

Fig. 1 is a flow chart of a network attack detection method according to an embodiment of the present application;

fig. 2 is a specific flow chart of a network attack detection method according to an embodiment of the present application.

Fig. 3 is a schematic diagram of a training flow of a network attack detection model in the network attack detection method according to the embodiment of the present application.

Fig. 4 is a schematic structural diagram of a network attack detection device according to an embodiment of the present application;

fig. 5 is a schematic diagram of an entity structure of an electronic device according to an embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which are derived by a person skilled in the art from the embodiments according to the application without creative efforts, fall within the protection scope of the application.

It should be noted that the terms "first," "second," and the like in the description of the present application and the above-described figures are used for distinguishing between similar users and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

The following describes embodiments thereof in detail based on the network attack detection method of the present application. As shown in fig. 1, a flow chart of a network attack detection method provided by an embodiment of the present application includes the following steps:

step 101: and acquiring HTTP request data to be analyzed, and carrying out segmentation processing on the HTTP request data to obtain a plurality of corresponding target segmentation words.

In the embodiment of the application, after HTTP request data to be analyzed is obtained, segmentation processing can be performed on the HTTP request data based on a preset word segmentation device model, so as to obtain a plurality of target word segments corresponding to the HTTP request data, which are output by the word segmentation device model. Specifically, the word segmentation device model is a RoBERTa model obtained by training based on sample HTTP request data and word segmentation results corresponding to the sample HTTP request data. The HTTP request data includes a code string. The code string can be effectively segmented by using the segmenter based on the RoBERTa model, i.e., segmentation of HTTP request data is facilitated. The HTTP request data is split into multiple target fragments (i.e., tokens) for understanding and processing using the cyber attack detection model in step 102. Wherein each token is a sub-sequence of HTTP request data.

In the process of segmenting the HTTP request data based on a preset word segmentation model to obtain a plurality of target word segments corresponding to the HTTP request data output by the word segmentation model, the HTTP request data can be input into the preset word segmentation model, segmentation processing is carried out on the HTTP request data based on a word segmentation table in the word segmentation model and according to different word roots contained in the HTTP request data, and a plurality of target word segments corresponding to the HTTP request data are obtained.

Step 102: inputting the target word into a preset network attack detection model, obtaining a network attack prediction result output by the network attack detection model, and determining whether the HTTP request data has network attack behaviors or not based on the network attack prediction result; the network attack detection model is a deep learning pre-training model which is obtained by training based on sample word segmentation information and a network attack detection result label corresponding to the sample word segmentation information.

In the embodiment of the application, the context semantic information of the target word segmentation characterization of the HTTP request data can be analyzed based on the network attack detection model to obtain the probability value of the target word segmentation abnormality output by the network attack detection model, the probability value is used as a network attack prediction result, and whether the HTTP request data has network attack behavior is determined based on the probability value. HTTP request data is detected based on the deep learning pre-training model, and accuracy is effectively improved. The Context semantic information is an Http request Context (Http Context object). The network attack behavior is Web attack (webattach) behavior, which is the behavior of attacking devices such as a user surfing behavior or a website server. The Web attack detection is a detection means adopted for Web attack behaviors. The cyber attack detection model may use a deep learning pre-training model ELECTRA (Efficiently Learning an Enco)der that Classifies Token Replacements Accurately), the deep learning pre-training model ELECTRA has the advantages of smaller parameters, higher efficiency and better effect compared with the BERT model. Specifically, a network attack detection model meeting the conditions can be obtained based on a deep learning pre-training model (namely an ELECTRA model), in addition, the network attack detection model also comprises a Fully connected layer layer (namely a full-connection layer) and a Softmax layer (namely a Softmax function), the full-connection layer is used for calculating logits (scores) at an ELECTRA model output layer, so that an un-normalized probability value is obtained, the Softmax function of the Softmax layer is used for calculating the HTTP request data as a target probability value of the network attack, and therefore, the potential expression of the HTTP request data can be accurately extracted while the calculation resources are saved, and the network attack prediction result is more accurate. The overall model structure of the network attack detection model is shown in FIG. 2, E ₁ -E _N Respectively representing a plurality of target segmentation words obtained after HTTP request data segmentation; e (E) _cls And E is _sep Special characters specified by the ELECTRA model respectively represent the head and the tail of a character string code sentence; c represents the output of cls bit; t (T) ₁ -T _N Representing the ELECTRA model output. The ELECTRA model includes a Generator (Generator) and a discriminant model (discriminant). The input is destroyed by replacing part of the input token with a reasonable substitute token sampled from the generator model. A discriminant model is then trained that predicts whether each token in the input is replaced by a generator, rather than training a discriminant model of the original token that predicts the replaced token. The training flow of the ELECTRA model is shown in FIG. 3, and "def functions ()" can represent a sample of word segmentation information; "def() The gray "def" and "()" in "indicates that" def "and" () "are masked, and" if "and" () "are generated by the generator to obtain" if->() "; by means of a model of a discriminatorJudging that "Γ" represents original, "x" represents masked, i.e., judging that "if" is "x", "functions" and "()" is "v", thereby determining a network attack detection result label. In the training process of the ELECTRA model, the segmenter model of the ELECTRA model is also used to divide HTTP request data into a series of token, and then the token is input into the ELECTRA model for training, the whole connection layer is used to calculate logits at the output layer of the ELECTRA model, and the softmax function is used to calculate the sample probability value of the HTTP request data as web attack.

In the process of determining whether the HTTP request data has network attack behaviors based on the probability values, normalizing the probability values to obtain target probability values of the network attack behaviors of the HTTP request data, comparing the target probability values with a preset probability threshold, and determining that the network attack behaviors exist in the HTTP request data under the condition that the target probability values are larger than or equal to the probability threshold.

In addition, after determining that the network attack behavior exists in the HTTP request data, the abnormal code in the HTTP request data may be identified, and alarm prompt information for indicating the abnormal code in the HTTP request data may be generated and sent to the corresponding terminal device.

The existing method is lack of understanding of the semantic information of the code context, and is updated and upgraded continuously by an attack means, so that the detection capability is insufficient, and the problem of serious false alarm caused by missing report exists. The method disclosed by the application uses the pre-trained ELECTRA model, so that the potential expression of HTTP request data can be accurately extracted while the calculation resources are saved. The word segmentation device based on the RoBERTa model is used for realizing reasonable segmentation of HTTP request data, so that the HTTP request data is easier to understand and process for the deep learning model, the context information of the HTTP request is deeply understood, and the web attack detection efficiency is improved.

According to the network attack detection method, the HTTP request data to be analyzed is obtained, the HTTP request data is segmented, a plurality of corresponding target words are obtained, the target words are input into the preset network attack detection model, the network attack prediction result output by the network attack detection model is obtained, whether the HTTP request data has network attack behaviors or not is determined based on the network attack prediction result, the network attack detection efficiency and accuracy of the HTTP request data are improved, and the missing prevention and false alarm conditions of network attack detection can be effectively reduced.

Corresponding to the network attack detection method provided by the application, the application also provides a network attack detection device. Since the embodiment of the device is similar to the above-described method embodiment, the description is relatively simple, and the relevant point is just to refer to the description of the above-described method embodiment section, and the embodiments of the network attack detection device described below are only illustrative. Fig. 4 is a schematic structural diagram of a network attack detection device according to an embodiment of the present application.

The application relates to a network attack detection device, which specifically comprises the following parts:

a request data processing unit 401, configured to obtain HTTP request data to be analyzed, and perform segmentation processing on the HTTP request data to obtain a plurality of corresponding target segments;

a network attack detection unit 402, configured to input the target word into a preset network attack detection model, obtain a network attack prediction result output by the network attack detection model, and determine whether the HTTP request data has a network attack behavior based on the network attack prediction result; the network attack detection model is obtained by training a network attack detection result label corresponding to sample word segmentation information.

Further, the request data processing unit is specifically configured to:

the model training unit is specifically configured to:

acquiring sample HTTP request data;

According to the network attack detection device, HTTP request data to be analyzed is obtained, segmentation processing is carried out on the HTTP request data, a plurality of corresponding target segmentation words are obtained, the target segmentation words are input into a preset network attack detection model, a network attack prediction result output by the network attack detection model is obtained, and whether network attack behaviors exist in the HTTP request data is determined based on the network attack prediction result; the network attack detection model is obtained by training based on the network attack detection result label corresponding to the sample word segmentation information. The application improves the network attack detection efficiency and accuracy of HTTP request data and can effectively reduce the missing prevention and false alarm conditions of network attack detection.

Corresponding to the network attack detection method provided by the application, the application also provides electronic equipment. Since the embodiments of the electronic device are similar to the method embodiments described above, the description is relatively simple, and reference should be made to the description of the method embodiments described above, and the electronic device described below is merely illustrative. Fig. 5 is a schematic diagram of the physical structure of an electronic device according to an embodiment of the present application. The electronic device may include: a processor (processor) 501, a memory (memory) 502 and a communication bus 503, wherein the processor 501 and the memory 502 complete communication with each other through the communication bus 503 and communicate with the outside through a communication interface 504. The processor 501 may invoke logic instructions in the memory 502 to perform a network attack detection method comprising: acquiring HTTP request data to be analyzed, and carrying out segmentation processing on the HTTP request data to obtain a plurality of corresponding target segmentation words; inputting the target word into a preset network attack detection model, obtaining a network attack prediction result output by the network attack detection model, and determining whether the HTTP request data has network attack behaviors or not based on the network attack prediction result; the network attack detection model is obtained by training based on sample word segmentation information and a network attack detection result label corresponding to the sample word segmentation information.

Further, the logic instructions in the memory 502 described above may be implemented in the form of software functional units and stored in a computer readable storage medium when sold or used as a stand alone product. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a Memory chip, a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

In another aspect, embodiments of the present application further provide a computer program product, where the computer program product includes a computer program stored on a storage medium readable by a processor, and the computer program includes program instructions, where when the program instructions are executed by a computer, the computer is capable of executing the network attack detection method provided in the foregoing method embodiments. The method comprises the following steps: acquiring HTTP request data to be analyzed, and carrying out segmentation processing on the HTTP request data to obtain a plurality of corresponding target segmentation words; inputting the target word into a preset network attack detection model, obtaining a network attack prediction result output by the network attack detection model, and determining whether the HTTP request data has network attack behaviors or not based on the network attack prediction result; the network attack detection model is obtained by training based on sample word segmentation information and a network attack detection result label corresponding to the sample word segmentation information.

In yet another aspect, an embodiment of the present application further provides a processor readable storage medium, where a computer program is stored, where the computer program is implemented when executed by a processor to perform the network attack detection method provided in the foregoing embodiments. The method comprises the following steps: acquiring HTTP request data to be analyzed, and carrying out segmentation processing on the HTTP request data to obtain a plurality of corresponding target segmentation words; inputting the target word into a preset network attack detection model, obtaining a network attack prediction result output by the network attack detection model, and determining whether the HTTP request data has network attack behaviors or not based on the network attack prediction result; the network attack detection model is obtained by training based on sample word segmentation information and a network attack detection result label corresponding to the sample word segmentation information.

The processor-readable storage medium may be any available medium or data storage device that can be accessed by a processor, including, but not limited to, magnetic storage (e.g., floppy disks, hard disks, magnetic tape, magneto-optical disks (MOs), etc.), optical storage (e.g., CD, DVD, BD, HVD, etc.), semiconductor storage (e.g., ROM, EPROM, EEPROM, nonvolatile storage (NAND FLASH), solid State Disk (SSD)), and the like.

The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present application without undue burden.

From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and are not limiting; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application.

Claims

1. A network attack detection method, comprising:

inputting the target word into a preset network attack detection model, obtaining a network attack prediction result output by the network attack detection model, and determining whether the HTTP request data has network attack behaviors or not based on the network attack prediction result;

the network attack detection model is obtained by training based on sample word segmentation information and a network attack detection result label corresponding to the sample word segmentation information.

2. The method for detecting a network attack according to claim 1, wherein the inputting the target word into a preset network attack detection model, obtaining a network attack prediction result output by the network attack detection model, and determining whether the HTTP request data has a network attack behavior based on the network attack prediction result, specifically includes:

and analyzing the context semantic information of the target word segmentation characterization of the HTTP request data based on the network attack detection model to obtain a probability value of the target word segmentation abnormality output by the network attack detection model, taking the probability value as a network attack prediction result, and determining whether the HTTP request data has network attack behaviors based on the probability value.

3. The network attack detection method according to claim 2, wherein the determining whether the HTTP request data has a network attack behavior based on the probability value specifically includes:

4. The network attack detection method according to claim 1, wherein the performing segmentation processing on the HTTP request data to obtain a plurality of corresponding target segments includes:

5. The method for detecting a network attack according to claim 4, wherein the performing segmentation processing on the HTTP request data based on a preset word segmentation model to obtain a plurality of target words corresponding to the HTTP request data output by the word segmentation model specifically includes:

6. The network attack detection method according to claim 1, wherein the analyzing the context semantic information of the target word segmentation token of the HTTP request data based on the network attack detection model to obtain the probability value of the target word segmentation abnormality output by the network attack detection model specifically includes:

7. The cyber attack detection method according to claim 6, further comprising, prior to analyzing the context semantic information of the target word segmentation characterization of the HTTP request data based on a pre-trained elecrtra model in the cyber attack detection model: pre-training the ELECTRA model;

the pre-training of the ELECTRA model specifically comprises:

acquiring sample HTTP request data;

8. A network attack detection device, comprising:

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the network attack detection method according to any of claims 1 to 7 when the computer program is executed.

10. A processor readable storage medium having stored thereon a computer program, which when executed by a processor, implements the steps of the network attack detection method according to any of claims 1 to 7.