CN115426121A

CN115426121A - Method, apparatus and medium for detecting botnet

Info

Publication number: CN115426121A
Application number: CN202110600098.4A
Authority: CN
Inventors: 高岩; 袁涵; 马晨; 崔江琳
Original assignee: China Telecom Corp Ltd
Current assignee: China Telecom Corp Ltd
Priority date: 2021-05-31
Filing date: 2021-05-31
Publication date: 2022-12-02

Abstract

The present disclosure relates to a method for detecting botnets, comprising the steps of: acquiring a DNS data packet of flow to be detected to obtain source IP, target IP and protocol information; sending the source IP, the destination IP and the protocol information into a threat information library for matching collision; if the matching is successful, judging that the IP of the network request behavior corresponding to the flow belongs to a botnet; if the matching fails, determining a text dimension feature of the flow through a BiLstm algorithm, determining a space dimension feature of the flow through a CNN algorithm and determining an associated dimension feature of the flow through a Deekwalk algorithm, fusing the text dimension feature, the space dimension feature and the associated dimension feature through an RNN neural network, calculating the probability that the IP of the network request behavior corresponding to the flow belongs to the botnet by using a SoftMax classifier, and judging that the IP of the network request behavior corresponding to the flow belongs to the botnet if the probability is greater than a threshold value.

Description

Method, apparatus and medium for detecting botnet

Technical Field

The present disclosure generally pertains to the field of computer security.

Background

Botnets refer to a group of non-cooperative user terminals that can be remotely controlled by an attacker. As one of the main ways of network intrusion, in botnets, the infected host is called a botnet host (Bot), and a controller (Botmaster) sends commands to the botnet host through command and control channels (C & C) to perform one-to-many operations. An attacker can use the botnet to launch large-scale denial of service attacks, malicious software distribution, junk mails, virtual currency mining and other attack behaviors. The advent of botnets poses a significant threat to the network security environment.

Massive attacks through botnets have been rare in recent years. For example, in 2016, DDos attacks peaking at 1.1TBps have been performed by Dyn, a domain name resolution service provider, causing a large area network disruption in the eastern united states, including numerous united states websites such as Twitter and Facebook, which cannot be accessed by domain name resolution. The chief culprit responsible for this event is the Mirai zombie virus. In 2017, researchers find Brickerbot attack attempts from 1895 different IPs through honeypots, the virus attacks an open port aiming at Internet of things equipment provided with a Linux system, and data and codes on the equipment are erased, so that the equipment cannot normally operate. It follows that botnets are potentially a huge hazard to the internet.

Disclosure of Invention

The following presents a simplified summary of the disclosure in order to provide a basic understanding of some aspects of the disclosure. However, it should be understood that this summary is not an exhaustive overview of the disclosure. It is not intended to identify key or critical elements of the disclosure or to delineate the scope of the disclosure. Its sole purpose is to present some concepts of the disclosure in a simplified form as a prelude to the more detailed description that is presented later.

According to one aspect of the present disclosure, there is provided a method of detecting botnets, comprising the steps of:

acquiring a DNS data packet of flow to be detected to obtain source IP, target IP and protocol information;

sending the source IP, the destination IP and the protocol information into a threat information library for matching collision;

if the matching is successful, judging that the IP of the network request behavior corresponding to the flow belongs to a botnet;

if the matching fails, determining a text dimension feature of the flow through a BiLstm algorithm, determining a space dimension feature of the flow through a CNN algorithm and determining an associated dimension feature of the flow through a Deekwalk algorithm, fusing the text dimension feature, the space dimension feature and the associated dimension feature through an RNN neural network, calculating the probability that the IP of the network request behavior corresponding to the flow belongs to the botnet by using a SoftMax classifier, and judging that the IP of the network request behavior corresponding to the flow belongs to the botnet if the probability is greater than a threshold value.

Further features of the invention and its advantages will become apparent from the detailed description of preferred embodiments of the invention which follows.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description, serve to explain the principles of the disclosure.

The present disclosure may be more clearly understood from the following detailed description taken in conjunction with the accompanying drawings, in which:

figure 1 shows a flow diagram of botnet detection according to the present invention.

FIG. 2 shows a schematic diagram of generating text dimensional features.

FIG. 3 shows a schematic diagram of generating spatial dimension features.

FIG. 4 shows a schematic diagram of generating associated dimensional features.

Fig. 5 shows an example of the association relationship between the IPs.

Fig. 6 shows a schematic diagram of the fusion of three features.

FIG. 7 illustrates an exemplary configuration of a computing device capable of implementing embodiments in accordance with the present disclosure.

Detailed Description

The following detailed description is made with reference to the accompanying drawings and is provided to assist in a comprehensive understanding of various exemplary embodiments of the disclosure. The following description includes various details to aid understanding, but these details are to be regarded as examples only and are not intended to limit the disclosure, which is defined by the appended claims and their equivalents. The words and phrases used in the following description are intended only to provide a clear and consistent understanding of the disclosure. In addition, descriptions of well-known structures, functions, and configurations may be omitted for clarity and conciseness. Those of ordinary skill in the art will recognize that various changes and modifications of the examples described herein can be made without departing from the spirit and scope of the disclosure.

Figure 1 shows a flow chart of botnet detection according to the present invention.

In order to execute the method for detecting the botnet, disclosed by the invention, the threat information system regularly captures open source threat information including malicious IP (Internet protocol), malicious URL (uniform resource locator), malicious software information and the like every day, and the threat information system uniformly inputs the information, so that the threat information can be real-time and effective.

In some embodiments, a threat information system is started, open source threat information capturing scripts are started at regular time, and the latest open source threat information is led into the threat information system through data processing.

In some embodiments, the present invention enables the acquisition of threat intelligence as follows. For example, a crawler script captures data of an open source threat information website every day at regular time, obtains threat information related to a botnet or a malicious IP, generates structured data with uniform style by performing operations such as data cleaning and information fusion on the captured data, and introduces the structured data into a database to serve as an information source of a threat information system.

In some embodiments, pre-trained botnet detection model parameters are loaded, and the trained model parameters are used by portions of the multi-dimensional collaborative-based botnet detection model for prediction.

As shown in the "DNS analysis" step in fig. 1, the present invention obtains a DNS packet of a flow to be detected to obtain source IP, destination IP, and protocol information. In some embodiments, a DNS packet is acquired for a piece of network traffic to be detected, and the acquired DNS packet is analyzed and filtered to acquire source IP, destination IP, protocol information, and the like.

And then, sending the information into a threat information library for matching collision, if the matching is successful, confirming the IP of the network request behavior as a botnet, and intercepting the flow. And if the matching fails, carrying out botnet detection by using a deep learning method.

In some embodiments, for an information flow accessing a cloud network, a DNS tool is used to analyze basic information such as a source IP address, a port, a destination IP address, a port, and a protocol, and after collision of a threat information system, if the collision is hit, operations such as interception and early warning are performed.

In some embodiments, if there is a miss in the intelligence system, the information stream is data pre-processed to form three data feed formats of text, gray-scale map, and associated structure required for the converged feature botnet detection model, and the data is fed into the detection model. For example, to utilize the deep learning approach, in some embodiments, data preprocessing is performed on the information stream to form three data feed formats of text, grayscale map, and associated structure required for the fused feature botnet detection model, and the data is fed into the detection model. The data preprocessing is performed on the information flow to form a data format required for determining the text dimensional characteristic of the flow through a Bilstm algorithm, determining the space dimensional characteristic of the flow through a CNN algorithm and determining the associated dimensional characteristic of the flow through a DeekWalk algorithm.

Text dimension feature learning is exemplarily described below.

FIG. 2 shows a schematic diagram of generating text dimensional features.

In some embodiments, for a network data flow, the data is defined as p = (x) _p ，s _p ，t _p ) Wherein x is _p And the quintuple represents srp _ IP, srp _ port, dst _ IP, dst _ port and protocol, and respectively represents a source IP, a source port, a destination IP, a destination port and a protocol. s is _p Indicates the data packet size, and t _p Indicating the start time of a data packet, and a data setRepresented by a number of p, i.e. p = { p = { p = ₁ ，p ₂ ...p _n }。

In some embodiments, the text dimension feature learning module intercepts the first 16 data packets for each piece of data, each data packet taking the first 100 bytes, and the insufficient portion is filled with 0x 00.

Text data is embedded (Embedding), each byte is coded into a 128-dimensional vector, so that data packets are converted into a 100-128-dimensional matrix through the Embedding operation, text dimensional characteristics of the 100-128-dimensional matrix are learned through a bidirectional LSTM neural network, and text vector characteristics are generated through the neural network.

In some embodiments, for example, the text dimension characteristics are determined as follows,

e ₁ ，e ₂ ...e _n ＝E ^T *(p ₁ ，p ₂ ...p _n ) (1)

h _{1_lstm} ，h _{2_lstm} ...h _{n_lstm} ＝Bilstm(e ₁ ，e ₂ ...e _n ) (2)

wherein e _n For each byte of the embedded vector, E ^T For embedding matrices, p _n One-hot (one-hot) encoding for each byte. h is _{n_lstm} For each embedded vector, a hidden layer representation after passing through a bitstm neural network (bi-directional LSTM neural network).

The following exemplarily illustrates spatial dimension feature learning.

FIG. 3 shows a schematic diagram of generating spatial dimension features.

For the same data flow used for text dimension feature learning, a CNN neural network is used for obtaining the spatial distribution feature of network flow, therefore, for each piece of data, the first 1024B data is intercepted, the data with insufficient length is filled by using 0x00, the intercepted data flow is generated into a 32 x 32 gray scale map which represents the spatial dimension feature of the data flow, the feature can be learned through a convolutional neural network, the model structure diagram is shown in figure 3, and the gray scale map passes through a convolutional layer c ₁ Average pooling layer m ₁ And a convolutional layer c ₂ Maximum pooling layer m ₂ Full connection layer f ₁ And five layers of networks are adopted to finally generate the spatial dimension characteristic representation of the network flow.

In some embodiments, the spatial dimension characteristics are determined in the following manner,

h ₁ ，h ₂ ...h _n ＝Conv2d(x ₁ ，x ₂ ...x _n ) (3)

m ₁ ，m ₂ ...m _n ＝mean-pooling(h ₁ ，h ₂ ...h _n ) (4)

h′ ₁ ，h′ ₂ ...h′ _n ＝Conv2d(m ₁ ，m ₂ ...m _n ) (5)

m′ ₁ ，m′ ₂ ...m′ _n ＝max-pooling(h′ ₁ ，h′ ₂ ...h′ _n ) (6)

h _{1_cnn} ，h _{2_cnn} ...h _{n_cnn} ＝W ^T (m′ ₁ ，m′ ₂ ...m′ _n )+b (7)

wherein Conv2d denotes a convolutional neural network, x ₁ To x _n Each bit of pixel value, h, representing a grey scale map ₁ To h _n Representing hidden representation vectors, m, after passing through a convolutional neural network ₁ To m _n Is a feature vector after passing through a mean pooling layer, h _{1_cnn} To h _{n_cnn} Representing the output vector after passing through the fully-connected layer.

The associated dimension feature learning is exemplarily described below.

FIG. 4 shows a schematic diagram of generating associated dimensional features.

In a threat intelligence system, for obtained open source threat intelligence, a graph database is used for constructing a threat intelligence knowledge graph, and because the threat intelligence has relevance, one piece of threat intelligence often has a plurality of relevant intelligence, such as the same domain name is attacked by a plurality of IP. Or an IP belongs to a malicious sample family, so there is correlation between different intelligence, as shown in fig. 4. Therefore, the associated information is also an important feature for the detection and mining of botnets.

For relevant information in a threat information system, a Graph database is used for storing, graph Embedding (Graph Embedding) is carried out through a deep walk algorithm, namely, each node in the Graph is embedded to generate a high-dimensional vector representation, the concept of deep walk is similar to a Word Embedding algorithm (Word 2 Vec), the vector representation of the node is learned through the co-occurrence relation of the node and the node in the Graph, a certain vertex is obtained from the Graph through a random walk strategy, a vertex sequence is regarded as a generated sentence, and the embedded representation of each node is generated through single-layer neural network and hierarchical SoftMax strategy training.

For an information flow, a source IP of the information flow is obtained as a destination IP, the graph database is used for searching the relevant information of the information flow, the embedded expression of the relevant information is obtained, the single-layer LSTM neural network is used for feature learning to obtain the relevant dimension feature of the information flow, if the information flow can not be hit in the graph database, the fixed vector is used for expressing the non-relevant dimension feature of the information flow, and the learning is carried out through the single-layer LSTM neural network.

In some embodiments, the associated dimensional characteristics are determined as follows,

for one information flow, obtaining the source IP and the destination IP, searching the related information through a graph database,

if the correlation intelligence is searched in the graph database, the embedded expression of the correlation intelligence is obtained and the single-layer LSTM neural network is used for feature learning, so as to obtain the correlation dimension feature,

if no relevant intelligence can be searched in the graph database, a non-relevant dimension characteristic is expressed by using a fixed vector, and single-layer LSTM neural network learning is carried out.

In some embodiments, h _node And representing the associated dimensional features, belonging to a feature matrix, wherein the features can be obtained by training through the Deepwalk algorithm, and the feature representation of the graph neural network can be obtained. There are many associated intelligence in the intelligence repository, such as one IP attacking many other hosts or servers, and the entire intelligence database can constitute many such graphs. FIG. 5 showsAn example of an association between IPs is shown. The graph vector representation is generated through the training of the Deepwalk algorithm, and for a node, a vector can be generated through the algorithm to represent the associated dimensional characteristic (h) of the node _node )。

Multidimensional collaboration is illustratively described below.

In the present invention, three features need to be fused. Fig. 6 shows a schematic diagram of the fusion of three features. Three features were fused using RNN neural networks in conjunction with an attention mechanism. The attention mechanism can enable the model to select the characteristics more suitable for the task, and fade the irrelevant characteristics, so that the model can better adapt to the complex situation.

In some embodiments, the three features are fused as follows.

Wherein h is _fusion For the fusion of the spatial dimension characteristic, the text dimension characteristic and the associated dimension characteristic passing through the RNN neural network,

representing a weight; alpha (alpha) ("alpha") _i For attention weight, W _f Represents a weight, b _f Represents an offset value, L represents the length of the h _ fusion blending vector; s _i The fused feature vector after weight calculation. For the models in some embodiments, all

For random initialization, inThe model is continuously updated in the training process and the model is continuously updated in the optimization process, so that the parameters can be more accordant with the task, and the accuracy is improved; w _f And b _f And is also a parameter, which is used for parameter adjustment in the model iteration process, and can be randomly initialized when the model is initialized.

In some embodiments, the probability that the IP of the network request action corresponding to the traffic belongs to the botnet is calculated as follows,

y＝softmax(W ^T s _i +b) (11)

where y is a confidence score representing the probability that the IP of the network request behavior corresponding to the traffic belongs to the botnet, and W is ^T And b is a learning parameter.

In some embodiments, for the prediction result, according to the confidence level of the prediction result, the information stream with the confidence level larger than the interception threshold value is intercepted and indicated in the log. And writing the DNS information for the flow and the confidence level of the model prediction into a threat intelligence system.

The early botnet detection method mainly depends on modes such as an IP-based blacklist and the like, although the method is high in accuracy, an attacker avoids blacklist detection through a DGA domain name generation algorithm along with technical iteration. In the face of emerging endless botnets, an automatic detection mode is needed to detect and mine the botnets on the internet, so that the threat of the botnets to the internet safety is reduced.

In the invention, the threat intelligence system regularly captures open source threat intelligence including malicious IP, malicious URL, malicious software information and the like every day, and the threat intelligence system is uniformly input into the system. And the instant and effective threat information is ensured. For any network flow, analyzing a DNS data packet to acquire data such as source IP (Internet protocol) and destination IP, and colliding with a threat information system, if the data packet is listed as a botnet, judging and taking measures. If the collision fails, the collision is predicted through a deep neural network. And generating a threat confidence coefficient of the network traffic, and selecting measures such as access limitation of the interception according to the confidence coefficient score.

Compared with the prior art, the invention has the following advantages. (1) The text dimension characteristic, the space dimension characteristic and the associated dimension characteristic of the botnet information flow are subjected to multi-dimensional cooperation through a neural network, modeling analysis is carried out on the botnet information flow from the global perspective, and accuracy of system prediction is improved. (2) By combining the threat information system with the deep learning detection model integrating the three characteristics, compared with a method for detecting the botnet based on the threat information, the method can detect unregistered data, and improve the robustness and the generalization of the system. Meanwhile, compared with a single neural network detection mode, the threat information system can improve the detection efficiency and the hit accuracy.

The traditional botnet detection technology based on the IP malicious list has the problems that a malicious IP library needs to be built, encrypted flow is difficult to identify and the like. The botnet detection based on machine learning needs manual feature extraction, and has the defects of insufficient detection capability and the like. Based on the problems, the invention creates a botnet detection method based on threat intelligence and multi-dimensional cooperation. Botnets are detected from multiple dimensions. The threat information system regularly captures the latest zombie network information every day, thereby ensuring the real-time and effective malicious IP list. For network traffic which cannot be hit by a threat information system, a deep learning technology is adopted to model from three dimensions of text dimension characteristics, space dimension characteristics and association dimension characteristics of the traffic, and the botnet is detected from the global perspective. The method does not depend on prior knowledge, does not need to manually select features, can better adapt to strange network environments under the condition of ensuring timeliness, and can effectively detect large-scale complex botnet compared with the existing method.

Fig. 7 illustrates an exemplary configuration of a computing device 700 capable of implementing embodiments in accordance with the present disclosure.

Computing device 700 is an example of a hardware device to which the above-described aspects of the disclosure can be applied. Computing device 700 may be any machine configured to perform processing and/or computing. The computing device 700 may be, but is not limited to, a workstation, a server, a desktop computer, a laptop computer, a tablet computer, a Personal Data Assistant (PDA), a smart phone, an in-vehicle computer, or a combination thereof.

As shown in fig. 7, computing device 700 may include one or more elements that may be connected to or in communication with a bus 702 via one or more interfaces. The bus 702 may include, but is not limited to, an Industry Standard Architecture (ISA) bus, a Micro Channel Architecture (MCA) bus, an Enhanced ISA (EISA) bus, a Video Electronics Standards Association (VESA) local bus, a Peripheral Component Interconnect (PCI) bus, and the like. Computing device 700 may include, for example, one or more processors 704, one or more input devices 706, and one or more output devices 708. The one or more processors 704 may be any kind of processor and may include, but are not limited to, one or more general-purpose processors or special-purpose processors (such as special-purpose processing chips). The processor 704 may be configured to perform the method illustrated in fig. 2 or fig. 3, for example. Input device 706 may be any type of input device capable of inputting information to a computing device and may include, but is not limited to, a mouse, a keyboard, a touch screen, a microphone, and/or a remote controller. Output device 708 may be any type of device capable of presenting information and may include, but is not limited to, a display, speakers, a video/audio output terminal, a vibrator, and/or a printer.

The computing device 700 may also include or be connected to a non-transitory storage device 714, which non-transitory storage device 714 may be any non-transitory and data storage enabled storage device, and may include, but is not limited to, disk drives, optical storage devices, solid state memory, floppy disks, flexible disks, hard disks, magnetic tape, or any other magnetic medium, compact disks or any other optical medium, cache memory, and/or any other memory chip or module, and/or any other medium from which a computer can read data, instructions, and/or code. The computing device 700 may also include Random Access Memory (RAM) 710 and Read Only Memory (ROM) 712. The ROM 712 may store programs, utilities or processes to be executed in a nonvolatile manner. The RAM 710 may provide volatile data storage and store instructions related to the operation of the computing device 700. Computing device 700 may also include a couplingA network/bus interface 716 to a data link 718. The network/bus interface 716 can be any type of device or system capable of enabling communication with external devices and/or networks and can include, but is not limited to, a modem, a network card, an infrared communication device, a wireless communication device, and/or a chipset (such as Bluetooth) ^TM Devices, 802.11 devices, wiFi devices, wiMax devices, cellular communications facilities, etc.).

The present disclosure may be implemented as any combination of apparatus, systems, integrated circuits, and computer programs on non-transitory computer readable media. One or more processors may be implemented as an Integrated Circuit (IC), an Application Specific Integrated Circuit (ASIC), or a large scale integrated circuit (LSI), a system LSI, or a super LSI, or as an ultra LSI package that performs some or all of the functions described in this disclosure.

The present disclosure includes the use of software, applications, computer programs or algorithms. Software, applications, computer programs, or algorithms may be stored on a non-transitory computer readable medium to cause a computer, such as one or more processors, to perform the steps described above and depicted in the figures. For example, one or more memories store software or algorithms in executable instructions and one or more processors may associate a set of instructions to execute the software or algorithms to provide various functionality in accordance with embodiments described in this disclosure.

Software and computer programs (which may also be referred to as programs, software applications, components, or code) include machine instructions for a programmable processor, and may be implemented in a high-level procedural, object-oriented, functional, logical, or assembly or machine language. The term "computer-readable medium" refers to any computer program product, apparatus or device, such as magnetic disks, optical disks, solid state storage devices, memories, and Programmable Logic Devices (PLDs), used to provide machine instructions or data to a programmable data processor, including a computer-readable medium that receives machine instructions as a computer-readable signal.

By way of example, computer-readable media can comprise Dynamic Random Access Memory (DRAM), random Access Memory (RAM), read Only Memory (ROM), electrically erasable read only memory (EEPROM), compact disk read only memory (CD-ROM) or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to carry or store desired computer-readable program code in the form of instructions or data structures and that can be accessed by a general-purpose or special-purpose computer or a general-purpose or special-purpose processor. Disk and disc, as used herein, includes Compact Disc (CD), laser disc, optical disc, digital Versatile Disc (DVD), floppy disk and blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above are also included within the scope of computer-readable media.

The subject matter of the present disclosure is provided as examples of apparatus, systems, methods, and programs for performing the features described in the present disclosure. However, other features or variations are contemplated in addition to the above-described features. It is contemplated that the implementation of the components and functions of the present disclosure may be accomplished with any emerging technology that may replace the technology of any of the implementations described above.

Additionally, the above description provides examples, and does not limit the scope, applicability, or configuration set forth in the claims. Changes may be made in the function and arrangement of elements discussed without departing from the spirit and scope of the disclosure. Various embodiments may omit, substitute, or add various procedures or components as appropriate. For example, features described with respect to certain embodiments may be combined in other embodiments.

Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In some cases, multitasking and parallel processing may be advantageous.

Claims

1. A method of detecting botnets, comprising the steps of:

2. The method of claim 1, further comprising:

and (4) performing data capture on the open source threat information website by using a crawler script, acquiring threat information related to a botnet or malicious IP (Internet protocol), and importing the threat information into a threat information library.

3. The method of claim 1, further comprising:

and performing data preprocessing on the information flow to form a data format required for determining the text dimensional characteristic of the flow through a BiLstm algorithm, determining the space dimensional characteristic of the flow through a CNN algorithm and determining the associated dimensional characteristic of the flow through a DeekWalk algorithm.

4. The method of claim 1, wherein the traffic is intercepted if the IP of the network solicitation corresponding to the traffic is determined to belong to a botnet, and otherwise the traffic is released.

5. The method of claim 1, wherein the text dimension characteristic is determined as follows,

e ₁ ,e ₂ ...e _n ＝E ^T *(p ₁ ,p ₂ ...p _n ) (1)

h _{1_lstm} ,h _{2_lstm} ...h _{n_lstm} ＝Bilstm(e ₁ ,e ₂ ...e _n ) (2)

wherein e _n For each byte of the embedded vector, E ^T For embedding matrices, p _n One-hot (one-hot) encoding for each byte. h is _{n_lstm} For each embedded vector, a hidden layer representation after passing through the Bilstm neural network.

6. The method of claim 1, wherein the spatial dimension characteristic is determined in the following manner,

h ₁ ,h ₂ ...h _n ＝Conv2d(x ₁ ,x ₂ ...x _n ) (3)

m ₁ ,m ₂ ...m _n ＝mean-pooling(h ₁ ,h ₂ ...h _n ) (4)

h′ ₁ ,h′ ₂ …h′ _n ＝Conv2d(m ₁ ,m ₂ ...m _n ) (5)

m′ ₁ ,m′ ₂ ...m′ _n ＝max-pooling(h′ ₁ ,h′ ₂ …h′ _n ) (6)

h _{1_cnn} ,h _{2_cnn} ...h _{n_cnn} ＝W ^T (m′ ₁ ,m′ ₂ ...m′ _n )+b (7)

7. The method of claim 1, wherein the associated dimensional features are determined in the following manner,

if the correlation intelligence is searched in the graph database, the embedded expression of the correlation intelligence is obtained and the single-layer LSTM neural network is used for feature learning, so that the correlation dimension feature is obtained,

8. The method of claim 1, wherein the probability that the IP of the network request behavior corresponding to the traffic belongs to the botnet is calculated as follows,

y＝softmax(W ^T s _i +b) (11)

representing a weight; alpha (alpha) ("alpha") _i For attention weight, W _f Represents a weight, b _f Represents an offset value, and L represents the length of the h _ fusion blending vector; s _i The fusion feature vector is obtained after weight calculation; y is IP belonging to the network request behavior corresponding to the trafficConfidence score, W, of the probability of a botnet ^T And b is a learning parameter.

9. An apparatus for detecting botnets, comprising:

a memory having instructions stored thereon; and

a processor configured to execute instructions stored on the memory to perform the method of any of claims 1 to 8.

10. A computer-readable storage medium comprising computer-executable instructions that, when executed by one or more processors, cause the one or more processors to perform the method of any one of claims 1-8.