CN115118491B - Botnet detection method, device, electronic equipment and readable storage medium - Google Patents

Botnet detection method, device, electronic equipment and readable storage medium Download PDF

Info

Publication number
CN115118491B
CN115118491B CN202210731247.5A CN202210731247A CN115118491B CN 115118491 B CN115118491 B CN 115118491B CN 202210731247 A CN202210731247 A CN 202210731247A CN 115118491 B CN115118491 B CN 115118491B
Authority
CN
China
Prior art keywords
network
data
botnet
nodes
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210731247.5A
Other languages
Chinese (zh)
Other versions
CN115118491A (en
Inventor
刘柱
鲍青波
张楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Topsec Technology Co Ltd
Beijing Topsec Network Security Technology Co Ltd
Beijing Topsec Software Co Ltd
Original Assignee
Beijing Topsec Technology Co Ltd
Beijing Topsec Network Security Technology Co Ltd
Beijing Topsec Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Topsec Technology Co Ltd, Beijing Topsec Network Security Technology Co Ltd, Beijing Topsec Software Co Ltd filed Critical Beijing Topsec Technology Co Ltd
Priority to CN202210731247.5A priority Critical patent/CN115118491B/en
Publication of CN115118491A publication Critical patent/CN115118491A/en
Application granted granted Critical
Publication of CN115118491B publication Critical patent/CN115118491B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • H04L63/145Countermeasures against malicious traffic the attack involving the propagation of malware through the network, e.g. viruses, trojans or worms
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2463/00Additional details relating to network architectures or network communication protocols for network security covered by H04L63/00
    • H04L2463/144Detection or countermeasures against botnets

Abstract

The application belongs to the technical field of communication and discloses a botnet detection method, a botnet detection device, electronic equipment and a readable storage medium, wherein the method comprises the steps of obtaining network original data, wherein the network original data are transmission data among different network nodes; based on network original data and a pre-trained network detection model, network address information of botnet nodes is obtained, and the network detection model is constructed based on a graph neural network and an attention mechanism. Therefore, a network detection model constructed based on the graph neural network and the attention mechanism is adopted to detect the botnet, so that the detection accuracy is improved.

Description

Botnet detection method, device, electronic equipment and readable storage medium
Technical Field
The present application relates to the field of communications technologies, and in particular, to a method, an apparatus, an electronic device, and a readable storage medium for botnet detection.
Background
Botnets are overlay networks for malicious activities, which are formed by a large number of bot hosts infected by bots under the control of attackers, and are generally composed of attackers, command and control channels, and bot hosts. Botnets can launch cyber attacks such as distributed denial of service, spam, phishing, click fraud, and theft of sensitive information, and have been considered one of the most serious threats to many industries such as finance, education, and medicine.
In the prior art, botnet detection is generally performed based on a neural network, but the accuracy of a detection result is low.
Disclosure of Invention
The embodiment of the application aims to provide a botnet detection method, device, electronic equipment and readable storage medium, which are used for improving detection accuracy when botnet detection is performed.
In one aspect, a method for botnet detection is provided, comprising:
acquiring network original data, wherein the network original data are transmission data among different network nodes;
based on network original data and a pre-trained network detection model, network address information of botnet nodes is obtained, and the network detection model is constructed based on a graph neural network and an attention mechanism.
In the implementation process, the botnet detection is performed by adopting a network detection model constructed based on the graph neural network and the attention mechanism, so that the detection accuracy is improved.
In one embodiment, obtaining network address information of a botnet node based on network raw data and a pre-trained network detection model includes:
analyzing each data packet in the network original data to respectively obtain the communication characteristics of each data packet, wherein the communication characteristics comprise a source network address and a destination network address;
Dividing each data packet based on the communication characteristics of each data packet to obtain divided data streams, wherein the data streams represent sessions between two network nodes;
constructing a graph structure network among the network nodes according to the source network address and the destination network address of each data stream, wherein the source network address and the destination network address are used for determining the relationship among different network nodes in the graph structure network;
based on the graph structure network and the network detection model, network address information of the botnet nodes is obtained.
In the implementation process, detection is performed based on the network connection relation of each network node in the network original data.
In one embodiment, obtaining network address information of a botnet node based on a graph structure network and a network detection model includes:
acquiring botnet seed nodes, wherein the botnet seed nodes represent an invaded botnet host;
based on the botnet seed nodes, extracting local communities in the graph structural network, wherein the local communities are sub-networks in the graph structural network;
based on the local communities and the network detection model, network address information of the botnet nodes is obtained.
In the implementation process, local communities are extracted based on botnet seed nodes, so that the data processing amount is reduced, and the efficiency is improved.
In one embodiment, before obtaining the network address information of the botnet node based on the local community and the network detection model, the method further comprises:
if the local communities are multiple, determining the similarity between each network node and each botnet seed node according to the network distance between each network node and each botnet seed node in the local communities;
and merging the local communities according to the similarities to obtain merged local communities.
In the implementation process, the local communities are combined, so that subsequent detection can be facilitated.
In one embodiment, obtaining network address information of a botnet node based on a local community and a network detection model includes:
adopting an attention mechanism, respectively determining weights between every two network nodes based on data transmission characteristics among network nodes in the local community, wherein the data transmission characteristics are determined according to the sizes of data packets transmitted among different network nodes;
determining respective risk scores of the network nodes according to the local communities and weights between every two network nodes;
determining the network node with the risk score higher than the risk threshold as a botnet node;
Network address information of the botnet node is obtained.
In the implementation process, detection is performed based on the weight, so that the detection accuracy is improved.
In one embodiment, the communication features further include a source port, a destination port, and a transport protocol for the data packet;
the data transmission characteristics include at least one of the following parameters: session time, packet size, total number of bytes transmitted, average packet length, standard deviation of packet length, maximum byte transmission rate, and packet transmission rate;
the network detection model is obtained based on positive sample data and negative sample data training, the negative sample data is obtained based on network nodes in the local community, and the positive sample data is obtained based on network nodes outside the local community.
In one embodiment, before obtaining the network address information of the botnet node based on the network raw data and the pre-trained network detection model, the method further comprises:
removing a data stream meeting any one of the following set security conditions from network original data:
the white list contains the destination network address of the data flow;
the transmission protocol of the data stream is a non-appointed transmission protocol;
the transmission time length of the data stream is less than the set time length;
The data flow does not meet the successful establishment condition of the session;
the transmission direction of the data stream is a non-set transmission direction.
In the implementation process, the data is filtered, so that the data processing amount is reduced.
In one aspect, an apparatus for botnet detection is provided, comprising:
the acquisition unit is used for acquiring network original data, wherein the network original data is transmission data among different network nodes;
the detection unit is used for obtaining the network address information of the botnet nodes based on the network original data and a pre-trained network detection model, and the network detection model is constructed based on the graph neural network and the attention mechanism.
In one embodiment, the detection unit is configured to:
analyzing each data packet in the network original data to respectively obtain the communication characteristics of each data packet, wherein the communication characteristics comprise a source network address and a destination network address;
dividing each data packet based on the communication characteristics of each data packet to obtain divided data streams, wherein the data streams represent sessions between two network nodes;
constructing a graph structure network among the network nodes according to the source network address and the destination network address of each data stream, wherein the source network address and the destination network address are used for determining the relationship among different network nodes in the graph structure network;
Based on the graph structure network and the network detection model, network address information of the botnet nodes is obtained.
In one embodiment, the detection unit is configured to:
acquiring botnet seed nodes, wherein the botnet seed nodes represent an invaded botnet host;
based on the botnet seed nodes, extracting local communities in the graph structural network, wherein the local communities are sub-networks in the graph structural network;
based on the local communities and the network detection model, network address information of the botnet nodes is obtained.
In one embodiment, the detection unit is further configured to:
if the local communities are multiple, determining the similarity between each network node and each botnet seed node according to the network distance between each network node and each botnet seed node in the local communities;
and merging the local communities according to the similarities to obtain merged local communities.
In one embodiment, the detection unit is configured to:
adopting an attention mechanism, respectively determining weights between every two network nodes based on data transmission characteristics among network nodes in the local community, wherein the data transmission characteristics are determined according to the sizes of data packets transmitted among different network nodes;
Determining respective risk scores of the network nodes according to the local communities and weights between every two network nodes;
determining the network node with the risk score higher than the risk threshold as a botnet node;
network address information of the botnet node is obtained.
In one embodiment, the communication features further include a source port, a destination port, and a transport protocol for the data packet;
the data transmission characteristics include at least one of the following parameters: session time, packet size, total number of bytes transmitted, average packet length, standard deviation of packet length, maximum byte transmission rate, and packet transmission rate;
the network detection model is obtained based on positive sample data and negative sample data training, the negative sample data is obtained based on network nodes in the local community, and the positive sample data is obtained based on network nodes outside the local community.
In one embodiment, the detection unit is further configured to:
removing a data stream meeting any one of the following set security conditions from network original data:
the white list contains the destination network address of the data flow;
the transmission protocol of the data stream is a non-appointed transmission protocol;
the transmission time length of the data stream is less than the set time length;
The data flow does not meet the successful establishment condition of the session;
the transmission direction of the data stream is a non-set transmission direction.
In one aspect, an electronic device is provided that includes a processor and a memory storing computer readable instructions that, when executed by the processor, perform the steps of a method as provided in various alternative implementations of any of the botnet detection described above.
In one aspect, a computer-readable storage medium is provided, having stored thereon a computer program which, when executed by a processor, performs the steps of a method as provided in various alternative implementations of any of the botnet detection described above.
In one aspect, a computer program product is provided that, when run on a computer, causes the computer to perform the steps of the method provided in various alternative implementations of botnet detection as described above.
Additional features and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application. The objectives and other advantages of the application will be realized and attained by the structure particularly pointed out in the written description and claims thereof as well as the appended drawings.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a method for botnet detection according to an embodiment of the present disclosure;
FIG. 2 is a detailed flow chart of a botnet detection method provided in an embodiment of the present application;
FIG. 3 is a block diagram of a botnet detection device according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. The components of the embodiments of the present application, which are generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, as provided in the accompanying drawings, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, are intended to be within the scope of the present application.
Some of the terms referred to in the embodiments of the present application will be described first to facilitate understanding by those skilled in the art.
Terminal equipment: the mobile terminal, stationary terminal or portable terminal may be, for example, a mobile handset, a site, a unit, a device, a multimedia computer, a multimedia tablet, an internet node, a communicator, a desktop computer, a laptop computer, a notebook computer, a netbook computer, a tablet computer, a personal communications system device, a personal navigation device, a personal digital assistant, an audio/video player, a digital camera/camcorder, a positioning device, a television receiver, a radio broadcast receiver, an electronic book device, a game device, or any combination thereof, including the accessories and peripherals of these devices, or any combination thereof. It is also contemplated that the terminal device can support any type of interface (e.g., wearable device) for the user, etc.
And (3) a server: the cloud server can be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, and can also be a cloud server for providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, basic cloud computing services such as big data and artificial intelligent platforms and the like.
In order to improve detection accuracy when botnet detection is performed, the embodiment of the application provides a method, a device, electronic equipment and a readable storage medium for botnet detection.
Referring to fig. 1, a flowchart of implementation of a botnet detection method provided by an embodiment of the present application may be applied to an electronic device, where the electronic device may be a server or a terminal device, and the specific implementation flow of the method is as follows:
step 100: acquiring network original data, wherein the network original data are transmission data among different network nodes; step 101: based on network original data and a pre-trained network detection model, network address information of botnet nodes is obtained, and the network detection model is constructed based on a graph neural network and an attention mechanism.
As an example, when the acquiring network raw data in step 100 is performed, wireshark software may be used to acquire a data packet in the network, so as to obtain the network raw data.
Wherein, the Wireshark is a network packet capturing tool, which is used for capturing network packets and displaying more detailed network packet data as far as possible.
Furthermore, filtering can be performed on the network original data which meets the set security condition (namely legal data which does not need to be detected) so as to reduce the subsequent data calculation amount.
In one embodiment, a data stream meeting any of the following set security conditions is removed from the network raw data:
the white list (such as google and network address of hundred legal server) contains destination network address of data flow; the transmission protocol of the data stream is a non-appointed transmission protocol; the transmission duration of the data stream is lower than the set duration (e.g., 1 s); the data flow does not meet the session success establishment condition (i.e., the data flow that was not fully established); the transmission direction of the data stream is a non-set transmission direction (for example, the set transmission direction is the transmission from the intranet host to the extranet host).
As one example, the specified transmission protocol is a transmission control protocol (TCP, transmission Control Protocol). And carrying out protocol analysis on the data stream to obtain the transmission protocol of the data stream, and if the transmission protocol of the data stream is non-TCP, determining that the data stream meets the set security condition.
To improve the detection accuracy, the following steps may be used when step 101 is performed:
s1011: analyzing each data packet in network original data (such as TCP audit log data) to obtain communication characteristics of each data packet respectively; s1012: dividing each data packet based on the communication characteristics of each data packet to obtain divided data streams; s1013: constructing a graph structure network among the network nodes according to the source network address and the destination network address of each data stream, wherein the source network address and the destination network address are used for determining the relationship among different network nodes in the graph structure network; s1014: based on the graph structure network and the network detection model, network address information of the botnet nodes is obtained.
As one example, the communication characteristics include a source network address and a destination network address. Further, the communication features may also include source port, destination port, and transmission protocol of the data packet. For ease of illustration, the source network address, destination network address source port, destination port, and transport protocol may be five-tuple. The network address information may be an internet protocol (Internet Protocol, IP).
The data flow represents a session between two network nodes, five tuples of each data packet transmitted in the same session are the same, and whether the session is ended can be determined according to a session end identifier and other modes. Thus, in performing S1012, the set of multiple packets having the same five-tuple within the session time (as an example, the session time is determined by the end-of-session identification of the packets) is taken as the set of all packets transmitted by one network node (i.e., host) and another network node within the session time.
Furthermore, in order to facilitate botnet detection in the subsequent step, when resolving each data packet in the network original data in S1011, the data packet size of each data packet may be resolved and obtained, and according to the data packet size, the data transmission characteristics between each pair of network nodes may be obtained.
As one example, the data transmission characteristics include at least one of the following parameters: session time, packet size, total number of bytes transmitted, average packet length, standard deviation of packet length, maximum byte transmission rate, and packet transmission rate.
The session time refers to the duration of one session. Packet size is the size of each packet in the data stream. The total number of bytes transmitted refers to the total number of bytes transmitted by a certain data stream. The average packet length refers to the average value of the packet lengths in a certain data stream. The standard deviation of the packet length refers to the standard deviation of the packet lengths in a certain data stream. The maximum byte transmission rate refers to the maximum value of the byte transmission rate (i.e., the number of bytes transmitted per second) in a certain data stream. The packet transmission rate refers to the rate of packets transmitted in a certain data stream (i.e., the number of packets transmitted in seconds).
In order to build the graph structure network between the network nodes, in S1013, when the graph structure network between the network nodes is built according to the source network address and the destination network address of each data stream, the source network address and the destination network address of each data stream may be used as the network nodes in the graph structure network, and the source network address and the destination network address of each data stream may be connected to generate the graph structure network. For convenience of the following description, the connection lines between different network nodes in the graph structure network may be referred to as edges. The graph structure network can also comprise data transmission characteristics of which the sides are matched.
In order to reduce the computational complexity and improve the detection efficiency, when the graph-based structure network and the network detection model in S1014 are executed to obtain the network address information of the botnet node, the method may include the following steps: acquiring botnet seed nodes, wherein the botnet seed nodes represent an invaded botnet host; extracting local communities in the graph structure network based on botnet seed nodes by adopting a local community algorithm, wherein the local communities are sub-networks in the graph structure network; based on the local communities and the network detection model, network address information of the botnet nodes is obtained.
It should be noted that a community is a local structure, in which the connection is tight, and the connection between different local structures is sparse, and this local structure is a community. As one example, a local community discovery algorithm (i.e., one of the local community algorithms) may be employed to discover community structures in the graph structure network. The local community discovery algorithm may also be regarded as a clustering algorithm. Because the botnet hosts are controlled by the same server, the botnet nodes in the graph structure network generally have aggregation, and then local communities in the graph structure network can be extracted through a local community discovery algorithm so as to perform preliminary positioning work on the botnet nodes. While the traditional community discovery algorithm (e.g., louvain) is used for calculating the whole graph structure network, the calculation amount is large, in the embodiment of the application, the local community discovery algorithm (e.g., personal Pagerank-based local community algorithm) is adopted for community discovery based on the botnet seed nodes (i.e., known botnet nodes), so that local communities around the botnet seed nodes are obtained, and the calculation amount of community discovery is reduced.
Further, if there are a plurality of local communities, the similar local communities may be combined to facilitate subsequent calculation and detection operations. In one embodiment, if the similarity between each network node and the botnet seed node is determined according to the network distance between each network node and the botnet seed node in the local community; and merging the local communities according to the similarities to obtain merged local communities, so that hierarchical clustering is performed on the local communities to obtain final local communities (namely merged local communities).
As an example, based on the local community algorithm of Personal Pagerank, the similarity vector of each node is obtained, the similarity vector of each node is converted into a matrix, and based on the matrix, each local community is combined to obtain a combined local community. The similarity vector represents the similarity between a certain node and each node.
To detect botnets, performing the local community-based and network detection model in S1014, obtaining network address information for the botnet nodes may include:
adopting an attention mechanism, respectively determining weights between every two network nodes (also can be every two adjacent network nodes) based on data transmission characteristics among all network nodes in the local community, wherein the data transmission characteristics are determined according to the sizes of data packets transmitted among different network nodes; determining respective risk scores of the network nodes according to the local communities and weights between every two network nodes; determining a network node with a risk score above a risk threshold (e.g., 0.5) as a botnet node; network address information of the botnet node is obtained.
In one embodiment, the data transmission characteristic between any two network nodes may be a matrix of data transmission characteristics of each data flow between the two network nodes, or an average value (e.g., an average value of data packet sizes) of data transmission characteristics of each data flow between the two network nodes.
As one example, the network nodes are ordered in order of high-to-low risk score, network nodes with risk scores above a risk threshold (e.g., 0.5) are determined to be botnet nodes, and the IP of the botnet nodes is obtained and output. In practical application, the network address information may be set according to a practical application scenario, which is not limited herein.
It should be noted that the network detection model is constructed based on the neural network and the attention mechanism. A graph neural network is a neural network model developed for deep learning on data of a graph structure (e.g., a graph structure network). The graph neural network can not only utilize the data transmission characteristics of each network node in the graph structural network, but also utilize the topological structure information of the graph structural network. In the embodiment of the application, a graph annotation mechanism (Graph attention networks, GAT) network is used, and an attention layer is added in a traditional graph neural network (Graph Neural Network, GNN) framework, so that weights among adjacent network nodes can be learned, different network nodes are treated differently, and further more network nodes with larger attention effect, namely higher weight, can be ignored when botnet nodes are detected based on the aggregated network nodes, and some network nodes with smaller effect are ignored.
In one embodiment, the network detection model is trained based on positive sample data and negative sample data, the negative sample data being obtained based on network nodes in the local community, the positive sample data being obtained based on network nodes outside the local community.
As one example, known botnet nodes are collected, community discovery is performed on network sample data based on the botnet nodes, a plurality of local communities are obtained, and network nodes in the local communities are sampled, so that negative sample nodes and botnet node labels, namely negative sample data, are obtained; sampling from network nodes outside the local community to obtain positive sample nodes and normal network node labels, namely positive sample data, and training the network detection model based on the positive sample data and the negative sample data to obtain a trained network detection model.
Referring to fig. 2, a detailed flowchart of a botnet detection method provided in an embodiment of the present application is shown, and fig. 1 is specifically described with reference to fig. 2, where a specific implementation flow of the method is as follows:
step 200: and acquiring network original data.
Step 201: and filtering the network original data.
Step 202: and analyzing each data packet in the network original data to obtain the communication characteristics of each data packet.
Step 203: and constructing a graph structure network according to the communication characteristics of each data packet.
Step 204: and based on the botnet seed nodes, carrying out community discovery on the graph structural network to obtain a plurality of local communities.
Step 205: and merging the local communities.
Step 206: and inputting the combined local communities into a network detection model to obtain the risk scores of all the network nodes.
Step 207: and taking the network node with the risk score higher than the risk threshold as the botnet node, and obtaining the network address information of the botnet node.
Specifically, when steps 200-207 are executed, specific steps refer to steps 100-101, and detailed descriptions are omitted herein.
According to the embodiment of the application, the network original data is filtered, the calculation amount of subsequent data processing is reduced, the local communities of the graph structure network formed by the original network are extracted through the botnet seed nodes, the calculation amount of community discovery is reduced, the calculation amount of subsequent botnet detection based on the network detection model is further reduced, the detection efficiency is improved, the characteristics between each adjacent network node are extracted based on the attention mechanism and the graph neural network, and each network node is treated differently according to the importance (namely weight) of each network node, so that the detection accuracy is improved. Furthermore, when the model is trained, training data is acquired based on the local communities, manual labeling is not needed, and consumed labor cost and time cost are greatly reduced.
Based on the same inventive concept, the embodiments of the present application further provide a botnet detection device, and because the principle of solving the problem by using the device and the equipment is similar to that of a botnet detection method, the implementation of the device can refer to the implementation of the method, and the repetition is omitted.
Fig. 3 is a schematic structural diagram of a botnet detection device according to an embodiment of the present application, including:
an acquiring unit 301, configured to acquire network original data, where the network original data is transmission data between different network nodes;
the detection unit 302 is configured to obtain network address information of a botnet node based on network raw data and a pre-trained network detection model, where the network detection model is constructed based on a neural network and an attention mechanism.
In one embodiment, the detection unit 302 is configured to:
analyzing each data packet in the network original data to respectively obtain the communication characteristics of each data packet, wherein the communication characteristics comprise a source network address and a destination network address;
dividing each data packet based on the communication characteristics of each data packet to obtain divided data streams, wherein the data streams represent sessions between two network nodes;
Constructing a graph structure network among the network nodes according to the source network address and the destination network address of each data stream, wherein the source network address and the destination network address are used for determining the relationship among different network nodes in the graph structure network;
based on the graph structure network and the network detection model, network address information of the botnet nodes is obtained.
In one embodiment, the detection unit 302 is configured to:
acquiring botnet seed nodes, wherein the botnet seed nodes represent an invaded botnet host;
based on the botnet seed nodes, extracting local communities in the graph structural network, wherein the local communities are sub-networks in the graph structural network;
based on the local communities and the network detection model, network address information of the botnet nodes is obtained.
In one embodiment, the detection unit 302 is further configured to:
if the local communities are multiple, determining the similarity between each network node and each botnet seed node according to the network distance between each network node and each botnet seed node in the local communities;
and merging the local communities according to the similarities to obtain merged local communities.
In one embodiment, the detection unit 302 is configured to:
Adopting an attention mechanism, respectively determining weights between every two network nodes based on data transmission characteristics among network nodes in the local community, wherein the data transmission characteristics are determined according to the sizes of data packets transmitted among different network nodes;
determining respective risk scores of the network nodes according to the local communities and weights between every two network nodes;
determining the network node with the risk score higher than the risk threshold as a botnet node;
network address information of the botnet node is obtained.
In one embodiment, the communication features further include a source port, a destination port, and a transport protocol for the data packet;
the data transmission characteristics include at least one of the following parameters: session time, packet size, total number of bytes transmitted, average packet length, standard deviation of packet length, maximum byte transmission rate, and packet transmission rate;
the network detection model is obtained based on positive sample data and negative sample data training, the negative sample data is obtained based on network nodes in the local community, and the positive sample data is obtained based on network nodes outside the local community.
In one embodiment, the detection unit 302 is further configured to:
Removing a data stream meeting any one of the following set security conditions from network original data:
the white list contains the destination network address of the data flow;
the transmission protocol of the data stream is a non-appointed transmission protocol;
the transmission time length of the data stream is less than the set time length;
the data flow does not meet the successful establishment condition of the session;
the transmission direction of the data stream is a non-set transmission direction.
In the botnet detection method, the botnet detection device, the electronic equipment and the readable storage medium provided by the embodiment of the application, network original data is obtained, and the network original data is transmission data among different network nodes; based on network original data and a pre-trained network detection model, network address information of botnet nodes is obtained, and the network detection model is constructed based on a graph neural network and an attention mechanism. Therefore, a network detection model constructed based on the graph neural network and the attention mechanism is adopted to detect the botnet, so that the detection accuracy is improved.
Fig. 4 shows a schematic structural diagram of an electronic device 4000. Referring to fig. 4, an electronic device 4000 includes: the processor 4010 and the memory 4020, and may optionally include a power supply 4030, a display unit 4040, and an input unit 4050.
The processor 4010 is a control center of the electronic device 4000, connects the respective components using various interfaces and lines, and performs various functions of the electronic device 4000 by running or executing software programs and/or data stored in the memory 4020, thereby performing overall monitoring of the electronic device 4000.
In the embodiment of the present application, the processor 4010 executes the steps in the above embodiment when calling the computer program stored in the memory 4020.
Optionally, the processor 4010 may comprise one or more processing units; preferably, the processor 4010 may integrate an application processor and a modem processor, wherein the application processor mainly handles an operating system, a user interface, an application, etc., and the modem processor mainly handles wireless communication. It will be appreciated that the modem processor described above may not be integrated into the processor 4010. In some embodiments, the processor, memory, may be implemented on a single chip, and in some embodiments, they may be implemented separately on separate chips.
The memory 4020 may mainly include a storage program area that may store an operating system, various applications, and the like, and a storage data area; the storage data area may store data created according to the use of the electronic device 4000, and the like. In addition, the memory 4020 may include high-speed random access memory, and may also include nonvolatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device, and the like.
The electronic device 4000 further includes a power supply 4030 (e.g., a battery) for powering the various components that can be logically coupled to the processor 4010 via a power management system to facilitate management of charge, discharge, and power consumption via the power management system.
The display unit 4040 may be used to display information input by a user or information provided to the user, various menus of the electronic device 4000, and the like, and is mainly used to display a display interface of each application in the electronic device 4000 and objects such as text and pictures displayed in the display interface in the embodiment of the present invention. The display unit 4040 may include a display panel 4041. The display panel 4041 may be configured in the form of a liquid crystal display (Liquid Crystal Display, LCD), an Organic Light-Emitting Diode (OLED), or the like.
The input unit 4050 may be used to receive information such as numbers or characters entered by a user. The input unit 4050 may include a touch panel 4051 and other input devices 4052. Wherein the touch panel 4051, also referred to as a touch screen, may collect touch operations thereon or thereabout by a user (e.g., operations of the user on the touch panel 4051 or thereabout using any suitable object or accessory such as a finger, stylus, etc.).
Specifically, the touch panel 4051 may detect a touch operation by a user, detect a signal resulting from the touch operation, convert the signal into a touch point coordinate, send the touch point coordinate to the processor 4010, and receive and execute a command sent from the processor 4010. In addition, the touch panel 4051 may be implemented in various types such as resistive, capacitive, infrared, and surface acoustic wave. Other input devices 4052 may include, but are not limited to, one or more of a physical keyboard, function keys (e.g., volume control keys, on-off keys, etc.), a trackball, mouse, joystick, etc.
Of course, the touch panel 4051 may overlay the display panel 4041, and when the touch panel 4051 detects a touch operation thereon or thereabout, it is passed to the processor 4010 to determine the type of touch event, and the processor 4010 then provides a corresponding visual output on the display panel 4041 in accordance with the type of touch event. Although in fig. 4, the touch panel 4051 and the display panel 4041 are implemented as two separate components to implement the input and output functions of the electronic device 4000, in some embodiments, the touch panel 4051 may be integrated with the display panel 4041 to implement the input and output functions of the electronic device 4000.
The electronic device 4000 may also include one or more sensors, such as a pressure sensor, a gravitational acceleration sensor, a proximity light sensor, and the like. Of course, the electronic device 4000 may also include other components such as a camera, as needed in a specific application, and these components are not shown in fig. 4 and will not be described in detail since they are not the components that are important in the embodiments of the present application.
It will be appreciated by those skilled in the art that fig. 4 is merely an example of an electronic device and is not meant to be limiting, and that more or fewer components than shown may be included, or certain components may be combined, or different components may be included.
In an embodiment of the present application, a computer-readable storage medium has stored thereon a computer program that, when executed by a processor, enables a communication device to perform the steps of the above-described embodiments.
For convenience of description, the above parts are described as being functionally divided into modules (or units) respectively. Of course, the functions of each module (or unit) may be implemented in the same piece or pieces of software or hardware when implementing the present application.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various modifications and variations can be made in the present application without departing from the spirit or scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims and the equivalents thereof, the present application is intended to cover such modifications and variations.

Claims (10)

1. A method of botnet detection, comprising:
Acquiring network original data, wherein the network original data are transmission data among different network nodes;
analyzing each data packet in the network original data to respectively obtain the communication characteristic of each data packet, wherein the communication characteristic comprises a source network address and a destination network address;
dividing each data packet based on the communication characteristics of each data packet to obtain divided data streams, wherein each data stream represents a session between two network nodes;
constructing a graph structure network among network nodes according to a source network address and a destination network address of each data stream, wherein the source network address and the destination network address are used for determining the relation among different network nodes in the graph structure network;
acquiring botnet seed nodes, wherein the botnet seed nodes represent an invaded botnet host;
based on the botnet seed node, extracting local communities in the graph structure network, wherein the local communities are sub-networks in the graph structure network;
adopting an attention mechanism, and respectively determining weights between every two network nodes based on data transmission characteristics among network nodes in the local community, wherein the data transmission characteristics are determined according to the sizes of data packets transmitted among different network nodes;
Determining respective risk scores of the network nodes according to the local communities and weights between every two network nodes;
determining a network node with a risk score higher than a risk threshold as the botnet node;
and acquiring network address information of the botnet node.
2. The method of claim 1, wherein prior to the obtaining network address information for a botnet node based on the local community and the network detection model, the method further comprises:
if the local communities are multiple, determining the similarity between each network node and each botnet seed node according to the network distance between each network node and each botnet seed node in the local communities;
and merging the local communities according to the similarities to obtain merged local communities.
3. The method of claim 1, wherein the communication features further comprise a source port, a destination port, and a transport protocol for the data packet;
the data transmission characteristics include at least one of the following parameters: session time, packet size, total number of bytes transmitted, average packet length, standard deviation of packet length, maximum byte transmission rate, and packet transmission rate;
The network detection model is obtained based on training of positive sample data and negative sample data, the negative sample data is obtained based on network nodes in the local community, and the positive sample data is obtained based on network nodes outside the local community.
4. The method of any of claims 1-2, wherein prior to the obtaining network address information for a botnet node based on the network raw data and a pre-trained network detection model, the method further comprises:
removing a data stream meeting any one of the following set security conditions from the network original data:
the white list contains the destination network address of the data flow;
the transmission protocol of the data stream is a non-appointed transmission protocol;
the transmission time length of the data stream is less than the set time length;
the data flow does not meet the successful establishment condition of the session;
the transmission direction of the data stream is a non-set transmission direction.
5. An apparatus for botnet detection, comprising:
the network node comprises an acquisition unit, a transmission unit and a transmission unit, wherein the acquisition unit is used for acquiring network original data, and the network original data are transmission data among different network nodes;
the detection unit is used for analyzing each data packet in the network original data to respectively obtain the communication characteristic of each data packet, wherein the communication characteristic comprises a source network address and a destination network address;
Dividing each data packet based on the communication characteristics of each data packet to obtain divided data streams, wherein each data stream represents a session between two network nodes;
constructing a graph structure network among network nodes according to a source network address and a destination network address of each data stream, wherein the source network address and the destination network address are used for determining the relation among different network nodes in the graph structure network;
acquiring botnet seed nodes, wherein the botnet seed nodes represent an invaded botnet host;
based on the botnet seed node, extracting local communities in the graph structure network, wherein the local communities are sub-networks in the graph structure network;
adopting an attention mechanism, and respectively determining weights between every two network nodes based on data transmission characteristics among network nodes in the local community, wherein the data transmission characteristics are determined according to the sizes of data packets transmitted among different network nodes;
determining respective risk scores of the network nodes according to the local communities and weights between every two network nodes;
determining a network node with a risk score higher than a risk threshold as the botnet node;
And acquiring network address information of the botnet node.
6. The apparatus of claim 5, wherein the detection unit is further configured to:
if the local communities are multiple, determining the similarity between each network node and each botnet seed node according to the network distance between each network node and each botnet seed node in the local communities;
and merging the local communities according to the similarities to obtain merged local communities.
7. The apparatus of claim 5, wherein the communication features further comprise a source port, a destination port, and a transport protocol for the data packet;
the data transmission characteristics include at least one of the following parameters: session time, packet size, total number of bytes transmitted, average packet length, standard deviation of packet length, maximum byte transmission rate, and packet transmission rate;
the network detection model is obtained based on training of positive sample data and negative sample data, the negative sample data is obtained based on network nodes in the local community, and the positive sample data is obtained based on network nodes outside the local community.
8. The apparatus of any one of claims 5-6, wherein the detection unit is further configured to:
removing a data stream meeting any one of the following set security conditions from the network original data:
the white list contains the destination network address of the data flow;
the transmission protocol of the data stream is a non-appointed transmission protocol;
the transmission time length of the data stream is less than the set time length;
the data flow does not meet the successful establishment condition of the session;
the transmission direction of the data stream is a non-set transmission direction.
9. An electronic device comprising a processor and a memory storing computer readable instructions that, when executed by the processor, perform the method of any of claims 1-4.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, performs the method according to any of claims 1-4.
CN202210731247.5A 2022-06-24 2022-06-24 Botnet detection method, device, electronic equipment and readable storage medium Active CN115118491B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210731247.5A CN115118491B (en) 2022-06-24 2022-06-24 Botnet detection method, device, electronic equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210731247.5A CN115118491B (en) 2022-06-24 2022-06-24 Botnet detection method, device, electronic equipment and readable storage medium

Publications (2)

Publication Number Publication Date
CN115118491A CN115118491A (en) 2022-09-27
CN115118491B true CN115118491B (en) 2024-02-09

Family

ID=83329654

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210731247.5A Active CN115118491B (en) 2022-06-24 2022-06-24 Botnet detection method, device, electronic equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN115118491B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103457909A (en) * 2012-05-29 2013-12-18 中国移动通信集团湖南有限公司 Botnet detection method and device
CN108965248A (en) * 2018-06-04 2018-12-07 上海交通大学 A kind of P2P Botnet detection system and method based on flow analysis
CN111355697A (en) * 2018-12-24 2020-06-30 深信服科技股份有限公司 Detection method, device, equipment and storage medium for botnet domain name family
CN111371735A (en) * 2018-12-26 2020-07-03 中兴通讯股份有限公司 Botnet detection method, system and storage medium
CN112822194A (en) * 2021-01-07 2021-05-18 国家计算机网络与信息安全管理中心 Method for identifying and judging DDoS attack group-partner behaviors
CN113919320A (en) * 2021-11-01 2022-01-11 湖南工商大学 Method, system and equipment for detecting early rumors of heteromorphic neural network
CN113965393A (en) * 2021-10-27 2022-01-21 浙江网安信创电子技术有限公司 Botnet detection method based on complex network and graph neural network
CN114021140A (en) * 2021-10-20 2022-02-08 深圳融安网络科技有限公司 Method and device for predicting network security situation and computer readable storage medium
CN114219287A (en) * 2021-12-15 2022-03-22 中国软件与技术服务股份有限公司 Taxpayer risk evaluation method based on graph neural network
CN114389966A (en) * 2022-03-24 2022-04-22 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) Network traffic identification method and system based on graph neural network and stream space-time correlation
CN114513325A (en) * 2021-12-21 2022-05-17 中国人民解放军战略支援部队信息工程大学 Unstructured P2P botnet detection method and device based on SAW community discovery

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10462159B2 (en) * 2016-06-22 2019-10-29 Ntt Innovation Institute, Inc. Botnet detection system and method
CN106503558B (en) * 2016-11-18 2019-02-19 四川大学 A kind of Android malicious code detecting method based on community structure analysis
US11310119B2 (en) * 2020-03-19 2022-04-19 Indian Institute Of Technology, Bombay Using graph neural networks to create table-less routers

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103457909A (en) * 2012-05-29 2013-12-18 中国移动通信集团湖南有限公司 Botnet detection method and device
CN108965248A (en) * 2018-06-04 2018-12-07 上海交通大学 A kind of P2P Botnet detection system and method based on flow analysis
CN111355697A (en) * 2018-12-24 2020-06-30 深信服科技股份有限公司 Detection method, device, equipment and storage medium for botnet domain name family
WO2020133986A1 (en) * 2018-12-24 2020-07-02 深信服科技股份有限公司 Botnet domain name family detecting method, apparatus, device, and storage medium
CN111371735A (en) * 2018-12-26 2020-07-03 中兴通讯股份有限公司 Botnet detection method, system and storage medium
CN112822194A (en) * 2021-01-07 2021-05-18 国家计算机网络与信息安全管理中心 Method for identifying and judging DDoS attack group-partner behaviors
CN114021140A (en) * 2021-10-20 2022-02-08 深圳融安网络科技有限公司 Method and device for predicting network security situation and computer readable storage medium
CN113965393A (en) * 2021-10-27 2022-01-21 浙江网安信创电子技术有限公司 Botnet detection method based on complex network and graph neural network
CN113919320A (en) * 2021-11-01 2022-01-11 湖南工商大学 Method, system and equipment for detecting early rumors of heteromorphic neural network
CN114219287A (en) * 2021-12-15 2022-03-22 中国软件与技术服务股份有限公司 Taxpayer risk evaluation method based on graph neural network
CN114513325A (en) * 2021-12-21 2022-05-17 中国人民解放军战略支援部队信息工程大学 Unstructured P2P botnet detection method and device based on SAW community discovery
CN114389966A (en) * 2022-03-24 2022-04-22 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) Network traffic identification method and system based on graph neural network and stream space-time correlation

Non-Patent Citations (9)

* Cited by examiner, † Cited by third party
Title
一种基于社会网络分析的P2P僵尸网络反制策略;陈端兵;万英;田军伟;傅彦;;计算机科学(第06期);全文 *
僵尸网络的防御与控制;孙卫喜;苟红玲;;微型电脑应用(第11期);全文 *
基于Char-RNN改进模型的恶意域名训练数据生成技术;吴警;芦天亮;杜彦辉;;信息网络安全(第09期);全文 *
基于加权网络的僵尸网络传播模型研究;曹晓丽;牛志玲;;计算机应用与软件(第07期);全文 *
基于动态聚类算法的IRC僵尸网络检测;刘建波;;哈尔滨商业大学学报(自然科学版)(第05期);全文 *
基于图神经网络和时间注意力的会话序列推荐;孙鑫;刘学军;李斌;梁珂;;计算机工程与设计(第10期);全文 *
基于网络结构和流量特征相似性的僵尸网;任凯凯;;计算机产品与流通(第07期);全文 *
多阶段过滤的P2P僵尸网络检测方法;刘丹;李毅超;胡跃;;计算机应用(第12期);全文 *
局域网内僵尸网络发起的DDoS攻击事件的感知与防护;朱强;;信息与电脑(理论版)(第17期);全文 *

Also Published As

Publication number Publication date
CN115118491A (en) 2022-09-27

Similar Documents

Publication Publication Date Title
Tian et al. A distributed deep learning system for web attack detection on edge devices
Dou et al. A confidence-based filtering method for DDoS attack defense in cloud environment
US9723016B2 (en) Detecting web exploit kits by tree-based structural similarity search
CN111368290B (en) Data anomaly detection method and device and terminal equipment
CN111600850B (en) Method, equipment and storage medium for detecting mine digging virtual currency
CN107196930B (en) The method of computer network abnormality detection
CN108306879B (en) Distributed real-time anomaly positioning method based on Web session flow
CN109246027B (en) Network maintenance method and device and terminal equipment
Patil et al. S-DDoS: Apache spark based real-time DDoS detection system
Li et al. PhishBox: An approach for phishing validation and detection
CN115580450A (en) Method and device for detecting flow, electronic equipment and computer readable storage medium
CN104021124B (en) Methods, devices and systems for handling web data
Elekar Combination of data mining techniques for intrusion detection system
CN114422211B (en) HTTP malicious traffic detection method and device based on graph attention network
Yang et al. Characterizing heterogeneous Internet of Things devices at Internet scale using semantic extraction
Liang et al. FECC: DNS Tunnel Detection model based on CNN and Clustering
Aldwairi et al. n‐Grams exclusion and inclusion filter for intrusion detection in Internet of Energy big data systems
CN114253866A (en) Malicious code detection method and device, computer equipment and readable storage medium
CN115118491B (en) Botnet detection method, device, electronic equipment and readable storage medium
CN115801366A (en) Attack detection method and device, electronic equipment and computer readable storage medium
CN104980459B (en) A kind of method and website access device of application program operation
CN113765924A (en) Safety monitoring method, terminal and equipment based on cross-server access of user
CN114513355A (en) Malicious domain name detection method, device, equipment and storage medium
CN112231700A (en) Behavior recognition method and apparatus, storage medium, and electronic device
Barrionuevo et al. Secure computer network: Strategies and challengers in big data era

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant