WO2022037191A1 - 一种网络流异常检测模型的生成方法和计算机设备 - Google Patents

一种网络流异常检测模型的生成方法和计算机设备 Download PDF

Info

Publication number
WO2022037191A1
WO2022037191A1 PCT/CN2021/098695 CN2021098695W WO2022037191A1 WO 2022037191 A1 WO2022037191 A1 WO 2022037191A1 CN 2021098695 W CN2021098695 W CN 2021098695W WO 2022037191 A1 WO2022037191 A1 WO 2022037191A1
Authority
WO
WIPO (PCT)
Prior art keywords
network
network flow
model
source domain
anomaly detection
Prior art date
Application number
PCT/CN2021/098695
Other languages
English (en)
French (fr)
Inventor
吕麒
李伟超
汪漪
金波
Original Assignee
鹏城实验室
南方科技大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 鹏城实验室, 南方科技大学 filed Critical 鹏城实验室
Publication of WO2022037191A1 publication Critical patent/WO2022037191A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection

Definitions

  • the present application relates to the technical field of network data detection, and in particular, to a method and computer device for generating a network flow anomaly detection model.
  • Network attack is a serious problem in today's society that is increasingly closely linked. With the development of the network and the continuous expansion of the application scope, the means of network intrusion are changing with each passing day, causing more and more damage. Intrusion is an attempt to gain access to a computer system or disrupt the operation of the system in an illegal or unauthorized manner. Anomaly detection can well detect new network intrusion behaviors.
  • Model training and model detection are performed on the same dataset, which can only indicate models trained on a certain dataset. The detections against this dataset are valid.
  • model adjustment needs to be used, and model adjustment relies on a large amount of labeled data, so it is not suitable for environments with little data and no labels.
  • the present invention provides a method and computer equipment for generating a network flow anomaly detection model.
  • the features extracted by the trained target domain feature extractor in the present invention on the target domain are the same as those extracted by the source domain feature extractor on the source domain. Therefore, the classifier trained based on the source domain in the network flow anomaly detection model can perform anomaly detection on the target domain with high accuracy.
  • an embodiment of the present invention provides a method for generating a network flow anomaly detection model, including:
  • the first network model is trained based on the source domain to obtain a trained first network model, wherein the trained first network model includes a source domain feature extractor and a classifier;
  • the second network model is trained based on the target domain, the source domain, the source domain feature extractor and the discriminator to obtain a target domain feature extractor;
  • a network flow anomaly detection model is generated according to the target domain feature extractor and the classifier.
  • an embodiment of the present invention provides a network flow anomaly detection method, which is applied to a network flow anomaly detection model.
  • the network flow anomaly detection model includes a target domain feature extractor and a classifier.
  • the network flow anomaly detection model obtains the network flow to be detected in the target domain
  • the target domain feature extractor extracts the to-be-detected feature vector corresponding to the to-be-detected network flow, wherein the target domain feature extractor is the target domain feature extractor in the method for generating the above-mentioned network flow anomaly detection model;
  • the classifier classifies the feature vector to be detected to obtain anomaly detection corresponding to the feature vector to be detected.
  • the test result is obtained, wherein the classifier is the classifier in the above-mentioned method for generating an abnormality detection model of network flow.
  • an embodiment of the present invention provides a computer device, including a memory and a processor, the memory stores a computer program, and the processor implements the following steps when executing the computer program:
  • the first network model is trained based on the source domain to obtain a trained first network model, wherein the trained first network model includes a source domain feature extractor and a classifier;
  • the second network model is trained based on the target domain, the source domain, the source domain feature extractor and the discriminator to obtain a target domain feature extractor;
  • the network flow anomaly detection model acquires the network flow to be detected in the target domain
  • the target domain feature extractor extracts the to-be-detected feature vector corresponding to the to-be-detected network flow, wherein the target domain feature extractor is the target domain feature extractor in the method for generating the above-mentioned network flow anomaly detection model;
  • the classifier classifies the feature vector to be detected to obtain an anomaly detection result corresponding to the feature vector to be detected, wherein the classifier is a classifier in the above method for generating an abnormality detection model of network flow.
  • an embodiment of the present invention further provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the following steps are implemented:
  • the first network model is trained based on the source domain to obtain a trained first network model, wherein the trained first network model includes a source domain feature extractor and a classifier;
  • the second network model is trained based on the target domain, the source domain, the source domain feature extractor and the discriminator to obtain a target domain feature extractor;
  • the network flow anomaly detection model acquires the network flow to be detected in the target domain
  • the target domain feature extractor extracts the to-be-detected feature vector corresponding to the to-be-detected network flow, wherein the target domain feature extractor is the target domain feature extractor in the method for generating the above-mentioned network flow anomaly detection model;
  • the classifier classifies the feature vector to be detected to obtain an anomaly detection result corresponding to the feature vector to be detected, wherein the classifier is a classifier in the above method for generating an abnormality detection model of network flow.
  • the embodiment of the present invention has the following advantages:
  • the present invention provides a method for generating a network flow anomaly detection model, comprising: training a first network model based on a source domain to obtain a trained first network model, wherein the trained first network model includes source domain feature extractor and classifier; train a second network model based on the target domain, the source domain, the source domain feature extractor and the discriminator to obtain a target domain feature extractor; according to the target domain feature
  • the extractor and the classifier generate a network flow anomaly detection model.
  • the data in the target domain does not have labels
  • the second network model is trained by means of generative confrontation to obtain the target domain feature extractor, so that the target domain feature extractor can map the data on the target domain to similar features in the source domain space to minimize the spatial distance between the feature space of the target domain and the features of the source domain, so that the features extracted by the target domain feature extractor on the target domain are the same as the features extracted by the source domain feature extractor on the source domain
  • the adaptation process from the source domain to the target domain is completed; furthermore, when using the classifier obtained by training the source domain in a new scene, the new scene does not need a large amount of labeled data for secondary training.
  • the classifier obtained by training based on the source domain can perform anomaly detection on the target domain with high accuracy.
  • FIG. 1 is a schematic diagram of an application field of a method for generating a network flow anomaly detection model in an embodiment of the present invention
  • FIG. 2 is a schematic flowchart of a method for generating a network flow anomaly detection model according to an embodiment of the present invention
  • FIG. 3 is a schematic diagram of a process of determining each normal network flow and each abnormal network flow from a source domain in an embodiment of the present invention
  • FIG. 4 is a schematic diagram of the form after converting the first network flow into a first character string in an embodiment of the present invention
  • FIG. 5 is a schematic diagram of a form obtained by parsing a first network stream by capturing packets (Wiresharks) in an embodiment of the present invention
  • FIG. 6 is a schematic diagram of storing the first three-dimensional tensor in the form of a Numpy compressed (Numpy zip, NPZ) file in an embodiment of the present invention
  • FIG. 7 is a schematic diagram of a process of extracting a flow feature vector by a convolutional neural network in an embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of a convolutional neural network in an embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of a recurrent neural network in an embodiment of the present invention.
  • FIG. 10 is a schematic diagram of using a vector generator to generate an abnormal network flow when the abnormal network flow in the source domain is insufficient in an embodiment of the present invention
  • FIG. 11 is a schematic structural diagram of a vector generator in an embodiment of the present invention.
  • FIG. 12 is a schematic diagram of a process of training a second network model in an embodiment of the present invention.
  • FIG. 13 is a schematic structural diagram of a network flow anomaly detection model according to an embodiment of the present invention.
  • FIG. 14 is a schematic diagram of stages during specific implementation of a method for generating a network flow anomaly detection model according to an embodiment of the present invention.
  • FIG. 15 is a schematic flowchart of a method for detecting an anomaly in a network flow according to an embodiment of the present invention.
  • 16 is a schematic flowchart of a specific implementation of a method for detecting anomalies in a network flow according to an embodiment of the present invention
  • FIG. 17 is an internal structure diagram of a computer device in an embodiment of the present invention.
  • the present invention provides a method and computer equipment for generating a network flow anomaly detection model.
  • the present invention is further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.
  • a first network model is trained based on a source domain to obtain a trained first network model, wherein the trained first network model includes a source domain feature extractor and classifier; train the second network model based on the target domain, the source domain, the source domain feature extractor and the discriminator to obtain a target domain feature extractor; according to the target domain feature extractor and the The classifier generates a network flow anomaly detection model.
  • the data in the target domain has no labels
  • the second network model is trained by means of generative confrontation to obtain the target domain feature extractor, so that the target domain feature extractor can map the data on the target domain to the feature space similar to the source domain, so as to achieve Minimize the spatial distance between the feature space of the target domain and the features of the source domain, so that the features extracted by the target domain feature extractor on the target domain are similar to the features extracted by the source domain feature extractor on the source domain, thus completing the The adaptation process from the source domain to the target domain; furthermore, when using the classifier obtained by training the source domain in a new scene, there is no need for the new scene to have a large amount of labeled data for secondary training.
  • the classifier obtained by training based on the source domain can perform anomaly detection on the target domain with high accuracy.
  • This embodiment provides a method and computer device for generating a network flow anomaly detection model, and the method for generating a network flow anomaly detection model can be applied to the scenario shown in FIG. 1 .
  • the terminal device 1 can collect the source domain and the target domain, and input the source domain and the target domain into the server 2, so that the server 2 can train the first network model and the first network model according to the source domain and the target domain.
  • Two network models are trained.
  • the server 2 can preselect and store the first network model and the second network model, and respond to the input source domain and target domain of the terminal device 1, to train the first network model and the second network model, and obtain the target domain feature extractor. and a classifier, and then generate a network flow anomaly detection model according to the target domain feature extractor and the classifier.
  • the network flow anomaly detection model is applied to the electronic device to detect whether the to-be-detected network flow obtained by the electronic device from the target domain is abnormal, and the electronic device includes a PC, a server, Mobile phones, tablet computers, handheld computers, personal digital assistants (Personal Digital Assistant, PDA), etc.
  • the electronic device includes a PC, a server, Mobile phones, tablet computers, handheld computers, personal digital assistants (Personal Digital Assistant, PDA), etc.
  • this embodiment provides a method for generating a network flow anomaly detection model, including:
  • the first network model is a deep learning network model
  • the traffic in the source domain is labeled traffic
  • the label is used to indicate that the network traffic in the source domain is normal traffic or abnormal traffic
  • the traffic in the source domain is normal traffic or abnormal traffic.
  • the source domain trains a first network model to obtain a trained first network model
  • the trained first network model includes a source domain feature extractor and a classifier.
  • step S1 includes:
  • the first network model includes a first sub-network and a second sub-network
  • the first sub-network is used to extract flow feature vectors of each network flow (including normal network flow and abnormal network flow)
  • the second sub-network is used to classify the extracted flow feature vectors and output the scores corresponding to the flow feature vectors.
  • the flow feature vector and the abnormal flow feature vector are input to the second sub-network to obtain a first detection score corresponding to the normal flow feature vector and a second detection score corresponding to the abnormal flow feature vector.
  • the first detection score is a score obtained by the second sub-network based on the normal flow feature vector
  • the second detection score is a score obtained by the second sub-network based on the abnormal flow feature vector.
  • each training group in the training data includes normal network flows from the source domain and abnormal network flows from the source domain; each normal network flow and each abnormal network flow can be determined from the source domain.
  • step S11 it also includes:
  • the source domain includes multiple network flows.
  • the label of each network flow can be determined according to the description file of the data set, and the label includes a normal label and an abnormal label, and a label is added to each network flow to obtain the first A network flow and a second network flow, the first network flow is a network flow with a normal label added, and the second network flow is a network flow with an abnormal label added.
  • a first three-dimensional tensor is generated according to the first network flow to obtain a normal flow
  • a second three-dimensional tensor is generated according to the second network flow to obtain an abnormal network flow.
  • step M includes:
  • the network flow corresponding to the source domain is captured.
  • the captured network flow corresponding to the source domain is usually stored in a PCAP file.
  • the network flow corresponding to the source domain is usually relatively large.
  • the source domain corresponds to The size of the network flow is several G to dozens of G, and the network flow corresponding to the source domain includes thousands of data packets, usually the data packets of a certain network in a certain period of time are collected. Since the network flow corresponding to the source domain is stored in the PCAP file, the PCAP file corresponding to the source domain can be obtained, and the PCAP file corresponding to the source domain can be cut to obtain each sub-PACP file, and each sub-PACP file corresponds to a cut network flow.
  • the so-called cutting the PCAP file corresponding to the source domain specifically, using the pkt2flow tool to convert the captured data packets (the PCAP file corresponding to the source domain includes multiple data packets) into a flow (five-tuple: source IP, source port, destination IP , destination port, protocol) as a unit, to obtain multiple sub-PCAP files, each sub-PCAP file represents a stream, and the file name of the sub-PCAP file can be named by the quintuple of each PCAP file.
  • the sub-PCAP files of unrecognized types in each sub-PCAP file are filtered out, so as to obtain a filtered sub-PACP file.
  • the so-called sub-PCAP files of unrecognized types refer to network traffic that cannot identify whether the sub-PCAP files are normal or abnormal.
  • the filtered sub-PACP files add labels to each PCAP file according to the description file of the data set; the data set includes the network traffic stored in the PCAP file and the comma-separated values (Comma- Seperated Values, CSV) file; the CSV file will record whether each sub-PCAP file is normal traffic or abnormal traffic.
  • This operation is to write code according to the description of the CSV file to add tags to each sub-PCAP file, and the tags include normal tags and exceptions Label.
  • Adding a label will change the file name of the sub-PCAP file, that is, whether the sub-PCAP file is normal traffic or abnormal traffic can be determined by the file name of the sub-PCAP file.
  • the sub-PCAP file after adding the normal label is recorded as the first network flow
  • the sub-PCAP file after adding the abnormal label is recorded as the second network flow.
  • each first three-dimensional tensor Generates each first three-dimensional tensor with a preset size according to each first network flow, and use each first three-dimensional tensor as each normal network flow.
  • each first network flow multiple first network data packets in the first network flow are extracted, and a first three-dimensional tensor is generated according to each first network data packet.
  • step M2 includes:
  • multiple first network data packets in the first network flow are extracted; specifically, for the first network flow, a Packets object is obtained through the rdpcap() function in the Scapy packet, and this operation can Obtaining each Packets object corresponding to the first network flow, that is, obtaining a plurality of first network data packets.
  • each first network data packet is first converted into a first character string corresponding to each, and a first two-dimensional tensor can be obtained according to each first character string.
  • the first two-dimensional tensor is obtained to obtain a first three-dimensional tensor, and the preset size can be used to represent the size of the first two-dimensional tensor and the number of the first two-dimensional tensor.
  • step M22 includes:
  • each first network data packet is firstly serialized to obtain a first character string, where the first character string is a character string in the form of a hexadecimal number.
  • the value is in the range of [0, 255], and the hexadecimal number is consistent with the gray value range of the image, that is, the first string corresponding to the first network data packet can indirectly represent the gray value of the image.
  • Figure 4 is the form after converting the first network data packet into a first string
  • Figure 5 is the form obtained by parsing the first network data packet through packet capture (Wiresharks)
  • the value of each field obtained by parsing the first network data packet is exactly the same as the field value parsed by Wiresharks. That is to say, in the embodiment of the present invention, the first network data packet is converted into a first character string is practical.
  • the preset size includes the number of data packets and the size of each intercepted data packet, the number of network data packets is recorded as pkt_num, and the size of each intercepted data packet is recorded as pkt_size.
  • the size of the first two-dimensional tensor is determined according to the size of each intercepted data packet, and each intercepted data packet includes pkt_size valid characters.
  • valid characters refer to the characters in the string.
  • a valid character such as ⁇ xff
  • other symbols in the string are not converted as identifiers, but are only used for string parsing.
  • the meaning of a valid character corresponds to a hexadecimal number, such as ⁇ xff, which corresponds to a hexadecimal number and is stored in one byte.
  • a first two-dimensional tensor with a size of 22*22 is generated according to the first 484 valid characters in the first string. If the number of valid characters in the first string exceeds 484 bytes, only the first 484 bytes of the first string are taken to generate the first two-dimensional tensor. If the number of valid characters in the first string is If it is less than 484 bytes, add 0 at the end of the first string to get the first string with a size of 484 bytes, and then generate the first and second strings with a size of 22*22 according to the first string after adding 0 dimension tensor.
  • a first three-dimensional tensor is generated according to each of the first character strings corresponding to the first two-dimensional tensor.
  • the number of the first strings is the same as the number of the first data packets, and a first three-dimensional tensor is generated according to each of the first strings and the number of the network data packets.
  • the number of the first data packets is greater than the number of the network data packets, only the first network data packets of the number of network data packets are taken to generate a first three-dimensional tensor; if the number of the first network data packets is less than The number of network packets (the number of first strings is less than the number of network packets), then a zero matrix is generated when the first three-dimensional tensor is generated, so that the size of the first three-dimensional tensor after adding the zero matrix is Default size.
  • the preset size is 10*22*22, that is, the size of the first three-dimensional tensor is 10*22*22;
  • a network flow includes 15 first network data packets, then a first three-dimensional tensor is generated according to the first 10 first network data packets in the 15 first network data packets, and the size of the first three-dimensional tensor is: 10*22 *twenty two.
  • the first network flow includes 8 first network data packets, then generate a three-dimensional tensor according to the 8 first network data packets, and use the np.zeros() function to directly generate a zero matrix, This makes the first three-dimensional tensor of size 10*22*22 after adding the zero matrix.
  • the first three-dimensional tensor is stored in the NPZ form to obtain a normal network flow; see FIG. 6 , which is a schematic diagram after the first three-dimensional tensor is stored in the NPZ form.
  • each second network data packet in the second network flow is extracted, and a second three-dimensional tensor is generated according to each second network data packet.
  • step M3 includes:
  • multiple second network data packets in the second network flow are extracted; specifically, for the second network flow, a Packets object is obtained through the rdpcap() function in the Scapy packet, and this operation can Obtaining each Packets object corresponding to the second network flow, that is, obtaining a plurality of second network data packets.
  • each second network data packet is first converted into a second character string corresponding to each, and a second two-dimensional tensor can be obtained according to each second character string.
  • the corresponding second two-dimensional tensor obtains a second three-dimensional tensor, and the preset size can be used to represent the size of the second two-dimensional tensor and the number of the second two-dimensional tensor.
  • step M32 includes:
  • step M321 the execution process of "sequencing each second network data packet to obtain a second character string corresponding to each second network data packet" is the same as: “for each second network data packet"
  • the first network data packet is serialized to obtain the first character string "" corresponding to each first network data packet.
  • the execution process is consistent. Further, the specific description of step M321 can refer to the above description of step M221.
  • the execution process of "generating a second three-dimensional tensor of the preset size according to each second character string, and using the second three-dimensional tensor as the abnormal network flow" is the same as :
  • the execution process of "generating a first three-dimensional tensor with a preset size according to each first character string, and using the first three-dimensional tensor as the normal network flow” is consistent, and further, the specific description of step M322 can refer to the above Description of step M222.
  • the first network model includes a first sub-network and a second sub-network, and the first sub-network is used to extract a normal flow feature vector corresponding to a normal network flow and an abnormality corresponding to an abnormal network flow Flow feature vector.
  • the normal flow feature vector and the abnormal flow feature vector are input into the second sub-network to obtain a first detection score corresponding to the normal flow feature vector and a second detection score corresponding to the abnormal flow feature vector.
  • the first sub-network includes a convolutional neural network (CNN) and a recurrent neural network (GRU).
  • CNN convolutional neural network
  • GRU recurrent neural network
  • each network flow is in the form of a three-dimensional tensor (n*m*m), which can be divided into n two-dimensional vectors (m*m), and the two-dimensional vector is the packet feature vector of the data packet
  • the first three-dimensional tensor includes the first two-dimensional vector corresponding to each first network data packet, that is, the packet vector corresponding to each first network data packet;
  • the second three-dimensional tensor includes a packet vector corresponding to each of the second network data packets respectively.
  • each package vector of size m*m into CNN uses the np.
  • the vector is input to the GRU to learn the time series features of the spliced feature vector, and the flow feature vector corresponding to the network flow is obtained.
  • the normal flow feature vector corresponding to the normal network flow is obtained through CNN and GRU;
  • the abnormal flow feature vector corresponding to the abnormal network flow is obtained through CNN and GRU.
  • the input item of the CNN is in the form of a three-dimensional tensor (n*m*m).
  • a first three-dimensional tensor of a preset size is generated, and in step M3, a preset Let the size of the second 3D tensor.
  • the first three-dimensional tensor is a normal network flow, that is, the input item of the CNN, and the second three-dimensional tensor is also the input item of the CNN.
  • the preset size includes the number of data packets and the size of each intercepted data packet.
  • the number of data packets and the size of each intercepted data packet have a great impact on the algorithm.
  • Some attack types, such as DoS attacks, are more related to packet header data and the first few packets in a flow, and some attack types, such as XSS attacks, are more related to payload data, so determine which part of the original network flow data to characterize Learning can have a significant impact on the detection accuracy of an algorithm.
  • the preset size can be set to is: the number of data packets is 6, and the size of each intercepted data packet is 484.
  • Such a stream is finally processed into a 6*22*22 three-dimensional tensor and input to the CNN, which can be tuned according to the characteristics of the data in the actual use process.
  • HAST-NAD first proposes to use a convolutional neural network (CNN) to learn the spatial features of network flows, and then uses a recurrent neural network (LSTM) to learn the time-series features between network flows.
  • CNN convolutional neural network
  • LSTM recurrent neural network
  • This application does not perform One-Hot Encoding.
  • GRU instead of LSTM because the overhead of GRU is lower than that of LSTM, but the effect is almost the same.
  • GRU is finally selected to capture the timing characteristics of network flow.
  • the convolutional neural network includes three convolutional layers, two pooling layers and one linear layer, and the activation function in the convolutional layer uses ReLU; the network flow is input into the CNN , each feature map with different scales is obtained at each layer of CNN, where the number before @ represents the number of channels, and the number after @ represents the size of the feature map.
  • the essence of the feature map is the matrix obtained after feature extraction.
  • a three-dimensional tensor is input into the CNN.
  • each two-dimensional vector in a three-dimensional tensor is input into the CNN in turn, and each two-dimensional vector in a three-dimensional tensor is a packet vector corresponding to each network data packet corresponding to the network flow.
  • the final output of CNN is the packet feature vector corresponding to each packet vector.
  • the recurrent neural network includes two GRU layers and one Flatten layer. The input of the GRU is the spliced packet feature vector, and the output of the GRU network is the flow feature vector, which is a one-dimensional feature vector. .
  • the normal flow feature vector and the abnormal flow feature vector are collectively referred to as flow feature vector.
  • the essence of the second sub-network is a classifier, which is used to determine whether the extracted features are abnormal.
  • the output of the second sub-network is a floating point number in the [0,1] interval, that is, the first Both the first detection score and the second detection score are floating point numbers in the [0,1] interval.
  • the normal network flow is a network flow with a normal label added, and the normal label is represented by 0;
  • the abnormal network flow is a network flow with an abnormal label added, and the abnormal label is represented by 1 ;
  • the first detection score is the detection score corresponding to the normal network flow, and the second detection score is the detection score corresponding to the abnormal network flow; that is, the first network model obtains the first detection score according to the normal network flow , the first network model obtains the second detection score according to the abnormal network flow.
  • the network structure of the second sub-network is relatively simple, and the second sub-network includes a fully connected layer and a sigmoid layer.
  • the first network model includes a first sub-network and a second sub-network. Therefore, during training, the first sub-network and the second sub-network are determined according to the first detection score and the second detection score.
  • the second sub-network is trained to obtain the trained first sub-network, ie the source domain feature extractor, and the trained second sub-network, ie the classifier.
  • the normal flow feature vector and the abnormal flow feature vector are input into the second sub-network in one iteration, so that the trained second sub-network (classifier) can be used to distinguish whether the input is a normal network flow or a Abnormal network flow.
  • the process of training the first network model according to the first detection score and the second detection score is: modifying the parameters of the first sub-network and the first detection score according to the first detection score and the second detection score Two subnet parameters. Specifically, the classification loss function value is calculated according to the first detection score and the second detection score, and then the parameters of the first sub-network and the parameters of the second sub-network are modified according to the classification loss function value.
  • anomaly detection is a typical data imbalance problem, that is to say, the abnormal network flow in the training data is far less than the normal network flow.
  • the neural network is trained. Due to the strong learning ability of the neural network, it will overfit the normal data flow. Since the abnormal data flow is rarely learned, it is difficult for the trained classifier to detect the abnormal data flow, resulting in serious data. Bias, resulting in a very low anomaly detection rate.
  • the first network model further includes a vector generator, and when the abnormal network flow in the source domain is insufficient, random noise is input into the vector generator to obtain the abnormal network flow.
  • each normal network flow and each abnormal network flow in the source domain will be loaded first, and when the abnormal network flow in the source domain is insufficient, the The abnormal network flow is supplemented, so that regardless of the proportion of normal network flow and abnormal network flow in the source domain, the normal network flow and abnormal network flow actually input to the second sub-network can maintain a fixed ratio.
  • the abnormal network flow will not be generated by the vector generator. That is to say, the whole training process is essentially divided into two stages.
  • the first stage extracts the normal network flow and abnormal network flow from the source domain
  • the second stage extracts the normal network flow from the source domain and uses the vector generator to generate Abnormal network flow.
  • the detection scores output by the second sub-network in FIG. 10 include: a first detection score corresponding to a normal flow feature vector, and a second detection score corresponding to an abnormal flow feature vector.
  • the network structure of the vector generator is shown in FIG. 11 , the vector generator includes 4 deconvolution layers, wherein after each deconvolution layer, BatchNorm2d is used for normalization , the activation function uses ReLU, and the final output of the vector generator is a vector that is isomorphic to the vector read in the NPZ file, that is, the structure of the normal network flow obtained in step M2 and the abnormal network flow obtained in step M3 are the same.
  • the training process of the first network model may be implemented by the following algorithm.
  • the classification loss L classifier (f n , f a ) is passed back, and the parameters of C s and E S are updated at the same time;
  • Output trained classifier C s , source domain feature extractor: E S ;
  • the normal network flow and the abnormal network flow extracted from the source domain are real data, by introducing a hyperparameter ⁇ , the priority of the real data being correctly classified can be improved, and the value range of ⁇ is (0, 1], when ⁇ is less than 1, C s increases the priority of correctly classifying real data.
  • the abnormal network flow generated by the vector generator is marked as 1, and the normal network flow extracted from the source domain The mark is 0, which is exactly the opposite of marking the real sample as 1 by default in the cross-entropy function in the usual GAN, and the generated sample as 0. Therefore, the classification loss function corresponding to C s is shown in formula (1).
  • the network parameters of the first sub-network and the second sub-network are modified according to the classification loss function value calculated by the classification loss function until the first preset condition is satisfied, so as to obtain the trained first network model
  • the trained first network model includes a source domain feature extractor E S corresponding to the first sub-network, and a classifier C s corresponding to the second sub-network.
  • the abnormal network flow generated by the vector generator is used.
  • the vector generator is trained.
  • the goal of training the vector generator is to make Cs can distinguish the normal network flow from the abnormal network flow well.
  • the abnormal network flow generated by the vector generator will be closely distributed around the normal network flow, but not identically distributed.
  • the ideal situation is that C s can identify the network flows with the same distribution as normal network flows, and the network flows with different distributions as abnormal network flows. If the abnormal network flow is not closely distributed around the normal network flow, the classifier Cs will be easy to distinguish, for example, for the first abnormal network flow A and the second abnormal network flow B, if A is distributed near the normal network flow, and B If it is distributed far away from the normal network flow, it is easier for C s to distinguish that A is an abnormal network flow than B.
  • the vector generator In order to enable the vector generator to generate anomalous network flows surrounding the real data (the network flow extracted from the source domain), the vector generator is trained by a surround loss and a dispersion loss, so that the vector generator can generate a network flow surrounding the real data ( Abnormal network flows around network flows extracted from the source domain).
  • the abnormal network flow generated by the vector generator is input into a first network model, a generation score is obtained through the first network model, and the vector generator is trained according to the generation score to obtain a trained vector generator.
  • the generated score is used to represent the score obtained by the second sub-network according to the abnormal network flow generated by the vector generator.
  • the surrounding loss value is calculated according to the generated score, and the surrounding loss value can be calculated by formula (2)
  • G ⁇ (Z i ) is the abnormal network flow generated by the vector generator G ⁇
  • G ⁇ (Z i ) is the score obtained by the second sub-network according to the abnormal network flow G ⁇ (Z i ) generated by the vector generator G ⁇ , that is, the generated score
  • is a hyperparameter, ⁇ (0,1].
  • the discrete loss value DL(G ⁇ , Z) is calculated by formula (3).
  • is the corresponding centroid of the generated abnormal network flow
  • G ⁇ (Z i ) is the abnormal network flow generated by the vector generator G ⁇ .
  • Equation (4) the loss function corresponding to the vector generator can be described by Equation (4).
  • is a hyperparameter used to adjust the weights of the surround loss and dispersion loss.
  • the network parameters of the first sub-network and the second sub-network are modified by the value of the classification loss function
  • the network parameters of the vector generator are modified by the value of the loss function corresponding to the vector generator, until A first preset condition is satisfied to obtain a source domain feature extractor, a classifier and a trained vector generator.
  • the first preset condition includes that the value of the classification loss function meets a preset requirement, or the number of training times reaches a preset number of times.
  • the preset requirement may be determined according to the accuracy of the classifier, which is not described in detail here, and the preset number of times may be the maximum number of training times of the second sub-network, for example, 4000 times.
  • the training is ended; if the classification loss function value does not meet the preset requirement, it is judged Whether the training times of the second sub-network reaches the predicted times, and if it does not reach the preset times, modify the network parameters of the first sub-network and the second sub-network through the value of the classification loss function, and at the same time pass the loss function corresponding to the vector generator.
  • the value modifies the network parameters of the vector generator; if the preset number of times is reached, the training ends. In this way, it is judged whether the training is over by using the classification loss function value and the number of training times, which can avoid the training entering an infinite loop due to the failure of the classification loss function value to meet the preset requirements.
  • the source domain and the target domain are both network flows in essence, the network flow in the source domain is a network flow with labels, and the network flow in the target domain is a network flow without labels; the existing network flow is
  • model training and model detection are performed on the same data set, which can only indicate that the model trained on a certain data set is effective for the detection of this data set.
  • model adjustment needs to be used, and model adjustment relies on a large amount of labeled data, so it is not suitable for environments with little data and no labels.
  • the data in the target domain used for training has no labels.
  • the classifier is trained by the training data extracted from the source domain, and the classifier is transferred to the target domain to perform anomaly detection on the target domain; that is, the domain transfer is completed through the mapping of latent features , and the mapping of the latent features is done by the target domain feature extractor, which is optimized by the process of adversarial training.
  • the features extracted by the target domain feature extractor in the target domain are similar to the features extracted by the source domain feature extractor in the source domain.
  • the process of training the second network model includes: the source domain feature extractor ES extracts the source domain feature vector from the source domain, the second network model extracts the target domain feature vector from the target domain, and combines the source domain feature vector and The target domain feature vector is input to the discriminator D d , and the discriminator D d outputs a prediction score.
  • the prediction score includes: the first prediction score corresponding to the source domain feature vector and the second prediction score corresponding to the target domain feature vector.
  • the predicted score and the second predicted score modify the model parameters of the second network model to obtain the target domain feature extractor E t .
  • the source domain feature extractor ES is obtained by training the first sub-network through step S1 .
  • the initial model parameters of the second network model are the same as the model parameters of the source domain feature extractor, and the structure of the second network model is the same as that of the source domain feature extractor.
  • the initial model parameters of the second network model are the model parameters when the second network model is not trained, that is, the model parameters of the source domain feature extractor ES are used to initialize the parameters of the second network model.
  • the parameters of the source domain feature extractor ES are fixed, and only the model parameters of the second network model are updated.
  • the first prediction score is used to represent the source domain feature score corresponding to the source domain feature vector output by the discriminator
  • the second prediction score is used to represent the target domain feature score corresponding to the target domain feature vector output by the discriminator.
  • step S2 includes:
  • the source domain feature extractor extracts the source domain feature vector corresponding to the source domain.
  • the process of extracting the source domain feature vector by the source domain feature extractor ES is consistent with the step of extracting the normal flow feature vector by the first sub-network (and the step of extracting the abnormal flow feature vector by the first sub-network). consistent). Specifically, a network stream is obtained from the source domain, a 3D tensor of preset size is extracted from the obtained network stream, and the source domain feature extractor ES outputs the source domain feature vector according to the extracted 3D tensor.
  • the second network model extracts a target domain feature vector corresponding to the target domain.
  • the source domain feature extractor includes CNN and GRU.
  • the second network model also includes CNN and GRU.
  • the CNN includes three convolution layers and two pooling layers. and a linear layer
  • GRU includes two GRU layers and a Flatten layer
  • the network structure of CNN is shown in Figure 8
  • the network structure of GRU is shown in Figure 9
  • the second network model is obtained by concatenating CNN and GRU.
  • the structure of the second network model is the same as that of the source domain feature extractor ES .
  • a network flow is obtained from the target domain, and a preset size three-dimensional sheet is extracted from the network flow corresponding to the target domain.
  • the second network model outputs the target domain feature vector according to the three-dimensional tensor from the target domain.
  • the goal of the discriminator D d is to distinguish the features from the source domain and the target domain, that is, the goal of the discriminator D d is to distinguish the source domain feature vector and the target domain feature vector.
  • the source domain feature vector is marked as 1 and the target domain feature vector is marked as 0.
  • the first prediction score is used to represent the source domain feature score corresponding to the source domain feature vector output by the discriminator, and the second prediction score is used to represent the target domain feature score corresponding to the target domain feature vector output by the discriminator.
  • D d can distinguish the input features from the target domain or the source domain.
  • the target domain loss function value corresponding to the second network model is calculated by using the first prediction score and the second prediction score, and the parameters of the second network model are adjusted according to the target domain loss function value until the second preset value is satisfied. conditions to obtain the target domain feature extractor E t .
  • the value of the target domain loss function can be calculated by formula (5).
  • X s is the extracted 3D tensor in the source domain
  • X t is the extracted 3D tensor in the target domain
  • M t (X t ) is the target domain feature vector
  • D(M t (x t )) is the second
  • the detection score is the target domain feature score corresponding to the target domain feature vector
  • D is the discriminator.
  • the model parameters of the discriminator are also updated. Specifically, the discriminant loss function value corresponding to the discriminator D d is calculated by the first prediction score and the second prediction score, and the parameters of the second network model are adjusted according to the discriminant loss function value until the second preset condition is satisfied, so as to obtain the target domain Feature Extractor E t .
  • the discriminant loss function corresponding to the discriminator D d is shown in formula (6).
  • X S is the three-dimensional tensor extracted from the source domain
  • M S (X S ) is the source domain feature vector
  • D(M s (x s )) is the first detection score, that is, the source domain corresponding to the source domain feature vector Feature score
  • X t is the three-dimensional tensor extracted in the target domain
  • M t (X t ) is the target domain feature vector
  • D (M t (x t )) is the second detection score, that is, the target corresponding to the target domain feature vector Domain Feature Score.
  • the data input into ES is a three-dimensional tensor extracted from the source domain (including a first three-dimensional tensor corresponding to a normal network flow, and a second three-dimensional tensor corresponding to an abnormal network flow), and the input
  • the second network model is an unlabeled three-dimensional tensor from the target domain, and then the target domain feature vector output by the second network model and the source domain feature vector output by the source domain feature extractor are input into the discriminator D s .
  • the second network model try to extract similar features from the target domain and ES from the source domain to fool the discriminator.
  • D(M s ( x s ) ) and D(M t (x t )) is close to 0.5, that is, when the discriminator D s cannot distinguish whether an extracted feature is from the source domain or the target domain, the training process is completed.
  • the network flow anomaly detection model includes: a target domain feature extractor and a classifier, the classifier is obtained by training in step S1, and the target domain feature extractor is obtained by training in step S2.
  • a method for generating a network flow anomaly detection model can be divided into three stages.
  • the classifier CS and the source domain feature extractor ES are trained based on the source domain.
  • the abnormal network flow is generated by Gaussian noise and vector generator, so that the normal network flow and abnormal network flow input to the source domain feature extractor are balanced, preventing the classifier CS from appearing normal sample data bias (Bias) resulting in extremely low abnormal detection. Rate.
  • the adversarial domain adaptation method is used to train the target domain feature extractor E t corresponding to the target domain, and map the data on the target domain to the feature space similar to the source domain, so as to minimize the target domain feature space and The spatial distance between the features of the source domain makes the features extracted by the target domain feature extractor on the target domain similar to the features extracted by the source domain feature extractor on the source domain, thus completing the adaptation process from the source domain to the target domain .
  • the classifier CS trained in the first stage and the target domain feature extractor E t trained in the second stage are cascaded, and finally a network flow anomaly detection model that can perform anomaly detection on the target domain is realized.
  • the first network model is trained based on the source domain to obtain a trained first network model, wherein the trained first network model includes a source domain feature extractor and a classifier; domain, the source domain, the source domain feature extractor and the discriminator train the second network model to obtain a target domain feature extractor; generate network flow anomalies according to the target domain feature extractor and the classifier Detection model.
  • the trained first network model includes a source domain feature extractor and a classifier
  • domain, the source domain, the source domain feature extractor and the discriminator train the second network model to obtain a target domain feature extractor; generate network flow anomalies according to the target domain feature extractor and the classifier Detection model.
  • the data in the target domain has no labels
  • the second network model is trained by means of generative confrontation to obtain the target domain feature extractor, so that the target domain feature extractor can map the data on the target domain to the feature space similar to the source domain, so as to achieve Minimize the spatial distance between the feature space of the target domain and the features of the source domain, so that the features extracted by the target domain feature extractor on the target domain are similar to the features extracted by the source domain feature extractor on the source domain, thus completing the The adaptation process from the source domain to the target domain; furthermore, when using the classifier obtained by training the source domain in a new scene, there is no need for the new scene to have a large amount of labeled data for secondary training.
  • the classifier obtained by training based on the source domain can perform anomaly detection on the target domain with high accuracy.
  • the present invention also provides a network flow anomaly detection method, wherein the network flow anomaly detection method applies the network flow anomaly detection model described in the above embodiments
  • the network flow anomaly detection model obtained by the generation method of , the network flow anomaly detection model includes a target domain feature extractor and a classifier, as shown in Figure 15, the network flow anomaly detection method, including:
  • the network flow anomaly detection model acquires the network flow to be detected in the target domain.
  • the target domain feature extractor extracts the to-be-detected feature vector corresponding to the to-be-detected network flow, wherein the target domain feature extractor is a target domain feature extractor in the above-mentioned network flow anomaly detection method;
  • the classifier classifies the feature vector to be detected to obtain an anomaly detection result corresponding to the feature vector to be detected, wherein the classifier is a classifier in the above-mentioned method for detecting anomaly of network flows .
  • the target domain is first preprocessed to obtain the network flow to be detected, and the process of preprocessing the target domain to obtain the network flow to be detected is the same as the process of obtaining a normal network based on the source domain in steps M1 to M3
  • the processes of the flow and the abnormal network flow are the same, and further, for the specific process of "preprocessing the target domain to obtain the network flow to be detected", please refer to the descriptions in steps M1 to M3.
  • the spatiotemporal features of the network flow to be detected are extracted by the target domain feature extractor to obtain the feature vector to be detected, and the feature vector to be detected is input into the classifier, and the classifier
  • the output is a [0,1] floating-point type number, and the floating-point type number can obtain a label used to represent the abnormality detection result through the binary function.
  • the label corresponding to the floating-point type number less than or equal to 0.5 is 0, the label corresponding to the floating-point type number greater than 0.5 is 1, and the label is 0, indicating that the abnormal detection result of the network flow to be detected is normal, and the label is 1 Indicates that the abnormality detection result of the network flow to be detected is abnormal.
  • the classifier trained through the source domain can Anomaly detection is performed with high accuracy.
  • the present invention provides a computer device, the device may be a terminal, and the internal structure is shown in FIG. 17 .
  • the computer equipment includes a processor, memory, a network model interface, a display screen, and an input device connected by a system bus.
  • the processor of the computer device is used to provide computing and control capabilities.
  • the memory of the computer device includes non-volatile storage media, internal memory.
  • the nonvolatile storage medium stores an operating system and a computer program.
  • the internal memory provides an environment for the execution of the operating system and computer programs in the non-volatile storage medium.
  • the network model interface of the computer equipment is used to communicate with external terminals through the network model connection.
  • the computer program is executed by the processor to implement a method for generating an anomaly detection model of a network flow, or a method for detecting an anomaly in a network flow.
  • the display screen of the computer equipment may be a liquid crystal display screen or an electronic ink display screen
  • the input device of the computer equipment may be a touch layer covered on the display screen, or a button, a trackball or a touchpad set on the shell of the computer equipment , or an external keyboard, trackpad, or mouse.
  • FIG. 17 is only a partial structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied. shown in more or less components, or in combination with certain components, or with different arrangements of components.
  • An embodiment of the present invention provides a computer device, including a memory and a processor, wherein the memory stores a computer program, wherein the processor implements the following steps when executing the computer program:
  • the first network model is trained based on the source domain to obtain a trained first network model, wherein the trained first network model includes a source domain feature extractor and a classifier;
  • the second network model is trained based on the target domain, the source domain, the source domain feature extractor and the discriminator to obtain a target domain feature extractor;
  • the network flow anomaly detection model acquires the network flow to be detected in the target domain
  • the target domain feature extractor extracts the to-be-detected feature vector corresponding to the to-be-detected network flow, wherein the target domain feature extractor is the target domain feature extractor in the above-mentioned network flow anomaly detection method;
  • the classifier classifies the to-be-detected feature vector to obtain an anomaly detection result corresponding to the to-be-detected feature vector, wherein the classifier is a classifier in the above-mentioned network flow anomaly detection method.
  • An embodiment of the present invention also provides a computer-readable storage medium on which a computer program is stored, and is characterized in that, when the computer program is executed by a processor, the following steps are implemented:
  • the first network model is trained based on the source domain to obtain a trained first network model, wherein the trained first network model includes a source domain feature extractor and a classifier;
  • the second network model is trained based on the target domain, the source domain, the source domain feature extractor and the discriminator to obtain a target domain feature extractor;
  • the network flow anomaly detection model acquires the network flow to be detected in the target domain
  • the target domain feature extractor extracts the to-be-detected feature vector corresponding to the to-be-detected network flow, wherein the target domain feature extractor is the target domain feature extractor in the above-mentioned network flow anomaly detection method;
  • the classifier classifies the to-be-detected feature vector to obtain an anomaly detection result corresponding to the to-be-detected feature vector, wherein the classifier is a classifier in the above-mentioned network flow anomaly detection method.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本发明提供了一种网络流异常检测模型的生成方法和计算机设备,网络流异常检测模型的生成方法,包括:基于源域对第一网络模型进行训练,以得到已训练的第一网络模型,其中,已训练的第一网络模型包括源域特征提取器和分类器;基于目标域、源域、源域特征提取器和判别器对第二网络模型进行训练,以得到目标域特征提取器;根据目标域特征提取器和分类器生成网络流异常检测模型。本发明中,通过训练使得目标域特征提取器在目标域上提取到的特征,与源域特征提取器在源域上提取的特征相似,进而,本发明中的网络流异常检测模型中基于源域训练得到的分类器,可以对目标域进行异常检测,且准确性高。

Description

一种网络流异常检测模型的生成方法和计算机设备 技术领域
本申请涉及网络数据检测技术领域,特别是涉及一种网络流异常检测模型的生成方法和计算机设备。
背景技术
网络攻击是当今社会日益紧密联系的一个严重问题,随着网络的发展和应用范围的不断扩大,网络入侵手段日新月异,造成的破坏越来越大。入侵是指尝试访问有关计算机系统或以非法或未经授权的方式破坏系统运行。异常检测可以很好的检测新的网络入侵行为。
现有的异常检测方法没有考虑到网络数据特征场景变化对算法性能带来的影响,模型训练和模型检测是在同一个数据集上进行的,只能说明在某个数据集上训练的模型,对针对这个数据集的检测是有效的。在新的场景下需要使用对模型进行调整,而对模型进行调整依赖大量有标记的数据,因此,不适用于在数据少、无标签的环境。
因此,现有技术有待改进。
发明内容
本发明提供了一种网络流异常检测模型的生成方法和计算机设备,本发明中的已训练的目标域特征提取器在目标域上提取到的特征,与源域特征提取器在源域上提取的特征相似,因此,网络流异常检测模型中基于源域训练得到的分类器,可以对目标域进行异常检测,且准确性高。
第一方面,本发明实施例提供了网络流异常检测模型的生成方法,包括:
基于源域对第一网络模型进行训练,以得到已训练的第一网络模型,其中,所述已训练的第一网络模型包括源域特征提取器和分类器;
基于目标域、所述源域、所述源域特征提取器和判别器对第二网络模型进行训练,以得到目标域特征提取器;
根据所述目标域特征提取器和所述分类器生成网络流异常检测模型。
第二方面,本发明实施例提供了一种网络流的异常检测方法,应用于网络流异常检测模型,所述网络流异常检测模型包括目标域特征提取器和分类器,所述网络流的异常检测方法包括:
所述网络流异常检测模型获取目标域中的待检测网络流;
所述目标域特征提取器提取所述待检测网络流对应的待检测特征向量,其中,所述目标域特征提取器是上述网络流异常检测模型的生成方法中的目标域特征提取器;
所述分类器对所述待检测特征向量进行分类,以得到所述待检测特征向量对应的异常检
测结果,其中,所述分类器是上述网络流异常检测模型的生成方法中的分类器。
第三方面,本发明实施例提供了一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,所述处理器执行所述计算机程序时实现以下步骤:
基于源域对第一网络模型进行训练,以得到已训练的第一网络模型,其中,所述已训练的第一网络模型包括源域特征提取器和分类器;
基于目标域、所述源域、所述源域特征提取器和判别器对第二网络模型进行训练,以得到目标域特征提取器;
根据所述目标域特征提取器和所述分类器生成网络流异常检测模型;
或者,所述网络流异常检测模型获取目标域中的待检测网络流;
所述目标域特征提取器提取所述待检测网络流对应的待检测特征向量,其中,所述目标域特征提取器是上述网络流异常检测模型的生成方法中的目标域特征提取器;
所述分类器对所述待检测特征向量进行分类,以得到所述待检测特征向量对应的异常检测结果,其中,所述分类器是上述网络流异常检测模型的生成方法中的分类器。
第四方面,本发明实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现以下步骤:
基于源域对第一网络模型进行训练,以得到已训练的第一网络模型,其中,所述已训练的第一网络模型包括源域特征提取器和分类器;
基于目标域、所述源域、所述源域特征提取器和判别器对第二网络模型进行训练,以得到目标域特征提取器;
根据所述目标域特征提取器和所述分类器生成网络流异常检测模型;
或者,所述网络流异常检测模型获取目标域中的待检测网络流;
所述目标域特征提取器提取所述待检测网络流对应的待检测特征向量,其中,所述目标域特征提取器是上述网络流异常检测模型的生成方法中的目标域特征提取器;
所述分类器对所述待检测特征向量进行分类,以得到所述待检测特征向量对应的异常检测结果,其中,所述分类器是上述网络流异常检测模型的生成方法中的分类器。
与现有技术相比,本发明实施例具有以下优点:
本发明提供了一种网络流异常检测模型的生成方法,包括:基于源域对第一网络模型进 行训练,以得到已训练的第一网络模型,其中,所述已训练的第一网络模型包括源域特征提取器和分类器;基于目标域、所述源域、所述源域特征提取器和判别器对第二网络模型进行训练,以得到目标域特征提取器;根据所述目标域特征提取器和所述分类器生成网络流异常检测模型。本发明中,目标域中的数据没有标签,通过生成对抗的方式训练第二网络模型,得到目标域特征提取器,使得目标域特征提取器可以将目标域上的数据映射到源域相似的特征空间,以实现最小化目标域的特征空间和源域的特征之间的空间距离,使得目标域特征提取器在目标域上提取到的特征,与源域特征提取器在源域上提取的特征相似,从而完成源域到目标域的适应过程;进而,在新场景下使用通过源域训练得到的分类器时,不需要新场景具有大量有标签的数据进行二次训练。本发明中的网络流异常检测模型中基于源域训练得到的分类器,可以对目标域进行异常检测,且准确性高。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明中记载的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本发明实施例中一种网络流异常检测模型的生成方法的应用场的示意图;
图2为本发明实施例中一种网络流异常检测模型的生成方法的流程示意图;
图3为本发明实施例中从源域中确定各正常网络流和各异常网络流的过程示意图;
图4为本发明实施例中将第一网络流转换为第一字符串后的形式的示意图;
图5为本发明实施例中通过抓包(Wiresharks)解析第一网络流得到的形式的示意图;
图6为本发明实施例中将第一三维张量以Numpy压缩(Numpy zip,NPZ)文件形式存储后的示意图;
图7为本发明实施例中通过卷积神经网络提取流特征向量的过程是示意图;
图8为本发明实施例中卷积神经网络的结构示意图;
图9为本发明实施例中循环神经网络的结构示意图;
图10为本发明实施例中当源域中的异常网络流不足时,利用向量生成器生成异常网络流的示意图;
图11为本发明实施例中向量生成器的结构示意图;
图12为本发明实施例中对第二网络模型进行训练的过程示意图;
图13为本发明实施例中网络流异常检测模型的结构示意图;
图14为本发明实施例中一种网络流异常检测模型的生成方法在具体实施时的阶段示意图;
图15为本发明实施例中一种网络流的异常检测方法的流程示意图;
图16为本发明实施例中一种网络流的异常检测方法在具体实施时的流程示意图;
图17为本发明实施例中计算机设备的内部结构图。
具体实施方式
本发明提供一种网络流异常检测模型的生成方法和计算机设备,为使本发明的目的、技术方案及效果更加清楚、明确,以下参照附图并举实施例对本发明进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本发明,并不用于限定本发明。
本技术领域技术人员可以理解,除非特意声明,这里使用的单数形式“一”、“一个”、“所述”和“该”也可包括复数形式。应该进一步理解的是,本发明的说明书中使用的措辞“包括”是指存在所述特征、整数、步骤、操作、元件和/或组件,但是并不排除存在或添加一个或多个其他特征、整数、步骤、操作、元件、组件和/或它们的组。应该理解,当我们称元件被“连接”或“耦接”到另一元件时,它可以直接连接或耦接到其他元件,或者也可以存在中间元件。此外,这里使用的“连接”或“耦接”可以包括无线连接或无线耦接。这里使用的措辞“和/或”包括一个或更多个相关联的列出项的全部或任一单元和全部组合。
本技术领域技术人员可以理解,除非另外定义,这里使用的所有术语(包括技术术语和科学术语),具有与本发明所属领域中的普通技术人员的一般理解相同的意义。还应该理解的是,诸如通用字典中定义的那些术语,应该被理解为具有与现有技术的上下文中的意义一致的意义,并且除非像这里一样被特定定义,否则不会用理想化或过于正式的含义来解释。
发明人经过研究发现,现有技术中,通过深度学习方法虽然可以在大数据集上训练一个很好的分类器,但是这种训练好的模型往往无法直接推广到新的具有不同数据分布特征的场景中。典型的解决方案是先训练好模型,然后针对特定任务的数据集再进一步调整(Fine-tuning)模型。但这是极其困难和代价昂贵的,尤其是在网络异常检测领域,通常很难获得足够具有标签的数据来调整具有巨量参数的深度神经网络。也就是说,现有的异常检测方法没有考虑到网络数据特征场景变化对算法性能带来的影响,模型训练和模型检测是在同一个数据集上进行的,只能说明在某个数据集上训练的模型,对针对这个数据集的检测是有效的。在新的场景下需要使用对模型进行调整,而对模型进行调整依赖大量有标记的数据, 因此,不适用于在数据少、无标签的环境。
为了解决上述问题,在本发明实施例中,基于源域对第一网络模型进行训练,以得到已训练的第一网络模型,其中,所述已训练的第一网络模型包括源域特征提取器和分类器;基于目标域、所述源域、所述源域特征提取器和判别器对第二网络模型进行训练,以得到目标域特征提取器;根据所述目标域特征提取器和所述分类器生成网络流异常检测模型。目标域中的数据没有标签,通过生成对抗的方式训练第二网络模型,得到目标域特征提取器,使得目标域特征提取器可以将目标域上的数据映射到源域相似的特征空间,以实现最小化目标域的特征空间和源域的特征之间的空间距离,使得目标域特征提取器在目标域上提取到的特征,与源域特征提取器在源域上提取的特征相似,从而完成源域到目标域的适应过程;进而,在新场景下使用通过源域训练得到的分类器时,不需要新场景具有大量有标签的数据进行二次训练。本发明中的网络流异常检测模型中基于源域训练得到的分类器,可以对目标域进行异常检测,且准确性高。
本实施例提供了一种网络流异常检测模型的生成方法和计算机设备,所述网络流异常检测模型的生成方法可以应用到如图1所示的场景。在该场景中,首先,终端设备1可以采集源域和目标域,并将源域和目标域输入服务器2,以使得服务器2依据所述源域和所述目标域训练第一网络模型和第二网络模型进行训练。所述服务器2可以预选存储有第一网络模型和第二网络模型,并响应终端设备1的输入的源域和目标域,以训练第一网络模型和第二网络模型,得到目标域特征提取器和分类器,再根据所述目标域特征提取器和所述分类器生成网络流异常检测模型。
可以理解的是,在上述应用场景中,虽然将本发明实施方式的动作描述为部分由终端设备1执行,部分由服务器2执行,但是这些动作也可以完全由服务器2执行,或者完全由终端设备1执行。本发明在执行主体方面不受限制,只要执行了本发明实施方式所公开的动作即可。
进一步,在生成网络流异常检测模型后,将所述网络流异常检测模型应用于电子设备中,用于检测电子设备从目标域获取的待检测网络流是否异常,电子设备包括PC机、服务器、手机、平板电脑、掌上电脑、个人数字助理(Personal Digital Assistant,PDA)等。
需要注意的是,上述应用场景仅是为了便于理解本发明而示出,本发明的实施方式在此方面不受任何限制。相反,本发明的实施方式可以应用于适用的任何场景。
下面结合附图,通过对实施例的描述,对发明内容作进一步说明。
参阅图2,本实施例提供了一种网络流异常检测模型的生成方法,包括:
S1、基于源域对第一网络模型进行训练,以得到已训练的第一网络模型,其中,所述已训练的第一网络模型包括源域特征提取器和分类器。
在本发明实施例中,所述第一网络模型为深度学习网络模型,所述源域中的流量是有标签的流量,标签用于表示源域中的网络流量是正常流量或者异常流量,通过源域训练第一网络模型,以得到已训练的第一网络模型,所述已训练的第一网络模型包括源域特征提取器和分类器。
具体的,步骤S1包括:
S11、将训练数据中的正常网络流和所述训练数据中的异常网络流输入所述第一网络模型,通过所述第一网络模型生成所述正常网络流对应的第一检测分数和所述异常网络流对应的第二检测分数,其中,所述训练数据包括多个训练组,每个训练组包括来自源域的正常网络流和来自源域的异常网络流。
在本发明实施例中,所述第一网络模型包括第一子网络和第二子网络,所述第一子网络用于提取各网络流(包括正常网络流和异常网络流)的流特征向量,所述第二子网络用于为提取的流特征向量进行分类,并输出流特征向量对应的分数。将源域的正常网络流和异常网络流输入所述第一子网络,以得到所述正常网络流对应的正常流特征向量,以及所述异常网络流对应的异常流特征向量,将所述正常流特征向量和所述异常流特征向量输入所述第二子网络,以得到所述正常流特征向量对应的第一检测分数,以及所述异常流特征向量对应的第二检测分数。所述第一检测分数为所述第二子网络基于所述正常流特征向量得到的分数,所述第二检测分数为所述第二子网络基于所述异常流特征向量得到的分数。在后文会详细介绍“将训练数据中的正常网络流和所述训练数据中的异常网络流输入所述第一网络模型,通过所述第一网络模型生成所述正常网络流对应的第一检测分数和所述异常网络流对应的第二检测分数”的详细过程。
在发明实施例中,所述训练数据中的每个训练组包括来自源域的正常网络流和来自源域的异常网络流;从源域中可以确定各正常网络流和各异常网络流。
具体的,参阅图3,从源域中确定各正常网络流和各异常网络流的过程如下:
11、获取源域;
12、对源域中的大的数据包获取(Packet capture,PCAP)文件进行分割,得到以网络流为分割粒度的PCAP文件;
21、对各网络流进行过滤,以滤除无法识别标签的网络流;
22、为过滤后的各网络流添加标签,得到各第一网络流和各第二网络流数据,其中,第 一网络流是添加了正常标签的网络流,第二网络流是添加了异常标签的网络流;
31、根据各第一网络流生成预设大小的各第一三维张量,根据各第二网络流生成预设大小的各第二三维张量;
41、将各第一三维张量保存为Numpy压缩(Numpy zip,NPZ)文件,得到各正常网络流,将各第二三维张量保存为NPZ文件,得到各异常网络流。正常网络流和异常网络流是第一网络模型的输入项。
接下来详细介绍如何得到训练数据中的正常网络流和异常网络流。在步骤S11之前,还包括:
M、基于所述源域确定各异常网络流和各正常网络流。
在本发明实施例中,源域包括多个网络流,首先根据数据集的描述文件可以确定各网络流的标签,所述标签包括正常标签和异常标签,为各网络流添加标签,以得到第一网络流和第二网络流,所述第一网络流为添加了正常标签的网络流,所述第二网络流为添加了异常标签的网络流。其次,根据第一网络流生成为第一三维张量,以得到正常流量,根据第二网络流生成第二三维张量,以得到异常网络流。
具体的,步骤M包括:
M1、提取所述源域中的各第一网络流和各第二网络流。
在本发明实施例中,捕获源域对应的网络流,捕获的源域对应的网络流通常是通过PCAP文件进行存储的,所述源域对应网络流通常比较大,例如,所述源域对应的网络流的大小为几个G到几十个G,所述源域对应的网络流中包括成千上万个数据包,通常是采集某个网络在一定时间段内的数据包。由于所述源域对应的网络流通过PCAP文件进行存储,可以得到源域对应的PCAP文件,对源域对应的PCAP文件进行切割,以得到各子PACP文件,每个子PACP文件对应一个切割后的网络流。
所谓对源域对应的PCAP文件进行切割,具体的,利用pkt2flow工具将捕获的数据包(源域对应的PCAP文件包括多个数据包)以流(五元组:源IP,源端口,目的IP,目的端口,协议)为单位进行切割,以得到多个子PCAP文件每个子PCAP文件代表一个流,子PCAP文件的文件名可以通过每个PCAP文件的五元组命名。
对于切割好的各子PCAP文件,将各子PCAP文件中无法识别类型的子PCAP文件滤除,以得到滤除后的子PACP文件。所谓无法识别类型的子PCAP文件是指无法识别子PCAP文件是正常或是异常的网络流量。
对于所述滤除后的子PACP文件,根据数据集的描述文件为各PCAP文件添加标签;所 述数据集包括以PCAP文件存储的网络流量和以流为单位进行标记的逗号分离值(Comma-Seperated Values,CSV)文件;CSV文件会记录每个子PCAP文件是正常的流量还是异常的流量,此操作就是根据CSV文件的描述编写代码给每个子PCAP文件添加标签,所述标签包括正常标签和异常标签。添加标签会改变子PCAP文件的文件名,也就是说,通过子PCAP文件的文件名可以确定该子PCAP文件是正常的流量还是异常的流量。为了便于说明,将添加正常标签后的子PCAP文件记为第一网络流,将添加异常标签后的子PCAP文件记为第二网络流。
M2、根据所述各第一网络流生成预设大小的各第一三维张量,并将所述各第一三维张量作为所述各正常网络流。
在本发明实施例中,对于每个第一网络流,提取所述第一网络流中的多个第一网络数据包,根据各第一网络数据包生成第一三维张量。
具体的,步骤M2包括:
M21、对于每个第一网络流,提取所述第一网络流对应的各第一网络数据包。
在本发明实施例中,提取所述第一网络流中的多个第一网络数据包;具体的,对于第一网络流,通过Scapy包中的rdpcap()函数获取一个Packets对象,此操作可以得到所述第一网络流对应的各Packets对象,即得到多个第一网络数据包。
M22、根据各第一网络数据包得到所述第一网络流对应的预设大小的各第一三维张量,并将所述各第一三维张量作为所述各正常网络流。
在本发明实施例中,首先将各第一网络数据包转换为各自分别对应的第一字符串,根据每个第一字符串可以得到第一二维张量,根据各第一字符串分别对应的第一二维张量,得到第一三维张量,所述预设大小可以用于表示第一二维张量的大小和第一二维张量的个数。
具体的,步骤M22包括:
M221、对每个第一网络数据包进行序列化处理,以得到各第一网络数据包各自分别对应的第一字符串。
在本发明实施例中,首先对每个第一网络数据包进行序列化处理,得到第一字符串,所述第一字符串为十六进制数字形式的字符串,十六进制数的取值在[0,255]区间内,十六进制数与图像的灰度取值范围一致,也就是说,第一网络数据包对应的第一字符串可以间接表示图像的灰度值。在实际应用时,参见图4和图5,图4是将第一网络数据包转换为第一字符串后的形式,图5是通过抓包(Wiresharks)解析第一网络数据包得到的形式,可见,解析所述第一网络数据包得到的每个字段的值,与通过Wiresharks解析出来的字段值完全一致,也 就是说,本发明实施例中将第一网络数据包转换为第一字符串是有实际意义的。
M222、根据各第一字符串生成预设大小的各第一三维张量,并将所述各第一三维张量作为所述各正常网络流。
在本发明实施例中,所述预设大小包括数据包数量和截取的每个数据包的大小,将网络数据包数量记为pkt_num,将截取的每个数据包的大小记为pkt_size。依次读取各第一网络数据包,对于每个第一网络数据包,根据该第一网络数据包对应的第一字符串和所述截取的每个数据包的大小,生成第一二维张量,其中,根据所述截取的每个数据包的大小确定所述第一二维张量的大小,截取的每个数据包中包括pkt_size个有效字符,所谓有效字符,是指字符串中的数字部分,将\x分隔的叫做一个有效字符,例如\xff;字符串中的其他符号作为标识符不进行转换,只用来进行字符串解析的标识。一个有效字符的含义对应一个十六进制数,如\xff,对应一个十六进制数,用一个字节存储。
例如,pkt_size为484,则根据第一字符串中前484个有效字符生成大小为22*22大小的第一二维张量。若第一字符串中有效字符的数量超过484字节,则只取该第一字符串的前484个字节,以生成第一二维张量,若第一字符串中的有效字符的数量不足484字节,则在第一字符串的末尾添加0,以得到具有484字节大小的第一字符串,再根据添加0后的第一字符串生成大小为22*22大小的第一二维张量。
根据各第一字符串各自分别对应第一二维张量,生成第一三维张量。所述第一字符串的数量与所述第一数据包的数量相同,根据各第一字符串和所述网络数据包数量生成第一三维张量。若所述第一数据包的数量大于所述网络数据包数量,则只取网络数据包数量个第一网络数据包,以生成第一三维张量;若所述第一网络数据包的数量小于所述网络数据包数量(第一字符串的数量小于所述网络数据包数量),则在生成第一三维张量时生成零矩阵,以使得添加零矩阵后的第一三维张量的大小为预设大小。
例如,假设pkt_size为484,pkt_num为10,则所述预设大小为10*22*22,即第一三维张量的尺寸为10*22*22;假设,对于一个第一网络流,该第一网络流包括15个第一网络数据包,则根据15个第一网络数据包中前10个第一网络数据包生成第一三维张量,该第一三维张量的大小为:10*22*22。假设,对于一个第一网络流,该第一网络流包括8个第一网络数据包,则根据8个第一网络数据包生成三维张量,并使用np.zeros()函数直接生成零矩阵,使得添加了零矩阵后得到大小为10*22*22的第一三维张量。
在本发明实施例中,将第一三维张量以NPZ形式存储,以得到正常网络流;参见图6,图6是将第一三维张量以NPZ形式存储后的示意图。
M3、根据所述各第二网络流生成所述预设大小的各第二三维张量,并将所述各第二三维张量作为所述各异常网络流。
在本发明实施例中,对于每个第二网络流,提取所述第二网络流中的各第二网络数据包,根据各第二网络数据包生成第二三维张量。
具体的,步骤M3包括:
M31、对于每个第一网络流,提取所述第一网络流对应的各第一网络数据包。
在本发明实施例中,提取所述第二网络流中的多个第二网络数据包;具体的,对于第二网络流,通过Scapy包中的rdpcap()函数获取一个Packets对象,此操作可以得到所述第二网络流对应的各Packets对象,即得到多个第二网络数据包。
M32、根据各第一网络数据包得到所述第一网络流对应的预设大小的各第一三维张量,并将所述各第一三维张量作为所述各正常网络流。
在本发明实施例中,首先将各第二网络数据包转换为各自分别对应的第二字符串,根据每个第二字符串可以得到第二二维张量,根据各第二字符串各自分别对应的第二二维张量,得到第二三维张量,所述预设大小可以用于表示第二二维张量的大小和第二二维张量的个数。
具体的,步骤M32包括:
M321、对每个第二网络数据包进行序列化处理,以得到各第二网络数据包各自分别对应的第二字符串;
M322、根据各第二字符串生成所述预设大小的第二三维张量,并将所述第二三维张量作为所述异常网络流。
在本发明实施例中,所述“对每个第二网络数据包进行序列化处理,以得到各第二网络数据包各自分别对应的第二字符串”的执行过程,与:“对每个第一网络数据包进行序列化处理,以得到各第一网络数据包各自分别对应的第一字符串”的执行过程一致,进而,步骤M321的具体说明可以参考上述对步骤M221的说明。
在本发明实施例中,所述“根据各第二字符串生成所述预设大小的第二三维张量,并将所述第二三维张量作为所述异常网络流”的执行过程,与:“根据各第一字符串生成预设大小的第一三维张量,并将所述第一三维张量作为所述正常网络流”的执行过程一致,进而,步骤M322的具体说明可以参考上述对步骤M222的说明。
接下来详细介绍“将训练数据中的正常网络流和所述训练数据中的异常网络流输入所述第一网络模型,通过所述第一网络模型生成所述正常网络流对应的第一检测分数和所述异常网络流对应的第二检测分数”的具体过程。
在本发明实施例中,所述第一网络模型包括第一子网络和第二子网络,所述第一子网络用于提取正常网络流对应的正常流特征向量,以及异常网络流对应的异常流特征向量。将正常流特征向量和异常流特征向量输入第二子网络,得到正常流特征向量对应的第一检测分数,以及异常流特征向量对应的第二检测分数。
为了便于说明,将正常网络流和异常网络流统称为网络流。所述第一子网络包括卷积神经网络(CNN)和循环神经网络(GRU),首先利用CNN学习网络流的空间特征,再利用GRU学习网络流的时序特征。具体的,参见图7,每一个网络流均是三维张量(n*m*m)形式,可以分为n个二维向量(m*m),二维向量即是数据包的包特征向量;具体的,对于第一三维张量,第一三维张量包括各第一网络数据包各自分别对应的第一二维向量,即各第一网络数据包各自分别对应的包向量;对于第二三维张量,第二三维张量包括各第二网络数据包各自分别对应的包向量。
将各个大小为m*m的包向量输入CNN,以得到各包向量各自分别对应的包特征向量,利用np.concatenate()函数将各包特征向量拼接为一个特征向量,再将拼接后的特征向量输入GRU,以学习拼接后的特征向量的时序特征,得到网络流对应的流特征向量。对于正常网络流,通过CNN和GRU得到正常网络流对应的正常流特征向量;对于异常网络流,通过CNN和GRU得到异常网络流对应的异常流特征向量。
在本发明实施例中,CNN的输入项是三维张量(n*m*m)形式,在步骤M2中,生成了预设大小的第一三维张量,以及在步骤M3中,生成了预设大小的第二三维张量。所述第一三维张量为正常网络流,即CNN的输入项,所述第二三维张量也为CNN的输入项。
由于CNN要求输入的数据有固定的大小,所述预设大小包括数据包数量和截取的每个数据包的大小,数据包数量和截取的每个数据包的大小对算法有很大的影响。有些攻击类型如DoS攻击会跟包头数据以及一个流中的前几个包关系更大,有些攻击类型如XSS攻击会跟负载的数据关系更大,因此确定对原始网络流数据的哪一部分进行表征学习会对算法的检测准确度产生重要影响。
在本发明实施例中,通过在多个数据集上各种类型的攻击的综合实验结果分析,以及对数据集的当中流和数据包的统计结果,综合考虑各个指标,可以将预设大小设为:数据包数量为6,截取的每个数据包的大小为484。这样一个流最终被处理成一个6*22*22的三维张量输入到CNN中,实际使用过程中可以再根据数据的特点进行调优。
现有技术中,HAST-NAD首先提出使用卷积神经网络(CNN)学习网络流的空间特征,然后使用循环神经网络(LSTM)学习网络流之间的时序特征,与HAST-NAD不同的是,本 申请没有进行独热编码(One-Hot Encoding)。同时本申请选用GRU而不是LSTM是因为GRU的开销比LSTM更低而效果却相差无几,考虑到网络流检测对效率的要求,最终选择GRU来捕获网络流的时序特征。
在本发明实施例中,参见图8,卷积神经网络(CNN)包括三个卷积层、两个池化层和一个线形层,卷积层当中的激活函数使用ReLU;将网络流输入CNN,在CNN的各个层得到尺度不同的各特征图,其中,@前面的数字代表通道数,@后面的数字代表特征图大小,特征图的本质是特征提取后得到的矩阵。将一个三维张量输入CNN,具体的,将一个三维张量中的各二维向量依次输入CNN,一个三维张量中的各二维向量即网络流对应的各网络数据包对应的包向量。CNN的最终输出是各包向量各自分别对应的包特征向量。参见图9,循环神经网络(GRU)包括两个GRU层和一个Flatten层,GRU的输入项是拼接后的包特征向量,GRU网络的输出为流特征向量,流特征向量为一维的特征向量。
为了便于说明,将正常流特征向量和异常流特征向量统称为流特征向量。所述第二子网络的本质为一个分类器,用来对已经提取的特征做出是否是异常的判定,第二子网络的输出结果是[0,1]区间的浮点数,即所述第一检测分数和所述第二检测分数均是[0,1]区间的浮点数。
在本发明实施例中,所述正常网络流是添加了正常标签的网络流,所述正常标签用0表示,所述异常网络流是添加了异常标签的网络流,所述异常标签用1表示;所述第一检测分数是正常网络流对应的检测分数,所述第二检测分数是异常网络流对应的检测分数;也就是说,所述第一网络模型根据正常网络流得到第一检测分数,所述第一网络模型根据异常网络流得到第二检测分数。
在本发明实施例中,由于已经通过第一子网络得到了时空特征,因此,第二子网络的网络结构比较简单,第二子网络包括全连接层和Sigmoid层。
S12、根据所述第一检测分数和所述第二检测分数对所述第一网络模型进行训练,直至满足第一预设条件,以得到已训练的第一网络模型。
在本发明实施例中,所述第一网络模型包括第一子网络和第二子网络,因此,在训练时,根据所述第一检测分数和所述第二检测分数对第一子网络和第二子网络进行训练,以得到已训练的第一子网络,即源域特征提取器,以及已训练的第二子网络,即分类器。在训练过程中会在一次迭代中同时将正常流特征向量和异常流特征向量输入第二子网络,这样,已训练的第二子网络(分类器)可以用于区分输入分别是正常网络流还是异常网络流。
根据所述第一检测分数和所述第二检测分数对所述第一网络模型进行训练的过程是:根据所述第一检测分数和所述第二检测分数修改第一子网络的参数和第二子网络的参数。具体 的,根据第一检测分数和第二检测分数计算分类损失函数值,再根据分类损失函数值修改第一子网络的参数和第二子网络的参数。
现有技术中,异常检测是一个典型的数据不平衡问题,也就是说训练数据中异常网络流远远少于正常网络流,在这种不平衡的训练数据上如果不进行特殊处理,直接使用神经网络进行训练,由于神经网络强大的学习能力,会对正常数据流过拟合,由于很少对异常数据流进行学习,所以训练出来的分类器很难检测出异常数据流,产生严重的数据偏差(Bias),造成极低的异常检测率。
在本发明实施例中,所述第一网络模型还包括向量生成器,当所述源域中的异常网络流不足时,将随机噪声输入所述向量生成器,以得到异常网络流。
在本发明实施例中,参见图10,在训练过程中首先会加载源域中的各正常网络流和各异常网络流,当源域中的异常网络流不足时,则使用向量生成器生成的异常网络流进行补齐,这样,不论源域中的正常网络流和异常网络流的占比如何,实际输入到第二子网络中的正常网络流和异常网络流都能维持固定比例。需要注意的是,当源域中还有异常网络流时,不会通过向量生成器生成异常网络流。也就是说,整个训练过程本质上是分为两个阶段,第一阶段从源域中提取正常网络流和异常网络流,第二阶段从源域中提取正常网络流,并利用向量生成器生成异常网络流。图10中第二子网络输出的检测分数包括:正常流特征向量对应的第一检测分数,以及异常流特征向量对应的第二检测分数。
在本发明实施例中,所述向量生成器的网络结构如图11所示,所述向量生成器包括4个反卷积层,其中,在每个反卷积层后使用BatchNorm2d进行归一化,激活函数使用ReLU,向量生成器最终输出的是一个与NPZ文件中读取的向量同构的向量,即与步骤M2得到的正常网络流,以及步骤M3得到的异常网络流的结构均相同。
在本发明实施例中,第一网络模型的训练过程可通过以下算法实现。
输入:从源域提取的正常网络流X n,从源域提取的异常网络流X a,噪声Z;
输出:第一子网络:E S,第二子网络:C S
开始;
从1到N迭代;
从训练数据中分别加载一批正常网络流X n,异常网络流X a
通过第一子网络E S提取X n得到f n
如果X a的数据大小等于一个批次的数据的大小;
则通过第一子网络E S提取X a得到f a
否则,使用E S提取G θ(Z)得到f a,其中,G θ是向量生成器;
将f n和f a输入C s到中计算分类损失:L classifier(f n,f a);
分类损失L classifier(f n,f a)后传,同时更新C s和E S的参数;
计算生成器的损失:
Figure PCTCN2021098695-appb-000001
生成器的损失:
Figure PCTCN2021098695-appb-000002
后传,同时更G θ新的参数;
输出训练好的分类器:C s,源域特征提取器:E S
结束。
在本发明实施例中,从源域提取的正常网络流和异常网络流为真实数据,通过引入一个超参数γ,可以提高真实数据被分类正确的优先级,γ的取值范围为(0,1],当γ小于1时,C s提高正确分类真实数据的优先级。需要注意的是,本申请中将向量生成器生成的异常网络流标记为1,将从源域提取的正常网络流标记为0,这和通常GAN中的交叉熵函数中默认将真实样本标记为1,生成样本标记为0是刚好相反。因此,C s对应的分类损失函数如公式(1)所示。
Figure PCTCN2021098695-appb-000003
其中,
Figure PCTCN2021098695-appb-000004
为分类损失函数,γ是超参数,X i为X n或者X a
Figure PCTCN2021098695-appb-000005
为第二子网络根据从源域提取的正常网络流或者异常网络流X a得到的分数,当输入为从源域提取的正常网络流时X n
Figure PCTCN2021098695-appb-000006
为第一检测分数,当输入为从源域提取的异常网络流X a时,
Figure PCTCN2021098695-appb-000007
为第二检测分数;
Figure PCTCN2021098695-appb-000008
为第二子网络根据向量生成器G θ生成的异常网络流得G θ(Z i)到的分数。
在本发明实施例中,根据分类损失函数计算得到的分类损失函数值修改第一子网络和第二子网络的网络参数,直至满足第一预设条件,以得到已训练的第一网络模型,所述已训练的第一网络模型包括第一子网络对应的源域特征提取器E S,以及第二子网络对应的分类器C s
在本发明实施例中,当所述训练数据中来自源域中异常网络流不足时,使用向量生成器生成的异常网络流。为了使得向量生成器能够生成围绕在真实数据(从源域提取的网络流)周围的异常网络流,在本发明实施例中,对向量生成器进行训练。
训练向量生成器的目标是,使得C s可以很好的区分正常网络流和异常网络流,向量生成器生成的异常网络流会紧密分布在正常网络流周围,但不是同分布。理想的情况是C s能将同分布的网络流识别为正常网络流,将不同分布的网络流识别为异常网络流。若异常网络流不是紧密分布在正常网络流的周围,分类器C s会容易区分,例如,对于第一异常网络流A和第二异常网络流B,若A分布在正常网络流附近,而B分布在离正常网络流很远的位置,则相 比B而言,C s更容易区分出A是异常网络流。为了使得向量生成器能够生成围绕在真实数据(从源域提取的网络流)周围的异常网络流,通过环绕损失和弥散损失对向量生成器进行训练,使得向量生成器能够生成围绕在真实数据(从源域提取的网络流)周围的异常网络流。
具体的,将向量生成器生成的异常网络流输入第一网络模型,通过第一网络模型得到生成分数,根据所述生成分数对所述向量生成器进行训练,以得到已训练的向量生成器。所述生成分数用于表示第二子网络根据向量生成器生成的异常网络流得到的分数。
具体的,根据生成分数计算环绕损失值,通过如公式(2)可以计算环绕损失值
Figure PCTCN2021098695-appb-000009
Figure PCTCN2021098695-appb-000010
其中,G θ(Z i)是向量生成器G θ生成的异常网络流,
Figure PCTCN2021098695-appb-000011
是第二子网络根据向量生成器G θ生成的异常网络流G θ(Z i)得到的分数,即生成分数,α为超参数,α∈(0,1]。
通过离散损失可以使生成的异常网络流尽可能的分散开来,通过生成的异常网络流的数据点与其质心之间的距离最大,从而鼓励数据点覆盖整个边界。通过公式(3)计算离散损失值DL(G θ,Z)。
Figure PCTCN2021098695-appb-000012
其中,μ是生成的异常网络流对应的质心,G θ(Z i)是向量生成器G θ生成的异常网络流。
综合考虑环绕损失和弥散损失,向量生成器对应的损失函数可以通过公式(4)描述。
Figure PCTCN2021098695-appb-000013
其中,
Figure PCTCN2021098695-appb-000014
是向量生成器对应的损失函数值,β是超参数,用于调整环绕损失和弥散损失的权重。
在本发明实施例中,在训练过程中,通过分类损失函数值修改第一子网络和第二子网络的网络参数,同时通过向量生成器对应的损失函数值修改向量生成器的网络参数,直至满足第一预设条件,以得到源域特征提取器、分类器和已训练的向量生成器。
在本发明实施例中,所述第一预设条件包括分类损失函数值满足预设要求,或者训练次数达到预设次数。所述预设要求可以是根据分类器的精度来确定,这里不做详细说明,所述预设次数可以为第二子网络的最大训练次数,例如,4000次等。由此,计算分类损失函数值后,判断所述分类损失函数值是否满足预设要求,若分类损失函数值满足预设要求,则结束训练;若分类损失函数值不满足预设要求,则判断所述第二子网络的训练次数是否达到预测次数,若未达到预设次数,则通过分类损失函数值修改第一子网络和第二子网络的网络参数,同时通过向量生成器对应的损失函数值修改向量生成器的网络参数;若达到预设次数,则结 束训练。这样通过分类损失函数值和训练次数来判断训练是否结束,可以避免因分类损失函数值无法达到预设要求而造成训练进入死循环。
S2、基于目标域、所述源域、所述源域特征提取器和判别器对第二网络模型进行训练,以得到目标域特征提取器。
在本发明实施例中,所述源域和目标域的本质都是网络流,源域中的网络流是有标签的网络流,目标域中的网络流是没有标签的网络流;现有的异常检测方法,模型训练和模型检测是在同一个数据集上进行的,只能说明在某个数据集上训练的模型,对针对这个数据集的检测是有效的。在新的场景下需要使用对模型进行调整,而对模型进行调整依赖大量有标记的数据,因此,不适用于在数据少、无标签的环境。本发明实施例中,用于训练的目标域中的数据没有标签。
在本发明实施例中,通过源域提取的训练数据训练分类器,将分类器迁移到目标域中,以对目标域进行异常检测;也就是说,域的迁移是通过潜在特征的映射来完成的,而潜在特征的映射又是通过目标域特征提取器来完成的,而目标域特征提取器是通过对抗式训练的过程来优化的。在步骤S2中,通过对抗式训练,使得目标域特征提取器在目标域中提取到的特征,与源域特征提取器在源域中提取的特征相似。
参见图12,对第二网络模型进行训练的过程包括:源域特征提取器E S从源域提取源域特征向量,第二网络模型从目标域提取目标域特征向量,将源域特征向量和目标域特征向量输入判别器D d,通过判别器D d输出预测分数,预测分数包括:源域特征向量对应的第一预测分数,以及目标域特征向量对应的第二预测分数,再根据第一预测分数和第二预测分数修改第二网络模型的模型参数,以得到目标域特征提取器E t
在本发明实施例中,源域特征提取器E S是通过步骤S1训练第一子网络得到的。在开始训练时,所述第二网络模型的初始模型参数与所述源域特征提取器的模型参数相同,所述第二网络模型的结构与所述源域特征提取器的结构相同。所述第二网络模型的初始模型参数是第二网络模型未经训练时的模型参数,也就是说,采用源域特征提取器E S的模型参数对第二网络模型进行参数初始化,在训练过程中,源域特征提取器E S的参数固定,仅仅更新第二网络模型的模型参数。所述第一预测分数用于表示判别器输出的源域特征向量对应的源域特征分数,所述第二预测分数用于表示判别器输出的目标域特征向量对应的目标域特征分数。
具体的,步骤S2包括:
S21、所述源域特征提取器提取所述源域对应的源域特征向量。
在本发明实施例中,所述源域特征提取器E S提取源域特征向量的过程,与第一子网络提 取正常流特征向量的步骤一致(与第一子网络提取异常流特征向量的步骤一致)。具体的,从源域中获取一个网络流,从获取的网络流中提取预设大小的三维张量,源域特征提取器E S根据提取的三维张量输出源域特征向量。
S22、所述第二网络模型提取目标域对应的目标域特征向量。
在本发明实施例中,所述源域特征提取器包括CNN和GRU,同样的,所述第二网络模型也包括CNN和GRU,具体的,CNN包括三个卷积层、两个池化层和一个线形层,GRU包括两个GRU层和一个Flatten层,CNN的网络结构参见图8,GRU的网络结构参见图9,将CNN和GRU级联,即得到第二网络模型。
在本发明实施例中,所述第二网络模型与源域特征提取器E S结构相同,同样的,从目标域中获取一个网络流,从目标域对应的网络流提取预设大小的三维张量,第二网络模型根据来自目标域的三维张量输出目标域特征向量。
S23、将所述源域特征向量和所述目标域特征向量输入所述判别器,以生成所述源域特征向量对应的第一预测分数,以及所述目标域特征向量对应的第二预测分数。
在本发明实施例中,判别器D d的目标是将来自源域和目标域的特征进行区分,也就是说,判别器的D d目标是区分源域特征向量和目标域特征向量。源域特征向量被标记为1,目标域特征向量被标记为0。所述第一预测分数用于表示判别器输出的源域特征向量对应的源域特征分数,所述第二预测分数用于表示判别器输出的目标域特征向量对应的目标域特征分数,判别器D d可以区分输入的特征来自目标域还是源域。
S24、基于所述第一预测分数和第二预测分数对所述第二网络模型进行训练,直至满足第二预设条件,以得到目标域特征提取器。
在本发明实施例中,通过第一预测分数和第二预测分数计算第二网络模型对应的目标域损失函数值,根据目标域损失函数值调整第二网络模型的参数,直至满足第二预设条件,以得到目标域特征提取器E t
在本发明实施例中,目标域损失函数值可以通过公式(5)计算得到。
Figure PCTCN2021098695-appb-000015
其中,X s是源域中提取的三维张量,X t是目标域中提取的三维张量,M t(X t)是目标域特征向量,D(M t(x t))是第二检测分数,即目标域特征向量对应的目标域特征分数,D是判别器。
在本发明实施例中,在训练第二网络模型的过程中,判别器的模型参数也会被更新。具体的,通过第一预测分数和第二预测分数计算判别器D d对应的判别损失函数值,根据判别损失函数值调整第二网络模型的参数,直至满足第二预设条件,以得到目标域特征提取器E t
在本发明实施例中,判别器D d对应的判别损失函数如公式(6)所示。
Figure PCTCN2021098695-appb-000016
其中,X S是源域中提取的三维张量,M S(X S)是源域特征向量,D(M s(x s))是第一检测分数,即源域特征向量对应的源域特征分数,X t是目标域中提取的三维张量,M t(X t)是目标域特征向量,D(M t(x t))是第二检测分数,即目标域特征向量对应的目标域特征分数。
在本发明实施例中,输入到E S中的数据是源域中提取的三维张量(包括对应正常网络流的第一三维张量,以及对应异常网络流的第二三维张量),输入第二网络模型的是来自目标域的无标签的三维张量,再将第二网络模型输出的目标域特征向量,以及源域特征提取器输出的源域特征向量输入判别器D s。通过对抗式训练,让第二网络模型尝试从目标域中提取和E S从源域提取类似的特征来欺骗判别器,经过对抗式优化,使得D(M s(x s))和D(M t(x t))都趋近于0.5,即当判别器D s无法区分一个被提取的特征是来自源域还是目标域时,就说明训练过程完毕。
S3、根据所述目标域特征提取器和所述分类器生成网络流异常检测模型。
在本发明实施例中,参见图13,网络流异常检测模型包括:目标域特征提取器和分类器,分类器是经过步骤S1训练得到的,目标域特征提取器是经过步骤S2训练得到的。
参见图14,具体实施时,一种网络流异常检测模型的生成方法可以分为三个阶段。第一阶段,基于源域训练分类器C S以及源域特征提取器E S,考虑到来自源域的正常网络流和异常网络流的数量不平衡,当来自源域的异常网络流不足时,通过高斯噪声和向量生成器生成异常网络流,从而是输入到源域特征提取器的正常网络流和异常网络流平衡,防止分类器C S出现正常样本数据偏差(Bias)导致极低的异常检测率。第二阶段,使用对抗式域适应的方法,训练目标域对应的目标域特征提取器E t,将目标域上的数据映射到源域相似的特征空间,以实现最小化目标域的特征空间和源域的特征之间的空间距离,使得目标域特征提取器在目标域上提取到的特征,与源域特征提取器在源域上提取的特征相似,从而完成源域到目标域的适应过程。第三阶段,将第一阶段训练好的分类器C S和第二阶段训练好的目标域特征提取器E t级联,最终实现一个可以在目标域上进行异常检测的网络流异常检测模型。
本发明实施例中,基于源域对第一网络模型进行训练,以得到已训练的第一网络模型,其中,所述已训练的第一网络模型包括源域特征提取器和分类器;基于目标域、所述源域、所述源域特征提取器和判别器对第二网络模型进行训练,以得到目标域特征提取器;根据所述目标域特征提取器和所述分类器生成网络流异常检测模型。目标域中的数据没有标签,通过生成对抗的方式训练第二网络模型,得到目标域特征提取器,使得目标域特征提取器可以 将目标域上的数据映射到源域相似的特征空间,以实现最小化目标域的特征空间和源域的特征之间的空间距离,使得目标域特征提取器在目标域上提取到的特征,与源域特征提取器在源域上提取的特征相似,从而完成源域到目标域的适应过程;进而,在新场景下使用通过源域训练得到的分类器时,不需要新场景具有大量有标签的数据进行二次训练。本发明中的网络流异常检测模型中基于源域训练得到的分类器,可以对目标域进行异常检测,且准确性高。
基于上述一种网络流异常检测模型的生成方法,本发明还提供了一种网络流的异常检测方法,所述一种网络流的异常检测方法应用如上述实施例所述的网络流异常检测模型的生成方法得到的网络流异常检测模型,所述网络流异常检测模型包括目标域特征提取器和分类器,如图15所示,所述网络流的异常检测方法,包括:
K1、所述网络流异常检测模型获取目标域中的待检测网络流。
K2、所述目标域特征提取器提取所述待检测网络流对应的待检测特征向量,其中,所述目标域特征提取器是上述一种网络流的异常检测方法中的目标域特征提取器;
K3、所述分类器对所述待检测特征向量进行分类,以得到所述待检测特征向量对应的异常检测结果,其中,所述分类器是上述一种网络流的异常检测方法中的分类器。
在本发明实施例中,首先对目标域进行预处理,以得到待检测网络流,对目标域进行预处理,以得到待检测网络流的过程与步骤M1至步骤M3中基于源域得到正常网络流和异常网络流的过程相同,进而,“对目标域进行预处理,以得到待检测网络流”的具体过程,可以参考步骤M1至步骤M3中的描述。
在本发明实施例中,具体实施时,参见图16,通过目标域特征提取器提取待检测网络流的时空特征,以得到待检测特征向量,将待检测特征向量输入分类器,所述分类器输出是一个[0,1]的浮点类型数,所述浮点类型数通过binary函数即可得到用于表示异常检测结果的标签。通过binary函数,小于或等于0.5的浮点类型数对应的标签为0,大于0.5的浮点类型数对应的标签为1,标签为0表示待检测网络流的异常检测结果为正常,标签为1表示待检测网络流的异常检测结果为异常。
在本发明实施例中,由于目标域特征提取器在目标域上提取到的特征,与源域特征提取器在源域上提取的特征相似,因此,通过源域训练的分类器可以对目标域进行异常检测,且准确性高。
在一个实施例中,本发明提供了一种计算机设备,该设备可以是终端,内部结构如图17所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络模型接口、显示屏和输入装置。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包 括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统和计算机程序。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的网络模型接口用于与外部的终端通过网络模型连接通信。该计算机程序被处理器执行时以实现一种网络流异常检测模型的生成方法,或者一种网络流的异常检测方法。该计算机设备的显示屏可以是液晶显示屏或者电子墨水显示屏,该计算机设备的输入装置可以是显示屏上覆盖的触摸层,也可以是计算机设备外壳上设置的按键、轨迹球或触控板,还可以是外接的键盘、触控板或鼠标等。
本领域技术人员可以理解,图17所示的仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
本发明实施例提供了一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,其特征在于,所述处理器执行所述计算机程序时实现以下步骤:
基于源域对第一网络模型进行训练,以得到已训练的第一网络模型,其中,所述已训练的第一网络模型包括源域特征提取器和分类器;
基于目标域、所述源域、所述源域特征提取器和判别器对第二网络模型进行训练,以得到目标域特征提取器;
根据所述目标域特征提取器和所述分类器生成网络流异常检测模型;
或者,所述网络流异常检测模型获取目标域中的待检测网络流;
所述目标域特征提取器提取所述待检测网络流对应的待检测特征向量,其中,所述目标域特征提取器是上述一种网络流的异常检测方法中的目标域特征提取器;
所述分类器对所述待检测特征向量进行分类,以得到所述待检测特征向量对应的异常检测结果,其中,所述分类器是上述一种网络流的异常检测方法中的分类器。
本发明实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现以下步骤:
基于源域对第一网络模型进行训练,以得到已训练的第一网络模型,其中,所述已训练的第一网络模型包括源域特征提取器和分类器;
基于目标域、所述源域、所述源域特征提取器和判别器对第二网络模型进行训练,以得到目标域特征提取器;
根据所述目标域特征提取器和所述分类器生成网络流异常检测模型;
或者,所述网络流异常检测模型获取目标域中的待检测网络流;
所述目标域特征提取器提取所述待检测网络流对应的待检测特征向量,其中,所述目标域特征提取器是上述一种网络流的异常检测方法中的目标域特征提取器;
所述分类器对所述待检测特征向量进行分类,以得到所述待检测特征向量对应的异常检测结果,其中,所述分类器是上述一种网络流的异常检测方法中的分类器。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (13)

  1. 一种网络流异常检测模型的生成方法,其特征在于,包括:
    基于源域对第一网络模型进行训练,以得到已训练的第一网络模型,其中,所述已训练的第一网络模型包括源域特征提取器和分类器;
    基于目标域、所述源域、所述源域特征提取器和判别器对第二网络模型进行训练,以得到目标域特征提取器;
    根据所述目标域特征提取器和所述分类器生成网络流异常检测模型。
  2. 根据权利要求1所述的网络流异常检测模型的生成方法,其特征在于,所述基于源域对第一网络模型进行训练,以得到已训练的第一网络模型,具体包括:
    将训练数据中的正常网络流和所述训练数据中的异常网络流输入所述第一网络模型,通过所述第一网络模型生成所述正常网络流对应的第一检测分数和所述异常网络流对应的第二检测分数,其中,所述训练数据包括多个训练组,每个训练组包括来自源域的正常网络流和来自源域的异常网络流;
    根据所述第一检测分数和所述第二检测分数对所述第一网络模型进行训练,直至满足第一预设条件,以得到已训练的第一网络模型。
  3. 根据权利要求2所述的网络流异常检测模型的生成方法,其特征在于,所述第一网络根据训练数据中的正常网络流生成第一检测分数之前,还包括:
    基于所述源域确定各异常网络流和各正常网络流;
    所述基于所述源域确定各异常网络流和各正常网络流,具体包括:
    提取所述源域中的各第一网络流和各第二网络流;
    根据所述各第一网络流生成预设大小的各第一三维张量,并将所述各第一三维张量作为所述各正常网络流;
    根据所述各第二网络流生成所述预设大小的各第二三维张量,并将所述各第二三维张量作为所述各异常网络流。
  4. 根据权利要求3所述的网络流异常检测模型的生成方法,其特征在于,所述根据所述各第一网络流生成预设大小的各第一三维张量,并将所述各第一三维张量作为所述各正常网络流,具体包括:
    对于每个第一网络流,提取所述第一网络流对应的各第一网络数据包;
    根据各第一网络数据包得到所述第一网络流对应的预设大小的各第一三维张量,并将所述各第一三维张量作为所述各正常网络流。
  5. 根据权利要求4所述的网络流异常检测模型的生成方法,其特征在于,根据各第一网 络数据包得到所述第一网络流对应的预设大小的各第一三维张量,并将所述各第一三维张量作为所述各正常网络流,具体包括:
    对每个第一网络数据包进行序列化处理,以得到各第一网络数据包各自分别对应的第一字符串;
    根据各第一字符串生成预设大小的各第一三维张量,并将所述各第一三维张量作为所述各正常网络流。
  6. 根据权利要求3所述的网络流异常检测模型的生成方法,其特征在于,所述根据所述各第二网络流生成所述预设大小的各第二三维张量,并将所述各第二三维张量作为所述各异常网络流,具体包括:
    对于每个第二网络流,提取所述第二网络流对应的各第二网络数据包;
    根据各第二网络数据包得到所述第二网络流对应的预设大小的第二三维张量,并将所述各第二三维张量作为所述各异常网络流。
  7. 根据权利要求6所述的网络流异常检测模型的生成方法,其特征在于,所述根据各第二网络数据包得到所述第二网络流对应的预设大小的第二三维张量,并将所述各第二三维张量作为所述各异常网络流,具体包括:
    对每个第二网络数据包进行序列化处理,以得到各第二网络数据包各自分别对应的第二字符串;
    根据各第二字符串生成所述预设大小的各第二三维张量,并将所述各第二三维张量作为所述各异常网络流。
  8. 根据权利要求3所述的网络流异常检测模型的生成方法,其特征在于,所述第一网络模型包括向量生成器;
    当所述源域中的异常网络流不足时,所述基于所述源域确定异常网络流和正常网络流,还包括:
    将随机噪声输入所述向量生成器,以得到异常网络流。
  9. 根据权利要求1所述的网络流异常检测模型的生成方法,其特征在于,所述基于目标域、所述源域、所述源域特征提取器和判别器对第二网络模型进行训练,以得到目标域特征提取器,具体包括:
    所述源域特征提取器提取所述源域对应的源域特征向量;
    所述第二网络模型提取目标域对应的目标域特征向量;
    将所述源域特征向量和所述目标域特征向量输入所述判别器,以生成所述源域特征向量 对应的第一预测分数,以及所述目标域特征向量对应的第二预测分数;
    基于所述第一预测分数和第二预测分数对所述第二网络模型进行训练,直至满足第二预设条件,以得到目标域特征提取器。
  10. 根据权利要求9所述的网络流异常检测模型的生成方法,其特征在于,所述第二网络模型的初始模型参数与所述源域特征提取器的模型参数相同,所述第二网络模型的结构与所述源域特征提取器的结构相同,所述第二网络模型的初始模型参数是所述第二网络模型未经训练时的模型参数。
  11. 一种网络流的异常检测方法,其特征在于,应用于网络流异常检测模型,所述网络流异常检测模型包括目标域特征提取器和分类器,所述网络流的异常检测方法,具体包括:
    所述网络流异常检测模型获取目标域中的待检测网络流;
    所述目标域特征提取器提取所述待检测网络流对应的待检测特征向量,其中,所述目标域特征提取器是权利要求1-10中任意一项所述的目标域特征提取器;
    所述分类器对所述待检测特征向量进行分类,以得到所述待检测特征向量对应的异常检测结果,其中,所述分类器是权利要求1-10中任意一项所述的分类器。
  12. 一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,其特征在于,所述处理器执行所述计算机程序时实现权利要求1至10中任意一项所述的网络流异常检测模型的生成方法或者权利要求11所述的网络流的异常检测方法中的步骤。
  13. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1至10中任意一项所述的网络流异常检测模型的生成方法或者权利要求11所述的网络流的异常检测方法中的步骤。
PCT/CN2021/098695 2020-08-17 2021-06-07 一种网络流异常检测模型的生成方法和计算机设备 WO2022037191A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010823315.1A CN111683108B (zh) 2020-08-17 2020-08-17 一种网络流异常检测模型的生成方法和计算机设备
CN202010823315.1 2020-08-17

Publications (1)

Publication Number Publication Date
WO2022037191A1 true WO2022037191A1 (zh) 2022-02-24

Family

ID=72438791

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/098695 WO2022037191A1 (zh) 2020-08-17 2021-06-07 一种网络流异常检测模型的生成方法和计算机设备

Country Status (2)

Country Link
CN (1) CN111683108B (zh)
WO (1) WO2022037191A1 (zh)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114726749A (zh) * 2022-03-02 2022-07-08 阿里巴巴(中国)有限公司 数据异常检测模型获取方法、装置、设备、介质及产品
CN114928492A (zh) * 2022-05-20 2022-08-19 北京天融信网络安全技术有限公司 高级持续威胁攻击识别方法、装置和设备
CN114944926A (zh) * 2022-03-04 2022-08-26 北京邮电大学 势变谱构造方法、网络流异常行为识别方法及相关装置
CN115865534A (zh) * 2023-02-27 2023-03-28 深圳大学 一种基于恶意加密流量检测方法、系统、装置及介质
CN116015932A (zh) * 2022-12-30 2023-04-25 湖南大学 入侵检测网络模型生成方法以及数据流量入侵检测方法
CN116095089A (zh) * 2023-04-11 2023-05-09 云南远信科技有限公司 遥感卫星数据处理方法及系统
CN116723115A (zh) * 2023-08-08 2023-09-08 中国电信股份有限公司 流量异常处理方法、装置、电子设备及存储介质
CN116962083A (zh) * 2023-09-20 2023-10-27 西南交通大学 网络异常行为的检测方法、装置、设备及可读存储介质
CN117407733A (zh) * 2023-12-12 2024-01-16 南昌科晨电力试验研究有限公司 一种基于对抗生成shapelet的流量异常检测方法及系统
CN117811843A (zh) * 2024-02-29 2024-04-02 暨南大学 基于大数据分析和自主学习的网络入侵检测方法及系统
CN118037316A (zh) * 2024-03-18 2024-05-14 四川旌城电科科技发展有限公司 一种基于负荷自适应的异常用电识别方法

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111683108B (zh) * 2020-08-17 2020-11-17 鹏城实验室 一种网络流异常检测模型的生成方法和计算机设备
CN112383516A (zh) * 2020-10-29 2021-02-19 博雅正链(北京)科技有限公司 图神经网络构建方法、基于图神经网络的异常流量检测方法
CN112398862B (zh) * 2020-11-18 2022-06-10 深圳供电局有限公司 一种基于gru模型的充电桩攻击聚类检测方法
CN112839034B (zh) * 2020-12-29 2022-08-05 湖北大学 一种基于cnn-gru分层神经网络的网络入侵检测方法
CN112966261A (zh) * 2021-03-08 2021-06-15 中电积至(海南)信息技术有限公司 一种轻量级可拓展的网络流量特征提取工具和方法
CN118056429A (zh) * 2021-12-29 2024-05-17 Oppo广东移动通信有限公司 虚拟信道样本的质量评估方法和设备
CN116450399B (zh) * 2023-06-13 2023-08-22 西华大学 微服务系统故障诊断及根因定位方法
CN117955753A (zh) * 2024-03-27 2024-04-30 国网山西省电力公司晋城供电公司 网络流量检测方法、装置、电子设备及可读存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109376620A (zh) * 2018-09-30 2019-02-22 华北电力大学 一种风电机组齿轮箱故障的迁移诊断方法
CN110149280A (zh) * 2019-05-27 2019-08-20 中国科学技术大学 网络流量分类方法和装置
US20200193269A1 (en) * 2018-12-18 2020-06-18 Samsung Electronics Co., Ltd. Recognizer, object recognition method, learning apparatus, and learning method for domain adaptation
CN111444952A (zh) * 2020-03-24 2020-07-24 腾讯科技(深圳)有限公司 样本识别模型的生成方法、装置、计算机设备和存储介质
CN111683108A (zh) * 2020-08-17 2020-09-18 鹏城实验室 一种网络流异常检测模型的生成方法和计算机设备

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10817668B2 (en) * 2018-11-26 2020-10-27 Sap Se Adaptive semi-supervised learning for cross-domain sentiment classification
CN111290947B (zh) * 2020-01-16 2022-06-14 华南理工大学 一种基于对抗判别的跨软件缺陷预测方法
CN111444951B (zh) * 2020-03-24 2024-02-20 腾讯科技(深圳)有限公司 样本识别模型的生成方法、装置、计算机设备和存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109376620A (zh) * 2018-09-30 2019-02-22 华北电力大学 一种风电机组齿轮箱故障的迁移诊断方法
US20200193269A1 (en) * 2018-12-18 2020-06-18 Samsung Electronics Co., Ltd. Recognizer, object recognition method, learning apparatus, and learning method for domain adaptation
CN110149280A (zh) * 2019-05-27 2019-08-20 中国科学技术大学 网络流量分类方法和装置
CN111444952A (zh) * 2020-03-24 2020-07-24 腾讯科技(深圳)有限公司 样本识别模型的生成方法、装置、计算机设备和存储介质
CN111683108A (zh) * 2020-08-17 2020-09-18 鹏城实验室 一种网络流异常检测模型的生成方法和计算机设备

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WANG QI; GAO JUNYU; LI XUELONG: "Weakly Supervised Adversarial Domain Adaptation for Semantic Segmentation in Urban Scenes", IEEE TRANSACTIONS ON IMAGE PROCESSING, IEEE, USA, vol. 28, no. 9, 1 September 2019 (2019-09-01), USA, pages 4376 - 4386, XP011733034, ISSN: 1057-7149, DOI: 10.1109/TIP.2019.2910667 *

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114726749B (zh) * 2022-03-02 2023-10-31 阿里巴巴(中国)有限公司 数据异常检测模型获取方法、装置、设备及介质
CN114726749A (zh) * 2022-03-02 2022-07-08 阿里巴巴(中国)有限公司 数据异常检测模型获取方法、装置、设备、介质及产品
CN114944926B (zh) * 2022-03-04 2023-12-22 北京邮电大学 势变谱构造方法、网络流异常行为识别方法、相关装置、电子设备及存储介质
CN114944926A (zh) * 2022-03-04 2022-08-26 北京邮电大学 势变谱构造方法、网络流异常行为识别方法及相关装置
CN114928492B (zh) * 2022-05-20 2023-11-24 北京天融信网络安全技术有限公司 高级持续威胁攻击识别方法、装置和设备
CN114928492A (zh) * 2022-05-20 2022-08-19 北京天融信网络安全技术有限公司 高级持续威胁攻击识别方法、装置和设备
CN116015932A (zh) * 2022-12-30 2023-04-25 湖南大学 入侵检测网络模型生成方法以及数据流量入侵检测方法
CN115865534A (zh) * 2023-02-27 2023-03-28 深圳大学 一种基于恶意加密流量检测方法、系统、装置及介质
CN116095089A (zh) * 2023-04-11 2023-05-09 云南远信科技有限公司 遥感卫星数据处理方法及系统
CN116723115A (zh) * 2023-08-08 2023-09-08 中国电信股份有限公司 流量异常处理方法、装置、电子设备及存储介质
CN116723115B (zh) * 2023-08-08 2023-11-07 中国电信股份有限公司 流量异常处理方法、装置、电子设备及存储介质
CN116962083B (zh) * 2023-09-20 2023-12-05 西南交通大学 网络异常行为的检测方法、装置、设备及可读存储介质
CN116962083A (zh) * 2023-09-20 2023-10-27 西南交通大学 网络异常行为的检测方法、装置、设备及可读存储介质
CN117407733A (zh) * 2023-12-12 2024-01-16 南昌科晨电力试验研究有限公司 一种基于对抗生成shapelet的流量异常检测方法及系统
CN117407733B (zh) * 2023-12-12 2024-04-02 南昌科晨电力试验研究有限公司 一种基于对抗生成shapelet的流量异常检测方法及系统
CN117811843A (zh) * 2024-02-29 2024-04-02 暨南大学 基于大数据分析和自主学习的网络入侵检测方法及系统
CN117811843B (zh) * 2024-02-29 2024-05-03 暨南大学 基于大数据分析和自主学习的网络入侵检测方法及系统
CN118037316A (zh) * 2024-03-18 2024-05-14 四川旌城电科科技发展有限公司 一种基于负荷自适应的异常用电识别方法

Also Published As

Publication number Publication date
CN111683108A (zh) 2020-09-18
CN111683108B (zh) 2020-11-17

Similar Documents

Publication Publication Date Title
WO2022037191A1 (zh) 一种网络流异常检测模型的生成方法和计算机设备
ElSayed et al. A novel hybrid model for intrusion detection systems in SDNs based on CNN and a new regularization technique
Yu et al. PBCNN: Packet bytes-based convolutional neural network for network intrusion detection
Görnitz et al. Toward supervised anomaly detection
CN112953924A (zh) 网络异常流量检测方法、系统、存储介质、终端及应用
US20150135318A1 (en) Method of detecting intrusion based on improved support vector machine
CN110808971A (zh) 一种基于深度嵌入的未知恶意流量主动检测系统及方法
CN111786951B (zh) 流量数据特征提取方法、恶意流量识别方法及网络系统
CN113821793B (zh) 基于图卷积神经网络的多阶段攻击场景构建方法及系统
Manganiello et al. Multistep attack detection and alert correlation in intrusion detection systems
US20240185582A1 (en) Annotation-efficient image anomaly detection
CN111355671B (zh) 基于自注意机制的网络流量分类方法、介质及终端设备
Callegari et al. Improving stability of PCA-based network anomaly detection by means of kernel-PCA
CN116915450A (zh) 基于多步网络攻击识别和场景重构的拓扑剪枝优化方法
Jain Network traffic identification with convolutional neural networks
Vu et al. Handling imbalanced data in intrusion detection systems using generative adversarial networks
CN117527391A (zh) 基于注意力机制和一维卷积神经网络的加密流量分类方法
CN116545944A (zh) 一种网络流量分类方法及系统
Someya et al. FCGAT: Interpretable malware classification method using function call graph and attention mechanism
CN114548678A (zh) 分阶段的设备细粒度类型识别方法及系统
Sharma et al. Identification of device type using transformers in heterogeneous internet of things traffic
CN117278336B (zh) 基于时频域变换的物联网设备异常流量检测方法和系统
Liu et al. Autonomous Anti-interference Identification of $\text {IoT} $ Device Traffic based on Convolutional Neural Network
Rueda et al. A hybrid intrusion detection approach based on deep learning techniques
Bao et al. Towards Open-Set APT Malware Classification under Few-Shot Setting

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21857290

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 21857290

Country of ref document: EP

Kind code of ref document: A1