WO2023024506A1 - 流量检测方法、装置、电子设备和存储介质 - Google Patents

流量检测方法、装置、电子设备和存储介质 Download PDF

Info

Publication number
WO2023024506A1
WO2023024506A1 PCT/CN2022/083871 CN2022083871W WO2023024506A1 WO 2023024506 A1 WO2023024506 A1 WO 2023024506A1 CN 2022083871 W CN2022083871 W CN 2022083871W WO 2023024506 A1 WO2023024506 A1 WO 2023024506A1
Authority
WO
WIPO (PCT)
Prior art keywords
traffic
suspicious
data
traffic data
layer
Prior art date
Application number
PCT/CN2022/083871
Other languages
English (en)
French (fr)
Inventor
谷勇浩
张晓青
徐昊
黄泽祺
王翼翡
王继刚
田甜
马苏安
王静
付鹏
Original Assignee
中兴通讯股份有限公司
北京邮电大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司, 北京邮电大学 filed Critical 中兴通讯股份有限公司
Publication of WO2023024506A1 publication Critical patent/WO2023024506A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/06Generation of reports
    • H04L43/062Generation of reports related to network traffic

Definitions

  • the embodiments of the present application relate to the technical field of network security, and in particular to a traffic detection method, device, electronic equipment, and storage medium.
  • Network traffic reflects the basic form of network bearer. With the popularization of the network and the increasing usage of the network, the network traffic also shows an exponential increase. The size of network traffic reflects the security of the network to a certain extent. Many network attacks will cause abnormal network traffic, such as distributed denial of service (Distributed Denial of Service, referred to as: DDoS) attack is to use a large number of normal access requests to attack the server , so as to occupy a large amount of service resources of the server, so that legitimate users cannot get the response of the server, and even cause the server to be paralyzed. Therefore, detecting network traffic to find abnormal traffic conditions and taking corresponding measures is an important measure to protect network security.
  • DDoS distributed Denial of Service
  • the flow detection method uses multiple autoencoders to process the feature subsets separately, because when using hierarchical clustering to divide the feature subsets, it is necessary to set the upper limit of the number of features for each feature subset and the lower limit, multiple aggregation or split operations will bring high time and space complexity, and choosing different distance metrics and link algorithms will have a greater impact on the traffic detection results, resulting in increased difficulty in traffic detection and the Detected anomalous network traffic with low accuracy.
  • An embodiment of the present application provides a traffic detection method, including: acquiring network traffic data; inputting the network traffic data to the i-th layer autoencoder of a preset n-layer autoencoder to perform reconstruction processing to obtain reconstructed traffic data, and obtain the reconstruction error according to the network traffic data and the reconstructed traffic data; wherein, the n is an integer greater than 1, and the i is an integer greater than 0 and less than the n; the reconstructed The network traffic data whose error is greater than the preset error threshold of the i-th layer autoencoder is called suspicious traffic, and the suspicious traffic is input to the i+1-th layer autoencoder for reconstruction processing to obtain the suspicious The suspicious reconstruction error of the traffic, and obtain the suspicious reconstruction error according to the suspicious traffic and the suspicious reconstruction traffic; when the i+1th layer autoencoder is the nth layer autoencoder, if the suspicious If the suspicious reconstruction error of the traffic is greater than the preset error threshold of the n-th layer autoencoder, the suspicious traffic
  • the embodiment of the present application also provides a traffic detection device, including: an acquisition module, configured to acquire network traffic data; an n-layer autoencoder, configured to input the network traffic data into the i-th layer of the n-layer autoencoder
  • the layer self-encoder performs reconstruction processing to obtain reconstructed traffic data, and obtains a reconstruction error according to the network traffic data and the reconstructed traffic data; the reconstruction error is greater than the prediction of the i-th layer self-encoder
  • the network traffic data whose error threshold is set is called suspicious traffic, and the suspicious traffic is input to the i+1th layer autoencoder for reconstruction processing to obtain the suspicious reconstruction error of the suspicious traffic; and according to the suspicious traffic and the suspicious reconstruction traffic to obtain the suspicious reconstruction error of the suspicious traffic; when the i+1th layer autoencoder is the nth layer autoencoder, if the suspicious reconstruction error of the suspicious traffic is greater than The preset error threshold of the n-th layer autoencoder, the suspicious traffic is the abnormal traffic in the
  • the embodiment of the present application also provides an electronic device, including: at least one processor; and a memory connected in communication with the at least one processor; wherein, the memory stores information that can be executed by the at least one processor. Instructions, the instructions are executed by the at least one processor, so that the at least one processor can execute the above traffic detection method.
  • the embodiment of the present application also provides a computer-readable storage medium, storing a computer program, and implementing the above flow detection method when the computer program is executed by a processor.
  • Fig. 1 is a flow chart of the flow detection method provided by the embodiment of the present application.
  • FIG. 2 is a schematic structural diagram of an n-layer self-encoder provided in an embodiment of the present application
  • Fig. 3 is a schematic structural diagram of each layer of autoencoders provided by the embodiment of the present application.
  • FIG. 4 is a flowchart of step 102 of the flow detection method provided by the embodiment of the present application.
  • FIG. 5 is a flow chart 1 of the flow detection method provided by the embodiment of the present application.
  • FIG. 6 is the second flow chart of the flow detection method provided by the embodiment of the present application.
  • FIG. 7 is a flowchart three of the flow detection method provided by the embodiment of the present application.
  • Fig. 8 is a schematic diagram of the flow structure of the flow detection device provided by the embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • the main purpose of the embodiments of the present application is to provide a flow detection method, device, electronic equipment, and storage medium.
  • the purpose is to reduce the detection difficulty of network traffic data and improve the detection accuracy of abnormal traffic in network traffic data.
  • An embodiment of the present application relates to a traffic detection method, as shown in FIG. 1 , comprising:
  • Step 101 acquiring network traffic data.
  • the acquired network traffic data may be original traffic data obtained from a network environment or a local cache file, or may be traffic data after feature extraction is performed on the original traffic data.
  • Step 102 inputting the network traffic data to the i-th layer autoencoder of the n-layer autoencoder for reconstruction processing to obtain the reconstructed traffic data, and obtaining the reconstruction error according to the network traffic data and the reconstructed traffic data.
  • n of the n-layer autoencoder is an integer greater than 1
  • i of the i-th layer of autoencoder is an integer greater than 0 and less than n.
  • the structure diagram of the n-layer autoencoder is shown in Figure 2.
  • Step 102 can be realized by each sub-step as shown in Figure 4, specifically includes:
  • the i-th layer self-encoder encodes the network traffic data to generate encoded data of the network traffic data.
  • the network traffic data is input into the i-th layer autoencoder, and the i-th layer autoencoder will first use the encoder to encode the network traffic parameters to reduce the feature dimension of the network traffic data, and then generate network traffic The encoded data of the data.
  • the i-th layer self-encoder decodes the encoded data to generate reconstructed traffic data.
  • the i-th layer autoencoder will use a decoder to decode the encoded data to restore the feature dimension of the encoded data, and then generate reconstructed traffic data.
  • sub-step 1023 the difference between the network flow data and the reconstructed flow data is taken as a reconstruction error.
  • the reconstructed traffic data is the data obtained after the network traffic data is encoded for dimensionality reduction and decoding for dimensionality enhancement.
  • the feature dimensions of reconstructed traffic data and network traffic data are the same, and the difference between them is The value is the reconstruction error of the i-th layer autoencoder when processing the network traffic data.
  • Step 103 the network traffic data whose reconstruction error is greater than the error threshold of the i-th layer autoencoder is called suspicious traffic, and the suspicious traffic is input to the i+1-th layer autoencoder for reconstruction processing to obtain suspicious traffic data of suspicious traffic According to the suspicious flow and the suspicious reconstructed flow, the suspicious reconstruction error is obtained.
  • each layer of the n-layer autoencoder contains an error threshold relative to each layer, and each layer of the autoencoder will judge the reconstruction error of the network traffic data according to the error threshold of each layer.
  • the reconstruction error of the network traffic data is less than or equal to the error threshold of the i-th layer autoencoder, the network traffic data is judged as normal traffic, and no other processing is required to directly output the corresponding result; and when the network traffic
  • the data reconstruction error is greater than the error threshold of the i-th layer autoencoder, the network traffic data is judged as suspicious traffic, and the suspicious traffic needs to be input to the i+1-th layer autoencoder for reconstruction processing to generate the corresponding Suspicious reconstructed traffic, the characteristic dimensions of suspicious traffic and suspicious traffic are the same, and the suspicious reconstruction error of suspicious traffic can be obtained according to the difference between the two.
  • the error threshold corresponding to the i+1 layer autoencoder judges whether the suspicious traffic is normal traffic. If not, continue to input the traffic to the next layer of auto
  • Step 104 when the i+1th layer autoencoder is the nth layer autoencoder, if the suspicious reconstruction error of the suspicious traffic is greater than the error threshold of the nth layer autoencoder, then the suspicious traffic is an abnormality in the network traffic data flow.
  • the i+1th layer autoencoder is the nth layer autoencoder, it means that the last judgment is made on the suspicious traffic, when the reconstruction error of the suspicious reconstruction error of the suspicious traffic is greater than the nth layer If the error threshold of the self-encoder is equal to the error threshold, the suspicious traffic is abnormal traffic in the network traffic data acquired in step 101, otherwise it is normal traffic.
  • each layer of autoencoder processes the traffic data in the same way, the only difference is that the parameters of each layer of autoencoder are different.
  • the network traffic data is input to the i-th layer autoencoder of the n-layer autoencoder for reconstruction processing to obtain the reconstructed traffic data, and according to the network traffic data and the reconstructed traffic data to obtain the reconstruction error;
  • the network traffic data whose reconstruction error is greater than the preset error threshold of the i-th layer autoencoder is called suspicious traffic, and the suspicious traffic is input to the i+1-th layer autoencoder for reconstruction structure processing to obtain suspicious reconstruction errors of suspicious traffic; obtain suspicious reconstruction errors of suspicious traffic according to suspicious traffic and suspicious reconstruction traffic; when the i+1th layer autoencoder is nth layer autoencoder, if the suspicious traffic If the suspicious reconstruction error is greater than the preset error threshold of the n-th layer autoencoder, the suspicious traffic is abnormal traffic in the network traffic data; by using the n-layer autoencoder to process the acquired network traffic data, it is not necessary to The traffic data is aggregated and split to reduce the difficulty of traffic
  • An embodiment of the present application relates to a traffic detection method, as shown in FIG. 5 , including:
  • a traffic training set is acquired, wherein the traffic training set includes several pieces of normal traffic data.
  • the obtained traffic training set is composed of several normal traffic data, and the several normal traffic data cover multiple types of normal traffic types such as uploading, downloading, and browsing.
  • Step 202 input the traffic training set to the i-th layer autoencoder for reconstruction processing to obtain a reconstructed traffic set, and use the preset loss function to train the i-th layer autoencoder; wherein, the reconstructed traffic set includes several reconstructed traffic sets Each normal traffic data is reconstructed to obtain a reconstructed traffic.
  • the i-th layer autoencoder will perform reconstruction processing on each normal traffic data in the traffic training set as shown in step 1021-step 1022, and obtain each The reconstructed traffic corresponding to each normal traffic data constitutes the reconstructed traffic set; every time training is performed, the mean square error of the normal traffic data in the traffic training set and the reconstructed traffic data in the reconstructed traffic set is calculated, and the mean square error is taken as the second
  • the loss function of the i-layer autoencoder when the loss function does not meet the convergence conditions of the autoencoder, use the preset optimizer to calculate the gradient of the loss function, and adjust the autoencoder parameters of the i-th layer autoencoder according to the gradient, Complete one training of the i-th layer autoencoder, repeat the training steps until the i-th layer autoencoder is trained to converge, when the i-th layer autoencoder converges, the training process of the
  • Step 203 when the i-th layer autoencoder converges, obtain the reconstruction error of each normal flow data in the flow training set according to the reconstructed flow set, and determine the first
  • the error threshold of the i-layer autoencoder is used to filter out the normal traffic data whose reconstruction error is greater than the error threshold of the i-th layer autoencoder from the traffic training set as the suspicious traffic data set.
  • normal traffic data can be divided into normal traffic (normal traffic data whose reconstruction error is less than or equal to the error threshold) and suspicious traffic (normal traffic data whose reconstruction error is greater than the error threshold), and the normal traffic data whose reconstruction error is greater than the error threshold constitute a suspicious traffic set and input it to the i+1th layer autoencoder for training.
  • Step 204 input the suspicious traffic data set to the i+1th layer autoencoder for reconstruction processing to obtain the suspicious reconstructed traffic set, and use the loss function to train the i+1th layer autoencoder until convergence, according to the suspicious weight Obtain the suspicious reconstruction error of each normal traffic data in the suspicious traffic data set by constructing the traffic set, and determine the error threshold of the i+1th layer autoencoder according to the suspicious reconstruction error and the data division ratio of each normal traffic data.
  • the training process of the i+1-th layer autoencoder is the same as the training process of the i-th layer autoencoder in step 203, and after the i+1-th layer autoencoder converges, it also needs to follow the steps
  • the method for obtaining the error threshold of the i-th layer autoencoder given in 204 is to obtain the error threshold of the i+1-th layer autoencoder.
  • Step 205 when the i+1th layer of autoencoder is the nth layer of autoencoder, generate an nth layer of autoencoder.
  • the i+1 layer autoencoder is the nth layer autoencoder
  • the nth layer autoencoder converges and obtains the error threshold of the nth layer autoencoder, it means that n
  • the training process of the layer autoencoder has been completed, and the trained n-layer autoencoder can be applied in the process of traffic detection.
  • Step 206 acquiring network traffic data.
  • this step is substantially the same as step 101 in the embodiment of the present application, and details are not repeated here.
  • Step 207 input the network traffic data to the i-th layer autoencoder of the n-layer autoencoder for reconstruction processing to obtain reconstructed traffic data, and obtain reconstruction error according to the network traffic data and the reconstructed traffic data.
  • this step is substantially the same as step 102 in the embodiment of the present application, and details are not repeated here.
  • Step 208 the network traffic data whose reconstruction error is greater than the error threshold of the i-th layer autoencoder is called suspicious traffic, and the suspicious traffic is input to the i+1th layer autoencoder for reconstruction processing to obtain suspicious traffic data of suspicious traffic According to the suspicious flow and the suspicious reconstructed flow, the suspicious reconstruction error is obtained.
  • this step is substantially the same as step 103 in the embodiment of the present application, and details are not repeated here.
  • Step 209 when the i+1th layer autoencoder is the nth layer autoencoder, if the suspicious reconstruction error of the suspicious traffic is greater than the error threshold of the nth layer autoencoder, then the suspicious traffic is an abnormality in the network traffic data flow.
  • this step is substantially the same as step 104 in the embodiment of the present application, and details are not repeated here.
  • the abnormal traffic detection since the abnormal traffic detection formulates corresponding rules according to the special mode of each attack behavior, and warns the network traffic that does not conform to the rules, this method can only detect known attacks and cannot detect unknown networks. attack, resulting in a high false negative rate of abnormal network traffic, and this application can learn the distribution characteristics of normal traffic in different scenarios through autoencoders to improve the ability of autoencoders to reconstruct normal traffic data and reduce the reconstruction of normal traffic data.
  • An embodiment of the present application relates to a traffic detection method, as shown in FIG. 6, comprising:
  • Step 301 acquiring network traffic data.
  • this step is substantially the same as step 101 in the embodiment of the present application, and details are not repeated here.
  • Step 302 determine the traffic direction of the network traffic data according to the source IP, the destination IP, the source port and the destination port.
  • the obtained network traffic data contains quintuple information
  • the quintuple information includes source IP, destination IP, source port, destination port and protocol information, and source IP, destination IP, source port and destination port can be used It is used to determine the flow direction of network traffic data.
  • Step 303 performing feature extraction on the network traffic data according to the traffic direction and protocol information, and acquiring statistical feature data of the network traffic data.
  • the feature extraction of the network traffic data can be performed, and the flow duration, data packet length, data packet number, and adjacent data packet sending time of the network traffic data can be extracted.
  • Interval, sending rate, etc. 76 are statistical feature information.
  • Step 304 Input the network traffic data to the i-th layer autoencoder of the n-layer autoencoder for reconstruction processing to obtain reconstructed traffic data, and obtain reconstruction errors according to the network traffic data and the reconstructed traffic data.
  • this step is substantially the same as step 102 in the embodiment of the present application, and details are not repeated here.
  • Step 305 the network traffic data whose reconstruction error is greater than the error threshold of the i-th layer autoencoder is called suspicious traffic, and the suspicious traffic is input to the i+1-th layer autoencoder for reconstruction processing to obtain suspicious traffic data of suspicious traffic According to the suspicious flow and the suspicious reconstructed flow, the suspicious reconstruction error is obtained.
  • this step is substantially the same as step 103 in the embodiment of the present application, and details are not repeated here.
  • Step 306 when the i+1th layer autoencoder is the nth layer autoencoder, if the suspicious reconstruction error of the suspicious traffic is greater than the error threshold of the nth layer autoencoder, then the suspicious traffic is an abnormality in the network traffic data flow.
  • this step is substantially the same as step 104 in the embodiment of the present application, and details are not repeated here.
  • Embodiments of the present application can also perform feature extraction work on network traffic data, so that the input n-layer autoencoder is only the feature data of network traffic data, which can improve the performance of n-layer autoencoder on the network.
  • the processing speed of traffic data thereby improving the efficiency of traffic detection.
  • An embodiment of the present application relates to a traffic detection method, as shown in FIG. 7 , including:
  • Step 401 acquiring network traffic data.
  • this step is substantially the same as step 101 in the embodiment of the present application, and details are not repeated here.
  • Step 402 Determine the traffic direction of the network traffic data according to the source IP, the destination IP, the source port, and the destination port.
  • this step is substantially the same as step 302 in the embodiment of the present application, and details are not repeated here.
  • Step 403 performing feature extraction on the network traffic data according to the traffic direction and the protocol information, and acquiring statistical feature data of the network traffic data.
  • this step is substantially the same as step 303 in the embodiment of the present application, and details are not repeated here.
  • Step 404 performing data cleaning processing and data normalization processing on the statistical characteristic data.
  • Step 405 Input the network traffic data into the i-th layer autoencoder of the n-layer autoencoder for reconstruction processing to obtain reconstructed traffic data, and obtain reconstruction errors according to the network traffic data and the reconstructed traffic data.
  • this step is substantially the same as step 102 in the embodiment of the present application, and details are not repeated here.
  • Step 406 the network traffic data whose reconstruction error is greater than the error threshold of the i-th layer autoencoder is called suspicious traffic, and the suspicious traffic is input to the i+1th layer autoencoder for reconstruction processing to obtain suspicious traffic data of suspicious traffic According to the suspicious flow and the suspicious reconstructed flow, the suspicious reconstruction error is obtained.
  • this step is substantially the same as step 103 in the embodiment of the present application, and details are not repeated here.
  • Step 407 when the i+1th layer autoencoder is the nth layer autoencoder, if the suspicious reconstruction error of the suspicious traffic is greater than the error threshold of the nth layer autoencoder, then the suspicious traffic is an abnormality in the network traffic data flow.
  • this step is substantially the same as step 104 in the embodiment of the present application, and details are not repeated here.
  • data cleaning and normalization processing can be performed on the statistical feature data of the network traffic data, so as to avoid the influence of the traffic containing abnormal values and the different feature dimensions on the traffic detection results , so as to improve the accuracy of flow detection results.
  • FIG. 8 is The schematic diagram of the traffic detection device described in this embodiment includes: an acquisition module 501 and an n-layer self-encoder 502 .
  • the acquiring module 501 is used to acquire network traffic data
  • the n-layer autoencoder 502 is used to input the network traffic data to the i-th layer autoencoder of the n-layer autoencoder for reconstruction processing to obtain the reconstructed traffic data, and obtain the reconstruction error according to the network traffic data and the reconstructed traffic data ;
  • the network traffic data whose reconstruction error is greater than the preset error threshold of the i-th layer autoencoder is called suspicious traffic, and the suspicious traffic is input to the i+1-th layer autoencoder for reconstruction processing to obtain suspicious traffic data.
  • the suspicious traffic is the abnormal traffic in the network traffic data; wherein, the n is an integer greater than 1, and the i is an integer greater than 0 and less than n.
  • this embodiment is a system embodiment corresponding to the above method embodiment, and this embodiment can be implemented in cooperation with the above method embodiment.
  • the relevant technical details and technical effects mentioned in the above embodiments are still valid in this embodiment, and will not be repeated here to reduce repetition.
  • the relevant technical details mentioned in this embodiment can also be applied in the above embodiments.
  • modules involved in this embodiment are logical modules.
  • a logical unit can be a physical unit, or a part of a physical unit, or multiple physical units. Combination of units.
  • units that are not closely related to solving the technical problem proposed in the present application are not introduced in this embodiment, but this does not mean that there are no other units in this embodiment.
  • FIG. 9 Another embodiment of the present application relates to an electronic device, as shown in FIG. 9 , including: at least one processor 601; and a memory 602 communicatively connected to the at least one processor 601; wherein, the memory 602 stores An instruction that can be executed by the at least one processor 601, the instruction is executed by the at least one processor 601, so that the at least one processor 601 can execute the traffic detection method in the foregoing embodiments.
  • the memory and the processor are connected by a bus
  • the bus may include any number of interconnected buses and bridges, and the bus connects one or more processors and various circuits of the memory together.
  • the bus may also connect together various other circuits such as peripherals, voltage regulators, and power management circuits, all of which are well known in the art and therefore will not be further described herein.
  • the bus interface provides an interface between the bus and the transceivers.
  • a transceiver may be a single element or multiple elements, such as multiple receivers and transmitters, providing means for communicating with various other devices over a transmission medium.
  • the data processed by the processor is transmitted on the wireless medium through the antenna, further, the antenna also receives the data and transmits the data to the processor.
  • the processor is responsible for managing the bus and general processing, and can also provide various functions, including timing, peripheral interface, voltage regulation, power management, and other control functions. Instead, memory can be used to store data that the processor uses when performing operations.
  • Another embodiment of the present application relates to a computer-readable storage medium storing a computer program.
  • the above method embodiments are implemented when the computer program is executed by the processor.
  • the program is stored in a storage medium, and includes several instructions to make a device ( It may be a single-chip microcomputer, a chip, etc.) or a processor (processor) to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disc, etc., which can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本申请实施例涉及网络安全技术领域,特别涉及一种流量检测方法、装置、电子设备及存储介质。该方法包括:获取网络流量数据;将网络流量数据输入到n层自编码器的第i层自编码器进行重构处理获取重构流量数据;根据网络流量数据和重构流量数据获取重构误差;将重构误差大于第i层自编码器的误差阈值的网络流量数据称为可疑流量;将可疑流量输入到第i+1层自编码器进行重构处理获取可疑重构流量;根据可疑流量和可疑重构流量获取可疑流量的可疑重构误差;当第i+1层自编码器为第n层自编码器时,若可疑流量的可疑重构误差大于第n层自编码器的误差阈值,则可疑流量为网络流量数据中的异常流量。

Description

流量检测方法、装置、电子设备和存储介质
相关申请的交叉引用
本申请基于申请号为“202110975446.6”、申请日为2021年08月24日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此以引入方式并入本申请。
技术领域
本申请实施例涉及网络安全技术领域,特别涉及一种流量检测方法、装置、电子设备和存储介质。
背景技术
网络流量是反映网络承载的基本形态,随着网络的普及和网络使用量的与日俱增,网络流量也呈现指数式的上升。网络流量的大小在一定程度上反映了网络的安全性,许多网络攻击都会使得网络流量产生异常,例如分布式拒绝服务(Distributed Denial ofService,简称:DDoS)攻击就是利用大量的正常访问请求来攻击服务器,以占用服务器大量的服务资源,从而使得合法用户无法得到服务器的响应,甚至导致服务器的瘫痪。因此,对网络流量进行检测以发现异常流量情况并采取相应措施是保护网络安全的重要措施。
然而,基于层级聚类划分特征集,使用多个自编码器分别对特征子集进行处理的流量检测方法,由于在使用层次聚类划分特征子集时需要设置每个特征子集的特征数上限和下限,多次聚合或拆分操作会带来较高的时间和空间复杂性,以及选择不同的距离度量和链接算法会对流量的检测结果产生较大影响,导致流量检测的难度加大和所检测的异常网络流量的准确度较低。
发明内容
本申请实施例提供了一种流量检测方法,包括:获取网络流量数据;将所述网络流量数据输入到预设的n层自编码器的第i层自编码器进行重构处理获取重构流量数据,并根据所述网络流量数据和所述重构流量数据获取重构误差;其中,所述n为大于1的整数,所述i为大于0小于所述n的整数;将所述重构误差大于所述第i层自编码器的预设误差阈值的所述网络流量数据称为可疑流量,并将所述可疑流量输入到第i+1层自编码器进行重构处理获取所述可疑流量的可疑重构误差,并根据所述可疑流量和所述可疑重构流量获取可疑重构误差;当所述第i+1层自编码器为第n层自编码器时,若所述可疑流量的可疑重构误差大于所述第n层自编码器的预设误差阈值,则所述可疑流量为所述网络流量数据中的异常流量。
本申请实施例还提供一种流量检测装置,包括:获取模块,用于获取网络流量数据;n层自编码器,用于将所述网络流量数据输入到所述n层自编码器的第i层自编码器进行重构处理获取重构流量数据,并根据所述网络流量数据和所述重构流量数据获取重构误差;将所述重构误差大于所述第i层自编码器的预设误差阈值的所述网络流量数据称为可疑流量,并将所述可疑流量输入到第i+1层自编码器进行重构处理获取所述可疑流量的可疑重构误差; 并根据所述可疑流量和所述可疑重构流量获取所述可疑流量的可疑重构误差;当所述第i+1层自编码器为第n层自编码器时,若所述可疑流量的可疑重构误差大于所述第n层自编码器的预设误差阈值,则所述可疑流量为所述网络流量数据中的异常流量;其中,所述n为大于1的整数,所述i为大于0小于所述n的整数。
本申请实施例还提供了一种电子设备,包括:至少一个处理器;以及,与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行上述的流量检测方法。
本申请实施例还提供了一种计算机可读存储介质,存储有计算机程序,所述计算机程序被处理器执行时实现上述的流量检测方法。
附图说明
图1是本申请实施例提供的流量检测方法的流程图;
图2是本申请实施例提供的n层自编码器的结构示意图;
图3是本申请实施例提供的各层自编码器的结构示意图;
图4是本申请实施例提供的流量检测方法的步骤102的流程图;
图5是本申请实施例提供的流量检测方法的流程图一;
图6是本申请实施例提供的流量检测方法的流程图二;
图7是本申请实施例提供的流量检测方法的流程图三;
图8是本申请实施例提供的流量检测装置的流结构示意图;
图9是本申请实施例提供的电子设备的结构示意图。
具体实施方式
本申请实施例的主要目的在于提出一种流量检测方法、装置、电子设备和存储介质。旨在降低网络流量数据的检测难度和提高对网络流量数据中的异常流量的检测准确度。
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合附图对本申请的各实施例进行详细的阐述。然而,本领域的普通技术人员可以理解,在本申请各实施例中,为了使读者更好地理解本申请而提出了许多技术细节。但是,即使没有这些技术细节和基于以下各实施例的种种变化和修改,也可以实现本申请所要求保护的技术方案。以下各个实施例的划分是为了描述方便,不应对本申请的具体实现方式构成任何限定,各个实施例在不矛盾的前提下可以相互结合相互引用。
本申请的一个实施例涉及一种流量检测方法,如图1所示,包括:
步骤101,获取网络流量数据。
具体的说,所获取的网络流量数据可以是从网络环境或本地缓存文件中获取原始流量数据,也可以是对原始流量数据进行特征提取后的流量数据。
步骤102,将网络流量数据输入到n层自编码器的第i层自编码器进行重构处理获取重构流量数据,并根据网络流量数据和重构流量数据获取重构误差。
具体的说,n层自编码器的n为大于1的整数,第i层自编码器的i为大于0小于n的整数,n层自编码器的结构示意图如图2所示,由多层自编码器组成,以输入的网络流量数据 的特征是76维为例,各层自编码器的结构示意图如图3所示,由编码器和解码器组成,且在各个自编码器中还包含有给各层自编码器施加约束的正则化项,以防止各层自编码器过拟合。
步骤102可以由如图4所示的各子步骤实现,具体包括:
子步骤1021,第i层自编码器对网络流量数据进行编码处理,生成网络流量数据的编码数据。
具体的说,将网络流量数据输入到第i层自编码器中,第i层自编码器首先会使用编码器对网络流量参数进行编码处理,以降低网络流量数据的特征维度,进而生成网络流量数据的编码数据。
子步骤1022,第i层自编码器对编码数据进行解码处理,生成重构流量数据。
具体的说,在完成对网络流量数据的编码处理之后,第i层自编码器会使用解码器对编码数据进行解码处理,以恢复编码数据特征维度,进而生成重构流量数据。
子步骤1023,将网络流量数据和重构流量数据的差值作为重构误差。
具体的说,重构流量数据是对网络流量数据进行编码降维处理和解码升维处理之后得等到的数据,重构流量数据和网络流量数据的特征维度是相同的,两者之间的差值就是第i层自编码器的处理该网络流量数据时的重构误差。
步骤103,将重构误差大于第i层自编码器的误差阈值的网络流量数据称为可疑流量,并将可疑流量输入到第i+1层自编码器进行重构处理获取可疑流量的可疑重构误差,并根据可疑流量和可疑重构流量获取可疑重构误差。
具体的说,n层自编码器中的每一层自编码器都包含有各层相对于的误差阈值,各层自编码器会根据各层误差阈值对网络流量数据的重构误差进行判断,当网络流量数据的重构误差小于或等于第i层自编码器的误差阈值时,将该网络流量数据判断维正常流量,不需要对其进行其它处理,直接输出相应的结果;而当网络流量数据的重构误差大于第i层自编码器的误差阈值时,将该网络流量数据判断维可疑流量,需要将可疑流量输入到第i+1层自编码器进行重构处理生成可疑流量对应的可疑重构流量,可疑流量和可疑流量的特征维度是相同的,根据两者的差值可以获取到可疑流量的可疑重构误差,第i+1层再根据可疑流量的可疑重构误差和第i+1层自编码对应的误差阈值判断可疑流量是否为正常流量,若不是,继续将该流量输入到下一层自编码器进行处理,直至输入到第n层自编码器,
步骤104,当第i+1层自编码器为第n层自编码器时,若可疑流量的可疑重构误差大于第n层自编码器的误差阈值,则可疑流量为网络流量数据中的异常流量。
具体的说,当第i+1层自编码器是第n层自编码器时,也就意味着对可疑流量进行最后一次判断,当可疑流量的可疑重构误差的重构误差大于第n层自编码器的误差阈值时,则该可疑流量为步骤101所获取的网络流量数据中的异常流量,否则为正常流量。
此处所需要注意的是:n层自编码器中的各层自编码器对流量数据的处理过程都是相同的,唯一不同之处在于各层自编码器的参数是不相同的。
本实施例,在对获取的网络流量数据进行检测的过程中,将网络流量数据输入到n层自编码器的第i层自编码器进行重构处理获取重构流量数据,并根据网络流量数据和重构流量数据获取重构误差;将重构误差大于第i层自编码器的预设误差阈值的网络流量数据称为可疑流量,将可疑流量输入到第i+1层自编码器进行重构处理获取可疑流量的可疑重构误差;根据可疑流量和可疑重构流量获取可疑流量的可疑重构误差;当第i+1层自编码器为第n层 自编码器时,若可疑流量的可疑重构误差大于第n层自编码器的预设误差阈值,则可疑流量为网络流量数据中的异常流量;通过使用n层自编码器对所获取的网络流量数据进行处理,不需要对网络流量数据进行聚合和拆分工作,降低流量检测的难度;同时多次对网络数据流量中的可疑流量进行处理来获取最终的异常流量,提高所检测的异常流量的准确度。
本申请的一个实施例涉及一种流量检测方法,如图5所示,包括:
步骤201,获取流量训练集,其中,流量训练集包含若干个正常流量数据。
具体的说,所获取的流量训练集是由若干个正常流量数据组成,若干个正常流量数据涵盖上传、下载、浏览等多个类型的正常流量类型。
步骤202,将流量训练集输入到第i层自编码器进行重构处理获取重构流量集,并利用预设的损失函数训练第i层自编码器;其中,重构流量集包括若干个重构流量,每个正常流量数据经重构处理后得到一个重构流量。
具体的说,将流量训练集输入到第i层自编码器之后,第i层自编码器会对流量训练集中的各个正常流量数据进行如步骤1021-步骤1022所示的重构处理,获取每个正常流量数据对应的重构流量,构成重构流量集;每训练一次,就计算流量训练集中正常流量数据和重构流量集中各重构流量数据的均方误差,将该均方误差作为第i层自编码器的损失函数,当该损失函数不满足自编码器的收敛条件时,使用预设的优化器计算损失函数的梯度,并根据梯度调整第i层自编码器的自编码参数,完成第i层自编码器的一次训练,重复训练步骤直至第i层自编码器训练至收敛,当第i层自编码器收敛时可以进行第i+1层自编码器的训练过程。
步骤203,当第i层自编码器收敛时,根据重构流量集获取流量训练集中各正常流量数据的重构误差,并根据各正常流量数据的重构误差和预设的数据划分比例确定第i层自编码器的误差阈值,从流量训练集中筛选出重构误差大于第i层自编码器的误差阈值的正常流量数据作为可疑流量数据集。
具体的说,当第i层自编码器收敛时,根据流量训练集中的各正常流量数据和重构流量集中各重构流量数据计算各正常流量数据的重构误差,并根据各正常流量数据的重构误差和预设设置好的划分比例确定第i层自编码器的误差阈值,根据这个误差阈值可以将给正常流量数据划分为正常流量(重构误差小于或等于误差阈值的正常流量数据)和可疑流量(重构误差大于误差阈值的正常流量数据),而重构误差大于误差阈值的正常流量数据组成可疑流量集输入到第i+1层自编码器进行训练。
步骤204,将可疑流量数据集输入到第i+1层自编码器进行重构处理获取可疑重构流量集,并利用损失函数训练所述第i+1层自编码器直至收敛,根据可疑重构流量集获取可疑流量数据集中各正常流量数据的可疑重构误差,并根据各正常流量数据的可疑重构误差和数据划分比例确定第i+1层自编码器的误差阈值。
具体的说,第i+1层自编码器的训练过程和步骤203对第i层自编码器的训练过程一致,而在第i+1层自编码器收敛之后,同样的,也需要根据步骤204所给出的获取第i层自编码器的误差阈值的方法来获取第i+1层自编码器的误差阈值。
步骤205,当第i+1层自编码器为第n层自编码器时,则生成n层自编码器。
具体的说,当第i+1层自编码器是第n层自编码器的时候,当第n层自编码器收敛且获取到第n层自编码器的误差阈值之后,也就意味着n层自编码器的训练过程已经完成,可以将训练好的n层自编码器应用在流量检测的过程中。
步骤206,获取网络流量数据。
具体地说,本步骤与本申请实施例的步骤101大致相同,此处不一一赘述。
步骤207,将网络流量数据输入到n层自编码器的第i层自编码器进行重构处理获取重构流量数据,并根据网络流量数据和重构流量数据获取重构误差。
具体地说,本步骤与本申请实施例的步骤102大致相同,此处不一一赘述。
步骤208,将重构误差大于第i层自编码器的误差阈值的网络流量数据称为可疑流量,并将可疑流量输入到第i+1层自编码器进行重构处理获取可疑流量的可疑重构误差,并根据可疑流量和可疑重构流量获取可疑重构误差。
具体地说,本步骤与本申请实施例的步骤103大致相同,此处不一一赘述。
步骤209,当第i+1层自编码器为第n层自编码器时,若可疑流量的可疑重构误差大于第n层自编码器的误差阈值,则可疑流量为网络流量数据中的异常流量。
具体地说,本步骤与本申请实施例的步骤104大致相同,此处不一一赘述。
本申请的实施例,由于异常流量检测根据每种攻击行为的特殊模式制定相应的规则,对不符合该规则的网络流量进行预警,该方法只能检测已知的攻击,无法检测到未知的网络攻击,导致异常网络流量的漏报率较高,而本申请可以通过自编码器学习不同场景下正常流量的分布特征来提高自编码器对正常流量数据的重构能力,降低正常流量数据的重构误差,而对于异常流量来说,各层自编码器并不能够很好的对其进行重构处理,异常流量的重构误差与正常流量数据的重构误差相比存在很大差距,使得本申请可以很好的识别出异常流量;且本申请在第i+1层自编码器进行训练时只学习第i层自编码器学习的不好的流量数据,对于学习的好的流量数据不再重复学习,降低了各层自编码器的训练和测试的时间。
本申请的一个实施例涉及一种流量检测方法,如图6所示,包括:
步骤301,获取网络流量数据。
具体的说,本步骤与本申请实施例的步骤101大致相同,此处不一一赘述。
步骤302,根据源IP、目的IP、源端口和目的端口确定网络流量数据的流量方向。
具体的说,所获取的网络流量数据包含有五元组信息,五元组信息包含源IP、目的IP、源端口、目的端口和协议信息,源IP、目的IP、源端口和目的端口可以用于确定网络流量数据的流量方向。
步骤303,根据流量方向和协议信息对网络流量数据进行特征提取,获取网络流量数据的统计特征数据。
具体的说,根据网络流量数据的流量数据和协议信息便可以对该网络流量数据进行特征提取,提取该网络流量数据的流持续时间、数据包长度、数据包数量、相邻数据包发送的时间间隔、发送速率等76为统计特征信息。
步骤304,将网络流量数据输入到n层自编码器的第i层自编码器进行重构处理获取重构流量数据,并根据网络流量数据和重构流量数据获取重构误差。
具体的说,本步骤与本申请实施例的步骤102大致相同,此处不一一赘述。
步骤305,将重构误差大于第i层自编码器的误差阈值的网络流量数据称为可疑流量,并将可疑流量输入到第i+1层自编码器进行重构处理获取可疑流量的可疑重构误差,并根据可疑流量和可疑重构流量获取可疑重构误差。
具体的说,本步骤与本申请实施例的步骤103大致相同,此处不一一赘述。
步骤306,当第i+1层自编码器为第n层自编码器时,若可疑流量的可疑重构误差大于第n层自编码器的误差阈值,则可疑流量为网络流量数据中的异常流量。
具体的说,本步骤与本申请实施例的步骤104大致相同,此处不一一赘述。
本申请的实施例,在其他实施例的基础之上还可以对网络流量数据进行特征提取工作,使得输入n层自编码器的只是网络流量数据的特征数据,可以提高n层自编码器对网络流量数据的处理速度,从而提高流量检测的效率。
本申请的一个实施例涉及一种流量检测方法,如图7所示,包括:
步骤401,获取网络流量数据。
具体的说,本步骤与本申请实施例的步骤101大致相同,此处不一一赘述。
步骤402,根据所述源IP、所述目的IP、所述源端口和所述目的端口确定所述网络流量数据的流量方向。
具体的说,本步骤与本申请实施例的步骤302大致相同,此处不一一赘述。
步骤403,根据所述流量方向和所述协议信息对所述网络流量数据进行特征提取,获取所述网络流量数据的统计特征数据。
具体的说,本步骤与本申请实施例的步骤303大致相同,此处不一一赘述。
步骤404,对统计特征数据进行数据清洗处理和数据归一化处理。
具体的说,在获取网络流量数据时会因为捕获不完整或其他网络原因可能会导致特征提取后的个别流量中含有NaN、Inf等异常值,对后续数据处理产生影响,因为这部分流量占比不超过1%,因此需要对包含异常值的流量作为“脏数据”进行剔除;同时为了消除不同特征维度量纲的影响,可以采用线性函数归一化的方法将各个维度的数据归一化到0-1之间。
步骤405,将网络流量数据输入到n层自编码器的第i层自编码器进行重构处理获取重构流量数据,并根据网络流量数据和重构流量数据获取重构误差。
具体的说,本步骤与本申请实施例的步骤102大致相同,此处不一一赘述。
步骤406,将重构误差大于第i层自编码器的误差阈值的网络流量数据称为可疑流量,并将可疑流量输入到第i+1层自编码器进行重构处理获取可疑流量的可疑重构误差,并根据可疑流量和可疑重构流量获取可疑重构误差。
具体的说,本步骤与本申请实施例的步骤103大致相同,此处不一一赘述。
步骤407,当第i+1层自编码器为第n层自编码器时,若可疑流量的可疑重构误差大于第n层自编码器的误差阈值,则可疑流量为网络流量数据中的异常流量。
具体的说,本步骤与本申请实施例的步骤104大致相同,此处不一一赘述。
本申请的实施例,在其他实施例的基础之上还可以对网络流量数据的统计特征数据进行数据清洗和归一化处理,避免包含异常值的流量和特征维度不相同对流量检测结果的影响,从而提高流量检测结果的准确度。
本申请的另一个实施例涉及一种流量检测装置,下面对本实施例的流量检测装置的细节进行具体的说明,以下内容仅为方便理解提供的实现细节,并非实施本例的必须,图8是本实施例所述的流量检测装置的示意图,包括:获取模块501、n层自编码器502。
其中,获取模块501用于获取网络流量数据;
n层自编码器502用于将网络流量数据输入到n层自编码器的第i层自编码器进行重构处理获取重构流量数据,并根据网络流量数据和重构流量数据获取重构误差;将重构误差大于 第i层自编码器的预设误差阈值的网络流量数据称为可疑流量,并将可疑流量输入到第i+1层自编码器进行重构处理获取可疑流量的可疑重构误差;并根据可疑流量和可疑重构流量获取可疑流量的可疑重构误差;当第i+1层自编码器为第n层自编码器时,若可疑流量的可疑重构误差大于第n层自编码器的预设误差阈值,则可疑流量为网络流量数据中的异常流量;其中,所述n为大于1的整数,所述i为大于0小于所述n的整数。
不难发现,本实施例为与上述方法实施例对应的系统实施例,本实施例可以与上述方法实施例互相配合实施。上述实施例中提到的相关技术细节和技术效果在本实施例中依然有效,为了减少重复,这里不再赘述。相应地,本实施例中提到的相关技术细节也可应用在上述实施例中。
值得一提的是,本实施例中所涉及到的各模块均为逻辑模块,在实际应用中,一个逻辑单元可以是一个物理单元,也可以是一个物理单元的一部分,还可以以多个物理单元的组合实现。此外,为了突出本申请的创新部分,本实施例中并没有将与解决本申请所提出的技术问题关系不太密切的单元引入,但这并不表明本实施例中不存在其它的单元。
本申请另一个实施例涉及一种电子设备,如图9所示,包括:至少一个处理器601;以及,与所述至少一个处理器601通信连接的存储器602;其中,所述存储器602存储有可被所述至少一个处理器601执行的指令,所述指令被所述至少一个处理器601执行,以使所述至少一个处理器601能够执行上述各实施例中的流量检测方法。
其中,存储器和处理器采用总线方式连接,总线可以包括任意数量的互联的总线和桥,总线将一个或多个处理器和存储器的各种电路连接在一起。总线还可以将诸如外围设备、稳压器和功率管理电路等之类的各种其他电路连接在一起,这些都是本领域所公知的,因此,本文不再对其进行进一步描述。总线接口在总线和收发机之间提供接口。收发机可以是一个元件,也可以是多个元件,比如多个接收器和发送器,提供用于在传输介质上与各种其他装置通信的单元。经处理器处理的数据通过天线在无线介质上进行传输,进一步,天线还接收数据并将数据传送给处理器。
处理器负责管理总线和通常的处理,还可以提供各种功能,包括定时,外围接口,电压调节、电源管理以及其他控制功能。而存储器可以被用于存储处理器在执行操作时所使用的数据。
本申请另一个实施例涉及一种计算机可读存储介质,存储有计算机程序。计算机程序被处理器执行时实现上述方法实施例。
即,本领域技术人员可以理解,实现上述实施例方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序存储在一个存储介质中,包括若干指令用以使得一个设备(可以是单片机,芯片等)或处理器(processor)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
本领域的普通技术人员可以理解,上述各实施例是实现本申请的具体实施例,而在实际应用中,可以在形式上和细节上对其作各种改变,而不偏离本申请的精神和范围。

Claims (10)

  1. 一种流量检测方法,包括:
    获取网络流量数据;
    将所述网络流量数据输入到预设的n层自编码器的第i层自编码器进行重构处理获取重构流量数据,并根据所述网络流量数据和所述重构流量数据获取重构误差;其中,所述n为大于1的整数,所述i为大于0小于所述n的整数;
    将所述重构误差大于所述第i层自编码器的预设误差阈值的所述网络流量数据称为可疑流量,并将所述可疑流量输入到第i+1层自编码器进行重构处理获取所述可疑流量的可疑重构误差,并根据所述可疑流量和所述可疑重构流量获取可疑重构误差;
    当所述第i+1层自编码器为第n层自编码器时,若所述可疑流量的可疑重构误差大于所述第n层自编码器的预设误差阈值,则所述可疑流量为所述网络流量数据中的异常流量。
  2. 根据权利要求1所述的流量检测方法,其中,在所述获取网络流量数据之前,还包括:
    获取流量训练集,其中,所述流量训练集包含若干个正常流量数据;
    将所述流量训练集输入到所述第i层自编码器进行重构处理获取重构流量集,并利用预设的损失函数训练所述第i层自编码器;其中,所述重构流量集包括若干个重构流量,每个所述正常流量数据经重构处理后得到一个所述重构流量;
    当所述第i层自编码器收敛时,根据所述重构流量集获取所述流量训练集中各正常流量数据的重构误差,并根据所述各正常流量数据的重构误差和预设的数据划分比例确定所述第i层自编码器的误差阈值,从所述流量训练集中筛选出所述重构误差大于所述第i层自编码器的误差阈值的所述正常流量数据作为可疑流量数据集;
    将所述可疑流量数据集输入到所述第i+1层自编码器进行重构处理获取可疑重构流量集,并利用所述损失函数训练所述第i+1层自编码器直至收敛,根据所述可疑重构流量集获取所述可疑流量数据集中各正常流量数据的可疑重构误差,并根据所述各正常流量数据的可疑重构误差和所述数据划分比例确定所述第i+1层自编码器的误差阈值;
    当所述第i+1层自编码器为所述第n层自编码器时,则生成所述n层自编码器。
  3. 根据权利要求2所述的流量检测方法,其中,所述损失函数为所述流量训练集中各所述正常流量数据和所述重构流量集中各所述重构流量数据的均方误差;
    所述利用预设的损失函数训练所述第i层自编码器包括:
    当所述损失函数的值不满足预设的收敛条件时,使用预设的优化器计算所述损失函数的梯度,并根据所述梯度调整所述第i层自编码器的自编码参数。
  4. 根据权利要求1至3中任一项所述的流量检测方法,其中,所述网络流量数据携带有五元组信息,所述五元组信息包含源IP、目的IP、源端口、目的端口和协议信息;
    所述获取网络流量数据之后还包括:
    根据所述源IP、所述目的IP、所述源端口和所述目的端口确定所述网络流量数据的流量方向;
    根据所述流量方向和所述协议信息对所述网络流量数据进行特征提取,获取所述网络流 量数据的统计特征数据。
  5. 根据权利要求4所述的流量检测方法,其中,在所述获取所述网络流量数据的统计特征数据之后,还包括:对所述统计特征数据进行数据清洗处理和数据归一化处理。
  6. 根据权利要求1至5中任一项所述的流量检测方法,其中,所述将所述网络流量数据输入到预设的n层自编码器的第i层自编码器进行重构处理获取重构流量数据,并根据所述网络流量数据和所述重构流量数据获取重构误差具体包括:
    所述第i层自编码器对所述网络流量数据进行编码处理,生成所述网络流量数据的编码数据;
    所述第i层自编码器对所述编码数据进行解码处理,生成所述重构流量数据;
    将所述网络流量数据和所述重构流量数据的差值作为所述重构误差。
  7. 根据权利要求1至6中任一项所述的流量检测方法,其中,所述n层自编码器的各层自编码器均包含有用于给所述各层自编码器施加约束的正则化项。
  8. 一种流量检测装置,包括:
    获取模块,用于获取网络流量数据;
    n层自编码器,用于将所述网络流量数据输入到所述n层自编码器的第i层自编码器进行重构处理获取重构流量数据,并根据所述网络流量数据和所述重构流量数据获取重构误差;将所述重构误差大于所述第i层自编码器的预设误差阈值的所述网络流量数据称为可疑流量,并将所述可疑流量输入到第i+1层自编码器进行重构处理获取所述可疑流量的可疑重构误差;并根据所述可疑流量和所述可疑重构流量获取所述可疑流量的可疑重构误差;当所述第i+1层自编码器为第n层自编码器时,若所述可疑流量的可疑重构误差大于所述第n层自编码器的预设误差阈值,则所述可疑流量为所述网络流量数据中的异常流量;其中,所述n为大于1的整数,所述i为大于0小于所述n的整数。
  9. 一种电子设备,包括:
    至少一个处理器;以及,
    与所述至少一个处理器通信连接的存储器;其中,
    所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行如权利要求1至7中任一项所述的流量检测方法。
  10. 一种计算机可读存储介质,存储有计算机程序,所述计算机程序被处理器执行时实现权利要求1至7中任一项所述的流量检测方法。
PCT/CN2022/083871 2021-08-24 2022-03-29 流量检测方法、装置、电子设备和存储介质 WO2023024506A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110975446.6 2021-08-24
CN202110975446.6A CN115941218A (zh) 2021-08-24 2021-08-24 流量检测方法、装置、电子设备和存储介质

Publications (1)

Publication Number Publication Date
WO2023024506A1 true WO2023024506A1 (zh) 2023-03-02

Family

ID=85321508

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/083871 WO2023024506A1 (zh) 2021-08-24 2022-03-29 流量检测方法、装置、电子设备和存储介质

Country Status (2)

Country Link
CN (1) CN115941218A (zh)
WO (1) WO2023024506A1 (zh)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108737406A (zh) * 2018-05-10 2018-11-02 北京邮电大学 一种异常流量数据的检测方法及系统
CN110691100A (zh) * 2019-10-28 2020-01-14 中国科学技术大学 基于深度学习的分层网络攻击识别与未知攻击检测方法
EP3635932A1 (en) * 2017-06-09 2020-04-15 British Telecommunications Public Limited Company Anomaly detection in computer networks
CN111988277A (zh) * 2020-07-18 2020-11-24 郑州轻工业大学 一种基于双向生成对抗网络的攻击检测方法
CN112434298A (zh) * 2021-01-26 2021-03-02 浙江大学 一种基于自编码器集成的网络威胁检测系统
CN113067754A (zh) * 2021-04-13 2021-07-02 南京航空航天大学 一种半监督时间序列异常检测方法及系统

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3635932A1 (en) * 2017-06-09 2020-04-15 British Telecommunications Public Limited Company Anomaly detection in computer networks
CN108737406A (zh) * 2018-05-10 2018-11-02 北京邮电大学 一种异常流量数据的检测方法及系统
CN110691100A (zh) * 2019-10-28 2020-01-14 中国科学技术大学 基于深度学习的分层网络攻击识别与未知攻击检测方法
CN111988277A (zh) * 2020-07-18 2020-11-24 郑州轻工业大学 一种基于双向生成对抗网络的攻击检测方法
CN112434298A (zh) * 2021-01-26 2021-03-02 浙江大学 一种基于自编码器集成的网络威胁检测系统
CN113067754A (zh) * 2021-04-13 2021-07-02 南京航空航天大学 一种半监督时间序列异常检测方法及系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHANG CHANGHUA, ZHOU XIONGTU, ZHANG YONG’AI, YAO JIANMIN, GUO TAILIANG, YAN QUN: "Application Research of Deep Auto Encoder in Data Anomaly Detection", COMPUTER ENGINEERING AND APPLICATIONS, HUABEI JISUAN JISHU YANJIUSUO, CN, vol. 56, no. 17, 1 January 2020 (2020-01-01), CN , pages 93 - 99, XP093039577, ISSN: 1002-8331, DOI: 10.3778/j.issn.1002-8331.1908-0298 *

Also Published As

Publication number Publication date
CN115941218A (zh) 2023-04-07

Similar Documents

Publication Publication Date Title
US11620528B2 (en) Pattern detection in time-series data
EP3355547B1 (en) Method and system for learning representations of network flow traffic
He et al. Attacking and protecting data privacy in edge–cloud collaborative inference systems
US10075463B2 (en) Bot detection system based on deep learning
Chen et al. DAD-MCNN: DDoS attack detection via multi-channel CNN
CN112235264A (zh) 一种基于深度迁移学习的网络流量识别方法及装置
Hu et al. [Retracted] CLD‐Net: A Network Combining CNN and LSTM for Internet Encrypted Traffic Classification
US10965553B2 (en) Scalable unsupervised host clustering based on network metadata
US11451563B2 (en) Dynamic detection of HTTP-based DDoS attacks using estimated cardinality
US20230409714A1 (en) Machine Learning Techniques for Detecting Anomalous API Call Behavior
Cheng et al. STC‐IDS: Spatial–temporal correlation feature analyzing based intrusion detection system for intelligent connected vehicles
WO2023024506A1 (zh) 流量检测方法、装置、电子设备和存储介质
Bartos et al. IFS: Intelligent flow sampling for network security–an adaptive approach
Zhang et al. Machine learning based protocol classification in unlicensed 5 GHz bands
US11973779B2 (en) Detecting data exfiltration and compromised user accounts in a computing network
CN115712857A (zh) 一种异常流量检测方法、设备及介质
CN114125845A (zh) 一种基于物联网的智能设备自动连网方法及装置
Boukili et al. H∞ deconvolution filter for two‐dimensional numerical systems using orthogonal moments
CN113810336A (zh) 一种数据报文加密判定方法、装置及计算机设备
Hsiao et al. Detection of SQL Injection and Cross-Site Scripting Based on Multi-Model CNN Combined with Bidirectional GRU and Multi-Head Self-Attention
Casarin et al. Unsupervised Network Anomaly Detection by Learning on 2D Data Representations
US20240179155A1 (en) Method and system for network security situation assessment
Aung et al. Software rejuvenation approach to security engineering
Harlow Anomaly Detection for the MIL-STD-1553B Multiplex Data Bus Using an LSTM Autoencoder
Dargenio et al. Exploring the use of autoencoders for botnets traffic representation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22859860

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE