WO2020119481A1 - Network traffic classification method and system based on deep learning, and electronic device - Google Patents

Network traffic classification method and system based on deep learning, and electronic device Download PDF

Info

Publication number
WO2020119481A1
WO2020119481A1 PCT/CN2019/122001 CN2019122001W WO2020119481A1 WO 2020119481 A1 WO2020119481 A1 WO 2020119481A1 CN 2019122001 W CN2019122001 W CN 2019122001W WO 2020119481 A1 WO2020119481 A1 WO 2020119481A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
network traffic
network
classification
sample data
Prior art date
Application number
PCT/CN2019/122001
Other languages
French (fr)
Chinese (zh)
Inventor
赵世林
叶可江
须成忠
Original Assignee
深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳先进技术研究院 filed Critical 深圳先进技术研究院
Publication of WO2020119481A1 publication Critical patent/WO2020119481A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/04Network management architectures or arrangements
    • H04L41/044Network management architectures or arrangements comprising hierarchical management structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/02Capturing of monitoring data
    • H04L43/028Capturing of monitoring data by filtering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/24Traffic characterised by specific attributes, e.g. priority or QoS
    • H04L47/2441Traffic characterised by specific attributes, e.g. priority or QoS relying on flow classification, e.g. using integrated services [IntServ]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/31Flow control; Congestion control by tagging of packets, e.g. using discard eligibility [DE] bits
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/16Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
    • H04L69/161Implementation details of TCP/IP or UDP/IP stack architecture; Specification of modified or new header fields

Definitions

  • This application belongs to the technical field of network traffic classification, and in particular relates to a network traffic classification method, system, and electronic device based on deep learning.
  • Network traffic classification plays an important role in network management, resource allocation, on-demand services, and security systems.
  • network resources can be accurately managed,
  • the effective reuse of resources and the provision of personalized services play a very good role, and are also very important for enterprises to save unnecessary expenses on the network. Therefore, how to accurately classify network traffic and improve network resource reuse and personalized services is a major challenge.
  • Network traffic classification based on representation learning By preprocessing the obtained network traffic data, using the representation learning algorithm to extract the feature of the preprocessed network traffic data, the network traffic data is generated into a network flow vector, according to the network flow direction To classify the network traffic data by volume, it can realize the efficient classification of network traffic.
  • Network traffic classification method based on two-stage sequence feature learning two-stage use of short- and long-term memory neural networks to learn the sequence characteristics of network traffic in two stages at the two levels of data packet and network stream, the first stage is based on the flow byte sequence A sequence of packet vectors is generated on the second stage. In the second stage, a network flow vector is further generated based on the sequence of packet vectors. Finally, a classifier is used to perform traffic classification on the network flow vector. This method fully considers the internal structure and organization relationship of network traffic, effectively utilizes the time series feature learning ability of long-term and short-term memory neural network, and classifies after obtaining more comprehensive and comprehensive traffic characteristics, which can achieve a more accurate network traffic classification effect.
  • a network traffic classification method based on hierarchical spatio-temporal feature learning obtaining the spatial characteristics of the network traffic data through the first neural network; obtaining the temporal characteristics of the network traffic data through the second neural network; according to the spatial characteristics and the Time series features classify the network traffic.
  • This method can get more comprehensive and accurate traffic feature information, which can effectively improve the network traffic classification ability; using a better traffic feature set can effectively reduce the false alarm rate.
  • the existing network traffic classification methods are based on traditional machine learning technology, the classification performance is very dependent on the design of traffic characteristics, and how to accurately describe the feature set of traffic characteristics requires a lot of manual design, This is still a difficulty in solving the problem of network traffic classification.
  • most of the current network traffic classification methods basically propose various optimization and improvement algorithms for the classification algorithm module in the training phase, but the local characteristics contained in the original data of the network traffic are rarely studied and excavated. Classification performance is unstable.
  • the present application provides a method, system and electronic device for network traffic classification based on deep learning, aiming to solve at least to a certain extent one of the above technical problems in the prior art.
  • a network traffic classification method based on deep learning including the following steps:
  • Step a Capture network traffic sample data
  • Step b Extract the global feature data set of the network traffic sample data through a deep learning classification algorithm
  • Step c Construct a random forest classification model according to the global feature data set, and output the network traffic classification result through the random forest classification model.
  • the technical solution adopted in the embodiment of the present application further includes: in the step a, the capturing of network traffic sample data specifically includes: selecting a network data center to collect all network data packets; and at the same time, acquiring the network data packet corresponding time period System network logs generated by communication between internal network traffic.
  • the technical solution adopted in the embodiment of the present application further includes: in the step a, the network traffic sample data further includes: detecting network traffic sample data, preprocessing the network traffic sample data, and filtering out the network traffic sample data Incomplete network packets, and delete retransmitted network packets.
  • the network traffic sample data further includes: performing sample labeling processing on the preprocessed network traffic sample data to obtain a network flow data set;
  • the labeling of the sample is specifically: analyzing the network traffic sample data, finding out the natural attributes of each application and the IP address and transmission protocol between communicating with other applications; extracting the system network log and each application Associate the IP endpoints and the number of transmission packets to determine the category of the network traffic sample data, and combine the two applications with the IP address and transmission protocol of each application to complete the marking of the network traffic sample data; finally, use Deep packet inspection technology performs feature fingerprint matching on unknown traffic data to complete the marking of unknown traffic data.
  • the technical solution adopted in the embodiment of the present application further includes: in the step b, the global feature data set for extracting the network traffic sample data through the deep learning classification algorithm specifically includes:
  • Step b1 Enter the network flow data set
  • Step b2 Use the correlation between the flow data contained in the four layers of the TCP/IP protocol to sequentially extract the flow data of the application layer, transmission layer, network layer, and data link layer of each network packet in sequence;
  • Step b3 According to the importance of the data contained in the four layers of the TCP/IP protocol, sequentially divide and extract the traffic data of different sizes for each layer in proportion;
  • Step b4 The extracted flow data is composed into one-dimensional M bytes, and the M bytes are converted into N pixels;
  • Step b5 Convert the N pixels into a gray image of standard size to form a new gray image data set
  • Step b6 Send the grayscale image data set to the input layer of the convolutional neural network model. After continuously adaptively adjusting the size and number of the convolutional layer and the pooling layer, perform the convolution operation according to the bad, to obtain high-dimensional Global feature dataset.
  • a network traffic classification system based on deep learning including:
  • Data acquisition module used to capture network traffic sample data
  • Feature extraction module used to extract the global feature data set of the network traffic sample data through a deep learning classification algorithm
  • Classification model building module used to build a random forest classification model according to the global feature data set
  • Result output module used to output network traffic classification results.
  • the technical solution adopted in the embodiments of the present application further includes: the data acquisition module capturing network traffic sample data specifically includes: selecting a network data center to collect all network data packets; and at the same time, acquiring network traffic within a time period corresponding to the network data packets System network logs generated during the exchange.
  • the technical solution adopted in the embodiment of the present application further includes a data preprocessing module, which is used to detect network traffic sample data, preprocess the network traffic sample data, and filter out incompleteness in the network traffic sample data Network data packets, and delete the retransmitted network data packets.
  • a data preprocessing module which is used to detect network traffic sample data, preprocess the network traffic sample data, and filter out incompleteness in the network traffic sample data Network data packets, and delete the retransmitted network data packets.
  • the technical solution adopted in the embodiment of the present application further includes a data labeling module, and the data labeling module is used to perform sample labeling processing on the preprocessed network traffic sample data to obtain a network flow data set; the sample labeling
  • the tags are specifically: analyzing the network traffic sample data to find out the natural attributes of each application and the IP address and transmission protocol between communicating with other applications; extracting the IP associated with each application in the system network log Endpoints and the number of transmission packets, determine the category of the network traffic sample data, and combine the IP address and transmission protocol of each application to integrate the two to complete the marking of the network traffic sample data; Finally, use deep packet inspection technology Perform feature fingerprint matching on unknown traffic data to complete the marking of unknown traffic data.
  • the technical solution adopted in the embodiments of the present application further includes: the feature extraction module extracts the global feature data set of the network traffic sample data through a deep learning classification algorithm specifically: input network flow data set; using the TCP/IP protocol four-layer laboratory Contains the degree of correlation between traffic data, and sequentially extracts the traffic data of the application layer, transport layer, network layer, and data link layer of each network packet; according to the importance of the data contained in the four layers of the TCP/IP protocol , In order to divide and extract the traffic data of different sizes in each layer in turn according to the proportion; the extracted traffic data is composed of one-dimensional M bytes, and the M bytes are converted into N pixels; the N pixels The points are converted into standard-sized grayscale images to form a new grayscale image dataset; the grayscale image dataset is sent to the input layer of the convolutional neural network model, and the convolutional layer and pooling are continuously adaptively adjusted The size and number of layers are convolved according to the bad, and a high-dimensional global feature data set is obtained.
  • an electronic device including:
  • At least one processor At least one processor
  • a memory communicatively connected to the at least one processor; wherein,
  • the memory stores instructions executable by the one processor, and the instructions are executed by the at least one processor to enable the at least one processor to perform the following of the deep learning-based network traffic classification method described above operating:
  • Step a Capture network traffic sample data
  • Step b Extract the global feature data set of the network traffic sample data through a deep learning classification algorithm
  • Step c Construct a random forest classification model according to the global feature data set, and output the network traffic classification result through the random forest classification model.
  • the beneficial effects produced by the embodiments of the present application are: the deep learning-based network traffic classification method, system and electronic device of the embodiments of the present application use the potential characteristics of the traffic data of each layer in the TCP/IP protocol for classification, The classification accuracy is improved, and at the same time, the depth of the data contained in each layer is mined in proportion to the depth, which well guarantees the high cohesion of the features of each layer.
  • the results show stable classification performance, can handle high-dimensional traffic data, and do not need to make feature selection.
  • the present application can effectively guarantee the high accuracy and high performance of the network traffic classification, and at the same time, it can improve the classification efficiency, shorten the training time, and reduce the calculation overhead.
  • FIG. 1 is a flowchart of a network traffic classification method based on deep learning according to an embodiment of the present application
  • FIG. 2 is a flowchart of feature extraction by a deep learning classification algorithm according to an embodiment of the present application
  • FIG. 3 is a schematic structural diagram of a network traffic classification system based on deep learning according to an embodiment of the present application
  • FIG. 4 is a schematic structural diagram of a hardware device of a network traffic classification method based on deep learning provided by an embodiment of the present application.
  • the deep learning-based network traffic classification method of the embodiment of the present application uses deep learning hidden feature extraction technology to accurately mine a large number of hidden traffic feature sets in network traffic to ensure that network traffic classification In the process, the flow feature set in the network traffic is fully and efficiently used to accurately classify and identify the network traffic.
  • FIG. 1 is a flowchart of a deep learning-based network traffic classification method according to an embodiment of the present application.
  • the network traffic classification method based on deep learning in the embodiment of the present application includes the following steps:
  • Step 100 Capture sample data of network traffic
  • capturing network traffic sample data specifically includes: selecting a large network data center and using Wireshark software to collect all network data packets; at the same time, for labeling data, and setting up high-performance network monitoring software for continuous capture to obtain network data
  • the system network log generated by communication between network traffic during the time period corresponding to the packet.
  • Step 200 Detect network traffic sample data, and preprocess the network traffic sample data
  • the preprocessing of the network traffic sample data specifically includes: first, in order to prevent the incomplete network data packets generated by the transmission disconnection caused by the unstable three-way handshake of TCP (Transmission Control Protocol), the incomplete network data needs to be filtered out package. Secondly, in order to avoid the retransmission of network packets caused by the loss of acknowledgement packets during TCP connection, the retransmitted network packets need to be deleted.
  • TCP Transmission Control Protocol
  • Step 300 Perform sample labeling processing on the preprocessed network traffic sample data to obtain a network flow data set
  • the sample labeling specifically includes: first, analyze the network traffic sample data to find out the natural attributes of each application and the key information between communicating with other applications, including the IP address, transmission protocol, etc.; second, extract In the system network log, the IP endpoints and the number of transmission packets associated with each application are used to determine the category of network traffic sample data, and combined with the IP address and transmission protocol of each application to associate and merge the two to complete the marking of network traffic sample data. Finally, the DPI (Deep Packet Inspection) technology is used to perform fingerprint matching on unknown traffic data to complete the marking of unknown traffic data.
  • DPI Deep Packet Inspection
  • Step 400 Extract the global feature data set of the network flow data set through a deep learning classification algorithm
  • step 400 the embodiment of the present application uses the degree of association of each layer of protocol data in the traffic packets in the network traffic to re-extract and distribute the data set.
  • FIG. 2 is a flowchart of extracting global feature data of the deep learning classification algorithm according to an embodiment of the present application, which specifically includes the following steps:
  • Step 401 Enter the network stream data set
  • Step 402 Use the correlation between the flow data contained in the four layers of the TCP/IP protocol to sequentially extract the flow data of the application layer, transmission layer, network layer, and data link layer of each network packet in sequence;
  • Step 403 According to the importance of the data contained in the four layers of the TCP/IP protocol, sequentially divide and extract traffic data of different sizes for each layer according to a certain ratio;
  • step 403 the present application deeply digs in proportion to the importance of the data contained in each layer, which well guarantees the high cohesion of the features of each layer.
  • Step 404 Combine the extracted traffic data into one-dimensional M bytes, and convert the M bytes into N pixels;
  • Step 405 Convert the N pixels into a gray image of standard size (X, X, 1) to form a new gray image data set;
  • Step 406 Send the grayscale image data set to the input layer of the convolutional neural network model. After continuously adaptively adjusting the size and number of the convolutional layer and the pooling layer, perform the convolution operation according to the bad, to obtain a high-dimensional global Feature data set;
  • Step 407 Compress the image in the global feature data set to reduce parameters without affecting the image quality by downsampling
  • the downsampling method is specifically: the pooling layer is set to use MaxPooling (maximum pooling), the size is 2*2, the step size is 1, the maximum value of each window is updated, then the size of the image will be determined by Feature_map Becomes 2*2: (Feature_map-2)+1.
  • Step 408 Repeat steps 407 and 408 until a large number of local features are extracted and the convolution operation is terminated after the set learning rate is satisfied;
  • Step 409 The local feature extraction result is input to the Flatten layer, and the Flatten layer outputs a one-dimensional global feature data set.
  • Step 500 Perform classification training on the extracted global feature data set, construct a random forest classification model, and output a network traffic classification result through the random forest classification model.
  • step 500 the present application first uses a convolutional neural network to extract a global feature data set, and then uses the extracted global feature data set to train a random forest classification model. During the training process, it can detect the mutual influence of features (features), which is effective Guarantees the high precision and high performance of network traffic classification.
  • a random forest algorithm using supervised learning is used for modeling. According to the results given by each decision tree in the forest, not only can the category judgment of known traffic be obtained, but also the classification of unknown traffic can be determined by voting.
  • the test results show that the random forest classification model of the embodiment of the present application has very high classification accuracy, and at the same time, it can improve classification efficiency, shorten training time, and reduce calculation overhead.
  • FIG. 3 is a schematic structural diagram of a network traffic classification system based on deep learning according to an embodiment of the present application.
  • the network flow classification system based on deep learning in the embodiment of the present application includes a data acquisition module, a data preprocessing module, a data labeling module, a feature extraction module, a classification model construction module, and a result output module.
  • Data acquisition module used to capture network traffic sample data; among them, capturing network traffic sample data specifically includes: selecting a large network data center and using Wireshark software to collect all network data packets; at the same time, for label data, and setting up high-performance network monitoring The software continuously captures and obtains the system network log generated by the communication between the network traffic within the corresponding time period of the network data packet.
  • Data pre-processing module used to detect network traffic sample data and pre-process network traffic sample data; among them, network traffic sample data pre-processing specifically includes: first, in order to prevent TCP (Transmission Control Protocol) three-way handshake Instability leads to incomplete network data packets caused by disconnection. Incomplete network data packets need to be filtered out. Secondly, in order to avoid the retransmission of network packets caused by the loss of acknowledgement packets during TCP connection, the retransmitted network packets need to be deleted.
  • TCP Transmission Control Protocol
  • Data labeling module used for sample labeling the pre-processed network traffic sample data to obtain a network flow data set; among them, the sample labeling specifically includes: first, analyze the network traffic sample data to find each application The natural attributes of and the key information exchanged with other applications, including IP addresses, transmission protocols, etc.; second, extract the IP endpoints and the number of transmission packets associated with each application in the system network log to determine the network traffic sample data belongs to Category, and combine the IP address and transmission protocol of each application to associate and merge the two to complete the marking of network traffic sample data; finally, use DPI (Deep Packet Inspection) technology to perform feature fingerprint matching on unknown traffic data, Complete tagging of unknown traffic data.
  • DPI Deep Packet Inspection
  • Feature extraction module used to extract the global feature data set of the network flow data set through a deep learning classification algorithm; the embodiments of the present application use the degree of association of each layer of protocol data in the traffic packets in the network traffic to re-extract and distribute the data set.
  • the global feature data set extraction method includes:
  • the traffic data of different sizes of each layer is sequentially divided and extracted according to a certain ratio
  • the extracted flow data is composed of one-dimensional M bytes, and the M bytes are converted into N pixels;
  • Feature_map (feature map) ( wide+2*padding_size-filter_size)/stride+1, the specific size can be set according to the actual application.
  • the image in the global feature data set is compressed to reduce the parameters without affecting the image quality;
  • the downsampling method is specifically: the pooling layer is set to use MaxPooling (maximum pooling), the size is 2*2, step size is 1, update with the largest value of each window, then the size of the image will change from Feature_map to 2*2: (Feature_map-2)+1.
  • the local feature extraction result is input to the Flatten (flattening) layer, and the Flatten layer outputs a one-dimensional global feature dataset.
  • Classification model building module used for classification training on the extracted global feature data set to build a random forest classification model; this application first uses a convolutional neural network to extract the global feature data set, and then uses the extracted global feature data set to train the random forest classification
  • the model during the training process, can detect the interaction between features (features), and effectively guarantee the high accuracy and high performance of network traffic classification.
  • Result output module used to output network traffic classification results.
  • the device includes one or more processors and memory. Taking a processor as an example, the device may further include: an input system and an output system.
  • the processor, memory, input system, and output system may be connected through a bus or in other ways.
  • connection through a bus is used as an example.
  • the memory can be used to store non-transitory software programs, non-transitory computer executable programs, and modules.
  • the processor runs non-transitory software programs, instructions, and modules stored in the memory to execute various functional applications and data processing of the electronic device, that is, to implement the processing methods of the foregoing method embodiments.
  • the memory may include a storage program area and a storage data area, where the storage program area may store an operating system and application programs required by at least one function; the storage data area may store data, and the like.
  • the memory may include a high-speed random access memory, and may also include a non-transitory memory, such as at least one magnetic disk storage device, a flash memory device, or other non-transitory solid-state storage devices.
  • the memory optionally includes memories remotely located with respect to the processor, and these remote memories may be connected to the processing system via a network. Examples of the above network include but are not limited to the Internet, intranet, local area network, mobile communication network, and combinations thereof.
  • the input system can receive input digital or character information, and generate signal input.
  • the output system may include display devices such as display screens.
  • the one or more modules are stored in the memory, and when executed by the one or more processors, perform the following operations of any of the foregoing method embodiments:
  • Step a Capture network traffic sample data
  • Step b Extract the global feature data set of the network traffic sample data through a deep learning classification algorithm
  • Step c Construct a random forest classification model according to the global feature data set, and output the network traffic classification result through the random forest classification model.
  • the above-mentioned products can execute the method provided in the embodiments of the present application, and have function modules and beneficial effects corresponding to the execution method.
  • function modules and beneficial effects corresponding to the execution method For technical details that are not described in detail in this embodiment, refer to the method provided in the embodiments of the present application.
  • An embodiment of the present application provides a non-transitory (non-volatile) computer storage medium that stores computer-executable instructions, and the computer-executable instructions can perform the following operations:
  • Step a Capture network traffic sample data
  • Step b Extract the global feature data set of the network traffic sample data through a deep learning classification algorithm
  • Step c Construct a random forest classification model according to the global feature data set, and output the network traffic classification result through the random forest classification model.
  • An embodiment of the present application provides a computer program product.
  • the computer program product includes a computer program stored on a non-transitory computer-readable storage medium.
  • the computer program includes program instructions. When the program instructions are executed by a computer To cause the computer to perform the following operations:
  • Step a Capture network traffic sample data
  • Step b Extract the global feature data set of the network traffic sample data through a deep learning classification algorithm
  • Step c Construct a random forest classification model according to the global feature data set, and output the network traffic classification result through the random forest classification model.
  • the deep learning-based network traffic classification method, system, and electronic device of the embodiment of the present application use the potential characteristics of each layer of traffic data in the TCP/IP protocol for classification, which improves the classification accuracy, and at the same time according to the importance of the data contained in each layer Deep digging according to the ratio guarantees the high cohesion of the features of each layer.
  • the results show stable classification performance, can handle high-dimensional traffic data, and do not need to make feature selection.
  • the present application can effectively guarantee the high accuracy and high performance of the network traffic classification, and at the same time, it can improve the classification efficiency, shorten the training time, and reduce the calculation overhead.

Abstract

The present application relates to a network traffic classification method and system based on deep learning, and an electronic device. The method comprises: step a: capturing network traffic sample data; step b: extracting a global feature data set of the network traffic sample data by means of a deep learning classification algorithm; and step c: constructing a random forest classification model according to the global feature data set, and outputting a network traffic classification result by means of the random forest classification model. In the present application, the random forest classification model is trained by utilizing extracted global features, the result shows stable classification performance, ultra-high-dimension traffic data can be processed, and feature selection is not necessary. Compared with the prior art, the present application can effectively guarantee high precision and high performance of network traffic classification; in addition, the classification efficiency can be improved, the training time can be shortened, and the computation overhead can be reduced.

Description

一种基于深度学习的网络流量分类方法、系统及电子设备Network traffic classification method, system and electronic equipment based on deep learning 技术领域Technical field
本申请属于网络流量分类技术领域,特别涉及一种基于深度学习的网络流量分类方法、系统及电子设备。This application belongs to the technical field of network traffic classification, and in particular relates to a network traffic classification method, system, and electronic device based on deep learning.
背景技术Background technique
随着互联网技术的飞快发展,网络中不断有大量的新应用出现,每种应用携带各种各样的服务和功能,使得网络环境变得异常庞大复杂多变。对于网络的正常运行和服务、资源实时分配,能有一种有效的监管网络活动的方法已经是必不可少的一环。网络流量分类在网络管理、资源分配、按需服务和安全系统等中发挥着重要作用,例如,对于企业管理者来说,通过对网络流量精细的分类和识别,可以对网络资源进行精准管理、资源有效再利用和提供个性化服务起到很好的作用,对企业节省网络不必要的开支也是非常的重要。因此,如何准确的对网络流量进行精准分类,提高网络资源再利用率和个性化服务是一大挑战。With the rapid development of Internet technology, a large number of new applications continue to appear on the network, and each application carries a variety of services and functions, making the network environment extremely large, complex and changeable. For the normal operation of the network and the real-time allocation of services and resources, it is already indispensable to have an effective method of monitoring network activities. Network traffic classification plays an important role in network management, resource allocation, on-demand services, and security systems. For example, for enterprise managers, by finely classifying and identifying network traffic, network resources can be accurately managed, The effective reuse of resources and the provision of personalized services play a very good role, and are also very important for enterprises to save unnecessary expenses on the network. Therefore, how to accurately classify network traffic and improve network resource reuse and personalized services is a major challenge.
现有技术中,常用的网络流量分类方法包括一下几种:In the prior art, commonly used network traffic classification methods include the following:
1、基于表征学习的网络流量分类:通过对获取到的网络流量数据进行预处理,使用表征学习算法对预处理后的网络流量数据进行特征提取,将网络流量数据生成网络流向量,根据网络流向量对网络流量数据进行分类,可实现高效地对网络流量进行分类。1. Network traffic classification based on representation learning: By preprocessing the obtained network traffic data, using the representation learning algorithm to extract the feature of the preprocessed network traffic data, the network traffic data is generated into a network flow vector, according to the network flow direction To classify the network traffic data by volume, it can realize the efficient classification of network traffic.
2、基于两阶段序列特征学习的网络流量分类方法:在数据包和网络流两个层次上分两阶段使用长短时记忆神经网络学习网络流量的序列特征,第一阶段在流量字节序列的基础上生成数据包向量序列,第二阶段在数据包向量序列 的基础上进一步生成网络流向量,最后使用分类器对网络流向量执行流量分类。该方法充分考虑了网络流量的内部结构组织关系,有效利用了长短时记忆神经网络的时序特征学习能力,得到比较综合全面的流量特征后再进行分类,能够实现更加准确的网络流量分类效果。2. Network traffic classification method based on two-stage sequence feature learning: two-stage use of short- and long-term memory neural networks to learn the sequence characteristics of network traffic in two stages at the two levels of data packet and network stream, the first stage is based on the flow byte sequence A sequence of packet vectors is generated on the second stage. In the second stage, a network flow vector is further generated based on the sequence of packet vectors. Finally, a classifier is used to perform traffic classification on the network flow vector. This method fully considers the internal structure and organization relationship of network traffic, effectively utilizes the time series feature learning ability of long-term and short-term memory neural network, and classifies after obtaining more comprehensive and comprehensive traffic characteristics, which can achieve a more accurate network traffic classification effect.
3、基于层次化时空特征学习的网络流量分类方法:通过第一神经网络获取网络流量数据的空间特征;通过第二神经网络获取所述网络流量数据的时序特征;根据所述空间特征和所述时序特征对所述网络流量进行分类。该方法可得到比较全面准确的流量特征信息,能够有效提高网络流量分类能力;使用更好的流量特征集可以有效地降低误警率。3. A network traffic classification method based on hierarchical spatio-temporal feature learning: obtaining the spatial characteristics of the network traffic data through the first neural network; obtaining the temporal characteristics of the network traffic data through the second neural network; according to the spatial characteristics and the Time series features classify the network traffic. This method can get more comprehensive and accurate traffic feature information, which can effectively improve the network traffic classification ability; using a better traffic feature set can effectively reduce the false alarm rate.
综上所述,现有的网络流量分类方法都是基于传统的机器学习技术,分类性能非常依赖于流量特征的设计,而且如何能准确的刻画出流量特性的特征集,需要大量的人工设计,这仍是目前解决网络流量分类问题的一个难点。同时,目前大部分的网络流量分类方法基本都是对训练阶段的分类算法模块提出了各种各样的优化和改进算法,但对于网络流量原始数据本身含有的局部特征却很少研究和挖掘,分类性能不稳定。In summary, the existing network traffic classification methods are based on traditional machine learning technology, the classification performance is very dependent on the design of traffic characteristics, and how to accurately describe the feature set of traffic characteristics requires a lot of manual design, This is still a difficulty in solving the problem of network traffic classification. At the same time, most of the current network traffic classification methods basically propose various optimization and improvement algorithms for the classification algorithm module in the training phase, but the local characteristics contained in the original data of the network traffic are rarely studied and excavated. Classification performance is unstable.
发明内容Summary of the invention
本申请提供了一种基于深度学习的网络流量分类方法、系统及电子设备,旨在至少在一定程度上解决现有技术中的上述技术问题之一。The present application provides a method, system and electronic device for network traffic classification based on deep learning, aiming to solve at least to a certain extent one of the above technical problems in the prior art.
为了解决上述问题,本申请提供了如下技术方案:In order to solve the above problems, this application provides the following technical solutions:
一种基于深度学习的网络流量分类方法,包括以下步骤:A network traffic classification method based on deep learning, including the following steps:
步骤a:捕获网络流量样本数据;Step a: Capture network traffic sample data;
步骤b:通过深度学习分类算法提取所述网络流量样本数据的全局特征数据集;Step b: Extract the global feature data set of the network traffic sample data through a deep learning classification algorithm;
步骤c:根据所述全局特征数据集构建随机森林分类模型,通过随机森林分类模型输出网络流量分类结果。Step c: Construct a random forest classification model according to the global feature data set, and output the network traffic classification result through the random forest classification model.
本申请实施例采取的技术方案还包括:在所述步骤a中,所述捕获网络流量样本数据具体包括:选择网络数据中心,采集所有网络数据包;同时,获取所述网络数据包对应时间段内网络流量之间交流产生的系统网络日志。The technical solution adopted in the embodiment of the present application further includes: in the step a, the capturing of network traffic sample data specifically includes: selecting a network data center to collect all network data packets; and at the same time, acquiring the network data packet corresponding time period System network logs generated by communication between internal network traffic.
本申请实施例采取的技术方案还包括:在所述步骤a中,所述网络流量样本数据还包括:检测网络流量样本数据,对网络流量样本数据进行预处理,过滤掉网络流量样本数据中的不完整网络数据包,并删除重传的网络数据包。The technical solution adopted in the embodiment of the present application further includes: in the step a, the network traffic sample data further includes: detecting network traffic sample data, preprocessing the network traffic sample data, and filtering out the network traffic sample data Incomplete network packets, and delete retransmitted network packets.
本申请实施例采取的技术方案还包括:在所述步骤a中,所述网络流量样本数据还包括:对所述预处理后的网络流量样本数据进行样本打标签处理,得到网络流数据集;所述样本打标签具体为:分析所述网络流量样本数据,找出其中每个应用的自然属性和与其他应用交流之间的IP地址、传输协议;提取所述系统网络日志中与每个应用相关联的IP端点和传输包数,判断所述网络流量样本数据所属类别,并结合每个应用的IP地址和传输协议进行二者关联融合,完成所述网络流量样本数据的标记;最后,利用深度包检测技术对未知流量数据进行特征指纹匹配,完成未知流量数据的标记。The technical solution adopted in the embodiment of the present application further includes: in the step a, the network traffic sample data further includes: performing sample labeling processing on the preprocessed network traffic sample data to obtain a network flow data set; The labeling of the sample is specifically: analyzing the network traffic sample data, finding out the natural attributes of each application and the IP address and transmission protocol between communicating with other applications; extracting the system network log and each application Associate the IP endpoints and the number of transmission packets to determine the category of the network traffic sample data, and combine the two applications with the IP address and transmission protocol of each application to complete the marking of the network traffic sample data; finally, use Deep packet inspection technology performs feature fingerprint matching on unknown traffic data to complete the marking of unknown traffic data.
本申请实施例采取的技术方案还包括:在所述步骤b中,所述通过深度学习分类算法提取所述网络流量样本数据的全局特征数据集具体包括:The technical solution adopted in the embodiment of the present application further includes: in the step b, the global feature data set for extracting the network traffic sample data through the deep learning classification algorithm specifically includes:
步骤b1:输入网络流数据集;Step b1: Enter the network flow data set;
步骤b2:利用TCP/IP协议四层所包含流量数据之间的关联程度,按比例依次提取每个网络数据包的应用层、传输层、网络层、数据链路层的流量数据;Step b2: Use the correlation between the flow data contained in the four layers of the TCP/IP protocol to sequentially extract the flow data of the application layer, transmission layer, network layer, and data link layer of each network packet in sequence;
步骤b3:按照TCP/IP协议四层所包含的数据重要性程度,按比例依次分割并提取每层不同大小的流量数据;Step b3: According to the importance of the data contained in the four layers of the TCP/IP protocol, sequentially divide and extract the traffic data of different sizes for each layer in proportion;
步骤b4:将提取出的流量数据组成一维化的M个字节,并将M个字节转换成N个像素点;Step b4: The extracted flow data is composed into one-dimensional M bytes, and the M bytes are converted into N pixels;
步骤b5:将所述N个像素点转换成标准尺寸的灰度图像,形成新的灰度图像数据集;Step b5: Convert the N pixels into a gray image of standard size to form a new gray image data set;
步骤b6:将所述灰度图像数据集送入到卷积神经网络模型的输入层,经过不断自适应调整卷积层和池化层的大小和数量,循坏进行卷积操作,得到高维的全局特征数据集。Step b6: Send the grayscale image data set to the input layer of the convolutional neural network model. After continuously adaptively adjusting the size and number of the convolutional layer and the pooling layer, perform the convolution operation according to the bad, to obtain high-dimensional Global feature dataset.
本申请实施例采取的另一技术方案为:一种基于深度学习的网络流量分类系统,包括:Another technical solution adopted by the embodiment of the present application is: a network traffic classification system based on deep learning, including:
数据获取模块:用于捕获网络流量样本数据;Data acquisition module: used to capture network traffic sample data;
特征提取模块:用于通过深度学习分类算法提取所述网络流量样本数据的全局特征数据集;Feature extraction module: used to extract the global feature data set of the network traffic sample data through a deep learning classification algorithm;
分类模型构建模块:用于根据所述全局特征数据集构建随机森林分类模型;Classification model building module: used to build a random forest classification model according to the global feature data set;
结果输出模块:用于输出网络流量分类结果。Result output module: used to output network traffic classification results.
本申请实施例采取的技术方案还包括:所述数据获取模块捕获网络流量样本数据具体包括:选择网络数据中心,采集所有网络数据包;同时,获取所述网络数据包对应时间段内网络流量之间交流产生的系统网络日志。The technical solution adopted in the embodiments of the present application further includes: the data acquisition module capturing network traffic sample data specifically includes: selecting a network data center to collect all network data packets; and at the same time, acquiring network traffic within a time period corresponding to the network data packets System network logs generated during the exchange.
本申请实施例采取的技术方案还包括数据预处理模块,所述数据预处理模块用于检测网络流量样本数据,对所述网络流量样本数据进行预处理,过滤掉网络流量样本数据中的不完整网络数据包,并删除重传的网络数据包。The technical solution adopted in the embodiment of the present application further includes a data preprocessing module, which is used to detect network traffic sample data, preprocess the network traffic sample data, and filter out incompleteness in the network traffic sample data Network data packets, and delete the retransmitted network data packets.
本申请实施例采取的技术方案还包括数据打标签模块,所述数据打标签模块用于对所述预处理后的网络流量样本数据进行样本打标签处理,得到网络流数据集;所述样本打标签具体为:分析所述网络流量样本数据,找出其 中每个应用的自然属性和与其他应用交流之间的IP地址、传输协议;提取所述系统网络日志中与每个应用相关联的IP端点和传输包数,判断所述网络流量样本数据所属类别,并结合每个应用的IP地址和传输协议进行二者关联融合,完成所述网络流量样本数据的标记;最后,利用深度包检测技术对未知流量数据进行特征指纹匹配,完成未知流量数据的标记。The technical solution adopted in the embodiment of the present application further includes a data labeling module, and the data labeling module is used to perform sample labeling processing on the preprocessed network traffic sample data to obtain a network flow data set; the sample labeling The tags are specifically: analyzing the network traffic sample data to find out the natural attributes of each application and the IP address and transmission protocol between communicating with other applications; extracting the IP associated with each application in the system network log Endpoints and the number of transmission packets, determine the category of the network traffic sample data, and combine the IP address and transmission protocol of each application to integrate the two to complete the marking of the network traffic sample data; Finally, use deep packet inspection technology Perform feature fingerprint matching on unknown traffic data to complete the marking of unknown traffic data.
本申请实施例采取的技术方案还包括:所述特征提取模块通过深度学习分类算法提取所述网络流量样本数据的全局特征数据集具体为:输入网络流数据集;利用TCP/IP协议四层所包含流量数据之间的关联程度,按比例依次提取每个网络数据包的应用层、传输层、网络层、数据链路层的流量数据;按照TCP/IP协议四层所包含的数据重要性程度,按比例依次分割并提取每层不同大小的流量数据;将提取出的流量数据组成一维化的M个字节,并将M个字节转换成N个像素点;将所述N个像素点转换成标准尺寸的灰度图像,形成新的灰度图像数据集;将所述灰度图像数据集送入到卷积神经网络模型的输入层,经过不断自适应调整卷积层和池化层的大小和数量,循坏进行卷积操作,得到高维的全局特征数据集。The technical solution adopted in the embodiments of the present application further includes: the feature extraction module extracts the global feature data set of the network traffic sample data through a deep learning classification algorithm specifically: input network flow data set; using the TCP/IP protocol four-layer laboratory Contains the degree of correlation between traffic data, and sequentially extracts the traffic data of the application layer, transport layer, network layer, and data link layer of each network packet; according to the importance of the data contained in the four layers of the TCP/IP protocol , In order to divide and extract the traffic data of different sizes in each layer in turn according to the proportion; the extracted traffic data is composed of one-dimensional M bytes, and the M bytes are converted into N pixels; the N pixels The points are converted into standard-sized grayscale images to form a new grayscale image dataset; the grayscale image dataset is sent to the input layer of the convolutional neural network model, and the convolutional layer and pooling are continuously adaptively adjusted The size and number of layers are convolved according to the bad, and a high-dimensional global feature data set is obtained.
本申请实施例采取的又一技术方案为:一种电子设备,包括:Another technical solution adopted by the embodiment of the present application is: an electronic device, including:
至少一个处理器;以及At least one processor; and
与所述至少一个处理器通信连接的存储器;其中,A memory communicatively connected to the at least one processor; wherein,
所述存储器存储有可被所述一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行上述的基于深度学习的网络流量分类方法的以下操作:The memory stores instructions executable by the one processor, and the instructions are executed by the at least one processor to enable the at least one processor to perform the following of the deep learning-based network traffic classification method described above operating:
步骤a:捕获网络流量样本数据;Step a: Capture network traffic sample data;
步骤b:通过深度学习分类算法提取所述网络流量样本数据的全局特征数 据集;Step b: Extract the global feature data set of the network traffic sample data through a deep learning classification algorithm;
步骤c:根据所述全局特征数据集构建随机森林分类模型,通过随机森林分类模型输出网络流量分类结果。Step c: Construct a random forest classification model according to the global feature data set, and output the network traffic classification result through the random forest classification model.
相对于现有技术,本申请实施例产生的有益效果在于:本申请实施例的基于深度学习的网络流量分类方法、系统及电子设备利用TCP/IP协议中各层流量数据的潜在特征进行分类,提高了分类准确率,同时按每层所包含的数据重要程度按比例深度挖掘,很好的保证了每层特征的高内聚。利用提取的全局特征训练随机森林分类模型,结果表现出稳定的分类性能,能够处理很高维度的流量数据,并且不用做特征选择。相比现有技术,本申请能够有效的保障网络流量分类的高精度和高性能,同时,可以提高分类效率,缩短训练时间,降低计算开销。Compared with the prior art, the beneficial effects produced by the embodiments of the present application are: the deep learning-based network traffic classification method, system and electronic device of the embodiments of the present application use the potential characteristics of the traffic data of each layer in the TCP/IP protocol for classification, The classification accuracy is improved, and at the same time, the depth of the data contained in each layer is mined in proportion to the depth, which well guarantees the high cohesion of the features of each layer. Using the extracted global features to train a random forest classification model, the results show stable classification performance, can handle high-dimensional traffic data, and do not need to make feature selection. Compared with the prior art, the present application can effectively guarantee the high accuracy and high performance of the network traffic classification, and at the same time, it can improve the classification efficiency, shorten the training time, and reduce the calculation overhead.
附图说明BRIEF DESCRIPTION
图1是本申请实施例的基于深度学习的网络流量分类方法的流程图;1 is a flowchart of a network traffic classification method based on deep learning according to an embodiment of the present application;
图2为本申请实施例的深度学习分类算法提取特征流程图;2 is a flowchart of feature extraction by a deep learning classification algorithm according to an embodiment of the present application;
图3是本申请实施例的基于深度学习的网络流量分类系统的结构示意图;3 is a schematic structural diagram of a network traffic classification system based on deep learning according to an embodiment of the present application;
图4是本申请实施例提供的基于深度学习的网络流量分类方法的硬件设备结构示意图。4 is a schematic structural diagram of a hardware device of a network traffic classification method based on deep learning provided by an embodiment of the present application.
具体实施方式detailed description
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。In order to make the purpose, technical solutions and advantages of the present application more clear, the present application will be described in further detail in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present application, and are not used to limit the present application.
针对现有网络流量分类方法存在的技术问题,本申请实施例的基于深度学 习的网络流量分类方法利用深度学习隐藏特征提取技术准确的挖掘网络流量中大量的隐藏流量特征集,确保在网络流量分类过程中,充分高效利用到网络流量中的流量特征集,以对网络流量进行精准的分类和识别。In view of the technical problems of the existing network traffic classification methods, the deep learning-based network traffic classification method of the embodiment of the present application uses deep learning hidden feature extraction technology to accurately mine a large number of hidden traffic feature sets in network traffic to ensure that network traffic classification In the process, the flow feature set in the network traffic is fully and efficiently used to accurately classify and identify the network traffic.
具体地,请参阅图1,是本申请实施例的基于深度学习的网络流量分类方法的流程图。本申请实施例的基于深度学习的网络流量分类方法包括以下步骤:Specifically, please refer to FIG. 1, which is a flowchart of a deep learning-based network traffic classification method according to an embodiment of the present application. The network traffic classification method based on deep learning in the embodiment of the present application includes the following steps:
步骤100:捕获网络流量样本数据;Step 100: Capture sample data of network traffic;
步骤100中,捕获网络流量样本数据具体包括:选择一个大型网络数据中心,采用Wireshark软件采集所有网络数据包;同时,为了标签数据,并设置高性能网络监控软件进行连续性捕捉,获取到网络数据包对应时间段内网络流量之间交流产生的系统网络日志。In step 100, capturing network traffic sample data specifically includes: selecting a large network data center and using Wireshark software to collect all network data packets; at the same time, for labeling data, and setting up high-performance network monitoring software for continuous capture to obtain network data The system network log generated by communication between network traffic during the time period corresponding to the packet.
步骤200:检测网络流量样本数据,并对网络流量样本数据进行预处理;Step 200: Detect network traffic sample data, and preprocess the network traffic sample data;
步骤200中,网络流量样本数据预处理具体包括:首先,为了防止TCP(Transmission Control Protocol,传输控制协议)三次握手不稳定导致传送断开产生的不完整网络数据包,需要过滤掉不完整网络数据包。其次,为了避免TCP连接时确认报文丢失导致的网络数据包重传,需要删除重传的网络数据包。In step 200, the preprocessing of the network traffic sample data specifically includes: first, in order to prevent the incomplete network data packets generated by the transmission disconnection caused by the unstable three-way handshake of TCP (Transmission Control Protocol), the incomplete network data needs to be filtered out package. Secondly, in order to avoid the retransmission of network packets caused by the loss of acknowledgement packets during TCP connection, the retransmitted network packets need to be deleted.
步骤300:对预处理后的网络流量样本数据进行样本打标签处理,得到网络流数据集;Step 300: Perform sample labeling processing on the preprocessed network traffic sample data to obtain a network flow data set;
步骤300中,样本打标签具体包括:首先,分析网络流量样本数据,找出其中每个应用的自然属性和与其他应用交流之间的关键信息,包括IP地址、传输协议等;其次,提取出系统网络日志中与每个应用相关联的IP端点和传输包数,判断网络流量样本数据所属类别,并结合每个应用的IP地址和传输协议进行二者关联融合,完成网络流量样本数据的标记;最后,利用DPI(Deep Packet Inspection,深度包检测)技术对未知流量数据进行特征指纹匹配,完成 未知流量数据的标记。In step 300, the sample labeling specifically includes: first, analyze the network traffic sample data to find out the natural attributes of each application and the key information between communicating with other applications, including the IP address, transmission protocol, etc.; second, extract In the system network log, the IP endpoints and the number of transmission packets associated with each application are used to determine the category of network traffic sample data, and combined with the IP address and transmission protocol of each application to associate and merge the two to complete the marking of network traffic sample data. Finally, the DPI (Deep Packet Inspection) technology is used to perform fingerprint matching on unknown traffic data to complete the marking of unknown traffic data.
步骤400:通过深度学习分类算法提取网络流数据集的全局特征数据集;Step 400: Extract the global feature data set of the network flow data set through a deep learning classification algorithm;
步骤400中,本申请实施例利用网络流量中流量包的每层协议数据的关联程度,来重新提取和分配数据集。具体地,请一并参阅图2,为本申请实施例的深度学习分类算法提取全局特征数据流程图,其具体包括以下步骤:In step 400, the embodiment of the present application uses the degree of association of each layer of protocol data in the traffic packets in the network traffic to re-extract and distribute the data set. Specifically, please refer to FIG. 2 together, which is a flowchart of extracting global feature data of the deep learning classification algorithm according to an embodiment of the present application, which specifically includes the following steps:
步骤401:输入网络流数据集;Step 401: Enter the network stream data set;
步骤402:利用TCP/IP协议四层所包含流量数据之间的关联程度,按比例依次提取每个网络数据包的应用层、传输层、网络层、数据链路层的流量数据;Step 402: Use the correlation between the flow data contained in the four layers of the TCP/IP protocol to sequentially extract the flow data of the application layer, transmission layer, network layer, and data link layer of each network packet in sequence;
步骤403:按照TCP/IP协议四层所包含的数据重要性程度,按一定比例依次分割并提取每层不同大小的流量数据;Step 403: According to the importance of the data contained in the four layers of the TCP/IP protocol, sequentially divide and extract traffic data of different sizes for each layer according to a certain ratio;
步骤403中,本申请通过按每层所包含的数据重要程度来按比例深度挖掘,很好的保证了每层特征的高内聚。In step 403, the present application deeply digs in proportion to the importance of the data contained in each layer, which well guarantees the high cohesion of the features of each layer.
步骤404:将提取出的流量数据组成一维化的M个字节,并将M个字节转换成N个像素点;Step 404: Combine the extracted traffic data into one-dimensional M bytes, and convert the M bytes into N pixels;
步骤405:将N个像素点转换成标准尺寸(X,X,1)的灰度图像,形成新的灰度图像数据集;Step 405: Convert the N pixels into a gray image of standard size (X, X, 1) to form a new gray image data set;
步骤406:将灰度图像数据集送入到卷积神经网络模型的输入层,经过不断自适应调整卷积层和池化层的大小和数量,循坏进行卷积操作,得到高维的全局特征数据集;Step 406: Send the grayscale image data set to the input layer of the convolutional neural network model. After continuously adaptively adjusting the size and number of the convolutional layer and the pooling layer, perform the convolution operation according to the bad, to obtain a high-dimensional global Feature data set;
步骤406中,卷积神经网络模型的卷积操作具体为:首先,在靠近输入层的卷积层设定少量的卷积核,随着往后的训练循坏,卷积层设定的卷积核数目增多。设计卷积核的尺寸Y*Y、数量C和滑动步长W即可自动进行训练。为 了保证在卷积操作后能够保持原图像大小不变,本申请实施例中,选择尺寸为3*3的卷积核和1的zero padding(0值填充),Feature_map(特征映射)尺寸=(wide+2*padding_size-filter_size)/stride+1,具体尺寸可根据实际应用进行设定。In step 406, the convolution operation of the convolutional neural network model is specifically as follows: First, a small number of convolution kernels are set in the convolutional layer close to the input layer. The number of accumulated cores increased. The size Y*Y, number C and sliding step W of the convolution kernel can be automatically trained. In order to ensure that the original image size can be kept unchanged after the convolution operation, in the embodiment of the present application, a convolution kernel of size 3*3 and a zero padding of 0 are selected, and the size of Feature_map (feature map) = ( wide+2*padding_size-filter_size)/stride+1, the specific size can be set according to the actual application.
步骤407:通过降采样的方式,在不影响图像质量的情况下,对全局特征数据集中的图像进行压缩,减少参数;Step 407: Compress the image in the global feature data set to reduce parameters without affecting the image quality by downsampling;
步骤407中,降采样方式具体为:设池化层采用MaxPooling(最大池化),大小为2*2,步长为1,取每个窗口最大的数值更新,那么图像的尺寸就会由Feature_map变为2*2:(Feature_map-2)+1。In step 407, the downsampling method is specifically: the pooling layer is set to use MaxPooling (maximum pooling), the size is 2*2, the step size is 1, the maximum value of each window is updated, then the size of the image will be determined by Feature_map Becomes 2*2: (Feature_map-2)+1.
步骤408:重复执行步骤407和408,直到提取出大量的局部特征,并满足设定的学习率后终止卷积操作;Step 408: Repeat steps 407 and 408 until a large number of local features are extracted and the convolution operation is terminated after the set learning rate is satisfied;
步骤409:将局部特征提取结果输入到Flatten(压平)层,Flatten层输出一维化的全局特征数据集。Step 409: The local feature extraction result is input to the Flatten layer, and the Flatten layer outputs a one-dimensional global feature data set.
步骤500:对提取的全局特征数据集进行分类训练,构建随机森林分类模型,并通过随机森林分类模型输出网络流量分类结果。Step 500: Perform classification training on the extracted global feature data set, construct a random forest classification model, and output a network traffic classification result through the random forest classification model.
步骤500中,本申请首先使用卷积神经网络提取全局特征数据集,再用提取的全局特征数据集训练随机森林分类模型,在训练过程中,能够检测到feature(特征)间的互相影响,有效的保障网络流量分类的高精度和高性能。In step 500, the present application first uses a convolutional neural network to extract a global feature data set, and then uses the extracted global feature data set to train a random forest classification model. During the training process, it can detect the mutual influence of features (features), which is effective Guarantees the high precision and high performance of network traffic classification.
本申请通过采用监督学习的随机森林算法进行建模,根据森林中每棵决策树给出的结果,不仅可以得出已知流量的类别判定,还可以用投票方式来决定未知流量的类别划分。经测试结果显示,本申请实施例的随机森林分类模型有很高的分类精度,同时,可以提高分类效率,缩短训练时间,降低计算开销。In this application, a random forest algorithm using supervised learning is used for modeling. According to the results given by each decision tree in the forest, not only can the category judgment of known traffic be obtained, but also the classification of unknown traffic can be determined by voting. The test results show that the random forest classification model of the embodiment of the present application has very high classification accuracy, and at the same time, it can improve classification efficiency, shorten training time, and reduce calculation overhead.
请参阅图3,是本申请实施例的基于深度学习的网络流量分类系统的结构 示意图。本申请实施例的基于深度学习的网络流量分类系统包括数据获取模块、数据预处理模块、数据打标签模块、特征提取模块、分类模型构建模块和结果输出模块。Please refer to FIG. 3, which is a schematic structural diagram of a network traffic classification system based on deep learning according to an embodiment of the present application. The network flow classification system based on deep learning in the embodiment of the present application includes a data acquisition module, a data preprocessing module, a data labeling module, a feature extraction module, a classification model construction module, and a result output module.
数据获取模块:用于捕获网络流量样本数据;其中,捕获网络流量样本数据具体包括:选择一个大型网络数据中心,采用Wireshark软件采集所有网络数据包;同时,为了标签数据,并设置高性能网络监控软件进行连续性捕捉,获取到网络数据包对应时间段内网络流量之间交流产生的系统网络日志。Data acquisition module: used to capture network traffic sample data; among them, capturing network traffic sample data specifically includes: selecting a large network data center and using Wireshark software to collect all network data packets; at the same time, for label data, and setting up high-performance network monitoring The software continuously captures and obtains the system network log generated by the communication between the network traffic within the corresponding time period of the network data packet.
数据预处理模块:用于检测网络流量样本数据,并对网络流量样本数据进行预处理;其中,网络流量样本数据预处理具体包括:首先,为了防止TCP(Transmission Control Protocol,传输控制协议)三次握手不稳定导致传送断开产生的不完整网络数据包,需要过滤掉不完整网络数据包。其次,为了避免TCP连接时确认报文丢失导致的网络数据包重传,需要删除重传的网络数据包。Data pre-processing module: used to detect network traffic sample data and pre-process network traffic sample data; among them, network traffic sample data pre-processing specifically includes: first, in order to prevent TCP (Transmission Control Protocol) three-way handshake Instability leads to incomplete network data packets caused by disconnection. Incomplete network data packets need to be filtered out. Secondly, in order to avoid the retransmission of network packets caused by the loss of acknowledgement packets during TCP connection, the retransmitted network packets need to be deleted.
数据打标签模块:用于对预处理后的网络流量样本数据进行样本打标签处理,得到网络流数据集;其中,样本打标签具体包括:首先,分析网络流量样本数据,找出其中每个应用的自然属性和与其他应用交流之间的关键信息,包括IP地址、传输协议等;其次,提取出系统网络日志中与每个应用相关联的IP端点和传输包数,判断网络流量样本数据所属类别,并结合每个应用的IP地址和传输协议进行二者关联融合,完成网络流量样本数据的标记;最后,利用DPI(Deep Packet Inspection,深度包检测)技术对未知流量数据进行特征指纹匹配,完成未知流量数据的标记。Data labeling module: used for sample labeling the pre-processed network traffic sample data to obtain a network flow data set; among them, the sample labeling specifically includes: first, analyze the network traffic sample data to find each application The natural attributes of and the key information exchanged with other applications, including IP addresses, transmission protocols, etc.; second, extract the IP endpoints and the number of transmission packets associated with each application in the system network log to determine the network traffic sample data belongs to Category, and combine the IP address and transmission protocol of each application to associate and merge the two to complete the marking of network traffic sample data; finally, use DPI (Deep Packet Inspection) technology to perform feature fingerprint matching on unknown traffic data, Complete tagging of unknown traffic data.
特征提取模块:用于通过深度学习分类算法提取网络流数据集的全局特征数据集;本申请实施例利用网络流量中流量包的每层协议数据的关联程度,来重新提取和分配数据集。具体地,全局特征数据集提取方式包括:Feature extraction module: used to extract the global feature data set of the network flow data set through a deep learning classification algorithm; the embodiments of the present application use the degree of association of each layer of protocol data in the traffic packets in the network traffic to re-extract and distribute the data set. Specifically, the global feature data set extraction method includes:
1、输入网络流数据集;1. Enter the network stream data set;
2、利用TCP/IP协议四层所包含流量数据之间的关联程度,按比例依次提取每个网络数据包的应用层、传输层、网络层、数据链路层的流量数据;2. Use the correlation between the flow data contained in the four layers of the TCP/IP protocol to sequentially extract the flow data of the application layer, transmission layer, network layer, and data link layer of each network packet;
3、按照TCP/IP协议四层所包含的数据重要性程度,按一定比例依次分割并提取每层不同大小的流量数据;3. According to the importance of the data contained in the four layers of the TCP/IP protocol, the traffic data of different sizes of each layer is sequentially divided and extracted according to a certain ratio;
4、将提取出的流量数据组成一维化的M个字节,并将M个字节转换成N个像素点;4. The extracted flow data is composed of one-dimensional M bytes, and the M bytes are converted into N pixels;
5、将N个像素点转换成标准尺寸(X,X,1)的灰度图像,形成新的灰度图像数据集;5. Convert N pixels to grayscale images of standard size (X, X, 1) to form a new grayscale image data set;
6、将灰度图像数据集送入到卷积神经网络模型的输入层,经过不断自适应调整卷积层和池化层的大小和数量,循坏进行卷积操作,得到高维的全局特征数据集;具体为:首先,在靠近输入层的卷积层设定少量的卷积核,随着往后的训练循坏,卷积层设定的卷积核数目增多。设计卷积核的尺寸Y*Y、数量C和滑动步长W即可自动进行训练。为了保证在卷积操作后能够保持原图像大小不变,本申请实施例中,选择尺寸为3*3的卷积核和1的zero padding(0值填充),Feature_map(特征映射)尺寸=(wide+2*padding_size-filter_size)/stride+1,具体尺寸可根据实际应用进行设定。6. Send the gray image data set to the input layer of the convolutional neural network model. After continuously adaptively adjusting the size and number of the convolutional layer and the pooling layer, the convolution operation is performed according to the bad, and the high-dimensional global features are obtained. Data set; specifically: First, a small number of convolution kernels are set in the convolution layer close to the input layer. As the training cycle is broken, the number of convolution kernels set by the convolution layer increases. The size Y*Y, number C and sliding step W of the convolution kernel can be automatically trained. In order to ensure that the original image size can be kept unchanged after the convolution operation, in the embodiment of the present application, a convolution kernel of size 3*3 and a zero padding of 0 are selected, and the size of Feature_map (feature map) = ( wide+2*padding_size-filter_size)/stride+1, the specific size can be set according to the actual application.
7、通过降采样的方式,在不影响图像质量的情况下,对全局特征数据集中的图像进行压缩,减少参数;降采样方式具体为:设池化层采用MaxPooling(最大池化),大小为2*2,步长为1,取每个窗口最大的数值更新,那么图像的尺寸就会由Feature_map变为2*2:(Feature_map-2)+1。7. Through the downsampling method, the image in the global feature data set is compressed to reduce the parameters without affecting the image quality; the downsampling method is specifically: the pooling layer is set to use MaxPooling (maximum pooling), the size is 2*2, step size is 1, update with the largest value of each window, then the size of the image will change from Feature_map to 2*2: (Feature_map-2)+1.
8、重复进行卷积操作和降采样操作,直到提取出大量的局部特征,并满足设定的学习率后终止卷积操作;8. Repeat the convolution operation and the downsampling operation until a large number of local features are extracted and the convolution operation is terminated after the set learning rate is satisfied;
9、将局部特征提取结果输入到Flatten(压平)层,Flatten层输出一维化的全局特征数据集。9. The local feature extraction result is input to the Flatten (flattening) layer, and the Flatten layer outputs a one-dimensional global feature dataset.
分类模型构建模块:用于对提取的全局特征数据集进行分类训练,构建随机森林分类模型;本申请首先使用卷积神经网络提取全局特征数据集,再用提取的全局特征数据集训练随机森林分类模型,在训练过程中,能够检测到feature(特征)间的互相影响,有效的保障网络流量分类的高精度和高性能。Classification model building module: used for classification training on the extracted global feature data set to build a random forest classification model; this application first uses a convolutional neural network to extract the global feature data set, and then uses the extracted global feature data set to train the random forest classification The model, during the training process, can detect the interaction between features (features), and effectively guarantee the high accuracy and high performance of network traffic classification.
结果输出模块:用于输出网络流量分类结果。Result output module: used to output network traffic classification results.
图4是本申请实施例提供的基于深度学习的网络流量分类方法的硬件设备结构示意图。如图4所示,该设备包括一个或多个处理器以及存储器。以一个处理器为例,该设备还可以包括:输入系统和输出系统。4 is a schematic structural diagram of a hardware device of a network traffic classification method based on deep learning provided by an embodiment of the present application. As shown in Figure 4, the device includes one or more processors and memory. Taking a processor as an example, the device may further include: an input system and an output system.
处理器、存储器、输入系统和输出系统可以通过总线或者其他方式连接,图4中以通过总线连接为例。The processor, memory, input system, and output system may be connected through a bus or in other ways. In FIG. 4, connection through a bus is used as an example.
存储器作为一种非暂态计算机可读存储介质,可用于存储非暂态软件程序、非暂态计算机可执行程序以及模块。处理器通过运行存储在存储器中的非暂态软件程序、指令以及模块,从而执行电子设备的各种功能应用以及数据处理,即实现上述方法实施例的处理方法。As a non-transitory computer-readable storage medium, the memory can be used to store non-transitory software programs, non-transitory computer executable programs, and modules. The processor runs non-transitory software programs, instructions, and modules stored in the memory to execute various functional applications and data processing of the electronic device, that is, to implement the processing methods of the foregoing method embodiments.
存储器可以包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需要的应用程序;存储数据区可存储数据等。此外,存储器可以包括高速随机存取存储器,还可以包括非暂态存储器,例如至少一个磁盘存储器件、闪存器件、或其他非暂态固态存储器件。在一些实施例中,存储器可选包括相对于处理器远程设置的存储器,这些远程存储器可以通过网络连接至处理系统。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The memory may include a storage program area and a storage data area, where the storage program area may store an operating system and application programs required by at least one function; the storage data area may store data, and the like. In addition, the memory may include a high-speed random access memory, and may also include a non-transitory memory, such as at least one magnetic disk storage device, a flash memory device, or other non-transitory solid-state storage devices. In some embodiments, the memory optionally includes memories remotely located with respect to the processor, and these remote memories may be connected to the processing system via a network. Examples of the above network include but are not limited to the Internet, intranet, local area network, mobile communication network, and combinations thereof.
输入系统可接收输入的数字或字符信息,以及产生信号输入。输出系统可包括显示屏等显示设备。The input system can receive input digital or character information, and generate signal input. The output system may include display devices such as display screens.
所述一个或者多个模块存储在所述存储器中,当被所述一个或者多个处理器执行时,执行上述任一方法实施例的以下操作:The one or more modules are stored in the memory, and when executed by the one or more processors, perform the following operations of any of the foregoing method embodiments:
步骤a:捕获网络流量样本数据;Step a: Capture network traffic sample data;
步骤b:通过深度学习分类算法提取所述网络流量样本数据的全局特征数据集;Step b: Extract the global feature data set of the network traffic sample data through a deep learning classification algorithm;
步骤c:根据所述全局特征数据集构建随机森林分类模型,通过随机森林分类模型输出网络流量分类结果。Step c: Construct a random forest classification model according to the global feature data set, and output the network traffic classification result through the random forest classification model.
上述产品可执行本申请实施例所提供的方法,具备执行方法相应的功能模块和有益效果。未在本实施例中详尽描述的技术细节,可参见本申请实施例提供的方法。The above-mentioned products can execute the method provided in the embodiments of the present application, and have function modules and beneficial effects corresponding to the execution method. For technical details that are not described in detail in this embodiment, refer to the method provided in the embodiments of the present application.
本申请实施例提供了一种非暂态(非易失性)计算机存储介质,所述计算机存储介质存储有计算机可执行指令,该计算机可执行指令可执行以下操作:An embodiment of the present application provides a non-transitory (non-volatile) computer storage medium that stores computer-executable instructions, and the computer-executable instructions can perform the following operations:
步骤a:捕获网络流量样本数据;Step a: Capture network traffic sample data;
步骤b:通过深度学习分类算法提取所述网络流量样本数据的全局特征数据集;Step b: Extract the global feature data set of the network traffic sample data through a deep learning classification algorithm;
步骤c:根据所述全局特征数据集构建随机森林分类模型,通过随机森林分类模型输出网络流量分类结果。Step c: Construct a random forest classification model according to the global feature data set, and output the network traffic classification result through the random forest classification model.
本申请实施例提供了一种计算机程序产品,所述计算机程序产品包括存储在非暂态计算机可读存储介质上的计算机程序,所述计算机程序包括程序指令,当所述程序指令被计算机执行时,使所述计算机执行以下操作:An embodiment of the present application provides a computer program product. The computer program product includes a computer program stored on a non-transitory computer-readable storage medium. The computer program includes program instructions. When the program instructions are executed by a computer To cause the computer to perform the following operations:
步骤a:捕获网络流量样本数据;Step a: Capture network traffic sample data;
步骤b:通过深度学习分类算法提取所述网络流量样本数据的全局特征数据集;Step b: Extract the global feature data set of the network traffic sample data through a deep learning classification algorithm;
步骤c:根据所述全局特征数据集构建随机森林分类模型,通过随机森林 分类模型输出网络流量分类结果。Step c: Construct a random forest classification model according to the global feature data set, and output the network traffic classification result through the random forest classification model.
本申请实施例的基于深度学习的网络流量分类方法、系统及电子设备利用TCP/IP协议中各层流量数据的潜在特征进行分类,提高了分类准确率,同时按每层所包含的数据重要程度按比例深度挖掘,很好的保证了每层特征的高内聚。利用提取的全局特征训练随机森林分类模型,结果表现出稳定的分类性能,能够处理很高维度的流量数据,并且不用做特征选择。相比现有技术,本申请能够有效的保障网络流量分类的高精度和高性能,同时,可以提高分类效率,缩短训练时间,降低计算开销。The deep learning-based network traffic classification method, system, and electronic device of the embodiment of the present application use the potential characteristics of each layer of traffic data in the TCP/IP protocol for classification, which improves the classification accuracy, and at the same time according to the importance of the data contained in each layer Deep digging according to the ratio guarantees the high cohesion of the features of each layer. Using the extracted global features to train a random forest classification model, the results show stable classification performance, can handle high-dimensional traffic data, and do not need to make feature selection. Compared with the prior art, the present application can effectively guarantee the high accuracy and high performance of the network traffic classification, and at the same time, it can improve the classification efficiency, shorten the training time, and reduce the calculation overhead.
以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。The above is only the preferred embodiment of the present invention. It should be pointed out that for those of ordinary skill in the art, without departing from the principles of the present invention, several improvements and retouches can be made. These improvements and retouches also It should be regarded as the protection scope of the present invention.

Claims (11)

  1. 一种基于深度学习的网络流量分类方法,其特征在于,包括以下步骤:A network traffic classification method based on deep learning, which is characterized by the following steps:
    步骤a:捕获网络流量样本数据;Step a: Capture network traffic sample data;
    步骤b:通过深度学习分类算法提取所述网络流量样本数据的全局特征数据集;Step b: Extract the global feature data set of the network traffic sample data through a deep learning classification algorithm;
    步骤c:根据所述全局特征数据集构建随机森林分类模型,通过随机森林分类模型输出网络流量分类结果。Step c: Construct a random forest classification model according to the global feature data set, and output the network traffic classification result through the random forest classification model.
  2. 根据权利要求1所述的基于深度学习的网络流量分类方法,其特征在于,在所述步骤a中,所述捕获网络流量样本数据具体包括:选择网络数据中心,采集所有网络数据包;同时,获取所述网络数据包对应时间段内网络流量之间交流产生的系统网络日志。The network traffic classification method based on deep learning according to claim 1, wherein in step a, the capturing network traffic sample data specifically includes: selecting a network data center and collecting all network data packets; at the same time, Obtain system network logs generated by communication between network traffic within the corresponding time period of the network data packet.
  3. 根据权利要求2所述的基于深度学习的网络流量分类方法,其特征在于,在所述步骤a中,所述网络流量样本数据还包括:检测网络流量样本数据,对网络流量样本数据进行预处理,过滤掉网络流量样本数据中的不完整网络数据包,并删除重传的网络数据包。The network traffic classification method based on deep learning according to claim 2, wherein in the step a, the network traffic sample data further comprises: detecting network traffic sample data, and preprocessing the network traffic sample data , Filter out incomplete network packets in the network traffic sample data, and delete the retransmitted network packets.
  4. 根据权利要求3所述的基于深度学习的网络流量分类方法,其特征在于,在所述步骤a中,所述网络流量样本数据还包括:对所述预处理后的网络流量样本数据进行样本打标签处理,得到网络流数据集;所述样本打标签具体为:分析所述网络流量样本数据,找出其中每个应用的自然属性和与其他应用交流之间的IP地址、传输协议;提取所述系统网络日志中与每个应用相关联的IP端点和传输包数,判断所述网络流量样本数据所属类别,并结合每个应用的IP地址和传 输协议进行二者关联融合,完成所述网络流量样本数据的标记;最后,利用深度包检测技术对未知流量数据进行特征指纹匹配,完成未知流量数据的标记。The network traffic classification method based on deep learning according to claim 3, characterized in that, in the step a, the network traffic sample data further comprises: performing sample processing on the preprocessed network traffic sample data Label processing to obtain a network flow data set; the labeling of the sample specifically includes analyzing the network traffic sample data to find out the natural attributes of each application and the IP address and transmission protocol between the application and other applications; Describe the number of IP endpoints and transmission packets associated with each application in the system network log, determine the category of the network traffic sample data, and combine the two applications with the IP address and transmission protocol of each application to complete the network Marking of traffic sample data; finally, using deep packet inspection technology to perform fingerprint matching on unknown traffic data to complete the marking of unknown traffic data.
  5. 根据权利要求4所述的基于深度学习的网络流量分类方法,其特征在于,在所述步骤b中,所述通过深度学习分类算法提取所述网络流量样本数据的全局特征数据集具体包括:The network traffic classification method based on deep learning according to claim 4, characterized in that, in the step b, the global feature data set for extracting the network traffic sample data through the deep learning classification algorithm specifically includes:
    步骤b1:输入网络流数据集;Step b1: Enter the network flow data set;
    步骤b2:利用TCP/IP协议四层所包含流量数据之间的关联程度,按比例依次提取每个网络数据包的应用层、传输层、网络层、数据链路层的流量数据;Step b2: Use the correlation between the flow data contained in the four layers of the TCP/IP protocol to sequentially extract the flow data of the application layer, transmission layer, network layer, and data link layer of each network packet in sequence;
    步骤b3:按照TCP/IP协议四层所包含的数据重要性程度,按比例依次分割并提取每层不同大小的流量数据;Step b3: According to the importance of the data contained in the four layers of the TCP/IP protocol, sequentially divide and extract the traffic data of different sizes for each layer in proportion;
    步骤b4:将提取出的流量数据组成一维化的M个字节,并将M个字节转换成N个像素点;Step b4: The extracted flow data is composed into one-dimensional M bytes, and the M bytes are converted into N pixels;
    步骤b5:将所述N个像素点转换成标准尺寸的灰度图像,形成新的灰度图像数据集;Step b5: Convert the N pixels into a gray image of standard size to form a new gray image data set;
    步骤b6:将所述灰度图像数据集送入到卷积神经网络模型的输入层,经过不断自适应调整卷积层和池化层的大小和数量,循坏进行卷积操作,得到高维的全局特征数据集。Step b6: Send the gray-scale image data set to the input layer of the convolutional neural network model, after continuously adaptively adjusting the size and number of the convolutional layer and the pooling layer, perform the convolution operation according to the bad, and obtain high-dimensional Global feature dataset.
  6. 一种基于深度学习的网络流量分类系统,其特征在于,包括:A network traffic classification system based on deep learning, which is characterized by:
    数据获取模块:用于捕获网络流量样本数据;Data acquisition module: used to capture network traffic sample data;
    特征提取模块:用于通过深度学习分类算法提取所述网络流量样本数据的全局特征数据集;Feature extraction module: used to extract the global feature data set of the network traffic sample data through a deep learning classification algorithm;
    分类模型构建模块:用于根据所述全局特征数据集构建随机森林分类模型;Classification model building module: used to build a random forest classification model according to the global feature data set;
    结果输出模块:用于输出网络流量分类结果。Result output module: used to output network traffic classification results.
  7. 根据权利要求6所述的基于深度学习的网络流量分类系统,其特征在于,所述数据获取模块捕获网络流量样本数据具体包括:选择网络数据中心,采集所有网络数据包;同时,获取所述网络数据包对应时间段内网络流量之间交流产生的系统网络日志。The network traffic classification system based on deep learning according to claim 6, wherein the data acquisition module capturing network traffic sample data specifically includes: selecting a network data center to collect all network data packets; and at the same time, acquiring the network The system network log generated by the communication between network traffic during the time period corresponding to the data packet.
  8. 根据权利要求7所述的基于深度学习的网络流量分类系统,其特征在于,还包括数据预处理模块,所述数据预处理模块用于检测网络流量样本数据,对所述网络流量样本数据进行预处理,过滤掉网络流量样本数据中的不完整网络数据包,并删除重传的网络数据包。The network traffic classification system based on deep learning according to claim 7, further comprising a data pre-processing module, the data pre-processing module is configured to detect network traffic sample data and pre-process the network traffic sample data Process, filter out incomplete network data packets in network traffic sample data, and delete retransmitted network data packets.
  9. 根据权利要求8所述的基于深度学习的网络流量分类系统,其特征在于,还包括数据打标签模块,所述数据打标签模块用于对所述预处理后的网络流量样本数据进行样本打标签处理,得到网络流数据集;所述样本打标签具体为:分析所述网络流量样本数据,找出其中每个应用的自然属性和与其他应用交流之间的IP地址、传输协议;提取所述系统网络日志中与每个应用相关联的IP端点和传输包数,判断所述网络流量样本数据所属类别,并结合每个应用的IP地址和传输协议进行二者关联融合,完成所述网络流量样本数据的标记;最后,利用深度包检测技术对未知流量数据进行特征指纹匹配,完成未知流量数据的标记。The network traffic classification system based on deep learning according to claim 8, further comprising a data labeling module, the data labeling module is used for sample labeling the preprocessed network traffic sample data Processing to obtain a network flow data set; the labeling of the sample specifically includes: analyzing the network flow sample data to find out the natural attributes of each application and the IP address and transmission protocol between communicating with other applications; extracting the In the system network log, the number of IP endpoints and transmission packets associated with each application, determine the category of the network traffic sample data, and combine the IP address and transmission protocol of each application to associate and merge the two to complete the network traffic Marking of sample data; finally, using deep packet inspection technology to perform fingerprint matching on unknown traffic data to complete the marking of unknown traffic data.
  10. 根据权利要求9所述的基于深度学习的网络流量分类系统,其特征在于,所述特征提取模块通过深度学习分类算法提取所述网络流量样本数据的全局特征数据集具体为:输入网络流数据集;利用TCP/IP协议四层所包含流量数据之间的关联程度,按比例依次提取每个网络数据包的应用层、传输层、网络层、数据链路层的流量数据;按照TCP/IP协议四层所包含的数据重要性程度,按比例依次分割并提取每层不同大小的流量数据;将提取出的流量数据组成一维化的M个字节,并将M个字节转换成N个像素点;将所述N个像素点转换成标准尺 寸的灰度图像,形成新的灰度图像数据集;将所述灰度图像数据集送入到卷积神经网络模型的输入层,经过不断自适应调整卷积层和池化层的大小和数量,循坏进行卷积操作,得到高维的全局特征数据集。The network traffic classification system based on deep learning according to claim 9, wherein the feature extraction module extracts the global feature data set of the network traffic sample data through a deep learning classification algorithm specifically: input network flow data set ; Utilize the degree of correlation between the flow data contained in the four layers of the TCP/IP protocol to sequentially extract the flow data of the application layer, transport layer, network layer, and data link layer of each network packet; according to the TCP/IP protocol The importance of the data contained in the four layers is divided and extracted in sequence in proportion to the flow data of different sizes; the extracted flow data is composed of one-dimensional M bytes, and the M bytes are converted into N Pixels; convert the N pixels into a standard-sized grayscale image to form a new grayscale image data set; send the grayscale image data set to the input layer of the convolutional neural network model, after continuous Adaptively adjust the size and number of convolutional layer and pooling layer, and perform convolution operation according to the bad, to obtain a high-dimensional global feature data set.
  11. 一种电子设备,包括:An electronic device, including:
    至少一个处理器;以及At least one processor; and
    与所述至少一个处理器通信连接的存储器;其中,A memory communicatively connected to the at least one processor; wherein,
    所述存储器存储有可被所述一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行上述1至5任一项所述的基于深度学习的网络流量分类方法的以下操作:The memory stores instructions executable by the one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute the depth-based Learn the following operations of the network traffic classification method:
    步骤a:捕获网络流量样本数据;Step a: Capture network traffic sample data;
    步骤b:通过深度学习分类算法提取所述网络流量样本数据的全局特征数据集;Step b: Extract the global feature data set of the network traffic sample data through a deep learning classification algorithm;
    步骤c:根据所述全局特征数据集构建随机森林分类模型,通过随机森林分类模型输出网络流量分类结果。Step c: Construct a random forest classification model according to the global feature data set, and output the network traffic classification result through the random forest classification model.
PCT/CN2019/122001 2018-12-11 2019-11-29 Network traffic classification method and system based on deep learning, and electronic device WO2020119481A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811507380.2 2018-12-11
CN201811507380.2A CN109639481B (en) 2018-12-11 2018-12-11 Deep learning-based network traffic classification method and system and electronic equipment

Publications (1)

Publication Number Publication Date
WO2020119481A1 true WO2020119481A1 (en) 2020-06-18

Family

ID=66072697

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/122001 WO2020119481A1 (en) 2018-12-11 2019-11-29 Network traffic classification method and system based on deep learning, and electronic device

Country Status (2)

Country Link
CN (1) CN109639481B (en)
WO (1) WO2020119481A1 (en)

Cited By (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111817982A (en) * 2020-07-27 2020-10-23 南京信息工程大学 Encrypted flow identification method for category imbalance
CN111860628A (en) * 2020-07-08 2020-10-30 上海乘安科技集团有限公司 Deep learning-based traffic identification and feature extraction method
CN112187664A (en) * 2020-09-23 2021-01-05 东南大学 Application flow automatic classification method based on semi-supervised learning
CN112235264A (en) * 2020-09-28 2021-01-15 国家计算机网络与信息安全管理中心 Network traffic identification method and device based on deep migration learning
CN112235314A (en) * 2020-10-29 2021-01-15 东巽科技(北京)有限公司 Network flow detection method, device and equipment
CN112364878A (en) * 2020-09-25 2021-02-12 江苏师范大学 Power line classification method based on deep learning under complex background
CN112468509A (en) * 2020-12-09 2021-03-09 湖北松颢科技有限公司 Deep learning technology-based automatic flow data detection method and device
CN112615713A (en) * 2020-12-22 2021-04-06 东软集团股份有限公司 Detection method and device of hidden channel, readable storage medium and electronic equipment
CN112651435A (en) * 2020-12-22 2021-04-13 中国南方电网有限责任公司 Self-learning-based detection method for flow abnormity of power network probe
CN113124949A (en) * 2021-04-06 2021-07-16 深圳市联恒星科技有限公司 Multiphase flow detection method and system
CN113177209A (en) * 2021-04-19 2021-07-27 北京邮电大学 Encryption traffic classification method based on deep learning and related equipment
CN113256507A (en) * 2021-04-01 2021-08-13 南京信息工程大学 Attention enhancement method for generating image aiming at binary flux data
CN113660273A (en) * 2021-08-18 2021-11-16 国家电网公司东北分部 Intrusion detection method and device based on deep learning under super-fusion framework
CN113783795A (en) * 2021-07-19 2021-12-10 北京邮电大学 Encrypted flow classification method and related equipment
CN113872939A (en) * 2021-08-30 2021-12-31 济南浪潮数据技术有限公司 Flow detection method, device and storage medium
CN113949653A (en) * 2021-10-18 2022-01-18 中铁二院工程集团有限责任公司 Encryption protocol identification method and system based on deep learning
CN113965524A (en) * 2021-09-29 2022-01-21 河海大学 Network flow classification method and flow control system based on same
CN114338437A (en) * 2022-01-13 2022-04-12 北京邮电大学 Network traffic classification method and device, electronic equipment and storage medium
CN114500387A (en) * 2022-02-14 2022-05-13 重庆邮电大学 Mobile application traffic identification method and system based on machine learning
CN114553790A (en) * 2022-03-12 2022-05-27 北京工业大学 Multi-mode feature-based small sample learning Internet of things traffic classification method and system
CN114615007A (en) * 2022-01-13 2022-06-10 中国科学院信息工程研究所 Tunnel mixed flow classification method and system based on random forest
CN114765634A (en) * 2021-01-13 2022-07-19 腾讯科技(深圳)有限公司 Network protocol identification method and device, electronic equipment and readable storage medium
CN114915575A (en) * 2022-06-02 2022-08-16 电子科技大学 Network flow detection device based on artificial intelligence
CN115065560A (en) * 2022-08-16 2022-09-16 国网智能电网研究院有限公司 Data interaction leakage-prevention detection method and device based on service time sequence characteristic analysis
CN115134168A (en) * 2022-08-29 2022-09-30 成都盛思睿信息技术有限公司 Method and system for detecting cloud platform hidden channel based on convolutional neural network
CN115150840A (en) * 2022-05-18 2022-10-04 西安交通大学 Mobile network flow prediction method based on deep learning
CN115242496A (en) * 2022-07-20 2022-10-25 安徽工业大学 Tor encrypted traffic application behavior classification method and device based on residual error network
CN115277113A (en) * 2022-07-06 2022-11-01 国网山西省电力公司信息通信分公司 Power grid network intrusion event detection and identification method based on ensemble learning
CN115514720A (en) * 2022-09-19 2022-12-23 华东师范大学 Programmable data plane-oriented user activity classification method and application
CN114884704B (en) * 2022-04-21 2023-03-10 中国科学院信息工程研究所 Network traffic abnormal behavior detection method and system based on involution and voting
CN115993831A (en) * 2023-03-23 2023-04-21 安徽大学 Method for planning path of robot non-target network based on deep reinforcement learning
CN116599779A (en) * 2023-07-19 2023-08-15 中国电信股份有限公司江西分公司 IPv6 cloud conversion method for improving network security performance
CN116842459A (en) * 2023-09-01 2023-10-03 国网信息通信产业集团有限公司 Electric energy metering fault diagnosis method and diagnosis terminal based on small sample learning
CN116915512A (en) * 2023-09-14 2023-10-20 国网江苏省电力有限公司常州供电分公司 Method and device for detecting communication flow in power grid

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109639481B (en) * 2018-12-11 2020-10-27 深圳先进技术研究院 Deep learning-based network traffic classification method and system and electronic equipment
CN110012029B (en) * 2019-04-22 2020-05-26 中国科学院声学研究所 Method and system for distinguishing encrypted and non-encrypted compressed flow
CN110048962A (en) * 2019-04-24 2019-07-23 广东工业大学 A kind of method of net flow assorted, system and equipment
CN110097120B (en) * 2019-04-30 2022-08-26 南京邮电大学 Network flow data classification method, equipment and computer storage medium
CN110311829B (en) * 2019-05-24 2021-03-16 西安电子科技大学 Network traffic classification method based on machine learning acceleration
CN110225009B (en) * 2019-05-27 2020-06-05 四川大学 Proxy user detection method based on communication behavior portrait
CN110896381B (en) * 2019-11-25 2021-10-29 中国科学院深圳先进技术研究院 Deep neural network-based traffic classification method and system and electronic equipment
CN111131069B (en) * 2019-11-25 2021-06-08 北京理工大学 Abnormal encryption flow detection and classification method based on deep learning strategy
CN111224892B (en) * 2019-12-26 2023-08-01 中国人民解放军国防科技大学 Flow classification method and system based on FPGA random forest model
CN111917600A (en) * 2020-06-12 2020-11-10 贵州大学 Spark performance optimization-based network traffic classification device and classification method
CN112200256A (en) * 2020-10-16 2021-01-08 鹏城实验室 Sketch network measuring method based on deep learning and electronic equipment
CN112511384B (en) * 2020-11-26 2022-09-02 广州品唯软件有限公司 Flow data processing method and device, computer equipment and storage medium
CN112580708B (en) * 2020-12-10 2024-03-05 上海阅维科技股份有限公司 Method for identifying internet surfing behavior from encrypted traffic generated by application program
CN112804253B (en) * 2021-02-04 2022-07-12 湖南大学 Network flow classification detection method, system and storage medium
CN115514686A (en) * 2021-06-23 2022-12-23 深信服科技股份有限公司 Flow acquisition method and device, electronic equipment and storage medium
CN113591950A (en) * 2021-07-19 2021-11-02 中国海洋大学 Random forest network traffic classification method, system and storage medium
CN115296919B (en) * 2022-08-15 2023-04-25 江西师范大学 Method and system for calculating special traffic packet by edge gateway
CN116051883A (en) * 2022-12-09 2023-05-02 哈尔滨理工大学 Network traffic classification method based on CNN-converter hybrid architecture

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150295805A1 (en) * 2013-04-15 2015-10-15 International Business Machines Corporation Identification and classification of web traffic inside encrypted network tunnels
CN105141455A (en) * 2015-08-24 2015-12-09 西南大学 Noisy network traffic classification modeling method based on statistical characteristics
CN108900432A (en) * 2018-07-05 2018-11-27 中山大学 A kind of perception of content method based on network Flow Behavior
CN109639481A (en) * 2018-12-11 2019-04-16 深圳先进技术研究院 A kind of net flow assorted method, system and electronic equipment based on deep learning

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104601486A (en) * 2013-10-30 2015-05-06 阿里巴巴集团控股有限公司 Method and device for shunt of network flow
US20160283859A1 (en) * 2015-03-25 2016-09-29 Cisco Technology, Inc. Network traffic classification
CN106096411B (en) * 2016-06-08 2018-09-18 浙江工业大学 A kind of Android malicious code family classification methods based on bytecode image clustering
CN108021940B (en) * 2017-11-30 2023-04-18 中国银联股份有限公司 Data classification method and system based on machine learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150295805A1 (en) * 2013-04-15 2015-10-15 International Business Machines Corporation Identification and classification of web traffic inside encrypted network tunnels
CN105141455A (en) * 2015-08-24 2015-12-09 西南大学 Noisy network traffic classification modeling method based on statistical characteristics
CN108900432A (en) * 2018-07-05 2018-11-27 中山大学 A kind of perception of content method based on network Flow Behavior
CN109639481A (en) * 2018-12-11 2019-04-16 深圳先进技术研究院 A kind of net flow assorted method, system and electronic equipment based on deep learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LEI ZANG ET AL: "Application of Machine Learning in Cyberspace Security Research", CHINESE JOURNAL OF COMPUTERS, vol. 41, no. 9, 30 September 2018 (2018-09-30), pages 1 - 35, XP055712299 *
SHUANG ZHAO ET AL: "Review: Traffic Identification Based on Machine Learning", COMPUTER ENGINEERING & SCIENCE, vol. 40, no. 10, 25 October 2018 (2018-10-25), pages 1746 - 1756, XP055712306, ISSN: 1007-130X, DOI: 10.3969/j.issn.1007-130X.2018.10.005 *

Cited By (49)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111860628A (en) * 2020-07-08 2020-10-30 上海乘安科技集团有限公司 Deep learning-based traffic identification and feature extraction method
CN111817982A (en) * 2020-07-27 2020-10-23 南京信息工程大学 Encrypted flow identification method for category imbalance
CN112187664A (en) * 2020-09-23 2021-01-05 东南大学 Application flow automatic classification method based on semi-supervised learning
CN112364878A (en) * 2020-09-25 2021-02-12 江苏师范大学 Power line classification method based on deep learning under complex background
CN112235264A (en) * 2020-09-28 2021-01-15 国家计算机网络与信息安全管理中心 Network traffic identification method and device based on deep migration learning
CN112235314A (en) * 2020-10-29 2021-01-15 东巽科技(北京)有限公司 Network flow detection method, device and equipment
CN112468509A (en) * 2020-12-09 2021-03-09 湖北松颢科技有限公司 Deep learning technology-based automatic flow data detection method and device
CN112615713A (en) * 2020-12-22 2021-04-06 东软集团股份有限公司 Detection method and device of hidden channel, readable storage medium and electronic equipment
CN112651435A (en) * 2020-12-22 2021-04-13 中国南方电网有限责任公司 Self-learning-based detection method for flow abnormity of power network probe
CN112615713B (en) * 2020-12-22 2024-02-23 东软集团股份有限公司 Method and device for detecting hidden channel, readable storage medium and electronic equipment
CN112651435B (en) * 2020-12-22 2022-12-20 中国南方电网有限责任公司 Self-learning-based power network probe flow abnormity detection method
CN114765634A (en) * 2021-01-13 2022-07-19 腾讯科技(深圳)有限公司 Network protocol identification method and device, electronic equipment and readable storage medium
CN114765634B (en) * 2021-01-13 2023-12-12 腾讯科技(深圳)有限公司 Network protocol identification method, device, electronic equipment and readable storage medium
CN113256507B (en) * 2021-04-01 2023-11-21 南京信息工程大学 Attention enhancement method for generating image aiming at binary flow data
CN113256507A (en) * 2021-04-01 2021-08-13 南京信息工程大学 Attention enhancement method for generating image aiming at binary flux data
CN113124949A (en) * 2021-04-06 2021-07-16 深圳市联恒星科技有限公司 Multiphase flow detection method and system
CN113177209A (en) * 2021-04-19 2021-07-27 北京邮电大学 Encryption traffic classification method based on deep learning and related equipment
CN113783795B (en) * 2021-07-19 2023-07-25 北京邮电大学 Encryption traffic classification method and related equipment
CN113783795A (en) * 2021-07-19 2021-12-10 北京邮电大学 Encrypted flow classification method and related equipment
CN113660273A (en) * 2021-08-18 2021-11-16 国家电网公司东北分部 Intrusion detection method and device based on deep learning under super-fusion framework
CN113872939A (en) * 2021-08-30 2021-12-31 济南浪潮数据技术有限公司 Flow detection method, device and storage medium
CN113965524A (en) * 2021-09-29 2022-01-21 河海大学 Network flow classification method and flow control system based on same
CN113949653A (en) * 2021-10-18 2022-01-18 中铁二院工程集团有限责任公司 Encryption protocol identification method and system based on deep learning
CN114338437A (en) * 2022-01-13 2022-04-12 北京邮电大学 Network traffic classification method and device, electronic equipment and storage medium
CN114338437B (en) * 2022-01-13 2023-12-29 北京邮电大学 Network traffic classification method and device, electronic equipment and storage medium
CN114615007A (en) * 2022-01-13 2022-06-10 中国科学院信息工程研究所 Tunnel mixed flow classification method and system based on random forest
CN114615007B (en) * 2022-01-13 2023-05-23 中国科学院信息工程研究所 Tunnel mixed flow classification method and system based on random forest
CN114500387A (en) * 2022-02-14 2022-05-13 重庆邮电大学 Mobile application traffic identification method and system based on machine learning
CN114553790A (en) * 2022-03-12 2022-05-27 北京工业大学 Multi-mode feature-based small sample learning Internet of things traffic classification method and system
CN114884704B (en) * 2022-04-21 2023-03-10 中国科学院信息工程研究所 Network traffic abnormal behavior detection method and system based on involution and voting
CN115150840B (en) * 2022-05-18 2024-03-12 西安交通大学 Mobile network flow prediction method based on deep learning
CN115150840A (en) * 2022-05-18 2022-10-04 西安交通大学 Mobile network flow prediction method based on deep learning
CN114915575B (en) * 2022-06-02 2023-04-07 电子科技大学 Network flow detection device based on artificial intelligence
CN114915575A (en) * 2022-06-02 2022-08-16 电子科技大学 Network flow detection device based on artificial intelligence
CN115277113A (en) * 2022-07-06 2022-11-01 国网山西省电力公司信息通信分公司 Power grid network intrusion event detection and identification method based on ensemble learning
CN115242496B (en) * 2022-07-20 2024-04-16 安徽工业大学 Method and device for classifying Torr encrypted traffic application behaviors based on residual network
CN115242496A (en) * 2022-07-20 2022-10-25 安徽工业大学 Tor encrypted traffic application behavior classification method and device based on residual error network
CN115065560A (en) * 2022-08-16 2022-09-16 国网智能电网研究院有限公司 Data interaction leakage-prevention detection method and device based on service time sequence characteristic analysis
CN115134168A (en) * 2022-08-29 2022-09-30 成都盛思睿信息技术有限公司 Method and system for detecting cloud platform hidden channel based on convolutional neural network
CN115514720B (en) * 2022-09-19 2023-09-19 华东师范大学 User activity classification method and application for programmable data plane
CN115514720A (en) * 2022-09-19 2022-12-23 华东师范大学 Programmable data plane-oriented user activity classification method and application
CN115993831B (en) * 2023-03-23 2023-06-09 安徽大学 Method for planning path of robot non-target network based on deep reinforcement learning
CN115993831A (en) * 2023-03-23 2023-04-21 安徽大学 Method for planning path of robot non-target network based on deep reinforcement learning
CN116599779B (en) * 2023-07-19 2023-10-27 中国电信股份有限公司江西分公司 IPv6 cloud conversion method for improving network security performance
CN116599779A (en) * 2023-07-19 2023-08-15 中国电信股份有限公司江西分公司 IPv6 cloud conversion method for improving network security performance
CN116842459B (en) * 2023-09-01 2023-11-21 国网信息通信产业集团有限公司 Electric energy metering fault diagnosis method and diagnosis terminal based on small sample learning
CN116842459A (en) * 2023-09-01 2023-10-03 国网信息通信产业集团有限公司 Electric energy metering fault diagnosis method and diagnosis terminal based on small sample learning
CN116915512A (en) * 2023-09-14 2023-10-20 国网江苏省电力有限公司常州供电分公司 Method and device for detecting communication flow in power grid
CN116915512B (en) * 2023-09-14 2023-12-01 国网江苏省电力有限公司常州供电分公司 Method and device for detecting communication flow in power grid

Also Published As

Publication number Publication date
CN109639481B (en) 2020-10-27
CN109639481A (en) 2019-04-16

Similar Documents

Publication Publication Date Title
WO2020119481A1 (en) Network traffic classification method and system based on deep learning, and electronic device
Ravanbakhsh et al. Plug-and-play cnn for crowd motion analysis: An application in abnormal event detection
CN110896381B (en) Deep neural network-based traffic classification method and system and electronic equipment
WO2020062390A1 (en) Network traffic classification method and system, and electronic device
CN113162908B (en) Encrypted flow detection method and system based on deep learning
WO2019015684A1 (en) Facial image reduplication removing method and apparatus, electronic device, storage medium, and program
CN111860628A (en) Deep learning-based traffic identification and feature extraction method
CN112511555A (en) Private encryption protocol message classification method based on sparse representation and convolutional neural network
WO2019105131A1 (en) Image identification method and system for monitoring, computer device, and readable storage medium
CN111147394B (en) Multi-stage classification detection method for remote desktop protocol traffic behavior
WO2021103868A1 (en) Method for structuring pedestrian information, device, apparatus and storage medium
CN111311570A (en) Transmission line key device defect identification method based on unmanned aerial vehicle inspection
CN110034966B (en) Data flow classification method and system based on machine learning
CN110532959B (en) Real-time violent behavior detection system based on two-channel three-dimensional convolutional neural network
CN114039901A (en) Protocol identification method based on residual error network and recurrent neural network mixed model
KR20180123810A (en) Data enrichment processing technology and method for decoding x-ray medical image
CN112653749A (en) Edge computing-based complex event processing system and method for Internet of things
Nazeer et al. Real time object detection and recognition in machine learning using jetson nano
Wang et al. Sessionvideo: A novel approach for encrypted traffic classification via 3D-CNN model
CN114092746A (en) Multi-attribute identification method and device, storage medium and electronic equipment
KR102642446B1 (en) Method and device for image augmentation with masking of invoice for classifying cargo according to damage
CN115147895B (en) Face fake identifying method and device
CN113703977B (en) Intelligent face and human body detection and filtration device and picture output device
CN117235559B (en) Internet of things data acquisition method and system based on edge calculation
Sumalatha et al. An efficient approach for robust image classification based on extremely randomized decision trees

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19897053

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 03/11/2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19897053

Country of ref document: EP

Kind code of ref document: A1