WO2024109083A1 - Network traffic inspection method, electronic device, and storage medium - Google Patents

Network traffic inspection method, electronic device, and storage medium Download PDF

Info

Publication number
WO2024109083A1
WO2024109083A1 PCT/CN2023/105206 CN2023105206W WO2024109083A1 WO 2024109083 A1 WO2024109083 A1 WO 2024109083A1 CN 2023105206 W CN2023105206 W CN 2023105206W WO 2024109083 A1 WO2024109083 A1 WO 2024109083A1
Authority
WO
WIPO (PCT)
Prior art keywords
port
target
flow
target port
network traffic
Prior art date
Application number
PCT/CN2023/105206
Other languages
French (fr)
Chinese (zh)
Inventor
姚云国
石辰
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2024109083A1 publication Critical patent/WO2024109083A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/40Network security protocols

Definitions

  • the present application belongs to the field of information processing technology, and specifically relates to a network traffic detection method, an electronic device and a storage medium.
  • the first is the rule-based detection method, which usually records the flow information features extracted from abnormal network traffic in a predetermined rule file based on empirical values, and then matches the traffic to be detected. This method is usually used to deal with discovered threats, but for zero-day threats, the detection accuracy is low; the second is the artificial intelligence-based detection method, which usually detects network traffic based on the flow information corresponding to the network traffic, such as source IP, destination IP and other features, using related machine learning algorithms. It is difficult to identify some attack programs implanted in normal network flows, and the detection accuracy is low.
  • the purpose of the embodiments of the present application is to provide a network traffic detection method, device, electronic device and storage medium, which can realize abnormal detection of end-to-end communication traffic in the network with high detection accuracy.
  • an embodiment of the present application provides a network traffic detection method, the method comprising: determining target flow information of target network traffic; determining a target port of the target network traffic based on the target flow information, wherein the target port comprises a source port and a destination port; determining a flow abnormality event based on characteristics of the target port and target flow characteristics corresponding to the target flow information.
  • an embodiment of the present application provides a network traffic detection device, which includes: a first determination module, used to determine the target flow information of the target network traffic; a second determination module, used to determine the target port of the target network traffic based on the target flow information, wherein the target port includes a source port and a destination port; a detection module, used to determine the flow abnormality event based on the characteristics of the target port and the target flow characteristics corresponding to the target flow information.
  • an embodiment of the present application provides an electronic device, which includes a processor and a memory, wherein the memory stores programs or instructions that can be run on the processor, and when the program or instructions are executed by the processor, the steps of the network traffic detection method described in the first aspect are implemented.
  • an embodiment of the present application provides a readable storage medium, on which a program or instruction is stored.
  • the program or instruction is executed by a processor, the steps of the network traffic detection method described in the first aspect are implemented.
  • an embodiment of the present application provides a chip, comprising a processor and a communication interface, wherein the communication interface is coupled to the processor, and the processor is used to run programs or instructions to implement the network traffic detection method as described in the first aspect.
  • an embodiment of the present application provides a computer program product, which is stored in a storage medium and is executed by at least one processor to implement the network traffic detection method as described in the first aspect.
  • FIG1 is a flow chart of a network traffic detection method provided in an embodiment of the present application.
  • FIG2 is a flow chart of a network traffic detection method provided in an embodiment of the present application.
  • FIG3 is a schematic diagram of the structure of a port pattern recognition model provided in an embodiment of the present application.
  • FIG. 4 is a schematic diagram of the structure of a port pattern recognition model provided in an embodiment of the present application.
  • FIG5 is a schematic diagram of the structure of a flow anomaly detection model provided in an embodiment of the present application.
  • FIG6 is a schematic diagram of the structure of a network traffic detection device provided in an embodiment of the present application.
  • FIG. 7 is a schematic diagram of the hardware structure of an electronic device for executing the network traffic detection method provided in an embodiment of the present application.
  • first, second, etc. in the specification and claims of this application are used to distinguish similar objects, and are not used to describe a specific order or sequence. It should be understood that the data used in this way can be interchangeable under appropriate circumstances, so that the embodiments of the present application can be implemented in an order other than those illustrated or described here, and the objects distinguished by "first”, “second”, etc. are generally of one type, and the number of objects is not limited.
  • the first object can be one or more.
  • “and/or” in the specification and claims represents at least one of the connected objects, and the character “/" generally indicates that the objects associated with each other are in an "or” relationship.
  • FIG1 shows a flow chart of a network traffic detection method provided by an embodiment of the present application.
  • the method can be executed by an electronic device, and the electronic device may include: a server or a terminal device.
  • the method can be executed by software or hardware installed in the electronic device.
  • the method includes the following steps:
  • S101 Determine target flow information of target network traffic.
  • This step obtains the target network traffic.
  • the target network traffic is analyzed by the traffic probe to obtain the target flow information.
  • the extracted flow information includes the 80+ dimensional features of the network flow defined by the Canadian Institute for Cyber Security, the 41 flow features defined by the KDD data set of the IDS Laboratory of Columbia University, etc. By analyzing and studying these flow information features, relevant network traffic detection is performed.
  • the obtained network traffic data can be the original traffic obtained in the network environment or local cache file.
  • S102 Determine a target port of the target network traffic according to the target flow information.
  • This step determines the target port of the target network traffic according to the target flow information, and the target port includes a source port and a destination port.
  • S103 Determine a flow abnormality event according to the characteristics of the target port and the target flow characteristics corresponding to the target flow information.
  • This step takes into account that processes that transmit traffic often communicate through ports.
  • 5G 5th Generation Mobile Communication Technology
  • the communication between 5th Generation Mobile Communication Technology (5G) devices and device management centers is usually point-to-point, and the source port and destination port are fixed.
  • the source port and the destination port are fixed.
  • the source port and the destination port flow largely show a pattern of fixed source ports and random destination ports.
  • different devices are similar in traffic, they have different communication ports.
  • the flow abnormality event is determined based on the characteristics of the target port and the target flow characteristics, including: splicing the characteristics of the target port and the target flow characteristics to obtain a spliced value; encoding the spliced value and then restoring it to obtain a restored value; comparing the spliced value with the restored value to obtain an abnormal value; when the abnormal value is greater than a threshold, determining the target network traffic as the flow abnormality event.
  • a network traffic detection method determines the target flow information of the target network traffic; determines the target port of the target network traffic based on the target flow information, wherein the target port includes a source port and a destination port; determines the flow abnormality event based on the characteristics of the target port and the target flow characteristics corresponding to the target flow information, thereby realizing abnormal detection of end-to-end communication traffic in the network with high detection accuracy.
  • FIG2 is a schematic diagram showing a flow chart of another network traffic detection method provided in an embodiment of the present application.
  • the method can be executed by an electronic device, which may include a server or a terminal device.
  • the method can be executed by software or hardware installed in the electronic device, as shown in FIG2 , and the method includes the following steps:
  • S201 Determine target flow information of target network traffic.
  • S202 Determine a target port of the target network traffic according to the target flow information.
  • the network flow data is passed through a flow probe or flow information extraction tool to extract a flow information set.
  • the flow information includes the source Internet Protocol (IP), source port, destination IP, destination port, time, number of upstream bytes, number of downstream bytes, number of upstream packets, etc.
  • S203 Input the characteristics of the target port into a port pattern recognition model to obtain the pattern of the target port.
  • the port information of the target flow information is passed as a feature into the port pattern recognition model to obtain a port pattern.
  • the output of the port pattern recognition model includes four dimensions, namely: the source port is fixed, the source port is random, the destination port is fixed, and the destination port is random.
  • the port pattern recognition model can be trained using network traffic with comprehensive port patterns. For example, the network traffic for port pattern recognition comes from traditional IT office networks, laboratory network traffic, etc.
  • the characteristics of the target port are input into a port pattern recognition model to obtain the pattern of the target port, including: inputting the characteristics of the target port into a first network layer of the port pattern recognition model; inputting characteristics of multiple ports corresponding to multiple historical flow information and multiple time period information into a second network layer of the port pattern recognition model; adding the output of the first network layer to the output of the second network layer, inputting the sum into a third network layer of the port pattern recognition model, and outputting the pattern of the target port; wherein the output dimension of the first network layer is the same as the output dimension of the second network layer.
  • the flow abnormality event before determining the flow abnormality event according to the characteristics of the target port and the target flow characteristics corresponding to the target flow information, it also includes: dimensionality conversion of the characteristics of the target port so that the dimension of the characteristics of the target port reaches the target dimension.
  • the port pattern recognition model is divided into an input layer, a network layer, and an output layer, as shown in FIG3 .
  • the basic idea is to query the specific port pattern of this combination based on all historical flow port information (input data 2) combined with the characteristics of the target port (input data 1).
  • the input layer is the entrance of the model.
  • input data 1 represents the port combination (source port, destination port) passed through in the current training data set; when the model is tested, it represents the (source port, destination port) passed through by the current test flow information;
  • input data 2 represents all flow port information in the current training data set, where the time when the flow corresponding to the source port/destination port is generated can also be used.
  • the time characteristics of the flow generation can be processed by generalized mapping. For example, if divided by hours, 00:10:00 can be mapped to 0, and 05:30:22 can be mapped to 5.
  • the relevant generalization granularity is not specifically limited here.
  • the specific structure of the port pattern recognition model can be shown in FIG4 , wherein the network layer 1 is not limited to adopting a network structure such as a fully connected or convolutional network layer.
  • the network layer 2 is not limited to adopting a network structure such as a fully connected, convolutional, long short-term memory (LSTM), multi-head attention, etc.
  • LSTM long short-term memory
  • the network layer 3 is not limited to adopting a network structure such as a fully connected or convolutional network.
  • the input layer uses two embedding (65536, 16) layers.
  • the source port uses one embedding1, and the destination port uses embedding2.
  • 65536 represents the maximum total number of ports, and 16 represents the dimension after port mapping. This value can be adjusted as needed.
  • the first dimension in the embedding corresponds to the replaced port number one by one.
  • the time period information does not need to use the embedding technology.
  • Network layer 1 uses a single-layer fully connected layer with 64 neurons; network layer 2 uses 2 convolutional layers + Flatten + 64 neurons. Single-layer fully connected layer.
  • Network layer 3 uses a single-layer fully connected layer with 128 neurons.
  • the output layer uses a single-layer fully connected network with 4 neurons.
  • S204 According to the mode of the target port and the characteristics of the target port, convert the characteristics of the target port to obtain the converted characteristics of the target port.
  • the characteristics of the target port are converted according to the pattern of the target port and the characteristics of the target port, including one of the following: when the pattern of the target port is fixed, retaining the original value of the characteristics of the target port; when the pattern of the target port is random, converting the characteristics of the target port to a preset generalized value.
  • the input data 1 is passed through the pattern recognition model to obtain the port pattern, and then port conversion is performed. If the port mode is fixed, the original value of the port is retained. If the port pattern is recognized as random, the port is replaced with "random" to obtain a new port feature.
  • the target port (50001, 80) has a port mode of (random, fixed), and the converted result is represented as (random, 80).
  • S205 Input the converted target port features and the target flow features into a flow anomaly detection model to obtain the flow anomaly event.
  • the flow anomaly detection model described in this step includes an autoencoder AE or a variational autoencoder VAE, and the network structure can be as shown in Figure 5, without specific limitation.
  • the first port before inputting the characteristics of multiple ports corresponding to multiple historical flow information and multiple time period information into the second network layer of the port pattern recognition model, it also includes: determining a first number of first ports, wherein the first port includes at least one port corresponding to the multiple historical flow information; sorting the multiple first numbers to determine a first sorting value corresponding to the first port; encoding the characteristics of the first port according to the first sorting value; inputting the characteristics of multiple ports corresponding to multiple historical flow information and multiple time period information into the second network layer of the port pattern recognition model, includes: inputting the encoded characteristics of the multiple ports and multiple time period information into the second network layer of the port pattern recognition model.
  • each flow information in the flow information set can be separated into two parts, one part contains three pieces of information: source port, destination port, and time, and the flow information set feature_1_1 is obtained, and the port mode of each flow in the set is labeled (label), for example (1,0,0,1), which respectively indicates that the source port is fixed, the source port is random, the destination port is fixed, and the destination port is random, where 1 indicates confirmation and 0 indicates negation, and the other part extracts the remaining information from the flow information except the 4-tuple (source IP, source port, destination IP, destination port) and time to obtain the flow information set feature_2_1.
  • label label
  • feature_1_1 is generalized into ports, and the number of occurrences of the source port and the destination port in feature_1_1 are counted respectively, and sorted to obtain the sorting results, as shown in Table 1 below, and finally the port number is replaced with the sorting sequence number, such as encoding port 2554 as its corresponding sequence number 1.
  • the present application may also include a data enhancement step, such as deduplicating the (source port, destination port) combination in feature_1_1 to obtain the feature_3_1 set, and then negatively sampling the feature_3_1 data, with the port value range from 1 to 65536, and recording the port mode label of the source or destination port that does not appear in feature_3_1 as random, for example, the negative sampling source and destination port (1, 10), where 10 does not appear on the destination port, then the port mode label is (1, 0, 0, 1), and the enhanced data is placed in the feature_3_1 set.
  • a data enhancement step such as deduplicating the (source port, destination port) combination in feature_1_1 to obtain the feature_3_1 set, and then negatively sampling the feature_3_1 data, with the port value range from 1 to 65536, and recording the port mode label of the source or destination port that does not appear in feature_3_1 as random, for example, the negative sampling source and destination port (1, 10), where 10 does not appear on the destination port,
  • the training process of the port pattern recognition model in the embodiment of the present application includes: traversing the aforementioned feature_3_1 one by one, extracting the source port and the destination port as input data 1, all feature_1_1 source ports, destination ports, and time period information as input data 2, and the label in feature_3_1 as the output layer label, and then training, each traversal of feature_3_1 as an epoch, and the epoch can be set to 200 times, or specified as needed. After the training is completed, the port pattern recognition model is obtained.
  • the flow anomaly detection model is trained, and the specific steps include: traversing each flow feature in feature_2_1, extracting the corresponding source port and destination port information in feature_1_1, and sending it to the port identification model to obtain the port mode; performing feature conversion on the source port and the destination port according to the port mode, and obtaining a new feature new_port_feature; splicing feature_2_1 and new_port_feature and sending them to the flow anomaly detection model for training, encoding the input data and then decoding and restoring it, using the reconstruction between the original input data and the restoration as the outlier value, each traversal of feature_2_1 as an epoch, the epoch can be specified as needed, and after the training is completed, taking the maximum outlier value in the training set as the threshold of the abnormal event to obtain the flow anomaly detection model.
  • the network traffic data used in the present application for training the port pattern recognition model and the network traffic data used for training the flow anomaly detection model may be the same or different, and the present application does not impose any specific restrictions.
  • the network traffic data used by the port pattern recognition model may be made richer, so that the training effect of the port pattern recognition model is better, thereby improving the accuracy of the flow anomaly detection model detection.
  • the network traffic detection method provided in the embodiment of the present application can be executed by a network traffic detection device, or a control module in the network traffic detection device for executing the network traffic detection method.
  • the method for executing network traffic detection by a network traffic detection device is taken as an example to illustrate the network traffic detection device provided in the embodiment of the present application.
  • Fig. 6 is a schematic diagram of the structure of a network traffic detection device provided by an embodiment of the present application.
  • the network traffic detection device 600 includes: a first determination module 610 , a second determination module 620 and a detection module 630 .
  • the first determination module 610 is used to determine the target flow information of the target network traffic; the second determination module 620 is used to determine the target port of the target network traffic based on the target flow information, wherein the target port includes a source port and a destination port; the detection module 630 is used to determine the flow abnormality event based on the characteristics of the target port and the target flow characteristics corresponding to the target flow information.
  • the detection module 630 is used to: input the characteristics of the target port into a port pattern recognition model to obtain the pattern of the target port, wherein the port pattern recognition model is used to simulate the correspondence between the characteristics of multiple ports and multiple port patterns, and the pattern of the target port is used to represent the combined characteristics of the target port and the target port attributes; according to the pattern of the target port and the characteristics of the target port, convert the characteristics of the target port to obtain the characteristics of the converted target port; according to the characteristics of the converted target port and the target flow characteristics, determine the flow abnormality event.
  • the detection module 630 is used to: input the characteristics of the target port into the first network layer of the port pattern recognition model; input the characteristics of multiple ports corresponding to multiple historical flow information and multiple time period information into the second network layer of the port pattern recognition model; add the output of the first network layer to the output of the second network layer, input the result to the third network layer of the port pattern recognition model, and output the pattern of the target port; wherein the output dimension of the first network layer is the same as the output dimension of the second network layer.
  • the characteristics of the target port are converted according to the pattern of the target port and the characteristics of the target port, including one of the following: when the pattern of the target port is fixed, retaining the original value of the characteristics of the target port; when the pattern of the target port is random, converting the characteristics of the target port to a preset generalized value.
  • the detection module 630 is used to: input the characteristics of the target port and the target flow characteristics into a flow anomaly detection model to obtain the flow anomaly event, wherein the flow anomaly detection model includes an autoencoder AE or a variational autoencoder VAE.
  • the detection module 630 is used to: splice the characteristics of the target port and the target flow characteristics to obtain a spliced value; encode the spliced value and then restore it to obtain a restored value; compare the spliced value with the restored value to obtain an abnormal value; when the abnormal value is greater than a threshold, determine the target network traffic as the flow abnormal event.
  • the device 600 also includes: an encoding module, which is used to: determine a first number of first ports, wherein the first port includes at least one port corresponding to the multiple historical flow information; sort the multiple first numbers to determine a first sorting value corresponding to the first port; encode the characteristics of the first port according to the first sorting value; the detection module 630 is used to input the encoded characteristics of the multiple ports and multiple time period information into the second network layer of the port pattern recognition model.
  • an encoding module which is used to: determine a first number of first ports, wherein the first port includes at least one port corresponding to the multiple historical flow information; sort the multiple first numbers to determine a first sorting value corresponding to the first port; encode the characteristics of the first port according to the first sorting value; the detection module 630 is used to input the encoded characteristics of the multiple ports and multiple time period information into the second network layer of the port pattern recognition model.
  • the device 600 further includes: a preprocessing module, wherein the preprocessing module is configured to: perform dimension conversion on the feature of the target port so that the dimension of the feature of the target port reaches a target dimension.
  • a network traffic detection device uses a first determination module to determine target flow information of target network traffic; a second determination module to determine a target port of the target network traffic based on the target flow information, wherein the target port includes a source port and a destination port; a detection module to determine a flow anomaly event based on characteristics of the target port and target flow characteristics corresponding to the target flow information, thereby enabling abnormal detection of end-to-end communication traffic in the network with high detection accuracy.
  • the network traffic detection device provided in the embodiment of the present application can implement the various processes implemented by the network traffic detection method embodiment described in Figures 1 to 2, and achieve the same technical effect. To avoid repetition, it will not be repeated here.
  • the network traffic detection device in the embodiment of the present application can be a device, or a component, integrated circuit, or chip in a terminal device.
  • the device can be a mobile electronic device or a non-mobile electronic device.
  • the mobile electronic device can be a mobile phone, a tablet computer, a laptop computer, a PDA, an in-vehicle electronic device, a wearable device, an ultra-mobile personal computer (ultra-mobile personal computer, UMPC), a netbook or a personal digital assistant (personal digital assistant, PDA), etc.
  • the non-mobile electronic device can be a server, a network attached storage (Network Attached Storage, NAS), a personal computer (personal computer, PC), a television (television, TV), a teller machine or a self-service machine, etc., which is not specifically limited in the embodiment of the present application.
  • the network traffic detection device in the embodiment of the present application may be a device having an operating system.
  • the operating system may be an Android operating system, an iOS operating system, or other optional operating systems, which are not specifically limited in the embodiment of the present application.
  • an embodiment of the present application further provides an electronic device 700, including a processor 701, a memory 702, a program or instruction stored in the memory 702 and executable on the processor 701, and when the program or instruction is executed by the processor 701, the network traffic detection method described in at least one of the embodiments of FIG1 and FIG2 is implemented.
  • the electronic device in the embodiment of the present application includes: a server, a terminal device, or other devices other than a terminal device.
  • the above electronic device structure does not constitute a limitation on the electronic device.
  • the electronic device may include more or fewer components than shown in the figure, or combine certain components, or arrange the components differently.
  • the input unit may include a graphics processing unit (GPU) and a microphone
  • the display unit may be configured with a display panel in the form of a liquid crystal display, an organic light-emitting diode, etc.
  • the user input unit includes a touch panel and at least one of other input devices.
  • the touch panel is also called a touch screen.
  • Other input devices may include, but are not limited to, a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, and a joystick, which will not be repeated here.
  • the memory can be used to store software programs and various data.
  • the memory may mainly include a first storage area for storing programs or instructions and a second storage area for storing data, wherein the first storage area may store an operating system, an application program or instructions required for at least one function (such as a sound playback function, an image playback function, etc.), etc.
  • the memory may include a volatile memory or a non-volatile memory, or the memory may include both volatile and non-volatile memories.
  • the non-volatile memory may be a read-only memory (ROM), a programmable read-only memory (PROM), an erasable programmable read-only memory (EPROM), an electrically erasable programmable read-only memory (EEPROM), or a flash memory.
  • ROM read-only memory
  • PROM programmable read-only memory
  • EPROM erasable programmable read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • flash memory a flash memory.
  • Volatile memory can be random access memory (Random Access Memory, RAM), static random access memory (Static RAM, SRAM), dynamic random access memory (Dynamic RAM, DRAM), synchronous dynamic random access memory (Synchronous DRAM, SDRAM), double data rate synchronous dynamic random access memory (Double Data Rate SDRAM, DDRSDRAM), enhanced synchronous dynamic random access memory (Enhanced SDRAM, ESDRAM), synchronous link dynamic random access memory (Synch link DRAM, SLDRAM) and direct memory bus random access memory (Direct Rambus RAM, DRRAM).
  • the processor may include one or more processing units; optionally, the processor integrates an application processor and a modem processor, wherein the application processor mainly processes operations related to the operating system, user interface, and application programs, and the modem processor mainly processes communication signals, such as a baseband processor. It is understandable that the modem processor may not be integrated into the processor.
  • An embodiment of the present application also provides a readable storage medium, on which a program or instruction is stored.
  • the program or instruction is executed by a processor, the network traffic detection method described in at least one of the embodiments of Figures 1 and 2 is implemented, and the same technical effect can be achieved. To avoid repetition, it will not be repeated here.
  • the processor is a processor in the electronic device described in the above embodiment.
  • the readable storage medium includes a computer readable storage medium, such as a computer read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, etc.
  • An embodiment of the present application further provides a chip, which includes a processor and a communication interface, wherein the communication interface is coupled to the processor, and the processor is used to run programs or instructions to implement the various processes of the above-mentioned network traffic detection method embodiment, and can achieve the same technical effect. To avoid repetition, it will not be repeated here.
  • the chip mentioned in the embodiments of the present application can also be called a system-level chip, a system chip, a chip system or a system-on-chip chip, etc.
  • the technical solution of the present application can be embodied in the form of a computer software product, which is stored in a storage medium (such as ROM/RAM, a disk, or an optical disk), and includes a number of instructions for a terminal (which can be a mobile phone, a computer, a server, or a network device, etc.) to execute the methods described in each embodiment of the present application.
  • a storage medium such as ROM/RAM, a disk, or an optical disk
  • a terminal which can be a mobile phone, a computer, a server, or a network device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The present application relates to the technical field of information processing, and discloses a network traffic inspection method, an electronic device, and a storage medium. The method comprises: determining target flow information of target network traffic; determining target ports for the target network traffic according to the target flow information, wherein the target ports comprise a source port and a destination port; and determining a flow anomaly event according to features of the target ports and target flow features corresponding to the target flow information.

Description

一种网络流量检测方法、电子设备及存储介质A network flow detection method, electronic device and storage medium
交叉引用cross reference
本申请要求在2022年11月23日提交中国专利局、申请号为202211476183.5、发明名称为“一种网络流量检测方法、电子设备及存储介质”的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。This application claims priority to a Chinese patent application filed with the China Patent Office on November 23, 2022, with application number 202211476183.5 and invention name “A network traffic detection method, electronic device and storage medium”. The entire contents of the application are incorporated by reference into this application.
技术领域Technical Field
本申请属于信息处理技术领域,具体涉及一种网络流量检测方法、电子设备及存储介质。The present application belongs to the field of information processing technology, and specifically relates to a network traffic detection method, an electronic device and a storage medium.
背景技术Background technique
随着信息技术的飞速发展,网络安全问题日益严峻。网络流量异常检测技术作为应对该问题的方法之一,一直备受关注。With the rapid development of information technology, network security issues are becoming increasingly severe. As one of the methods to deal with this problem, network traffic anomaly detection technology has always attracted much attention.
基于流信息的检测方式包括两种,其一:基于规则的检测方法,通常根据经验值,将从异常的网络流量中提取的流信息特征,记录在预定的规则文件中,然后对待检测流量进行匹配,该方式通常用于应对已发现的威胁,但对于零日(0day)威胁,检测准确度低;第二:基于人工智能的检测方法通常根据网络流量对应的流信息,例如包括源IP,目的IP等特征,利用相关机器学习算法对网络流量进行检测,对于一些植入正常网络流中的攻击程序较难识别,检测准确度低。There are two detection methods based on flow information. The first is the rule-based detection method, which usually records the flow information features extracted from abnormal network traffic in a predetermined rule file based on empirical values, and then matches the traffic to be detected. This method is usually used to deal with discovered threats, but for zero-day threats, the detection accuracy is low; the second is the artificial intelligence-based detection method, which usually detects network traffic based on the flow information corresponding to the network traffic, such as source IP, destination IP and other features, using related machine learning algorithms. It is difficult to identify some attack programs implanted in normal network flows, and the detection accuracy is low.
发明内容Summary of the invention
本申请实施例的目的是提供一种网络流量检测方法、装置、电子设备及存储介质,能够实现网络中端与端通信流量的异常检测,检测准确度高。The purpose of the embodiments of the present application is to provide a network traffic detection method, device, electronic device and storage medium, which can realize abnormal detection of end-to-end communication traffic in the network with high detection accuracy.
为了解决上述技术问题,本申请是这样实现的:In order to solve the above technical problems, this application is implemented as follows:
第一方面,本申请实施例提供了一种网络流量检测方法,该方法包括:确定目标网络流量的目标流信息;根据所述目标流信息,确定所述目标网络流量的目标端口,其中,所述目标端口包括源端口和目的端口;根据所述目标端口的特征和所述目标流信息对应的目标流特征,确定流异常事件。In a first aspect, an embodiment of the present application provides a network traffic detection method, the method comprising: determining target flow information of target network traffic; determining a target port of the target network traffic based on the target flow information, wherein the target port comprises a source port and a destination port; determining a flow abnormality event based on characteristics of the target port and target flow characteristics corresponding to the target flow information.
第二方面,本申请实施例提供了一种网络流量检测装置,该装置包括:第一确定模块,用于确定目标网络流量的目标流信息;第二确定模块,用于根据所述目标流信息,确定所述目标网络流量的目标端口,其中,所述目标端口包括源端口和目的端口;检测模块,用于根据所述目标端口的特征和所述目标流信息对应的目标流特征,确定流异常事件。In the second aspect, an embodiment of the present application provides a network traffic detection device, which includes: a first determination module, used to determine the target flow information of the target network traffic; a second determination module, used to determine the target port of the target network traffic based on the target flow information, wherein the target port includes a source port and a destination port; a detection module, used to determine the flow abnormality event based on the characteristics of the target port and the target flow characteristics corresponding to the target flow information.
第三方面,本申请实施例提供了一种电子设备,该电子设备包括处理器和存储器,所述存储器存储可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如第一方面所述的网络流量检测方法的步骤。In a third aspect, an embodiment of the present application provides an electronic device, which includes a processor and a memory, wherein the memory stores programs or instructions that can be run on the processor, and when the program or instructions are executed by the processor, the steps of the network traffic detection method described in the first aspect are implemented.
第四方面,本申请实施例提供了一种可读存储介质,所述可读存储介质上存储程序或指令,所述程序或指令被处理器执行时实现如第一方面所述的网络流量检测方法的步骤。In a fourth aspect, an embodiment of the present application provides a readable storage medium, on which a program or instruction is stored. When the program or instruction is executed by a processor, the steps of the network traffic detection method described in the first aspect are implemented.
第五方面,本申请实施例提供了一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现如第一方面所述的网络流量检测方法。In a fifth aspect, an embodiment of the present application provides a chip, comprising a processor and a communication interface, wherein the communication interface is coupled to the processor, and the processor is used to run programs or instructions to implement the network traffic detection method as described in the first aspect.
第六方面,本申请实施例提供一种计算机程序产品,该程序产品被存储在存储介质中,该程序产品被至少一个处理器执行以实现如第一方面所述的网络流量检测方法。In a sixth aspect, an embodiment of the present application provides a computer program product, which is stored in a storage medium and is executed by at least one processor to implement the network traffic detection method as described in the first aspect.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1是本申请实施例提供的一种网络流量检测方法的流程示意图。FIG1 is a flow chart of a network traffic detection method provided in an embodiment of the present application.
图2是本申请实施例提供的一种网络流量检测方法的流程示意图。FIG2 is a flow chart of a network traffic detection method provided in an embodiment of the present application.
图3是本申请实施例提供的一种端口模式识别模型的结构示意图。FIG3 is a schematic diagram of the structure of a port pattern recognition model provided in an embodiment of the present application.
图4是本申请实施例提供的一种端口模式识别模型的结构示意图。FIG. 4 is a schematic diagram of the structure of a port pattern recognition model provided in an embodiment of the present application.
图5是本申请实施例提供的一种流异常检测模型的结构示意图。 FIG5 is a schematic diagram of the structure of a flow anomaly detection model provided in an embodiment of the present application.
图6是本申请实施例提供的一种网络流量检测装置的结构示意图。FIG6 is a schematic diagram of the structure of a network traffic detection device provided in an embodiment of the present application.
图7是本申请实施例提供的一种执行所述网络流量检测方法的电子设备的硬件结构示意图。FIG. 7 is a schematic diagram of the hardware structure of an electronic device for executing the network traffic detection method provided in an embodiment of the present application.
具体实施方式Detailed ways
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员获得的所有其他实施例,都属于本申请保护的范围。The following will be combined with the drawings in the embodiments of the present application to clearly describe the technical solutions in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, rather than all the embodiments. All other embodiments obtained by ordinary technicians in this field based on the embodiments in the present application belong to the scope of protection of this application.
本申请的说明书和权利要求书中的术语“第一”、“第二”等是用于区别类似的对象,而不用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施,且“第一”、“第二”等所区分的对象通常为一类,并不限定对象的个数,例如第一对象可以是一个,也可以是多个。此外,说明书以及权利要求中“和/或”表示所连接对象的至少其中之一,字符“/”,一般表示前后关联对象是一种“或”的关系。The terms "first", "second", etc. in the specification and claims of this application are used to distinguish similar objects, and are not used to describe a specific order or sequence. It should be understood that the data used in this way can be interchangeable under appropriate circumstances, so that the embodiments of the present application can be implemented in an order other than those illustrated or described here, and the objects distinguished by "first", "second", etc. are generally of one type, and the number of objects is not limited. For example, the first object can be one or more. In addition, "and/or" in the specification and claims represents at least one of the connected objects, and the character "/" generally indicates that the objects associated with each other are in an "or" relationship.
下面结合附图,通过具体的实施例及其应用场景对本申请实施例提供的一种网络流量检测方法、装置、电子设备及存储介质进行详细地说明。In conjunction with the accompanying drawings, a network traffic detection method, device, electronic device and storage medium provided in the embodiments of the present application are described in detail through specific embodiments and their application scenarios.
图1示出本申请的一个实施例提供的一种网络流量检测方法的流程示意图,该方法可以由电子设备执行,该电子设备可以包括:服务器或终端设备。换言之,该方法可以由安装在电子设备的软件或硬件来执行,如图1所示,该方法包括如下步骤:FIG1 shows a flow chart of a network traffic detection method provided by an embodiment of the present application. The method can be executed by an electronic device, and the electronic device may include: a server or a terminal device. In other words, the method can be executed by software or hardware installed in the electronic device. As shown in FIG1, the method includes the following steps:
S101:确定目标网络流量的目标流信息。S101: Determine target flow information of target network traffic.
本步骤获取目标网络流量,目标网络流量经流量探针分析得到目标流信息,如,提取流信息的内容有加拿大网络安全研究所对网络流定义80+维的特征,哥伦比亚大学IDS实验室KDD数据集定义的41个流特征等,通过分析研究这些流信息特征,进行相关的网络流量检测。另,所获取的网络流量数据可以是网络环境或本地缓存文件中获取原始流量。This step obtains the target network traffic. The target network traffic is analyzed by the traffic probe to obtain the target flow information. For example, the extracted flow information includes the 80+ dimensional features of the network flow defined by the Canadian Institute for Cyber Security, the 41 flow features defined by the KDD data set of the IDS Laboratory of Columbia University, etc. By analyzing and studying these flow information features, relevant network traffic detection is performed. In addition, the obtained network traffic data can be the original traffic obtained in the network environment or local cache file.
S102:根据所述目标流信息,确定所述目标网络流量的目标端口。S102: Determine a target port of the target network traffic according to the target flow information.
本步骤根据所述目标流信息,确定所述目标网络流量的目标端口,所述目标端口包括源端口和目的端口。This step determines the target port of the target network traffic according to the target flow information, and the target port includes a source port and a destination port.
S103:根据所述目标端口的特征和所述目标流信息对应的目标流特征,确定流异常事件。S103: Determine a flow abnormality event according to the characteristics of the target port and the target flow characteristics corresponding to the target flow information.
本步骤考虑到传播流量的进程与进程之间经常通过端口通信,例如第五代移动通信技术(5th Generation Mobile Communication Technology,5G)设备与设备管理中心之间的通信,通常为点对点通信,源端口与目的端口都是固定的;例如设备发起扫描侦查行为时,在源端口与目的端口的流上,很大程度上出现源端口固定,目的端口随机的模式;而不同设备虽然在流量上相似,但通信端口不同,通过在目标流信息对应的目标流特征基础上,增加目标流对应的端口特征作为检测特征来确定流异常事件,进一步增强流异常检测效果,实现网络中端与端通信流量的异常检测,检测准确度高。This step takes into account that processes that transmit traffic often communicate through ports. For example, the communication between 5th Generation Mobile Communication Technology (5G) devices and device management centers is usually point-to-point, and the source port and destination port are fixed. For example, when a device initiates a scanning and reconnaissance behavior, the source port and the destination port flow largely show a pattern of fixed source ports and random destination ports. Although different devices are similar in traffic, they have different communication ports. By adding the port features corresponding to the target flow as detection features on the basis of the target flow features corresponding to the target flow information to determine the flow anomaly event, the flow anomaly detection effect is further enhanced, and the anomaly detection of end-to-end communication traffic in the network is achieved with high detection accuracy.
在一种可选的实现方式中,所述根据所述目标端口的特征和所述目标流特征,确定所述流异常事件,包括:将所述目标端口的特征和所述目标流特征进行拼接,得到拼接值;将所述拼接值进行编码后再进行还原,得到还原值;将所述拼接值与所述还原值进行比较处理得到异常值;在所述异常值大于阈值的情况下,将所述目标网络流量确定为所述流异常事件。In an optional implementation, the flow abnormality event is determined based on the characteristics of the target port and the target flow characteristics, including: splicing the characteristics of the target port and the target flow characteristics to obtain a spliced value; encoding the spliced value and then restoring it to obtain a restored value; comparing the spliced value with the restored value to obtain an abnormal value; when the abnormal value is greater than a threshold, determining the target network traffic as the flow abnormality event.
本申请实施例提供的一种网络流量检测方法,通过确定目标网络流量的目标流信息;根据所述目标流信息,确定所述目标网络流量的目标端口,其中,所述目标端口包括源端口和目的端口;根据所述目标端口的特征和所述目标流信息对应的目标流特征,确定流异常事件,能够实现网络中端与端通信流量的异常检测,检测准确度高。A network traffic detection method provided by an embodiment of the present application determines the target flow information of the target network traffic; determines the target port of the target network traffic based on the target flow information, wherein the target port includes a source port and a destination port; determines the flow abnormality event based on the characteristics of the target port and the target flow characteristics corresponding to the target flow information, thereby realizing abnormal detection of end-to-end communication traffic in the network with high detection accuracy.
图2示出了本申请实施例提供的另一种网络流量检测方法的流程示意图, 该方法可以由电子设备执行,该电子设备可以包括:服务器或终端设备。换言之,该方法可以由安装在电子设备的软件或硬件来执行,如图2所示,该方法包括如下步骤:FIG2 is a schematic diagram showing a flow chart of another network traffic detection method provided in an embodiment of the present application. The method can be executed by an electronic device, which may include a server or a terminal device. In other words, the method can be executed by software or hardware installed in the electronic device, as shown in FIG2 , and the method includes the following steps:
S201:确定目标网络流量的目标流信息。S201: Determine target flow information of target network traffic.
S202:根据所述目标流信息,确定所述目标网络流量的目标端口。S202: Determine a target port of the target network traffic according to the target flow information.
将网络流量数据经过流量探针或者流信息提取工具,提取出流信息集合,流信息包含了源网际互连协议(Internet Protocol,IP),源端口,目的IP,目的端口,时间,上行字节数,下行字节数,上行包数目等。The network flow data is passed through a flow probe or flow information extraction tool to extract a flow information set. The flow information includes the source Internet Protocol (IP), source port, destination IP, destination port, time, number of upstream bytes, number of downstream bytes, number of upstream packets, etc.
S203:将所述目标端口的特征,输入端口模式识别模型,得到所述目标端口的模式。S203: Input the characteristics of the target port into a port pattern recognition model to obtain the pattern of the target port.
本步骤将目标流信息的端口信息作为特征传入端口模式识别模型,得到端口模式,所述端口模式识别模型的输出包括4个维度,分别为:源端口为固定、源端口为随机、目的端口为固定、目的端口为随机,所述端口模式识别模型可采用端口模式全面的网络流量进行训练,例如端口模式识别的网络流量来自传统IT办公网,实验室网络流量等。In this step, the port information of the target flow information is passed as a feature into the port pattern recognition model to obtain a port pattern. The output of the port pattern recognition model includes four dimensions, namely: the source port is fixed, the source port is random, the destination port is fixed, and the destination port is random. The port pattern recognition model can be trained using network traffic with comprehensive port patterns. For example, the network traffic for port pattern recognition comes from traditional IT office networks, laboratory network traffic, etc.
在一种实现方式中,如图3所示,所述将所述目标端口的特征,输入端口模式识别模型,得到目标端口的模式,包括:将所述目标端口的特征输入所述端口模式识别模型的第一网络层;将多个历史流信息对应的多个端口的特征和多个时间段信息输入所述端口模式识别模型的第二网络层;将所述第一网络层的输出与所述第二网络层的输出相加后,输入所述端口模式识别模型的第三网络层,输出所述目标端口的模式;其中,所述第一网络层的输出维度与所述第二网络层的输出维度相同。In one implementation, as shown in FIG3 , the characteristics of the target port are input into a port pattern recognition model to obtain the pattern of the target port, including: inputting the characteristics of the target port into a first network layer of the port pattern recognition model; inputting characteristics of multiple ports corresponding to multiple historical flow information and multiple time period information into a second network layer of the port pattern recognition model; adding the output of the first network layer to the output of the second network layer, inputting the sum into a third network layer of the port pattern recognition model, and outputting the pattern of the target port; wherein the output dimension of the first network layer is the same as the output dimension of the second network layer.
在一种可选的实现方式中,在根据所述目标端口的特征和所述目标流信息对应的目标流特征,确定流异常事件之前,还包括:对所述目标端口的特征进行维度转换,以使所述目标端口的特征的维度达到目标维度。In an optional implementation, before determining the flow abnormality event according to the characteristics of the target port and the target flow characteristics corresponding to the target flow information, it also includes: dimensionality conversion of the characteristics of the target port so that the dimension of the characteristics of the target port reaches the target dimension.
具体地,所述端口模式识别模型分为输入层,网络层,输出层,如图3, 其基本思想是:在所有历史流端口信息(输入数据2)的基础上结合目标端口的特征(输入数据1)来查询这个组合具体端口模式。输入层为模型的入口,输入数据1在模型训练时表示当前训练数据集中经过的端口组合(源端口,目的端口);在模型检测时,表示当前检测流信息经过的(源端口,目的端口);输入数据2表示当前训练数据集中的所有流端口信息,其中还可利用源端口/目的端口对应的流产生的时间,所述流产生的时间特征可采用泛化映射处理,例如按小时划分,则00:10:00可映射为0,05:30:22可映射为5,此处相关泛化粒度不作具体限制。Specifically, the port pattern recognition model is divided into an input layer, a network layer, and an output layer, as shown in FIG3 . The basic idea is to query the specific port pattern of this combination based on all historical flow port information (input data 2) combined with the characteristics of the target port (input data 1). The input layer is the entrance of the model. When the model is trained, input data 1 represents the port combination (source port, destination port) passed through in the current training data set; when the model is tested, it represents the (source port, destination port) passed through by the current test flow information; input data 2 represents all flow port information in the current training data set, where the time when the flow corresponding to the source port/destination port is generated can also be used. The time characteristics of the flow generation can be processed by generalized mapping. For example, if divided by hours, 00:10:00 can be mapped to 0, and 05:30:22 can be mapped to 5. The relevant generalization granularity is not specifically limited here.
所述端口模式识别模型的具体结构可以如图4所示,其中,网络层1不限于采用全连接或者卷积网络层等网络结构。网络层2不限于采用全连接、卷积、长短期记忆(Long short-term memory,LSTM)、多头注意力等网络结构。在得到网络层1和网络层2的结果后,将其两个结果进行相加,然后输入网络层3进行计算,网络层3不限于采用全连接或者卷积网络等网络结构。The specific structure of the port pattern recognition model can be shown in FIG4 , wherein the network layer 1 is not limited to adopting a network structure such as a fully connected or convolutional network layer. The network layer 2 is not limited to adopting a network structure such as a fully connected, convolutional, long short-term memory (LSTM), multi-head attention, etc. After obtaining the results of the network layer 1 and the network layer 2, the two results are added and then input into the network layer 3 for calculation. The network layer 3 is not limited to adopting a network structure such as a fully connected or convolutional network.
输入层使用2个embedding(65536,16)层,源端口使用其中一个embedding1,目的端口使用embedding2,65536代表最大端口总量,16代表端口映射后的维度,该值可按需调整,embedding中的第一维度与替换后的端口号一一对应,其中,时间段信息不需要使用embedding技术,网络层1采用具有64个神经元的单层全连接层;网络层2采用2层卷积层+Flatten+64个神经元单层全连接层。网络层3采用具有128个神经元的单层全连接层。输出层采用4个神经元的单层全连接网络。The input layer uses two embedding (65536, 16) layers. The source port uses one embedding1, and the destination port uses embedding2. 65536 represents the maximum total number of ports, and 16 represents the dimension after port mapping. This value can be adjusted as needed. The first dimension in the embedding corresponds to the replaced port number one by one. The time period information does not need to use the embedding technology. Network layer 1 uses a single-layer fully connected layer with 64 neurons; network layer 2 uses 2 convolutional layers + Flatten + 64 neurons. Single-layer fully connected layer. Network layer 3 uses a single-layer fully connected layer with 128 neurons. The output layer uses a single-layer fully connected network with 4 neurons.
S204:根据所述目标端口的模式和所述目标端口的特征,对所述目标端口的特征进行转换,得到转换后的目标端口的特征。S204: According to the mode of the target port and the characteristics of the target port, convert the characteristics of the target port to obtain the converted characteristics of the target port.
在一种可选的实现方式中,所述根据所述目标端口的模式和所述目标端口的特征,对所述目标端口的特征进行转换,包括以下一者:在所述目标端口的模式为固定的情况下,保留所述目标端口的特征的原始值;在所述目标端口的模式为随机的情况下,转换所述目标端口的特征为预设泛化值。 In an optional implementation, the characteristics of the target port are converted according to the pattern of the target port and the characteristics of the target port, including one of the following: when the pattern of the target port is fixed, retaining the original value of the characteristics of the target port; when the pattern of the target port is random, converting the characteristics of the target port to a preset generalized value.
具体地,输入数据1经过模式识别模型后得到了端口模式,然后进行端口转换,如果端口模式为固定,那么保留端口的原始数值,如果端口模式识别为随机,那么将端口替换成“random”,得到新端口特征,例如目标端口(50001,80),其端口模式为(随机,固定),那么转换后的结果表示为(random,80)。Specifically, the input data 1 is passed through the pattern recognition model to obtain the port pattern, and then port conversion is performed. If the port mode is fixed, the original value of the port is retained. If the port pattern is recognized as random, the port is replaced with "random" to obtain a new port feature. For example, the target port (50001, 80) has a port mode of (random, fixed), and the converted result is represented as (random, 80).
S205:将所述转换后的目标端口的特征和所述目标流特征,输入流异常检测模型,得到所述流异常事件。S205: Input the converted target port features and the target flow features into a flow anomaly detection model to obtain the flow anomaly event.
本步骤所述流异常检测模型包括自动编码器AE或变分自动编码器VAE,网络结构可以如图5所示,对此不作具体限制。The flow anomaly detection model described in this step includes an autoencoder AE or a variational autoencoder VAE, and the network structure can be as shown in Figure 5, without specific limitation.
在一种可选的实现方式中,在所述将多个历史流信息对应的多个端口的特征和多个时间段信息输入所述端口模式识别模型的第二网络层之前,还包括:确定第一端口的第一数量,其中,所述第一端口包括所述多个历史流信息对应的至少一个端口;对多个所述第一数量进行排序,确定所述第一端口对应的第一排序值;根据所述第一排序值,对所述第一端口的特征进行编码;所述将多个历史流信息对应的多个端口的特征和多个时间段信息输入所述端口模式识别模型的第二网络层,包括:将编码后的所述多个端口的特征和多个时间段信息输入所述端口模式识别模型的第二网络层。In an optional implementation, before inputting the characteristics of multiple ports corresponding to multiple historical flow information and multiple time period information into the second network layer of the port pattern recognition model, it also includes: determining a first number of first ports, wherein the first port includes at least one port corresponding to the multiple historical flow information; sorting the multiple first numbers to determine a first sorting value corresponding to the first port; encoding the characteristics of the first port according to the first sorting value; inputting the characteristics of multiple ports corresponding to multiple historical flow information and multiple time period information into the second network layer of the port pattern recognition model, includes: inputting the encoded characteristics of the multiple ports and multiple time period information into the second network layer of the port pattern recognition model.
具体地,如可以通过数据预处理,将流信息集合中每一条流信息分离成两部分,一部分包含源端口,目的端口,时间3个信息,得到流信息集合feature_1_1,并对集合中的每一条流的端口模式进行打标签(label),例如(1,0,0,1),分别表示源端口为固定,源端口为随机,目的端口为固定,目的端口为随机,其中1表示确定,0表示否定,另一部分从流信息提取除4元组(源IP,源端口,目的IP,目的端口)及时间外其余信息得到流信息集合feature_2_1。然后对feature_1_1泛化端口,分别统计feature_1_1中源端口,目的端口出现的次数,并进行排序,得到排序结果,如下表1,最后将端口号使用排序序号进行替换,如将端口2554编码为其对应的序号1。 Specifically, by data preprocessing, each flow information in the flow information set can be separated into two parts, one part contains three pieces of information: source port, destination port, and time, and the flow information set feature_1_1 is obtained, and the port mode of each flow in the set is labeled (label), for example (1,0,0,1), which respectively indicates that the source port is fixed, the source port is random, the destination port is fixed, and the destination port is random, where 1 indicates confirmation and 0 indicates negation, and the other part extracts the remaining information from the flow information except the 4-tuple (source IP, source port, destination IP, destination port) and time to obtain the flow information set feature_2_1. Then, feature_1_1 is generalized into ports, and the number of occurrences of the source port and the destination port in feature_1_1 are counted respectively, and sorted to obtain the sorting results, as shown in Table 1 below, and finally the port number is replaced with the sorting sequence number, such as encoding port 2554 as its corresponding sequence number 1.
表1端口号排序结果示例
Table 1 Example of port number sorting results
另外,本申请还可包括数据增强的步骤,如对feature_1_1中的(源端口、目的端口)组合去重,得到feature_3_1集合,然后对feature_3_1的数据进行负采样,端口数值范围由1-65536,将源或者目的端口没有出现在feature_3_1中的端口模式标签记为随机,例如负采样源,目的端口(1,10),其中10没有出现在目的端口上,则端口模式标签为(1,0,0,1),将增强的数据放入feature_3_1集合中。In addition, the present application may also include a data enhancement step, such as deduplicating the (source port, destination port) combination in feature_1_1 to obtain the feature_3_1 set, and then negatively sampling the feature_3_1 data, with the port value range from 1 to 65536, and recording the port mode label of the source or destination port that does not appear in feature_3_1 as random, for example, the negative sampling source and destination port (1, 10), where 10 does not appear on the destination port, then the port mode label is (1, 0, 0, 1), and the enhanced data is placed in the feature_3_1 set.
本申请实施例中的端口模式识别模型的训练过程包括:逐条遍历前述feature_3_1,提取源端口和目的端口作为输入数据1,所有feature_1_1源端口,目的端口,时间段信息作为输入数据2,feature_3_1中的label作为输出层标签,然后进行训练,每遍历一遍feature_3_1作为一个epoch,epoch可以设置200次,或者按需指定。训练完成之后,得到所述端口模式识别模型。The training process of the port pattern recognition model in the embodiment of the present application includes: traversing the aforementioned feature_3_1 one by one, extracting the source port and the destination port as input data 1, all feature_1_1 source ports, destination ports, and time period information as input data 2, and the label in feature_3_1 as the output layer label, and then training, each traversal of feature_3_1 as an epoch, and the epoch can be set to 200 times, or specified as needed. After the training is completed, the port pattern recognition model is obtained.
然后,训练所述流异常检测模型,具体步骤包括:遍历feature_2_1中的每一条流特征,在feature_1_1中提取对应的源端口,目的端口信息,将其送入端口识别模型,得到端口模式;根据端口模式对源端口,目的端口进行特征转换,转换方式,得到新的特征new_port_feature;将feature_2_1与new_port_feature拼接后送入流异常检测模型进行训练,将输入的数据进行编码后解码还原,利用原始输入数据与还原之间的重构作为异常值,每遍历一遍feature_2_1作为一个epoch,epoch可以按需指定,训练完成之后,取训练集中最大异常值作为异常事件的阈值,得到流异常检测模型。Then, the flow anomaly detection model is trained, and the specific steps include: traversing each flow feature in feature_2_1, extracting the corresponding source port and destination port information in feature_1_1, and sending it to the port identification model to obtain the port mode; performing feature conversion on the source port and the destination port according to the port mode, and obtaining a new feature new_port_feature; splicing feature_2_1 and new_port_feature and sending them to the flow anomaly detection model for training, encoding the input data and then decoding and restoring it, using the reconstruction between the original input data and the restoration as the outlier value, each traversal of feature_2_1 as an epoch, the epoch can be specified as needed, and after the training is completed, taking the maximum outlier value in the training set as the threshold of the abnormal event to obtain the flow anomaly detection model.
需要说明的是,本申请用于训练端口模式识别模型的网络流量数据与用于训练流异常检测模型的网络流量数据可以相同,也可以不同,本申请不作具体限制。例如,可以使端口模式识别模型采用的网络流量数据更加丰富,从而使端口模式识别模型训练效果更好,从而提高流异常检测模型检测的准确度。It should be noted that the network traffic data used in the present application for training the port pattern recognition model and the network traffic data used for training the flow anomaly detection model may be the same or different, and the present application does not impose any specific restrictions. For example, the network traffic data used by the port pattern recognition model may be made richer, so that the training effect of the port pattern recognition model is better, thereby improving the accuracy of the flow anomaly detection model detection.
需要说明的是,本申请实施例提供的网络流量检测方法,执行主体可以为网络流量检测装置,或者该网络流量检测装置中的用于执行网络流量检测方法的控制模块。本申请实施例中以网络流量检测装置执行网络流量检测的方法为例,说明本申请实施例提供的网络流量检测装置。It should be noted that the network traffic detection method provided in the embodiment of the present application can be executed by a network traffic detection device, or a control module in the network traffic detection device for executing the network traffic detection method. In the embodiment of the present application, the method for executing network traffic detection by a network traffic detection device is taken as an example to illustrate the network traffic detection device provided in the embodiment of the present application.
图6是本申请实施例提供的一种网络流量检测装置的结构示意图。如图6所示,该网络流量检测装置600包括:第一确定模块610、第二确定模块620和检测模块630。Fig. 6 is a schematic diagram of the structure of a network traffic detection device provided by an embodiment of the present application. As shown in Fig. 6 , the network traffic detection device 600 includes: a first determination module 610 , a second determination module 620 and a detection module 630 .
所述第一确定模块610,用于确定目标网络流量的目标流信息;所述第二确定模块620,用于根据所述目标流信息,确定所述目标网络流量的目标端口,其中,所述目标端口包括源端口和目的端口;所述检测模块630,用于根据所述目标端口的特征和所述目标流信息对应的目标流特征,确定流异常事件。The first determination module 610 is used to determine the target flow information of the target network traffic; the second determination module 620 is used to determine the target port of the target network traffic based on the target flow information, wherein the target port includes a source port and a destination port; the detection module 630 is used to determine the flow abnormality event based on the characteristics of the target port and the target flow characteristics corresponding to the target flow information.
在一种可选的实现方式中,所述检测模块630用于:将所述目标端口的特征,输入端口模式识别模型,得到所述目标端口的模式,其中,所述端口模式识别模型用于模拟多个端口的特征与多个端口模式之间的对应关系,所述目标端口的模式用于表示所述目标端口与所述目标端口属性的组合特征;根据所述目标端口的模式和所述目标端口的特征,对所述目标端口的特征进行转换,得到转换后的目标端口的特征;根据所述转换后的目标端口的特征和所述目标流特征,确定所述流异常事件。In an optional implementation, the detection module 630 is used to: input the characteristics of the target port into a port pattern recognition model to obtain the pattern of the target port, wherein the port pattern recognition model is used to simulate the correspondence between the characteristics of multiple ports and multiple port patterns, and the pattern of the target port is used to represent the combined characteristics of the target port and the target port attributes; according to the pattern of the target port and the characteristics of the target port, convert the characteristics of the target port to obtain the characteristics of the converted target port; according to the characteristics of the converted target port and the target flow characteristics, determine the flow abnormality event.
在一种可选的实现方式中,所述检测模块630,用于:将所述目标端口的特征输入所述端口模式识别模型的第一网络层;将多个历史流信息对应的多个端口的特征和多个时间段信息输入所述端口模式识别模型的第二网络层;将所述第一网络层的输出与所述第二网络层的输出相加后,输入所述端口模式识别模型的第三网络层,输出所述目标端口的模式;其中,所述第一网络层的输出维度与所述第二网络层的输出维度相同。In an optional implementation, the detection module 630 is used to: input the characteristics of the target port into the first network layer of the port pattern recognition model; input the characteristics of multiple ports corresponding to multiple historical flow information and multiple time period information into the second network layer of the port pattern recognition model; add the output of the first network layer to the output of the second network layer, input the result to the third network layer of the port pattern recognition model, and output the pattern of the target port; wherein the output dimension of the first network layer is the same as the output dimension of the second network layer.
在一种可选的实现方式中,所述根据所述目标端口的模式和所述目标端口的特征,对所述目标端口的特征进行转换,包括以下一者:在所述目标端口的模式为固定的情况下,保留所述目标端口的特征的原始值;在所述目标端口的模式为随机的情况下,转换所述目标端口的特征为预设泛化值。In an optional implementation, the characteristics of the target port are converted according to the pattern of the target port and the characteristics of the target port, including one of the following: when the pattern of the target port is fixed, retaining the original value of the characteristics of the target port; when the pattern of the target port is random, converting the characteristics of the target port to a preset generalized value.
在一种可选的实现方式中,所述检测模块630,用于:将所述目标端口的特征和所述目标流特征,输入流异常检测模型,得到所述流异常事件,其中,所述流异常检测模型包括自动编码器AE或变分自动编码器VAE。In an optional implementation, the detection module 630 is used to: input the characteristics of the target port and the target flow characteristics into a flow anomaly detection model to obtain the flow anomaly event, wherein the flow anomaly detection model includes an autoencoder AE or a variational autoencoder VAE.
在一种可选的实现方式中,所述检测模块630,用于:将所述目标端口的特征和所述目标流特征进行拼接,得到拼接值;将所述拼接值进行编码后再进行还原,得到还原值;将所述拼接值与所述还原值进行比较处理得到异常值;在所述异常值大于阈值的情况下,将所述目标网络流量确定为所述流异常事件。In an optional implementation, the detection module 630 is used to: splice the characteristics of the target port and the target flow characteristics to obtain a spliced value; encode the spliced value and then restore it to obtain a restored value; compare the spliced value with the restored value to obtain an abnormal value; when the abnormal value is greater than a threshold, determine the target network traffic as the flow abnormal event.
在一种可选的实现方式中,所述装置600还包括:编码模块,所述编码模块,用于:确定第一端口的第一数量,其中,所述第一端口包括所述多个历史流信息对应的至少一个端口;对多个所述第一数量进行排序,确定所述第一端口对应的第一排序值;根据所述第一排序值,对所述第一端口的特征进行编码;所述检测模块630,用于将编码后的所述多个端口的特征和多个时间段信息输入所述端口模式识别模型的第二网络层。In an optional implementation, the device 600 also includes: an encoding module, which is used to: determine a first number of first ports, wherein the first port includes at least one port corresponding to the multiple historical flow information; sort the multiple first numbers to determine a first sorting value corresponding to the first port; encode the characteristics of the first port according to the first sorting value; the detection module 630 is used to input the encoded characteristics of the multiple ports and multiple time period information into the second network layer of the port pattern recognition model.
在一种可选的实现方式中,所述装置600还包括:预处理模块,所述预处理模块,用于:对所述目标端口的特征进行维度转换,以使所述目标端口的特征的维度达到目标维度。In an optional implementation, the device 600 further includes: a preprocessing module, wherein the preprocessing module is configured to: perform dimension conversion on the feature of the target port so that the dimension of the feature of the target port reaches a target dimension.
本申请实施例提供的一种网络流量检测装置,通过第一确定模块,用于确定目标网络流量的目标流信息;第二确定模块,用于根据所述目标流信息,确定所述目标网络流量的目标端口,其中,所述目标端口包括源端口和目的端口;检测模块,用于根据所述目标端口的特征和所述目标流信息对应的目标流特征,确定流异常事件,能够实现网络中端与端通信流量的异常检测,检测准确度高。A network traffic detection device provided by an embodiment of the present application uses a first determination module to determine target flow information of target network traffic; a second determination module to determine a target port of the target network traffic based on the target flow information, wherein the target port includes a source port and a destination port; a detection module to determine a flow anomaly event based on characteristics of the target port and target flow characteristics corresponding to the target flow information, thereby enabling abnormal detection of end-to-end communication traffic in the network with high detection accuracy.
本申请实施例提供的网络流量检测装置能够实现图1至2所述的网络流量检测方法实施例实现的各个过程,并达到相同的技术效果,为避免重复,这里不再赘述。The network traffic detection device provided in the embodiment of the present application can implement the various processes implemented by the network traffic detection method embodiment described in Figures 1 to 2, and achieve the same technical effect. To avoid repetition, it will not be repeated here.
本申请实施例中的网络流量检测装置可以是装置,也可以是终端设备中的部件、集成电路、或芯片。该装置可以是移动电子设备,也可以为非移动电子设备。示例性的,移动电子设备可以为手机、平板电脑、笔记本电脑、掌上电脑、车载电子设备、可穿戴设备、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本或者个人数字助理(personal digital assistant,PDA)等,非移动电子设备可以为服务器、网络附属存储器(Network Attached Storage,NAS)、个人计算机(personal computer,PC)、电视机(television,TV)、柜员机或者自助机等,本申请实施例不作具体限定。The network traffic detection device in the embodiment of the present application can be a device, or a component, integrated circuit, or chip in a terminal device. The device can be a mobile electronic device or a non-mobile electronic device. Exemplarily, the mobile electronic device can be a mobile phone, a tablet computer, a laptop computer, a PDA, an in-vehicle electronic device, a wearable device, an ultra-mobile personal computer (ultra-mobile personal computer, UMPC), a netbook or a personal digital assistant (personal digital assistant, PDA), etc., and the non-mobile electronic device can be a server, a network attached storage (Network Attached Storage, NAS), a personal computer (personal computer, PC), a television (television, TV), a teller machine or a self-service machine, etc., which is not specifically limited in the embodiment of the present application.
本申请实施例中的网络流量检测装置可以为具有操作系统的装置。该操作系统可以为安卓(Android)操作系统,可以为ios操作系统,还可以为其他可选的操作系统,本申请实施例不作具体限定。The network traffic detection device in the embodiment of the present application may be a device having an operating system. The operating system may be an Android operating system, an iOS operating system, or other optional operating systems, which are not specifically limited in the embodiment of the present application.
可选的,如图7所示,本申请实施例还提供一种电子设备700,包括处理器701,存储器702,存储在存储器702上并可在所述处理器701上运行的程序或指令,该程序或指令被处理器701执行时实现:图1和图2实施例中至少一个实施例所述的网络流量检测方法。需要说明的是,本申请实施例中的电子设备包括:服务器、终端设备或除终端设备之外的其他设备。Optionally, as shown in FIG7 , an embodiment of the present application further provides an electronic device 700, including a processor 701, a memory 702, a program or instruction stored in the memory 702 and executable on the processor 701, and when the program or instruction is executed by the processor 701, the network traffic detection method described in at least one of the embodiments of FIG1 and FIG2 is implemented. It should be noted that the electronic device in the embodiment of the present application includes: a server, a terminal device, or other devices other than a terminal device.
以上电子设备结构并不构成对电子设备的限定,电子设备可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置,例如,输入单元,可以包括图形处理器(Graphics Processing Unit,GPU)和麦克风,显示单元可以采用液晶显示器、有机发光二极管等形式来配置显示面板。用户输入单元包括触控面板以及其他输入设备中的至少一种。触控面板也称为触摸屏。其他输入设备可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆,在此不再赘述。The above electronic device structure does not constitute a limitation on the electronic device. The electronic device may include more or fewer components than shown in the figure, or combine certain components, or arrange the components differently. For example, the input unit may include a graphics processing unit (GPU) and a microphone, and the display unit may be configured with a display panel in the form of a liquid crystal display, an organic light-emitting diode, etc. The user input unit includes a touch panel and at least one of other input devices. The touch panel is also called a touch screen. Other input devices may include, but are not limited to, a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, and a joystick, which will not be repeated here.
存储器可用于存储软件程序以及各种数据。存储器可主要包括存储程序或指令的第一存储区和存储数据的第二存储区,其中,第一存储区可存储操作系统、至少一个功能所需的应用程序或指令(比如声音播放功能、图像播放功能等)等。此外,存储器可以包括易失性存储器或非易失性存储器,或者,存储器可以包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDRSDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(Synch link DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DRRAM)。The memory can be used to store software programs and various data. The memory may mainly include a first storage area for storing programs or instructions and a second storage area for storing data, wherein the first storage area may store an operating system, an application program or instructions required for at least one function (such as a sound playback function, an image playback function, etc.), etc. In addition, the memory may include a volatile memory or a non-volatile memory, or the memory may include both volatile and non-volatile memories. Among them, the non-volatile memory may be a read-only memory (ROM), a programmable read-only memory (PROM), an erasable programmable read-only memory (EPROM), an electrically erasable programmable read-only memory (EEPROM), or a flash memory. Volatile memory can be random access memory (Random Access Memory, RAM), static random access memory (Static RAM, SRAM), dynamic random access memory (Dynamic RAM, DRAM), synchronous dynamic random access memory (Synchronous DRAM, SDRAM), double data rate synchronous dynamic random access memory (Double Data Rate SDRAM, DDRSDRAM), enhanced synchronous dynamic random access memory (Enhanced SDRAM, ESDRAM), synchronous link dynamic random access memory (Synch link DRAM, SLDRAM) and direct memory bus random access memory (Direct Rambus RAM, DRRAM).
处理器可包括一个或多个处理单元;可选的,处理器集成应用处理器和调制解调处理器,其中,应用处理器主要处理涉及操作系统、用户界面和应用程序等的操作,调制解调处理器主要处理通信信号,如基带处理器。可以理解的是,上述调制解调处理器也可以不集成到处理器中。The processor may include one or more processing units; optionally, the processor integrates an application processor and a modem processor, wherein the application processor mainly processes operations related to the operating system, user interface, and application programs, and the modem processor mainly processes communication signals, such as a baseband processor. It is understandable that the modem processor may not be integrated into the processor.
本申请实施例还提供一种可读存储介质,所述可读存储介质上存储有程序或指令,该程序或指令被处理器执行时实现图1和图2实施例中至少一个实施例所述的网络流量检测方法,且能达到相同的技术效果,为避免重复,这里不再赘述。An embodiment of the present application also provides a readable storage medium, on which a program or instruction is stored. When the program or instruction is executed by a processor, the network traffic detection method described in at least one of the embodiments of Figures 1 and 2 is implemented, and the same technical effect can be achieved. To avoid repetition, it will not be repeated here.
其中,所述处理器为上述实施例中所述的电子设备中的处理器。所述可读存储介质,包括计算机可读存储介质,如计算机只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等。The processor is a processor in the electronic device described in the above embodiment. The readable storage medium includes a computer readable storage medium, such as a computer read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, etc.
本申请实施例另提供了一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现上述网络流量检测方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。An embodiment of the present application further provides a chip, which includes a processor and a communication interface, wherein the communication interface is coupled to the processor, and the processor is used to run programs or instructions to implement the various processes of the above-mentioned network traffic detection method embodiment, and can achieve the same technical effect. To avoid repetition, it will not be repeated here.
应理解,本申请实施例提到的芯片还可以称为系统级芯片、系统芯片、芯片系统或片上系统芯片等。It should be understood that the chip mentioned in the embodiments of the present application can also be called a system-level chip, a system chip, a chip system or a system-on-chip chip, etc.
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。此外,需要指出的是,本申请实施方式中的方法和装置的范围不限按示出或讨论的顺序来执行功能,还可包括根据所涉及的功能按基本同时的方式或按相反的顺序来执行功能,例如,可以按不同于所描述的次序来执行所描述的方法,并且还可以添加、省去、或组合各种步骤。另外,参照某些示例所描述的特征可在其他示例中被组合。It should be noted that, in this article, the terms "comprise", "include" or any other variant thereof are intended to cover non-exclusive inclusion, so that the process, method, article or device including a series of elements includes not only those elements, but also includes other elements not explicitly listed, or also includes elements inherent to such process, method, article or device. In the absence of further restrictions, the elements defined by the sentence "including one..." do not exclude the presence of other identical elements in the process, method, article or device including the element. In addition, it should be pointed out that the scope of the methods and devices in the embodiments of the present application is not limited to performing functions in the order shown or discussed, and may also include performing functions in a substantially simultaneous manner or in reverse order according to the functions involved, for example, the described method may be performed in an order different from that described, and various steps may also be added, omitted, or combined. In addition, the features described with reference to certain examples may be combined in other examples.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以计算机软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端(可以是手机,计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。Through the description of the above implementation methods, those skilled in the art can clearly understand that the above-mentioned embodiment methods can be implemented by means of software plus a necessary general hardware platform, and of course by hardware, but in many cases the former is a better implementation method. Based on such an understanding, the technical solution of the present application, or the part that contributes to the prior art, can be embodied in the form of a computer software product, which is stored in a storage medium (such as ROM/RAM, a disk, or an optical disk), and includes a number of instructions for a terminal (which can be a mobile phone, a computer, a server, or a network device, etc.) to execute the methods described in each embodiment of the present application.
上面结合附图对本申请的实施例进行了描述,但是本申请并不局限于上述的具体实施方式,上述的具体实施方式仅仅是示意性的,而不是限制性的,本领域的普通技术人员在本申请的启示下,在不脱离本申请宗旨和权利要求所保护的范围情况下,还可做出很多形式,均属于本申请的保护之内。 The embodiments of the present application are described above in conjunction with the accompanying drawings, but the present application is not limited to the above-mentioned specific implementation methods. The above-mentioned specific implementation methods are merely illustrative and not restrictive. Under the guidance of the present application, ordinary technicians in this field can also make many forms without departing from the purpose of the present application and the scope of protection of the claims, all of which are within the protection of the present application.

Claims (10)

  1. 一种网络流量检测方法,所述方法包括:A network traffic detection method, the method comprising:
    确定目标网络流量的目标流信息;Determine target flow information of target network traffic;
    根据所述目标流信息,确定所述目标网络流量的目标端口,其中,所述目标端口包括源端口和目的端口;Determine the target port of the target network traffic according to the target flow information, wherein the target port includes a source port and a destination port;
    根据所述目标端口的特征和所述目标流信息对应的目标流特征,确定流异常事件。A flow abnormality event is determined according to the characteristics of the target port and the target flow characteristics corresponding to the target flow information.
  2. 根据权利要求1所述的方法,其中,所述根据所述目标端口的特征和所述目标流信息对应的目标流特征,确定流异常事件,包括:The method according to claim 1, wherein the determining the flow abnormality event according to the characteristics of the target port and the target flow characteristics corresponding to the target flow information comprises:
    将所述目标端口的特征,输入端口模式识别模型,得到所述目标端口的模式,其中,所述端口模式识别模型用于模拟端口的特征与端口的模式之间的对应关系,所述目标端口的模式用于表示所述目标端口与所述目标端口属性的组合特征;Inputting the characteristics of the target port into a port pattern recognition model to obtain the pattern of the target port, wherein the port pattern recognition model is used to simulate the corresponding relationship between the characteristics of a port and the pattern of a port, and the pattern of the target port is used to represent the combined characteristics of the target port and the attributes of the target port;
    根据所述目标端口的模式和所述目标端口的特征,对所述目标端口的特征进行转换,得到转换后的目标端口的特征;According to the mode of the target port and the characteristics of the target port, the characteristics of the target port are converted to obtain the characteristics of the converted target port;
    根据所述转换后的目标端口的特征和所述目标流特征,确定所述流异常事件。The flow abnormality event is determined according to the characteristics of the converted target port and the characteristics of the target flow.
  3. 根据权利要求2所述的方法,其中,所述将所述目标端口的特征,输入端口模式识别模型,得到目标端口的模式,包括:The method according to claim 2, wherein the step of inputting the characteristics of the target port into a port pattern recognition model to obtain the pattern of the target port comprises:
    将所述目标端口的特征输入所述端口模式识别模型的第一网络层;Inputting the characteristics of the target port into the first network layer of the port pattern recognition model;
    将多个历史流信息对应的多个端口的特征和多个时间段信息输入所述端口模式识别模型的第二网络层;Inputting the characteristics of multiple ports and multiple time period information corresponding to the multiple historical flow information into the second network layer of the port pattern recognition model;
    将所述第一网络层的输出与所述第二网络层的输出相加后,输入所述端口模式识别模型的第三网络层,输出所述目标端口的模式;After adding the output of the first network layer and the output of the second network layer, the output is input into the third network layer of the port pattern recognition model, and the pattern of the target port is output;
    其中,所述第一网络层的输出维度与所述第二网络层的输出维度相同。The output dimension of the first network layer is the same as the output dimension of the second network layer.
  4. 根据权利要求2所述的方法,其中,所述根据所述目标端口的模式和所述目标端口的特征,对所述目标端口的特征进行转换,包括以下一者:The method according to claim 2, wherein the converting the characteristic of the target port according to the mode of the target port and the characteristic of the target port comprises one of the following:
    在所述目标端口的模式为固定的情况下,保留所述目标端口的特征的原始值;In the case where the mode of the target port is fixed, retaining the original value of the characteristic of the target port;
    在所述目标端口的模式为随机的情况下,转换所述目标端口的特征为预设泛化值。When the pattern of the target port is random, the feature of the target port is converted into a preset generalization value.
  5. 根据权利要求1所述的方法,其中,所述根据所述目标端口的特征和所述目标流特征,确定所述流异常事件,包括:The method according to claim 1, wherein the determining the flow abnormality event according to the characteristics of the target port and the characteristics of the target flow comprises:
    将所述目标端口的特征和所述目标流特征,输入流异常检测模型,得到所述流异常事件,其中,所述流异常检测模型包括自动编码器AE或变分自动编码器VAE。The characteristics of the target port and the characteristics of the target flow are input into a flow anomaly detection model to obtain the flow anomaly event, wherein the flow anomaly detection model includes an autoencoder AE or a variational autoencoder VAE.
  6. 根据权利要求1所述的方法,其中,所述根据所述目标端口的特征和所述目标流特征,确定所述流异常事件,包括:The method according to claim 1, wherein the determining the flow abnormality event according to the characteristics of the target port and the characteristics of the target flow comprises:
    将所述目标端口的特征和所述目标流特征进行拼接,得到拼接值;Concatenate the target port feature and the target flow feature to obtain a concatenated value;
    将所述拼接值进行编码后再进行还原,得到还原值;Encoding the concatenated value and then restoring it to obtain a restored value;
    将所述拼接值与所述还原值进行比较处理得到异常值;Compare the spliced value with the restored value to obtain an abnormal value;
    在所述异常值大于阈值的情况下,将所述目标网络流量确定为所述流异常事件。When the abnormal value is greater than a threshold, the target network traffic is determined as the flow abnormality event.
  7. 根据权利要求3所述的方法,其中,在所述将多个历史流信息对应的多个端口的特征和多个时间段信息输入所述端口模式识别模型的第二网络层之前,还包括:The method according to claim 3, wherein, before inputting the characteristics of the multiple ports and the multiple time period information corresponding to the multiple historical flow information into the second network layer of the port pattern recognition model, it also includes:
    确定第一端口的第一数量,其中,所述第一端口包括所述多个历史流信息对应的至少一个端口;Determine a first number of first ports, wherein the first ports include at least one port corresponding to the plurality of historical flow information;
    对多个所述第一数量进行排序,确定所述第一端口对应的第一排序值;sorting the first quantities to determine a first sorting value corresponding to the first port;
    根据所述第一排序值,对所述第一端口的特征进行编码;encoding a characteristic of the first port according to the first ranking value;
    所述将多个历史流信息对应的多个端口的特征和多个时间段信息输入所述端口模式识别模型的第二网络层,包括:The step of inputting the characteristics of the multiple ports and the multiple time period information corresponding to the multiple historical flow information into the second network layer of the port pattern recognition model comprises:
    将编码后的所述多个端口的特征和多个时间段信息输入所述端口模式识别模型的第二网络层。The encoded features of the multiple ports and the multiple time period information are input into the second network layer of the port pattern recognition model.
  8. 根据权利要求1所述的方法,其中,在根据所述目标端口的特征和所述目标流信息对应的目标流特征,确定流异常事件之前,还包括:The method according to claim 1, wherein, before determining the flow abnormality event according to the characteristics of the target port and the target flow characteristics corresponding to the target flow information, it also includes:
    对所述目标端口的特征进行维度转换,以使所述目标端口的特征的维度达到目标维度。The feature of the target port is dimensionally transformed so that the dimension of the feature of the target port reaches a target dimension.
  9. 一种电子设备,包括处理器,存储器及存储在所述存储器上并可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如权利要求1-8任一项所述的网络流量检测方法的步骤。An electronic device comprises a processor, a memory and a program or instruction stored in the memory and executable on the processor, wherein the program or instruction, when executed by the processor, implements the steps of the network traffic detection method as described in any one of claims 1 to 8.
  10. 一种可读存储介质,所述可读存储介质上存储程序或指令,所述程序或指令被处理器执行时实现如权利要求1-8任一项所述的网络流量检测方法的步骤。 A readable storage medium stores a program or instruction, and when the program or instruction is executed by a processor, the steps of the network traffic detection method according to any one of claims 1 to 8 are implemented.
PCT/CN2023/105206 2022-11-23 2023-06-30 Network traffic inspection method, electronic device, and storage medium WO2024109083A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211476183.5A CN118074930A (en) 2022-11-23 2022-11-23 Network traffic detection method, electronic equipment and storage medium
CN202211476183.5 2022-11-23

Publications (1)

Publication Number Publication Date
WO2024109083A1 true WO2024109083A1 (en) 2024-05-30

Family

ID=91094289

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/105206 WO2024109083A1 (en) 2022-11-23 2023-06-30 Network traffic inspection method, electronic device, and storage medium

Country Status (2)

Country Link
CN (1) CN118074930A (en)
WO (1) WO2024109083A1 (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101510841A (en) * 2008-12-31 2009-08-19 成都市华为赛门铁克科技有限公司 Method and system for recognizing end-to-end flux
US20140059216A1 (en) * 2012-08-27 2014-02-27 Damballa, Inc. Methods and systems for network flow analysis
CN109587008A (en) * 2018-12-28 2019-04-05 华为技术服务有限公司 Detect the method, apparatus and storage medium of abnormal flow data
CN112153044A (en) * 2020-09-23 2020-12-29 腾讯科技(深圳)有限公司 Flow data detection method and related equipment
CN113079143A (en) * 2021-03-24 2021-07-06 北京锐驰信安技术有限公司 Flow data-based anomaly detection method and system
CN114338109A (en) * 2021-12-17 2022-04-12 北京安天网络安全技术有限公司 Flow detection method and device, electronic equipment and computer readable storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101510841A (en) * 2008-12-31 2009-08-19 成都市华为赛门铁克科技有限公司 Method and system for recognizing end-to-end flux
US20140059216A1 (en) * 2012-08-27 2014-02-27 Damballa, Inc. Methods and systems for network flow analysis
CN109587008A (en) * 2018-12-28 2019-04-05 华为技术服务有限公司 Detect the method, apparatus and storage medium of abnormal flow data
CN112153044A (en) * 2020-09-23 2020-12-29 腾讯科技(深圳)有限公司 Flow data detection method and related equipment
CN113079143A (en) * 2021-03-24 2021-07-06 北京锐驰信安技术有限公司 Flow data-based anomaly detection method and system
CN114338109A (en) * 2021-12-17 2022-04-12 北京安天网络安全技术有限公司 Flow detection method and device, electronic equipment and computer readable storage medium

Also Published As

Publication number Publication date
CN118074930A (en) 2024-05-24

Similar Documents

Publication Publication Date Title
JP7120350B2 (en) SECURITY INFORMATION ANALYSIS METHOD, SECURITY INFORMATION ANALYSIS SYSTEM AND PROGRAM
CN106874253A (en) Recognize the method and device of sensitive information
WO2022174491A1 (en) Artificial intelligence-based method and apparatus for medical record quality control, computer device, and storage medium
US10963590B1 (en) Automated data anonymization
US11036800B1 (en) Systems and methods for clustering data to improve data analytics
WO2022227388A1 (en) Log anomaly detection model training method, apparatus and device
US20210385251A1 (en) System and methods for integrating datasets and automating transformation workflows using a distributed computational graph
US20210112101A1 (en) Data set and algorithm validation, bias characterization, and valuation
Tsukerman Machine Learning for Cybersecurity Cookbook: Over 80 recipes on how to implement machine learning algorithms for building security systems using Python
Chen et al. Bert-log: Anomaly detection for system logs based on pre-trained language model
US10552781B2 (en) Task transformation responsive to confidentiality assessments
CN112183881A (en) Public opinion event prediction method and device based on social network and storage medium
WO2021136318A1 (en) Digital humanities-oriented email history eventline generating method and apparatus
Fan et al. Abnormal event detection via heterogeneous information network embedding
Khan et al. Digital forensics and cyber forensics investigation: security challenges, limitations, open issues, and future direction
CN112968872A (en) Malicious flow detection method, system and terminal based on natural language processing
Alam et al. Looking beyond IoCs: Automatically extracting attack patterns from external CTI
Jagdish et al. Identification of End‐User Economical Relationship Graph Using Lightweight Blockchain‐Based BERT Model
Tang et al. Deep learning-based solution for smart contract vulnerabilities detection
Al-Nabki et al. Short text classification approach to identify child sexual exploitation material
WO2024109083A1 (en) Network traffic inspection method, electronic device, and storage medium
CN116662987A (en) Service system monitoring method, device, computer equipment and storage medium
US20230306106A1 (en) Computer Security Systems and Methods Using Self-Supervised Consensus-Building Machine Learning
CN117009832A (en) Abnormal command detection method and device, electronic equipment and storage medium
US11681966B2 (en) Systems and methods for enhanced risk identification based on textual analysis