WO2024065185A1 - Device classification method and apparatus, electronic device, and computer-readable storage medium - Google Patents

Device classification method and apparatus, electronic device, and computer-readable storage medium Download PDF

Info

Publication number
WO2024065185A1
WO2024065185A1 PCT/CN2022/121762 CN2022121762W WO2024065185A1 WO 2024065185 A1 WO2024065185 A1 WO 2024065185A1 CN 2022121762 W CN2022121762 W CN 2022121762W WO 2024065185 A1 WO2024065185 A1 WO 2024065185A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature group
network traffic
classification result
classification
destination device
Prior art date
Application number
PCT/CN2022/121762
Other languages
French (fr)
Chinese (zh)
Inventor
宋杰
刁海洋
Original Assignee
西门子股份公司
西门子(中国)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 西门子股份公司, 西门子(中国)有限公司 filed Critical 西门子股份公司
Priority to PCT/CN2022/121762 priority Critical patent/WO2024065185A1/en
Publication of WO2024065185A1 publication Critical patent/WO2024065185A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks

Definitions

  • the present invention relates to the field of data processing technology, and in particular to a device classification method, an apparatus, an electronic device and a computer-readable storage medium.
  • IT Information technology
  • OT operational technology
  • the embodiments of the present invention provide a device classification method, an apparatus, an electronic device, and a computer-readable storage medium.
  • a device classification method comprising:
  • the first feature group includes a network address of a device and a common flag (Common Flag) in the network traffic associated with the network address
  • the second feature group includes a network address of a destination device of the network traffic and a Transmission Control Protocol (TCP) port status of the destination device
  • the third feature group includes a network address of a destination device of the network traffic and a User Datagram Protocol (UDP) port status of the destination device
  • a classification result of the device is determined.
  • the first feature group characterizing the common attributes of the network traffic and the second feature group and the third feature group characterizing the specific attributes of the network traffic are comprehensively considered in a weighted manner to achieve accurate device classification.
  • the device includes OT equipment and/or IT equipment;
  • the obtaining of network traffic between devices comprises: obtaining, via a monitoring port on a switch connected to the device, mirror traffic of network traffic flowing through the switch within a predetermined time;
  • the public sign includes at least one of the following:
  • Time to Live TTL
  • Window Size WinSize
  • Do not Fragment DF
  • Maximum Message Length Max Segment Size, MSS
  • Window Scaling Factor WinSacle
  • the weight of the second feature group is greater than the weight of the third feature group, and the weight of the third feature group is greater than the weight of the first feature group, wherein the respective weights of the first feature group, the second feature group and the third feature group are adjustable.
  • it also includes:
  • the status information of the TCP port represents the switch status of each TCP port of the destination device
  • the status information of the UDP port represents the switch status of each UDP port of the destination device
  • a device classification apparatus comprising:
  • An acquisition module is configured to acquire network traffic between devices
  • the extraction module is configured to extract a first feature group, a second feature group and a third feature group of the network traffic, wherein the first feature group includes a network address of a device and a public flag in the network traffic associated with the network address; the second feature group includes a network address of a destination device of the network traffic and a TCP port status of the destination device; the third feature group includes a network address of a destination device of the network traffic and a UDP status of the destination device;
  • a first determining module is configured to determine a union feature group of the first feature group, the second feature group and the third feature group based on respective weights of the first feature group, the second feature group and the third feature group;
  • the second determination module is configured to determine the classification result of the device based on the clustering result of the union feature group.
  • the first feature group characterizing the common attributes of the network traffic and the second feature group and the third feature group characterizing the specific attributes of the network traffic are comprehensively considered in a weighted manner to achieve accurate device classification.
  • the device includes OT equipment and/or IT equipment;
  • the acquisition module is configured to acquire the mirrored traffic of the network traffic within a predetermined time via a monitoring port of a switch connected to the device;
  • the public sign includes at least one of the following:
  • the weight of the second feature group is greater than the weight of the third feature group, and the weight of the third feature group is greater than the weight of the first feature group, wherein the respective weights of the first feature group, the second feature group and the third feature group are adjustable.
  • it also includes:
  • the adjustment module is configured to perform iteration when the overlap between the classification result and the predetermined target classification result is less than a predetermined threshold value, until the overlap between the classification result and the target classification result is greater than or equal to the threshold value, wherein the iteration includes: adjusting at least one of the respective weights of the first feature group, the second feature group and the third feature group; determining an adjusted union feature group of the first feature group, the second feature group and the third feature group based on the respective adjusted weights of the first feature group, the second feature group and the third feature group; determining an adjusted classification result based on a clustering result of the adjusted union feature group; and comparing the overlap between the adjusted classification result and the target classification result.
  • the TCP port status information represents the switch status of each TCP port of the destination device
  • the UDP port status information represents the switch status of each UDP port of the destination device
  • An electronic device comprising:
  • a memory configured to store executable instructions of the processor
  • the processor is used to read the executable instructions from the memory and execute the executable instructions to implement the device classification method as described in any one of the above items.
  • a computer-readable storage medium stores computer instructions, wherein the computer instructions, when executed by a processor, implement any of the above device classification methods.
  • a computer program product comprises a computer program, wherein when the computer program is executed by a processor, the device classification method as described in any one of the above items is implemented.
  • FIG. 1 is a flow chart of a device classification method according to an embodiment of the present invention.
  • FIG. 2 is a schematic diagram of obtaining network traffic according to an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of parsing network traffic according to an embodiment of the present invention.
  • FIG. 4 is an exemplary flow chart of a device classification method according to an embodiment of the present invention.
  • FIG. 5 is a block diagram of a device classification apparatus according to an embodiment of the present invention.
  • FIG. 6 is an exemplary structural diagram of an electronic device according to an embodiment of the present invention.
  • FIG1 is a flow chart of a device classification method according to an embodiment of the present invention.
  • the method of FIG1 is applicable to both IT systems that only include IT devices and OT systems that only include OT devices. Considering the number and complexity of devices in an IT/OT fusion system, the method of FIG1 is particularly suitable for OT systems and IT/OT fusion systems.
  • the method includes:
  • Step 101 Obtain network traffic between devices.
  • the device may be an IT device in an IT system, an OT device in an OT system, an IT device in an IT/OT fusion system, or an OT device in an IT/OT fusion system.
  • the protocols for transmitting network traffic between devices can include: transmission protocols and communication protocols.
  • transmission protocols are generally responsible for networking and communication between devices in a subnet;
  • communication protocols are mainly device communication protocols running on TCP/IP protocols, responsible for data exchange and communication between devices through the Internet.
  • protocols for transmitting network traffic may include: Representational State Transfer (REST)/Hyper Text Transfer Protocol (Hyper Text Transfer Protocol), Constrained Application Protocol (CoAP), Message Queuing Telemetry Transport (MQTT) protocol, Data Distribution Service for Real-Time Systems (DDS) protocol, Advanced Message Queuing Protocol (AMQP), Extensible Messaging and Presence Protocol (XMPP), JAVA Message Service (JMS) protocol, and so on.
  • REST Representational State Transfer
  • Hyper Text Transfer Protocol Hyper Text Transfer Protocol
  • CoAP Constrained Application Protocol
  • MQTT Message Queuing Telemetry Transport
  • DDS Data Distribution Service for Real-Time Systems
  • AMQP Advanced Message Queuing Protocol
  • XMPP Extensible Messaging and Presence Protocol
  • JMS JAVA Message Service
  • FIG2 is a schematic diagram of obtaining network traffic according to an embodiment of the present invention.
  • the IT system 16 includes multiple IT devices, and the first OT system 14 and the second OT system 15 each include multiple OT devices.
  • the IT system 16, the first OT system 14, and the second OT system 15 are each connected to the Internet 10 via a switch 11.
  • the switch 11 is also connected to a router 12 and a firewall 13, respectively.
  • By setting a mirrored traffic port on the switch 11 the network traffic between all devices in the IT system 16, the first OT system 14, and the second OT system 15 can be obtained.
  • the device includes IT equipment and/or OT equipment; step 101 specifically includes: obtaining, via a monitoring port on a switch connected to the device, mirror traffic of network traffic flowing through the switch within a predetermined time (preferably, the predetermined time is long enough to ensure that the devices have achieved complete communication). Therefore, obtaining the mirror traffic on the switch will not interfere with the communication between the devices, and the public flag has multiple implementation methods.
  • Step 102 Extract the first feature group, the second feature group and the third feature group of the network traffic, wherein the first feature group includes the network address of the device and the common flag in the network traffic associated with the network address; the second feature group includes the network address of the destination device of the network traffic, and the TCP port status of the destination device; the third feature group includes the network address of the destination device of the network traffic, and the UDP port status of the destination device.
  • FIG3 is a schematic diagram of parsing network traffic according to an embodiment of the present invention.
  • the network traffic parsing process 24 first, the network traffic is obtained from the switch 20 using the sniffer service 21; then, a deterministic finite automaton (DFA) service 22, such as a tshark service, is executed on the traffic to extract header attributes and payload attributes from the network traffic message, thereby obtaining attribute data 23.
  • DFA deterministic finite automaton
  • public flags representing common attributes of a device's network traffic e.g., with the device as the source device and/or destination device
  • the TCP port status and UDP port status of the message's destination device can be extracted from the network traffic header.
  • the public flag includes at least one of the following:
  • TTL Time To Live
  • TTL specifies the maximum number of network segments that an IP packet is allowed to pass through before it is discarded by a router.
  • TTL is an 8-bit field located in the 9th byte of the IPv4 packet.
  • the TCP header contains a window size field, which actually refers to the window of the receiving end, that is, the receiving window, which is used to inform the sending end of the amount of data it can receive, thereby achieving the purpose of flow control.
  • MMS is an option of the TCP protocol. It is used by the sender and receiver to negotiate the maximum data length that each segment can carry during communication (excluding the segment header) when the TCP connection is established.
  • WinScale is located in the Options field of the TCP packet header and represents the multiple by which the window can be enlarged.
  • Table 1 is a typical schematic table of the first feature group.
  • Table 1 contains the corresponding relationship between the public flags of the public attributes of the network traffic of the device and the network address of the device.
  • the state information of the TCP port represents the switch state of each TCP port of the destination device (that is, the port is open or disconnected and closed)
  • the state information of the UDP port represents the switch state of each UDP port of the destination device (that is, the port is open or disconnected and closed). For example, when the port is open, the corresponding state value is 1; when the port is closed, the corresponding state value is 0.
  • Table 2 is a typical schematic table of the second feature group.
  • the status of each TCP port in the destination device can be combined based on all network traffic sent to the destination device. For example, when there is traffic with a destination address of 192.168.0.1 and a destination port of TCP#1 in all network traffic, the status value of the TCP#1 port of the device with IP address: 192.168.0.1 is 1; when there is no traffic with a destination address of 192.168.0.2 and a destination port of TCP#2 in all network traffic, the status value of the TCP#2 port of the device with IP address: 192.168.0.1 is 0. Similarly, for the corresponding destination devices corresponding to the remaining IP addresses, the status values of all TCP ports in the respective destination devices can be parsed respectively, thereby forming Table 2.
  • Table 3 is a typical schematic table of the third feature group.
  • the status of each UDP port in the destination device can be combined based on all network traffic sent to the destination device. For example, when there is traffic with a destination address of 192.168.0.1 and a destination port of UDP#1 in all network traffic, the status value of the UDP#1 port of the device with IP address: 192.168.0.1 is 1; when there is no traffic with a destination address of 192.168.0.2 and a destination port of UDP#2 in all network traffic, the status value of the UDP#2 port of the device with IP address: 192.168.0.1 is 0. Similarly, for the corresponding destination devices corresponding to the remaining IP addresses, the status values of all UDP ports in the respective destination devices can be parsed separately, thereby forming Table 3.
  • Table 2 and Table 3 respectively include the switch status of the TCP port and UDP port of the device, that is, they include the specific attributes of the destination device to which the network traffic is directed.
  • Step 103 Determine a union feature group of the first feature group, the second feature group and the third feature group based on respective weights of the first feature group, the second feature group and the third feature group.
  • the union feature group means merging the first feature group, the second feature group and the third feature group.
  • the first feature group, the second feature group, and the third feature group can be represented in matrix form respectively, and then the three matrices are merged to obtain the matrix of the union feature group.
  • each coefficient is multiplied by its own weight.
  • the matrix of the first feature group is:
  • the matrix of the second feature group is:
  • the matrix of the second feature group is:
  • Weight 0 is the weight of the first feature group
  • weight 1 is the weight of the second feature group
  • weight 2 is the weight of the third feature group.
  • the union feature group has the following expression:
  • the weight of the second feature group is greater than the weight of the third feature group, and the weight of the third feature group is greater than the weight of the first feature group, wherein the weights of the first feature group, the second feature group, and the third feature group are all adjustable. It can be seen that by adjusting the weights of the first feature group, the second feature group, and the third feature group, the accuracy of grouping can be improved.
  • Step 104 Determine the classification result of the device based on the clustering result of the union feature group.
  • Cluster analysis is based on similarity, and there are more similarities between patterns in a cluster than between patterns that are not in the same cluster.
  • clustering algorithms There are many types of clustering algorithms that can be used to achieve classification.
  • the K-means clustering algorithm can be used.
  • K-means clustering algorithm firstly, k data objects are randomly selected as the initial cluster centers from the n data objects contained in the union feature group; and for the remaining data objects, they are respectively assigned to the clusters (represented by the cluster centers) that are most similar to them according to their similarity (distance) with these cluster centers; and then the cluster center of each new cluster obtained is calculated (the mean of all objects in the cluster); and this process is repeated until the standard measurement function begins to converge.
  • it also includes: when the overlap between the classification result and the predetermined target classification result is less than a predetermined threshold, iterate until the overlap between the classification result and the target classification result is greater than or equal to the threshold, wherein the iteration includes: adjusting at least one of the respective weights of the first feature group, the second feature group and the third feature group; determining the adjusted union feature group of the first feature group, the second feature group and the third feature group based on the respective adjusted weights of the first feature group, the second feature group and the third feature group; determining the adjusted classification result based on the clustering result of the adjusted union feature group; comparing the overlap between the adjusted classification result and the target classification result.
  • the network traffic of a known type of device can be obtained, and then the network traffic of the known type of device can be classified based on the above method to obtain the calculated classification result, and the calculated classification result can be compared with the actual classification result of the device (that is, the predetermined target classification result).
  • the overlap is less than the threshold, it is determined that the respective coefficients of the first feature group, the second feature group and the third feature group need to be adjusted, wherein when adjusting, the weight of the second feature group is increased first.
  • the classification of unknown devices can be performed, and the classification results can be mapped to actual device type definitions.
  • the device number matrix for example: in is the device number matrix;
  • the device classification matrix is shown in Figure 1, where PLC represents that the device is classified as a programmable logic controller and HMI represents that the device is classified as a human-machine interface device.
  • FIG4 is an exemplary flow chart of a device classification method according to an embodiment of the present invention.
  • an OI network is taken as an example for exemplary description.
  • the method includes:
  • Step 401 Perform data preparation.
  • traffic attributes can be extracted from OT network traffic.
  • passive monitoring is performed to sniff network traffic, and then attribute data is extracted from the network traffic as data to be used later.
  • Step 402 Perform feature identification and weight setting.
  • features including public flags, TCP port status, and UDP port status
  • the features are grouped to form three feature groups, and each feature group is assigned a respective weight.
  • Step 403 Perform clustering processing.
  • a clustering algorithm is used to cluster the feature matrix after the three feature groups are combined.
  • Step 404 Determine the device category based on the clustering result.
  • FIG5 is a structural diagram of a device classification device according to an embodiment of the present invention. As shown in FIG5 , the device classification device 500 includes:
  • the acquisition module 501 is configured to acquire network traffic between devices
  • the extraction module 502 is configured to extract a first feature group, a second feature group, and a third feature group of the network traffic, wherein the first feature group includes a network address of a device and a common flag in the network traffic associated with the network address; the second feature group includes a network address of a destination device of the network traffic and a transmission control protocol port state of the destination device; the third feature group includes a network address of a destination device of the network traffic and a user datagram protocol port state of the destination device;
  • a first determining module 503 is configured to determine a union feature group of the first feature group, the second feature group and the third feature group based on respective weights of the first feature group, the second feature group and the third feature group;
  • the second determination module 504 is configured to determine a classification result of the device based on the clustering result of the union feature group.
  • the device includes an OT device and/or an IT device; the acquisition module 501 is configured to obtain the mirror traffic of the network traffic within a predetermined time via the monitoring port of the switch connected to the device; wherein the public flag includes at least one of the following: lifetime; window size; non-fragmentation bit; maximum message length; window scaling factor.
  • the weight of the second feature group is greater than the weight of the third feature group, and the weight of the third feature group is greater than the weight of the first feature group, wherein the respective weights of the first feature group, the second feature group, and the third feature group are all adjustable.
  • an adjustment module 505 is further included, which is configured to perform iterations when the overlap between the classification result and the predetermined target classification result is less than a predetermined threshold value, until the overlap between the classification result and the target classification result is greater than or equal to the threshold value, wherein the iterations include: adjusting at least one of the respective weights of the first feature group, the second feature group and the third feature group; determining an adjusted union feature group of the first feature group, the second feature group and the third feature group based on the respective adjusted weights of the first feature group, the second feature group and the third feature group; determining an adjusted classification result based on the clustering result of the adjusted union feature group; and comparing the overlap between the adjusted classification result and the target classification result.
  • the TCP port status information represents the switch status of each TCP port of the destination device
  • the UDP port status information represents the switch status of each UDP port of the destination device
  • the embodiment of the present invention further provides an electronic device with a processor-memory architecture.
  • Fig. 6 is an exemplary structural diagram of an electronic device according to an embodiment of the present invention.
  • the electronic device 600 includes a processor 601, a memory 602, and a computer program stored in the memory 602 and executable on the processor 601.
  • the memory 602 can be specifically implemented as a variety of storage media such as an electrically erasable programmable read-only memory (EEPROM), a flash memory (Flash memory), and a programmable program read-only memory (PROM).
  • the processor 601 can be implemented as including one or more central processing units or one or more field programmable gate arrays, wherein the field programmable gate array integrates one or more central processing unit cores.
  • the central processing unit or the central processing unit core can be implemented as a CPU or an MCU or a DSP, and so on.
  • a hardware module may include a specially designed permanent circuit or logic device (such as a dedicated processor, such as an FPGA or ASIC) to perform a specific operation.
  • the hardware module may also include a programmable logic device or circuit (such as a general-purpose processor or other programmable processor) temporarily configured by software to perform a specific operation.
  • a programmable logic device or circuit such as a general-purpose processor or other programmable processor

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

Embodiments of the present invention disclose a device classification method and apparatus, an electronic device, and a computer-readable storage medium. The method comprises: acquiring network traffic between devices; extracting a first feature set, a second feature set, and a third feature set of the network traffic, the first feature set comprising a network address of a device and a common flag in the network traffic associated with the network address, the second feature set comprising a network address of a destination device of the network traffic and a transmission control protocol (TCP) port state of the destination device, and the third feature set comprising the network address of the destination device of the network traffic and a user datagram protocol (UDP) port state of the destination device; determining a union feature set on the basis of respective weights of the first feature set, the second feature set, and the third feature set; and determining a classification result of the device on the basis of a clustering result of the union feature set. Accurate device classification can be implemented by combining common attributes and specific attributes of the network traffic; in addition, weighting can be adjusted, improving classification accuracy.

Description

设备分类方法、装置、电子设备及计算机可读存储介质Device classification method, device, electronic device and computer readable storage medium 技术领域Technical Field
本发明涉及数据处理技术领域,特别是设备分类方法、装置、电子设备及计算机可读存储介质。The present invention relates to the field of data processing technology, and in particular to a device classification method, an apparatus, an electronic device and a computer-readable storage medium.
背景技术Background technique
信息技术(IT)和运营技术(OT)可以用于处理企业技术基础设施的不同方面。IT和OT的融合是成功实施工业物联网系统的关键技术。然而,这种融合具有挑战:双方都有明显不同的优先级、系统模型和术语。Information technology (IT) and operational technology (OT) can be used to handle different aspects of an enterprise's technology infrastructure. The convergence of IT and OT is a key technology for the successful implementation of industrial IoT systems. However, this convergence is challenging: both sides have significantly different priorities, system models, and terminology.
随着信息和通信技术越来越多地融入OT系统中,对设备(或称资产)自动分类的需求越来越强烈。As information and communication technologies are increasingly integrated into OT systems, the need for automatic classification of equipment (or assets) is growing.
发明内容Summary of the invention
本发明实施方式提出设备分类方法、装置、电子设备及计算机可读存储介质。The embodiments of the present invention provide a device classification method, an apparatus, an electronic device, and a computer-readable storage medium.
一种设备分类方法,包括:A device classification method, comprising:
获取设备间的网络流量;Get network traffic between devices;
提取所述网络流量的第一特征组、第二特征组和第三特征组,其中所述第一特征组包括设备的网络地址以及关联于所述网络地址的网络流量中的公共标志(Common Flag);所述第二特征组包括所述网络流量的目的设备的网络地址,以及所述目的设备的传输控制协议(Transmission Control Protocol,TCP)端口状态;所述第三特征组包括所述网络流量的目的设备的网络地址,以及所述目的设备的用户数据报协议(User Datagram Protocol,UDP)端口状态;Extracting a first feature group, a second feature group and a third feature group of the network traffic, wherein the first feature group includes a network address of a device and a common flag (Common Flag) in the network traffic associated with the network address; the second feature group includes a network address of a destination device of the network traffic and a Transmission Control Protocol (TCP) port status of the destination device; the third feature group includes a network address of a destination device of the network traffic and a User Datagram Protocol (UDP) port status of the destination device;
基于第一特征组、第二特征组和第三特征组的各自权重,确定所述第一特征组、第二特征组和第三特征组的并集特征组;Based on the respective weights of the first feature group, the second feature group and the third feature group, determining a feature group which is a union of the first feature group, the second feature group and the third feature group;
基于所述并集特征组的聚类结果,确定所述设备的分类结果。Based on the clustering result of the union feature group, a classification result of the device is determined.
因此,以加权方式综合考虑表征网络流量的公共属性的第一特征组与表征 网络流量的特定属性的第二特征组与第三特征组,实现准确的设备分类。Therefore, the first feature group characterizing the common attributes of the network traffic and the second feature group and the third feature group characterizing the specific attributes of the network traffic are comprehensively considered in a weighted manner to achieve accurate device classification.
在一个实施方式中,所述设备包括OT设备和/或IT设备;In one embodiment, the device includes OT equipment and/or IT equipment;
所述获取设备间的网络流量包括:经由与所述设备连接的交换机上的监控端口,获取预定时间内流经该交换机的网络流量的镜像流量;The obtaining of network traffic between devices comprises: obtaining, via a monitoring port on a switch connected to the device, mirror traffic of network traffic flowing through the switch within a predetermined time;
其中所述公共标志包括下列中的至少一个:The public sign includes at least one of the following:
生存时间(TTL);窗口尺寸(WinSize);不分片(Do notFragment,DF)位;最大报文长度(Max Segment Size,MSS);窗口缩放因子(WinSacle)。Time to Live (TTL); Window Size (WinSize); Do not Fragment (DF) bit; Maximum Message Length (Max Segment Size, MSS); Window Scaling Factor (WinSacle).
因此,获取交换机上的镜像流量不会干扰设备间的通信,而且公共标志具有多种实施方式。Therefore, capturing mirrored traffic on the switch does not interfere with inter-device communications, and public flags have multiple implementations.
在一个实施方式中,所述第二特征组的权重大于所述第三特征组的权重,所述第三特征组的权重大于所述第一特征组的权重,其中所述第一特征组、第二特征组和第三特征组的各自权重都是可调整的。In one embodiment, the weight of the second feature group is greater than the weight of the third feature group, and the weight of the third feature group is greater than the weight of the first feature group, wherein the respective weights of the first feature group, the second feature group and the third feature group are adjustable.
因此,考虑到TCP端口状态对设备分类的显著重要程度,为其设置更高的权重,提高了分类准确度。Therefore, considering the significant importance of TCP port status to device classification, a higher weight is set for it, which improves the classification accuracy.
在一个实施方式中,还包括:In one embodiment, it also includes:
当所述分类结果与预定的目标分类结果的重合度小于预定的阈值时,执行迭代,直到所述分类结果与所述目标分类结果的重合度大于或等于所述阈值,其中所述迭代包括:When the degree of coincidence between the classification result and the predetermined target classification result is less than a predetermined threshold, iteration is performed until the degree of coincidence between the classification result and the target classification result is greater than or equal to the threshold, wherein the iteration includes:
对第一特征组、第二特征组和第三特征组的各自权重中的至少一个进行调整;adjusting at least one of the respective weights of the first feature group, the second feature group, and the third feature group;
基于第一特征组、第二特征组和第三特征组的各自调整后权重,确定所述第一特征组、第二特征组和第三特征组的调整后的并集特征组;Determine an adjusted union feature group of the first feature group, the second feature group, and the third feature group based on the respective adjusted weights of the first feature group, the second feature group, and the third feature group;
基于所述调整后的并集特征组的聚类结果,确定调整后的分类结果;Determining an adjusted classification result based on the clustering result of the adjusted union feature group;
比较调整后的分类结果与所述目标分类结果的重合度。Compare the adjusted classification results with the target classification results.
因此,通过迭代实现对权重的合理调整。Therefore, reasonable adjustment of weights is achieved through iteration.
在一个实施方式中,所述TCP端口的状态信息表征所述目的设备的每一个TCP端口的开关状态,所述UDP端口的状态信息表征所述目的设备的每 一个UDP端口的开关状态。In one embodiment, the status information of the TCP port represents the switch status of each TCP port of the destination device, and the status information of the UDP port represents the switch status of each UDP port of the destination device.
一种设备分类装置,包括:A device classification apparatus, comprising:
获取模块,被配置为获取设备间的网络流量;An acquisition module is configured to acquire network traffic between devices;
提取模块,被配置为提取所述网络流量的第一特征组、第二特征组和第三特征组,其中所述第一特征组包括设备的网络地址以及关联于所述网络地址的网络流量中的公共标志;所述第二特征组包括所述网络流量的目的设备的网络地址,以及所述目的设备的TCP端口状态;所述第三特征组包括所述网络流量的目的设备的网络地址,以及所述目的设备的UDP状态;The extraction module is configured to extract a first feature group, a second feature group and a third feature group of the network traffic, wherein the first feature group includes a network address of a device and a public flag in the network traffic associated with the network address; the second feature group includes a network address of a destination device of the network traffic and a TCP port status of the destination device; the third feature group includes a network address of a destination device of the network traffic and a UDP status of the destination device;
第一确定模块,被配置为基于第一特征组、第二特征组和第三特征组的各自权重,确定所述第一特征组、第二特征组和第三特征组的并集特征组;A first determining module is configured to determine a union feature group of the first feature group, the second feature group and the third feature group based on respective weights of the first feature group, the second feature group and the third feature group;
第二确定模块,被配置为基于所述并集特征组的聚类结果,确定所述设备的分类结果。The second determination module is configured to determine the classification result of the device based on the clustering result of the union feature group.
因此,以加权方式综合考虑表征网络流量的公共属性的第一特征组与表征网络流量的特定属性的第二特征组与第三特征组,实现准确的设备分类。Therefore, the first feature group characterizing the common attributes of the network traffic and the second feature group and the third feature group characterizing the specific attributes of the network traffic are comprehensively considered in a weighted manner to achieve accurate device classification.
在一个实施方式中,所述设备包括OT设备和/或IT设备;In one embodiment, the device includes OT equipment and/or IT equipment;
所述获取模块,被配置为经由与所述设备连接的交换机的监控端口,获取预定时间内的网络流量的镜像流量;The acquisition module is configured to acquire the mirrored traffic of the network traffic within a predetermined time via a monitoring port of a switch connected to the device;
其中所述公共标志包括下列中的至少一个:The public sign includes at least one of the following:
TTL;WinSize;DF位;MSS;WinScale。TTL; WinSize; DF bit; MSS; WinScale.
因此,获取交换机上的镜像流量不会干扰设备间的通信,而且公共标志具有多种实施方式。Therefore, capturing mirrored traffic on the switch does not interfere with inter-device communications, and public flags have multiple implementations.
在一个实施方式中,所述第二特征组的权重大于所述第三特征组的权重,所述第三特征组的权重大于所述第一特征组的权重,其中所述第一特征组、第二特征组和第三特征组的各自权重都是可调整的。In one embodiment, the weight of the second feature group is greater than the weight of the third feature group, and the weight of the third feature group is greater than the weight of the first feature group, wherein the respective weights of the first feature group, the second feature group and the third feature group are adjustable.
因此,考虑到TCP端口状态对设备分类的显著重要程度,为其设置更高的权重,提高了分类准确度。Therefore, considering the significant importance of TCP port status to device classification, a higher weight is set for it, which improves the classification accuracy.
在一个实施方式中,还包括:In one embodiment, it also includes:
调整模块,被配置为当所述分类结果与预定的目标分类结果的重合度小 于预定的阈值时,执行迭代,直到所述分类结果与所述目标分类结果的重合度大于或等于所述阈值,其中所述迭代包括:对第一特征组、第二特征组和第三特征组的各自权重中的至少一个进行调整;基于第一特征组、第二特征组和第三特征组的各自调整后权重,确定所述第一特征组、第二特征组和第三特征组的调整后的并集特征组;基于所述调整后的并集特征组的聚类结果,确定调整后的分类结果;比较调整后的分类结果与所述目标分类结果的重合度。The adjustment module is configured to perform iteration when the overlap between the classification result and the predetermined target classification result is less than a predetermined threshold value, until the overlap between the classification result and the target classification result is greater than or equal to the threshold value, wherein the iteration includes: adjusting at least one of the respective weights of the first feature group, the second feature group and the third feature group; determining an adjusted union feature group of the first feature group, the second feature group and the third feature group based on the respective adjusted weights of the first feature group, the second feature group and the third feature group; determining an adjusted classification result based on a clustering result of the adjusted union feature group; and comparing the overlap between the adjusted classification result and the target classification result.
因此,通过迭代实现对权重的合理调整。Therefore, reasonable adjustment of weights is achieved through iteration.
在一个实施方式中,所述TCP端口的状态信息表征所述目的设备的每一个TCP端口的开关状态,所述UDP端口的状态信息表征所述目的设备的每一个UDP端口的开关状态。In one embodiment, the TCP port status information represents the switch status of each TCP port of the destination device, and the UDP port status information represents the switch status of each UDP port of the destination device.
一种电子设备,包括:An electronic device, comprising:
处理器;processor;
存储器,用于存储所述处理器的可执行指令;A memory, configured to store executable instructions of the processor;
所述处理器,用于从所述存储器中读取所述可执行指令,并执行所述可执行指令以实施如上任一项所述的设备分类方法。The processor is used to read the executable instructions from the memory and execute the executable instructions to implement the device classification method as described in any one of the above items.
一种计算机可读存储介质,其上存储有计算机指令,所述计算机指令被处理器执行时实施如上任一项所述的设备分类方法。A computer-readable storage medium stores computer instructions, wherein the computer instructions, when executed by a processor, implement any of the above device classification methods.
一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时实施如上任一项所述的设备分类方法。A computer program product comprises a computer program, wherein when the computer program is executed by a processor, the device classification method as described in any one of the above items is implemented.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
下面将通过参照附图详细描述本发明的优选实施例,使本领域的普通技术人员更清楚本发明的上述及其它特征和优点,附图中:The preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings, so that those skilled in the art can better understand the above and other features and advantages of the present invention. In the accompanying drawings:
图1是本发明实施方式的设备分类方法的流程图。FIG. 1 is a flow chart of a device classification method according to an embodiment of the present invention.
图2是本发明实施方式获取网络流量的示意图。FIG. 2 is a schematic diagram of obtaining network traffic according to an embodiment of the present invention.
图3是本发明实施方式的解析网络流量的示意图。FIG. 3 is a schematic diagram of parsing network traffic according to an embodiment of the present invention.
图4是本发明实施方式的设备分类方法的示范性流程图。FIG. 4 is an exemplary flow chart of a device classification method according to an embodiment of the present invention.
图5是本发明实施方式的设备分类装置的结构图。FIG. 5 is a block diagram of a device classification apparatus according to an embodiment of the present invention.
图6是根据本发明实施方式电子设备的示范性结构图。FIG. 6 is an exemplary structural diagram of an electronic device according to an embodiment of the present invention.
其中,附图标记如下:The reference numerals are as follows:
标号Label 含义meaning
101~104101~104 步骤 step
1010 互联网 internet
1111 交换机 switch
1212 路由器 router
1313 防火墙 Firewall
1414 第一OT系统 First OT system
1515 第二OT系统 Second OT system
1616 IT系统 IT Systems
2020 交换机switch
21twenty one 嗅探器服务Sniffer Service
22twenty two DFA服务DFA Services
23twenty three 属性数据Attribute data
24twenty four 网络流量解析处理Network traffic analysis and processing
401~405401~405 步骤 step
500500 设备分类装置 Equipment classification device
501501 获取模块Get Module
502502 提取模块 Extraction module
503503 第一确定模块The first determination module
504504 第二确定模块The second determination module
505505 调整模块 Adjustment module
600600 电子设备 Electronic equipment
601601 处理器processor
602602 存储器Memory
具体实施方式Detailed ways
为使本发明的目的、技术方案和优点更加清楚,以下举实施例对本发明进一步详细说明。In order to make the purpose, technical solutions and advantages of the present invention more clear, the present invention is further described in detail with reference to the following embodiments.
为了描述上的简洁和直观,下文通过描述若干代表性的实施方式来对本发明的方案进行阐述。实施方式中大量的细节仅用于帮助理解本发明的方案。但是很明显,本发明的技术方案实现时可以不局限于这些细节。为了避免不必要地模糊了本发明的方案,一些实施方式没有进行细致地描述,而是仅给出了框架。下文中,“包括”是指“包括但不限于”,“根据……”是指“至少根据……,但不限于仅根据……”。由于汉语的语言习惯,下文中没有特别指出一个成分的数量时,意味着该成分可以是一个也可以是多个,或可理解为至少一个。For the sake of brevity and intuitiveness in description, the scheme of the present invention is explained below by describing several representative implementations. A large number of details in the implementations are only used to help understand the scheme of the present invention. However, it is obvious that the technical scheme of the present invention may not be limited to these details when implemented. In order to avoid unnecessarily obscuring the scheme of the present invention, some implementations are not described in detail, but only a framework is given. Hereinafter, "including" means "including but not limited to", and "according to..." means "at least according to..., but not limited to only according to...". Due to the language habits of Chinese, when the number of a component is not specifically specified below, it means that the component can be one or more, or can be understood as at least one.
随着IT技术越来越多地融入OT系统中,对设备自动分类的需求越来越强烈。需要指出的是,无论是IT系统、OT系统或IT/OT融合系统中,都具有对设备自动分类的需求。尤其是:在OT系统或IT/OT融合系统中,由于OT设备的很多重要属性不能直接获取,因此对设备自动分类的需求尤其强烈,实现难度也较大。As IT technology is increasingly integrated into OT systems, the demand for automatic equipment classification is growing. It should be noted that whether it is an IT system, an OT system, or an IT/OT fusion system, there is a demand for automatic equipment classification. In particular: in OT systems or IT/OT fusion systems, since many important attributes of OT devices cannot be directly obtained, the demand for automatic equipment classification is particularly strong and difficult to achieve.
图1是本发明实施方式的设备分类方法的流程图。图1的方法既适用于只包含IT设备的IT系统,也适用于只包含OT设备的OT系统。考虑到IT/OT融合系统中的设备数量与种类复杂度,图1的方法尤其适合于OT系统和IT/OT融合系统。FIG1 is a flow chart of a device classification method according to an embodiment of the present invention. The method of FIG1 is applicable to both IT systems that only include IT devices and OT systems that only include OT devices. Considering the number and complexity of devices in an IT/OT fusion system, the method of FIG1 is particularly suitable for OT systems and IT/OT fusion systems.
如图1所示,该方法包括:As shown in FIG1 , the method includes:
步骤101:获取设备间的网络流量。Step 101: Obtain network traffic between devices.
在这里,设备可以为IT系统中的IT设备、OT系统中的OT设备、IT/OT融合系统中的IT设备或IT/OT融合系统中的OT设备。Here, the device may be an IT device in an IT system, an OT device in an OT system, an IT device in an IT/OT fusion system, or an OT device in an IT/OT fusion system.
在设备之间传输网络流量的协议可以包括:传输协议和通信协议。其中:传输协议一般负责子网内设备间的组网及通信;通信协议则主要是运行在TCP/IP协议之上的设备通讯协议,负责设备通过互联网进行数据交换及通信。The protocols for transmitting network traffic between devices can include: transmission protocols and communication protocols. Among them: transmission protocols are generally responsible for networking and communication between devices in a subnet; communication protocols are mainly device communication protocols running on TCP/IP protocols, responsible for data exchange and communication between devices through the Internet.
比如,传输网络流量的协议可以包括:表述性状态传递(REST)/超文本传输协议(Hyper Text Transfer Protocol)、受限应用协议(Constrained Application Protocol,CoAP)、消息队列遥测传输(Message Queuing Telemetry Transport,MQTT)协议、面向实时系统的数据分布服务(Data Distribution Service for Real-Time Systems,DDS)协议、先进消息队列协议(Advanced Message Queuing Protocol,AMQP)、可扩展通讯和表示协议(Extensible Messaging and Presence Protocol,XMPP)、JAVA消息服务(Java Message Service,JMS)协议,等等。For example, protocols for transmitting network traffic may include: Representational State Transfer (REST)/Hyper Text Transfer Protocol (Hyper Text Transfer Protocol), Constrained Application Protocol (CoAP), Message Queuing Telemetry Transport (MQTT) protocol, Data Distribution Service for Real-Time Systems (DDS) protocol, Advanced Message Queuing Protocol (AMQP), Extensible Messaging and Presence Protocol (XMPP), JAVA Message Service (JMS) protocol, and so on.
以上示范性描述了传输网络流量的协议的具体实例,本领域技术人员可以意识到,这种描述仅是示范性的,并不用于限定本发明实施方式的保护范围。The above exemplary descriptions are specific examples of protocols for transmitting network traffic. Those skilled in the art will appreciate that such descriptions are merely exemplary and are not intended to limit the scope of protection of the embodiments of the present invention.
图2是本发明实施方式获取网络流量的示意图。在图2中,IT系统16包含多个IT设备,第一OT系统14和第二OT系统15中分别包含多个OT设备。IT系统16、第一OT系统14和第二OT系统15,各自经由交换机11连接到互联网10。交换机11还分别连接路由器12和防火墙13。IT系统16与第一OT系统14和第二OT系统15之间具有网络流量。通过在交换机11上设置镜像流量端口,可以获取IT系统16、第一OT系统14和第二OT系统15中的全部设备之间的网络流量。FIG2 is a schematic diagram of obtaining network traffic according to an embodiment of the present invention. In FIG2 , the IT system 16 includes multiple IT devices, and the first OT system 14 and the second OT system 15 each include multiple OT devices. The IT system 16, the first OT system 14, and the second OT system 15 are each connected to the Internet 10 via a switch 11. The switch 11 is also connected to a router 12 and a firewall 13, respectively. There is network traffic between the IT system 16 and the first OT system 14 and the second OT system 15. By setting a mirrored traffic port on the switch 11, the network traffic between all devices in the IT system 16, the first OT system 14, and the second OT system 15 can be obtained.
在一个实施方式中,设备包括IT设备和/或OT设备;步骤101具体包括:经由与设备连接的交换机上的监控端口,获取预定时间内流经该交换机的网络流量的镜像流量(优选地,预定时间足够长,以保证设备之间已实现完全通信)。因此,获取交换机上的镜像流量不会干扰设备间的通信,而且公共标志具有多种实施方式。In one embodiment, the device includes IT equipment and/or OT equipment; step 101 specifically includes: obtaining, via a monitoring port on a switch connected to the device, mirror traffic of network traffic flowing through the switch within a predetermined time (preferably, the predetermined time is long enough to ensure that the devices have achieved complete communication). Therefore, obtaining the mirror traffic on the switch will not interfere with the communication between the devices, and the public flag has multiple implementation methods.
步骤102:提取网络流量的第一特征组、第二特征组和第三特征组,其中第一特征组包括设备的网络地址以及关联于网络地址的网络流量中的公共标志;第二特征组包括网络流量的目的设备的网络地址,以及目的设备的TCP端口状态;第三特征组包括网络流量的目的设备的网络地址,以及目的设备的UDP端口状态。Step 102: Extract the first feature group, the second feature group and the third feature group of the network traffic, wherein the first feature group includes the network address of the device and the common flag in the network traffic associated with the network address; the second feature group includes the network address of the destination device of the network traffic, and the TCP port status of the destination device; the third feature group includes the network address of the destination device of the network traffic, and the UDP port status of the destination device.
图3是本发明实施方式的解析网络流量的示意图。在网络流量解析处理 24中:首先,利用嗅探器服务21从交换机20获取网络流量;然后,对流量执行确定性有限状态机(deterministic finite automaton,DFA)服务22,比如tshark服务,以从网络流量的报文中提取出报头属性和负载属性,从而得到属性数据23。FIG3 is a schematic diagram of parsing network traffic according to an embodiment of the present invention. In the network traffic parsing process 24: first, the network traffic is obtained from the switch 20 using the sniffer service 21; then, a deterministic finite automaton (DFA) service 22, such as a tshark service, is executed on the traffic to extract header attributes and payload attributes from the network traffic message, thereby obtaining attribute data 23.
比如,可以从网络流量的报头中,提取出表征设备的网络流量(比如,以设备为源设备和/或目的设备)的公共属性的公共标志、报文的目的设备的TCP端口状态和UDP端口状态。For example, public flags representing common attributes of a device's network traffic (e.g., with the device as the source device and/or destination device), and the TCP port status and UDP port status of the message's destination device can be extracted from the network traffic header.
在这里,公共标志包括下列中的至少一个:Here, the public flag includes at least one of the following:
(1)、生存时间(Time To Live,TTL):(1) Time To Live (TTL):
TTL指定IP包被路由器丢弃之前允许通过的最大网段数量。比如,在IPv4包头中TTL是一个8bit字段,它位于IPv4包的第9个字节。TTL specifies the maximum number of network segments that an IP packet is allowed to pass through before it is discarded by a router. For example, in the IPv4 packet header, TTL is an 8-bit field located in the 9th byte of the IPv4 packet.
(2)、窗口尺寸:(WinSize):(2) Window size: (WinSize):
TCP报头(Header)中包含窗口尺寸字段,它其实是指接收端的窗口,即接收窗口,用来告知发送端自己所能接收的数据量,从而达到流量控制目的。The TCP header contains a window size field, which actually refers to the window of the receiving end, that is, the receiving window, which is used to inform the sending end of the amount of data it can receive, thereby achieving the purpose of flow control.
(3)、不分片(DF)位:(3) Do not fragment (DF) bit:
DF位:为1表示不分片,为0表示分片DF bit: 1 means no fragmentation, 0 means fragmentation
(4)、最大报文长度(MSS):(4) Maximum message length (MSS):
MMS是TCP协议的一个选项,用于在TCP连接建立时,收发双方协商通信时每一个报文段所能承载的最大数据长度(不包括文段头)。MMS is an option of the TCP protocol. It is used by the sender and receiver to negotiate the maximum data length that each segment can carry during communication (excluding the segment header) when the TCP connection is established.
(5)、窗口缩放因子(WinScale):(5) Window scaling factor (WinScale):
WinScale位于TCP包头的可选(Options)字段中,表征窗口可以放大的倍数。WinScale is located in the Options field of the TCP packet header and represents the multiple by which the window can be enlarged.
表1 是第一特征组的典型示意表。Table 1 is a typical schematic table of the first feature group.
Figure PCTCN2022121762-appb-000001
Figure PCTCN2022121762-appb-000001
Figure PCTCN2022121762-appb-000002
Figure PCTCN2022121762-appb-000002
表1Table 1
在表1中,对于IP地址为192.168.0.1的设备,可以从其发出的网络流量中,提取出包含在网络流量(比如,从流量的报头中提取)中的各个公共标志。相应地,对于其他IP地址的设备,同样可以分别从其发出的网络流量中,提取各个公共标志,从而形成表1。类似地,可以针对每个IP地址的对应设备,分别提取以该设备为目的设备的网络流量,从而形成表1。In Table 1, for the device with IP address 192.168.0.1, various public flags contained in the network traffic (e.g., extracted from the header of the traffic) can be extracted from the network traffic sent by the device. Correspondingly, for devices with other IP addresses, various public flags can also be extracted from the network traffic sent by the devices, thereby forming Table 1. Similarly, for the device corresponding to each IP address, the network traffic with the device as the destination device can be extracted, thereby forming Table 1.
可见,在表1中,包含有关于设备的网络流量的公共属性的公共标志与设备的网络地址之间的对应关系。It can be seen that Table 1 contains the corresponding relationship between the public flags of the public attributes of the network traffic of the device and the network address of the device.
在一个实施方式中,TCP端口的状态信息表征目的设备的每一个TCP端口的开关状态(也就是,端口打开状态或断开关闭状态),UDP端口的状态信息表征目的设备的每一个UDP端口的开关状态(也就是,端口打开状态或断开关闭状态)。比如,当端口打开时,对应的状态值为1;当端口关闭时,对应的状态值为0。In one embodiment, the state information of the TCP port represents the switch state of each TCP port of the destination device (that is, the port is open or disconnected and closed), and the state information of the UDP port represents the switch state of each UDP port of the destination device (that is, the port is open or disconnected and closed). For example, when the port is open, the corresponding state value is 1; when the port is closed, the corresponding state value is 0.
表2 是第二特征组的典型示意表。Table 2 is a typical schematic table of the second feature group.
IP地址IP address TCP#1TCP#1 TCP#2TCP#2 ...... TCP#65534TCP#65534 TCP#65535TCP#65535
192.168.0.1192.168.0.1 11 00 ...... 11 11
192.168.0.2192.168.0.2 11 11 ...... 00 11
192.168.0.3192.168.0.3 00 00 ...... 11 00
...... ...... ...... ...... ...... ......
表2Table 2
在表2中,对于网络流量的目的IP地址为192.168.0.1的设备(即目的设备),可以基于发往该目的设备的全部网络流量,合并得到该目的设备中的每个TCP端口的状态。比如,当全部网络流量中存在目的地址为192.168.0.1,且目的端口为TCP#1的流量,则IP地址:192.168.0.1的设备的TCP#1端口的状态值为1;当全部网络流量中不存在同时满足目的地址为192.168.0.2且 目的端口为TCP#2的流量,则IP地址:192.168.0.1的设备的TCP#2端口的状态值为0。类似地,可以针对对应于其余IP地址的相应目的设备,分别解析出各自目的设备中的全部TCP端口的状态值,从而形成表2。In Table 2, for the device whose destination IP address of the network traffic is 192.168.0.1 (i.e., the destination device), the status of each TCP port in the destination device can be combined based on all network traffic sent to the destination device. For example, when there is traffic with a destination address of 192.168.0.1 and a destination port of TCP#1 in all network traffic, the status value of the TCP#1 port of the device with IP address: 192.168.0.1 is 1; when there is no traffic with a destination address of 192.168.0.2 and a destination port of TCP#2 in all network traffic, the status value of the TCP#2 port of the device with IP address: 192.168.0.1 is 0. Similarly, for the corresponding destination devices corresponding to the remaining IP addresses, the status values of all TCP ports in the respective destination devices can be parsed respectively, thereby forming Table 2.
表3 是第三特征组的典型示意表。Table 3 is a typical schematic table of the third feature group.
IP地址IP address UDP#1UDP#1 UDP#2UDP#2 ...... UDP#65534UDP#65534 UDP#65535UDP#65535
192.168.0.1192.168.0.1 11 00 ...... 11 11
192.168.0.2192.168.0.2 11 11 ...... 00 11
192.168.0.3192.168.0.3 00 00 ...... 11 00
...... ...... ...... ...... ...... ......
表3table 3
在表3中,对于网络流量的目的IP地址为192.168.0.1的设备(即目的设备),可以基于发往该目的设备的全部网络流量,合并得到该目的设备中的每个UDP端口的状态。比如,当全部网络流量中存在目的地址为192.168.0.1,且目的端口为UDP#1的流量,则IP地址:192.168.0.1设备的UDP#1端口的状态值为1;当全部网络流量中不存在同时满足目的地址为192.168.0.2且目的端口为UDP#2的流量,则IP地址:192.168.0.1设备的UDP#2端口的状态值为0。类似地,可以针对对应于其余IP地址的相应目的设备,分别解析出各自目的设备中的全部UDP端口的状态值,从而形成表3。In Table 3, for the device whose destination IP address of the network traffic is 192.168.0.1 (i.e., the destination device), the status of each UDP port in the destination device can be combined based on all network traffic sent to the destination device. For example, when there is traffic with a destination address of 192.168.0.1 and a destination port of UDP#1 in all network traffic, the status value of the UDP#1 port of the device with IP address: 192.168.0.1 is 1; when there is no traffic with a destination address of 192.168.0.2 and a destination port of UDP#2 in all network traffic, the status value of the UDP#2 port of the device with IP address: 192.168.0.1 is 0. Similarly, for the corresponding destination devices corresponding to the remaining IP addresses, the status values of all UDP ports in the respective destination devices can be parsed separately, thereby forming Table 3.
可见,在表2和表3中,分别包含了设备的TCP端口和UDP端口的开关状态,即包含了网络流量所指向的目的设备的特定属性。It can be seen that Table 2 and Table 3 respectively include the switch status of the TCP port and UDP port of the device, that is, they include the specific attributes of the destination device to which the network traffic is directed.
步骤103:基于第一特征组、第二特征组和第三特征组的各自权重,确定第一特征组、第二特征组和第三特征组的并集特征组。Step 103: Determine a union feature group of the first feature group, the second feature group and the third feature group based on respective weights of the first feature group, the second feature group and the third feature group.
在这里,并集特征组的含义是将第一特征组、第二特征组和第三特征组合并。Here, the union feature group means merging the first feature group, the second feature group and the third feature group.
比如,可以分别利用矩阵形式表征第一特征组、第二特征组和第三特征组,则将这三个矩阵合并以得到并集特征组的矩阵。在合并过程中,分别为各个系数乘以各自的权重。考虑到设备的端口开关状态对于设备种类具有重要参照意义,将网络流量的特定属性作为考虑因子参与到设备分类过程中,具有重要意义。For example, the first feature group, the second feature group, and the third feature group can be represented in matrix form respectively, and then the three matrices are merged to obtain the matrix of the union feature group. In the merging process, each coefficient is multiplied by its own weight. Considering that the port switch status of the device has an important reference significance for the type of device, it is of great significance to take the specific attributes of network traffic as a consideration factor in the device classification process.
比如,假定第一特征组的矩阵为:
Figure PCTCN2022121762-appb-000003
第二特征组的矩阵为:
Figure PCTCN2022121762-appb-000004
第二特征组的矩阵为:
Figure PCTCN2022121762-appb-000005
weight 0为第一特征组的权重,weight 1为第二特征组的权重,weight 2为第三特征组的权重,则并集特征组具有如下表述形式:
For example, suppose the matrix of the first feature group is:
Figure PCTCN2022121762-appb-000003
The matrix of the second feature group is:
Figure PCTCN2022121762-appb-000004
The matrix of the second feature group is:
Figure PCTCN2022121762-appb-000005
Weight 0 is the weight of the first feature group, weight 1 is the weight of the second feature group, and weight 2 is the weight of the third feature group. The union feature group has the following expression:
Figure PCTCN2022121762-appb-000006
Figure PCTCN2022121762-appb-000006
申请人发现:相比较网络流量的公共属性对设备分类结果的影响,网络流量的特定属性对设备分类结果的影响更大,而且TCP端口状态对设备的分类结果具有重大意义。The applicant found that: compared with the impact of common attributes of network traffic on device classification results, specific attributes of network traffic have a greater impact on device classification results, and the TCP port status has a significant impact on the classification results of the device.
在一个实施方式中,第二特征组的权重大于第三特征组的权重,第三特征组的权重大于第一特征组的权重,其中第一特征组、第二特征组和第三特征组的各自权重都是可调整的。可见,通过调整第一特征组、第二特征组和第三特征组的各自权重,可以提高分组的准确度。In one embodiment, the weight of the second feature group is greater than the weight of the third feature group, and the weight of the third feature group is greater than the weight of the first feature group, wherein the weights of the first feature group, the second feature group, and the third feature group are all adjustable. It can be seen that by adjusting the weights of the first feature group, the second feature group, and the third feature group, the accuracy of grouping can be improved.
步骤104:基于并集特征组的聚类结果,确定设备的分类结果。Step 104: Determine the classification result of the device based on the clustering result of the union feature group.
在这里,对并集特征组执行聚类算法,以确定设备的分类结果。聚类分析以相似性为基础,在一个聚类中的模式之间比不在同一聚类中的模式之间具有更多的相似性。可以采用多种类型的聚类算法以实现分类。Here, a clustering algorithm is performed on the union feature set to determine the classification result of the device. Cluster analysis is based on similarity, and there are more similarities between patterns in a cluster than between patterns that are not in the same cluster. There are many types of clustering algorithms that can be used to achieve classification.
比如,可以采用K-means聚类算法。在K-means聚类算法中,首先从并集特征组所包含的n个数据对象中,任意选择k个数据对象作为初始聚类中心;而对于所剩下其它数据对象,则根据它们与这些聚类中心的相似度(距离),分别将它们分配给与其最相似的(聚类中心所代表的)聚类;然后再计算每个所获新聚类的聚类中心(该聚类中所有对象的均值);不断重复这一过程直到标准测度函数开始收敛为止。For example, the K-means clustering algorithm can be used. In the K-means clustering algorithm, firstly, k data objects are randomly selected as the initial cluster centers from the n data objects contained in the union feature group; and for the remaining data objects, they are respectively assigned to the clusters (represented by the cluster centers) that are most similar to them according to their similarity (distance) with these cluster centers; and then the cluster center of each new cluster obtained is calculated (the mean of all objects in the cluster); and this process is repeated until the standard measurement function begins to converge.
在一个实施方式中,还包括:当分类结果与预定的目标分类结果的重合度小于预定的阈值时,执行迭代,直到分类结果与目标分类结果的重合度大 于或等于阈值,其中迭代包括:对第一特征组、第二特征组和第三特征组的各自权重中的至少一个进行调整;基于第一特征组、第二特征组和第三特征组的各自调整后权重,确定第一特征组、第二特征组和第三特征组的调整后的并集特征组;基于调整后的并集特征组的聚类结果,确定调整后的分类结果;比较调整后的分类结果与目标分类结果的重合度。比如,可以获取已知类型的设备的网络流量,再基于上述方法对已知类型的设备的网络流量进行分类,得到计算出的分类结果,将计算出的分类结果与设备的实际分类结果(也就是预定的目标分类结果)进行比对。当发现重合度小于阈值时,则认定需要调整第一特征组、第二特征组和第三特征组的各自系数,其中在调整时,优先增加第二特征组的权重。In one embodiment, it also includes: when the overlap between the classification result and the predetermined target classification result is less than a predetermined threshold, iterate until the overlap between the classification result and the target classification result is greater than or equal to the threshold, wherein the iteration includes: adjusting at least one of the respective weights of the first feature group, the second feature group and the third feature group; determining the adjusted union feature group of the first feature group, the second feature group and the third feature group based on the respective adjusted weights of the first feature group, the second feature group and the third feature group; determining the adjusted classification result based on the clustering result of the adjusted union feature group; comparing the overlap between the adjusted classification result and the target classification result. For example, the network traffic of a known type of device can be obtained, and then the network traffic of the known type of device can be classified based on the above method to obtain the calculated classification result, and the calculated classification result can be compared with the actual classification result of the device (that is, the predetermined target classification result). When it is found that the overlap is less than the threshold, it is determined that the respective coefficients of the first feature group, the second feature group and the third feature group need to be adjusted, wherein when adjusting, the weight of the second feature group is increased first.
当权重确定后,可以执行对未知设备的分类。而且,将分类结果映射到实际设备类型定义。When the weights are determined, the classification of unknown devices can be performed, and the classification results can be mapped to actual device type definitions.
比如:
Figure PCTCN2022121762-appb-000007
其中
Figure PCTCN2022121762-appb-000008
为设备编号矩阵;
Figure PCTCN2022121762-appb-000009
为设备分类矩阵,其中PLC代表设备被分类为可编程逻辑控制器,HMI代表设备被分类为人机接口设备。
for example:
Figure PCTCN2022121762-appb-000007
in
Figure PCTCN2022121762-appb-000008
is the device number matrix;
Figure PCTCN2022121762-appb-000009
The device classification matrix is shown in Figure 1, where PLC represents that the device is classified as a programmable logic controller and HMI represents that the device is classified as a human-machine interface device.
图4是本发明实施方式的设备分类方法的示范性流程图。在图4中,以OI网络为例进行示范性说明。如图4所示,该方法包括:FIG4 is an exemplary flow chart of a device classification method according to an embodiment of the present invention. In FIG4, an OI network is taken as an example for exemplary description. As shown in FIG4, the method includes:
步骤401:执行数据准备。在这里,可以从OT网络流量中提取流量属性。通常,执行被动监控以嗅探网络流量,然后从网络流量中提取属性数据,以作为后续待使用的数据。Step 401: Perform data preparation. Here, traffic attributes can be extracted from OT network traffic. Typically, passive monitoring is performed to sniff network traffic, and then attribute data is extracted from the network traffic as data to be used later.
步骤402:执行特征识别和权重设置。在这里,从属性数据中识别特征(包括公共标志、TCP端口状态和UDP端口状态),然后将特征分组以形成三个特征组,并为每个特征组分配各自的权重。Step 402: Perform feature identification and weight setting. Here, features (including public flags, TCP port status, and UDP port status) are identified from the attribute data, and then the features are grouped to form three feature groups, and each feature group is assigned a respective weight.
步骤403:执行聚类处理。在这里,使用聚类算法对三个特征组合并后的特征矩阵进行聚类。Step 403: Perform clustering processing. Here, a clustering algorithm is used to cluster the feature matrix after the three feature groups are combined.
步骤404:基于聚类结果确定设备类别。Step 404: Determine the device category based on the clustering result.
图5是本发明实施方式的设备分类装置的结构图。如图5所示,设备分类装置500包括:FIG5 is a structural diagram of a device classification device according to an embodiment of the present invention. As shown in FIG5 , the device classification device 500 includes:
获取模块501,被配置为获取设备间的网络流量;The acquisition module 501 is configured to acquire network traffic between devices;
提取模块502,被配置为提取网络流量的第一特征组、第二特征组和第三特征组,其中第一特征组包括设备的网络地址以及关联于网络地址的网络流量中的公共标志;第二特征组包括网络流量的目的设备的网络地址,以及目的设备的传输控制协议端口状态;第三特征组包括网络流量的目的设备的网络地址,以及目的设备的用户数据报协议端口状态;The extraction module 502 is configured to extract a first feature group, a second feature group, and a third feature group of the network traffic, wherein the first feature group includes a network address of a device and a common flag in the network traffic associated with the network address; the second feature group includes a network address of a destination device of the network traffic and a transmission control protocol port state of the destination device; the third feature group includes a network address of a destination device of the network traffic and a user datagram protocol port state of the destination device;
第一确定模块503,被配置为基于第一特征组、第二特征组和第三特征组的各自权重,确定第一特征组、第二特征组和第三特征组的并集特征组;A first determining module 503 is configured to determine a union feature group of the first feature group, the second feature group and the third feature group based on respective weights of the first feature group, the second feature group and the third feature group;
第二确定模块504,被配置为基于并集特征组的聚类结果,确定设备的分类结果。The second determination module 504 is configured to determine a classification result of the device based on the clustering result of the union feature group.
在一个实施方式中,设备包括OT设备和/或IT设备;获取模块501,被配置为经由与设备连接的交换机的监控端口,获取预定时间内的网络流量的镜像流量;其中公共标志包括下列中的至少一个:生存时间;窗口尺寸;不分片位;最大报文长度;窗口缩放因子。In one embodiment, the device includes an OT device and/or an IT device; the acquisition module 501 is configured to obtain the mirror traffic of the network traffic within a predetermined time via the monitoring port of the switch connected to the device; wherein the public flag includes at least one of the following: lifetime; window size; non-fragmentation bit; maximum message length; window scaling factor.
在一个实施方式中,第二特征组的权重大于第三特征组的权重,第三特征组的权重大于第一特征组的权重,其中第一特征组、第二特征组和第三特征组的各自权重都是可调整的。In one embodiment, the weight of the second feature group is greater than the weight of the third feature group, and the weight of the third feature group is greater than the weight of the first feature group, wherein the respective weights of the first feature group, the second feature group, and the third feature group are all adjustable.
在一个实施方式中,还包括调整模块505,被配置为当分类结果与预定的目标分类结果的重合度小于预定的阈值时,执行迭代,直到分类结果与目标分类结果的重合度大于或等于阈值,其中迭代包括:对第一特征组、第二特征组和第三特征组的各自权重中的至少一个进行调整;基于第一特征组、第二特征组和第三特征组的各自调整后权重,确定第一特征组、第二特征组和第三特征组的调整后的并集特征组;基于调整后的并集特征组的聚类结果,确定调整后的分类结果;比较调整后的分类结果与目标分类结果的重合度。In one embodiment, an adjustment module 505 is further included, which is configured to perform iterations when the overlap between the classification result and the predetermined target classification result is less than a predetermined threshold value, until the overlap between the classification result and the target classification result is greater than or equal to the threshold value, wherein the iterations include: adjusting at least one of the respective weights of the first feature group, the second feature group and the third feature group; determining an adjusted union feature group of the first feature group, the second feature group and the third feature group based on the respective adjusted weights of the first feature group, the second feature group and the third feature group; determining an adjusted classification result based on the clustering result of the adjusted union feature group; and comparing the overlap between the adjusted classification result and the target classification result.
在一个实施方式中,TCP端口的状态信息表征目的设备的每一个TCP端口的开关状态,UDP端口的状态信息表征目的设备的每一个UDP端口的开关状态。In one embodiment, the TCP port status information represents the switch status of each TCP port of the destination device, and the UDP port status information represents the switch status of each UDP port of the destination device.
本发明实施方式还提出了一种具有处理器-存储器架构的电子设备。图6是根据本发明实施方式电子设备的示范性结构图。The embodiment of the present invention further provides an electronic device with a processor-memory architecture. Fig. 6 is an exemplary structural diagram of an electronic device according to an embodiment of the present invention.
如图6所示,电子设备600包括处理器601、存储器602及存储在存储器602上并可在处理器601上运行的计算机程序,计算机程序被处理器601执行时实现如上任一种的设备分类方法。其中,存储器602具体可以实施为电可擦可编程只读存储器(EEPROM)、快闪存储器(Flash memory)、可编程程序只读存储器(PROM)等多种存储介质。处理器601可以实施为包括一或多个中央处理器或一或多个现场可编程门阵列,其中现场可编程门阵列集成一或多个中央处理器核。具体地,中央处理器或中央处理器核可以实施为CPU或MCU或DSP,等等。As shown in FIG6 , the electronic device 600 includes a processor 601, a memory 602, and a computer program stored in the memory 602 and executable on the processor 601. When the computer program is executed by the processor 601, any of the above device classification methods is implemented. Among them, the memory 602 can be specifically implemented as a variety of storage media such as an electrically erasable programmable read-only memory (EEPROM), a flash memory (Flash memory), and a programmable program read-only memory (PROM). The processor 601 can be implemented as including one or more central processing units or one or more field programmable gate arrays, wherein the field programmable gate array integrates one or more central processing unit cores. Specifically, the central processing unit or the central processing unit core can be implemented as a CPU or an MCU or a DSP, and so on.
需要说明的是,上述各流程和各结构图中不是所有的步骤和模块都是必须的,可以根据实际的需要忽略某些步骤或模块。各步骤的执行顺序不是固定的,可以根据需要进行调整。各模块的划分仅仅是为了便于描述采用的功能上的划分,实际实现时,一个模块可以分由多个模块实现,多个模块的功能也可以由同一个模块实现,这些模块可以位于同一个设备中,也可以位于不同的设备中。It should be noted that not all steps and modules in the above processes and structure diagrams are necessary, and some steps or modules can be ignored according to actual needs. The execution order of each step is not fixed and can be adjusted as needed. The division of each module is only for the convenience of describing the functional division adopted. In actual implementation, a module can be implemented by multiple modules, and the functions of multiple modules can also be implemented by the same module. These modules can be located in the same device or in different devices.
各实施方式中的硬件模块可以以机械方式或电子方式实现。例如,一个硬件模块可以包括专门设计的永久性电路或逻辑器件(如专用处理器,如FPGA或ASIC)用于完成特定的操作。硬件模块也可以包括由软件临时配置的可编程逻辑器件或电路(如包括通用处理器或其它可编程处理器)用于执行特定操作。至于具体采用机械方式,或是采用专用的永久性电路,或是采用临时配置的电路(如由软件进行配置)来实现硬件模块,可以根据成本和时间上的考虑来决定。The hardware modules in each embodiment can be implemented mechanically or electronically. For example, a hardware module may include a specially designed permanent circuit or logic device (such as a dedicated processor, such as an FPGA or ASIC) to perform a specific operation. The hardware module may also include a programmable logic device or circuit (such as a general-purpose processor or other programmable processor) temporarily configured by software to perform a specific operation. As for whether to implement the hardware module mechanically, or using a dedicated permanent circuit, or using a temporarily configured circuit (such as configured by software), it can be decided based on cost and time considerations.
以上所述,仅为本发明的较佳实施方式而已,并非用于限定本发明的保护范围。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The above description is only a preferred embodiment of the present invention and is not intended to limit the protection scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included in the protection scope of the present invention.

Claims (13)

  1. 一种设备分类方法,其特征在于,包括:A device classification method, characterized by comprising:
    获取设备间的网络流量(101);Get network traffic between devices (101);
    提取所述网络流量的第一特征组、第二特征组和第三特征组,其中所述第一特征组包括设备的网络地址以及关联于所述网络地址的网络流量中的公共标志;所述第二特征组包括所述网络流量的目的设备的网络地址,以及所述目的设备的传输控制协议端口状态;所述第三特征组包括所述网络流量的目的设备的网络地址,以及所述目的设备的用户数据报协议端口状态(102);Extracting a first feature group, a second feature group, and a third feature group of the network traffic, wherein the first feature group includes a network address of a device and a common flag in the network traffic associated with the network address; the second feature group includes a network address of a destination device of the network traffic and a transmission control protocol port status of the destination device; the third feature group includes a network address of a destination device of the network traffic and a user datagram protocol port status of the destination device (102);
    基于第一特征组、第二特征组和第三特征组的各自权重,确定所述第一特征组、第二特征组和第三特征组的并集特征组(103);Determine a union feature group of the first feature group, the second feature group and the third feature group based on their respective weights (103);
    基于所述并集特征组的聚类结果,确定所述设备的分类结果(104)。Based on the clustering result of the union feature group, a classification result of the device is determined (104).
  2. 根据权利要求1所述的方法,其特征在于,所述设备包括运营技术设备和/或信息技术设备;The method according to claim 1, characterized in that the equipment includes operational technology equipment and/or information technology equipment;
    所述获取设备间的网络流量(101)包括:经由与所述设备连接的交换机上的监控端口,获取预定时间内流经该交换机的网络流量的镜像流量;The obtaining of network traffic between devices (101) comprises: obtaining, via a monitoring port on a switch connected to the device, mirror traffic of network traffic flowing through the switch within a predetermined time;
    其中所述公共标志包括下列中的至少一个:The public sign includes at least one of the following:
    生存时间;窗口尺寸;不分片位;最大报文长度;窗口缩放因子。Lifetime; window size; non-fragmentation bit; maximum message length; window scaling factor.
  3. 根据权利要求1所述的方法,其特征在于,所述第二特征组的权重大于所述第三特征组的权重,所述第三特征组的权重大于所述第一特征组的权重,其中所述第一特征组、第二特征组和第三特征组的各自权重都是可调整的。The method according to claim 1 is characterized in that the weight of the second feature group is greater than the weight of the third feature group, the weight of the third feature group is greater than the weight of the first feature group, and the respective weights of the first feature group, the second feature group and the third feature group are all adjustable.
  4. 根据权利要求1所述的方法,其特征在于,还包括:The method according to claim 1, further comprising:
    当所述分类结果与预定的目标分类结果的重合度小于预定的阈值时,执行迭代,直到所述分类结果与所述目标分类结果的重合度大于或等于所述阈值,其中所述迭代包括:When the degree of coincidence between the classification result and the predetermined target classification result is less than a predetermined threshold, iteration is performed until the degree of coincidence between the classification result and the target classification result is greater than or equal to the threshold, wherein the iteration includes:
    对第一特征组、第二特征组和第三特征组的各自权重中的至少一个进行调整;adjusting at least one of the respective weights of the first feature group, the second feature group, and the third feature group;
    基于第一特征组、第二特征组和第三特征组的各自调整后权重,确定所述第一特征组、第二特征组和第三特征组的调整后的并集特征组;Determine an adjusted union feature group of the first feature group, the second feature group, and the third feature group based on the respective adjusted weights of the first feature group, the second feature group, and the third feature group;
    基于所述调整后的并集特征组的聚类结果,确定调整后的分类结果;Determining an adjusted classification result based on the clustering result of the adjusted union feature group;
    比较调整后的分类结果与所述目标分类结果的重合度。Compare the adjusted classification results with the target classification results.
  5. 根据权利要求1-4中任一项所述的方法,其特征在于,所述传输控制协议端口的状态信息表征所述目的设备的每一个传输控制协议端口的开关状态,所述用户数据报协议 端口的状态信息表征所述目的设备的每一个用户数据报协议端口的开关状态。The method according to any one of claims 1-4 is characterized in that the status information of the transmission control protocol port represents the switch status of each transmission control protocol port of the destination device, and the status information of the user datagram protocol port represents the switch status of each user datagram protocol port of the destination device.
  6. 一种设备分类装置,其特征在于,包括:A device classification device, characterized by comprising:
    获取模块(501),被配置为获取设备间的网络流量;An acquisition module (501) is configured to acquire network traffic between devices;
    提取模块(502),被配置为提取所述网络流量的第一特征组、第二特征组和第三特征组,其中所述第一特征组包括设备的网络地址以及关联于所述网络地址的网络流量中的公共标志;所述第二特征组包括所述网络流量的目的设备的网络地址,以及所述目的设备的传输控制协议端口状态;所述第三特征组包括所述网络流量的目的设备的网络地址,以及所述目的设备的用户数据报协议端口状态;The extraction module (502) is configured to extract a first feature group, a second feature group and a third feature group of the network traffic, wherein the first feature group includes a network address of a device and a common flag in the network traffic associated with the network address; the second feature group includes a network address of a destination device of the network traffic and a transmission control protocol port status of the destination device; the third feature group includes a network address of a destination device of the network traffic and a user datagram protocol port status of the destination device;
    第一确定模块(503),被配置为基于第一特征组、第二特征组和第三特征组的各自权重,确定所述第一特征组、第二特征组和第三特征组的并集特征组;A first determination module (503) is configured to determine a feature group that is a union of the first feature group, the second feature group and the third feature group based on respective weights of the first feature group, the second feature group and the third feature group;
    第二确定模块(504),被配置为基于所述并集特征组的聚类结果,确定所述设备的分类结果。The second determination module (504) is configured to determine a classification result of the device based on the clustering result of the union feature group.
  7. 根据权利要求6所述的装置,其特征在于,所述设备包括运营技术设备和/或信息技术设备;The apparatus according to claim 6, characterized in that the equipment comprises operational technology equipment and/or information technology equipment;
    所述获取模块(501),被配置为经由与所述设备连接的交换机的监控端口,获取预定时间内的网络流量的镜像流量;The acquisition module (501) is configured to acquire the mirrored traffic of the network traffic within a predetermined time via a monitoring port of a switch connected to the device;
    其中所述公共标志包括下列中的至少一个:The public sign includes at least one of the following:
    生存时间;窗口尺寸;不分片位;最大报文长度;窗口缩放因子。Lifetime; window size; non-fragmentation bit; maximum message length; window scaling factor.
  8. 根据权利要求6所述的装置,其特征在于,所述第二特征组的权重大于所述第三特征组的权重,所述第三特征组的权重大于所述第一特征组的权重,其中所述第一特征组、第二特征组和第三特征组的各自权重都是可调整的。The device according to claim 6 is characterized in that the weight of the second feature group is greater than the weight of the third feature group, the weight of the third feature group is greater than the weight of the first feature group, and the respective weights of the first feature group, the second feature group and the third feature group are adjustable.
  9. 根据权利要求6所述的装置,其特征在于,还包括:The device according to claim 6, further comprising:
    调整模块(505),被配置为当所述分类结果与预定的目标分类结果的重合度小于预定的阈值时,执行迭代,直到所述分类结果与所述目标分类结果的重合度大于或等于所述阈值,其中所述迭代包括:对第一特征组、第二特征组和第三特征组的各自权重中的至少一个进行调整;基于第一特征组、第二特征组和第三特征组的各自调整后权重,确定所述第一特征组、第二特征组和第三特征组的调整后的并集特征组;基于所述调整后的并集特征组的聚类结果,确定调整后的分类结果;比较调整后的分类结果与所述目标分类结果的重合度。The adjustment module (505) is configured to, when the overlap between the classification result and the predetermined target classification result is less than a predetermined threshold, perform iteration until the overlap between the classification result and the target classification result is greater than or equal to the threshold, wherein the iteration includes: adjusting at least one of the respective weights of the first feature group, the second feature group and the third feature group; determining an adjusted union feature group of the first feature group, the second feature group and the third feature group based on the respective adjusted weights of the first feature group, the second feature group and the third feature group; determining an adjusted classification result based on a clustering result of the adjusted union feature group; and comparing the overlap between the adjusted classification result and the target classification result.
  10. 根据权利要求6-9中任一项所述的装置,其特征在于,所述传输控制协议端口的状态信息表征所述目的设备的每一个传输控制协议端口的开关状态,所述用户数据报协议端口的状态信息表征所述目的设备的每一个用户数据报协议端口的开关状态。The device according to any one of claims 6-9 is characterized in that the status information of the transmission control protocol port represents the switch status of each transmission control protocol port of the destination device, and the status information of the user datagram protocol port represents the switch status of each user datagram protocol port of the destination device.
  11. 一种电子设备,其特征在于,包括:An electronic device, comprising:
    处理器(601);Processor (601);
    存储器(602),用于存储所述处理器(601)的可执行指令;A memory (602) for storing executable instructions of the processor (601);
    所述处理器(601),用于从所述存储器(602)中读取所述可执行指令,并执行所述可执行指令以实施权利要求1-6中任一项所述的设备分类方法。The processor (601) is used to read the executable instructions from the memory (602) and execute the executable instructions to implement the device classification method according to any one of claims 1 to 6.
  12. 一种计算机可读存储介质,其上存储有计算机指令,其特征在于,所述计算机指令被处理器执行时实施权利要求1-6中任一项所述的设备分类方法。A computer-readable storage medium having computer instructions stored thereon, wherein the computer instructions, when executed by a processor, implement the device classification method according to any one of claims 1 to 6.
  13. 一种计算机程序产品,其特征在于,包括计算机程序,所述计算机程序被处理器执行时实施权利要求1-6中任一项所述的设备分类方法。A computer program product, characterized in that it comprises a computer program, and when the computer program is executed by a processor, it implements the device classification method according to any one of claims 1 to 6.
PCT/CN2022/121762 2022-09-27 2022-09-27 Device classification method and apparatus, electronic device, and computer-readable storage medium WO2024065185A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/121762 WO2024065185A1 (en) 2022-09-27 2022-09-27 Device classification method and apparatus, electronic device, and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/121762 WO2024065185A1 (en) 2022-09-27 2022-09-27 Device classification method and apparatus, electronic device, and computer-readable storage medium

Publications (1)

Publication Number Publication Date
WO2024065185A1 true WO2024065185A1 (en) 2024-04-04

Family

ID=90475066

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/121762 WO2024065185A1 (en) 2022-09-27 2022-09-27 Device classification method and apparatus, electronic device, and computer-readable storage medium

Country Status (1)

Country Link
WO (1) WO2024065185A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109309630A (en) * 2018-09-25 2019-02-05 深圳先进技术研究院 A kind of net flow assorted method, system and electronic equipment
CN110149239A (en) * 2019-04-01 2019-08-20 电子科技大学 A kind of network flow monitoring method based on sFlow
CN112398779A (en) * 2019-08-12 2021-02-23 中国科学院国家空间科学中心 Network traffic data analysis method and system
US20210160266A1 (en) * 2019-11-27 2021-05-27 Telefonaktiebolaget Lm Ericsson (Publ) Computer-implemented method and arrangement for classifying anomalies

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109309630A (en) * 2018-09-25 2019-02-05 深圳先进技术研究院 A kind of net flow assorted method, system and electronic equipment
CN110149239A (en) * 2019-04-01 2019-08-20 电子科技大学 A kind of network flow monitoring method based on sFlow
CN112398779A (en) * 2019-08-12 2021-02-23 中国科学院国家空间科学中心 Network traffic data analysis method and system
US20210160266A1 (en) * 2019-11-27 2021-05-27 Telefonaktiebolaget Lm Ericsson (Publ) Computer-implemented method and arrangement for classifying anomalies

Similar Documents

Publication Publication Date Title
CN111865815B (en) Flow classification method and system based on federal learning
US10084713B2 (en) Protocol type identification method and apparatus
EP3691218A1 (en) Method and device for identifying encrypted data stream
EP3407562A1 (en) Coflow recognition method and system, and server using method
US9009830B2 (en) Inline intrusion detection
EP3229407A1 (en) Application signature generation and distribution
US9397901B2 (en) Methods, systems, and computer readable media for classifying application traffic received at a network traffic emulation device that emulates multiple application servers
CN108270699B (en) Message processing method, shunt switch and aggregation network
CN105591973A (en) Application recognition method and apparatus
US10033619B2 (en) Data processing method and apparatus for OpenFlow network
CN111953552B (en) Data flow classification method and message forwarding equipment
CN109905328B (en) Data stream identification method and device
US20170041242A1 (en) Network system, communication analysis method and analysis apparatus
CN116346418A (en) DDoS detection method and device based on federal learning
CN101741745A (en) Method and system for identifying application traffic of peer-to-peer network
CN109672594B (en) IPoE message processing method and device and broadband remote access server
WO2024065185A1 (en) Device classification method and apparatus, electronic device, and computer-readable storage medium
CN113162911B (en) Multi-protocol compatible data interaction method and device based on SDN network controller
CN114143385A (en) Network traffic data identification method, device, equipment and medium
WO2023284809A1 (en) Device identification method, apparatus and system
CN109905325A (en) A kind of flow bootstrap technique and flow identify equipment
US11941626B2 (en) System and method for associating a cryptocurrency address to a user
CN114301960B (en) Processing method and device for cluster asymmetric traffic, electronic equipment and storage medium
US8116199B2 (en) Method and system for monitoring network communication
EP4075727A1 (en) System and method for identifying services with which encrypted traffic is exchanged