CN114390118A - Industrial control asset identification method and device, electronic equipment and storage medium - Google Patents

Industrial control asset identification method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN114390118A
CN114390118A CN202111625685.5A CN202111625685A CN114390118A CN 114390118 A CN114390118 A CN 114390118A CN 202111625685 A CN202111625685 A CN 202111625685A CN 114390118 A CN114390118 A CN 114390118A
Authority
CN
China
Prior art keywords
industrial control
identified
message
control equipment
equipment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111625685.5A
Other languages
Chinese (zh)
Other versions
CN114390118B (en
Inventor
王建明
陈景妹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nsfocus Technologies Inc
Nsfocus Technologies Group Co Ltd
Original Assignee
Nsfocus Technologies Inc
Nsfocus Technologies Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nsfocus Technologies Inc, Nsfocus Technologies Group Co Ltd filed Critical Nsfocus Technologies Inc
Priority to CN202111625685.5A priority Critical patent/CN114390118B/en
Publication of CN114390118A publication Critical patent/CN114390118A/en
Application granted granted Critical
Publication of CN114390118B publication Critical patent/CN114390118B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/22Parsing or analysis of headers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the application provides an industrial control asset identification method and device, electronic equipment and a storage medium, which are used for improving the accuracy of industrial control asset identification. The method comprises the following steps: acquiring a plurality of messages corresponding to the industrial control equipment to be identified; acquiring characteristic information of each message in the plurality of messages; the characteristic information comprises a general protocol of industrial control equipment, a message transmission direction, a data packet length, a source port address, a destination port address, the industrial control protocol used by the industrial control equipment to be identified, a function code field corresponding to the industrial control protocol and message arrival time; splicing the characteristic information of each message according to a preset rule to obtain a characteristic vector corresponding to the industrial control equipment to be identified; and determining the equipment model of the industrial control equipment to be identified according to the characteristic vector.

Description

Industrial control asset identification method and device, electronic equipment and storage medium
Technical Field
The invention relates to the technical field of industrial control safety, in particular to an industrial control asset identification method, an industrial control asset identification device, electronic equipment and a storage medium.
Background
Industrial Control Systems (ICS) (also referred to simply as Industrial control Systems) are Physical information Systems (CPS) that monitor and control Industrial production, and are widely used in the infrastructure industries such as electric power, water utilities, petrochemical industry, and the like. In recent years, with the development of industrial internet Technology, a "partition wall" of an industrial control system and an Information Technology (IT) network is gradually opened, so that industrial control equipment with security holes may be exposed on the network, the security situation of the industrial control system is more and more severe, and more enterprises begin to perform security assessment on ICS. Wherein, the first work of security assessment on the ICS is to identify the assets in the ICS and clarify the asset conditions in the ICS. Compared with the traditional IT network, the ICS has multiple asset models, is complex in communication connection, and is not interfered by the outside, so that the difficulty in determining the asset condition in the ICS is higher.
At present, the identification method of assets in ICS mainly comprises active detection and passive identification; the active detection generally comprises the steps of sending a certain number of network data packets to a target system, detecting host fingerprint information and webpage web fingerprint information of a surviving host, and judging whether the detected host is an asset or not by configuring rules of different assets, but the mode can cause the network data packets to have certain influence on the target network, so that the safety of an industrial control network is influenced, and meanwhile, as ICS resources are limited and sensitive to time response, a plurality of ICS systems cannot allow active detection; passive identification generally adopts a bypass deployment mode, does not affect the operation of equipment, namely does not affect the safety of an industrial Control system, and mainly depends on binding equipment fingerprints to a physical Address (Media Access Control Address, MAC), but the MAC Address may change after the MAC Address is subjected to Network Address Translation (NAT), so that the accuracy of identifying industrial Control assets is low.
Disclosure of Invention
The embodiment of the application provides an industrial control asset identification method and device, electronic equipment and a storage medium, which are used for improving the accuracy of industrial control asset identification.
In a first aspect, a method for identifying industrial control assets is provided, the method comprising:
acquiring a plurality of messages corresponding to the industrial control equipment to be identified;
acquiring characteristic information of each message in the plurality of messages; the characteristic information comprises a general protocol of industrial control equipment, a message transmission direction, a data packet length, a source port address, a destination port address, the industrial control protocol used by the industrial control equipment to be identified, a function code field corresponding to the industrial control protocol and message arrival time;
splicing the characteristic information of each message according to a preset rule to obtain a characteristic vector corresponding to the industrial control equipment to be identified;
and determining the equipment model of the industrial control equipment to be identified according to the characteristic vector.
Optionally, the splicing the feature information of each packet according to a preset rule to obtain a feature vector corresponding to the industrial control device to be identified includes:
selecting n continuous messages from the plurality of messages;
and splicing the characteristic information corresponding to the n continuous messages according to a preset rule to obtain a characteristic vector corresponding to the industrial control equipment to be identified.
Optionally, the splicing the feature information corresponding to the n consecutive messages according to a preset rule to obtain the feature vector corresponding to the industrial control device to be identified includes:
splicing a general protocol, a message transmission direction, a data packet length, a source port address, a destination port address and an industrial control protocol corresponding to each message in the n continuous messages according to a message sequence to obtain a protocol feature vector;
acquiring a first message arrival time corresponding to a first message in the n continuous messages and a second message arrival time corresponding to an nth message in the n continuous messages;
determining a time difference value between the arrival time of the second message and the arrival time of the first message to obtain time characteristic vectors corresponding to the n continuous messages;
and splicing the protocol characteristic vector and the time characteristic vector to obtain a characteristic vector corresponding to the industrial control equipment to be identified.
Optionally, before the feature information of each packet is spliced according to a preset rule to obtain a feature vector corresponding to the industrial control device to be identified, the method further includes:
acquiring a function code field of an industrial control protocol used by the industrial control equipment to be identified;
determining whether the format corresponding to the function code field is a preset format;
and if the format corresponding to the function code field is not a preset format, filling the function code field according to the preset format.
Optionally, the determining the device model of the industrial control device to be identified according to the feature vector corresponding to the industrial control device to be identified includes:
determining the equipment model of the industrial control equipment to be identified based on the feature vector corresponding to the industrial control equipment to be identified through the trained classifier model; the training mode of the classifier model is as follows:
acquiring a full-period message corresponding to an industrial control system; the industrial control system is corresponding to the industrial control equipment to be identified, and the full-period messages are corresponding to all the industrial control equipment in the industrial control system;
acquiring characteristic information of each message in n continuous messages corresponding to each industrial control device;
splicing the characteristic information of each message corresponding to each industrial control device according to the preset rule to obtain a characteristic vector set corresponding to all the industrial control devices;
inputting each feature vector in the feature vector set into a preset classifier model, and determining the equipment model corresponding to each industrial control equipment through the preset classifier model;
judging whether the equipment model of each industrial control equipment is correct or not according to the mapping relation; the mapping relation is a corresponding relation between the message and the equipment model which is established according to the full-period message;
if the number of the industrial control devices with the correct device models is smaller than the preset number, the classifier parameters corresponding to the preset classifier are adjusted, and when the number of the industrial control devices with the correct device models is larger than or equal to the preset number, model training is finished, and the classifier model is obtained.
Optionally, after the trained classifier model determines the device model of the industrial control device to be identified based on the feature vector corresponding to the industrial control device to be identified, the method further includes:
determining the probability that the industrial control equipment to be identified is determined to be the same equipment type within a preset time length;
if the probability is smaller than a preset threshold value, determining that the industrial control equipment to be identified is abnormal, and sending first alarm information to enable a worker to confirm the reason of the abnormality;
if the reason that the industrial control equipment to be identified is abnormal is that the production environment of the industrial control equipment to be identified is changed, acquiring a full-period message corresponding to a current industrial control system corresponding to the production environment of the industrial control equipment to be identified;
and optimizing and updating the classifier model according to the full-period message corresponding to the current industrial control system.
In a second aspect, an industrial asset identification apparatus is provided, the apparatus comprising:
the acquisition module is used for acquiring a plurality of messages corresponding to the industrial control equipment to be identified;
the obtaining module is further configured to obtain feature information of each of the plurality of messages; the characteristic information comprises a general protocol of industrial control equipment, a message transmission direction, a data packet length, a source port address, a destination port address, the industrial control protocol used by the industrial control equipment to be identified, a function code field corresponding to the industrial control protocol and message arrival time;
the processing module is used for splicing the characteristic information of each message according to a preset rule to obtain a characteristic vector corresponding to the industrial control equipment to be identified;
the processing module is further configured to determine the device model of the industrial control device to be identified according to the feature vector.
Optionally, the processing module is specifically configured to:
selecting n continuous messages from the plurality of messages;
and splicing the characteristic information corresponding to the n continuous messages according to a preset rule to obtain a characteristic vector corresponding to the industrial control equipment to be identified.
Optionally, the processing module is specifically configured to:
splicing a general protocol, a message transmission direction, a data packet length, a source port address, a destination port address and an industrial control protocol corresponding to each message in the n continuous messages according to a message sequence to obtain a protocol feature vector;
acquiring a first message arrival time corresponding to a first message in the n continuous messages and a second message arrival time corresponding to an nth message in the n continuous messages;
determining a time difference value between the arrival time of the second message and the arrival time of the first message to obtain time characteristic vectors corresponding to the n continuous messages;
and splicing the protocol characteristic vector and the time characteristic vector to obtain a characteristic vector corresponding to the industrial control equipment to be identified.
Optionally, the obtaining module is further configured to:
acquiring a function code field of an industrial control protocol used by the industrial control equipment to be identified;
the processing module is further configured to:
determining whether the format corresponding to the function code field is a preset format;
and when the format corresponding to the functional code segment is not a preset format, filling the functional code segment according to the preset format.
Optionally, the processing module is specifically configured to:
and determining the equipment model of the industrial control equipment to be identified based on the feature vector corresponding to the industrial control equipment to be identified through the trained classifier model.
Optionally, the industrial control asset identification apparatus further includes a model training module, configured to:
acquiring a full-period message corresponding to an industrial control system; the industrial control system is corresponding to the industrial control equipment to be identified, and the full-period messages are corresponding to all the industrial control equipment in the industrial control system;
acquiring characteristic information of each message in n continuous messages corresponding to each industrial control device;
splicing the characteristic information of each message corresponding to each industrial control device according to the preset rule to obtain a characteristic vector set corresponding to all the industrial control devices;
inputting each feature vector in the feature vector set into a preset classifier model, and determining the equipment model corresponding to each industrial control equipment through the preset classifier model;
judging whether the equipment model of each industrial control equipment is correct or not according to the mapping relation; the mapping relation is a corresponding relation between the message and the equipment model which is established according to the full-period message;
if the number of the industrial control devices with the correct device models is smaller than the preset number, the classifier parameters corresponding to the preset classifier are adjusted, and when the number of the industrial control devices with the correct device models is larger than or equal to the preset number, model training is finished, and the classifier model is obtained.
Optionally, the processing module is further configured to:
determining the probability that the industrial control equipment to be identified is determined to be the same equipment type within a preset time length;
when the probability is smaller than a preset threshold value, determining that the industrial control equipment to be identified is abnormal, and sending first alarm information to enable a worker to confirm the reason of the abnormality;
if the reason that the industrial control equipment to be identified is abnormal is that the production environment of the industrial control equipment to be identified is changed; the model training module is further configured to:
acquiring a full-period message corresponding to a current industrial control system corresponding to the current production environment of the industrial control equipment to be identified;
and optimizing and updating the classifier model according to the full-period message corresponding to the current industrial control system.
In a third aspect, an electronic device is provided, which includes:
a memory for storing program instructions;
a processor for calling the program instructions stored in the memory and executing the steps comprised in any of the methods of the first aspect according to the obtained program instructions.
In a fourth aspect, there is provided a computer-readable storage medium having stored thereon computer-executable instructions for causing a computer to perform the steps included in the method of any one of the first aspects.
In a fifth aspect, a computer program product containing instructions is provided, which when run on a computer causes the computer to perform the industrial control asset identification method described in the various possible implementations described above.
In the embodiment of the application, a plurality of messages corresponding to the industrial control equipment to be identified are determined, feature information of each message in the plurality of messages, which is related to a general protocol, a message transmission direction, a data packet length, a source port address, a destination port address, an industrial control protocol used by the industrial control equipment to be identified, a function code field corresponding to the industrial control protocol, message arrival time and the like of the industrial control equipment, is obtained, the feature information of each message is spliced according to a preset rule to obtain a feature vector corresponding to the industrial control equipment to be identified, and the equipment model of the industrial control equipment to be identified is determined according to the feature vector.
That is to say, according to the method and the device, asset identification is carried out on the industrial control equipment to be identified through passively monitoring the plurality of messages corresponding to the industrial control equipment to be identified, asset identification of the industrial control equipment to be identified is achieved on the premise that the industrial control system is not affected, and the problem that the safety of the industrial control system is affected when the industrial control system is actively detected is avoided. And because the industrial control system is not interfered by the outside world, and the industrial control equipment in the industrial control system is in a stable production environment, messages communicated between the industrial control equipment in the industrial control system can present regularity and predictable behavior characteristics, and the accuracy of industrial control asset identification can be effectively improved through the message interaction characteristics corresponding to the industrial control equipment to be identified.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application.
FIG. 1 is a flow chart of a method for identifying industrial assets according to an embodiment of the present application;
FIG. 2 is a flow chart of model training provided by an embodiment of the present application;
fig. 3 is a block diagram of an industrial control asset identification device according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of a computer device in an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions in the embodiments of the present application will be described clearly and completely with reference to the accompanying drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application. In the present application, the embodiments and features of the embodiments may be arbitrarily combined with each other without conflict. Also, while a logical order is shown in the flow diagrams, in some cases, the steps shown or described may be performed in an order different than here.
The terms "first" and "second" in the description and claims of the present application and the above-described drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the term "comprises" and any variations thereof, which are intended to cover non-exclusive protection. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus. The "plurality" in the present application may mean at least two, for example, two, three or more, and the embodiments of the present application are not limited.
In addition, the term "and/or" herein is only one kind of association relationship describing an associated object, and means that there may be three kinds of relationships, for example, a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" in this document generally indicates that the preceding and following related objects are in an "or" relationship unless otherwise specified.
The industrial control asset identification method provided by the embodiment of the application is described below with reference to the drawings in the specification. Referring to fig. 1, a flow of the industrial control asset identification method in the embodiment of the present application is described as follows:
step 101: acquiring a plurality of messages corresponding to the industrial control equipment to be identified;
in the embodiment of the application, a data acquisition module is deployed in a bypass of an industrial control system corresponding to the industrial control equipment to be identified, and is used for acquiring messages corresponding to each industrial control equipment to be identified in the industrial control system, wherein one industrial control equipment may correspond to a plurality of messages.
Step 102: acquiring characteristic information of each message in a plurality of messages;
the characteristic information comprises information such as a general protocol of the industrial control equipment, a message transmission direction, a data packet length, a source port address sport, a destination port address dport, the industrial control protocol used by the industrial control equipment to be identified, a function code field corresponding to the industrial control protocol, message arrival time and the like. The Common protocols of the Industrial Control devices include, for example, Address Resolution Protocol (ARP), Control Message Protocol (ICMP), Transmission Control Protocol (TCP), User Datagram Protocol (UDP), and the like, and the Industrial Control protocols used by the Industrial Control devices to be identified include, for example, Modbus/TCP Protocol, new generation service Platform (ENIP), Common Industrial Protocol (CIP), CIP PCCC Protocol, ohm dragon (OMRON) communication Protocol, and Connection-Oriented Transport Protocol (COTP).
In this embodiment of the present application, before obtaining the feature information of each packet in a plurality of packets, the feature information of each packet may be quantized in a value-taking manner, for example, please refer to table 1:
feature(s) Value taking
ARP 0,1
ICMP 0,1
IP 0,1
TCP 0,1
UDP 0,1
Lenth 0~8
sport Actual port number, not set to 0
dport Actual port number, not set to 0
Modbus/TCP 0,1
ENIP 0,1
CIP 0,1
CIP PCCC 0,1
OMRON 0,1
COTP 0,1
Industrial control protocol function code field Actual field value
Time (Time) Time of arrival of message
TABLE 1
In a specific implementation process, when the feature information of each packet is obtained, a value corresponding to each feature information may be directly obtained, for example, for protocols such as ARP, ICMP, TCP, UDP, Modbus/TCP, ENIP, CIP PCCC, OMRON, COTP, and the like, if a certain protocol exists in the packet, the value corresponding to the feature information of the protocol is 1, and if no certain protocol exists in the packet, the value corresponding to the feature information of the protocol is 0, for example, if an ARP protocol exists in the packet, and no TCP protocol exists in the packet, the value corresponding to the feature information of the ARP protocol is 1, and the value corresponding to the feature information of the TCP protocol is 0. It should be noted that, for different industrial control systems, the industrial control protocols used by the industrial control devices in the industrial control system may be different, and therefore, the corresponding classification features should be set according to the industrial control protocols existing in the industrial control system.
For another example, the length in table 1 is used to indicate the packet length, the value of the IP is used to identify the transmission direction of the packet, for example, to identify whether the packet is a received packet or a sent packet, and the sport and dport are used to indicate the source port address and the destination port address respectively. For the value of the IP, if a certain message is a sending message, the value corresponding to the feature information corresponding to the IP is 0, and if a certain message is a receiving message, the value corresponding to the feature information corresponding to the IP is 1; for sport and dport, if the source port and/or the destination port are not used, the value corresponding to the characteristic information corresponding to the address of the source port and/or the destination port is 0, and if the source port and/or the destination port are used, the value corresponding to the source port and/or the destination port is the actually used port number; for the Lenth, the value mode corresponding to the corresponding characteristic information can be taken according to the range divided in table 2, and the specific division standard is as follows:
f=0 l∈[0,50]
f=1 l∈[50,100]
f=2 l∈[100,150]
f=3 l∈[150,200]
f=4 l∈[200,250]
f=5 l∈[250,300]
f=6 l∈[300,350]
f=7 l∈[350,400]
f=8 l>400
TABLE 2
Where l represents the packet length. For example, if the packet length of a certain message corresponding to the industrial control device to be identified is 255, the value corresponding to the message Lenth is 5.
In a possible implementation manner, after the value corresponding to the feature information of each message is obtained, a functional code field of an industrial control protocol used by the industrial control device to be identified may also be obtained, whether a format corresponding to the functional code field is a preset format is determined, and if the format corresponding to the functional code field is not the preset format, the functional code field is filled according to the preset format. For example, the function code field of the preset format is a format with a length of 4 characters (the longest function code field corresponding to the industrial control protocol is 4 characters), and the function code field corresponding to the industrial control protocol used by the industrial control equipment to be identified is 2 characters, for example, 11, at this time, the function code field corresponding to the industrial control protocol used by the industrial control equipment to be identified can be filled with a value "0", the filled function code field is 0011, and the formats of the function code fields are unified. In other embodiments, the function codes of different industrial control protocols can be separated, and the function code of each industrial control protocol is independently used as the characteristic information corresponding to one message.
Step 103: splicing the characteristic information of each message according to a preset rule to obtain a characteristic vector corresponding to the industrial control equipment to be identified;
in a possible implementation manner, in order to amplify the feature information of the device and obtain the interactive feature between the devices, a multi-packet splicing manner may be used to splice values of the feature information corresponding to a plurality of messages of a preset number according to a preset rule, for example, n consecutive messages are randomly selected from the plurality of messages, and values corresponding to the feature information corresponding to the n consecutive messages are spliced according to the preset rule to obtain a feature vector corresponding to the industrial control device to be identified. And the value of n can be reasonably selected according to the data scale and the interaction behavior among the devices, and preferably, the value of n is more than or equal to 6.
Specifically, the general protocol, the packet length, the source port address, the destination port address, and the industrial control protocol corresponding to each of n consecutive messages are spliced according to the message sequence, that is, the features except time in table 1 are spliced according to the message sequence, for example, the last feature COTP of the first message is spliced with the ARP of the second message, the COTP of the second message is spliced … … with the ARP of the third message, and the protocol feature vector f corresponding to n messages is obtained1,f2,f3,f4,…,fnAnd acquiring a first message arrival time (time1) corresponding to a first message in the n continuous messages and a second message arrival time (time2) corresponding to an nth message, determining a time difference (namely time2-time1) between the second message arrival time and the first message arrival time, and acquiring time corresponding to the n continuous messagesThe time characteristic vector t is spliced with the protocol characteristic vector to obtain a characteristic vector F ═ F (F) corresponding to the industrial control equipment to be identified1,f2,f3,f4,…,fn,t)。
Step 104: and determining the equipment model of the industrial control equipment to be identified according to the characteristic vector.
In the embodiment of the application, the equipment model of the industrial control equipment to be identified can be determined based on the feature vector corresponding to the industrial control equipment to be identified through the trained classifier model; the training process of the classifier model is shown in fig. 2:
step 201: acquiring a full-period message corresponding to an industrial control system;
in the embodiment of the application, the full-period messages corresponding to the industrial control system are obtained by obtaining a plurality of messages corresponding to each industrial control device in the industrial control system.
Step 202: acquiring characteristic information of each message in n continuous messages corresponding to each industrial control device;
in this embodiment of the application, the model may be trained by obtaining the feature information of each of the n continuous messages at each preset duration, where the manner of obtaining the feature information of each message is the same as that in step 102, and is not described herein again.
Step 203: splicing the characteristic information of each message corresponding to each industrial control device according to a preset rule to obtain a set of characteristic vectors corresponding to all industrial control devices;
step 204: inputting each feature vector in the feature vector set into a preset classifier model, and determining the equipment model corresponding to each industrial control equipment through the preset classifier model;
the preset classifier model can be decision trees, random forests and other models capable of realizing multi-classification of equipment.
Step 205: judging whether the equipment model of each industrial control equipment is correct or not according to the mapping relation;
the mapping relationship is a corresponding relationship between a message and an equipment model which is established according to the full-cycle message, specifically, after the full-cycle message corresponding to the industrial control system is obtained, the equipment model is calibrated, and the corresponding relationship between the equipment model and the message is established.
Step 206: if the number of the industrial control devices with the correct device models is smaller than the preset number, the classifier parameters corresponding to the preset classifier are adjusted, and when the number of the industrial control devices with the correct device models is larger than or equal to the preset number, the model training is finished, and a classifier model is obtained.
In other embodiments, after the device model of the industrial control device to be identified is determined based on the feature vector corresponding to the industrial control device to be identified through the trained classifier model, the probability that the industrial control device to be identified is determined to be the same device model within a preset time period may also be determined, if the probability is smaller than a preset threshold, it indicates that the industrial control device to be identified is abnormal, that is, the production environment where the industrial control device to be identified is located may be changed, which results in a great change in the communication flow between the industrial control devices, or the industrial control device to be identified itself is failed, at this time, it is necessary to manually check the reason for the abnormality of the industrial control device to be identified, so that first alarm information may be sent to notify relevant workers to perform on-site confirmation, if it is determined that the reason for the abnormality of the industrial control device to be identified is the change in the environment where the industrial control device to be identified, it is necessary to reacquire the full-cycle message corresponding to the current industrial control system corresponding to the production environment where the industrial control device to be identified is currently located, and optimizing and updating the trained classifier model. The preset threshold value can be set according to the classification precision of the classifier model during model training.
In a specific implementation process, the acquired fingerprint information (namely, characteristic information) of the industrial control equipment to be identified is irrelevant to the MAC address, if the communication flow of the industrial control equipment to be identified is not changed, even if the MAC address of the industrial control equipment to be identified is changed after NAT, the characteristic vector change is not large because the change of the communication behavior characteristic is not large, and a correct equipment model can be still identified when the industrial control equipment is identified, so that the accuracy of industrial control asset identification is effectively improved.
Based on the same inventive concept, the embodiment of the application provides an industrial control asset identification device, and the industrial control asset identification device can realize the corresponding function of the industrial control asset identification method. The industrial control asset identification device can be a hardware structure, a software module or a hardware structure and a software module. The industrial control asset identification device can be realized by a chip system, and the chip system can be formed by a chip and can also comprise the chip and other discrete devices. Referring to fig. 3, the industrial control asset identification apparatus includes an obtaining module 301, a processing module 302, and a model training module 303. Wherein:
an obtaining module 301, configured to obtain multiple messages corresponding to an industrial control device to be identified;
the obtaining module 301 is further configured to obtain feature information of each of the multiple messages; the characteristic information comprises a general protocol of industrial control equipment, a message transmission direction, a data packet length, a source port address, a destination port address, the industrial control protocol used by the industrial control equipment to be identified, a function code field corresponding to the industrial control protocol and message arrival time;
the processing module 302 is configured to splice the feature information of each packet according to a preset rule to obtain a feature vector corresponding to the industrial control device to be identified;
the processing module 302 is further configured to determine an equipment model of the industrial control equipment to be identified according to the feature vector.
Optionally, the processing module 302 is specifically configured to:
selecting n continuous messages from the plurality of messages;
and splicing the characteristic information corresponding to the n continuous messages according to a preset rule to obtain a characteristic vector corresponding to the industrial control equipment to be identified.
Optionally, the processing module 302 is specifically configured to:
splicing the general protocol, the data packet length, the source port address, the destination port address and the industrial control protocol corresponding to each message in the n continuous messages according to the message sequence to obtain a protocol characteristic vector;
acquiring a first message arrival time corresponding to a first message in the n continuous messages and a second message arrival time corresponding to an nth message in the n continuous messages;
determining a time difference value between the arrival time of the second message and the arrival time of the first message to obtain time characteristic vectors corresponding to the n continuous messages;
and splicing the protocol characteristic vector and the time characteristic vector to obtain a characteristic vector corresponding to the industrial control equipment to be identified.
Optionally, the obtaining module 301 is further configured to:
acquiring a function code field of an industrial control protocol used by the industrial control equipment to be identified;
the processing module 302 is further configured to:
determining whether the format corresponding to the function code field is a preset format;
and when the format corresponding to the functional code segment is not a preset format, filling the functional code segment according to the preset format.
Optionally, the processing module 302 is specifically configured to:
and determining the equipment model of the industrial control equipment to be identified based on the feature vector corresponding to the industrial control equipment to be identified through the trained classifier model.
Optionally, the industrial control asset identification apparatus further includes a model training module 303, configured to:
acquiring a full-period message corresponding to an industrial control system; the industrial control system is corresponding to the industrial control equipment to be identified, and the full-period messages are corresponding to all the industrial control equipment in the industrial control system;
acquiring characteristic information of each message in n continuous messages corresponding to each industrial control device;
splicing the characteristic information of each message corresponding to each industrial control device according to the preset rule to obtain a characteristic vector set corresponding to all the industrial control devices;
inputting each feature vector in the feature vector set into a preset classifier model, and determining the equipment model corresponding to each industrial control equipment through the preset classifier model;
judging whether the equipment model of each industrial control equipment is correct or not according to the mapping relation; the mapping relation is a corresponding relation between the message and the equipment model which is established according to the full-period message;
if the number of the industrial control devices with the correct device models is smaller than the preset number, the classifier parameters corresponding to the preset classifier are adjusted, and when the number of the industrial control devices with the correct device models is larger than or equal to the preset number, model training is finished, and the classifier model is obtained.
Optionally, the processing module 302 is further configured to:
determining the probability that the industrial control equipment to be identified is determined to be the same equipment type within a preset time length;
when the probability is smaller than a preset threshold value, determining that the industrial control equipment to be identified is abnormal, and sending first alarm information to enable a worker to confirm the reason of the abnormality;
if the reason that the industrial control equipment to be identified is abnormal is that the production environment of the industrial control equipment to be identified is changed; the model training module 303 is further configured to:
acquiring a full-period message corresponding to a current industrial control system corresponding to the current production environment of the industrial control equipment to be identified;
and optimizing and updating the classifier model according to the full-period message corresponding to the current industrial control system.
All relevant contents of each step related to the embodiment of the industrial control asset identification method can be cited to the functional description of the functional module corresponding to the industrial control asset identification device in the embodiment of the present application, and are not described herein again.
The division of the modules in the embodiments of the present application is schematic, and only one logical function division is provided, and in actual implementation, there may be another division manner, and in addition, each functional module in each embodiment of the present application may be integrated in one processor, may also exist alone physically, or may also be integrated in one module by two or more modules. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.
Based on the same inventive concept, the embodiment of the application provides electronic equipment. Referring to fig. 4, the electronic device includes at least one processor 401 and a memory 402 connected to the at least one processor, a specific connection medium between the processor 401 and the memory 402 is not limited in this embodiment, in fig. 4, the processor 401 and the memory 402 are connected by a bus 400 as an example, the bus 400 is represented by a thick line in fig. 4, and a connection manner between other components is only schematically illustrated and is not limited. The bus 400 may be divided into an address bus, a data bus, a control bus, etc., and is shown with only one thick line in fig. 4 for ease of illustration, but does not represent only one bus or type of bus.
In the embodiment of the present application, the memory 402 stores instructions executable by the at least one processor 401, and the at least one processor 401 may execute the steps included in the industrial control asset identification method by executing the instructions stored in the memory 402.
The processor 401 is a control center of the electronic device, and may connect various portions of the whole electronic device by using various interfaces and lines, and perform various functions and process data of the electronic device by operating or executing instructions stored in the memory 402 and calling data stored in the memory 402, thereby performing overall monitoring on the electronic device. Optionally, the processor 401 may include one or more processing units, and the processor 401 may integrate an application processor and a modem processor, wherein the application processor mainly handles operating systems, application programs, and the like, and the modem processor mainly handles wireless communication. It will be appreciated that the modem processor described above may not be integrated into the processor 401. In some embodiments, processor 401 and memory 402 may be implemented on the same chip, or in some embodiments, they may be implemented separately on separate chips.
The processor 401 may be a general-purpose processor, such as a Central Processing Unit (CPU), digital signal processor, application specific integrated circuit, field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or the like, that may implement or perform the methods, steps, and logic blocks disclosed in embodiments of the present application. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the industrial control asset identification method disclosed by the embodiment of the application can be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules in the processor.
Memory 402, which is a non-volatile computer-readable storage medium, may be used to store non-volatile software programs, non-volatile computer-executable programs, and modules. The Memory 402 may include at least one type of storage medium, and may include, for example, a flash Memory, a hard disk, a multimedia card, a card-type Memory, a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Programmable Read Only Memory (PROM), a Read Only Memory (ROM), a charge Erasable Programmable Read Only Memory (EEPROM), a magnetic Memory, a magnetic disk, an optical disk, and so on. The memory 402 is any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to such. The memory 402 in the embodiments of the present application may also be circuitry or any other device capable of performing a storage function for storing program instructions and/or data.
By programming the processor 401, the code corresponding to the industrial control asset identification method described in the foregoing embodiment may be solidified into a chip, so that the chip can execute the steps of the industrial control asset identification method when running.
Based on the same inventive concept, embodiments of the present application further provide a computer-readable storage medium, which stores computer instructions, and when the computer instructions are executed on a computer, the computer is caused to execute the steps of the industrial control asset identification method.
In some possible embodiments, the aspects of the industrial control asset identification method provided by the present application may also be implemented in the form of a program product, which includes program code for causing a detection device to perform the steps of the industrial control asset identification method according to various exemplary embodiments of the present application described above in this specification, when the program product is run on an electronic device.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims (10)

1. An industrial control asset identification method, characterized in that the method comprises:
acquiring a plurality of messages corresponding to the industrial control equipment to be identified;
acquiring characteristic information of each message in the plurality of messages; the characteristic information comprises a general protocol of industrial control equipment, a message transmission direction, a data packet length, a source port address, a destination port address, the industrial control protocol used by the industrial control equipment to be identified, a function code field corresponding to the industrial control protocol and message arrival time;
splicing the characteristic information of each message according to a preset rule to obtain a characteristic vector corresponding to the industrial control equipment to be identified;
and determining the equipment model of the industrial control equipment to be identified according to the characteristic vector.
2. The method according to claim 1, wherein the splicing the feature information of each packet according to a preset rule to obtain a feature vector corresponding to the industrial control device to be identified comprises:
selecting n continuous messages from the plurality of messages;
and splicing the characteristic information corresponding to the n continuous messages according to a preset rule to obtain a characteristic vector corresponding to the industrial control equipment to be identified.
3. The method according to claim 2, wherein the splicing the feature information corresponding to the n consecutive messages according to a preset rule to obtain a feature vector corresponding to the industrial control device to be identified comprises:
splicing a general protocol, a message transmission direction, a data packet length, a source port address, a destination port address and an industrial control protocol corresponding to each message in the n continuous messages according to a message sequence to obtain a protocol feature vector;
acquiring a first message arrival time corresponding to a first message in the n continuous messages and a second message arrival time corresponding to an nth message in the n continuous messages;
determining a time difference value between the arrival time of the second message and the arrival time of the first message to obtain time characteristic vectors corresponding to the n continuous messages;
and splicing the protocol characteristic vector and the time characteristic vector to obtain a characteristic vector corresponding to the industrial control equipment to be identified.
4. The method according to claim 1, wherein before the splicing the feature information of each packet according to a preset rule to obtain the feature vector corresponding to the industrial control device to be identified, the method further comprises:
acquiring a function code field of an industrial control protocol used by the industrial control equipment to be identified;
determining whether the format corresponding to the function code field is a preset format;
and if the format corresponding to the function code field is not a preset format, filling the function code field according to the preset format.
5. The method of claim 1, wherein the determining the device model of the industrial control device to be identified according to the feature vector corresponding to the industrial control device to be identified comprises:
determining the equipment model of the industrial control equipment to be identified based on the feature vector corresponding to the industrial control equipment to be identified through the trained classifier model; the training mode of the classifier model is as follows:
acquiring a full-period message corresponding to an industrial control system; the industrial control system is corresponding to the industrial control equipment to be identified, and the full-period messages are corresponding to all the industrial control equipment in the industrial control system;
acquiring characteristic information of each message in n continuous messages corresponding to each industrial control device;
splicing the characteristic information of each message corresponding to each industrial control device according to the preset rule to obtain a characteristic vector set corresponding to all the industrial control devices;
inputting each feature vector in the feature vector set into a preset classifier model, and determining the equipment model corresponding to each industrial control equipment through the preset classifier model;
judging whether the equipment model of each industrial control equipment is correct or not according to the mapping relation; the mapping relation is a corresponding relation between the message and the equipment model which is established according to the full-period message;
if the number of the industrial control devices with the correct device models is smaller than the preset number, the classifier parameters corresponding to the preset classifier are adjusted, and when the number of the industrial control devices with the correct device models is larger than or equal to the preset number, model training is finished, and the classifier model is obtained.
6. The method of claim 5, wherein after determining, by the trained classifier model, the device model of the industrial control device to be identified based on the feature vector corresponding to the industrial control device to be identified, the method further comprises:
determining the probability that the industrial control equipment to be identified is determined to be the same equipment type within a preset time length;
if the probability is smaller than a preset threshold value, determining that the industrial control equipment to be identified is abnormal, and sending first alarm information to enable a worker to confirm the reason of the abnormality;
if the reason that the industrial control equipment to be identified is abnormal is that the production environment of the industrial control equipment to be identified is changed, acquiring a full-period message corresponding to a current industrial control system corresponding to the production environment of the industrial control equipment to be identified;
and optimizing and updating the classifier model according to the full-period message corresponding to the current industrial control system.
7. An industrial asset identification device, the device comprising:
the acquisition module is used for acquiring a plurality of messages corresponding to the industrial control equipment to be identified;
the obtaining module is further configured to obtain feature information of each of the plurality of messages; the characteristic information comprises a general protocol of industrial control equipment, a message transmission direction, a data packet length, a source port address, a destination port address, the industrial control protocol used by the industrial control equipment to be identified, a function code field corresponding to the industrial control protocol and message arrival time;
the processing module is used for splicing the characteristic information of each message according to a preset rule to obtain a characteristic vector corresponding to the industrial control equipment to be identified;
the processing module is further configured to determine the device model of the industrial control device to be identified according to the feature vector.
8. The apparatus of claim 7, wherein the processing module is specifically configured to:
selecting n continuous messages from the plurality of messages;
and splicing the characteristic information corresponding to the n continuous messages according to a preset rule to obtain a characteristic vector corresponding to the industrial control equipment to be identified.
9. An electronic device, comprising:
a memory for storing program instructions;
a processor for calling program instructions stored in said memory and for executing the steps comprised by the method of any one of claims 1 to 6 in accordance with the obtained program instructions.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program comprising program instructions that, when executed by a computer, cause the computer to perform the method according to any one of claims 1-6.
CN202111625685.5A 2021-12-28 2021-12-28 Industrial control asset identification method and device, electronic equipment and storage medium Active CN114390118B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111625685.5A CN114390118B (en) 2021-12-28 2021-12-28 Industrial control asset identification method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111625685.5A CN114390118B (en) 2021-12-28 2021-12-28 Industrial control asset identification method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN114390118A true CN114390118A (en) 2022-04-22
CN114390118B CN114390118B (en) 2023-11-07

Family

ID=81197626

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111625685.5A Active CN114390118B (en) 2021-12-28 2021-12-28 Industrial control asset identification method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114390118B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117575878A (en) * 2023-11-16 2024-02-20 杭州众诚咨询监理有限公司 Intelligent management method and device for traffic facility asset data, electronic equipment and medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009070994A1 (en) * 2007-11-30 2009-06-11 Hangzhou H3C Technologies Co., Ltd. A method and device for matching message rule
US20190171693A1 (en) * 2017-12-06 2019-06-06 Microsoft Technology Licensing, Llc Personalized presentation of messages on a computing device
WO2019173439A1 (en) * 2018-03-07 2019-09-12 Saudi Arabian Oil Company Asset discovery using network connections of known assets
CN111027048A (en) * 2019-12-11 2020-04-17 北京天融信网络安全技术有限公司 Operating system identification method and device, electronic equipment and storage medium
CN112311755A (en) * 2020-06-11 2021-02-02 北京威努特技术有限公司 Industrial control protocol reverse analysis method and device
CN112468364A (en) * 2020-11-25 2021-03-09 杭州安恒信息技术股份有限公司 CIP asset detection method and device, computer equipment and readable storage medium
CN112688932A (en) * 2020-12-21 2021-04-20 杭州迪普科技股份有限公司 Honeypot generation method, honeypot generation device, honeypot generation equipment and computer readable storage medium
CN113055127A (en) * 2021-03-17 2021-06-29 网宿科技股份有限公司 Data message duplicate removal and transmission method, electronic equipment and storage medium
CN113572761A (en) * 2021-07-22 2021-10-29 四川英得赛克科技有限公司 Equipment identification method and device, electronic equipment and storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009070994A1 (en) * 2007-11-30 2009-06-11 Hangzhou H3C Technologies Co., Ltd. A method and device for matching message rule
US20190171693A1 (en) * 2017-12-06 2019-06-06 Microsoft Technology Licensing, Llc Personalized presentation of messages on a computing device
WO2019173439A1 (en) * 2018-03-07 2019-09-12 Saudi Arabian Oil Company Asset discovery using network connections of known assets
CN111027048A (en) * 2019-12-11 2020-04-17 北京天融信网络安全技术有限公司 Operating system identification method and device, electronic equipment and storage medium
CN112311755A (en) * 2020-06-11 2021-02-02 北京威努特技术有限公司 Industrial control protocol reverse analysis method and device
CN112468364A (en) * 2020-11-25 2021-03-09 杭州安恒信息技术股份有限公司 CIP asset detection method and device, computer equipment and readable storage medium
CN112688932A (en) * 2020-12-21 2021-04-20 杭州迪普科技股份有限公司 Honeypot generation method, honeypot generation device, honeypot generation equipment and computer readable storage medium
CN113055127A (en) * 2021-03-17 2021-06-29 网宿科技股份有限公司 Data message duplicate removal and transmission method, electronic equipment and storage medium
CN113572761A (en) * 2021-07-22 2021-10-29 四川英得赛克科技有限公司 Equipment identification method and device, electronic equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117575878A (en) * 2023-11-16 2024-02-20 杭州众诚咨询监理有限公司 Intelligent management method and device for traffic facility asset data, electronic equipment and medium
CN117575878B (en) * 2023-11-16 2024-04-26 杭州众诚咨询监理有限公司 Intelligent management method and device for traffic facility asset data, electronic equipment and medium

Also Published As

Publication number Publication date
CN114390118B (en) 2023-11-07

Similar Documents

Publication Publication Date Title
CN109802953B (en) Industrial control asset identification method and device
CN112181804A (en) Parameter checking method, equipment and storage medium
CN109951494B (en) Simulation data processing method and device, simulation equipment and storage medium
CN114584582A (en) In-vehicle message processing method and device, vehicle-mounted terminal and storage medium
CN112272184B (en) Industrial flow detection method, device, equipment and medium
CN114390118B (en) Industrial control asset identification method and device, electronic equipment and storage medium
CN110099074A (en) A kind of method for detecting abnormality of internet of things equipment, system and electronic equipment
CN110602234A (en) Block chain network node management method, device, equipment and storage medium
CN112988670A (en) Log data processing method and device
CN118094450B (en) Fault early warning method and related equipment
CN109560964B (en) Equipment compliance checking method and device
CN113051571B (en) Method and device for detecting false alarm vulnerability and computer equipment
CN108306865B (en) Modbus packet-sticking processing method and device based on Netty framework
CN111159009A (en) Pressure testing method and device for log service system
CN112699000A (en) Data processing method and device, readable storage medium and electronic equipment
CN114567613B (en) Real IP identification method and device, electronic equipment and storage medium
WO2019207764A1 (en) Extraction device, extraction method, recording medium, and detection device
CN113304482A (en) Cloud game player portrait processing method, server and medium applied to cloud computing
CN116016252B (en) Gateway protocol detection method and device
CN107608809A (en) Abnormality eliminating method and device
CN116800637B (en) Method for estimating base number of data item in data stream and related equipment
CN117560285B (en) Intelligent control internet of things OTA upgrading method, client and server
CN110493818A (en) Detection method, device, storage medium and the electronic device of wireless fidelity module
CN112288990A (en) Method, system, medium and device for generating internet of things event based on internet of things data
CN117540071B (en) Configuration method and device for attribute table item of search engine

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant