CN116668145A - Industrial control equipment manufacturer identification method based on industrial control protocol communication model - Google Patents

Industrial control equipment manufacturer identification method based on industrial control protocol communication model Download PDF

Info

Publication number
CN116668145A
CN116668145A CN202310691586.XA CN202310691586A CN116668145A CN 116668145 A CN116668145 A CN 116668145A CN 202310691586 A CN202310691586 A CN 202310691586A CN 116668145 A CN116668145 A CN 116668145A
Authority
CN
China
Prior art keywords
industrial control
control equipment
fingerprint
fingerprint vector
message
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310691586.XA
Other languages
Chinese (zh)
Inventor
盛川
赵剑明
刘贤达
张博文
王天宇
曾鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenyang Institute of Automation of CAS
Original Assignee
Shenyang Institute of Automation of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenyang Institute of Automation of CAS filed Critical Shenyang Institute of Automation of CAS
Priority to CN202310691586.XA priority Critical patent/CN116668145A/en
Publication of CN116668145A publication Critical patent/CN116668145A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1433Vulnerability analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/40Network security protocols
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Selective Calling Equipment (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Communication Control (AREA)

Abstract

The invention provides an industrial control equipment manufacturer identification method based on an industrial control protocol communication model. Constructing an industrial control equipment fingerprint vector applicable to various industrial control protocols based on the industrial control protocol communication model; carrying out safety scanning on industrial control equipment on the Internet according to the industrial control protocol communication model to form an industrial control equipment fingerprint vector library; acquiring manufacturer information of industrial control equipment, and marking fingerprint vectors of the corresponding industrial control equipment; preprocessing attribute values in fingerprint vectors of industrial control equipment; clustering the fingerprint vectors of the industrial control equipment by a random clustering algorithm based on the density peak values to form a classification model of the fingerprint vectors of the industrial control equipment; monitoring the flow in the target industrial control network, and extracting the fingerprint vector of industrial control equipment; and identifying industrial control equipment manufacturers in the target industrial control network through a composite classification algorithm based on the fingerprint vector packet. The method can effectively solve the problem that manufacturer information of various industrial control equipment in an industrial control network is difficult to identify, and can effectively improve identification accuracy of industrial control equipment manufacturers.

Description

Industrial control equipment manufacturer identification method based on industrial control protocol communication model
Technical Field
The invention relates to the field of industrial control system network security, in particular to an industrial control equipment manufacturer identification method based on an industrial control protocol communication model.
Background
Along with the continuous deepening of the opening degree of the industrial control system, the traditional closed environment isolated from an external network is gradually broken, the network boundary is increasingly blurred, and various industrial control devices deployed inside the system can be accessed by an enterprise management network and even can be accessed directly through the Internet. However, since the inside of the industrial control device lacks sufficient information security functions, and the resistance to external malicious access behavior is rarely considered at the beginning of design, once the industrial control device is directly exposed on the internet, serious threat is posed to its own information security. Along with the public exploitation of the loopholes of the industrial control equipment and the continuous emergence of the loopholes of zero days, the safety situation of the industrial control equipment is more serious.
In order to realize comprehensive protection of industrial control equipment, relevant information of the equipment is identified, and accurate portrait is a basic and key work. Because the existing vulnerability information is generally issued aiming at industrial control equipment of some manufacturers, compared with other simpler portrait information (such as equipment roles, connection objects, flow statistics and the like), the manufacturer information of the industrial control equipment is particularly important for the safety protection of the industrial control equipment; on one hand, an attacker can implement targeted network attack aiming at industrial control equipment of different manufacturers, so that the success rate is higher, and the defending difficulty is higher; on the other hand, when the vulnerability information of the industrial control equipment of a certain manufacturer is published, an defender can conduct targeted and timely patch upgrade or security protection strategy upgrade on the related industrial control equipment according to the manufacturer information, so that the attack difficulty is improved, and the network defense capability is dynamically enhanced.
However, vendor information of industrial control devices is often difficult to directly obtain, especially for industrial control systems with long deployment time, because of data loss or network topology change, vendor information of related industrial control devices is difficult to obtain through inquiry. Although the active detection technology (such as NMap, zmap, PLCScan) at present shows better performance and effectiveness in the aspect of exposing the excavation of industrial control equipment, the unsafe active detection behavior can influence the normal operation of the industrial control equipment and even cause the interruption of the production process due to higher requirements of the industrial control system which is put into operation on the availability and the real-time performance of the industrial control equipment, so the technology cannot be applied to the asset detection inside the industrial control system.
The passive detection technology (such as P0f, GRASSMARLIN, netdiscover) identifies information such as identity, manufacturer and type of industrial control equipment by analyzing network traffic in a target industrial control network, but the method has the main defects of seriously depending on the effectiveness and the comprehensiveness of a fingerprint library of the industrial control equipment and the accuracy of a fingerprint matching algorithm. However, the problem of how to build a relatively effective and comprehensive fingerprint library of industrial control equipment is not solved effectively at present, the accuracy of a related fingerprint matching algorithm is also difficult to be ensured, and the existing method is only suitable for industrial control equipment or industrial control protocols of a specific type, and the universality is poor. Therefore, if the fingerprint vector library can be established for various industrial control protocols and has a representative industrial control device, an effective and comprehensive fingerprint vector library of the industrial control device is established, and a multi-classification algorithm with higher accuracy and suitable for fingerprint matching of the industrial control device is provided, the safety and accuracy of industrial control device manufacturer identification can be effectively improved.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides an industrial control equipment manufacturer identification method based on an industrial control protocol communication model, which aims to solve the problems of insufficient fingerprint vector distinguishing capability, poor generality, low accuracy of a fingerprint identification algorithm and the like of industrial control equipment.
The technical scheme adopted by the invention for achieving the purpose is as follows:
an industrial control equipment manufacturer identification method based on an industrial control protocol communication model comprises the following steps:
1) Constructing an industrial control equipment fingerprint vector applicable to various industrial control protocols based on the industrial control protocol communication model;
2) Carrying out safety scanning on industrial control equipment on the Internet according to the industrial control protocol communication model, and extracting fingerprint vectors based on scanning flow to form an industrial control equipment fingerprint vector library;
3) Acquiring manufacturer information of industrial control equipment, and marking fingerprint vectors of the corresponding industrial control equipment in a fingerprint vector library;
4) Preprocessing attribute values in fingerprint vectors of industrial control equipment;
5) Clustering the preprocessed industrial control equipment fingerprint vectors by a density peak value-based random clustering algorithm to obtain an industrial control equipment fingerprint vector classification model;
6) Monitoring the flow in the target industrial control network, and extracting the fingerprint vector of industrial control equipment;
7) And identifying industrial control equipment manufacturers in the target industrial control network through the industrial control equipment fingerprint vector classification model. The industrial control protocol communication model ics_cm in the step 1) is expressed as:
ICS_CM=(C E ,C D ,C T )
wherein ,CE The sequence of messages representing the connection establishment phase is expressed as:
wherein ,SYN message indicating the connection establishment phase, < >>SYN_ACK message indicating the connection establishment phase,/->ACK message for connection establishment stage, upper corner marks src and dst respectively represent message sendingThe sender is a host for applying for establishing TCP connection, namely a Client end, and a host for agreeing to establish TCP connection, namely a Server end, and subscript marks represent message types;
C D a sequence of messages representing a data transmission phase, consisting of one or more data transmissions, expressed as:
wherein ,represents the ith data transmission, i=1, …, n, n represents the total number of data transmissions, +.>Expressed as:
wherein ,request message representing the ith data transmission, < >>TCP protocol ACK response message indicating the ith data transmission,/th data transmission>An industrial control protocol data response message of the ith data transmission;
C T a message sequence representing the end of connection phase, consisting of one or more message subsequences, expressed as:
wherein ,message sequence indicating end of application initiated by Client end TCP connection,/for the application>Indicating the message sequence returned by the Server end correspondingly,/-for> and />Expressed as:
the fingerprint vector DF of the industrial control equipment in the step 1) is expressed as follows:
DF={ITTL,IPDF,IWS,MSS,WSC,SAP,ILRT,TON,TSCON,TCF,RTD,FTS}
wherein ITTL and IPDF represent respectivelyThe initial value of TTL field and DF flag bit of IP protocol header of the message; IWS, MSS, WSC, SAP each represents->Values of Window Size field, maxim segment Size field, window Scale field and SACPPERMITED field of TCP protocol header of message; ILRT represents->Message and->Time intervals of messages; TON represents->Whether the message contains TCP timestamp option, TSCON indicates +.>TSecr field value and +.>Whether the TSval field value of the TCP timestamp option of the message is consistent or not, wherein TCF represents the updating frequency of the TSval field value of the TCP timestamp option of the Server end; RTD represents->Message and->Time interval and +.>Message and->The difference in time intervals of the messages; FTS represents message type sequence sent by Server end in connection ending stage.
The step 2) specifically comprises the following steps:
and according to the industrial control protocol communication model, carrying out safety scanning on industrial control equipment on the Internet by using a standard industrial control protocol format data message, and extracting fingerprint vectors based on scanning flow to form an industrial control equipment fingerprint vector library.
The manufacturer information of the industrial control equipment is obtained in the step 3), and the method specifically comprises the following steps:
3.1 For industrial control equipment supporting the return of manufacturer information through the industrial control protocol function code, the manufacturer information of the industrial control equipment is queried by using the acquired equipment information type function code corresponding to the industrial control protocol in the process of carrying out safety scanning on the industrial control equipment;
3.2 For the industrial control equipment which does not support the return of manufacturer information through the industrial control protocol function code, continuing to try to access the common port supporting the HTTP/HTTPS protocol, extracting Banner information from the response head and the response body returned by the HTTP/HTTPS protocol, and further acquiring the manufacturer information of the industrial control equipment.
The step 4) is specifically as follows:
carrying out data normalization and data discretization on all attribute values in the fingerprint vector of the industrial control equipment, and carrying out data distribution fitting on time attribute ILRT and RTD;
the data distribution fitting is specifically as follows: if the number of fingerprint vectors of industrial control equipment of a certain known manufacturer is greater than 10, fitting data distribution to time type attributes of the manufacturer by using normal distribution, and acquiring a corresponding mean value mu and a standard deviation sigma; if a certain attribute value has a corresponding normal distribution, the normal distribution distance d from other attribute values to the attribute value n (x i ,x j ) The method comprises the following steps:
wherein ,xj Property value, μ representing presence of normal distribution j and σj Represents the mean and standard deviation of the corresponding normal distribution, x i An attribute value representing a normal distribution distance to be calculated, the distance between two attribute values being expressed as:
where dist represents the distance between attribute values.
The step 5) specifically comprises the following steps:
5.1 Creating m feature subsets of the industrial control device fingerprint vector, each feature subset comprising k different features randomly selected from the industrial control device fingerprint vector;
5.2 Creating a clustering model DPCM based on a density peak value for each feature subset by using an industrial control equipment fingerprint vector library, and creating m independent DPCM models in total;
5.3 For each cluster in the DPCM model, if the cluster comprises the industrial control equipment fingerprint vector with manufacturer labels, marking the manufacturer labels as manufacturer labels with the largest number of votes by adopting a voting method; otherwise, its vendor label is marked as unknown.
The step 5.2) specifically comprises the following steps:
5.2.1 According to the set cut-off distance d c Calculate each fingerprint vector DF i Local density ρ of (2) i And d thereof c Neighborhood, local density ρ i The calculation formula of (2) is as follows:
wherein ,dij Fingerprint vector DF for representing industrial control equipment i and DFj The distance between them;
d c the neighborhood is expressed as:
wherein ,DFj Representation d ij Less than d c Corresponding fingerprint vectors;
5.2.2 All fingerprint vectors are ordered according to the descending order of the local density, and a numbered sequence Γ of the fingerprint vectors is obtained and is expressed as follows:
Γ=<s 1 ,s 2 ,…,s n >
wherein ,si Numbers corresponding to fingerprint vectors representing local densities i-th largest, e.g. DF j I is greater, s i =j;
5.2.3 Calculating each fingerprint vector DF i And the nearest fingerprint vector DF with higher local density j Distance delta between i Wherein the closest distance delta i The calculation formula of (2) is as follows:
5.2.4 Use number s 1 Fingerprint vector of (a)Creating a first cluster; thereafter, a cluster assignment check and a cluster merge check are sequentially performed on each fingerprint vector in the order of the number sequence Γ.
The cluster attribution test is as follows: if the fingerprint vectorIs>Is smaller than the cut-off distance, will +.>Assigning the fingerprint vector to the cluster which is closest to the fingerprint vector and has higher density, otherwise creating a new cluster for the fingerprint vector;
the cluster merge test is: when the fingerprint vectorAfter determining the cluster, checking d c Whether all clusters belonging to fingerprint vectors with higher density in the neighborhood are consistent with the clusters belonging to the neighborhood, if not, the fingerprint vectors are +.>D of (2) c All clusters belonging to fingerprint vectors with higher density in the neighborhood are combined to form a new cluster.
The step 7) specifically comprises the following steps:
7.1 A set of different fingerprint vectors generated by the same industrial control device is referred to as a fingerprint vector packet, bobf, expressed as:
BoDF i ={DF 1 ,DF 2 ,…,DF p }
wherein p represents the number of fingerprint vectors generated by the industrial control device i;
7.2) The j-th fingerprint vector DF of the industrial control device i j Input to the m independent DPCM models generated in step 5) for classification, and the DPCM model is used to classify the fingerprint vector DF j The classifying process comprises the following steps: calculating fingerprint vector DF j Minimum distance to all clusters in DPCM model; if the minimum distance is smaller than the cut-off distance d c DF is to j Classifying the clusters into manufacturers to which the corresponding clusters belong; otherwise, judging that the fingerprint vector belongs to an unknown manufacturer outside the fingerprint vector library;
7.3 Fingerprint vector DF generated for m DPCM models by voting method j Is aggregated, expressed as:
wherein G (DF) j ) Representation DF j The final classification result of (a) is, a vendor label,manufacturer label with the highest number of votes, g l Representing the first DPCM model, +.>G represents g l Regarding DF j In vendor label c k An output from the first and second switches; if g l The classification result of (c) k Then->If the value of (2) is 1, otherwise, the value is 0, if the number of votes obtained by all manufacturer labels is 0, DF is calculated j Classifying as an unknown vendor;
7.4 Fingerprint vector packet BoDF for industrial control device i i Corresponding classification result c= { C 1 ,C 2 ,…,C p Integrating by adopting a weighted voting method to obtain manufacturer labels corresponding to the industrial control equipment i, wherein the weighted voting method is expressed as follows:
wherein Γ (BoDF i ) Representation of BoDF i Final classification result of G k (DF j ) Represents G (DF) j ) In vendor label c k An output from the first and second switches; if G (DF) j ) The classification result of (c) k G is then k (DF j ) And otherwise has a value of 0; omega j Representation of BoDF i J-th fingerprint vector DF in (3) j Is expressed as:
wherein ,Tj Representing fingerprint vector DF j The number of data messages involved.
The invention has the following beneficial effects and advantages:
1. the network communication model capable of describing various industrial control protocol communication processes is provided, the industrial control equipment fingerprint vector capable of being used in the active and passive detection scene is constructed, and the problems that an industrial control equipment fingerprint vector library is difficult to establish and poor in universality are effectively solved.
2. The random clustering algorithm based on the density peak value and the composite classification algorithm based on the fingerprint vector packet are provided, the problem that the importance degree of different attributes in the fingerprint vectors of the industrial control equipment is different in the identification process of the industrial control equipment manufacturer is effectively solved, the aggregation problem that the same industrial control equipment generates a plurality of different fingerprint vectors under different network communication conditions is effectively solved, and the identification accuracy of the industrial control equipment manufacturer is improved.
Drawings
FIG. 1 is a schematic flow chart of the method of the present invention;
FIG. 2 is a communication data message generated by performing security scanning on an industrial control device according to an embodiment of the present invention;
FIG. 3 is a set of data messages generated by an industrial control device in a target industrial control network according to an embodiment of the present invention;
fig. 4 is a set of data packets generated by an industrial control device in a target industrial control network according to an embodiment of the present invention.
Detailed Description
The invention is described in further detail below with reference to the drawings and the specific embodiments, but is not intended to limit the technical scope of the invention.
As shown in fig. 1, an industrial control equipment manufacturer identification method based on an industrial control protocol communication model includes the following steps:
1) Constructing an industrial control equipment fingerprint vector applicable to various industrial control protocols based on the industrial control protocol communication model;
2) Carrying out safety scanning on industrial control equipment on the Internet according to the industrial control protocol communication model, and extracting fingerprint vectors based on scanning flow to form an industrial control equipment fingerprint vector library;
3) Acquiring manufacturer information of industrial control equipment, and marking fingerprint vectors of the corresponding industrial control equipment;
4) Preprocessing attribute values in fingerprint vectors of industrial control equipment to enable the attribute values to be suitable for a clustering algorithm;
5) Clustering the fingerprint vectors of the industrial control equipment by a random clustering algorithm based on the density peak values to form a classification model of the fingerprint vectors of the industrial control equipment;
6) Monitoring the flow in the target industrial control network, and extracting the fingerprint vector of industrial control equipment;
7) And identifying industrial control equipment manufacturers in the target industrial control network through a composite classification algorithm based on the fingerprint vector packet.
The industrial control protocol communication model (ics_cm) in the step 1) is specifically a network communication model that can be used to describe the main communication process of various industrial control protocols (such as Modbus/TCP, ethernet/IP, etc.) that are based on TCP and employ the Client/Server communication mode, and is expressed as:
ICS_CM=(C E ,C D ,C T )
the object of the ics_cm description is the whole communication procedure based on one TCP connection, which mainly consists of a connection establishment phase, a data transmission phase and a connection end phase. C (C) E The sequence of messages representing the connection establishment phase is expressed as:
wherein ,SYN message indicating the connection establishment phase, < >>SYN_ACK message indicating the connection establishment phase,/->ACK message indicating connection establishment stage, upper corner mark src and dst respectively indicating message sender as host (Client end) applying to establish TCP connection and host (Server end) agreeing to establish TCP connection, and lower corner mark indicating message type. C (C) D A sequence of messages representing a data transmission phase, consisting of one or more data transmissions, expressed as:
wherein ,represents the ith data transmission, n represents the total number of data transmissions, < >>Expressed as:
wherein ,request message representing the ith data transmission, < >>TCP protocol ACK response message indicating the ith data transmission,/th data transmission>The i-th data transmission industrial control protocol data response message. Furthermore, the->Message and->The message is optional, that is, the Server end may reply all the messages, or may reply only part of the messages, or may not reply. C (C) T A message sequence representing the end of connection phase, consisting of one or more message subsequences, expressed as:
wherein ,message sequence indicating end of application initiated by Client end TCP connection,/for the application>Indicating the message sequence returned by the Server end correspondingly,/-for> and />Expressed as:
wherein ,CT and />The type and number of the messages contained are determined by the specific industrial control equipment protocol stack implementation mode.
Based on the industrial control protocol communication model (ics_cm) described above, the industrial control device fingerprint vector (DF) in step 1) is expressed as:
DF={ITTL,IPDF,IWS,MSS,WSC,SAP,ILRT,TON,TSCON,TCF,RTD,FTS}
wherein ITTL and IPDF represent respectivelyThe initial value of TTL field and DF flag bit of IP protocol header of the message; IWS, MSS, WSC, SAP each represents->The values of the Window Size field, the MaximSegmentSize field, the Window Scale field and the SACPPERMITED field of the TCP protocol header of the message; ILRT represents->Message and->Time intervals of messages; TON represents->Whether the message contains TCP timestamp option, TSCON indicates +.>TSecr field value and +.>Whether the TSval field value of the TCP timestamp option of the message is consistent or not, wherein TCF represents the updating frequency of the TSval field value of the TCP timestamp option of the Server end; RTD represents->Message and->Time interval and +.>Message and->The difference in time intervals of the messages; FTS represents a message type sequence sent by a Server end at the connection ending stage, wherein the ACK message type is 1, [ FIN, ACK]Message type is 2, [ RST, ACK]Message type 3, [ RST ]]Message type 4, [ ACK ]]And the TCP data part length is more than 0, the message type is 5, and the other message types are 0.
The step 2) specifically comprises the following steps:
and according to the industrial control protocol communication model, carrying out safety scanning on industrial control equipment on the Internet by using a standard industrial control protocol format data message, and extracting fingerprint vectors based on scanning flow to form an industrial control equipment fingerprint vector library.
The manufacturer information of the industrial control equipment is obtained in the step 3), and the method specifically comprises the following steps:
3.1 For industrial control equipment supporting the return of manufacturer information through the industrial control protocol function code, in the process of carrying out safety scanning on the industrial control equipment in the step 2), the manufacturer information of the industrial control equipment is queried by using the acquired equipment information type function code corresponding to the industrial control protocol;
3.2 For the industrial control equipment which does not support the functions, continuing to try to access the common ports (such as 80, 8080 and the like) supporting the HTTP/HTTPS protocol, extracting Banner information (usually including information such as a Server name, a webpage Title, and the like) from a response head and a response body returned by the HTTP/HTTPS protocol, and further acquiring manufacturer information of the industrial control equipment.
The preprocessing of the attribute values in the fingerprint vector of the industrial control equipment in the step 4) mainly comprises data normalization, data discretization and data distribution fitting.
The data distribution fitting refers to the fitting of the time class attributes ILRT and RTD, if the number of fingerprint vectors of industrial control equipment of a certain known manufacturer is greater than 10, the data distribution fitting is carried out on the time class attributes of the manufacturer by utilizing normal distribution, and corresponding mean mu and standard deviation sigma are obtained; if a certain attribute value has a corresponding normal distribution, calculating normal distribution distances from other attribute values to the attribute value by using the following formula:
wherein ,xj Property value, μ representing presence of normal distribution j and σj Represents the mean and standard deviation of the corresponding normal distribution, x i An attribute value representing a normal distribution distance to be calculated. The distance between two attribute values is expressed as:
the dist represents the distance between attribute values and is used for calculating the distance between fingerprint vectors of the industrial control equipment by a subsequent clustering algorithm, and other attributes of the fingerprint vectors of the industrial control equipment calculate the distance according to the condition that normal distribution does not exist. The distance between fingerprint vectors of the industrial control equipment is calculated by Euclidean distance.
The specific steps of the density peak value-based random clustering algorithm in the step 5) are as follows:
5.1 Creating m feature subsets of the industrial control device fingerprint vector, each feature subset comprising k different features randomly selected from the industrial control device fingerprint vector;
5.2 Creating a clustering model (DPCM) based on density peak value for each feature subset by using the industrial control equipment fingerprint vector library, and creating m independent DPCM models in total;
5.3 For each cluster in the DPCM model, if the cluster comprises the industrial control equipment fingerprint vector with manufacturer labels, marking the manufacturer labels as manufacturer labels with the largest number of votes by adopting a voting method; otherwise, its vendor label is marked as unknown.
The specific steps of creating the density peak-based clustering model (DPCM) in the step 5.2) are as follows:
5.2.1 According to the cutoff distance d c Calculate each fingerprint vector DF i Local density ρ of (2) i And d thereof c A neighborhood. Local density ρ i The calculation formula of (2) is as follows:
wherein ,dij Fingerprint vector DF for representing industrial control equipment i and DFj Distance between them. d, d c The neighborhood is expressed as:
wherein ,DFj Representation d ij Less than d c A corresponding fingerprint vector.
5.2.2 All fingerprint vectors are ordered according to the descending order of the local density, and a numbered sequence Γ of the fingerprint vectors is obtained and is expressed as follows:
Γ=<s 1 ,s 2 ,…,s n
wherein ,si Numbers corresponding to fingerprint vectors representing local densities i-th largest, e.g. DF j I is greater, s i =j。
5.2.3 Calculating each fingerprint vector DF i And the nearest fingerprint vector DF with higher local density j Distance delta between i Wherein the closest distance delta i The calculation formula of (2) is as follows:
5.2.4 Use number s 1 Fingerprint vector of (a)Creating a first cluster; then, sequentially carrying out cluster attribution test and cluster merging test on each fingerprint vector according to the sequence in the number sequence gamma; cluster attribution checking refers to: if the fingerprint vector->Is>Is smaller than the cut-off distance, will +.>Assigning the fingerprint vector to the cluster which is closest to the fingerprint vector and has higher density, otherwise creating a new cluster for the fingerprint vector; cluster merge test refers to: when fingerprint vector +.>After determining the cluster, checking d c If not, combining all the related clusters to form a new cluster.
The step 6) is specifically as follows:
and exporting the network flow in the target industrial control network in a switch port mirror image mode, screening out the corresponding industrial control protocol flow according to the port number of the industrial control protocol, and further extracting the fingerprint vector of the corresponding industrial control equipment.
The specific steps of the fingerprint vector packet-based composite classification algorithm in the step 7) are as follows:
7.1 A collection of different fingerprint vectors generated by the same industrial control device is referred to as a fingerprint vector package (bobf), expressed as:
BoDF i ={DF 1 ,DF 2 ,…,DF p }
where p represents the number of fingerprint vectors generated by the industrial control device i.
7.2 J-th fingerprint vector DF of industrial control device i) j Input to the m independent DPCM models generated in step 5) for classification, and the DPCM model is used to classify the fingerprint vector DF j The classifying process comprises the following steps: calculating fingerprint vector DF j Minimum distance to all clusters in DPCM model; if the minimum distance is smaller than the cut-off distance d c DF is to j Classifying the clusters into manufacturers to which the corresponding clusters belong; otherwise, it is determined that it belongs to an unknown vendor outside the fingerprint vector library.
7.3 Fingerprint vector DF generated for m DPCM models by voting method j Is aggregated, expressed as:
wherein G (DF) j ) Representation DF j The final classification result (i.e. vendor label),manufacturer label with the highest number of votes, g l Representing the first DPCM model, +.>G represents g l Regarding DF j In vendor label c k An output from the first and second switches; if g l The classification result of (c) k Then->And otherwise has a value of 0. If the ticket number of all manufacturer labels is 0, DF is used j Classified as an unknown vendor.
7.4 Fingerprint vector packet BoDF for industrial control device i i Corresponding classification result c= { C 1 ,C 2 ,…,C p Weighted projection is usedThe ticket method is integrated to obtain manufacturer labels corresponding to the industrial control equipment i, wherein the weighted voting method is expressed as follows:
wherein Γ (BoDF i ) Representation of BoDF i Final classification result of G k (DF j ) Represents G (DF) j ) In vendor label c k An output from the first and second switches; if G (DF) j ) The classification result of (c) k G is then k (DF j ) And otherwise has a value of 0; omega j Representation of BoDF i J-th fingerprint vector DF in (3) j Is expressed as:
wherein ,Tj Representing fingerprint vector DF j The number of data messages involved.
Examples
The embodiment of the invention provides an overall flow of an industrial control equipment manufacturer identification method based on an industrial control protocol communication model, which specifically comprises the following steps:
step 1: and constructing fingerprint vectors of industrial control equipment applicable to various industrial control protocols based on the industrial control protocol communication model.
Step 2: and carrying out safety scanning on industrial control equipment on the Internet according to the industrial control protocol communication model, and extracting fingerprint vectors based on scanning flow to form an industrial control equipment fingerprint vector library. Taking a scan server with an IP address of 192.168.158.111 to perform security scan on an industrial control device exposed on the internet with an IP address of 95.173..40 as an example, a communication message sequence shown in fig. 2 is obtained, and the following fingerprint vectors of the industrial control device can be obtained:
DF={255,false,6000,1400,-1,false,0,false,false,-1,0.01911,12}
step 3: and acquiring manufacturer information of the industrial control equipment, and marking fingerprint vectors of the corresponding industrial control equipment. For industrial control equipment supporting the return of manufacturer information through an industrial control protocol function code, as in the industrial control equipment with the IP address of 95.173..40 in figure 2, the manufacturer of the equipment is ABB; for an industrial control device that does not support the above functions, for example, an industrial control device with an IP address of 104.169..178, its 80 ports are open and support the Http protocol, and the device manufacturer of Rockwell Automation may be extracted from the web Title of the response body of the Http protocol.
Step 4: and preprocessing the attribute values in the fingerprint vector of the industrial control equipment to ensure that the attribute values are suitable for a clustering algorithm. Taking the RTD attribute of industrial control equipment of Schneider Electric equipment manufacturers supporting the Modbus protocol obtained by one round of scanning as an example, 524 attribute values are obtained in total, and the average value of the attribute values is-0.0272 and the standard deviation is 0.27091. When the RTD attribute value of one industrial control device is 0.035, the normal distribution distance is 0.0765; when the RTD attribute value is 0.739, the normal distribution distance thereof is 0.9427.
Step 5: and clustering the fingerprint vectors of the industrial control equipment by a random clustering algorithm based on the density peak values to form a classification model of the fingerprint vectors of the industrial control equipment. Firstly, 10 feature subsets of the fingerprint vector of the industrial control equipment are created, and each feature subset comprises 6 different features randomly selected from the fingerprint vector of the industrial control equipment; next, the distance d will be truncated c Setting to 0.01, creating 10 independent DPCM models based on the feature subsets; finally, a vendor label is created for each cluster in the DPCM model using voting.
Step 6: and exporting the network flow in the target industrial control network in a switch port mirror image mode, screening out the corresponding industrial control protocol flow according to the port number of the industrial control protocol, and further extracting the fingerprint vector of the corresponding industrial control equipment. Taking an industrial control device with an IP address of 192.168.0.2 as an example, two TCP connections generated by the industrial control device are shown in fig. 3 and 4 to obtain two fingerprint vectors DF with differences 1 and DF2 The method comprises the following steps of:
DF 1 ={64,true,8192,-1,-1,false,0,false,false,-1,-0.0027,1234}
DF 2 ={64,true,8192,-1,-1,false,0,false,false,-1,0.00587,1234}
step 7: and identifying industrial control equipment manufacturers in the target industrial control network through a composite classification algorithm based on the fingerprint vector packet. Taking an industrial control device with an IP address of 192.168.0.2 as an example, a fingerprint vector packet (bobf) is expressed as follows:
BoDF={DF 1 ,DF 2 }
classifying the fingerprint vector by using 10 independent DPCM models generated in the step 5) to obtain a fingerprint vector DF 1 and DF2 The corresponding classification results were Schneider Electric. Furthermore, the fingerprint vector DF 1 and DF2 Each containing 12 data messages, and the weight corresponding to the data messages is 0.5. Therefore, the classification results are integrated by using the weighted voting method, and the manufacturer label corresponding to the industrial control equipment with the IP address of 192.168.0.2 is Schneider Electric.

Claims (10)

1. An industrial control equipment manufacturer identification method based on an industrial control protocol communication model is characterized by comprising the following steps:
1) Constructing an industrial control equipment fingerprint vector applicable to various industrial control protocols based on the industrial control protocol communication model;
2) Carrying out safety scanning on industrial control equipment on the Internet according to the industrial control protocol communication model, and extracting fingerprint vectors based on scanning flow to form an industrial control equipment fingerprint vector library;
3) Acquiring manufacturer information of industrial control equipment, and marking fingerprint vectors of the corresponding industrial control equipment in a fingerprint vector library;
4) Preprocessing attribute values in fingerprint vectors of industrial control equipment;
5) Clustering the preprocessed industrial control equipment fingerprint vectors by a density peak value-based random clustering algorithm to obtain an industrial control equipment fingerprint vector classification model;
6) Monitoring the flow in the target industrial control network, and extracting the fingerprint vector of industrial control equipment;
7) And identifying industrial control equipment manufacturers in the target industrial control network through the industrial control equipment fingerprint vector classification model.
2. The industrial control equipment manufacturer identification method based on the industrial control protocol communication model according to claim 1, wherein the industrial control protocol communication model ics_cm in the step 1) is expressed as:
ICS_CM=(C E ,C D ,C T )
wherein ,CE The sequence of messages representing the connection establishment phase is expressed as:
wherein ,SYN message indicating the connection establishment phase, < >>A SYN _ ACK message indicating the connection establishment phase,ACK message indicating connection establishment stage, upper corner mark src and dst respectively indicating message sender as host machine for applying to establish TCP connection, namely Client end and host machine for agreeing to establish TCP connection, namely Server end, lower corner mark indicating message type;
C D a sequence of messages representing a data transmission phase, consisting of one or more data transmissions, expressed as:
wherein ,represents the ith data transmission, i=1, …, n, n represents the total number of data transmissions, +.>Expressed as:
wherein ,request message representing the ith data transmission, < >>TCP protocol ACK response message indicating the ith data transmission,/th data transmission>An industrial control protocol data response message of the ith data transmission;
C T a message sequence representing the end of connection phase, consisting of one or more message subsequences, expressed as:
wherein ,message sequence indicating end of application initiated by Client end TCP connection,/for the application>Indicating the message sequence returned by the Server end correspondingly,/-for> and />Represented as:
3. The industrial control equipment manufacturer identification method based on the industrial control protocol communication model according to claim 1, wherein the industrial control equipment fingerprint vector DF in the step 1) is expressed as:
DF={ITTL,IPDF,IWS,MSS,WSC,SAP,ILRT,TON,TSCON,TCF,RTD,FTS}
wherein ITTL and IPDF represent respectivelyThe initial value of TTL field and DF flag bit of IP protocol header of the message; IWS, MSS, WSC, SAP each represents->Values of Window Size field, maxim segment Size field, window Scale field and SACPPERMITED field of TCP protocol header of message; ILRT represents->Message and->Time intervals of messages; TON represents->Whether the message contains TCP timestamp option, TSCON indicates +.>TSecr field value of TCP timestamp option of message +.>Whether the TSval field value of the TCP timestamp option of the message is consistent or not, wherein TCF represents the updating frequency of the TSval field value of the TCP timestamp option of the Server end; RTD represents->Message and->Time interval and +.>Message and->The difference in time intervals of the messages; FTS represents message type sequence sent by Server end in connection ending stage.
4. The industrial control equipment manufacturer identification method based on the industrial control protocol communication model according to claim 1, wherein the step 2) specifically comprises:
and according to the industrial control protocol communication model, carrying out safety scanning on industrial control equipment on the Internet by using a standard industrial control protocol format data message, and extracting fingerprint vectors based on scanning flow to form an industrial control equipment fingerprint vector library.
5. The industrial control equipment manufacturer identification method based on the industrial control protocol communication model according to claim 1, wherein the manufacturer information of the industrial control equipment is obtained in the step 3), and the specific steps are as follows:
3.1 For industrial control equipment supporting the return of manufacturer information through the industrial control protocol function code, the manufacturer information of the industrial control equipment is queried by using the acquired equipment information type function code corresponding to the industrial control protocol in the process of carrying out safety scanning on the industrial control equipment;
3.2 For the industrial control equipment which does not support the return of manufacturer information through the industrial control protocol function code, continuing to try to access the common port supporting the HTTP/HTTPS protocol, extracting Banner information from the response head and the response body returned by the HTTP/HTTPS protocol, and further acquiring the manufacturer information of the industrial control equipment.
6. The industrial control equipment manufacturer identification method based on the industrial control protocol communication model according to claim 1, wherein the step 4) specifically comprises:
carrying out data normalization and data discretization on all attribute values in the fingerprint vector of the industrial control equipment, and carrying out data distribution fitting on time attribute ILRT and RTD;
the data distribution fitting is specifically as follows: if the number of fingerprint vectors of industrial control equipment of a certain known manufacturer is greater than 10, fitting data distribution to time type attributes of the manufacturer by using normal distribution, and acquiring a corresponding mean value mu and a standard deviation sigma; if a certain attribute value has a corresponding normal distribution, the normal distribution distance d from other attribute values to the attribute value n (x i ,x j ) The method comprises the following steps:
wherein ,xj Property value, μ representing presence of normal distribution j and σj Represents the mean and standard deviation of the corresponding normal distribution, x i An attribute value representing a normal distribution distance to be calculated, the distance between two attribute values being expressed as:
where dist represents the distance between attribute values.
7. The industrial control equipment manufacturer identification method based on the industrial control protocol communication model according to claim 1, wherein the step 5) specifically comprises:
5.1 Creating m feature subsets of the industrial control device fingerprint vector, each feature subset comprising k different features randomly selected from the industrial control device fingerprint vector;
5.2 Creating a clustering model DPCM based on a density peak value for each feature subset by using an industrial control equipment fingerprint vector library, and creating m independent DPCM models in total;
5.3 For each cluster in the DPCM model, if the cluster comprises the industrial control equipment fingerprint vector with manufacturer labels, marking the manufacturer labels as manufacturer labels with the largest number of votes by adopting a voting method; otherwise, its vendor label is marked as unknown.
8. The industrial control equipment manufacturer identification method based on the industrial control protocol communication model according to claim 7, wherein the step 5.2) specifically comprises:
5.2.1 According to the set cut-off distance d c Calculate each fingerprint vector DF i Local density ρ of (2) i And d thereof c Neighborhood, local density ρ i The calculation formula of (2) is as follows:
wherein ,dij Fingerprint vector DF for representing industrial control equipment i and DFj The distance between them;
d c the neighborhood is expressed as:
wherein ,DFj Representation d ij Less than d c Corresponding fingerprint vectors;
5.2.2 All fingerprint vectors are ordered according to the descending order of the local density, and a numbered sequence Γ of the fingerprint vectors is obtained and is expressed as follows:
Γ=<s 1 ,s 2 ,…,s n >
wherein ,si Representing localThe number corresponding to the i-th fingerprint vector of density, e.g. DF j I is greater, s i =j;
5.2.3 Calculating each fingerprint vector DF i And the nearest fingerprint vector DF with higher local density j Distance delta between i Wherein the closest distance delta i The calculation formula of (2) is as follows:
5.2.4 Use number s 1 Fingerprint vector of (a)Creating a first cluster; thereafter, a cluster assignment check and a cluster merge check are sequentially performed on each fingerprint vector in the order of the number sequence Γ.
9. The industrial control equipment manufacturer identification method based on the industrial control protocol communication model according to claim 8, wherein the cluster attribution test is: if the fingerprint vectorIs>Is smaller than the cut-off distance, will +.>Assigning the fingerprint vector to the cluster which is closest to the fingerprint vector and has higher density, otherwise creating a new cluster for the fingerprint vector;
the cluster merge test is: when the fingerprint vectorAfter determining the cluster, checking d c All fingerprint vectors within a neighborhood that are denser than itWhether the cluster to which it belongs is identical with the cluster to which it belongs, if not, the fingerprint vector is +.>D of (2) c All clusters belonging to fingerprint vectors with higher density in the neighborhood are combined to form a new cluster.
10. The industrial control equipment manufacturer identification method based on the industrial control protocol communication model according to claim 1, wherein the step 7) specifically comprises:
7.1 A set of different fingerprint vectors generated by the same industrial control device is referred to as a fingerprint vector packet, bobf, expressed as:
BoDF i ={DF 1 ,DF 2 ,…,DF p }
wherein p represents the number of fingerprint vectors generated by the industrial control device i;
7.2 J-th fingerprint vector DF of industrial control device i) j Input to the m independent DPCM models generated in step 5) for classification, and the DPCM model is used to classify the fingerprint vector DF j The classifying process comprises the following steps: calculating fingerprint vector DF j Minimum distance to all clusters in DPCM model; if the minimum distance is smaller than the cut-off distance d c DF is to j Classifying the clusters into manufacturers to which the corresponding clusters belong; otherwise, judging that the fingerprint vector belongs to an unknown manufacturer outside the fingerprint vector library;
7.3 Fingerprint vector DF generated for m DPCM models by voting method j Is aggregated, expressed as:
wherein G (DF) j ) Representation DF j The final classification result of (a) is, a vendor label,manufacturer label with the highest number of votes, g l Representing the first DPCM model, +.>G represents g l Regarding DF j In vendor label c k An output from the first and second switches; if g l The classification result of (c) k Then->If the value of (2) is 1, otherwise, the value is 0, if the number of votes obtained by all manufacturer labels is 0, DF is calculated j Classifying as an unknown vendor;
7.4 Fingerprint vector packet BoDF for industrial control device i i Corresponding classification result c= { C 1 ,C 2 ,…,C p Integrating by adopting a weighted voting method to obtain manufacturer labels corresponding to the industrial control equipment i, wherein the weighted voting method is expressed as follows:
wherein Γ (BoDF i ) Representation of BoDF i Final classification result of G k (DF j ) Represents G (DF) j ) In vendor label c k An output from the first and second switches; if G (DF) j ) The classification result of (c) k G is then k (DF j ) And otherwise has a value of 0; omega j Representation of BoDF i J-th fingerprint vector DF in (3) j Is expressed as:
wherein ,Tj Representing fingerprint vector DF j The number of data messages involved.
CN202310691586.XA 2023-06-12 2023-06-12 Industrial control equipment manufacturer identification method based on industrial control protocol communication model Pending CN116668145A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310691586.XA CN116668145A (en) 2023-06-12 2023-06-12 Industrial control equipment manufacturer identification method based on industrial control protocol communication model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310691586.XA CN116668145A (en) 2023-06-12 2023-06-12 Industrial control equipment manufacturer identification method based on industrial control protocol communication model

Publications (1)

Publication Number Publication Date
CN116668145A true CN116668145A (en) 2023-08-29

Family

ID=87710332

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310691586.XA Pending CN116668145A (en) 2023-06-12 2023-06-12 Industrial control equipment manufacturer identification method based on industrial control protocol communication model

Country Status (1)

Country Link
CN (1) CN116668145A (en)

Similar Documents

Publication Publication Date Title
CN109600363B (en) Internet of things terminal network portrait and abnormal network access behavior detection method
CN112910929B (en) Malicious domain name detection method and device based on heterogeneous graph representation learning
CN108737439B (en) Large-scale malicious domain name detection system and method based on self-feedback learning
CN107360145B (en) Multi-node honeypot system and data analysis method thereof
US20040255162A1 (en) Security gateway system and method for intrusion detection
CN112270346B (en) Internet of things equipment identification method and device based on semi-supervised learning
CN109861957A (en) A kind of the user behavior fining classification method and system of the privately owned cryptographic protocol of mobile application
CN110611640A (en) DNS protocol hidden channel detection method based on random forest
CN110868404B (en) Industrial control equipment automatic identification method based on TCP/IP fingerprint
CN113904795B (en) Flow rapid and accurate detection method based on network security probe
Fei et al. The abnormal detection for network traffic of power iot based on device portrait
CN105959321A (en) Passive identification method and apparatus for network remote host operation system
US20240146753A1 (en) Automated identification of false positives in dns tunneling detectors
Kong et al. Identification of abnormal network traffic using support vector machine
CN113268735B (en) Distributed denial of service attack detection method, device, equipment and storage medium
CN110225009A (en) It is a kind of that user&#39;s detection method is acted on behalf of based on communication behavior portrait
CN109274551A (en) A kind of accurate efficient industry control resource location method
CN110912933B (en) Equipment identification method based on passive measurement
CN108650274B (en) Network intrusion detection method and system
CN116346434A (en) Method and system for improving monitoring accuracy of network attack behavior of power system
CN116405261A (en) Malicious flow detection method, system and storage medium based on deep learning
CN116668145A (en) Industrial control equipment manufacturer identification method based on industrial control protocol communication model
Zhou et al. Fingerprinting IIoT devices through machine learning techniques
KR20140014784A (en) A method for detecting abnormal patterns of network traffic by analyzing linear patterns and intensity features
EP2114050A1 (en) Method and system for allocating resources of a Web-server based on classified usage behavior also for identifying and blocking bot generated HTTP-GET attacks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination