CN112311755A - Industrial control protocol reverse analysis method and device - Google Patents

Industrial control protocol reverse analysis method and device Download PDF

Info

Publication number
CN112311755A
CN112311755A CN202010532058.6A CN202010532058A CN112311755A CN 112311755 A CN112311755 A CN 112311755A CN 202010532058 A CN202010532058 A CN 202010532058A CN 112311755 A CN112311755 A CN 112311755A
Authority
CN
China
Prior art keywords
byte
field
message
byte array
protocol
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010532058.6A
Other languages
Chinese (zh)
Inventor
石凌志
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Winicssec Technologies Co Ltd
Original Assignee
Beijing Winicssec Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Winicssec Technologies Co Ltd filed Critical Beijing Winicssec Technologies Co Ltd
Priority to CN202010532058.6A priority Critical patent/CN112311755A/en
Publication of CN112311755A publication Critical patent/CN112311755A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/03Protocol definition or specification 
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/06Notations for structuring of protocol data, e.g. abstract syntax notation one [ASN.1]

Abstract

The invention provides an industrial control protocol reverse analysis method and device, wherein the method comprises the following steps: acquiring a message set to be analyzed; decomposing all messages in a message set to be analyzed into byte arrays, and synthesizing all the decomposed messages into a two-dimensional protocol matrix; performing multi-sequence comparison on the byte arrays in each row in the two-dimensional protocol matrix by combining the message time of each message, and determining the field type of the byte array in each row; determining the protocol content corresponding to the byte array in each column based on the field type of the byte array in each column and the numerical value of the byte array in each column; and combining the protocol contents corresponding to all the byte arrays in sequence to obtain the industrial control protocol corresponding to the message. The invention analyzes the protocol message corresponding to the byte array of each column by combining the characteristics of the industrial control protocol after the message is decomposed, thereby obtaining the industrial control protocol corresponding to the message, and even for a brand new industrial control protocol, the protocol content can be analyzed.

Description

Industrial control protocol reverse analysis method and device
Technical Field
The invention relates to the field of industrial control systems, in particular to an industrial control protocol reverse analysis method and device.
Background
Industrial control systems are generally applied in the fields of rail transit, power plants, power grids, intelligent manufacturing, petroleum and petrochemical industry and the like, and many of the systems are related to national civilization and belong to key infrastructure, so that the safety of the industrial control systems is concerned. The industrial control system has various industrial control protocols, and each manufacturer basically uses a private protocol of its own, some of them are disclosed to the outside, and many of them are not disclosed to the outside. In order to protect the industrial control systems, the industrial control protocols need to be analyzed reversely, the details of the protocols are known, and then the protocols are researched and analyzed safely, so that corresponding safety strategies can be formulated more specifically. Because the data source for performing the reverse protocol analysis is generally a PCAP message or a serially-connected captured real-time data stream, the message data may be a brand new unknown protocol, the general reverse analysis algorithm compares the message data with a known protocol, and if the message data matches with the known protocol, the protocol corresponding to the message may be output.
Disclosure of Invention
Therefore, the technical problem to be solved by the present invention is to overcome the defect that in the prior art, if a protocol corresponding to a message is a brand new unknown protocol, the protocol cannot be obtained through analysis by a general reverse analysis algorithm, thereby providing an industrial control protocol reverse analysis method and device.
The invention provides a reverse analysis method of an industrial control protocol, which comprises the following steps: acquiring a message set to be analyzed; decomposing all messages in a message set to be analyzed into byte arrays, synthesizing all the decomposed messages into a two-dimensional protocol matrix, wherein all the byte arrays obtained by decomposing each message are a row of the two-dimensional protocol matrix, and the byte arrays of the messages at the same position form a column of the two-dimensional protocol matrix; performing multi-sequence comparison on the byte arrays in each row in the two-dimensional protocol matrix by combining the message time of each message, and determining the field types of the byte arrays in each row, wherein the byte arrays corresponding to different field types have different change rules; determining the protocol content corresponding to the byte array in each column based on the field type of the byte array in each column and the numerical value of the byte array in each column; and combining the protocol contents corresponding to all the byte arrays in sequence to obtain the industrial control protocol corresponding to the message.
Optionally, after obtaining the set of messages to be analyzed, before decomposing all messages in the set of messages to be analyzed into byte arrays, the method further includes: comparing the messages in the message set to be analyzed with a priori knowledge base to judge whether the messages in the message set to be analyzed conform to a known industrial control protocol or not, wherein the priori knowledge base is a pre-established database containing the known industrial control protocol; when the messages in the message set to be analyzed conform to the known industrial control protocol, outputting the industrial control protocol corresponding to the messages; when the messages in the message set to be analyzed do not accord with the known industrial control protocol, the steps of decomposing all the messages in the message set to be analyzed into byte arrays and synthesizing all the decomposed messages into a two-dimensional protocol matrix are executed.
Optionally, the step of obtaining a set of messages to be analyzed includes: collecting a plurality of initial messages of different services; removing communication protocol message headers of a plurality of initial messages to obtain a plurality of original messages; and clustering the plurality of original messages according to the message direction and the message length of the original messages to obtain a plurality of message sets to be analyzed.
Optionally, the field type includes a gradient field, and the step of determining the protocol content corresponding to the byte array in each column based on the field type of the byte array in each column and the value of the byte array in each column includes: combining a byte array of which the field type is a gradient field in a two-dimensional protocol matrix with byte arrays of the first N1 columns of the byte array to form a first combined byte set, wherein N1 is greater than or equal to 1, and the first combined byte set comprises a plurality of first combined bytes; calculating the difference value of the numerical values of any two first combined bytes in the first combined byte set and the difference value of the message time corresponding to any two first combined bytes; and if the difference value of the numerical values of any two first combined bytes is the same as the difference value of the message time, judging that the byte array with the field type of the gradual change field is a time field for representing time.
Optionally, the field type includes a changeable field, and the changeable field indicates that the value of the byte array of the corresponding column changes irregularly; the step of determining the protocol content corresponding to the byte array in each column based on the field type of the byte array in each column and the value of the byte array in each column includes: combining a byte array with a field type of a changeable field in the two-dimensional protocol matrix and byte arrays of the first N2 columns of the byte array to form a second combined byte set, wherein N2 is greater than or equal to 1, and the second combined byte set comprises a plurality of second combined bytes; and if the numerical value of any second combined byte in the second combined byte set is the same as the message length of the message corresponding to any second combined byte, judging that the byte array with the field type of the changeable field is the length field for representing the message length.
Optionally, the field type includes a changeable field, and the changeable field indicates that the value of the byte array of the corresponding column changes irregularly; the step of determining the protocol content corresponding to the byte array in each column based on the field type of the byte array in each column and the value of the byte array in each column includes: combining a byte array with a field type of a changeable field in the two-dimensional protocol matrix and byte arrays of the first N3 columns of the byte array to form a third combined byte set, wherein N3 is greater than or equal to 1, and the third combined byte set comprises a plurality of three combined bytes; calculating the difference value of the numerical values of any two third combined bytes in the third combined byte set and the difference value of the message lengths of the messages corresponding to any two third combined bytes; and if the difference value of the numerical values of any two third combined bytes is the same as the difference value of the message length, judging that the byte array with the field type of the variable field is the length field for representing the message length.
Optionally, the field type includes a changeable field, and the changeable field indicates that the value of the byte array of the corresponding column changes irregularly; the step of determining the protocol content corresponding to the byte array in each column based on the field type of the byte array in each column and the value of the byte array in each column includes: combining a byte array with a field type of a polytropic field in the two-dimensional protocol matrix and a byte array of the first N4 columns of the byte array to form a fourth combined byte set, wherein N4 is greater than or equal to 1, and the fourth combined byte set comprises a plurality of fourth combined bytes; verifying any fourth combined byte in the fourth combined byte set by adopting a preset verification algorithm to obtain a verification result; and if the verification result is consistent with the expected value of the preset verification algorithm, judging that the byte array with the field type of the changeable field is the check code field for representing the check code.
Optionally, the field type includes a changeable field, and the changeable field indicates that the value of the byte array of the corresponding column changes irregularly; the step of determining the protocol content corresponding to the byte array in each column based on the field type of the byte array in each column and the value of the byte array in each column includes: combining a byte array with a field type of a changeable field in the two-dimensional protocol matrix and byte arrays of the first N5 columns of the byte array to form a fifth combined byte set, wherein N5 is greater than or equal to 1, and the fifth combined byte set comprises a plurality of fifth combined bytes; calculating the difference value of the numerical values of any two fifth combined bytes in the fifth combined byte set and the difference value of the message time corresponding to any two fifth combined bytes; and if the difference value of the numerical values of any two fifth combined bytes is the same as the difference value of the message time, judging that the byte array with the field type of the variable field is the time field for representing the time.
Optionally, the step of obtaining a set of messages to be analyzed includes: collecting a plurality of initial messages of different services, wherein the plurality of initial messages comprise request messages and response messages, and the plurality of initial messages are message sets to be analyzed; decomposing all messages in a message set to be analyzed into byte arrays, and synthesizing all the decomposed messages into a two-dimensional protocol matrix, wherein the two-dimensional protocol matrix comprises the following steps: decomposing a plurality of initial messages into byte arrays, synthesizing all the decomposed messages into a two-dimensional protocol matrix, wherein all the byte arrays obtained by decomposing each initial message are a row of the two-dimensional protocol matrix, and the byte arrays of the initial messages at the same position form a column of the two-dimensional protocol matrix.
Optionally, the field type includes an alternate field, and the alternate field indicates that the numerical values of the byte arrays of the corresponding column alternate; the step of determining the protocol content corresponding to the byte array in each column based on the field type of the byte array in each column and the value of the byte array in each column includes: if the field type of the byte array in the adjacent column of the column in which the byte array with the field type of the alternate field is located in the two-dimensional protocol matrix is also the alternate field, and the numerical value of the byte array of the alternate field is the same as the numerical value of the corresponding byte array in the adjacent column, the byte array with the field type of the alternate field is judged to be the source field for representing the source address, and the corresponding byte array in the adjacent column is judged to be the destination field for representing the destination address.
Optionally, the field type includes a multivariable field, the multivariable field indicates that values of the byte arrays in the corresponding columns change irregularly, and the step of determining the protocol content corresponding to the byte array in each column based on the field type of the byte array in each column and the value of the byte array in each column includes: if the numerical values of the byte arrays of any request message and the response message corresponding to the request message are the same in the column of the byte array of the field type of the variable field in the two-dimensional protocol matrix, and the numeric values of all the byte arrays are smaller than the preset threshold value in the column of the byte array of the field type of the variable field, the byte array of the field type of the variable field is judged to be the function code field for representing the function code.
Optionally, the step of determining the protocol content corresponding to the byte array in each column based on the field type of the byte array in each column and the value of the byte array in each column further includes: and determining the byte array positioned in the last N6 columns of the function code field in the two-dimensional protocol matrix as the identification data field.
Optionally, the field type includes a gradient field, and the step of determining the protocol content corresponding to the byte array in each column based on the field type of the byte array in each column and the value of the byte array in each column includes: combining a byte array with a field type of a gradient field in a two-dimensional protocol matrix and byte arrays of the first N7 columns of the byte array to form a seventh combined byte set, wherein N7 is greater than or equal to 1, and the seventh combined byte set comprises a plurality of seventh combined bytes; and if the value of the seventh combined byte in the seventh combined byte set is in equal difference increment, judging that the byte array with the field type being the gradient field is a sequence number field for representing the communication sequence number.
The second aspect of the present invention provides an industrial control protocol reverse analysis device, including: the message set to be analyzed acquiring module is used for acquiring a message set to be analyzed; the two-dimensional protocol matrix construction module is used for decomposing all messages in a message set to be analyzed into byte arrays, synthesizing all the decomposed messages into a two-dimensional protocol matrix, wherein all the byte arrays obtained by decomposing each message are a row of the two-dimensional protocol matrix, and the byte arrays of the messages at the same position form a column of the two-dimensional protocol matrix; the field type determining module is used for carrying out multi-sequence comparison on the byte arrays in each row in the two-dimensional protocol matrix by combining the message time of each message and determining the field types of the byte arrays in each row, wherein the byte arrays corresponding to different field types have different change rules; the protocol content determining module is used for determining the protocol content corresponding to the byte array in each column based on the field type of the byte array in each column and the numerical value of the byte array in each column; and the industrial control protocol determining module is used for combining the protocol contents corresponding to all the byte arrays in sequence to obtain the industrial control protocol corresponding to the message.
A third aspect of the present invention provides a computer device, comprising: at least one processor; and a memory communicatively coupled to the at least one processor; the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor so as to execute the inverse analysis method of the industrial control protocol provided by the first aspect of the present invention.
A fourth aspect of the present invention provides a computer-readable storage medium, wherein the computer-readable storage medium stores computer instructions for causing a computer to execute the inverse analysis method for industrial control protocol according to the first aspect of the present invention.
The technical scheme of the invention has the following advantages:
1. the industrial control protocol reverse analysis method provided by the invention comprises the steps of decomposing all messages in a message set to be analyzed into byte arrays after the message set to be analyzed is obtained, synthesizing all the decomposed messages into a two-dimensional protocol matrix, carrying out multi-sequence comparison on the byte arrays in each row in the two-dimensional protocol matrix by combining the message time of each message, determining the field types of the byte arrays in each row, and determining the protocol content corresponding to the byte array in each column according to the field type of the byte array in each column and the data of the byte array, because each field of the industrial control protocol has the fixed characteristics, the protocol content corresponding to each column of byte arrays can be analyzed according to the field type of the byte arrays in each column and the data of the byte arrays, and finally, the protocol contents corresponding to all the byte arrays are combined in sequence to obtain the industrial control protocol corresponding to the message. The industrial control protocol reverse analysis method provided by the invention analyzes the protocol message corresponding to the byte array of each column by combining the characteristics of the industrial control protocol after the message is decomposed, so as to obtain the industrial control protocol corresponding to the message.
2. The industrial control protocol reverse analysis method provided by the invention is characterized in that when a message set to be analyzed is obtained, a plurality of initial messages are collected, a communication protocol message header of the initial messages is removed to obtain original messages, and the original messages are clustered according to the message direction and the message length of the original messages to obtain the message set to be analyzed. Even if the messages are sent under the same industrial control protocol, the numerical values of some byte arrays in the messages are changed due to different senders or receivers, so that in order to avoid interference caused by reverse analysis due to different senders or receivers, the reverse analysis method of the industrial control protocol provided by the invention clusters according to the message direction when the message set to be analyzed is obtained, meanwhile, as the reverse analysis method of the industrial control protocol provided by the invention needs to carry out multi-sequence comparison on the byte arrays of the messages at the same position when the industrial control protocol is subjected to reverse analysis, if the length difference of the messages in the message set to be analyzed is large, the analysis result is interfered, and therefore, the reverse analysis method of the industrial control protocol provided by the invention also clusters according to the message length when the message set to be analyzed is obtained.
3. The industrial control protocol reverse analysis method provided by the invention is characterized in that when a message set to be analyzed is obtained, a plurality of initial messages are collected, the plurality of initial messages are the message set to be analyzed, wherein the plurality of initial messages comprise request messages and response messages, and some protocol fields in the industrial control protocol are used for appointing the interactive format between a sender and a receiver and cannot be obtained through message analysis in a single direction.
4. The industrial control protocol reverse analysis device provided by the invention decomposes all messages in a message set to be analyzed into byte arrays after obtaining the message set to be analyzed, synthesizes all decomposed messages into a two-dimensional protocol matrix, compares the byte arrays in each row in the two-dimensional protocol matrix in a multi-sequence mode according to the message time of each message, determines the field type of the byte arrays in each row, and determining the protocol content corresponding to the byte array in each column according to the field type of the byte array in each column and the data of the byte array, because each field of the industrial control protocol has the fixed characteristics, the protocol content corresponding to each column of byte arrays can be analyzed according to the field type of the byte arrays in each column and the data of the byte arrays, and finally, the protocol contents corresponding to all the byte arrays are combined in sequence to obtain the industrial control protocol corresponding to the message. The industrial control protocol reverse analysis device provided by the invention analyzes the protocol message corresponding to the byte array of each column by combining the characteristics of the industrial control protocol after the message is decomposed, so as to obtain the industrial control protocol corresponding to the message.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1-13 are flowcharts illustrating a specific example of a reverse analysis method for an industrial control protocol according to an embodiment of the present invention;
fig. 14 is a schematic block diagram of a specific example of an industrial control protocol reverse analysis device in an embodiment of the present invention;
fig. 15 is a schematic block diagram of a specific example of a computer device provided in the embodiment of the present invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the description of the present invention, it should be noted that the terms "first", "second", and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
In addition, the technical features involved in the different embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
Example 1
In order to protect an industrial control system, reverse analysis needs to be performed on an industrial control protocol, protocol details are known, and then safety research and analysis are performed on the protocol, but a general protocol wants to perform an analysis algorithm to compare message data with a known protocol, if the message data is matched with the known protocol, a protocol corresponding to the message is output, but a data source for performing the reverse analysis on the protocol is generally a PCAP message or a serially-connected captured real-time data stream, the message data may be a brand-new unknown protocol, and the general protocol reverse analysis algorithm cannot perform reverse analysis on a completely new protocol.
The embodiment of the invention provides an industrial control protocol reverse analysis method, as shown in fig. 1, comprising the following steps:
step S10: the method comprises the steps of obtaining a message set to be analyzed, wherein the message set to be analyzed comprises a plurality of messages, and the messages in the message set to be analyzed are generated when a target client executes a service.
Step S20: decomposing all messages in a message set to be analyzed into byte arrays, synthesizing all the decomposed messages into a two-dimensional protocol matrix, wherein all the byte arrays obtained by decomposing each message are a row of the two-dimensional protocol matrix, and the byte arrays of the messages at the same position form a column of the two-dimensional protocol matrix. Because the message is sent in units of bytes, in the embodiment of the present invention, when the message is decomposed, each byte in the message is decomposed into a byte array, and in a specific embodiment, a two-dimensional protocol matrix synthesized by all decomposed messages is shown in table 1:
TABLE 1
Figure BDA0002535685150000111
Figure BDA0002535685150000121
Step S30: and performing multi-sequence comparison on the byte arrays in each column in the two-dimensional protocol matrix by combining the message time of each message, and determining the field types of the byte arrays in each column, wherein the byte arrays corresponding to different field types have different change rules. The field type comprises a fixed field, a gradual change field and a changeable field, wherein the fixed field indicates that the numerical value of the byte array in the corresponding column does not change, the gradual change field indicates that the numerical value of the byte array in the corresponding column increases or decreases along with the sequence of the message, and the changeable field indicates that the numerical value of the byte array in the corresponding column changes irregularly. In a specific embodiment, in order to determine the field type of the byte array in each column according to the multi-sequence comparison result, when the two-dimensional protocol matrix is constructed, the messages may be sorted according to the message time.
Step S40: and determining the protocol content corresponding to the byte array in each column based on the field type of the byte array in each column and the numerical value of the byte array in each column. In determining the protocol content, the determination is performed based on field characteristics of an industrial control protocol, in the industrial control protocol, some protocol fields are composed of a plurality of bytes, therefore, in determining the protocol content, the content corresponding to the byte array in each column is determined based on the field type of the byte array in each column and the numerical value of the combined byte formed by each column and the adjacent preset column, in an optional embodiment, the number of the adjacent preset columns is determined by the field type of the byte array in each column, and in a specific embodiment, common fields and characteristics thereof in the industrial control protocol include:
1) and session identification: the session period is unique, different sessions are different, and the length is 2B or 4B generally;
2) time stamping: the general length is 4B, and the accuracy is up to the second; length 8B, with fractional parts;
3) sequence number: the general length is 2B or 4B, and the sequence is increased;
4) function code: a general length 1B;
5) message length: the length 2B is generally, but may be 4B;
6) and (4) checking codes: typically the CRC16 algorithm, length 2B;
7) numerical values: the data belongs to service data, which may be integer or floating point number, and the length is generally 2B or 4B, and generally the numerical value is limited to a value range.
Taking the timestamp as an example, the value of the field representing the timestamp is incremented with time, and the length of the field representing the timestamp is 4 bytes, so after the byte array is determined to be the gradient field, the byte array in the column of the gradient field and the array of the first 3 columns of bytes of the gradient field can be combined to form an array with the length of 4 bytes, and whether the byte array is the byte array for representing the timestamp is judged through the combined array.
In a specific embodiment, common fields and characteristics of the industrial control protocol can be stored in an industrial control protocol characteristic library, the industrial control protocol characteristic library is a part of an industrial control protocol priori knowledge library and is used for storing common industrial control protocol field characteristics, when the industrial control protocol is subjected to reverse analysis, the industrial control protocol field characteristics can be obtained based on the industrial control protocol characteristic library, then the protocol is subjected to reverse analysis based on the industrial control protocol field characteristics, and the industrial control protocol characteristic library is continuously filled and updated in the using process, so that the result of the protocol reverse analysis is more accurate.
Step S50: and combining the protocol contents corresponding to all the byte arrays in sequence to obtain the industrial control protocol corresponding to the message. Different byte arrays in the message represent different protocol contents, after the protocol contents of the byte arrays in the message are reversely analyzed, the industrial control protocol corresponding to the message can be obtained through the combination of the protocol contents of the byte arrays, and it needs to be noted that some protocol contents are composed of a plurality of byte arrays.
The industrial control protocol reverse analysis method provided by the invention comprises the steps of decomposing all messages in a message set to be analyzed into byte arrays after the message set to be analyzed is obtained, synthesizing all the decomposed messages into a two-dimensional protocol matrix, carrying out multi-sequence comparison on the byte arrays in each row in the two-dimensional protocol matrix by combining the message time of each message, determining the field types of the byte arrays in each row, and determining the protocol content corresponding to the byte array in each column according to the field type of the byte array in each column and the data of the byte array, because each field of the industrial control protocol has the fixed characteristics, the protocol content corresponding to each column of byte arrays can be analyzed according to the field type of the byte arrays in each column and the data of the byte arrays, and finally, the protocol contents corresponding to all the byte arrays are combined in sequence to obtain the industrial control protocol corresponding to the message. The industrial control protocol reverse analysis method provided by the invention analyzes the protocol message corresponding to the byte array of each column by combining the characteristics of the industrial control protocol after the message is decomposed, so as to obtain the industrial control protocol corresponding to the message.
In an alternative embodiment, as shown in fig. 2, in the inverse analysis method for industrial control protocol provided in the embodiment of the present invention, after the step S10 and before the step S20, the method further includes:
step S60: and comparing the messages in the message set to be analyzed with a prior knowledge base, wherein the prior knowledge base is a pre-established database containing known industrial control protocols. The prior knowledge base includes, in addition to the industrial control protocol field feature base described in the step S40, an industrial control protocol knowledge base, where the industrial control protocol knowledge base is used to summarize typical message segments and protocol format definitions of known industrial control protocols, and is used to compare and identify the known protocols, and comparing the message with the prior knowledge base refers to comparing the message with the known protocols in the industrial control protocol knowledge base.
Step S61: and judging whether the messages in the message set to be analyzed conform to the known industrial control protocol or not, and executing the step S62 when the messages in the message set to be analyzed conform to the known industrial control protocol.
Step S62: outputting an industrial control protocol corresponding to the message;
and when the messages in the message set to be analyzed do not conform to the known industrial control protocol, executing the step S20.
In an optional embodiment, if the message in the message set to be analyzed is the same as the known industrial control protocol part, the message in the message set to be analyzed is separated from the part different from the known industrial control protocol, and the part different from the known industrial control protocol part in the message set to be analyzed is analyzed by performing the above steps S20-S50.
In an optional embodiment, after the step of comparing the message in the message set to be analyzed with the prior knowledge base is performed, if the message in the message set to be analyzed has the same part as the known industrial control protocol, before the protocol content corresponding to the same part is output, the content of the same part needs to be verified.
The industrial control protocol reverse analysis method provided by the invention comprises the steps of comparing messages in a message set to be analyzed with known industrial control protocols in a priori knowledge base after obtaining the message set to be analyzed, outputting corresponding industrial control protocols if the messages in the message set to be analyzed are the same as the known industrial control protocols in the priori knowledge base, outputting protocol contents corresponding to the same parts if the messages in the message set to be analyzed are the same as the known industrial control protocols in the priori knowledge base, separating the parts of the messages, which are different from the known protocols, and further analyzing the parts of the messages, which are different from the known protocols, through the steps S20-S50. By means of the method for comparing the information with the prior knowledge base, the protocol content which is the same as that of the known industrial control protocol can be directly output, only the part of the message which is different from that of the known industrial control protocol needs to be further analyzed, and the reverse analysis process of the industrial control protocol is accelerated.
In an alternative embodiment, as shown in fig. 3, the step S10 specifically includes:
step S11: in the embodiment of the present invention, reverse analysis of the industrial control protocol is performed based on a result of multi-sequence comparison of numerical values of byte arrays of each message at the same position, so to avoid interference caused by repetition of the messages, after a plurality of initial messages are collected, the messages with completely consistent contents need to be deleted.
Step S12: and removing the communication protocol message headers of the plurality of initial messages to obtain a plurality of original messages.
Step S13: and clustering the plurality of original messages according to the message direction and the message length of the original messages to obtain a plurality of message sets to be analyzed. The method comprises the steps of obtaining messages sent by the same client side, and clustering the messages with the same receiving party and the message length belonging to the same threshold range in the messages sent by the same client side according to the message direction and the message length of the original messages. In the embodiment of the present invention, the reverse analysis of the message protocol is performed based on the result of comparing multiple sequences of byte arrays at the same position of each message in the message set to be analyzed, and the result of comparing multiple sequences may be affected by the difference of the receiving party and the difference of the message length, so that the original messages need to be clustered according to the direction and the message length of the original messages, and the messages in the same class are analyzed during the reverse analysis. For the report with insufficient length in the same set, 0 is required to be used for filling, so that the length of the message in the same set is the same.
The industrial control protocol reverse analysis method provided by the invention is characterized in that when a message set to be analyzed is obtained, a plurality of initial messages are collected, a communication protocol message header of the initial messages is removed to obtain original messages, and the original messages are clustered according to the message direction and the message length of the original messages to obtain the message set to be analyzed. Even if the messages are sent under the same industrial control protocol, the numerical values of some byte arrays in the messages are changed due to different senders or receivers, so that in order to avoid interference caused by reverse analysis due to different senders or receivers, the reverse analysis method of the industrial control protocol provided by the invention clusters according to the message direction when the message set to be analyzed is obtained, meanwhile, as the reverse analysis method of the industrial control protocol provided by the invention needs to carry out multi-sequence comparison on the byte arrays of the messages at the same position when the industrial control protocol is subjected to reverse analysis, if the length difference of the messages in the message set to be analyzed is large, the analysis result is interfered, and therefore, the reverse analysis method of the industrial control protocol provided by the invention also clusters according to the message length when the message set to be analyzed is obtained.
In an optional embodiment, the field type includes a gradient field, and the gradient field indicates that the numerical value of the byte array in the corresponding column increases or decreases with the message sequence, as shown in fig. 4, the step S40 specifically includes:
step S411: the method includes the steps that a byte array with a field type of a gradual change field in a two-dimensional protocol matrix and byte arrays of the first N1 columns of the byte array are combined to form a first combined byte set, N1 is larger than or equal to 1, the first combined byte set comprises a plurality of first combined bytes, the numerical value of the field representing a timestamp in the industrial control protocol is increased progressively along with the sequence of messages, and the length of the timestamp is 4B, so that in order to verify whether the byte array with the field type of the gradual change field represents the timestamp, the byte array and the first 3 columns of byte arrays need to be combined, the combined arrays are converted into 32-bit integers, namely, the time is accurate to seconds, and it needs to be noted that the numerical value of the timestamp is all seconds from the 1/0 point of 1900/1/0 or the 1/0 point of 1970 to the present.
Step S412: and calculating the difference value of the numerical values of any two first combined bytes in the first combined byte set and the difference value of the message time corresponding to any two first combined bytes.
Step S413: and judging whether the difference value of the numerical values of any two first combined bytes is the same as the difference value of the message time, and if so, executing the step S414.
Step S414: the byte array for determining the field type as a fade field is a time field for indicating time. Since the byte array of the gradation field is analyzed after being combined with the first N1 byte arrays of the byte array when the field type is analyzed, it can be determined that the byte array of the gradation field and the first N1 byte arrays of the byte array are time fields for indicating time, and the time field indicating time is 4 bytes, it can be determined that the byte array and the first 3 byte arrays of the byte array are time fields for indicating time.
In an alternative embodiment, the field type includes a changeable field, and the changeable field indicates that the value of the byte array of the corresponding column changes irregularly, as shown in fig. 5, the step S40 specifically includes:
step S421: and combining the byte array of which the field type is a polytropic field in the two-dimensional protocol matrix with the byte array of the first N2 columns of the byte array to form a second combined byte set, wherein N2 is greater than or equal to 1, and the second combined byte set comprises a plurality of second combined bytes. Since the value of the field indicating the message length in the industrial control protocol is irregularly changed, and the field indicating the message length is generally 2 bytes or 4 bytes, in order to verify whether the byte array with the field type of a variable field indicates the message length, the byte array needs to be combined with the first 1 column or the first 3 columns of byte arrays.
Step S422: and judging whether the numerical value of any second combined byte in the second combined byte set is the same as the message length of the message corresponding to any second combined byte, and if so, executing the step S423.
Step S423: and the byte array for judging the field type to be the variable field is a length field for representing the length of the message. When the field type is the byte array of the multi-variable field, the byte array of the multi-variable field and the first N2 byte array of the byte array are combined and analyzed, so that the byte array of the field type and the first N2 byte array of the byte array can be judged to be the length field for representing the length of the message, if the second combined byte is obtained by combining the byte array of the field type and the first 1 byte array of the byte array, the byte array and the first 1 byte array of the byte array are judged to be the length field for representing the length of the message, and if the second combined byte is obtained by combining the byte array of the field type and the first 3 byte arrays of the byte array, the byte array and the first 3 byte arrays of the byte array are judged to be the length field for representing the length of the message.
In an alternative embodiment, the field type includes a changeable field, and the changeable field indicates that the value of the byte array of the corresponding column changes irregularly, as shown in fig. 6, the step S40 specifically includes:
step S431: and combining the byte array with the field type of the multivariable field in the two-dimensional protocol matrix and the byte array of the first N3 columns of the byte array to form a third combined byte set, wherein N3 is greater than or equal to 1, and the third combined byte set comprises a plurality of three combined bytes. Since the value of the field indicating the message length in the industrial control protocol is irregularly changed, and the field indicating the message length is generally 2 bytes or 4 bytes, in order to verify whether the byte array with the field type of a variable field indicates the message length, the byte array needs to be combined with the first 1 column or the first 3 columns of byte arrays.
Step S432: and calculating the difference value of the numerical values of any two third combined bytes in the third combined byte set and the difference value of the message lengths of the messages corresponding to any two third combined bytes.
Step S433: and judging whether the difference between the numerical values of any two third combined bytes and the difference between the message lengths are the same, if so, executing step S434.
Step S434: and the byte array for judging the field type to be the variable field is a length field for representing the length of the message. See step S423 above for a detailed description.
The determination of the length field may be performed by one of the above steps S421 to S423 or steps S431 to S434, or may be performed by one of the above methods and then verified by the other method.
In an alternative embodiment, the field type includes a changeable field, and the changeable field indicates that the value of the byte array of the corresponding column changes irregularly, as shown in fig. 7, the step S40 specifically includes:
step S441: and combining the byte array with the field type of the polytropic field in the two-dimensional protocol matrix and the byte array of the first N4 columns of the byte array to form a fourth combined byte set, wherein N4 is greater than or equal to 1, and the fourth combined byte set comprises a plurality of fourth combined bytes. In the industrial control protocol, the values of the fields representing the check codes are variable, so that if the field type of the byte array is a variable field, whether the byte array is a field representing the check codes can be judged according to the characteristics of the field of the check codes.
The method comprises the following steps of combining a byte array of which the field type is a changeable field in a two-dimensional protocol matrix and a byte array of the first N4 columns of the byte array, wherein the steps comprise: according to the algorithm rule of the preset verification algorithm, the byte array of which the field type is the changeable field in the two-dimensional protocol matrix and the byte array of the first N4 columns of the byte array are combined, in a specific embodiment, the bytes corresponding to different verification algorithms are different, for example, the length of the check code corresponding to the CRC8 algorithm is 1 byte, the length of the check code corresponding to the CRC16 algorithm is 2 bytes, the length of the check code corresponding to the CRC32 algorithm is 4 bytes, and the like, so that the byte array of which the field type is the changeable field and the byte array of the different columns before the byte array need to be combined for different verification algorithms.
Step S442: and verifying any fourth combined byte in the fourth combined byte set by adopting a preset verification algorithm to obtain a verification result.
Step S443: and judging whether the verification result is consistent with an expected value of a preset verification algorithm. If so, step S444 is designated.
Step S444: and the byte array for judging the field type to be the changeable field is a check code field for representing the check code. Since the analysis is performed after combining the byte array of the first N4 columns of the byte array when the field type is a multivariate field, it can be determined that the byte array of the field type is a multivariate field and the byte array of the first N4 columns of the byte array are check code fields for representing check codes.
In an alternative embodiment, the field type includes a changeable field, and the changeable field indicates that the value of the byte array of the corresponding column changes irregularly, as shown in fig. 8, the step S40 specifically includes:
step S451: and combining the byte array with the field type of the polytropic field in the two-dimensional protocol matrix and the byte array of the first N5 columns of the byte array to form a fifth combined byte set, wherein N5 is greater than or equal to 1, the fifth combined byte set comprises a plurality of fifth combined bytes, and the detailed description refers to the step S411.
Step S452: the difference between the values of any two fifth combined bytes in the fifth combined byte set and the difference between the message times corresponding to any two fifth combined bytes are calculated, and the detailed description refers to the above step S412.
Step S453: and judging whether the difference value of the numerical values of any two fifth combined bytes is the same as the difference value of the message time, and if so, executing the step S454.
Step S454: the byte array for determining the field type as the changeable field is a time field for indicating time, which is described in detail in step S454 above.
In an alternative embodiment, as shown in fig. 9, the step S10 specifically includes:
step S14: the method comprises the steps of collecting a plurality of initial messages of different services, wherein the plurality of initial messages comprise request messages and response messages, the plurality of initial messages are a message set to be analyzed, all messages in the message set to be analyzed are decomposed into byte arrays, and all the decomposed messages are synthesized into a two-dimensional protocol matrix, and the method comprises the following steps: decomposing a plurality of initial messages into byte arrays, synthesizing all the decomposed messages into a two-dimensional protocol matrix, wherein all the byte arrays obtained by decomposing each initial message are a row of the two-dimensional protocol matrix, and the byte arrays of the initial messages at the same position form a column of the two-dimensional protocol matrix.
The industrial control protocol reverse analysis method provided by the invention is characterized in that when a message set to be analyzed is obtained, a plurality of initial messages are collected, the plurality of initial messages are the message set to be analyzed, wherein the plurality of initial messages comprise request messages and response messages, and some protocol fields in the industrial control protocol are used for appointing the interactive format between a sender and a receiver and cannot be obtained through message analysis in a single direction.
In an alternative embodiment, the field type includes an alternate field, and the alternate field indicates that the values of the byte arrays of the corresponding column are changed alternately, as shown in fig. 10, the step S40 specifically includes:
step S461: judging whether the field type of the byte array in the adjacent column of the column where the byte array with the field type of the alternate field is located in the two-dimensional protocol matrix is also the alternate field, and whether the numerical value of the byte array of the alternate field is the same as the numerical value of the corresponding byte array in the adjacent column, if so, executing step S462.
Step S462: the byte array with the field type of the alternative field is judged to be a source field for representing a source address, and the corresponding byte array in the adjacent column is a destination field for representing a destination address. In a specific embodiment, the industrial control protocol has interactivity, that is, the communication process between the client and the server is a question-and-answer, so that when an alternate field appears, it can be verified whether the field is a source field for representing a source address or a destination field for representing a destination address.
In an alternative embodiment, the field type includes a changeable field, and the changeable field indicates that the value of the byte array of the corresponding column changes irregularly, as shown in fig. 11, the step S40 specifically includes:
step S471: judging whether the numerical values of the byte arrays of any request message and the response message corresponding to the request message are the same in the column of the byte array with the field type of the variable field in the two-dimensional protocol matrix, and whether the numerical values of all the byte arrays are smaller than a preset threshold value in the column of the byte array with the field type of the variable field, if so, executing step S472.
Step S472: the byte array for judging the field type to be the changeable field is a function code field for representing the function code. Because the values of the function code fields in the request message and the response message corresponding to the request message are the same, and because the number of functions of the message is small, the number of values of the function codes is also small and is not changed completely randomly, and the number of values of the function codes in the captured message is not more than 50, the byte array can be determined as the function code field for representing the function codes.
In an alternative embodiment, as shown in fig. 12, after step S472, in step S40, the method further includes:
step S473: and determining the byte array positioned in the last N6 columns of the function code field in the two-dimensional protocol matrix as the identification data field. In one embodiment, the length of the identification data field is generally 2B, so the byte array in the last 2 columns of the function code can be determined as the identification data field.
In an optional embodiment, the field type includes a gradient field, and the gradient field indicates that the numerical value of the byte array in the corresponding column increases or decreases with the message sequence, as shown in fig. 13, the step S40 specifically includes:
step S481: and combining the byte array of which the field type is the gradient field in the two-dimensional protocol matrix with the byte array of the first N7 columns of the byte array to form a seventh combined byte set, wherein N7 is greater than or equal to 1, and the seventh combined byte set comprises a plurality of seventh combined bytes. The value of the field representing the sequence number in the industrial control protocol is sequentially increased, and the length of the sequence number is generally 2 bytes or 4 bytes, so in order to verify whether the byte array with the field type being the gradient field represents the sequence number, the byte array needs to be combined with the first 1 column or the first 3 columns of byte arrays.
Step S482: and judging whether the value of the seventh combined byte in the seventh combined byte set is in arithmetic progression or not, and if the value of the seventh combined byte is in arithmetic progression, executing step S483.
Step S483: the byte array for determining the field type as the gradient field is a sequence number field for indicating a communication sequence number. When the byte array of the field type of the gradient field is analyzed after being combined with the first N7 byte arrays of the byte array, the byte array of the field type of the gradient field and the first N7 byte array of the byte array can be judged to be the sequence number fields for representing the sequence number, if the seventh combined byte is obtained by combining the byte array of the field type of the gradient field and the first 1 byte array of the byte array, the byte array and the first 1 byte array of the byte array are judged to be the sequence number fields for representing the sequence number, and if the seventh combined byte is obtained by combining the byte array of the field type of the gradient field and the first 3 byte arrays of the byte array, the byte array and the first 3 byte arrays of the byte array are judged to be the sequence number fields for representing the sequence number.
In an alternative embodiment, after the step S50 is executed, the following steps are also executed:
and verifying the obtained industrial control protocol through all the collected initial messages of different services, and if the ratio of the industrial control protocol in the initial messages is greater than a threshold value, judging that the obtained industrial control protocol is the industrial control protocol of the initial messages. The setting of the threshold value can be adjusted according to actual requirements, and if the requirement on the reverse analysis result of the industrial control protocol is high, a larger threshold value can be set.
Example 2
An embodiment of the present invention provides an industrial control protocol reverse analysis device, as shown in fig. 14, including:
the to-be-analyzed message set obtaining module 10 is configured to obtain a to-be-analyzed message set, and the detailed description is described in the above embodiment 1 for the step S10.
The two-dimensional protocol matrix building module 20 is configured to decompose all messages in the message set to be analyzed into byte arrays, and synthesize all decomposed messages into a two-dimensional protocol matrix, where all byte arrays obtained by decomposing each message are a row of the two-dimensional protocol matrix, and byte arrays of all messages at the same position form a column of the two-dimensional protocol matrix, and the detailed description is described in the above embodiment 1 for step S20.
The field type determining module 30 is configured to perform multiple sequence comparison on the byte array in each column in the two-dimensional protocol matrix in combination with the packet time of each packet, and determine the field type of the byte array in each column, where the change rules of the byte arrays corresponding to different field types are different, and the detailed description is described in the above embodiment 1 for step S30.
The protocol content determining module 40 is configured to determine the protocol content corresponding to the byte array in each column based on the field type of the byte array in each column and the value of the byte array in each column, which is described in detail in the above description of step S40 in embodiment 1.
The industrial control protocol determining module 50 is configured to combine the protocol contents corresponding to all the byte arrays in sequence to obtain the industrial control protocol corresponding to the packet, which is described in detail in the above embodiment 1 for the step S50.
The industrial control protocol reverse analysis device provided by the invention decomposes all messages in a message set to be analyzed into byte arrays after obtaining the message set to be analyzed, synthesizes all decomposed messages into a two-dimensional protocol matrix, compares the byte arrays in each row in the two-dimensional protocol matrix in a multi-sequence mode according to the message time of each message, determines the field type of the byte arrays in each row, and determining the protocol content corresponding to the byte array in each column according to the field type of the byte array in each column and the data of the byte array, because each field of the industrial control protocol has the fixed characteristics, the protocol content corresponding to each column of byte arrays can be analyzed according to the field type of the byte arrays in each column and the data of the byte arrays, and finally, the protocol contents corresponding to all the byte arrays are combined in sequence to obtain the industrial control protocol corresponding to the message. The industrial control protocol reverse analysis device provided by the invention analyzes the protocol message corresponding to the byte array of each column by combining the characteristics of the industrial control protocol after the message is decomposed, so as to obtain the industrial control protocol corresponding to the message.
Example 3
An embodiment of the present invention provides a computer device, as shown in fig. 15, the computer device mainly includes one or more processors 61 and a memory 62, and one processor 61 is taken as an example in fig. 15.
The computer device may further include: an input device 63 and an output device 64.
The processor 61, the memory 62, the input device 63 and the output device 64 may be connected by a bus or other means, and the bus connection is exemplified in fig. 15.
The processor 61 may be a Central Processing Unit (CPU). The Processor 61 may also be other general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, or combinations thereof. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The memory 62 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the industrial control protocol reverse analysis apparatus, and the like. Further, the memory 62 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, memory 62 optionally includes memory located remotely from processor 61, and these remote memories may be connected to the inverse industrial control protocol analysis device via a network. The input device 63 may receive a calculation request (or other numerical or character information) input by a user and generate a key signal input associated with the inverse tool control protocol analysis device. The output device 64 may include a display device such as a display screen for outputting the calculation result.
Example 4
An embodiment of the present invention provides a computer-readable storage medium, where the computer-readable storage medium stores computer instructions, and the computer-readable storage medium stores computer-executable instructions, where the computer-executable instructions may execute the inverse analysis method of the industrial control protocol in any method embodiment described above. The storage medium may be a magnetic Disk, an optical Disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a Flash Memory (Flash Memory), a Hard Disk (Hard Disk Drive, abbreviated as HDD), a Solid State Drive (SSD), or the like; the storage medium may also comprise a combination of memories of the kind described above.
It should be understood that the above examples are only for clarity of illustration and are not intended to limit the embodiments. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. And obvious variations or modifications therefrom are within the scope of the invention.

Claims (16)

1. A reverse analysis method for industrial control protocol is characterized by comprising the following steps:
acquiring a message set to be analyzed;
decomposing all messages in the message set to be analyzed into byte arrays, synthesizing all the decomposed messages into a two-dimensional protocol matrix, wherein all the byte arrays obtained by decomposing each message are a row of the two-dimensional protocol matrix, and the byte arrays of the messages at the same position form a column of the two-dimensional protocol matrix;
performing multi-sequence comparison on the byte arrays in each row in the two-dimensional protocol matrix by combining the message time of each message, and determining the field types of the byte arrays in each row, wherein the byte arrays corresponding to different field types have different change rules;
determining the protocol content corresponding to the byte array in each column based on the field type of the byte array in each column and the numerical value of the byte array in each column;
and combining the protocol contents corresponding to all the byte arrays in sequence to obtain the industrial control protocol corresponding to the message.
2. The reverse analysis method of industrial control protocol according to claim 1, wherein after obtaining the set of messages to be analyzed, before decomposing all messages in the set of messages to be analyzed into byte arrays, the method further comprises:
comparing the messages in the message set to be analyzed with a priori knowledge base to judge whether the messages in the message set to be analyzed conform to a known industrial control protocol or not, wherein the priori knowledge base is a pre-established database containing the known industrial control protocol;
when the messages in the message set to be analyzed conform to the known industrial control protocol, outputting the industrial control protocol corresponding to the messages;
when the messages in the message set to be analyzed do not accord with the known industrial control protocol, the steps of decomposing all the messages in the message set to be analyzed into byte arrays and synthesizing all the decomposed messages into a two-dimensional protocol matrix are executed.
3. The reverse analysis method of industrial control protocol according to claim 1, wherein the step of obtaining the set of messages to be analyzed includes:
collecting a plurality of initial messages of different services;
removing the communication protocol message headers of the plurality of initial messages to obtain a plurality of original messages;
and clustering the plurality of original messages according to the message direction and the message length of the original message to obtain a plurality of message sets to be analyzed.
4. The inverse industrial control protocol analysis method according to claim 3, wherein the field type comprises a gradient field;
the step of determining the protocol content corresponding to the byte array in each column based on the field type of the byte array in each column and the value of the byte array in each column includes:
combining a byte array of which the field type is a gradient field in the two-dimensional protocol matrix with byte arrays of the first N1 columns of the byte array to form a first combined byte set, wherein N1 is greater than or equal to 1, and the first combined byte set comprises a plurality of first combined bytes;
calculating the difference value of the numerical values of any two first combined bytes in the first combined byte set and the difference value of the message time corresponding to any two first combined bytes;
and if the difference value of the numerical values of any two first combined bytes is the same as the difference value of the message time, judging that the byte array with the field type of the gradual change field is a time field for representing time.
5. The inverse industrial control protocol analysis method according to claim 3, wherein the field types include a polytropic field, and the polytropic field indicates an irregular change in the values of the byte arrays of the corresponding columns;
the step of determining the protocol content corresponding to the byte array in each column based on the field type of the byte array in each column and the value of the byte array in each column includes:
combining a byte array with a field type of a changeable field in the two-dimensional protocol matrix and byte arrays of the first N2 columns of the byte array to form a second combined byte set, wherein N2 is greater than or equal to 1, and the second combined byte set comprises a plurality of second combined bytes;
and if the numerical value of any second combined byte in the second combined byte set is the same as the message length of the message corresponding to any second combined byte, judging that the byte array with the field type of the changeable field is a length field for representing the message length.
6. The inverse industrial control protocol analysis method according to claim 3, wherein the field types include a polytropic field, and the polytropic field indicates an irregular change in the values of the byte arrays of the corresponding columns;
the step of determining the protocol content corresponding to the byte array in each column based on the field type of the byte array in each column and the value of the byte array in each column includes:
combining a byte array with a field type of a polytropic field in the two-dimensional protocol matrix and a byte array of the first N3 columns of the byte array to form a third combined byte set, wherein N3 is greater than or equal to 1, and the third combined byte set comprises a plurality of third combined bytes;
calculating the difference value of the numerical values of any two third combined bytes in the third combined byte set and the difference value of the message lengths of the messages corresponding to any two third combined bytes;
and if the difference value of the numerical values of any two third combined bytes is the same as the difference value of the message length, judging that the byte array with the field type of the changeable field is a length field for representing the message length.
7. The inverse industrial control protocol analysis method according to claim 3, wherein the field types include a polytropic field, and the polytropic field indicates an irregular change in the values of the byte arrays of the corresponding columns;
the step of determining the protocol content corresponding to the byte array in each column based on the field type of the byte array in each column and the value of the byte array in each column includes:
combining a byte array with a field type of a polytropic field in the two-dimensional protocol matrix and a byte array of the first N4 columns of the byte array to form a fourth combined byte set, wherein N4 is greater than or equal to 1, and the fourth combined byte set comprises a plurality of fourth combined bytes;
verifying any fourth combined byte in the fourth combined byte set by adopting a preset verification algorithm to obtain a verification result;
and if the verification result is consistent with the expected value of the preset verification algorithm, judging that the byte array with the field type of the changeable field is a check code field for representing a check code.
8. The inverse industrial control protocol analysis method according to claim 3, wherein the field types include a polytropic field, and the polytropic field indicates an irregular change in the values of the byte arrays of the corresponding columns;
the step of determining the protocol content corresponding to the byte array in each column based on the field type of the byte array in each column and the value of the byte array in each column includes:
combining a byte array with a field type of a changeable field in the two-dimensional protocol matrix and byte arrays of the first N5 columns of the byte array to form a fifth combined byte set, wherein N5 is greater than or equal to 1, and the fifth combined byte set comprises a plurality of fifth combined bytes;
calculating the difference value of the numerical values of any two fifth combined bytes in the fifth combined byte set and the difference value of the message time corresponding to any two fifth combined bytes;
and if the difference value of the numerical values of any two fifth combined bytes is the same as the difference value of the message time, judging that the byte array with the field type of a changeable field is a time field for representing time.
9. The reverse analysis method of industrial control protocol according to claim 1, wherein the step of obtaining the set of messages to be analyzed includes:
collecting a plurality of initial messages of different services, wherein the plurality of initial messages comprise request messages and response messages, and the plurality of initial messages are the message set to be analyzed;
the decomposing all the messages in the message set to be analyzed into byte arrays and synthesizing all the decomposed messages into a two-dimensional protocol matrix comprises the following steps: decomposing the initial messages into byte arrays, synthesizing all the decomposed messages into a two-dimensional protocol matrix, wherein all the byte arrays obtained by decomposing each initial message are a row of the two-dimensional protocol matrix, and the byte arrays of the initial messages at the same position form a column of the two-dimensional protocol matrix.
10. The inverse industrial control protocol analysis method according to claim 9, wherein the field types include alternate fields indicating that values of byte arrays of corresponding columns alternate;
the step of determining the protocol content corresponding to the byte array in each column based on the field type of the byte array in each column and the value of the byte array in each column includes:
if the field type of the byte array in the adjacent column of the column in which the byte array with the field type of the alternate field is located in the two-dimensional protocol matrix is also the alternate field, and the numerical value of the byte array of the alternate field is the same as the numerical value of the corresponding byte array in the adjacent column, the byte array with the field type of the alternate field is determined to be the source field for representing the source address, and the corresponding byte array in the adjacent column is determined to be the destination field for representing the destination address.
11. The inverse industrial control protocol analysis method according to claim 9, wherein the field types include a changeable field, the changeable field indicates an irregular change in the values of the byte arrays of the corresponding columns,
the step of determining the protocol content corresponding to the byte array in each column based on the field type of the byte array in each column and the value of the byte array in each column includes:
if the numerical values of the byte arrays of any request message and the response message corresponding to the request message are the same in the column of the byte array of the field type of the variable field in the two-dimensional protocol matrix, and the numeric values of all the byte arrays are smaller than a preset threshold value in the column of the byte array of the field type of the variable field, the byte array of the field type of the variable field is judged to be the function code field for representing the function code.
12. The inverse industrial control protocol analysis method of claim 11, wherein the step of determining the protocol content corresponding to the byte array in each column based on the field type of the byte array in each column and the value of the byte array in each column further comprises:
and determining a byte array positioned in the last N6 columns of the function code field in the two-dimensional protocol matrix as an identification data field.
13. The inverse industrial control protocol analysis method of claim 9, wherein the field type comprises a gradient field,
the step of determining the protocol content corresponding to the byte array in each column based on the field type of the byte array in each column and the value of the byte array in each column includes:
combining the byte array of which the field type is the gradient field in the two-dimensional protocol matrix with the byte arrays of the first N7 columns of the byte array to form a seventh combined byte set, wherein N7 is greater than or equal to 1, and the seventh combined byte set comprises a plurality of seventh combined bytes;
and if the value of the seventh combined byte in the seventh combined byte set is in arithmetic progression, determining that the byte array with the field type of the gradual change field is a sequence number field for representing a communication sequence number.
14. An industrial control protocol reverse analysis device, comprising:
the message set to be analyzed acquiring module is used for acquiring a message set to be analyzed;
the two-dimensional protocol matrix construction module is used for decomposing all messages in the message set to be analyzed into byte arrays, synthesizing all the decomposed messages into a two-dimensional protocol matrix, wherein all the byte arrays obtained by decomposing each message are one row of the two-dimensional protocol matrix, and the byte arrays of the messages at the same position form one column of the two-dimensional protocol matrix;
a field type determining module, configured to perform multiple sequence comparison on the byte array in each column in the two-dimensional protocol matrix in combination with the packet time of each packet, and determine a field type of the byte array in each column, where variation rules of the byte arrays corresponding to different field types are different;
the protocol content determining module is used for determining the protocol content corresponding to the byte array in each column based on the field type of the byte array in each column and the numerical value of the byte array in each column;
and the industrial control protocol determining module is used for combining the protocol contents corresponding to all the byte arrays in sequence to obtain the industrial control protocol corresponding to the message.
15. A computer device, comprising:
at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to perform the inverse industrial control protocol analysis method of any of claims 1-13.
16. A computer-readable storage medium storing computer instructions for causing a computer to perform the inverse industrial control protocol analysis method according to any one of claims 1 to 13.
CN202010532058.6A 2020-06-11 2020-06-11 Industrial control protocol reverse analysis method and device Pending CN112311755A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010532058.6A CN112311755A (en) 2020-06-11 2020-06-11 Industrial control protocol reverse analysis method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010532058.6A CN112311755A (en) 2020-06-11 2020-06-11 Industrial control protocol reverse analysis method and device

Publications (1)

Publication Number Publication Date
CN112311755A true CN112311755A (en) 2021-02-02

Family

ID=74336432

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010532058.6A Pending CN112311755A (en) 2020-06-11 2020-06-11 Industrial control protocol reverse analysis method and device

Country Status (1)

Country Link
CN (1) CN112311755A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113810471A (en) * 2021-08-18 2021-12-17 深圳市元征科技股份有限公司 Data transmission method, sending equipment and receiving equipment
CN114390118A (en) * 2021-12-28 2022-04-22 绿盟科技集团股份有限公司 Industrial control asset identification method and device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130094376A1 (en) * 2011-10-18 2013-04-18 Randall E. Reeves Network protocol analyzer apparatus and method
CN104270392A (en) * 2014-10-24 2015-01-07 中国科学院信息工程研究所 Method and system for network protocol recognition based on tri-classifier cooperative training learning
CN107404487A (en) * 2017-08-07 2017-11-28 浙江国利信安科技有限公司 A kind of industrial control system safety detection method and device
CN108933784A (en) * 2018-06-26 2018-12-04 北京威努特技术有限公司 A kind of statement of industry control protocol-decoding rule and optimization coding/decoding method
CN110098959A (en) * 2019-04-23 2019-08-06 广东技术师范大学 Modeling method, device, system and the storage medium of industry control protocol interaction behavior

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130094376A1 (en) * 2011-10-18 2013-04-18 Randall E. Reeves Network protocol analyzer apparatus and method
CN104270392A (en) * 2014-10-24 2015-01-07 中国科学院信息工程研究所 Method and system for network protocol recognition based on tri-classifier cooperative training learning
CN107404487A (en) * 2017-08-07 2017-11-28 浙江国利信安科技有限公司 A kind of industrial control system safety detection method and device
CN108933784A (en) * 2018-06-26 2018-12-04 北京威努特技术有限公司 A kind of statement of industry control protocol-decoding rule and optimization coding/decoding method
CN110098959A (en) * 2019-04-23 2019-08-06 广东技术师范大学 Modeling method, device, system and the storage medium of industry control protocol interaction behavior

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113810471A (en) * 2021-08-18 2021-12-17 深圳市元征科技股份有限公司 Data transmission method, sending equipment and receiving equipment
CN114390118A (en) * 2021-12-28 2022-04-22 绿盟科技集团股份有限公司 Industrial control asset identification method and device, electronic equipment and storage medium
CN114390118B (en) * 2021-12-28 2023-11-07 绿盟科技集团股份有限公司 Industrial control asset identification method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
US9483533B2 (en) Method and apparatus for processing time series data
CN110995482B (en) Alarm analysis method and device, computer equipment and computer readable storage medium
CN112035258A (en) Data processing method, device, electronic equipment and medium
CN112311755A (en) Industrial control protocol reverse analysis method and device
CN112163412B (en) Data verification method and device, electronic equipment and storage medium
CN111683066A (en) Heterogeneous system integration method and device, computer equipment and storage medium
WO2017104119A1 (en) Log analysis system, method, and program
CN111814441A (en) Report generation method and device, electronic equipment and storage medium
CN106227881B (en) Information processing method and server
CN110545444A (en) tamper-proof monitoring method and system for IP video
CN110351281A (en) A kind of general data frame analytic method, device and equipment
CN113448817A (en) Page screen recording method and device and storage medium
KR20200101889A (en) Method and system for processing mpeg data
CN113938408B (en) Data traffic testing method and device, server and storage medium
CN113536770A (en) Text analysis method, device and equipment based on artificial intelligence and storage medium
CN110554877A (en) JSON data analysis method, device, equipment and storage medium
US20160012113A1 (en) Data conversion system
WO2020258942A1 (en) Data compression method and device
US9098863B2 (en) Compressed analytics data for multiple recurring time periods
CN111277626B (en) Server upgrading method and device, electronic equipment and medium
CN110851871A (en) File decompression method and device, electronic equipment and storage medium
CN111787396A (en) Video stream parsing method and device
CN106156169B (en) Discrete data processing method and device
CN113050918A (en) Audio optimization method, device, equipment and storage medium based on remote double recording
CN109033189B (en) Compression method and device of link structure log, server and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination