CN113132392A - Industrial control network flow abnormity detection method, device and system - Google Patents

Industrial control network flow abnormity detection method, device and system Download PDF

Info

Publication number
CN113132392A
CN113132392A CN202110434596.6A CN202110434596A CN113132392A CN 113132392 A CN113132392 A CN 113132392A CN 202110434596 A CN202110434596 A CN 202110434596A CN 113132392 A CN113132392 A CN 113132392A
Authority
CN
China
Prior art keywords
symbol
sub
period
state
dfa
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110434596.6A
Other languages
Chinese (zh)
Other versions
CN113132392B (en
Inventor
唐玉维
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Liandian Energy Development Co ltd
Original Assignee
Suzhou Liandian Energy Development Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Liandian Energy Development Co ltd filed Critical Suzhou Liandian Energy Development Co ltd
Priority to CN202110434596.6A priority Critical patent/CN113132392B/en
Publication of CN113132392A publication Critical patent/CN113132392A/en
Application granted granted Critical
Publication of CN113132392B publication Critical patent/CN113132392B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The application relates to a method, a device and a system for detecting abnormal industrial control network flow, which relate to the technical field of network security and are used for matching message pairs consisting of request messages and response messages in industrial control network flow data; converting the successfully matched message pair into a symbol sequence carrying a timestamp, wherein each symbol in the symbol sequence indicates a unique state event; sequentially acquiring symbols in a symbol sequence carrying timestamps in sequence, and inputting a pre-constructed anomaly detection model for anomaly detection; if the symbol in the symbol sequence obtained currently is detected to belong to a known symbol, inputting the symbol into a corresponding sub-period DFA model; and determining an abnormal detection result according to the state transition result after the sub-period DFA model receives the symbol. The problem that the semantic attack aiming at the SCADA system damages industrial equipment or industrial production at present can be solved.

Description

Industrial control network flow abnormity detection method, device and system
Technical Field
The application relates to a method, a device and a system for detecting abnormal flow of an industrial control network, belonging to the technical field of network security.
Background
SCADA (Supervisory control and data acquisition) systems are used to monitor and control critical infrastructure such as wastewater distribution facilities, natural gas production systems and power plants. The SCADA system is mainly realized through communication between the HMI and the PLC, the HMI sends related instructions to the PLC regularly according to certain logic according to service requirements, the PLC accesses information of the field equipment according to the received instruction content and returns the information to the HMI, and the HMI displays the information after receiving the returned information so as to achieve the purpose of monitoring and controlling. In actual industrial production, there is a definite periodic behavior and operation sequence, so there is also a high periodicity of SCADA traffic in business logic.
The cycle types of SCADA flow are: the system comprises a polling period and a timing period, wherein the polling period refers to that the SCADA system sequentially executes a series of instructions according to the service logic of industrial production and is mainly used for retrieving data from field devices. The timing cycle is the time at which the SCADA system performs some type of operation at regular intervals, and is commonly used to adjust the state of field devices. There may be a mixture of multiple polling periods and multiple timing periods in the HMI and PLC communication channels. For more complex cases, for example, assume that communication between the HMI and the PLC employs a multi-threaded architecture, each thread being responsible for an independent task, with concurrent execution between threads. In this case, the traffic in the industrial control system is multiplexed, i.e., a certain traffic may occur in a plurality of periodic patterns.
In real industrial production, the SCADA system faces not only traditional network attacks, such as function code exception, dos (national office of service), buffer overflow and the like, but also a semantic attack specially aiming at the SCADA system. The attacker has detailed knowledge of the industrial process and the physical equipment, and can purposely damage the industrial equipment or the industrial production by constructing a group of message sequences which are seemingly 'legal'.
Disclosure of Invention
The application provides a method, a device and a system for detecting abnormal industrial control network flow, which can solve the problem that the semantic attack aiming at an SCADA system at present damages industrial equipment or industrial production.
The application provides the following technical scheme:
in a first aspect, a method for detecting an industrial control network traffic anomaly is provided, where the method includes:
matching message pairs consisting of request messages and response messages in industrial control network flow data;
converting the successfully matched message pair into a symbol sequence carrying a timestamp, wherein each symbol in the symbol sequence indicates a unique state event;
sequentially acquiring symbols in a symbol sequence carrying timestamps in sequence, and inputting a pre-constructed anomaly detection model for anomaly detection;
if the symbol in the symbol sequence obtained currently is detected to belong to a known symbol, inputting the symbol into a corresponding sub-period DFA model; the sub-period DFA model is obtained by establishing a DFA model for the symbol set in each sub-period according to a state transition relation after classifying the symbol sequence corresponding to the industrial control network traffic data to obtain the symbol sets corresponding to the plurality of sub-periods;
and determining an abnormal detection result according to the state transition result after the sub-period DFA model receives the symbol.
In a second aspect, an industrial control network traffic anomaly detection device is provided, which comprises
The message matching module is used for matching a message pair consisting of a request message and a response message in the industrial control network flow data;
the mapping module is used for converting the successfully matched message pair into a symbol sequence carrying a timestamp, and each symbol in the symbol sequence indicates a unique state event;
the symbol acquisition module is used for sequentially acquiring symbols in a symbol sequence carrying the time stamps in sequence and inputting a pre-constructed anomaly detection model for anomaly detection;
the judging module is used for inputting the symbols in the currently acquired symbol sequence into a corresponding sub-period DFA model if the symbols are detected to belong to known symbols; the sub-period DFA model is obtained by establishing a DFA model for the symbol set in each sub-period according to a state transition relation after classifying the symbol sequence corresponding to the industrial control network traffic data to obtain the symbol sets corresponding to the plurality of sub-periods;
and the result output module is used for determining an abnormal detection result according to the state transition result after the sub-period DFA model receives the symbol.
In a third aspect, an industrial control network traffic anomaly detection system is provided, where the system includes a processor and a memory; the memory stores a program, and the program is loaded and executed by the processor to implement the steps of the industrial control network traffic anomaly detection method according to the first aspect of the present application.
The beneficial effect of this application lies in: according to the method for detecting the traffic anomaly of the industrial control network, the anomaly detection model is built, the sub-periods are divided based on the Markov principle, and the DFA model is respectively established for each sub-period, so that the abnormal traffic can be accurately detected, and more complex semantic attacks can be detected. Compared with the existing anomaly detection method, the method can detect more types of semantic attacks and has lower false alarm rate and lower missing report rate of the detection model.
The foregoing description is only an overview of the technical solutions of the present application, and in order to make the technical solutions of the present application more clear and clear, and to implement the technical solutions according to the content of the description, the following detailed description is made with reference to the preferred embodiments of the present application and the accompanying drawings.
Drawings
FIG. 1 is a state transition diagram provided by one embodiment of the present application;
fig. 2 is a flowchart of an industrial control network traffic anomaly detection method according to an embodiment of the present application;
FIG. 3 is a block diagram of an anomaly detection model provided in one embodiment of the present application;
FIG. 4 is a state transition diagram provided by another embodiment of the present application;
FIG. 5 is a state transition diagram for a sub-cycle provided by one embodiment of the present application;
fig. 6 is a block diagram of an industrial control network traffic anomaly detection device according to an embodiment of the present application.
Fig. 7 is a block diagram of an industrial control network traffic anomaly detection system according to an embodiment of the present application.
Detailed Description
The following detailed description of embodiments of the present application will be described in conjunction with the accompanying drawings and examples. The following examples are intended to illustrate the present application but are not intended to limit the scope of the present application.
Fig. 1 shows a state transition diagram corresponding to a multi-cycle mixed industrial control network flow, and referring to fig. 1, the state transition diagram includes a polling cycle state sequence "abcdbadddbedf … …" and another timing cycle state sequence "AAA … …", which is a multi-cycle mixed state transition diagram.
As shown in fig. 1, for this state transition diagram, simple semantic attacks can be divided into two categories: sequence attacks and time-series attacks.
The sequence attack refers to that an attacker sends message instructions in an illegal and malicious sequence, for example, the sequence attack is carried out by reversing the sub-sequence of 'ab' in the sequence 'abcdbedfabdbedf … …' to form an abnormal sequence 'backdbedfbacdbedf … …'. For example, in the case of a sequence attack affecting a high pressure gas delivery pipe, the pressure of the gas delivery pipe is controlled by two valves, and the attacker controls the PLC of the gas delivery pipe to force one valve to be fully opened and the other valve to be fully closed by sending instructions, resulting in the pressure of the gas delivery pipe being too high and stopping working. These instructions are legitimate when detected individually, but will stop the system when they are sent in an illegitimate order.
The time sequence attack means that an attacker sends a message instruction at an illegal time, for example, the cycle time of the sequence 'AAA … …' is changed from 5 seconds to 2 seconds to form the time sequence attack. Illustratively, in a water delivery system, an attacker sends normal sequence commands to the PLC at an abnormal frequency, causing the valves of the water delivery pipes to open and close quickly, creating an air hammer effect, causing a large number of water delivery pipes to break.
More complex semantic attacks can be constructed if the attacker has deeper knowledge of the industrial production process: branch node attacks and sub-cycle replay attacks.
The branch node attack means that an attacker reverses the transmission order of subsequences to cause an attack. For the branch nodes b, b → c and b → e, the state transitions are legal, and the attacker can construct the branch node attack sequence "abedbcdfabebcddf … …" by reversing the order of the sub-sequences "bcd" and "bed".
The sub-period replay attack refers to the attack caused by repeatedly sending a sub-sequence by an attacker. For the sub-period "AAA … …" in the state diagram shown in fig. 1, an attacker can send state a multiple times so that the entire production flow changes to interfere with industrial production.
The embodiment of the application provides an industrial control network flow abnormity detection method aiming at the defect of semantic attack detection of the conventional industrial control system abnormity detection method on multi-period mixed industrial control network flow.
Fig. 2 is a flowchart of an industrial control network traffic anomaly detection method according to an embodiment of the present application, where the method includes the following steps:
s201: and acquiring industrial control network flow data.
S202: and matching the message pair consisting of the request message and the response message in the industrial control network flow data.
Specifically, the matching in this embodiment refers to whether the response message is a response to the request message. If so, the match is successful. If not, the match fails.
S203: judging whether the message pair is successfully matched, and if the message pair is not successfully matched, entering S204; if the matching is successful, entering S205;
s204: detects as a loss anomaly and returns to S201.
S205: and converting the successfully matched message pair into a symbol sequence carrying a timestamp.
In the interaction process of the industrial control system, the specific operation of the industrial production is performed through communication flow, and the embodiment converts the flow data into a symbol sequence by converting the acquired flow data into a state event and constructing a state transition diagram, wherein different states represented by each node in the state transition diagram are represented by different symbols. The specific conversion is as follows:
different industrial protocols contain different fields and different definitions of state events, in order to ensure that flow data can be accurately converted into the state events, semantic features of the protocols need to be analyzed, appropriate feature fields are selected, an original flow sequence can be converted into a symbol sequence carrying time stamps according to the selected feature fields, and each symbol represents a unique state event.
In this embodiment, taking the S7 Protocol as an example, a feature extraction and state conversion rule based on the S7 Protocol is formulated, and Data headers of Protocol Id, ROSCTR, Parameter Length, Data Length, Function Code, Item Count, and Item in the Protocol are extracted to define a state event. The state event transition rules defined by these features are as follows:
(1) when the S7 message function code is read (0x04), the same status event is obtained when the addresses of the objects read by the request message are the same (reading the same field device information), and the same status event is obtained when the parameters of the return values of the response message are the same (the returned device information is the same).
(2) When the S7 message function code is write operation (0x05), the same object address and the same write value (the same value is written to the same field device) of the request message are the same state event, and the same state event is obtained when the parameters of the response message return value are the same (whether writing is successful).
Any one of the S7 messages in the traffic data may be mapped to a unique state event according to the above-described transformation rules.
In order to facilitate the construction of the anomaly detection model, in this embodiment, values of the parameter items required in the S7 message are taken out first, and symbol string concatenation is performed, so that one S7 message becomes a symbol string. And then, converting the symbol strings by using an SHA-1 function to obtain hash symbol strings with equal length, and simultaneously keeping the timestamp of each hash symbol string to complete the conversion from the message pair to the symbol sequence, wherein each symbol represents a unique state event.
For convenience of presentation, for example, the corresponding hash symbol string may be represented by "a, b, c … …", illustratively, for example, symbol a represents reading of a certain sensor, and symbol b represents modifying a parameter of a certain controller.
S206: and sequentially acquiring symbols in the symbol sequence carrying the time stamp in sequence, and inputting a pre-constructed anomaly detection model for anomaly detection.
Specifically, the anomaly detection model constructed in the embodiment includes at least two sub-period DFA models, a DFA selector, and an anomaly determination module.
In this embodiment, a state transition diagram is constructed from a symbol sequence carrying a timestamp, and symbols in different states represented by each node in the state transition diagram are classified to obtain a plurality of separated sub-periods. After separating a plurality of sub-periods, respectively and independently constructing a DFA model for each sub-period, wherein each sub-period DFA model is used for outputting a state transition result according to an input symbol.
Referring to fig. 3, the present embodiment sets the DFA selector according to the periodic pattern of the sub-periods. When at least 2 seed period patterns exist in the channel, the symbols in the symbol sequence are sent into the corresponding sub-period DFA model by adding a DFA selector.
The DFA selector of this embodiment performs the selection function by analyzing and comparing the symbol content and the time stamp in the channel[30]The design is mainly designed for the following two cases:
(1) the symbol contents contained in the sub-period modes are different, and the symbol contents can be directly selected according to the symbol contents, and the flow symbols are sent to the corresponding DFAs.
(2) The content of the symbols contained in the sub-period mode is repeated, and the comparison between the timestamp and the period value is added on the basis of the content of the symbols to send the traffic symbols into the corresponding DFA model.
The anomaly determination module of this embodiment is configured to determine whether the input industrial control network traffic is abnormal according to the state transition result output by the sub-period DFA model. For a specific determination method, please refer to the following description.
S207: judging whether the currently acquired symbol belongs to a known symbol, if not, entering S208; if yes, go to S209;
specifically, the input symbols are divided into two categories: known symbols and unknown symbols.
The known symbol set is composed of all input symbols observed in the training phase of the anomaly detection model, and has corresponding DFA states, and the rest are unknown symbols which can be directly detected as unknown attacks.
The present embodiment sets a state alphabet States, in which the elements are state types in the symbol sequence SymSeq.
The symbol s to be acquired in the present embodimentiSending into DFA selector, judging siWhether in the status alphabet States, if siIn the state alphabet States, then known symbols, otherwise unknown symbols.
S208: detecting the currently acquired symbol as an unknown anomaly and proceeding to S206;
s209: the symbols are input to the corresponding sub-period DFA model.
S210: after inputting the currently obtained symbol, whether the state is transferred to the expected position of the periodic sequence or not is judged, if yes, the operation goes to S213; otherwise, go to S212;
s211, whether the state is still kept at the current state, if not, the step goes to S212, and if so, the step goes to S213.
S212, detecting as a lost exception, executing S206.
S213, detecting the abnormality of 'retransmission', and executing S206;
s214: judging the normal state transition;
s215: judging that the difference value between the timestamp carried by the current symbol and the average time interval in the current state is greater than a time interval deviation threshold value, if so, entering S216; otherwise, the process proceeds to S217.
S216: the current symbol timing is detected to be abnormal, and the process proceeds to S216.
S217: the detection is that the time sequence and the order are normal.
In the DFA model, traffic behavior is described as three types of state transitions (output symbols):
(1) and (3) normal: a "normal" state transition occurs after receiving the known symbol, such that the state transitions to the next state of the periodic sequence, i.e., sj=si+1. As a result of a "normal" event, the DFA transitions to its next State Statei+1
(2) And (4) retransmission: a "retransmission" is the occurrence of the same known symbol as the previous symbol. As a result of the "retransmit" event, the DFA remains in its current State StateiI.e. sj=si. If there are two consecutive identical symbols in the pattern, the DFA will cause the state to have two different state transitions for the same symbol, namely a forward "normal" state transition and a self-looping "retransmission" state transition. This uncertainty can be resolved at runtime by selecting a "normal" state transition rather than a self-looping "retransmit" state transition.
(3) Loss: "missing" refers to a known symbol sjOut of Statei+1Is received (not present at the expected location of the state transition process), i.e. sj≠si+1. As a result of a "lost" event, the DFA state transitions to receive sjState of a symbolj
For steps S211-S217, the acquired symbols are fed into the sub-period DFA model DFAiThen, if DFAiThe current State is StatejInputting siPost State transition to Statej+1Then the transition is "normal".
Input siThen, if the State is transferred, the State is stilljThen a "retransmit" exception is detected.
Input siAfter that, the State transitions to Statej+kIf k is greater than or equal to 2, a "lost" anomaly is detected.
For the symbol detected as a normal transition, further detection of the timing is required, T being the symbol siTime stamping of, DFAiEach State ofjAll have corresponding average time intervals TijAvg, duration threshold is the time interval deviation threshold, if T-TijAvg | < duration threshold, then siIs "normal" in both order and timing, otherwise siDetected as a "timing" anomaly.
Optionally, the present embodiment further includes a step of constructing an anomaly detection model, which is specifically as follows:
and S301, constructing a state transition diagram.
According to S205, the present embodiment converts the original traffic data into a symbol sequence carrying a timestamp, that is, includes a symbol sequence SymSeq and a time sequence TimeSeq corresponding to the symbol sequence.
In this embodiment, a state transition diagram is constructed according to a symbol sequence carrying a timestamp, and a state transition relationship of an event can be obtained, where the state transition relationship is represented by a matrix adjStates, where adjStates [ i ] [ j ] is an element of the adjStates, and a sequence number i corresponds to an ith state in the States: i ', and a sequence number j is a jth state in the States: j', and the adjStates [ i ] [ j ] represents the number of transitions from the state i 'to the state j'.
After the state transition relation use matrix adjStates is constructed, elements in the state transition relation use matrix adjStates are divided by the total time interval of the time sequence TimeSeq to obtain a frequency matrix adjF. The frequency matrix adjF is a finally constructed state transition diagram matrix, wherein elements adjF [ i ] [ j ] of the adjF represent the transition frequency from the state i 'to the state j'. The specific construction process is as follows:
and S1, sequentially taking out symbols symbol from SymSeq, judging whether the current symbol is a new state event or not, and if so, adding the current symbol to the state alphabet States.
And S2, updating the transition relation between the current state and the previous historical state, namely adding 1 to the corresponding element of the matrix adjStates.
And S3, judging whether the symbols in the SymSeq are completely taken or not, and if not, entering S1. Otherwise, the process proceeds to S4.
And S4, dividing the values in the matrix adjStates by the total time interval of the time sequence in sequence, and calculating the frequency of the degree of entrance and exit of each state node to obtain a frequency matrix adjF.
And S5, counting the access value of each state node and the relevant access state set according to whether the value of the frequency matrix adjF is 0 or not.
The embodiment of the application classifies the symbols represented by different States in the alphabet States into different symbol sets S according to the access relation and the access frequency, thereby achieving the purpose of separating a plurality of sub-periods. Finally classifying to obtain a subcycle set C ═ S1,S2,…,Si,…,SnIn which S isi,i∈[1,n]A set of symbols representing the ith sub-period.
Fig. 4 is a state transition diagram consisting of a sequence of timing cycles "ij … …" of cycle length 2, which consists of a request-response pair, i.e. the symbol "ij", and a polling sequence "ababcefghefghefghefghefghefghgh … …" of cycle length 50. The polling cycle sequence is composed of four request response pairs, namely symbols 'ab', 'cd', 'ef' and 'gh', the number in the state diagram node is the frequency of each state, wherein the input state set of the state 'a' is { b, h, j } and the output state set is { b }.
According to the state transition diagram access relationship and frequency, the present embodiment divides the symbols in fig. 4 into four sub-periods: { a, b }, { c, d }, { e, f, g, h } and { i, j }, successfully separating the polling period and the timing period.
S302: the order of symbols contained in each sub-period is determined.
Determining the sequence of the symbols in each set according to the input-output relationship of the symbols in the state transition diagram and the sequence of the original symbol sequence, and sequentially setting each sub-period symbol set SiThe following operations were carried out:
(1) first, a subset S is determined from the original symbol sequence SymSeqiFirst symbol s in (1)first
(2) Then, determining s according to the access relation of the state transition diagramfirstUntil the set S is reached, this operation is repeatediAll symbol orders in (a) are determined.
(3) Determining the last symbol slastIs the first symbol s of the sub-periodfirstForming a complete cycle.
The above construction process is described in detail below by taking the corresponding separated subcycle { e, f, g, h } in the symbol sequence "ababcdefghefghefghefghefghefghefghefghefghefgh … …" in fig. 4 as an example:
firstly, selecting a first symbol e from the subset according to the original symbol sequence, then obtaining a next symbol f of e according to the state transition diagram, determining the sequence e → f, and then sequentially executing the determination sequence e → f → g → h. Since h is the last symbol in the symbol set and the sub-periods are cyclic, the order of h → e is determined, and finally the order relationship as shown in fig. 5 is constructed.
S303: a DFA model is constructed for each sub-period.
Optionally, the process of constructing the DFA model for each sub-period in this embodiment is as follows:
specifically, the DFA consists of five tuples (Q, sigma, delta, Q)0F), where Q is a non-empty finite set of states, sigma is a non-empty finite set of alphabets, delta is a transfer function, Q0(q0E.g. Q) is an initial state, F (F e.Q) is an acceptance state set, a DFA model is respectively constructed for each sub-period, and in order to enable the DFA model to meet the actual modeling requirement, the DFA is modified by the following two items:
(1) the final state set F is removed because the input of the abnormal testing model in the detection phase is an endless repeated data stream, and the DFA model cannot be terminated unless the data stream is ended.
(2) The start state is defined as the state corresponding to the first symbol in the periodic traffic pattern.
The DFA model of this embodiment is constructed according to the obtained sub-period symbol sequence with determined sequence, and the State is orderediExpressed as the current state, siIndicating a State transition to StateiInput symbol of sjIndicates the currently received input symbol and will cause the State to transition to Statej
For each sub-period, the DFA model construction process is as follows:
(1) the normal conversion relation between sequences, i.e. s, is constructed firstj≠si+1Current State StateiUpon reception of a symbol sjNormal transition to Statei+1
(2) Reconstructing the retransmission transition relation between sequences, i.e. sj=siCurrent State StateiUpon reception of a symbol sjThen, the retransmission is judged, and the current state is unchanged.
(3) Finally, the lost transition relation between sequences is constructed, i.e. sj≠si+1Current State StateiUpon reception of a symbol sjAnd then judging that the current state is lost and the current state is unchanged.
Optionally, in this embodiment, the classifying each symbol and dividing the sub-period are based on the principle of a markov chain, and a specific process of the sub-period in this embodiment is as follows:
(1) since nodes with both 1 degree of entry and exit are not branched in the state transition diagram and belong to only one subcycle set, the node V with both 1 degree of entry and exit in the state transition diagram is selected first, and according to the node frequency FVDividing the frequency of the frequency into a set S of similar frequencies, wherein the set frequency is FS. The symbol ≈ herein indicates frequency similarity, i.e., FS×(1-FT)≤FV≤FS×(1+FT) In which F isTIs the frequency threshold, in this experiment FTSet to 0.05. If the set of frequencies does not exist, a new frequency F is createdVNode V is added to the new set.
(2) remainV represents a collection of nodes in the state diagram that are not all allocated, i.e.
Figure BDA0003032637010000111
Selecting a node V with out-degree or in-degree of 1 from the remainV according to the node frequency FVIt is assigned to the set of already existing similar frequencies and node V is removed from remainV.
(3) Nodes may be members of multiple sets, representing the occurrence of symbols in multiple recurring patterns. VinIs an entry node of node V and VinBelonging to only one known set, VoutIs an output node of node V and VoutOnly belonging to a known set if all VinOr all of VoutSum of the set frequencies and frequency F of node VVSimilarly, node V is added to all relevant sets separately, while node V is removed from remainV. If the relevant set frequency is not similar to the node frequency, judging all VinAnd VoutSum of the set frequenciesFrequency F with node VVAnd if so, adding the node V into all the related sets respectively, and removing the node V from the remainV.
(4) If at least one access node adjacent to the node V is in the known set and the access degree of the adjacent node is 1, adding the node V into the set, and simultaneously modifying the frequency of the node to be FV=FV-FSIf F isVIs 0, the remaining node set remainV is removed.
(5) Judging whether nodes exist in the residual node set remainV or not, if so, finding out the node with the minimum frequency from the residual nodes, creating a new set according to the frequency of the node and adding the node into the set, then adding the residual nodes into the set, and modifying the frequency of each node, if the frequency of the node is less than a threshold FTThen node V is removed from remainV.
(6) And finally adding the residual nodes in the remainV into a known similar set according to the frequency.
The markov chain is of common general knowledge in the art and will not be described in further detail herein.
Further optionally, the present embodiment fuses the separated sub-periods.
When the traffic is mixed in multiple cycles, it is also possible to decompose one long-cycle traffic into multiple sub-cycle traffic, for example, the polling cycle with a length of 50 in fig. 4 is decomposed into 3 sub-cycles "abab … …", "cdcd … …", "efghefgh … …" by the classification algorithm, if the attacker continuously replays the "ab" sub-cycle after fully understanding the production link, the DFA model cannot detect the attack behavior, which may result in serious production accident.
In order to improve the detection capability of the model for complex semantic attacks, the separated sub-periods are further fused on the basis. The method comprises the following specific steps:
(1) obtaining all subcycle symbol set C ═ S1,S2,…,Si,…,SnIn which S isi,i∈[1,n]The symbol set of the ith sub-period is represented, and simultaneously the model false alarm rate FPR can be obtainedorig
(2) For all symbol sets, creating a matrix adjSet to record whether the sets try to fuse or not, and initializing the adjSet [ i ] [ j ] ═ 1 and i ═ j since the sets do not need to try to fuse; adjSet [ i ] [ j ] ═ 0, i ≠ j; and i, j ∈ [1, n ].
(3) Searching the matrix adjSet, and selecting two sets S which are not tried to be fused from the matrix CiAnd SjI.e., adjSet [ i ]][j]And (5) if no qualified set exists in the C, executing (6), otherwise, executing (4).
(4) For all symbols V in both setsi∈SiAnd Vj∈SjIf, if
Figure BDA0003032637010000122
Or
Figure BDA0003032637010000123
Namely SiAnd SjIf the symbol has an in-out relationship, S is fused temporarilyiAnd SjTo generate a new set S of symbolstempRemoving S from CiAnd SjAnd mixing StempIs added in.
Then according to all the sub-period symbol sets S in C in sequencetClassifying the original symbol sequence SymSeq to generate a new subsequence subSeqiI.e. to
Figure BDA0003032637010000121
If V is equal to StsubSeqt←V。
Finally, adopting an unsupervised learning method to carry out unsupervised learning on each subSeqtAnd constructing a temporary DFA model.
If SiAnd SjIf there is no in-out relation in the middle symbol, then adjSet [ i][j]And (3) is repeatedly executed, wherein the number is 1.
(5) The false alarm rate of the temporary DFA model is FPRtempIf FPRtemp>FPRorigWill StempReduction to SiAnd SjSimultaneous adjSet [ i ]][j]And (3) is repeatedly executed as 1. Otherwise, confirm the fusion SiAnd SjAnd updating C, and repeatedly executing (2) and (3).
(6) After the above steps, the fused set C is obtained, and a false alarm (i.e. q) is formed due to the normal delay of the request packet and the response packet1,r1,q2,r2… … sequences will form q due to the normal delays of the network1,q2,r1,r2… …, which may be misinformed by the model) so that the embodiment resolves the protocol based on the matched request response set q in set C1,r1,q2,r2… … request response pair q1,q2… … request to indicate, a new symbol set C ' is generated and from each S in C ', a new symbol set C ' is generatediGenerating the corresponding subSeqiFor each subSeq, unsupervised learning method is adoptediAnd constructing the DFA model to obtain the fused DFA model.
For the IP channels with the symbol sequence modeling completed, after the DFA sequence model corresponding to each channel is obtained, in order to enable the model to detect the time sequence attack, the time mark is added to each node in the DFA model.
The original time symbol sequence is abnormally input into the DFA model after fusion, and the DFA model is used for each sub DFA modeliRecord each State StateijThe time stamp of the symbol.
When all time symbols are input to the end, the average time interval T of each state is obtained by the time stamp of the state recordij.avg=(Tij.last-Tij.first)/Length(Tij) Wherein T isijAll timestamps representing the jth state of the ith DFA. It should be noted that each node of the sub-DFA has its own Tij.avg。
Fig. 6 is a block diagram of an industrial control network traffic anomaly detection device according to an embodiment of the present application, where the device is applied to the anomaly detection model shown in fig. 3 in the present embodiment. The device at least comprises the following modules:
the message matching module is used for matching a message pair consisting of a request message and a response message in the industrial control network flow data;
the mapping module is used for converting the successfully matched message pair into a symbol sequence carrying a timestamp, and each symbol in the symbol sequence indicates a unique state event;
the symbol acquisition module is used for sequentially acquiring symbols in a symbol sequence carrying the time stamps in sequence and inputting a pre-constructed anomaly detection model for anomaly detection;
the judging module is used for inputting the symbols in the currently acquired symbol sequence into a corresponding sub-period DFA model if the symbols are detected to belong to known symbols; the sub-period DFA model is obtained by classifying symbol sequences corresponding to industrial control network traffic data to obtain symbol sets corresponding to a plurality of sub-periods and establishing a DFA model for the symbol sets in each sub-period according to a state transition relation;
and the result output module is used for determining an abnormal detection result according to the state transition result after the sub-period DFA model receives the symbol.
For relevant details reference is made to the above-described method embodiments.
It should be noted that: in the above embodiment, when the industrial control network traffic anomaly detection apparatus performs network reconnection, only the division of the above functional modules is taken as an example, and in practical applications, the above function distribution may be completed by different functional modules as needed, that is, the internal structure of the industrial control network traffic anomaly detection apparatus is divided into different functional modules, so as to complete all or part of the above described functions. In addition, the industrial control network traffic anomaly detection device and the industrial control network traffic anomaly detection method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are detailed in the method embodiments and are not described herein again.
Fig. 7 is a block diagram of an industrial control network traffic anomaly detection system according to an embodiment of the present application, where the system may be: a smartphone, a tablet, a laptop, a desktop, or a server. The industrial control network traffic abnormality detection apparatus may also be referred to as a user equipment, a portable terminal, a laptop terminal, a desktop terminal, a control terminal, etc., which is not limited in this embodiment. The system includes at least a processor and a memory.
The processor may include one or more processing cores, such as: 4 core processors, 6 core processors, etc. The processor may be implemented in at least one hardware form of DSP (Digital Signal Processing), FPGA (Field-Programmable gate array), PLA (Programmable logic array). The processor may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor may be integrated with a GPU (Graphics processing unit), which is responsible for rendering and drawing the content that the display screen needs to display. In some embodiments, the processor may further include an AI (Artificial Intelligence) processor for processing computing operations related to machine learning.
The memory may include one or more computer-readable storage media, which may be non-transitory. The memory may also include high speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer-readable storage medium in a memory is configured to store at least one instruction for execution by a processor to implement a method for industrial control network traffic anomaly detection provided by method embodiments herein.
In some embodiments, the industrial control network traffic anomaly detection system may further include: a peripheral interface and at least one peripheral. The processor, memory and peripheral interface may be connected by bus or signal lines. Each peripheral may be connected to the peripheral interface via a bus, signal line, or circuit board. Illustratively, peripheral devices include, but are not limited to: radio frequency circuit, touch display screen, audio circuit, power supply, etc.
Of course, the industrial control network traffic anomaly detection system may further include fewer or more components, which is not limited in this embodiment.
Optionally, the present application further provides a computer-readable storage medium, where a program is stored in the computer-readable storage medium, and the program is loaded and executed by a processor to implement the industrial control network traffic anomaly detection method according to the foregoing method embodiment.
Optionally, the present application further provides a computer product, where the computer product includes a computer-readable storage medium, where a program is stored in the computer-readable storage medium, and the program is loaded and executed by a processor to implement the industrial control network traffic anomaly detection method according to the foregoing method embodiment.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A method for detecting abnormal flow of an industrial control network is characterized by comprising the following steps:
matching message pairs consisting of request messages and response messages in industrial control network flow data;
converting the successfully matched message pair into a symbol sequence carrying a timestamp, wherein each symbol in the symbol sequence indicates a unique state event;
sequentially acquiring symbols in a symbol sequence carrying timestamps in sequence, and inputting a pre-constructed anomaly detection model for anomaly detection;
if the symbol in the symbol sequence obtained currently is detected to belong to a known symbol, inputting the symbol into a corresponding sub-period DFA model; the sub-period DFA model is obtained by establishing a DFA model for the symbol set in each sub-period according to a state transition relation after classifying the symbol sequence corresponding to the industrial control network traffic data to obtain the symbol sets corresponding to the plurality of sub-periods;
and determining an abnormal detection result according to the state transition result after the sub-period DFA model receives the symbol.
2. The method of claim 1, wherein the determining the abnormal detection result according to the state transition result after receiving the symbol by the sub-period DFA model comprises:
after the sub-period DFA model receives the symbol, if the corresponding state is transferred to the next state of the periodic sequence from the current state, the state is judged to be normally transferred;
and under the condition of normal state transition, if the difference value between the timestamp carried by the current symbol and the average time interval in the current state is greater than a time interval deviation threshold value, detecting that the time sequence of the current symbol is abnormal.
3. The method of claim 1, wherein the determining the abnormal detection result according to the state transition result after receiving the symbol by the sub-period DFA model comprises:
and after the sub-period DFA model receives the symbol, if the occurred state transition does not appear at the expected position in the state transition process, detecting the abnormal state as 'loss'.
4. The method of claim 1, wherein the determining the abnormal detection result according to the state transition result after receiving the symbol by the sub-period DFA model comprises:
and after the sub-period DFA model receives the symbol, if the corresponding state is still the current state and the state transition does not occur, detecting that the state is abnormal for retransmission.
5. The method of claim 1, wherein an "unknown" anomaly is detected if a symbol in the currently acquired sequence of symbols belongs to an unknown symbol.
6. The method according to claim 1, wherein the matching of the message pair consisting of the request message and the response message in the industrial control network traffic data further comprises:
if the match fails, it is directly detected as a "missing exception".
7. The method of claim 1, wherein the anomaly detection model comprises:
at least two sub-period DFA models for outputting state transition results according to input symbols;
the DFA selector is used for sending the symbols corresponding to the input industrial control network flow into the corresponding sub-period DFA models according to the input symbol content and the corresponding time stamps;
and the abnormity judgment module is used for judging whether the input industrial control network flow is abnormal or not according to the state transition result output by the sub-period DFA model.
8. The method according to claim 6, wherein after the DFA model is established for the symbol set in each of the sub-periods according to the state transition relationship, the method further comprises performing fusion again for each sub-period according to the false alarm rate of the corresponding DFA model, so as to obtain a plurality of fused sub-period DFA models.
9. The utility model provides an industrial control network flow anomaly detection device which characterized in that includes:
the message matching module is used for matching a message pair consisting of a request message and a response message in the industrial control network flow data;
the mapping module is used for converting the successfully matched message pair into a symbol sequence carrying a timestamp, and each symbol in the symbol sequence indicates a unique state event;
the symbol acquisition module is used for sequentially acquiring symbols in a symbol sequence carrying the time stamps in sequence and inputting a pre-constructed anomaly detection model for anomaly detection;
the judging module is used for inputting the symbols in the currently acquired symbol sequence into a corresponding sub-period DFA model if the symbols are detected to belong to known symbols; the sub-period DFA model is obtained by establishing a DFA model for the symbol set in each sub-period according to a state transition relation after classifying the symbol sequence corresponding to the industrial control network traffic data to obtain the symbol sets corresponding to the plurality of sub-periods;
and the result output module is used for determining an abnormal detection result according to the state transition result after the sub-period DFA model receives the symbol.
10. The industrial control network flow abnormity detection system is characterized in that the device comprises a processor and a memory; the memory stores a program, and the program is loaded and executed by the processor to implement the steps of the industrial control network traffic anomaly detection method according to any one of claims 1-8.
CN202110434596.6A 2021-04-22 2021-04-22 Industrial control network flow abnormity detection method, device and system Active CN113132392B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110434596.6A CN113132392B (en) 2021-04-22 2021-04-22 Industrial control network flow abnormity detection method, device and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110434596.6A CN113132392B (en) 2021-04-22 2021-04-22 Industrial control network flow abnormity detection method, device and system

Publications (2)

Publication Number Publication Date
CN113132392A true CN113132392A (en) 2021-07-16
CN113132392B CN113132392B (en) 2022-05-06

Family

ID=76778853

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110434596.6A Active CN113132392B (en) 2021-04-22 2021-04-22 Industrial control network flow abnormity detection method, device and system

Country Status (1)

Country Link
CN (1) CN113132392B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113852603A (en) * 2021-08-13 2021-12-28 京东科技信息技术有限公司 Method and device for detecting abnormality of network traffic, electronic equipment and readable medium
CN114710354A (en) * 2022-04-11 2022-07-05 中国电信股份有限公司 Abnormal event detection method and device, storage medium and electronic equipment
CN114844802A (en) * 2022-07-04 2022-08-02 北京六方云信息技术有限公司 Traffic detection method, device, terminal equipment and storage medium
CN116192714A (en) * 2023-02-24 2023-05-30 上海繁易信息科技股份有限公司 Variable information acquisition method and device for industrial equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109088903A (en) * 2018-11-07 2018-12-25 湖南大学 A kind of exception flow of network detection method based on streaming
CN109167796A (en) * 2018-09-30 2019-01-08 浙江大学 A kind of deep-packet detection platform based on industrial SCADA system
CN110535878A (en) * 2019-09-23 2019-12-03 电子科技大学 A kind of threat detection method based on sequence of events
CN110909811A (en) * 2019-11-28 2020-03-24 国网湖南省电力有限公司 OCSVM (online charging management system) -based power grid abnormal behavior detection and analysis method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109167796A (en) * 2018-09-30 2019-01-08 浙江大学 A kind of deep-packet detection platform based on industrial SCADA system
CN109088903A (en) * 2018-11-07 2018-12-25 湖南大学 A kind of exception flow of network detection method based on streaming
CN110535878A (en) * 2019-09-23 2019-12-03 电子科技大学 A kind of threat detection method based on sequence of events
CN110909811A (en) * 2019-11-28 2020-03-24 国网湖南省电力有限公司 OCSVM (online charging management system) -based power grid abnormal behavior detection and analysis method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
周宇: "基于R-DFA状态机的工控系统异常流量检测", 《现代计算机(专业版)》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113852603A (en) * 2021-08-13 2021-12-28 京东科技信息技术有限公司 Method and device for detecting abnormality of network traffic, electronic equipment and readable medium
CN113852603B (en) * 2021-08-13 2023-11-07 京东科技信息技术有限公司 Abnormality detection method and device for network traffic, electronic equipment and readable medium
CN114710354A (en) * 2022-04-11 2022-07-05 中国电信股份有限公司 Abnormal event detection method and device, storage medium and electronic equipment
CN114710354B (en) * 2022-04-11 2023-09-08 中国电信股份有限公司 Abnormal event detection method and device, storage medium and electronic equipment
CN114844802A (en) * 2022-07-04 2022-08-02 北京六方云信息技术有限公司 Traffic detection method, device, terminal equipment and storage medium
CN116192714A (en) * 2023-02-24 2023-05-30 上海繁易信息科技股份有限公司 Variable information acquisition method and device for industrial equipment

Also Published As

Publication number Publication date
CN113132392B (en) 2022-05-06

Similar Documents

Publication Publication Date Title
CN113132392B (en) Industrial control network flow abnormity detection method, device and system
Wang et al. Anomaly detection for industrial control system based on autoencoder neural network
CN111262722B (en) Safety monitoring method for industrial control system network
US11522881B2 (en) Structural graph neural networks for suspicious event detection
CN109167796B (en) Deep packet inspection platform based on industrial SCADA system
Zhao et al. SeqFuzzer: An industrial protocol fuzzing framework from a deep learning perspective
JP3832281B2 (en) Outlier rule generation device, outlier detection device, outlier rule generation method, outlier detection method, and program thereof
Caselli et al. Sequence-aware intrusion detection in industrial control systems
US10679135B2 (en) Periodicity analysis on heterogeneous logs
Klerx et al. Model-based anomaly detection for discrete event systems
JP2019110513A (en) Anomaly detection method, learning method, anomaly detection device, and learning device
CN113645232A (en) Intelligent flow monitoring method and system for industrial internet and storage medium
CN111901340A (en) Intrusion detection system and method for energy Internet
CN105306463A (en) Modbus TCP intrusion detection method based on support vector machine
CN110456765B (en) Method and device for generating time sequence model of industrial control instruction and method and device for detecting time sequence model of industrial control instruction
Faisal et al. Modeling Modbus TCP for intrusion detection
CN113220534A (en) Cluster multi-dimensional anomaly monitoring method, device, equipment and storage medium
CN115208604A (en) Method, device and medium for detecting AMI network intrusion
CN115514558A (en) Intrusion detection method, device, equipment and medium
Yang et al. Cloud-edge coordinated traffic anomaly detection for industrial cyber-physical systems
CN111010387A (en) Illegal replacement detection method, device, equipment and medium for Internet of things equipment
CN113282920A (en) Log abnormity detection method and device, computer equipment and storage medium
CN117240586A (en) Internal threat detection method and system based on depth time map information maximization
CN117220911B (en) Industrial control safety audit system based on protocol depth analysis
Arshed et al. Machine learning with data balancing technique for IoT attack and anomalies detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant