WO2020231334A1

WO2020231334A1 - Modelling and black-box security testing of cyber-physical systems

Info

Publication number: WO2020231334A1
Application number: PCT/SG2020/050271
Authority: WO
Inventors: John Henry CASTELLANOS; Jianying Zhou
Original assignee: Singapore University Of Technology And Design
Priority date: 2019-05-10
Filing date: 2020-05-08
Publication date: 2020-11-19
Also published as: SG11202112495XA

Abstract

A method of training a model of a cyber-physical system (CPS) includes: obtaining raw data relating to plant inputs, plant outputs, and external messages received by controllers of the CPS; automatically identifying, based on the inputs, a set of locations of a hybrid system, and respective destination locations to which respective members of the set of locations can transition; learning, for each location of the set of locations, based on the inputs and outputs, parameters of a continuous-time model of events of the hybrid system; and training, for each location of the set of locations, a discrete-time classifier; wherein features of the discrete-time classifier include, or are based on, the outputs and external messages; and wherein class labels of the discrete-time classifier include, or are based on, the respective destination locations.

Description

MODELLING AND BLACK-BOX SECURITY TESTING OF CYBER-PHYSICAL

SYSTEMS

Technical Field

The present invention relates, in general terms, to methods and systems for training a model of a cyber-physical system (CPS), and to methods and systems of black-box security testing of CPS. Background

Cyber-security challenges in CPS have been broadly studied in recent years. Despite this, detection of cyber-attacks is still an open problem that needs to be addressed.

Running security tests on Cyber-Physical Systems (CPS) is very challenging, especially in the subset of CPS called Industrial Control Systems (ICS). ICS are mission-critical systems that provide essential products/services in modern societies, for example, water treatment and distribution, power grid and transportation infrastructures. It is risky to perform security tests on an operating ICS, so it is usual to perform tests only at the design phase or during a specific operation window, to minimise availability issues.

To address this challenge, various model-based testing approaches have previously been proposed. There are currently two paradigms for model-based security testing: white-box and black-box.

In the white-box paradigm, detailed information about the CPS is made available to the security tester, including details of hardware and software components and their physical interactions and communication flows, enabling the security tester to generate an accurate and precise model of the system. Due to the detailed information required, a white-box approach is generally more appropriate for security testing in the design phase. White-box methods can also be very time-consuming, as they typically involve code inspection.

For modelling of a CPS such as an ICS, a black-box approach may be more appropriate. In the black-box approach, the security tester is only able to observe the system from the outside, and thus needs to collect data from the system to build the model. Known techniques for black-box modelling of CPS include system identification, which models a system based on considerations from control and system engineering disciplines, and automata learning, which models a system from experimental data.

It would be desirable to overcome or alleviate at least one of the above- described problems, or at least to provide a useful alternative. Summary

The present disclosure relates to a method of training a model of a cyber- physical system (CPS), the CPS being represented as a hybrid system, the method being executed by at least one processor and including:

obtaining historical data relating to inputs received at the hybrid system, outputs produced by the hybrid system, and external messages received by controllers of the hybrid system;

automatically identifying, based on the inputs, a set of locations of the hybrid system, and respective destination locations to which respective members of the set of locations can transition;

learning, for each location l of the set of locations, based on the inputs and outputs, parameters of a continuous-time model c_l of events of the hybrid system; and

training, for each location l of the set of locations, a discrete-time classifier d_l; wherein features of the discrete-time classifier include, or are based on, the outputs and external messages; and wherein class labels of the discrete-time classifier include, or are based on, the respective destination locations.

The present disclosure also relates to a method of detecting an anomaly in a cyber-physical system (CPS), the CPS being represented as a hybrid system, the method being executed by at least one processor and including:

obtaining a model of the CPS, the model including, for each location l of the hybrid system, a continuous-time model c_l of sensor outputs of the hybrid system and a discrete-time classifier d_l that predicts an updated location V based on sensor outputs and external messages received by the CPS;

determining a measured sensor output at a location y;

determining, using the continuous-time model c_j for location j , a predicted sensor output; and

determining whether the measured sensor output differs from the predicted sensor output by more than a predetermined threshold.

The present disclosure further relates to a system including at least one processor in communication with computer-readable storage, the computer- readable storage having stored thereon instructions for causing the at least one processor to perform a method as disclosed herein.

The present disclosure yet further relates to one or more non-transitory computer-readable storage media having stored thereon instructions for causing at least one processor to perform a method as disclosed herein.

Advantageously, at least some embodiments of the present invention can automatically identify transitions in actuator states, and can infer the control strategy programmed in the controllers. Additionally, embodiments automatically identify locations of the hybrid system representation of the CPS, thus making it easier for deployment in diverse CPS. Additionally, in at least some embodiments, the modular nature of the invention allows it to readily be scaled to model more complex systems. Brief description of the drawings

Figure 1 depicts system models of (a) a control system and (b) a hybrid system; Figure 2 is a flow diagram of an embodiment of a method for training a model of a cyber-physical system;

Figure 3 shows the architecture of a set of modules for implementing the method of Figure 2;

Figure 4 shows an example of a water tank filling system of an example testbed system for evaluation of models generated according to some embodiments; Figure 5 shows (left) a finite state machine of events learned by an embodiment for an example testbed system and (right) a location list and description of control signals inferred for the example testbed system;

Figure 6 shows continuous-time models for events in the model generated for the testbed system;

Figure 7 shows (left) Top-2 most relevant features for location I₅ classification for the finite state machine of Figure 5, and boundaries for events I₅ -> I₄ and I₅ -> , and (right) extracted decision rule (control strategy) from l₅-classification analysis;

Figure 8 shows system simulation for a test run of the model generated for the testbed system;

Figure 9 shows modelled level sensor behaviour for different locations for the testbed system;

Figure 10 depicts detection of an anomaly in a simulated attack on the testbed system ;

Figure 11 depicts detection of an anomaly in another type of simulated attack on the testbed system;

Figure 12 depicts detection of an anomaly in a further type of simulated attack on the testbed system;

Figure 13 is a table showing a summary of the simulated attacks of Figures 10, 11 and 12; and

Figure 14 is an example computer architecture of a system for training a model of a cyber-physical system, and/or for detecting an anomaly in a cyber-physical system.

Detailed description

The present disclosure relates to methods and systems for black-box modelling of Cyber-Physical Systems (CPS). Embodiments make use of a hybrid automaton model as a framework to connect cyber and physical components under the same model, thereby providing a modular and scalable approach. Some embodiments relate to a model-based monitoring system to detect cyberattacks in CPS.

Embodiments may be highly effective when applied to a specific subset of CPS called Industrial Control Systems (ICS). For example, raw data (historical logs, real-time readings and network traffic) can be used to build a hybrid model of an ICS. In this regard, continuous-time and discrete-time aspects of the ICS (or other CPS) may be modelled using different techniques. For example, a non-parametric regression algorithm can be used to model the behaviour of the continuous-time system, and classification algorithms can be used to model the behaviour of the discrete-time system. As will be demonstrated below, the method and system of embodiments of the present disclosure are able to automatically produce a model, from real data, that mimics with good accuracy the behaviour of a CPS. Advantageously, the model can then be used to perform security monitoring of a CPS, based only on observations of data such as control signals, sensor readings and general network traffic. In some embodiments, the method of the present disclosure may be performed "online", such that the model can be built, and security of the CPS can be tested, in a real-time manner.

Embodiments are particularly advantageous when modelling and testing a particular subset of Cyber-Physical Systems called Industrial Control Systems, though it will be appreciated that the invention is more generally applicable, and can be used to model and test other types of CPS. ML algorithms like Support Vector Machines and Gaussian Processes can be applied to model the CPS as a hybrid system in a modular manner. Once a model of the system is obtained, it can be used to simulate the normal behaviour of the system, such that cyber-attacks that aim to change the system functionality can be detected by a monitor that detects deviation from the simulated normal behaviour.

Certain concepts that are useful for the understanding of the present invention will now be described.

Hybrid systems

Hybrid systems are extended-state machines that are composed of inputs, outputs, state variables, locations, events and guards. As shown in Figure 1(a), a cyberphysical system can be represented as a hybrid system, where continuous-time components interact with discrete-time components through Analog-to-Digital converters (ADC) and Digital-to-Analog converters (DAC).

The continuous-time system can be considered as the physical process (referred to in Figure 1(a) as Plant) or the physical domain in a CPS, operating according to physical laws. The discrete-time system refers to the cyber domain in a CPS or Controllers in the context of an ICS, and DAC and ADC converters represent actuators and sensors, respectively.

A cyber-physical system can be divided into a set of subsystems, each subsystem comprising a controller, together with the sensors and actuators with which it is in communication. Similarly, critical states can be split using the same approach. The subsystems have their critical states related to component functionality and can be monitored by the particular controller. They are called local critical states, e.g. a tank reaching an overflow level. On the other hand, there are critical states that are related to interdependencies such as the interaction of two or more subsystems' states, and that can only be detected from a global monitor. These are called global critical states, e.g. a tank having a shallow level at the same time as another controller demanding liquid provision from it.

A more formal definition of hybrid systems and the theoretical framework that underlies embodiments of the present disclosure will now be described.

Representation of CPS as one or more hybrid automata

The present disclosure uses Hybrid Automata as the framework to model cyberphysical systems.

A hybrid system HI (see Figure 1(b)) can be defined as a five-tuple ( L, X, E , Inv, Act), where the respective components are defined as follows. · A set of discrete-time variables L called locations (or modes). The locations represent control modes of the hybrid system.

• An n-dimensional manifold X called states. The variables

{x₁, x₂, , x_n} in continuous-time t, where n e N is known as the dimension of

· A set of discrete events E depicted as directed edges, each of which connects two locations {/, /'} to represent a transition between them, l ® l' . (For example, in ICS, it represents a transition between different control strategies.) E has additional parameters such as edge identifier e, Guard_e, and Jump_e. Guard_e is a subset of X defined by a condition statement (such as x £ 100). When x reaches a value in this set, HI triggers the transition l ® l'. Jump_e (also known as Reset) is a function that maps x ® x' from a state-space in Guard_e to a state-space that belongs to the new location l'.

• Inv is known as the location invariant. lnv(l) e x when

is in l, x must satisfy x Î Inv(l).

• Act describes a set of first order derivatives x, also called flows, that describes how X evolves during continuous-time t when stays at l. In each location l, there exist inputs , state variables and outputs

,

where m Î N is known as the dimension of the input set and n, as described above, is the dimension of

The output depends on the current state variable and the input y = h_y(x, u) , and the state variable evolves according to its current state and the input (x = f_x(x, u)) . The system state holds over the interval [0, d]; 0 £ t £ d. The parameter d is also known as the sampling period, and represents a bridge that syncs continuous and discrete time, t and k respectively. Continuous-time components are regulated for t and discrete-time components for k. The control strategy (u⁺) is evaluated in each new step k + 1 and depends on current values of u[k], y[k] and z[k] .

Method 100 of training a model of a cvber-phvsical system A method 100 of training a model of a cyber-physical system will now be described with reference to Figures 2 and 3. The method may be performed by one or more processors in communication with computer-readable storage having stored thereon instructions for causing the one or more processors to perform a sequence of operations as described below.

In general, the method 100 takes, as input, data from controllers of the cyber- physical system, where the data comprises inputs sent from the cyber-domain (e.g. controllers) to the physical domain (e.g. signals sent to actuators), outputs received from the physical domain (e.g. sensor readings), and external network data (e.g. signals transmitted between a controller and another controller, or between a controller and a SCADA system). The method 100 then separately models continuous-time and discrete-time parts of the CPS using the controller data, and learns parameters of the continuous-time model and the discrete time model. In some embodiments, a single model can be generated for the CPS. In other embodiments, a model can be generated separately for each controller, the respective models being synchronised in appropriate manner to generate an overall model of the CPS. By taking a modular approach and generating separate models for different controllers, it becomes easier to scale the modelling process to more complex systems.

An example implementation of method 100 is shown as Algorithm 1 below and is referred to herein as HybLearner.

Turning now to Figure 2, at step 102, the method 100 receives raw data comprising historical data or real-time data from controllers of the CPS, and network traffic captures from the CPS. For example, the raw data may be at least partly received from a historian server that is in communication with the CPS. In another example, the raw data may be received at least partly in real time from a supervisory computer, or even directly from lower level devices such as remote terminal units (RTUs) and/or programmable logic controllers (PLCs).

The method 100 may also take configuration data as input. The configuration data may comprise identification information for a controller, such as IP address and process, and may also comprise control labels like u = [u1, u2, ...], sensor signal labels, e.g. Y = [y1, y2, ...], and learning period (number of steps).

At step 104, the raw data received at step 102 is classified into three categories: plant inputs, plant outputs and external messages (u, y, z) . Plant inputs are signals that travel from controllers to the plant (for example, actuator signals for ICS), and are labelled as Ū in Figure 3. Plant outputs are feedback signals, such as sensor readings, and are labelled as setῩ in Figure 3.

Typically, plant inputs are associated with actuators, and plant outputs are associated with sensors. As such, plant inputs can be extracted by identifying actuator labels in the raw data, and plant outputs can be extracted by identifying sensor labels in the raw data. The sensor labels and actuator labels may be stored in the configuration file, for example. Sensor labels and actuator labels may also be associated with a controller ID (e.g., IP address or other identifier) in the configuration file. The plant inputs and plant outputs may be identified by a variable extraction module 202 (Figure 3). External messages are extracted (for example, by external message extraction module 204 as shown in Figure 3) from analysis of network traffic in respective controllers (see Algorithm 2 below, for example) to generate the external message data, labelled as set Z in Figure 3. For example, raw network packets may be analysed to determine source and destination IP addresses, and other relevant information for a message transmitted from the source to the destination, such as the message type. The IP addresses may then be matched against those in the configuration data to identify the interacting devices (controllers or otherwise). Reading and writing messages to/from other entities, like human machine interfaces (HMI) and supervisory systems, are also considered to be part of set Z.

Next, at step 106, locations are determined from the plant inputs ϋ (for example, by a location identification module 206, see Figure 3). For all plant inputs u e Z^m, a check is performed for a transition v_i u_i ® u_i' , where v_i is a 2-tuple ( U_i, U_i ). Location set L is a unique combination of m sets of 2-tuples L = {l₁, l₂..., l_h}, Where l₁ = v₁| |v₂|| ... | | v_m.

A graph representation of the hybrid system in which locations are nodes, and events are directed edges from the source location to the destination location, is generated at step 108. The graph may be stored in a graph database 1346 (Figure 14), for example a Neo4j or other NoSQL database such as Oracle Spatial and Graph, or OrientDB. Advantageously, the graph representation enables location definitions and valid jumps to be stored and readily retrieved to validate transitions between locations, for example during a security testing process that makes use of the model generated by process 100.

Next, at step 110, in each independent event, the continuous dynamics of the system can be learned (for example, by a regression module 208). It includes resets (jumps) and derivatives (Act) in new modes as discussed above. For each location, the method 100 takes the data from each sensor associated with a controller and models it as a continuous-time variable. This can be treated as a regression problem in Machine Learning. For example, a non-parametric regression algorithm like Gaussian Processes can be used to learn the continuous dynamics because it can produce high order models. Usually, control strategies are triggered by particular conditions, e.g. "close a valve once a tank has reached a certain level". Accordingly, it is proposed that such transitions can be modelled as a multi-class classification problem. To do so, at step 112, a classifier may be trained (for example, by a discrete-time classification module 210) for each location with (Ῡ and Ż) as input features. The categorical variables are the possible V locations that an event might transit to, and are derived from the transitions in the event matrix E (see Figure 3).

In some embodiments, feature-enhancing and/or feature-selection steps may be performed (for example, for each location) to improve classifier accuracy.

For example, the feature set can be extended by adding a set DῩ that captures the dynamic behaviour of F (sensor readings) from previous values, e.g. y[k - 5] ® y[k\ .

A training set composed by F, AῩ and Ż contains features that bring different relevance in each location. For example, for a locatio where a valve is filling a tank, the level sensor y₁ is more relevant than a flow sensor y₂ that is connected to another part of the system. As such, it makes sense to use y₁ instead of y₂ as a feature in the training set. Accordingly, as a feature-selection step, the features can be ranked, for example using the Gini importance metric (extra-trees classifier) and the classifier then trained with the most relevant features in that particular location. At step 114, the continuous-time model Ċ trained at step 110, and the discrete- time model trained at step 112, are output. The result is a hybrid model H_c for a particular controller c.

To generate the overall model for the system, a system model is composed by the combination of the controller sub-models, for r number

of controllers. To combine the individual controller models, all controllers may share information through a common data structure, and synchronise under the same discrete-time scale (k) . As such, method 100 may be performed separately for each controller of the CPS, with multiple instances of the method 100 executing simultaneously. For example, multithreading and/or multiprocessing may be used to execute multiple instances of method 100, with all instances writing variable states to, and reading variable states from, the same data structure.

Model evaluation process

An example method for evaluation of the model generated by method 100 is shown below as Algorithm 3. The evaluation method is also referred to herein as HybTester.

The evaluation method takes the initial condition ( init ), the discrete-time classifier

and the continuous-time model

as input parameters, and produces an estimation of y⁺ and u⁺ for a period of steps ahead. From init, it extracts initial values for set u, y and z . From the initial u₀, it deduces initial location l and loads the continuous-time model c_l and discrete-time classifier d_t for that location. The method then predicts next step values of sensor readings y⁺ based on the continuous-time predictor c_l. On each time step, the method, through classifier d_l, checks if there are conditions that trigger an event l ® l' from the current location. If d_l predicts that such an event occurs, new continuous-time model eg and discrete-time classifier d_l' are loaded, and the process continues.

A security metric: time-to-critical-state (t_q)

Critical states can be considered as a state where the system operation cannot satisfy minimal safety conditions and threatens product or service quality or human lives. Distance to critical states has previously been proposed as a security measurement. In addition, the importance of including the time dimension in resilience analysis of CPS has previously been highlighted.

Along similar lines, we aim to answer two questions regarding security metrics in CPS, how far is the current system state to the nearest critical state, and how fast could the system reach the nearest critical state?

We propose to measure the 'distance' from the current state to the nearest critical states, not only considering where the critical states are, but how fast the system might get there. We compute this considering the fastest pace of system evolution according to the system's historical registers (worst case scenario).

There are different levels of critical states. One is what we call local critical states, and those depend on particular components, e.g. in a water treatment plant, a pump is activated when there is no liquid flowing through it. Others are global critical states, where the combination of multiple components states will configure a critical state of the system, even if local components are in a non- critical states, e.g. in transportation systems, a train approaches a station where another train is stationary as each controller only monitors local critical states for each component.

After the learning phase we can identify the set of critical states Q and the points from ^' normal' operation that are the nearest to them, denoted as x_q . In each location l, states evolve to a different rate (x), e.g. the filling rate of a tank. Then we compute the time-to-critical-state (t_q) as the time that the system will take to reach q from x_q under the fastest rate captured during the learning phase (the worst scenario).

t_q is expressed in terms of steps (k), the conversion to time units (t) depends on the sampling period (5), as described above, which varies from CPS to CPS. For example, d in water treatment plants is on the scale of seconds, while in smart grids, it is on the scale of milliseconds.

Method of detecting an anomaly in a CPS

As method 100 automatically learns a system's "normal" behaviour, this can be used to detect data-oriented attacks. For example, in some embodiments of a method of detecting anomalies in a CPS, a non-parametric cumulative sum algorithm (CUSUM) can be used to detect data-oriented attacks.

In some embodiments, a CUSUM detector relies on two parameters ( b and T). b represents the recovery ratio, and is inversely proportional to the time-to- detection parameter. If b is too small, it might increase the number of false positives b is chosen such that

The CUSUM measurement S(k ) in step k is computed as:

where

y(k) is the value of variable y from the system and y(k) is the predicted value of y both evaluated at step k. The optimal b can be chosen by HybTester during a period of "normal" operation. As will be appreciated by the skilled addressee, b can be selected based on a desired sensitivity of the anomaly detection process. As such, it may be empirically based on observations of the system without attacks.

The parameter t is the threshold where a system's behaviour deviates enough to be considered as abnormal, and it can trigger an attack detection. If the threshold increases, the probability of false alarm decreases abruptly. The detection rule is given by:

To do so, during the evaluation phase a threshold t can be automatically computed. As for b, the parameter t can be selected based on experimental data, taking into account a desired sensitivity of the detector. In the running phase, a monitoring system will compute the difference between monitored values and predicted ones. If S(k ) is greater than the computed t, an alarm can be generated.

Experimental results and evaluation

The method 100 of the present disclosure was used to generate a model of an exemplary ICS called SWaT. SWaT is a six-stage system built for research purposes in cyber-security of critical infrastructures (see A. P. Mathur and N . O. Tippenhauer, "SWaT: a water treatment testbed for research and training on ICS security," 2016 International Workshop on Cyber-physical Systems for Smart Water Networks (CySWater), Vienna, 2016, pp. 31-36, the entire contents of which are incorporated by reference).

In SWaT, a PLC controls each stage of the testbed. Stage one takes raw water from external sources, stores it in a tank, and feeds other stages on-demand. Stage two measures water properties like conductivity and add chemicals (HCI, NaCI and NaOCI) according to a control strategy. Stage three is the ultra- filtration (UF) process, which has a set of pumps to feed a UF membrane filter to remove water solids of a certain size. Stage four is a de-Chlorination system, which uses Ultraviolet (UV) radiation to eliminate residual chlorine and chloramines, and complements the filtering process with a Sodium Bisulfite (NaHSO₃) dosing. Stage five is the Reverse Osmosis (RO) process, which operates continuously monitoring the conductivity of the water and executing a pre-configured sequence controlled by a PLC. Stage six stores the resulting product from the previous stage into a tank, and water from this stage is used to clean the RO membrane, and the exceeding product is recycled.

Figure 4 depicts a simplified version of Stage one of the testbed. In the subsystem shown in Figure 4, the goal of the controller is to keep enough water between high ( H ) and low ( L ) levels to guarantee that liquid can be provided for later stages. If the tank level is H, the controller closes the main valve MV. When tank level hits L, the controller opens MV and starts filling the tank. The actions of Pumps P1, P2 depend on the demand of later stages.

Critical states (Q). Stage one has two local critical states. q₁ is related to the risk of tank overflow ( HH 1000.0 mm). q₂ is associated with the risk of tank underflow because running any of the pumps PI or P2 might cause severe damage on any of them ( LL 50.0 mm). Stage one operation can be represented as follows:

As the framework above was developed in a modular manner, it readily enables extension to more complex systems. We used Python 3.6 as the main programming language to implement method 100. Nevertheless, similar approaches can be developed using faster languages like C++ or Rust. They might be considered in the case an on-line modeller is needed, or a system with more strict sampling period 5 (e.g. in power grids).

The main infrastructure components in this example are as follows:

Configuration file. It stores the minimal information that HybLearn needs to model a subsystem. It contains Controller's identification (IP address, process), control labels like u = [u1, u2, ...], sensor signal labels, i.e. Y = [y1, y2 , . . . ] , and learning period (number of steps).

LoadExtLabel program. As described in Algorithm 2, we programmed a network traffic parser using the Scapy 2 library. The testbed network runs over ethernet/CIP protocol, so we filter TCP/44818 and UDP/2222 traffic, and validate reading(0x4c) and writing(0x4d) CIP services. The script returns the variable names from remote entities (other controllers or SCADA) that interact with the local controller. HybLearner and HybTester programs (Algorithm 1 and Algorithm 3). They use neo-model and sqlalchemy libraries to connect to the Databases. The SVM classifier and the Gaussian process regression were implemented using sklearn and Gpy, respectively. Databases. The main DB 1344 (see Figure 14) is a Postgres database that stores all records related to raw data, forecast, continuous-time models

and discrete-time classifiers

. A graph database 1346 (see Figure 14, e.g. neo4j) stores locations, events, event matrices, control signal variables ( Ū : names, values), and sensor signals (Ῡ: names, min and max values).

We used a dataset collected from SWaT over 11 days. It contains sensor readings and actuator signals of all six stages of the system as described above. We fed HybLearner with 25k registers (~7 hours) of normal operation of the testbed. We manually set the initial configuration as Ϋ = {'lit101', 'fit101'} and Ū ={'mv101', 'p101', 'p102'}. The LoadExtLabel program processed a network capture with 300k packets. It extracted Ż labels ({'fit201', 'fit301', 'Iit301', 'mv201'}), which corresponds to variables from Stages two and three of the testbed. We ran HybLearner for different stages from the testbed. HybLearner took from 18.32s up to 138.77s to produce a hybrid model . The execution time varies depending on the

number of unique locations and events.

Learning the Hybrid model M

HybLearner successfully identified all operational modes of Stage one of the testbed. The framework learnt seven locations, and nine events. Figure 5 (right hand side) shows all of the transitions learned by the method, and actions taken by actuators in each event.

Continuous-time modeller

and Discrete-time classifier

HybLearner built nine different models

, matching the nine events in Figure 5. Figure 6 shows all behaviours for sensor level lit101 during "normal" operation. Projections of the set of derivatives y (Act ) are used below to compute the time-to-critical-state for the system's security evaluation.

As an example, the classification decision for location l₅ is depicted in Figure 7. After feature selection, the two most relevant features are lit101 and iit30i, x-axis and y-axis respectively. It describes conditions under which the controller triggers a different control strategy for the transition event to l₄ or l₆. For example, from Figure 5 we know that the event l₅ ® l₄ is associated with the change P1 :OFF®ON, and the event l₅ ® l₆ with the transition MV:ON ®OFF. When considered together with the left-hand side of Figure 5, the decision rule at the right-hand side of Figure 5 can be inferred. By comparison against the operational manual of the testbed, a strong relationship with the control strategy programmed in the controller's source code was found. This demonstrates that methods according to the present disclosure can automatically learn the control strategy of a CPS from experimental data.

Simulating the system

We ran HybTester for 5k steps (1 hour 25 minutes in real-time) with the initial conditions iit101= 519.55; mv101=ON; P101= ON; P102= OFF . Figure 8 shows how the system evolves during this period. HybTester successfully predicted locations, events and sensor readings.

The residual analysis shows that the difference between the predicted value (y) and the ground truth (y) remains in the range -10 £ e £ 10, while the CUSUM value (S[k]) holds around 0 for the first 3.5k steps (~60 minutes). After 3.5k steps, ŷ deviates from y, due to error propagation, it causes an avalanche effect in e that produces S[k] exceeds t and triggers a false positive when the CUSUM detection mechanism is in place. The error propagation issue can be addressed by a periodic synchronisation between y and y, e.g. every 30 minutes (1.8k steps) for this scenario.

Security metrics: time-to-critical-states (t_q)

As mentioned above, Stage one has two local critical states associated with lit101. One is linked to tank overflow (q₁ = 1000.0) and another with tank underflow ( q₂ = 50.0) which can cause P1 , P2 malfunctioning. After running HybLearner, we automatically identify that the nearest points to q> , X_q: {815.08,120.62} respectively. In Figure 9, we project all Acts (x₍) for all locations l Î L . x_q is placed in the origin of the plot, and the boundaries Q - x_q (dashed lines) for each critical state. From F set, we choose max(y) : r_x = 0.481mm/s at l₅ location, and min(y) : r₂ = -0.474mm/s at l₈ location. Using equation 1 we get t_q = {384, 149}. We can interpret this result as the ^' resilience' of Stage one, which means the system can resist any data-oriented attack for at least 149 seconds without the risk of reaching a critical state

. The physical properties of the system guarantee it.

Attack scenarios

We use three different attacks from the literature to test the viability of implementing an attack detection mechanism based on HybTester. The attack scenarios assume an malicious actor can inject data-oriented attacks via manipulation of sensor readings y . It can be achieved by attacking the communication channel sensor-actuator through Man-in-the-Middle attack or compromising sensors directly. Al. Set tank level above H

The initial conditions before the attack are the following : iitioi= 501.33; mvioi=ON; P101= ON; P102= OFF . The attack vector holds the tank level sensor lit101 to 801.0 just above the H label, it should cause the controller to trigger a different control strategy where MV turns OFF, and P1 turns ON. The attacker's goal is to bring the system to a critical state q₂ (lit101 < 50. o), attack lasts 437 steps (7 minutes 12 seconds).

A2. Set tank level close to L

The initial conditions before the attack begins are: iitioi= 501, 33; mvioi=ON; P101=OFF; P102= OFF. The attack vector holds the tank level sensor lit101 to 244.0. It aims to mislead the controller to a different behaviour than what is expected, and the tank might keep filling up while the controller thinks it is stuck at 244.0 level. The attacker's goal is to bring the system to a critical state (q₁) (tank overflow: lit101 > IOOO.O). The attack lasts 516 steps (8 minutes 36 seconds). A3. Hold tank level and increase the level at a constant rate

Before the attack begins, initial conditions are: lit101= 573.87; mv101=ON; P101=OFF; P102= OFF. Attacker implements two attack vectors. The first holds the tank level sensor lit101 to the current value (574.03) for 92 steps. Then the second attack vector increases lit101 to a constant rate of l.0mm/s. It aims to deceive the controller into thinking the tank fills faster than usual, trigger control strategies that will help to discharge the container. The attacker's goal is similar to Al, brings the system to a critical state (q₂) (lit101 < 50. o). Attack duration : 473 steps (7 minutes 53 seconds).

HvbTester as a model-based attack detection mechanism

We combine HybTester's features to predict "normal" system operation with CUSUM as described above. We set CUSUM parameters as b = 10, t = 20, as shown in Figure 8. This configuration offers a very low false positive rate (1/3.5k steps). With this configuration, the monitor will trigger an alarm only when CUSUM value S[k] exceeds t (black dotted line at 20).

Figure 10 shows an example of a monitoring mechanism implemented with HybTester. The attack corresponds to A3 described above. The attack starts at k = 510 labelled as A3.0, and it is detected when S > 20 at k = 543, which means the monitor takes more than 30 seconds to detect the attack. Our analysis of time-to-critical-state t_q shows that the attacker is far from causing severe damage to the system. With a t_q [k = 543] : {855, 1136} at detection point D, it means that an operator has at least 855 s to implement a mitigation mechanism before the system reaches q₁.

Examples for attacks Al and A2, and the detection capabilities of HybTester are depicted in Figures 11 and 12. Attack Al which starts in step k = 48 (A1.0) and ends in step k = 481 (Al.l) is detected in step k = 49 (D). Attack A2 which starts in step k = 200 (A2.0) and ends in step k = 718 (A2.1) is detected in step k = 201 (D). Figure 13 summarises results for all attacks (A1-A3). It shows when attacks start (k₀), how long the mechanism takes to detect the attack (k_D), how long the attack lasts (k₁), and time-to-critical-state metrics (t_q) for each of these events. This set of examples shows that the combination of HybTester + CUSUM is suitable for building attack detection mechanisms.

Additionally, time-to-critical-state metrics at detection point show the buffer of time an operator has to activate response strategies, more than 12 minutes (> 720s) for these cases. And at the point when the attack stops, the time-to- critical-state metrics give an approximation of impact measurement of such attack, providing information about how close an attacker could bring the system to a critical state q. Turning now to Figure 14, an exemplary architecture of a system suitable for implementing methods of the present disclosure is shown.

Figure 14 shows an example computing device 400 that is capable of implementing the presently disclosed methods. In some embodiments, multiple computing devices 400 may be considered to be a single computing system. For example, each of a plurality of computing devices 400 may run a separate instance of a module that is used to implement part of the methods disclosed herein. The components of the computing device 400 can be configured in a variety of ways. The components can be implemented entirely by software to be executed on standard computer server hardware, which may comprise one hardware unit or different computer hardware units distributed over various locations, which may communicate over a network. Some of the components or parts thereof may also be implemented by application specific integrated circuits (ASICs) or field programmable gate arrays. In the example shown in Figure 14, the computing device 400 is a commercially available server computer system based on a 32 bit or a 64 bit Intel architecture, and the processes and/or methods executed or performed by the computing device 400 are implemented in the form of programming instructions of one or more software components or modules 1322 stored on non-volatile (e.g., hard disk) computer-readable storage 1324 associated with the computing device 500. At least parts of the software modules 1322 could alternatively be implemented as one or more dedicated hardware components, such as application-specific integrated circuits (ASICs) and/or field programmable gate arrays (FPGAs).

The computing device 1300 comprises at least one or more of the following standard, commercially available, computer components, all interconnected by a bus 1335:

(a) random access memory (RAM) 1326;

(b) at least one computer processor 1328, and

(c) a network interface connector (NIC) 1330 which connects the computer device 400 to a data communications network and/or to external devices. The computing device 400 comprises a plurality of standard software modules, comprising:

(a) an operating system (OS) 1336 (e.g., Linux or Microsoft Windows); and

(b) structured query language (SQL) modules 1342 (e.g., MySQL, available from http://www.mysql.com), which allow data to be stored in and

retrieved/accessed from an SQL database 1344.

Advantageously, the database 1344 forms part of the computer readable data storage 1324. Alternatively, the database 1344 is located remote from the computing device 400 shown in Figure 14. Similarly, graph database 1346 may be part of computer-readable storage 1324, or be located remote from computing device 400. The boundaries between the modules and components in the software modules 1322 are exemplary, and alternative embodiments may merge modules or impose an alternative decomposition of functionality of modules. For example, the modules discussed herein may be decomposed into submodules to be executed as multiple computer processes, and, optionally, on multiple computers. Moreover, alternative embodiments may combine multiple instances of a particular module or submodule. Furthermore, the operations may be combined or the functionality of the operations may be distributed in additional operations in accordance with the invention. Alternatively, such actions may be embodied in the structure of circuitry that implements such functionality, such as the micro-code of a complex instruction set computer (CISC), firmware programmed into programmable or erasable/programmable devices, the configuration of a field-programmable gate array (FPGA), the design of a gate array or full-custom application-specific integrated circuit (ASIC), or the like.

Each of the blocks of the flow diagrams the methods disclosed herein may be executed by a module (of software modules 1322) or a portion of a module. The processes may be embodied in a non-transient machine-readable and/or computer-readable medium for configuring a computer system to execute the method. The software modules may be stored within and/or transmitted to a computer system memory to configure the computer system to perform the functions of the module. For example, the modules 1322 may comprise the modules 200 shown in Figure 3, such as a variable extraction module 202, an external message extraction module 204, a location identification module 206, a continuous-time model trainer 208, and a discrete-time model trainer 210. The computing device 400 normally processes information according to a program (a list of internally stored instructions such as a particular application program and/or an operating system) and produces resultant output information via input/output (I/O) devices 1330. A computer process typically comprises an executing (running) program or portion of a program, current program values and state information, and the resources used by the operating system to manage the execution of the process. A parent process may spawn other, child processes to help perform the overall functionality of the parent process. Because the parent process specifically spawns the child processes to perform a portion of the overall functionality of the parent process, the functions performed by child processes (and grandchild processes, etc.) may sometimes be described as being performed by the parent process.

It will be appreciated that many further modifications and permutations of various aspects of the described embodiments are possible. Accordingly, the described aspects are intended to embrace all such alterations, modifications, and variations that fall within the spirit and scope of the appended claims. Throughout this specification and the claims which follow, unless the context requires otherwise, the word "comprise", and variations such as "comprises" and "comprising", will be understood to imply the inclusion of a stated integer or step or group of integers or steps but not the exclusion of any other integer or step or group of integers or steps. The reference in this specification to any prior publication (or information derived from it), or to any matter which is known, is not, and should not be taken as an acknowledgment or admission or any form of suggestion that that prior publication (or information derived from it) or known matter forms part of the common general knowledge in the field of endeavour to which this specification relates.

Claims

1. A method of training a model of a cyber-physical system (CPS), the CPS being represented as a hybrid system, the method being executed by at least one processor and including :

obtaining raw data relating to plant inputs, plant outputs, and external messages received by controllers of the hybrid system;

automatically identifying, based on the plant inputs, a set of locations of the hybrid system, and respective destination locations to which respective members of the set of locations can transition;

learning, for each location l of the set of locations, based on the plant inputs and plant outputs, parameters of a continuous-time model c_l of events of the hybrid system; and

training, for each location l of the set of locations, a discrete-time classifier d_l; wherein input features of the discrete-time classifier include, or are based on, the plant outputs and external messages; and wherein class labels of the discrete-time classifier include, or are based on, the respective destination locations.

2. A method according to claim 1, comprising generating a plurality of hybrid models, each hybrid model comprising a continuous-time model and a discrete-time classifier for one of the controllers of the hybrid system.

3. A method according to claim 2, comprising synchronising the plurality of hybrid models using a common data structure containing variable states of the hybrid system.

4. A method according to claim 2 or claim 3, wherein each hybrid model is generated by a separate application instance executing on the at least one processor.

5. A method according to any one of claims 1 to 4, wherein the parameters of the continuous-time model are learned using a non-parametric regression algorithm.

6. A method according to any one of claims 1 to 5, wherein input features of the discrete-time classifier further include additional outputs that precede the plant outputs.

7. A method according to any one of claims 1 to 6, comprising applying at least one relevance criterion to the input features at each location.

8. A method of detecting an anomaly in a cyber-physical system (CPS), the CPS being represented as a hybrid system, the method being executed by at least one processor and including:

determining a measured sensor output at a location y;

determining, using the continuous-time model c_j for location y , a predicted sensor output; and

9. A method according to claim 8, including :

determining, using the discrete-time model d_j for location y, an updated location y'; and

responsive to the updated location j' differing from the location y :

updating the location to the updated location;

updating the continuous-time model c_j to continuous-time model

Cj^'

10. A method according to claim 8 or claim 9, wherein the model is obtained by a method according to any one of claims 1 to 7.

11. A system including at least one processor in communication with computer- readable storage, the computer-readable storage having stored thereon instructions for causing the at least one processor to perform a method according to any one of claims 1 to 10.

12. One or more non-transitory computer-readable storage media having stored thereon instructions for causing at least one processor to perform a method according to any one of claims 1 to 10.