GB2624712A - Monitoring system - Google Patents
Monitoring system Download PDFInfo
- Publication number
- GB2624712A GB2624712A GB2217840.4A GB202217840A GB2624712A GB 2624712 A GB2624712 A GB 2624712A GB 202217840 A GB202217840 A GB 202217840A GB 2624712 A GB2624712 A GB 2624712A
- Authority
- GB
- United Kingdom
- Prior art keywords
- indicator
- indicators
- analysis
- behavioural
- detection
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012544 monitoring process Methods 0.000 title claims description 14
- 238000004458 analytical method Methods 0.000 claims abstract description 287
- 238000001514 detection method Methods 0.000 claims abstract description 192
- 230000003542 behavioural effect Effects 0.000 claims abstract description 178
- 238000000034 method Methods 0.000 claims abstract description 136
- 238000012882 sequential analysis Methods 0.000 claims abstract description 72
- 230000008569 process Effects 0.000 claims abstract description 62
- 230000004913 activation Effects 0.000 claims abstract description 25
- 230000006399 behavior Effects 0.000 claims description 54
- 238000012545 processing Methods 0.000 claims description 7
- 238000012546 transfer Methods 0.000 claims description 7
- 230000000717 retained effect Effects 0.000 claims description 4
- 230000026676 system process Effects 0.000 claims description 2
- 230000003068 static effect Effects 0.000 description 30
- 230000000694 effects Effects 0.000 description 27
- 230000008859 change Effects 0.000 description 16
- 238000004891 communication Methods 0.000 description 16
- 230000002159 abnormal effect Effects 0.000 description 12
- 230000003044 adaptive effect Effects 0.000 description 11
- 238000004422 calculation algorithm Methods 0.000 description 11
- 230000001629 suppression Effects 0.000 description 10
- 238000012360 testing method Methods 0.000 description 9
- 230000002776 aggregation Effects 0.000 description 8
- 238000004220 aggregation Methods 0.000 description 8
- 238000013459 approach Methods 0.000 description 8
- 230000000875 corresponding effect Effects 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 6
- 230000002547 anomalous effect Effects 0.000 description 5
- 230000007704 transition Effects 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 4
- 238000009826 distribution Methods 0.000 description 4
- 238000010801 machine learning Methods 0.000 description 4
- 239000000523 sample Substances 0.000 description 4
- 238000007619 statistical method Methods 0.000 description 4
- 239000008186 active pharmaceutical agent Substances 0.000 description 3
- 238000009529 body temperature measurement Methods 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 230000035945 sensitivity Effects 0.000 description 3
- 238000011895 specific detection Methods 0.000 description 3
- 230000002459 sustained effect Effects 0.000 description 3
- 238000005303 weighing Methods 0.000 description 3
- 102100026278 Cysteine sulfinic acid decarboxylase Human genes 0.000 description 2
- 230000004931 aggregating effect Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000013136 deep learning model Methods 0.000 description 2
- 230000007717 exclusion Effects 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 230000000977 initiatory effect Effects 0.000 description 2
- 238000007689 inspection Methods 0.000 description 2
- 230000007257 malfunction Effects 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 108010064775 protein C activator peptide Proteins 0.000 description 2
- 238000012797 qualification Methods 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 230000001960 triggered effect Effects 0.000 description 2
- 230000005856 abnormality Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 230000001364 causal effect Effects 0.000 description 1
- 230000008867 communication pathway Effects 0.000 description 1
- 238000010835 comparative analysis Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000000151 deposition Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 230000036433 growing body Effects 0.000 description 1
- 238000010438 heat treatment Methods 0.000 description 1
- 238000009776 industrial production Methods 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000000116 mitigating effect Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000037361 pathway Effects 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000007493 shaping process Methods 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1416—Event detection, e.g. attack signature detection
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1425—Traffic logging, e.g. anomaly detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/552—Detecting local intrusion or implementing counter-measures involving long-term monitoring or reporting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/57—Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
- G06F21/577—Assessing vulnerabilities and evaluating computer system security
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Computing Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Testing And Monitoring For Control Systems (AREA)
- Debugging And Monitoring (AREA)
Abstract
An analysis method 100 for detection of threats and anomalies in cyber and physical process behaviour defines series of time frames to monitor, within each time frame, at least one behavioural indicator, and establishes indicator strength values of the behavioural indicators to determine behavioural indicators to be used as included indicators. This is based on whether the strength value satisfies an indicator inclusion condition. The method assigns two or more included indicators to two or more analysis steps 112, each step comprising at least one included indicator 114, and defines a sequential analysis 110 of the two or more steps. Within an analysis step, an intra-step strength value of the included indicators of the analysis step is generated. The method comprises determining if the intra-step strength value exceeds an inter-step activation condition 118, and proceeding to a subsequent analysis step 120 when this condition is satisfied or exceeded. A sequence strength score is generated based on the strength values 122, and an output indicative of the strength score is generated. The method may allow behavioural indicators to be assessed based on context provided by indicators of varying strength.
Description
Monitoring System
Field of the Invention
The present invention relates to a monitoring system, more specifically a security monitoring and threat detection system. More specifically, the invention relates to a method and system, components and/or system architecture for carrying out a reconfigurable, sequential analysis for monitoring the presence of physical and/or cyber threats, specifically by using sequential analysis threat detection models, for identifying with better confidence the occurrence of attacks and/or faults in both computer behaviour and physical system activity.
Background
Ongoing digital transformation across several fields of modern society is driving ever greater inter-connectivity and automation between computer systems and physical machines. This leads to a convergence between different types of IT systems, including cloud computing platforms, networks at different levels (e.g., enterprise, public, domestic, mobile, and other types of networks). Network connectivity is becoming increasingly a standard way of communication, including in industry, healthcare, transport, defence and domestic services.
As will be appreciated, this connectivity creates an increased exposure to attacks and cybersecurity threats in general, and also an increased vulnerability to faults and malfunctions that may also occur in the absence of a malicious third party attack.
Today, proactive cybersecurity monitoring is an integral part of a defence strategy. However, there remains a problem in automating cybersecurity monitoring processes particularly for critical infrastructure, in that sensitivity to abnormal characteristics needs to be high in order to avoid missing a threat event by way of a false negative classification. On the other hand, too high a sensitivity to abnormal characteristics increases the risk of a false positive, which may have catastrophic consequences.
The present invention seeks to address or at least partially ameliorate the aforementioned problems.
Summary of the Invention
In accordance with a first aspect of the invention, there is provided an analysis method as defined in claim 1, for the detection of threats and anomalies in cyber and physical process behaviour, the method comprising: defining a series of time frames for the assessment of behavioural indicators, monitoring, within each time frame, for the existence of one or more behavioural indicators, analysing a plurality of behavioural indicators from within each time frame, each behavioural indicator being analysed to establish an indicator strength value based on a disposition of the behavioural indicator relative to a reference behaviour, determining whether or not the indicator strength value satisfies at least one indicator inclusion condition, wherein behavioural indicators satisfying at least one indicator inclusion condition are included indicators, assigning two or more included indicators to two or more analysis steps, each analysis step comprising at least one included indicator, defining a sequential analysis, analysis step by analysis step, of the two or more analysis steps, within an analysis step, generating an intra-step strength value of one or more included indicators of the analysis step, determining whether or not the intra-step strength value exceeds an inter-step activation condition, and proceeding to a subsequent analysis step of a sequential analysis when the inter-step activation condition is satisfied or exceeded, generating a sequence strength score based on the intra-step strength values, and generating an output indicative of the sequence strength score.
The method allows a number of different behavioural indicators to be processed according to multiple criteria. Specifically, the method allows behavioural indicators to be processed that originate from within a common time window, while applying additional rules based on other indicators in the same time frame window. By providing analysis steps, one or more (and typically at least two) behavioural indicators can be analysed either with direct reference to each other, and/or as a conditional sequence in which one or more behavioural indicators are assessed only after completion of an analysis step for one or more preceding behavioural indicators. This approach provides a departure from known analysis methods, because it allows indicators within an analysis step to be ignored, included, throttled, and/or amplified in their relevance over the course of different time frame windows. Both the selection of behavioural indicators and the relationship between selected behavioural indicators can, in this manner, be used to improve confidence in a threat detection model.
A behavioural indicator will be understood as an event, measurement, signal, or performance, including an absence thereof, that can be defined to be detected or measured to be analysed in relation to a reference behaviour. The nature (value, frequency, and/or occurrence) of the event relative to a reference behaviour provides an indicator strength value. E.g., a behavioural indicator may be a number of network connections. As a reference behaviour, a number higher than 20 may be considered suspicious, whereas a number of no more than 20 connections may be considered as normal. It will be appreciated that the threshold may be altered. By "disposition", it is meant that the behavioural indicator is classed as, for instance, suspicious or not suspicious. In this regard, the disposition may depend on other circumstances, such that in one scenario a detection of over 20 connections may be suspicious, and in another scenario the same number may not be suspicious.
By "included indicator", a behavioural indicator is meant that qualifies as an indicator sufficiently different or noteworthy in relation to a reference behaviour. The qualification criterion for an included indicator is the indicator inclusion condition, which can be considered an indicator breach threshold or indicator breach condition. To this end, a breach condition (or indicator inclusion condition) may be defined, for instance based on a breach threshold. The breach threshold may be static or adaptive, and may be changed or updated between sequential analyses. The breach threshold may be pre-defined together with a behavioural indicator. To provide an illustrative example, a behavioural indicator may be a temperature value. A temperature between 0°C and 75°C may not be considered suspicious. A temperature value above 75°C may be classed as suspicious. The breach condition may be a temperature value above 75°C. For the sequential analysis, the temperature value may be assessed within each time frame, and if a temperature value exceeds 75°C, it is included in the sequential analysis, i.e. the behavioural indicator "temperature" is used as an included indicator in an analysis step of the sequential analysis. In this manner, included indicators may be analysed in relation to another behavioural indicator, which may be the time of the day, number of network access requests, and/or the rate of temperature change (rise). For instance, another behavioural indicator may be a time of a day, the indicator inclusion condition being a time between 20:00 and 05:00. In this example, if the temperature measurement occurs between 05:00 and 20:00, the behavioural indicator "time of the day" is not included in the sequential analysis. If the temperature measurement occurs between 20:00 and 05:00, the behavioural indicator "time of the day" is included in the sequential analysis and may, in this case, amplify a risk score for the temperature measurement.
An analysis step, herein, is considered a rule set for the processing of one or more behavioural Indicators. An analysis step can be considered as a subunit of a sequential analysis that is used as part of the analysis model to force a structured, step-by-step analysis of groups of behavioural indicators.
Within an analysis step, logical conditions may include applying logical operator rules (e.g., AND / OR) to sets of behavioural indicators breaching their defined thresholds. For example, an analysis step Amn may consist of three behavioural indicators ii, i2, i3, each behavioural indicator i being satisfied when exceeding a threshold a, such that i, where the logical condition may be defined as per an exemplary Equation 1: Amn: .1 (Equ. 1) Even though an individual Indicator may breach its threshold (a) in an analysis step, in the given example, the logical condition of Equ. 1 requires that h AND i2 OR i3 is also satisfied to consider the analysis step complete (e.g., the logical condition for the analysis step is "true"). Herein, an analysist step that satisfies its logical condition provides an inter-step activation condition, meaning that the sequential analysis may proceed to a subsequent analysis step. Note that there is no limit to the number of indicators that can be assigned to an analysis step.
Behavioural indicators represent an aspect that may be associated by a user or system configuration as indicative of a threat or anomaly. An important aspect underlying the invention is that a behavioural indicator is not necessarily deterministic of a threat or anomaly as such.
An indicator may be characterised by a behaviour, a metric, and a disposition. The behaviour of an indicator may be a numerical or descriptive performance, such as "20 network connections", "a download has started", or "the temperature is rising". The metric may be a scale, coordinate or reference, such as "the 20 connections are HTTP connections", "the temperature is measured in degree Celsius". The disposition is a qualification of the behaviour relative to a reference behaviour, such as "over 10 connections are unusual", "the temperature should increase from 70°C to 100°C, outside this range is unexpected", or others. The behavioural indicator may provide an explanatory contribution to overall model detection confidence, whereby a series of analysis steps provide a mechanism for systematically organising behavioural indicators according to a rule set, for instance according to their importance and/or prevalence within a detection model.
Behavioural indicators may be based on data extracted from a range of computer or physical system sources, a non-exhaustive list includes: operating system logs, firmware/hardware event logs, API and application logs, network flow logs, packet captures, physical sensor / actuator I/O data. Indicators may be computed from a univariate or multivariate data source and apply a variety of methods to generate behavioural indicator metrics which best represent the context and state of a behavioural indicator.
For example, a behavioural indicator may be defined as a specific measurement of some defined metric from computer system device, network or physical system activity, such as the count of unique Transmission Control Protocol (TCP) network ports a device is attempting to connect to within a certain time period, the deviation of a temperature sensor reading from a predefined set value point, or the classification, anomaly / novelty detection output for a single or sequence of events or behaviours that are observed in computer physical system activity.
For the purposes of the present disclosure, a behavioural indicator may be generated and assessed using a range of methods such as logical rules, heuristics, statistical analysis and probability distributions, supervised or unsupervised machine and deep learning models, or even via the output of a sequential analysis detection model of the invention, in the manner of a recursive analysis and indicator definition. Consequently, an indicator itself may be generated by, and in some cases may be computed from, an indicator combination of two or more a behavioural indicators, and may be represented by multiple data types such as boolean, ordinal, categorical, continuous or probability metric values. In other words, an indicator may be an entity that is dichotomous, ordinal, finite, infinite, and therefore its corresponding threshold may have to be dichotomous, ordinal, or continuous, as appropriate.
For a behavioural indicator, one or more thresholds may be defined. The threshold may be a single threshold or an upper/lower boundary. Different thresholds may be defined for the same behavioural indicator in different detection models. VVithin one detection model, i.e. within a sequential analysis, each behavioural indicator can only be used in one analysis step, but the same behavioural indicator may be used in different models.
The behavioural indicators can be defined to provide context (e.g., type of network protocol, number of supply channels, time of day) that may or may not amplify the relevance of another behavioural indicator. An advantage of the method disclosed herein is that the number of behavioural indicators can be large, e.g. might amount to several hundred or even thousands of indicators. The method allows a large number of indicators to be considered, because only indicators with breached threshold are considered further as included indicators.
The scores of individual analysis steps, or intra-step strength values, are used in at least two ways, firstly to determine if an analysis step satisfies an inter-step activation condition (e.g., if the scores exceeds a threshold), and secondly, as will be set out in more detail below, the analysis step scores are aggregated to a detection score. While an inter-step activation condition and a detection score are based on the same input in the form of an analysis step score, different weighing algorithms may be applied. This may be appropriate specifically because, when used as an inter-step activation condition, the analysis step score is not combined with other scores. As such, a detection model may be understood as a rule set defining a selection of behavioural indicators, their thresholds and allocation to analysis steps, as well as rules for analysing behavioural indicators within analysis steps, and rules for aggregating scores from different analysis steps. Each detection model may be understood as defining an aspect of a threat or anomaly, and its context (see also examples 1-3 below).
Herein, sequential analysis consists of one or more, and typically at least two or more, analysis steps, each analysis step being executable to contribute towards a detection score, e.g., by determining whether or not one or more model detection thresholds have been breached, and/or whether or not a detection event output is to be generated.
By "inter-step activation condition", or analysis step threshold, a condition is meant that is required by the model to be satisfied by a preceding analysis step, before the method proceeds to executing a subsequent analysis step. The inter-step activation condition may be the intra-step strength value, i.e. a score aggregated from included behavioural indicators in one analysis step. The effect of the inter-step activation condition is that analysis steps act as gatekeeper within a sequential analysis. In this manner, a sequential analysis can be stopped for a time frame early if behavioural analysis steps of the first or of some of the first analysis steps fail to satisfy an inter-step activation condition. It will be appreciated that the same or similar behavioural indicators may be processed (repeatedly) in subsequent time frames.
As such, the analysis may be "chained", by which it is meant that an output of one analysis step triggers and/or is used as an input for a subsequent analysis step, practically providing an inter-step activation condition, such that the analysis proceeds analysis step by analysis step, in a mono-linear manner, without branching pathways. Each analysis step within a model can only be processed sequentially once a previous step has been completed. In contrast, behavioural indicators within one analysis step may be processed in parallel, or without reference to each other, and may even be required to be processed in parallel, which may be necessary to determine logical relationships. The same behavioural indicator may be considered multiple times within the logic of an analysis step, but may only be considered in one analysis step of a sequential analysis. Thereby, it is avoided that scores based on a behavioural indicator are considered twice in a sequential analysis time frame.
The method may be used for the detection of threats and/or anomalies. Numerous methods have hitherto been proposed and implemented for the task of threat detection, specifically in the field of cybersecurity. Such methods include rule-based, heuristic, statistical or machine / deep learning models to identify threat or anomalous behaviour in data collected from a variety of difference data sources derived from computer operating system, application, and network activity. In contrast, the present method provides a flexible, structured approach that allows different behavioural indicators to be defined, relating to different types of computer system activity under different conditions, and may include network or endpoint activity, behaviour in network, device, and user activity, as well as non-computer events.
Behavioural indicators are analysed from within a time frame, each time frame being part of a series of time frames. Each time frame may be characterised by an analysis window (i.e., length of a time frame) and an analysis interval (i.e., time between starting points of successive analysis windows). As will be appreciated, if an analysis interval is shorter than the corresponding analysis windows, time frames with successive trigger (start) times may overlap.
As such, a behavioural indicator may be used in analyses in multiple time frames. For example, if an analysis window contains one hour of indicator data for the time period 10:00am to 11:00am, and the analysis interval is one minute, then there is practically a sliding window moving to a new time frame including data from 10:01am to 11:01am. As will be appreciated, in this manner there may be old indicators removed from analysis (from 10:00 to 10:01), new indicators added (from 11:00 to 11:01), and indicators that remain the same (from 10:01 to 11:00). As such, the analysis outcome may differ between different time frames despite a large overlap of identical indicators. For example, an indicator threshold may have breached using analysis window data from one analysis interval, and may no longer breach in an updated analysis window for a subsequent analysis interval.
Analysis windows and analysis intervals may be dynamically adjusted, to increase and/or decrease the time frame from within which the behavioural indicators are assessed. For instance, the analysis window may be increased to cover a longer period of time. As another example, a model may require as a condition a minimum number of behavioural indicators to be processed. If the number of indicators remains below a required minimum threshold, then the model may increase the analysis window length to allow it to capture more behavioural indicators in a single sequential analysis for the analysis window. This may be used to ensure that the amount of data collected to establish a reliable threshold baseline, or when it is necessary to avoid a lag due to a too large number of included behavioural indicators.
The sequential analysis approach, repeated for subsequent time frames, achieves that the presence of a threat or anomaly, is based on a continuously re-assessed framework, and as such based on a growing body of evidence. The sequential analysis allows a re-ordering of events within a time-frame, e.g. an indicator that has become a most-recent indicator within a time line may be processed as first indicator in a first analysis step.
The indicator strength values from one or more analysis steps are aggregated to a sequence strength score. If the sequence strength score is returned after a single (i.e., the first) analysis step, it is likely to be low, however this is not necessarily impossible to happen. If the sequence strength score exceeds a detection threshold, it may be classed as detection event and may be assigned a detection ID for the detection model (i.e., for the sequential analysis) in which it was calculated.
A detection ID, herein, is an identifier for a specific detection event that a detection model generates when the conditions for detection event generation are satisfied. The detection ID can be used as an identifier or handle, for avoiding repeat alerts for the same detection event, for instance if the same detection event is measured in overlapping time frames.
The method may provide an output in the manner of a confidence score. Herein, a confidence score is a score combining the strength values (i.e., intra-step strength values) of all processed analysis steps. The confidence score is calculated using the aggregate strength of breached Indicators across all analysis steps in a detection model and it may be understood as representing an overall confidence of the model detection, according to all contributing Indicators which have breached.
To calculate the model's confidence score Ds, all indicator strengths i0 are combined to create a confidence score. Due to the recursive nature, the confidence score may be exponentially increasing. There may be several suitable methods for combining a series of scores. To provide illustrative examples, the aggregation may comprise a static or dynamic method, such as a multiplying function, supervised or reinforcement machine / deep learning algorithm, and others, as well as combinations thereof The confidence score represents multiple individual behavioural indicators, which can be taken as a growing evidence of an event, herein referred to as a "detection event". An example aggregation method is based on a multiplier function. The aggregation method ensures that a confidence score continues to increase as breached indicators are added to the calculation from different analysis steps. To convert an exponential curve into a linear curve, a multiplier function may be used that takes the product of the inverse of the indicator strength values, and then generates an overall confidence score by inverting again, to generate an output of the multiplier that is normalised between 0 to 1, where 1 is equal to 100% confidence, as per Equation 2: (Equ. 2) The multiplier function may be used in situations in which, without inverting the indicator confidence first, an aggregate confidence score might slowly decrease as indicators of varying strength are combined in an iterative manner. However, it will be understood that the manner of combining and/or aggregating data may differ, and that therefore other algorithms than Equation 2 may be used. Regardless of the manner in which the confidence score is calculated or aggregated, the confidence score is assessed against a confidence threshold score 0 required to breach the required aggregated confidence score for a detection model, e.g. by evaluating a test condition De > a The method may comprise the use of a suppression timer, which may be understood as a time period, or application of a time period that is required to elapse, before a detection event is considered a new, different event. A suppression timer allows time frames with a high degree of overlap to be used that may otherwise each generate individual outputs indicative a detection event that is the same detection event throughout a series of sequential analyses (series of time frames). A suppression timer may be combined with detection IDs, for example to be able to suppress creation of a new detection event only for detection events with matching detection ID, while allowing the generation of new detection events in addition to existing detection events with different detection IDs.
The present method is a departure from known approaches for threat detection, which employ specific algorithms for specific detection contexts (e.g., only network or only endpoint host activity), and provide a relatively rigid detection framework with limited ability to adapt. For example, a detection algorithm may be designed to work for detecting threats in network data only, limiting its applicability to computer system host data. Moreover, existing cybersecurity threat detection methods do not typically take into consideration non-malicious indicators, such as faults or malfunctions from both computer systems and physical processes at the same time to detect threats. Behavioural indicators may be defined to characterise a non-malicious abnormality or general context such as a days of a week.
This allows the invention to be used not only in the identification of threats and anomalies observed in computer system behaviour, but also in the identification of these in physical machines and processes, and the combinations of these deployed across industrial and operational technology systems.
In some embodiments, a definition which of the one or more behavioural indicators and the order in which they are to be considered in the analysis steps are defined as an input from a predetermined quantity of behavioural indicators.
In that case, each of the behavioural indicators allocated within in a time frame is processed.
The behavioural indicators may have been predefined by an operator or by a system configuration. The indicators may be predefined under supervision, or manually, i.e., by a security analyst or other professional. Being pre-defined, it will be appreciated that the analysis method will include each one of the pre-defined behavioural indicators in an analysis step, provided it qualifies as an included indicator based on the indicator strength value. As such, if a behavioural indicator is predefined or included, there is no mechanism to exclude it from an analysis step to which it has been assigned. As such, if an analysis step is being executed, it will include the behavioural indicator in its analysis, and will calculate its score if it is an included behavioural indicator. Likewise, if a behavioural indicator is assigned to a subsequent analysis step, it cannot be considered before its preceding analysis steps have all satisfied their inter-step activation conditions. In this manner, it can be ensured that certain criteria are satisfied before other behavioural indicators are processed.
In this manner, it is possible to force an analysis of behavioural indicators in combination, by predefining their inclusion in an analysis step. This allows behavioural indicators to be combined that may otherwise not necessarily grouped by a machine learning algorithm.
Furthermore, this allows a combination to be predefined, e.g. by a user, to use -and to force the use of -behavioural indicators from both cyber domain and physical domain. This also allows behavioural indicators to be included that are not only indicative of an attack, but alternatively and/or in addition also events that are indicative of a computer or physical system fault that is not a consequence of an attack, to determine whether or not such events pose a threat.
Conceptually, a distinction may be made between "cyber" and "physical" domains with regard to threats and the detection thereof With regard to the "cyber" domain, existing cybersecurity detection tools tend to focus on specific monitoring domains in cybersecurity, such as monitoring host or network activity. With regard to the "physical" domain, a typical approach is that condition-based monitoring tools for production systems calibrate event detection on specific physical process variables in operational technology sensor data.
The inventors have appreciated that many real-world environments may combine both domains, cyber and physical. The ability to define behavioural indicators from both domains (cyber and physical) allows indicators to be linked in a common analysis step, and allows the method to carry out a (mandatory) assessment that follows a prescribed structure.
Indeed, physical indicators can often be the first or perhaps the only obvious sign of an ongoing cyberattack, or system fault, so they offer a crucial dimension of visibility that is highly relevant and valuable to cybersecurity assessment in detection models.
Rather than relying on a specific algorithm for detection models calibrated solely to computer system hosts or network data, the suggestion made in this disclosure is to employ a novel detection model framework that is malleable to a range of different detection scenarios.
By "malleable", herein, it is meant that the system is configured to use a framework for detecting cybersecurity threats, physical process anomalies, related to both attacks and system faults, through a combination or cyber, physical, and cyber-physical indicators, without requiring any change to its structural components, algorithms, and workflows. I.e., the same system can be used for processing input parameters from network behaviour and/or physical process parameters such as temperature sensors, and the like.
Malleability is achieved by providing a configuration for defining different behavioural indicators, and practically any number of behavioural indicators, including indicators that are helpful as context, to be analysed to establish a disposition in relation to a reference behaviour.
In some embodiments, the one or more behavioural indicators are provided as an input from a pool of behavioural indicators, the method comprising selecting from the pool of indicators two or more behavioural indicators A further suggestion made in this disclosure is to employ a detection model framework that is adaptive to a range of different detection scenarios. This enables formulation of detection models in a dynamic manner, i.e. which may dynamically self-configure.
In this manner, the method may be both malleable and adaptive.
By "adaptive", herein, it is meant that the system may be configured for detection models to automatically, and/or based on a rule set, add new cyber, physical, and cyber-physical threat indicators within the detection process. In addition, the system may be configured to allow it to dynamically add indicator analysis criteria that may not have been predefined, or prescribed by a manually predefined process.
In this embodiment, a distinction may be made between two types of analysis steps, based on how behavioural indicators are selected into an analysis step. A first type is a preconfigured, or manually defined, analysis step. A second type is a dynamic, or self-supervised, analysis step.
Predefined analysis steps include behavioural indicators and analysis conditions that predetermined and organised according to varying logical conditions which are required to be satisfied to consider the analysis step as complete.
Dynamic/self-supervised analysis steps are understood herein as having been configured to select and include indicators from a list of available "pool" Indicators. In other words, a dynamic analysis provides an option to practically exclude (or to not include) behavioural indicators from a pool. In this manner, indicators may be dynamically added or removed across subsequent analysis intervals. The dynamic addition or removal may be dependent on conditions being met. A condition may be, for example, if two or more indicators are associated by similar meta-data, or, more generally, by context information. What constitutes context information may be predefined, e.g., by an operator.
To illustrate a distinction between pool indicators and predefined indicators, predefined indicators must be considered within an analysis step if they were classed as included indicators due to their indicator strength value. In contrast, pool indicators may be selected in a dynamic analysis step, if they are included indicators based on their indicator strength value. A dynamic analysis step may use some, none, or all of the pool indicators. In this manner, included indicators may be withheld from the analysis.
In some embodiments, the method comprises analysing context information related to the behavioural indicator, information of two or more behavioural indicators and grouping two or more behavioural indicators according to similarity of their context information.
The context information may be constituted by at least one of information stored in one or more network packets (such as a type of network protocol, a network payload, application headers, etc), an operating system process (such as a process tree parent and/or children, memory allocation, kernel space or user space function usage etc.), and operating system file (such as file type, file contents, extension, meta data, hash), and physical process meta-data (such as data type, parent and/or child process(es), data source, data/tag ID, tag type, process unit name).
Herein, context information is information associated with a behavioural indicator, but not as such the property of the behavioural indicator. A typical example for context information is meta-data, and more generally detection artefacts. Detection artefacts may be extracted manually or dynamically from a model, based on various observable information related to behavioural indicators and/or their associated raw data.
For example, in the case of a model that is configured to detect abnormal domain name system (DNS) behaviour, a behavioural indicator may measure the number of DNS queries made within the analysis window time frame. In that case, the method may also extract as a context information or detection artefact the query types and response. Depending on the type of context information, the context information may be manually predefined. However, some types of context information may be dynamically recognised and monitored without having been manually predefined.
Dynamically adding "pool" indicators to an analysis step may comprise matching with these indicators directly or indirectly with model detection artefacts. For example, a "pool" indicator may measure rare content-lengths in HTTP connection headers. Assuming another "rare content length" pool indicator exists and this indicator has not previously been used in any analysis step for a specific model's analysis interval, if the model contains detection model artefacts relevant to the HTTP connection, this indicator may be added dynamically to an analysis step for assessment.
The value of pool indicators is that behavioural indicators that may have a low indicator strength value, and/or indicators that may not have been recognised in their relevance during a predefining step, can be added based on their context information, e.g. based on other behavioural indicators with e.g. similar process parent, network protocol, and the like. In this manner, the aggregate score of a dynamic analysis step may increase considerably depending on the number of behavioural indicators. The behavioural indicators may, in theory, be added without limit, however the dynamic addition of behavioural indicators is expected to stop under two conditions. (i) No more associations are made with unused pool indicators based on the available context information. In that case, if the dynamic analysis step has reached an intra-step strength value that exceeds an inter-step activation condition, the sequential analysis proceeds to the next analysis step. (ii) The addition of pool indicators continues within an analysis step until the expire of a time frame. In that case, the sequential analysis stops at the end of the time frame. In that case, however, it can be assumed that the confidence score of the detection model is relatively high due to the addition of a large number of indicator values, even if not all analysis steps were completed within a time frame.
For each "pool" indicator that is added to a dynamic Analysis Step, where an existing detection Model artefact matches the selection criteria of the "pool" Indicator, if the behavioural indicator breaches, then additional model artefacts may be extracted from the "pool" Indicator and added to list of existing model artefacts.
In this manner, if pool indicators that are added to a dynamic analysis step breach after analysis, their extracted artefacts may continuously update the amount of model artefacts available to support further pool indicator selection with new meta-data criteria that were not available prior to the assessment of the pool indicator.
When pool indicators are added into a dynamic analysis step, their relation with other indicators may be defined at the same time. Behavioural indicators, whether dynamic or manual can be chained with "AND" or "OR" conditions with an analysis step (see, e.g., Equ. 1 above). Such chaining conditions may be based on a predefined criterion. For instance, a pool indicator may have a defined "AND" conditions with other pool indicators, if such pool indicators are present. Otherwise, a default "OR" chaining conditions may be applied.
Dynamic analysis steps may initially be defined with no indicators at all. For instance, they may be preceded by an analysis step that contains one or more behavioural indicators, and their context information (e.g. detection artefact data) may cause initiation of a dynamic analysis step for pool indicator lookups. A dynamic analysis step may require only one indicator to breach its defined threshold to complete the analysis step.
The invention allows collecting, combining and tracking evidence of high risk or abnormal system activity to continuously reassess detection confidence, and to modify a detection confidence level to a higher level or lower level depending on the presence of high risk or abnormal system activity.
In some embodiments, the method comprises selecting a behavioural indicator from the predetermined quantity of behavioural indicators and allocating it to the pool of behavioural indicators.
The model may allow a predetermined behavioural indicator to be used in a pool of indicators. In this manner, it may be made available as a dynamic indicator (behavioural indicator used in a dynamic analysis step). This is relevant because even though a predetermined behavioural indicator must be processed as part of an analysis step, it may only have a relatively low value, leading to it not contributing to an overall threat score. By allowing the behavioural indicator to be used as a pool indicator, it may be combined with factors that amplify its relevance. In this manner, such indicators effectively allow a transition from a static threshold to a dynamic threshold assessment, once sufficient data is available.
This provides a further advantage, namely that an analysis model may operate at' initio using pre-determined behavioural indicators, with a provision to practically seamlessly transition to a dynamic performance once there is sufficient data for the system to create dynamic analysis processes. This allows for behavioural indicators, and by extension models, to be immediately functional as soon as they are deployed, rather than requiring the system to wait an unspecified amount of time until sufficient data is available for assessment.
In some embodiments, the method comprises applying an obsolescence condition to determine whether or not an indicator strength value is an obsolete indicator value, and removing each obsolete indicator value from the combination of the two or more indicator strength values, to create a combination of retained indicator strength values, and generating a sequence strength score based on the combination of retained indicator strength values.
In this manner the method may exclude one or more obsolete indicator strength values.
In some embodiments, the method comprises applying a minimum indicator criterion defining a predetermined minimum number of included indicators to be used to generate a sequence strength score In this manner, the method may combine a minimum indicator criterion in combination with an indicator expiry (or obsolescence criterion). The minimum indicator criterion may be a leaky bucket threshold. "Leaky bucket" is a threshold construct which checks the number of behavioural indicators which have breached their threshold within a sequential analysis against a predefined "bucket" threshold. If the number of indicators breached within the analysis window meets or exceeds the Leaky bucket threshold, the classifier returns a true condition which contributes towards the detection model's event generation criteria. Leaky bucket behaviour is implicitly applied by a sliding analysis window, where indicators become obsolete when they are outside a time frame. In this manner, a risk score based on older, likely outdated data can be reduced. Leaky bucket methods may be applied in addition to a sliding analysis window. Note that a model has a mechanism for maintaining an otherwise old indicator, by increasing an analysis window to cover a longer period of time, to reach further back in time.
A Leaky bucket criterion may be combined with the confidence score calculation of the earlier embodiments. In addition, either or both of the leaky bucket threshold and a confidence score threshold may be altered, typically after completion time frame, to thereby increase or decrease the sensitivity of model detection for both false positives and false negatives. Thresholds may be adjusted based on automatic or manual feedback from the system or human operators, respectively. For instance, in a simple configuration, a leaky bucket threshold may be 1, i.e. at least one behavioural indicator value, in which case the confidence score of any single analysis step will determine an output, i.e. to be based on confidence score alone. The leaky bucket and/or confidence score thresholds may be adjusted, for instance, when a detection model has been found to produce false positives or false negatives, and when it can be determined via a test analysis or comparative analysis that an adjustment of the thresholds will result in fewer false positives and/or fewer false negatives.
In some embodiments, the method comprises modulating the indicator inclusion condition and/or the predetermined minimum number of included indicators.
In this manner, a minimum number of included indicators within a sequential analysis, regardless of their score, may be made altered to adjust the threshold for generation of a detection event.
Typically, the method will apply a modulation of criteria only between sequential analyses, i.e. for different time frames. In embodiments, the method may provide that no threshold or condition is modulated during execution of a sequential analysis.
In some embodiments, the method comprises modulating the indicator inclusion condition.
The indicator inclusion condition (i.e. the threshold and/or ranges defining whether or not a behavioural indicator should be considered as an included indicator) may be modulated. In that case, the indicator inclusion condition may be provided by, or may be considered as being, an adaptive threshold. The modulation of an indicator inclusion condition may be based on sufficient data having been analysed, using statistical analysis of preceding sequential analyses and/or manual supervision. To provide an illustrative example, extreme value distribution methods may be used to re-define a threshold to be used as indicator inclusion condition. As will be appreciated, depending on the statistical method employed, and depending on a desired confidence level, the amount of data may differ that is required to be deemed sufficient.
As will be appreciated, by modulating the indicator inclusion condition, the number of indicators included in the analysis steps may change, and the number of available pool indicators may change.
In this manner, combining several embodiments of the invention results in a requirement for several thresholds to be applied throughout a sequential analysis model. The indicator strength value must meet an indicator inclusion condition. The combined indicator strength values, i.e. the intra-step strength value within an analysis step, must meet an inter-step activation condition. The intra-step strength values may be weighed during an aggregation step to generate a sequence strength score or confidence score. The (aggregated) confidence score of a sequential analysis must meet a confidence score threshold. In addition to the score values, the number of included indicators must satisfy (or exceed) a predetermined minimum number of included indicators in the manner of a leaky bucket threshold. The calculated scores and number of indicators must have been considered from within an analysis time frame window.
Each of the thresholds may be pre-defined, in a static or manual manner, to allow the method to operate without requiring considerable configuration, i.e. the method may operate with little lag time. Each of the thresholds may be altered either manually or dynamically, e.g. once there is sufficient data or confidence for an autonomous reconfiguration.
In some embodiments, the method comprises communicating with one or more data transfer interfaces to process as input data output data from the data transfer interface, and to process the input data as one or more behavioural indicators.
In some embodiments, the method comprises sending a request command to the one or more data transfer interfaces to request output data for use as input data.
In some embodiments, the method comprises storing the input data in a memory, brokering the input data according to broker filter conditions, before using the input data as behavioural indicator in the sequential analysis.
Aspects and embodiments of the invention may be embodied in a software product configured to process, when executed in a memory of a processor, the method steps of any one or more of the preceding embodiments, and/or combinations thereof.
Aspects and embodiments of the invention may be embodied in a processor comprising memory and processing instructions to carry out the method according to any one of the preceding embodiments, and/or combinations thereof.
Any one or more of the embodiments described in relation to the first aspect may be combined with any one or more other embodiments.
An appreciation underlying the present invention was the approaches used in the aforementioned embodiments allow reducing or mitigating so-called "drift", by which is meant that gradual changes over time that should be interpreted as expected behaviour may be misinterpreted as deviation when compared to an obsolete starting point. Another aspect is that it may not always be appropriate or possible to rely on pre-defined reference behaviour: in the event of a change in dynamic and hybrid operational technology environments, there may be a lack of similar device profiles. Process changes such as automation may reduce user activity to a minimum.
Such changes may result in a legitimate behavioural change, and/or drift, in network and physical processes because of dynamic process updates. Decisions based on drifted reference behaviour may lead to erroneous classifications, including false positive classifications. The structured approach for the execution of behavioural indicators is believed to reduce the likelihood of drift occurring.
Description of the Figures
Exemplary embodiments of the invention will now be described with reference to the Figures, in which: Figure 1 is an example of a sequential analysis with manual and dynamic analysis steps; Figure 2 is an overview of the sequential analysis detection model; Figure 3 is an overview of the indicator analysis workflow within an analysis step; Figure 4 illustrates a sequential analysis detection model and handling of detection events, including detection, tracking, suppression, and update processes for detection events; and Figure 5 provides an overview of an example detection model system architecture for cyberphysical threat detection.
Description
In statistical analysis, the concept of "sequential analysis" relates to a testing process for a predefined hypothesis, such as "NB testing", when a sample size is not predetermined in advance. In other words, data is evaluated as it is collected, with further sampling stopped when one or more conditions of a predefined stopping rule are satisfied.
The assessment is made for data points from within a time frame, which may be a fixed time period or dynamically modulated period of time, and repeated in intervals. Herein, it is proposed to use sequential analysis of the data points within a time frame as part of a threat detection model. To this end, embodiments disclosed herein apply continuous testing, and additionally employ a repetition of testing, in the form of analysis steps that are applied, quasi in an iterative manner, however to successive time frames. The time frames may overlap. The indicators may be statically assigned indicators and/or dynamically assigned. This allows the present method to continuously collect data, and to continue to reuse collected data for multiple analysis steps, and to repeat analysis according to conditions that may be pre-determined or dynamically updated. However, it is also possible for a detection model not to be updated between sequential analyses. In that case, a change in the detection model is expected due to a change of behavioural indicators being detected in a different time frame.
Herein, a "detection model" is a process of carrying out a sequential analysis to establish the occurrence of a detection event, based on a structured analysis of a plurality of behavioural indicators. The indicators allow a detection event to be characterised as a threat and/or non-threat anomaly. The process allows taking into account relationships and recurrence patterns between multiple behavioural indicators, to thereby reduce the number of false positives, and/or reduce the number of false negatives, and/ or reduce the number of repeat threat alerts relating to the same event.
Below, a detection model 300 will be described with reference to Figure 4. For a better understanding of the detection model 300, Figures 2 and 3 illustrate a process for analysing behavioural indicators. Figure 1 illustrates a sequential analysis method 10 that is used in several embodiments of the invention.
Figure 1 illustrates a conceptual architecture of a sequential analysis model 10 that contains one or more manual analysis steps 20 (here: two manual analysis steps 20a, 20b) and/or one or more dynamic self-supervised analysis steps 30 (here, a single dynamic, or self-supervised, step 30).
A first manual analysis step 20a comprises a first indicator 22a and a second indicator 22b, which are to be processed according to a first logical condition 24a (here: an "AND" operator), constituting an inter-step activation condition and defining a logical condition "first indicator 22a AND a second indicator 22b. The first manual analysis step 20a is chained to be followed by a second manual analysis step 20b, if the first inter-step activation condition is satisfied. The second manual analysis step 20b comprises a third indicator 22c, a fourth indicator 22d, and a fifth indicator 22e, with a logical condition 24b being "22c AND (22d OR 22e)" and constituting a further inter-step activation condition. In this example, five indicators 22a-22e have been selected, but three indicators 22c-22e will only be analysed if the first logical condition 24a has been satisfied. The first and second indicators 22a and 22b may be processed in parallel or any order. In practice, all indicators may have been processed in advance, however their contribution to a confidence score is evaluated according to the process of Figure 1.
Continuing with Figure 1, the second manual analysis step 20b is chained to be followed by a dynamic analysis step 30, and will only execute if or once the second logical condition 24b, or inter-step activation condition, was satisfied. The dynamic analysis step 30 (being the first dynamic analysis step in this example, and also the third analysis step) processes behavioural indicators 32a, 32b selected from a repertoire of pool indicators 40. The behavioural indicators 32a, 32b may have meta-data or other artefacts that define a common model artefact 34.
The pool indicators 40 include a first pool indicator 42a, a second pool indicator 42b, a third pool indicator 42c, and an nth pool indicator 42n. The pool indicators 40 have been associated, in this example, by matching meta data, such that the first and second pool indicators 42a, 42b are grouped in a first pool indicator match 44a. The third pool indicator 42c, in this example, is not or not yet matched, however does comprise a meta data association 44c that could be used for matching decisions. In this example, a degree of dissimilarity between meta data association 44a and meta data association 44c re-affirms that the third pool indicator 42c remains unmatched with the first and second pool indicators 42a, 42b. As such it is possible for pool indicators to remain isolated.
For instance, based on the model artefacts 34, the dynamic analysis may identify a matching pool meta data association 44a, to identify and/or ensure that other pool indicators from the meta data association 44a are also included with the dynamic analysis 30.
The dynamic analysis step 30 may include an update 36 of the model artefacts 34 to create an updated model artefact definition 38. For instance, the pool indicators 42a, 42b may have used a similar network protocol, and may also be characterised by a common process tree parent. Once taking the process tree parent (in this example) into account in the updated model artefact definition 38, this may result in the previously unrelated meta data association 44c to become more relevant, resulting in an addition of the third pool indicator 42c as behavioural indicator 32c into the dynamic analysis step 30. As will be appreciated, the third pool indicator 42c may comprise additional context information such as meta-data that may lead to yet another update of the model artefact definition. In this manner the dynamic analysis step 30 may include any number of pool indicators from the repertoire 40, until either no further associations are made, or until an analysis time frame window expires.
A dynamic analysis step may be initiated without any indicators, and may have a single indicator. If a dynamic analysis step is determining a breach condition based on a single indicator, this may still be followed by other steps, such as leaky bucket and confidence score thresholds, may still need to be satisfied for a detection model to generate detection output.
Manual analysis steps such as steps 22a, 22b are better suited for assessing behavioural indicators for specific threat behaviours in a deterministic manner. Dynamic, self-supervised analysis steps, such as step 30, and by extension "pool" indicators, are better suited for detecting novel or anomalous behaviours that do not necessarily conform to well known or define threat behaviours and suspicious activity.
As shown in Figure 1, both manual and dynamic analysis steps may be combined within the same sequential analysis, with either a manual or self-supervised analysis step being an initial, intermediate or final analysis step of a sequential analysis (before initiation of another sequential analysis of a successive time frame), and as a result can be utilised together to generate models which adapt well to changing threat actor tactics.
For instance, a typical threat actor tactic may include a type of concealing behaviour, e.g. may involve defining an easily recognisable footprint of common threat behaviour activity, and avoiding such easily recognisable footprints. By using appropriate combination of analysis steps, the absence of such easily recognisable footprints may be ignored as a contributor, such that the absence of footprints does not lower a threat score. Alternatively or in addition, it may be possible to define the absence of such footprints via one or more behavioural indicators, which may then lead to this being considered in the calculation of a threat confidence score.
Figure 1 has been simplified for a focused illustration of an exemplary analysis sequence. In practice, the behavioural indicators that are included in the analysis steps need to satisfy an indicator inclusion condition, which may be considered an indicator breach threshold or upper/lower boundary. An analysis step may comprise a large number of unbreached indicators that are not shown in Figure 1. The indicator inclusion condition may be based on an indicator strength value in relation to a reference behaviour, for instance indicating how much an indicator deviates from the reference behaviour. However, the indicator inclusion condition is not necessarily a score and may also be a Boolean logic such as true or false (e.g. "Is today a weekday": yes or no).
The indicator breach threshold may be static or dynamic. A threshold algorithm may be used to determine if an indicator value breaches a minimum confidence condition for it to be considered as contributing to an analysis step. If the indicator fails to satisfy or breach the threshold, the sequential analysis is carried out as if the indicator was not present. This may result in a termination of the sequential analysis after the first analysis step. Dynamic thresholding for an indicator may be applied using a range of statistical probability distributions or unsupervised and supervised machine learning algorithms, whilst static thresholding may be applied using rules, heuristics, and likelihood / estimation algorithms. Other methods for setting a threshold may be used.
For indicators with boolean threshold conditions, an interpolation weighting factor y may be applied to alter (increase or reduce) the confidence of the indicator according to one or more of the following conditions: the number of model analysis steps that have been completed, the existence of specific analysis step indicator breaches within a specific analysis interval, where for both cases the strength of the indicator regarding its contribution to the model's confidence score should be adjusted.
The method may be executed to provide an indicator confidence score, normalised within a range between 0 and 1, to indicate a degree by which an indicator value iv of a behavioural indicator deviates from a reference behaviour. The indicator confidence score may be calculated relative to a threshold 0. The threshold may be predefined (e.g., static) or may be defined dynamically. The threshold may be set to be identical with a reference behaviour. In that case, any deviation from expected reference behaviour contributes to a confidence score, although it will be appreciated that a larger deviation results in a larger confidence score value. The threshold may be different from the reference behaviour. In that case, a behavioural indicator value iv deviating from reference behaviour by a degree not exceeding the threshold will not result in a high confidence score. If or once the behavioural indicator value exceeds the threshold, it is provided with an indicator confidence score between 0 or 1. In this manner, the indicator confidence score is normalised regardless of the threshold value and regardless of the threshold type (dynamic or static).
It will be appreciated that such formulae are applied depending on the type of data and its format. In the case of Boolean or categorical indicators, for instance, the formula may not be applied. Various formulas may be applied to measure the strength of threshold breach, which may include one or more rule-based heuristic of statistical distribution. To provide an illustrative example, below is provided an exemplary formula set (Equ. 3 and Equ. 4) which applies a weighted distance calculation with a shaping parameter based on an upper and lower bounded threshold: 6 = 0 -iv (Equ. 3) 1 -(1 /(1 +px (6 /(-Et)))) (Equ. 4) wherein, p defines the shape of the curve for the "strength" of the indicator confidence, on the basis of a lower and upper threshold, 6 is the difference of the indicator value from the lower or upper threshold (whichever is lower), c is a normalising factor set as (upper threshold -lower threshold), and c is a small weighting factor to prevent divisions by zero. In this example, the shape of the curve dictates the strength of the threshold breach, and therefore the overall confidence of the indicator breach. Other weighing functions may be used.
Referring now to Figure 2, a sequential analysis model 100 operates as a detection workflow within a the time frame of a recurring analysis interval (see e.g. Interval 316 in Figure 4). In step 102, a sequential analysis process is triggered by an analysis interval, to capture (or consider pre-calculated) behavioural indicators within an analysis window, to be evaluated according to a chained set of model analysis steps 110 (sequential analysis model 10 of Figure 1 being a simplified example). Each sequential analysis starts within an initial analysis step 112, such as step 20a (Figure 1). For each analysis step 112 that is executed, a series of associated one or more behavioural Indicators 114, such as indicators 22a and 22b (Figure 1) is analysed, whereas the analysis step is limited to using behavioural indicators from within the same analysis window.
Within an Analysis Step 114, some or all behavioural indicators (e.g. indicators 22a and 22b) may be analysed in parallel. For each behavioural indicator, its raw indicator data is collected from a telemetry store (such as a database), which is then processed to compute the indicator value in a computation step 116, by fetching available indicator models for computation. Once the indicator value has been computed in step 116, an indicator assessment is conducted at step 114. If applicable, in step 116 a check for the availability of a dynamic threshold for the indicator is made. Note in practice, the indicator value may be pre-computed for each behavioural indicator and stored in a memory or lookup table. In that case, step 116 may comprise or be constituted by determining an indicator strength value from memory or from a lookup table. Figure 3 provides a more detailed overview of steps 114/116.
Once all included behavioural indicators have been analysed in step 114, the condition threshold 118 is calculated to establish whether or not an analysis step 112 is considered complete. This is relevant because if complete, the process may proceed to interpret this as inter-step activation for the next analysis step in a chain.
If in step 118 it is determined that a threshold is reached, this counts as inter-step activation condition. The method proceeds to step 120 to make a determination whether or not another analysis step N+1 exists in the sequence. If another analysis step exists, the method proceeds to repeat the step 112 for the next analysis step N+1.
If in step 118 it is determined that a threshold is not reached, or if in step 120 it is determined that all analysis steps have been completed, then the model proceeds to step 122, to calculate the confidence score of all analysis steps competed in this sequential analysis. In step 122, the indicator confidence scores of different analysis steps are aggregated to create an overall detection model confidence score. In step 124, a leaky bucket threshold is applied to remove obsolete confidence score values, and a determination is made whether or not the confidence score exceeds a confidence score threshold.
If, in step 124, the detection model's confidence values are below either the leaky bucket threshold or its confidence score threshold, or both, then this is not classified as a breach. In that case, no detection event is generated, and the method proceeds to step 136 in which it awaits the next analysis interval trigger for a new sequential analysis. Note that in the absence of a threat an analysis interval may be triggered with high frequency, e.g. several times per second, and in that case it would normally be expected to have no detection event over the course of several analysis intervals. Likewise, should the data used to generate an indicator no longer result of a breach of threshold criteria in a subsequent time frame, the behavioural indicator will effectively be "dropped" from the leaky bucket classification.
If, in step 124, a confidence value is exceeded, then a detection event is generated. The detection event will be allocated a detection ID. The method proceeds to decision step 130 in which an assessment is made whether or not the detection event already exists, based on a match with another detection ID, and/or whether or not it has expired.
If, in step 130, no detection event exists with the same detection ID, or the suppression timer has expired, then the sequence proceeds to step 132 in which a new detection event is generated, and/or raised. Otherwise, if a detection event exists with a non-expired suppression timer, then the sequence proceeds to step 134 in which the existing detection is updated with new artefacts and indicators according to the detection model context and grouping (e.g., change in detection duration, new indicator breaches etc). After step 132 or step 134, as applicable, the detection model process is concluded at step 136, to await the next analysis interval for a new analysis sequence.
Figure 3 provides a detailed flow chart of the processes carried out as part of analysis step 114 (see Figure 2). In step 210, the indicator analysis is started. This is the case either after a interval trigger (see step 102 in Figure 2) or after the inter-step activation condition of a preceding analysis step was satisfied (see steps 118-120 in Figure 2). In step 212, the data and metrics relevant to behavioural indicators within the analysis step are collected. The data may be provided from an interface, memory or database 214, as may be appropriate. In step 216, an indicator strength value is calculated. The indicator strength value may take into account models and reference data 218. In step 220, indicator assessment criteria are applied to determine whether or not the indicator strength value exceeds an indicator inclusion condition. Note that steps 212 to 220 may be pre-computed for behavioural indicators prior to a sequential analysis. However, pre-computation need not always be the case. The indicator inclusion condition may be a pre-defined (static) threshold, or may be an adaptive (dynamic) threshold. It will be appreciated that an adaptive (dynamic) threshold is a threshold that has been updated between successive sequential analyses. In step 222, the method determines whether or not the indicator inclusion condition is a dynamic threshold. This is relevant because if there is a dynamic threshold, there is also a pre-defined static threshold. If a dynamic threshold exists, then the method proceeds to step 224 and determines whether or not the dynamic threshold was breached in step 226 (thereby satisfying the indicator inclusion condition). If the dynamic threshold was breached, then the method proceeds to step 230 and provides as an output an indicator strength value for use in its analysis step. If step 222 yields that no dynamic threshold exists, or if step 226 yields that there is a dynamic threshold but it is not breached, then the method proceeds to step 228 as a catch-all provision to apply the static threshold. If in step 228, there is no breach of the static threshold, then the behavioural indicator is not considered in its analysis step. If, in step 228, it is determined there is a breach of the static threshold, then the method proceeds to step 230 to provides as an output an indicator strength value for use in its analysis step, even if the dynamic threshold was not breached. The method proceeds within the sequential analysis either at step 112 (Figure 2) or via steps 118-120 (Figure 2).
The use of dynamic (adaptive) thresholds in combination with static thresholds has been found by the inventor to provide a robust and adaptive threshold in adversarial or rapidly changing (drifting) environments. To this end, if a model indicator has transitioned to dynamic thresholding, should the system generating the indicator data source become adversarial (i.e., is manipulated by an attacker) or cause legitimate concept drift, then there is a risk that the learned threshold of the system for that indicator, in that model analysis step, may no longer be breached. To deal with such a scenario, previous static thresholding (e.g., a rule-based heuristic) is still applied as a "catch-all" threshold assessor for an indicator, when a dynamic threshold assessment does not breach, i.e., might result in a false negative assessment.
To provide an illustrative numeric example, in the case of network scanning behaviour using a behavioural indicator describing an abnormal spread of connections with unique ports. A dynamic threshold may have learned that in a specific network it is abnormal for it to see connections to more than 25 unique ports for any particular asset in a 10 second time period, whereas a static threshold may always apply a threshold of 20 unique ports, because this is may have been deemed by a human operator to closely resemble common threat behaviour. In that example, if 23 connections are detected, this would escape the dynamic threshold yet would be captured by a static threshold. Within the relevant analysis step, a different weighing may be applied whether or not the breach was one of a dynamic threshold or a static threshold. However, in this manner the likelihood is reduced of a behavioural indicator being omitted entirely from an analysis step.
Exceptions to this behaviour may be applied when a specific detection indicator is suppressed, or static thresholding is automatically suppressed by the system based on the number of false positives generated. In this case, suppression may only apply to a static threshold as a catchall rule for a specific set of detection IDs generating the false positive detection.
Figure 4 provides an overview of the sequential analysis detection model workflow process 300, in the form of an end-to-end process of a sequential analysis detection model 312 applied repeatedly to a series of time frames within a period 318 in which detection models are applied in a sliding analysis window 314 carried out in intervals 316.
In the example, no detection events are generated in intervals 316 (corresponding to step 124 connecting directly to step 136 in Figure 2) until, once carrying out the analysis for interval 317, a detection event DI is generated (step 130 in Figure 2), i.e., whereby in a step 320a the sequential analysis generates an detection event, as illustrated in graph 320.
The graph 320 is a visual representation effectively of step 124 (Figure 2) of the combined threshold application for leaky bucket and confidence scores. As will be appreciated, indicator breaches of included indicators are counted as leaky bucket events 340 and contribute their indicator confidence score 342. The method may apply two or more thresholds, such as a first threshold 320a indicating a threshold between low score 352a indicating a lack of evidence and a medium score 352b confirming a need to continue analysis, and a critical threshold 320b indicating a high score 352c corresponding to a need to create or update a detection event.
The graph 320 illustrates four scenarios 344, 346, 348, 350 to highlight the effect of the leaky bucket threshold combined with a confidence score threshold. In a first scenario 344, both the leaky bucket events 340 and confidence score 342 are below the first threshold 320a. In a second scenario 346, the confidence score exceeds a value 346a and therefore the first threshold 320a, whereas the number of leaky bucket events 340 (i.e. the number of current behavioural indicators) remains below the first threshold 320a. The second scenario 346 may still be interpreted as lacking detection evidence. In a third scenario 348, the confidence score has a value 348c exceeding the critical threshold 320b, whereas the number of leaky bucket events amounts to a value 348b exceeding the first threshold 320a, but remaining below the critical threshold 320b. In other words, the number of relevant behavioural indicators is too low to justify generation of a detection event. As such, the third scenario 348 may not result in the creation of a detection event. In a fourth scenario 360, the confidence score has a value 350c exceeding the critical threshold 320b, and the number of leaky bucket events 350b exceeds the critical threshold 320b. In the fourth scenario 350, both scores exceed the critical threshold 320b, leading to the creation of a detection event.
In this example, the exemplary scenarios 344, 346, 348 may correspond to analysis steps of the intervals 316 in which not detection event was raised. The exemplary fourth scenario 350 may correspond to the interval 317 in which a detection event DI is generated.
Upon generation of a detection event, an event is assigned a detection ID that can be tracked in a step 320b according to a suppression timer, which may be based on the analysis window, interval or some other arbitrary time period for the model. When a detection event is tracked, i.e. it already exists, its score may be updated in a step 322. In this manner, if the same detection event is generated in a subsequent interval Diu. As such, at interval Din, the existing detection event is updated in a step 322 with the score of the new detection event. At interval Die, indicated by step 324, assuming a suppression timer has expired, the detection event is generated as a new, separate or repeat detection event. At interval D2, step 326 indicates that a new detection event has been generated. At interval D2r, , in step 328, the detection event has been resolved (i.e. closed by the system or a human operator). In that case, if a similar detection event happens again, this will be given a new detection ID.
The method may be configured such that an update to a detection event also resets or extends the suppression timer, to reduce the likelihood of duplicate detection events being generated in the same detection context (e.g., for the same detection ID), while at the same time allowing an output to be provided that threat behaviour or anomalous behaviour computer is ongoing. In this manner, the method allows a distinction to be made between ongoing threats or anomalies, and contemporaneous different threats or anomalies.
When a detection event is generated by the sequential analysis detection model, the detection event itself remains part of the sequential analysis model process. As such, it may be maintained and/or used to define or update, or to be considered for manual review, one or more of the following parameters to be used in subsequent analysis intervals: the definition of behavioural indicator, behavioural indicator reference behaviour, indicator inclusion conditions (indicator breach thresholds), transition of indicator inclusion condition from static to adaptive, dynamic analysis rules, allocation and/or transition of behavioural indicators as/to pool indicators, inter-step activation criteria (analysis step breach threshold), rules for aggregation of sequence strength scores and rules for converting and/or normalising sequence strength scores as confidence scores, selection, addition or deletion of context information, minimum indicator criterion (e.g. as leaky bucket threshold) and thresholds for the confidence score (confidence score breach threshold), analysis interval (frequency) and analysis window (length).
A detection event may be provided as an output for interpretation by an operator and/or as an input for a subsequent process such as for issuing a warning message or alarm condition.
Underlying embodiments of this invention is an appreciation by the inventors that it is a considerable challenge to be able to detect and appropriately interpret certain anomalies or threats, for instance network scanning activity in hybrid operational networks. This is believed to be even more challenging in modern environments where system automation implements a high degree of auto-discovery functionality and self-configuration. In such environments, traditional approaches to detecting scanning events with simple rules, or anomaly detectors often result in false positives. Therefore, when analysing scanning activity, it requires a significant amount of investigation, in addition to an initial "scanning-like" behaviour, to determine whether this activity was legitimate or not.
The inventors have found that the sequential analysis detection model method proposed herein facilitates sequential evidence gathering. Within the proposed method, a network scanning detection model may take the following form as set out in Example 1: Example 1: Title: Model Thresholds: Leaky bucket: 3, Confidence Score: 0.85 Model Configuration: Step 1. Manual Analysis Step 1 (Step Condition: OR) 1 a) (Static + Dynamic) Unusual connection spread to unique network ports 1b) (Static + Dynamic) Unusual connection spread to unique network hosts Step 2. Manual Analysis Step 2 (Step Condition: OR) 2a) (Static + Dynamic) Connection attempts to non-existent network hosts 2b) (Static + Dynamic) Multiple connections to unused network ports Step 3) Dynamic Analysis Step 1 (Step Condition: OR) 3a) (Dynamic) Multiple successive connections to rare hosts 3b) (Dynamic) Large number of short connections reset by source Sc) (Dynamic) Unusual number of unique connections for time of day 3d) (Dynamic) Unusual connection pattern to unique ports in short time period 3e) ...
The model structure of example 1 illustrates how different behavioural indicators, surfaced in their natural language representation, may be used to assess suspicious network scanning activity, to sequentially analyse and improve confidence whether or not network scanning activity should be considered as suspicious. In example 1, the model definition requires at least three indicators, with a combined Confidence Score of over 85% (Score 0.85) to raise a detection event, and contains two manual analysis steps and one dynamic analysis step.
As can be seen by the first Analysis Step, the initial indicators, if breached, indicate scanning activity regarding an abnormal number of connections to unique network ports or hosts, with a step condition that requires only one of these indicators to breach to begin executing the next analysis step. Note that both indicators are specified as "static + dynamic" whereby they may be initially analysed with a static threshold, and transition to a dynamic threshold once sufficient data is available. This is believed to be particularly useful in the scanning case, because noisy scanning behaviour may be routinely caught by the static indicator threshold, but stealthy, slower scanning actively may rely on a history of normal scanning behaviour in the network.
The second analysis step then builds upon the existence of initial scanning behaviour, by building further context and confidence with indicators that look at whether the scanned hosts or ports exist, or are used in the network.
In Example 1, three indicators are required for a detection event creation, as defined by the leaky bucket threshold. In this case, if at least three of the first four indicators in analysis steps 1 and 2 have been breached, assuming the confidence of these breaches generated an aggregate confidence score above the defined threshold (here: 0.85), this will lead to a detection event creation/updating without requiring the process to proceed to the dynamic analysis step 3. However, a third step (here: a dynamic analysis step), employs a range of "pool" indicators which are dynamic threshold-enabled only, and independently of the prior Analysis Steps do not have specific context to scanning, but when combined in the third step, they have a clear contribution to the nature of suspicious scanning activity.
Example 2 illustrates a scenario in which the Sequential Analysis Detection Models identify an abnormal physical process behaviour. Example 2 illustrates the possibility to provide context and any cascading indicators of disruptive change associated with anomalous physical system behaviour. Example 2 may be observed in an industrial production line, where a rapidly changing temperature value for a heating oven sensor line may be intuitively described and explained by employing the Sequential Analysis Model configuration as follows: Example 2: Model Thresholds: Leaky bucket: 1, Confidence Score: 0.95 Model Configuration: Step 1) Manual Analysis Step I (Step Condition: OR) I a) (Dynamic) Abnormally fast rate of increase in temperature sensor reading 1 b) (Dynamic) Unusually high temperature sensor process variable reading Step 2) Manual Analysis Step 2 (Step Condition: OR) 2a) (Dynamic) Sustained change in temperature sensor process variable reading In Example 2, two analysis steps provide a clear indication of the physical behaviour of the changing temperature in a physical system / process. The initial leaky bucket threshold requires a single analysis step indicator to breach to raise a detection event, which is prudent with regards to safety measures for reporting potential physical system disruption. However, over the model, the two analysis steps, and corresponding indicators, should they breach, provide a clear breakdown of ongoing behavioural changing in the physical process.
Firstly, if either indicator related to a sudden increase in temperature or a unusually high temperature (with respect to normal baseline) breaches, then the second analysis step is activated to determine whether the change temperature has been sustained for a certain amount of time, and therefore does not represent a temporal change in the process.
In example 2, in combination, should all indicators breach, it is easy to interpret the physical behaviour as it unfolds: a sudden increase in temperature, leading to an abnormally high temperature reading, that is then being sustained at this high temperature. This degree of explainability in the sequential analysis model and analysis steps provides human operators with an insight into ongoing system activity for further decision making.
In example 3, a model configuration for a cyber-physical detection is provided, where the sequential analysis model combines computer and physical system indicators to help discover subtle changes in cyber-physical system behaviour, via sequential analysis steps and indicators. Example 3 is intended to illustrate that certain events, when observed in isolation, may not appear to be particularly abnormal or suspicious, however, when combined within a sequential analysis detection model framework, the context of these events as cyber and physical indicators can reveal complex relationships that may appear abnormal and warrant further analysis.
The following shows a model configuration that links programmatic activity over an operational technology protocol to a Programmable Logic Controller (PLC) with corresponding activity in the physical input/output signals generated by the PLC: Example 3: Model Thresholds: Leaky bucket: 2, Confidence Score: 0.85 Model Configuration: Step 1) Manual Analysis Step 1 (Step Condition: AND) 1 a) (Static) New CIP connection Variable Write operation 1b) (Dynamic) Unexpected state change in I/O tag Step 2) Dynamic Analysis Step 2 (Step Condition: OR) 2a) (Dynamic) New / uncommon network connection between Client device and PLC 2b) (Dynamic) New client device detected 2c) ...
In the cyber-physical detection example, the first Analysis Step includes a fairly innocuous static indicator for an event involving a new write variable operation command issued to a device over the Ethernet/IP protocol. However, when combined with a dynamic indicator measuring an unusual change in a PLC device's Input and Output tag values, alongside an "AND" step condition, the suspicious nature of the original "cyber" indicator immediately changes, suggesting a physical impact as a result of the prior CIP variable write command.
A dynamic Analysis Step provides further context as to the nature of the device generating the "cyber" indicators, and to the cyber-physical threat context of the detection.
The information of example 3 allows for unique and novel detections to be established, with causal effect and context surfaced for a detection event response.
The examples 1 to 3 use narrative parameters, and are intended to show that a detection model may present an explanatory aspect, which in the form it is presented may also contribute to the confidence of a model's hypothesis test result. In other words, a behavioural indicator may be interpreted as evidence that a behavioural aspect of a model has been observed. Usually, indicators will be analysed as numerical metrics. However, as shown in the examples, they may be represented by natural language representations of their context in a manner that is intuitive and explainable to human operators.
In summary, the sequential analysis model framework provides an adaptable system for the definition of any desired combination of detection model configuration and architecture, whether based on computer or physical system indicators, analysis steps or a combination of two or more of these inputs. The model may further facilitate interpretation of detection models for threats whether cyber or physical in nature, or combined cyber-physical threats.
Within this specification, reference is made to behavioural indicators. These may be defined and captured in one or more of several ways. As an illustrative example, behavioural indicators may be defined and/or monitored for inclusion in an analysis by way of a detection context, a detection grouping, and/ or a detection filter.
Detection context determines how a specific model is executed against specific assets / entities in a given system (e.g., computing devices, physical process tags etc.). The system may be configured to distinguish a plurality of context states. The system may use any number of context states, for instance three or at least three context states including the states: Global, Community, and Local. A Global context applies to all entities, a Community context applies to a specific group of entities, based on some attribute-based criteria (e.g., device type, process tag process ID), and a Local context applies to a single entity. For example, if a model is defined with a Local context, then the model will be instantiated and executed for each specific asset / entity that exists (including any other matching criteria defined in the model), such that any learning components of the model utilise data collected for the specific asset / entity only.
Detection grouping determines how data is aggregated that is used to generate and assess Indicators within model analysis steps. In this context, an indicator value and context may be determined by grouping aggregation parameters. Detection grouping may aggregate Indicator data by a specific entity attribute, such as connection source and/or connection destination, process type, device source, and/or tag type for physical process values, and/or combinations thereof Detection grouping may be applied to an unspecified number of detection attributes to increase the desired specificity of data aggregation, by increasing the number of dimensions in which the grouping is applied.
For example, in the case of a model designed to detect abnormal network connection activity, "detection grouping by source" will aggregate all indicator data by the specific source, regardless of different destinations involved in the grouped connections. To provide another example, "detection grouping by source, destination, and network protocol" relates to three parameters (here: source, destination, network protocol) as a 3-tuple parameter, and will in that case aggregate all indicator data by the specific 3-tuple parameter, such that there would be separate groupings (which would generate separate Indicators) for different source, destination and network protocol pairs. To this end, detection grouping may be used to establish multiple detection contexts for different aggregations within a single model configuration, without needing to create a separate model configuration instance for different groupings.
Detection filter relates to a filtering process, i.e. the application of one or more inclusion rules for the data received, for data to be accepted as input into subsequent processing, and/or one or more exclusion rules to data received, for data not to be used, and to use the outcome of the filter to generate a behavioural indicator in a model analysis step. A detection filter may apply inclusion rules and/or exclusion rules as conditional parameters by pre-fetching meta-data from computer system telemetry or physical process telemetry that is used to generate an Indicator.
A detection filter may be applied globally to all indicators within an analysis step, to all indicators within a specific analysis step, or to an individual indicator only. Multiple detection filters with different rules may also be combined, or chained together to combine cascading filter conditions.
The concepts of detection context, detection grouping and detection filtering are understood as exemplary processes for the definition and capturing of behavioural indicators. Other methods may be used for the defining, monitoring and/or capturing behavioural indicators.
The above-described arrangements and methods may be embodied in a hardware and/or system architecture using one or more of the following exemplary components or subsystems.
Turning to Figure 5, an example system architecture and implementation is described for an exemplary sequential analysis detection model framework.
An example system architecture 400 which houses and executes sequential analysis detection models may comprise in a processor 420 one or more processor nodes 424 for task-specific node processes, a communication broker 422, and a hybrid database 430. Task nodes may be a collector node (not shown), data ingestor node 428 and detection processor node 426. A collector node acquires, via active or passive interfaces, computer and physical system telemetry data from a target network and its host systems. A data ingestor node 428 retrieves collector telemetry from a communication broker and performs parsing, transformation, normalisation, schema assignment, and data enrichment before depositing processed telemetry into a hybrid database for further analysis. A detection processor node 426 contains one or more sequential analysis models (see Figures 1 to 4) that retrieve data from a hybrid database 430, according to a specific model configuration for threat detection processing.
A communication broker 422 may be part of the architecture's communication bus. The purpose of a communication broker 422 is to act as a communication service bus between two or more, or all, of the aforementioned nodes. The communication broker 422 may coordinate, assign or revoke authenticated and node specific input and output namespaces and queues, for pushing and pulling data. A communication broker 422 may queue messages to thereby provide secure unidirectional communication pathways between nodes. In this manner, the communication broker 422 allows for scalable and distributed deployment of nodes outside of a purely centralised system deployment.
For example, a communication broker 422 may be a deployed in a distributed cluster of brokers, which may be decentralised at a network level, and where task nodes communicating with brokers are also fully distributed and decentralised at a network level, without requiring a change in the communication or security architecture of the system. As long as nodes can communicate to a broker, fully distributed node-to-node communication is possible within the system.
The hybrid database may be cache-enabled. The same decentralised deployment architecture may also be enabled for node communication and data transfer between the hybrid database, which may also be deployed as a decentralised cluster, including an in-memory cache-system for high-speed detection processing and enrichment lookups, as well as a long-term data historian store for computer and physical system telemetry storage and retrieval.
Where required, at a policy and at a hardware-level, brokers, nodes, and hybrid database instances may be still deployed in a traditional architecture, with a centralised processor 420 and one or collection probes 410. A centralised processor may contain a broker service bus 422, a data ingestor node 428, a detection node 426 and hybrid database instance 430. A collection probe 410 may include one or more collector nodes 412, 414, 416 designed for specific collection purposes, which it forwards to a processor broker 422.
For example, a collector node may include one or more network capture interfaces 412 monitoring a network capture point 402, for instance for performing deep packet inspection on raw network packets received from a Switch Port Analyser (SPAN), Test Access Point (TAP), or replayed PCAP file to extract network flow and application logs, as well as physical sensor/actuator/input and output data contained in specific operational technology application protocol data payloads.
One or more physical capture nodes 414 may monitor server behaviour, and one or more log capture nodes 416 may monitor client behaviour. The probe 410 may also include one or more network interfaces which actively queries a device which is publishing physical tag/process data, using publisher/subscriber, polling, native protocol communication protocols APIs, or receives client connections containing structured or unstructured computer or physical from physical system data payloads from a trusted data source.
A hardware or virtual probe device which may include one or more network capture interfaces performing deep packet inspection on raw network packets received from a Switch Port Analyser (SPAN), Test Access Point (TAP), or replayed PCAP file to extract network flow and application logs, as well as physical sensor/actuator/input and output data contained in specific operational technology application protocol data payloads. It may also include one or more network interfaces which actively queries a device which is publishing physical tag/process data, using publisher/subscriber, polling, native protocol communication protocols APIs, or receives client connections containing structured or unstructured computer or physical from physical system data payloads from a trusted data source.
The method proposed herein provides an adaptive and, possibly, holistic detection model framework which facilitates detection models for both threat patterns and anomalous activity, in both computer and physical system behaviour. The method can be applied regardless of the underlying system structure or purpose, whilst providing a high degree of reliability that detection models and associated detection indicators are intuitive and explainable by design according to the process of a sequential analysis assessment. In this manner, the proposed method also aims to facilitate efficient interpretation and actionability for human operators.
By way of the above-described variations, it will be understood that the embodiment described with reference to the Figures is an example within the scope of the appended claims, and that various modifications may be made to the invention as defined by the claims.
Claims (15)
- CLAIMS: 1. An analysis method for detection of threats and anomalies in cyber and physical process behaviour, the method comprising: defining a series of time frames for the assessment of behavioural indicators, monitoring, within each time frame, for the existence of one or more behavioural indicators, analysing a plurality of behavioural indicators from within each time frame, each behavioural indicator being analysed to establish an indicator strength value based on a disposition of the behavioural indicator relative to a reference behaviour, determining whether or not the indicator strength value satisfies at least one indicator inclusion condition, wherein behavioural indicators satisfying at least one indicator inclusion condition are included indicators, assigning two or more included indicators to two or more analysis steps, each analysis step comprising at least one included indicator, defining a sequential analysis, analysis step by analysis step, of the two or more analysis steps, within an analysis step, generating an intra-step strength value of one or more included indicators of the analysis step, determining whether or not the intra-step strength value exceeds an inter-step activation condition, and proceeding to a subsequent analysis step of a sequential analysis when the inter-step activation condition is satisfied or exceeded, generating a sequence strength score based on the intra-step strength values, and generating an output indicative of the sequence strength score.
- 2. The threat analysis method according to claim 1, wherein a definition which of the one or more behavioural indicators and the order in which they are to be considered in the analysis steps are defined as an input from a predetermined quantity of behavioural indicators.
- 3. The threat analysis method according to claim 1 or 2, wherein the one or more behavioural indicators are provided as an input from a pool of behavioural indicators, the method comprising selecting from the pool of indicators two or more behavioural indicators.
- 4. The threat analysis method according to claim 3, comprising analysing context information related to the behavioural indicator information of two or more behavioural indicators and grouping two or more behavioural indicators according to similarity of their context information.
- 5. The threat analysis method according to claim 4, wherein the context information is constituted by at least one of a network protocol, a network payload, an operating system process, and operating system file, and physical process meta-data.
- 6. The threat analysis method according to any one of claims 3, 4 or 5, when depending from claim 2, comprising selecting a behavioural indicator from the predetermined quantity and allocating it to the pool of behavioural indicators.
- 7. The threat analysis method according to any one of the preceding claims, comprising applying an obsolence condition to determine whether or not an indicator strength value is an obsolete indicator value, and removing each obsolete indicator value from the combination of the two or more indicator strength values, to create a combination of retained indicator strength values, and generating a sequence strength score based on the combination of retained indicator strength values
- 8. The threat analysis method according to any one of the preceding claims, comprising applying a minimum indicator criterion defining a predetermined minimum number of included indicators to be used to generate a sequence strength score.
- 9. The threat analysis method according to claim 8, comprising modulating the minimum indicator criterion and/or the predetermined minimum number of included indicators.
- 10. The threat analysis method according to any one of the preceding claims, comprising modulating the indicator inclusion condition.
- 11. The threat analysis method according to any one of the preceding claims, comprising communicating with one or more data transfer interfaces to process as input data output data from the data transfer interface, and to process the input data as one or more behavioural indicators.
- 12. The threat analysis method according to claim 11, comprising sending a request command to the one or more data transfer interfaces to request output data for use as input data.
- 13. The threat analysis method according to claim 11 or 12, comprising storing the input data in a memory, brokering the input data according to broker filter conditions, before using the input data as behavioural indicator in the sequential analysis.
- 14. A software product configured to process, when executed in a memory of a processor, the method steps of any one of the preceding claims.
- 15. A processor comprising memory and processing instructions to carry out the method according to any one of the preceding claims.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB2217840.4A GB2624712A (en) | 2022-11-28 | 2022-11-28 | Monitoring system |
PCT/EP2023/082994 WO2024115310A1 (en) | 2022-11-28 | 2023-11-24 | Monitoring system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB2217840.4A GB2624712A (en) | 2022-11-28 | 2022-11-28 | Monitoring system |
Publications (2)
Publication Number | Publication Date |
---|---|
GB202217840D0 GB202217840D0 (en) | 2023-01-11 |
GB2624712A true GB2624712A (en) | 2024-05-29 |
Family
ID=84889492
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
GB2217840.4A Pending GB2624712A (en) | 2022-11-28 | 2022-11-28 | Monitoring system |
Country Status (2)
Country | Link |
---|---|
GB (1) | GB2624712A (en) |
WO (1) | WO2024115310A1 (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9747446B1 (en) * | 2013-12-26 | 2017-08-29 | Fireeye, Inc. | System and method for run-time object classification |
US20200349592A1 (en) * | 2019-05-03 | 2020-11-05 | Sap Se | Identification of anomalies on a transaction network |
US20210256397A1 (en) * | 2020-02-17 | 2021-08-19 | International Business Machines Corporation | Anomaly detection using zonal parameter characteristics and non-linear scoring |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11880750B2 (en) * | 2020-04-15 | 2024-01-23 | SparkCognition, Inc. | Anomaly detection based on device vibration |
-
2022
- 2022-11-28 GB GB2217840.4A patent/GB2624712A/en active Pending
-
2023
- 2023-11-24 WO PCT/EP2023/082994 patent/WO2024115310A1/en unknown
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9747446B1 (en) * | 2013-12-26 | 2017-08-29 | Fireeye, Inc. | System and method for run-time object classification |
US20200349592A1 (en) * | 2019-05-03 | 2020-11-05 | Sap Se | Identification of anomalies on a transaction network |
US20210256397A1 (en) * | 2020-02-17 | 2021-08-19 | International Business Machines Corporation | Anomaly detection using zonal parameter characteristics and non-linear scoring |
Also Published As
Publication number | Publication date |
---|---|
WO2024115310A1 (en) | 2024-06-06 |
GB202217840D0 (en) | 2023-01-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20230421581A1 (en) | Framework for investigating events | |
US11750659B2 (en) | Cybersecurity profiling and rating using active and passive external reconnaissance | |
US20210273959A1 (en) | Cyber security system applying network sequence prediction using transformers | |
US20220014560A1 (en) | Correlating network event anomalies using active and passive external reconnaissance to identify attack information | |
US12058177B2 (en) | Cybersecurity risk analysis and anomaly detection using active and passive external reconnaissance | |
US9516041B2 (en) | Cyber security analytics architecture | |
US20190089725A1 (en) | Deep Architecture for Learning Threat Characterization | |
US20230336581A1 (en) | Intelligent prioritization of assessment and remediation of common vulnerabilities and exposures for network nodes | |
EP4367840A1 (en) | Intelligent prioritization of assessment and remediation of common vulnerabilities and exposures for network nodes | |
US20220414206A1 (en) | Browser extension for cybersecurity threat intelligence and response | |
CN112534432A (en) | Real-time mitigation of unfamiliar threat scenarios | |
JP2021515498A (en) | Attribute-based policies for integrity monitoring and network intrusion detection | |
US20220224724A1 (en) | Artificial intelligence based analyst as an evaluator | |
US20180013783A1 (en) | Method of protecting a communication network | |
US20240031380A1 (en) | Unifying of the network device entity and the user entity for better cyber security modeling along with ingesting firewall rules to determine pathways through a network | |
EP2936772A1 (en) | Network security management | |
US20240259414A1 (en) | Comprehensible threat detection | |
Bajtoš et al. | Multi-stage cyber-attacks detection in the industrial control systems | |
US11985144B2 (en) | Browser extension for cybersecurity threat intelligence and response | |
WO2023218167A1 (en) | Security threat alert analysis and prioritization | |
GB2624712A (en) | Monitoring system | |
WO2023009795A1 (en) | Systems and methods for applying reinforcement learning to cybersecurity graphs | |
Ramprasath et al. | Virtual Guard Against DDoS Attack for IoT Network Using Supervised Learning Method | |
US20240354399A1 (en) | Predictive models for extended detection and response (xdr) systems | |
Metelli et al. | Modelling new edge formation in a computer network through Bayesian variable selection |