WO2022161069A1 - 一种动态控制系统的异常检测方法、装置和计算机可读介质 - Google Patents

一种动态控制系统的异常检测方法、装置和计算机可读介质 Download PDF

Info

Publication number
WO2022161069A1
WO2022161069A1 PCT/CN2021/141706 CN2021141706W WO2022161069A1 WO 2022161069 A1 WO2022161069 A1 WO 2022161069A1 CN 2021141706 W CN2021141706 W CN 2021141706W WO 2022161069 A1 WO2022161069 A1 WO 2022161069A1
Authority
WO
WIPO (PCT)
Prior art keywords
network
control system
dynamic control
time point
current time
Prior art date
Application number
PCT/CN2021/141706
Other languages
English (en)
French (fr)
Inventor
冯程
王帆
李聪超
陈嘉雯
田鹏伟
Original Assignee
西门子股份公司
西门子(中国)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 西门子股份公司, 西门子(中国)有限公司 filed Critical 西门子股份公司
Priority to EP21922658.6A priority Critical patent/EP4266209A4/en
Priority to US18/262,630 priority patent/US20240045411A1/en
Publication of WO2022161069A1 publication Critical patent/WO2022161069A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B23/00Testing or monitoring of control systems or parts thereof
    • G05B23/02Electric testing or monitoring
    • G05B23/0205Electric testing or monitoring by means of a monitoring system capable of detecting and responding to faults
    • G05B23/0218Electric testing or monitoring by means of a monitoring system capable of detecting and responding to faults characterised by the fault detection method dealing with either existing or incipient faults
    • G05B23/0221Preprocessing measurements, e.g. data collection rate adjustment; Standardization of measurements; Time series or signal analysis, e.g. frequency analysis or wavelets; Trustworthiness of measurements; Indexes therefor; Measurements using easily measured parameters to estimate parameters difficult to measure; Virtual sensor creation; De-noising; Sensor fusion; Unconventional preprocessing inherently present in specific fault detection methods like PCA-based methods
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B23/00Testing or monitoring of control systems or parts thereof
    • G05B23/02Electric testing or monitoring
    • G05B23/0205Electric testing or monitoring by means of a monitoring system capable of detecting and responding to faults
    • G05B23/0218Electric testing or monitoring by means of a monitoring system capable of detecting and responding to faults characterised by the fault detection method dealing with either existing or incipient faults
    • G05B23/0224Process history based detection method, e.g. whereby history implies the availability of large amounts of data
    • G05B23/024Quantitative history assessment, e.g. mathematical relationships between available data; Functions therefor; Principal component analysis [PCA]; Partial least square [PLS]; Statistical classifiers, e.g. Bayesian networks, linear regression or correlation analysis; Neural networks
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B23/00Testing or monitoring of control systems or parts thereof
    • G05B23/02Electric testing or monitoring
    • G05B23/0205Electric testing or monitoring by means of a monitoring system capable of detecting and responding to faults
    • G05B23/0218Electric testing or monitoring by means of a monitoring system capable of detecting and responding to faults characterised by the fault detection method dealing with either existing or incipient faults
    • G05B23/0243Electric testing or monitoring by means of a monitoring system capable of detecting and responding to faults characterised by the fault detection method dealing with either existing or incipient faults model based detection method, e.g. first-principles knowledge model
    • G05B23/0254Electric testing or monitoring by means of a monitoring system capable of detecting and responding to faults characterised by the fault detection method dealing with either existing or incipient faults model based detection method, e.g. first-principles knowledge model based on a quantitative model, e.g. mathematical relationships between inputs and outputs; functions: observer, Kalman filter, residual calculation, Neural Networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0495Quantised networks; Sparse networks; Compressed networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/067Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using optical means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks

Definitions

  • Embodiments of the present invention relate to the technical field of abnormality detection, and in particular, to a method, device, and computer-readable medium for abnormality detection of a dynamic control system.
  • Active condition monitoring of dynamic control systems is critical to ensuring the safety and reliability of various industries such as discrete manufacturing, power generation, building asset management, and process industries.
  • anomaly detection systems are usually deployed to monitor the dynamic behavior of the control system, including the dynamic changes in the measured values of sensors and state values of triggers over time.
  • it is still very difficult to establish an effective anomaly detection model with high true positive rate and low false positive rate for dynamic control systems, because:
  • the amount of fault data is usually small, and the anomaly detection model must be able to detect unknown faults.
  • the anomaly detection model should be able to accurately capture the complex dynamic behavior of the system.
  • anomaly detection must accurately detect anomalies with unknown amounts of sensor noise and model errors at random points in time.
  • Current anomaly detection methods for dynamic control systems include residual-based anomaly detection methods, density-based anomaly detection methods, single-classification-based anomaly detection methods, and rule-based anomaly detection methods.
  • residual-based anomaly detection methods rely on predictive models such as neural network-based regression models (see “Long Short-Term Storage” by Hochreiter, Sepp and Jürgen Schmidhuber in Neural Computation pp. 1735-1780 in 1977) or Reconstruction models such as autoencoders (see Fast Learning Algorithms for Deep Belief Networks, Hinton, Geoffrey E, Simon Osindero, and Yee-Whye Teh, 2006, Neural Computation, pp. 1527-1554, and Automatic Coding Variational Bayes, Kingma, Diederik P, and Max Welling, published on the preprint platform arXiv in 2013), compresses sensor measurements to obtain low-dimensional features and reconstruct them.
  • predictive models such as neural network-based regression models (see “Long Short-Term Storage” by Hochreiter, Sepp and Jürgen Schmidhuber in Neural Computation pp. 1735-1780 in 1977) or Reconstruction models such as autoencoders (see Fast Learning Algorithms
  • the predicted or reconstructed measurements are then compared to measurements obtained from real-time monitoring to generate residuals.
  • An anomaly is considered to be detected if the residual exceeds a preset threshold.
  • strict thresholds between normal and abnormal sensor measurements are difficult to define due to the amount of sensor noise and unknown prediction or reconstruction errors at each point in time. Consequently, the performance of residual-based methods typically deteriorates when the sensor's measurements are disturbed by large noise or the errors in model predictions or reconstructions are unstable.
  • the density-based anomaly detection method models the probability distribution of the measurement value of the sensor at each time point, and when the likelihood value of the real-time monitoring measurement value is lower than a preset threshold, it is considered that an anomaly is detected.
  • Density-based anomaly detection methods include Kalman filtering algorithms (see Kalman Filtering, published in Springer, 2017 by C.K.Chui and G.Chen et al.; and Lossless Kalman Filtering for Nonlinear Estimation “, Wan, Eric A, and Rudolph Van Der Merwe, 2000, "Review of the IEEE Symposium on Adaptive Systems for Signal Processing, Communication, and Control" (No.
  • One-class classification-based anomaly detection methods such as: One-class Support Vector Machine (SVM) (see “One-Class Support Vector Machines for Document Classification”, Manevitz, Larry M and Malik Yousef in 2001 Published in Journal of Machine Learning Research pp. 139-154) and isolation forest (see Isolation Forest, Liu, Fei Tony, Kai Ming Ting and Zhi-Hua Zhou in 2008 in IEEE on The Eighth International Conference on Data Mining), which can be naturally applied to anomaly detection of dynamic control systems and has good interpretability.
  • SVM One-class Support Vector Machine
  • isolation forest see Isolation Forest, Liu, Fei Tony, Kai Ming Ting and Zhi-Hua Zhou in 2008 in IEEE on The Eighth International Conference on Data Mining
  • rule-based anomaly detection methods the state conditions that must be maintained by the system obtained from prior knowledge are obtained. Any physical process value monitored in real time that violates this rule is classified as anomalous.
  • these rules are defined by domain experts during the system design phase, and manual processing is time-consuming and labor-intensive.
  • rule-based anomaly detection methods are often limited by the inability to discover enough rules.
  • Embodiments of the present invention provide an abnormality detection method, device, and computer-readable medium for a dynamic control system, wherein, first, the system identification of the dynamic control system is performed by using a specially designed neural network structure, and the system is automatically completed through the training of the neural network. process of identification.
  • the use of neural network can significantly improve the generality of system identification and can obtain the highly nonlinear dynamic behavior of dynamic control system, and also overcome the common problem that the model with general expression ability may bring about the disaster of dimensionality. Anomalies are then detected by the likelihood of sensor measurements observed in real time, taking into account uncertainties from sensor noise and model errors, using Bayesian filtering.
  • a method for detecting anomalies in a dynamic control system is provided.
  • the method can be implemented by a computer program.
  • the g network is used to initialize the hidden state distribution of a dynamic control system;
  • the measured value of the sensor in the dynamic control system and the state value of the trigger input at least one first sampling point into the f network to predict at least one second sampling point, wherein the at least one first sampling point is used to represent The hidden state distribution of the dynamic control system at the adjacent time point t-1 before the current time point t, the at least one second sampling point is used to represent the prior hidden state distribution of the dynamic control system at the current time point t state distribution;
  • using the h network to map the at least one second sampling point to the sensor measurement value space, to predict the probability distribution of the sensor measurement value of the dynamic control system at the current time point t;
  • the measured value and the predicted probability distribution are used to determine whether the dynamic control system is abnormal;
  • the g network, f network and h network are the sub-neural networks used to
  • the g network is a feedforward network, and is used to encode the measured value of the sensor into a low-dimensional hidden state vector
  • the f network encodes the measured value of the sensor in the sliding window and the state value of the trigger into a vector
  • the hidden state vector at the current time point obtained by the g network coding is used to predict the hidden state vector at the next time point
  • the h network is a feedforward network, and the predicted hidden state vector at the next time point is decoded as The measured value of the sensor is decoded, and the low-dimensional hidden state vector at the current time point is decoded into the measured value of the sensor
  • the neural network is trained using the measured value of the sensor obtained under the normal operating conditions of the dynamic control system.
  • an abnormality detection device for a dynamic control system including:
  • an initialization module configured to initialize the hidden state distribution of a dynamic control system using the g-network
  • a data acquisition module configured to receive the measured value of the sensor in the dynamic control system and the state value of the trigger at the current time point t obtained by real-time monitoring;
  • a prediction module configured to: input at least one first sampling point into the f network to predict at least one second sampling point, wherein the at least one first sampling point is used to represent the a hidden state distribution of the dynamic control system at an adjacent time point t-1, the at least one second sampling point is used to represent the prior hidden state distribution of the dynamic control system at the current time point t; and using an h network mapping the at least one second sampling point to the sensor measurement value space to predict the probability distribution of the sensor measurement values of the dynamic control system at the current time point t;
  • an abnormality judgment module configured to judge whether the dynamic control system is abnormal by comparing the measured value obtained by real-time monitoring with the predicted probability distribution
  • the g-network, f-network and h-network are sub-networks in the neural network used to represent the dynamic distribution of the dynamic control system, and the g-network is a feedforward network used to encode the measured value of the sensor as a low dimension hidden state vector; the f network encodes the measurement value of the sensor in the sliding window and the state value of the trigger into a vector, and uses the hidden state vector of the current time point encoded by the g network to predict the next time point.
  • the h network is a feedforward network, which decodes the predicted hidden state vector at the next time point as the measured value of the sensor, and decodes the low-dimensional hidden state vector at the current time measurement value; the neural network is obtained by training using the measurement value of the sensor obtained under normal operating conditions of the dynamic control system.
  • a third aspect provides an abnormality detection device for a dynamic control system, comprising: at least one memory configured to store computer-readable codes; at least one processor configured to invoke the computer-readable codes to execute the first aspect provided steps.
  • a computer-readable medium storing computer-readable instructions on the computer-readable medium, the computer-readable instructions, when executed by a processor, cause the processor to execute the method provided in the first aspect. step.
  • the posterior hidden state distribution of the dynamic control system at the current time point t can also be updated to obtain all the data at the adjacent time point t+1 after the current time point t. the first sampling point. Therefore, the uncertainty of the hidden state of the system can be tracked in real time, and the reliability of abnormal monitoring can be increased.
  • the loss function used in the neural network training minimizes the sum of the reconstruction error and the prediction error of the measurement values of the sensors at each time point used for training. This end-to-end training method makes our neural network very easy to implement in real-world applications.
  • the at least one first sampling point and the at least one second sampling point are both sigma sampling points. In this way, the probability distribution can be expressed efficiently with the least sampling points, and the operation efficiency of the method can be greatly improved.
  • FIG. 1 is a schematic structural diagram of a neural network used for system identification in an embodiment of the present invention.
  • FIG. 2 is a comparison diagram of the effect of anomaly detection using an embodiment of the present invention and an existing method.
  • FIG. 3 is a schematic structural diagram of an abnormality detection apparatus provided by an embodiment of the present invention.
  • FIG. 4 is a flowchart of an abnormality detection method provided by an embodiment of the present invention.
  • the term “including” and variations thereof represent open-ended terms meaning “including but not limited to”.
  • the term “based on” means “based at least in part on”.
  • the terms “one embodiment” and “an embodiment” mean “at least one embodiment.”
  • the term “another embodiment” means “at least one other embodiment.”
  • the terms “first”, “second”, etc. may refer to different or the same objects. Other definitions, whether explicit or implicit, may be included below. The definition of a term is consistent throughout the specification unless the context clearly dictates otherwise.
  • control system is divided into static control system and dynamic control system. The two have the following differences:
  • the state variables of a dynamic control system change significantly with time and are a function of time.
  • the state variables of the static control system change little with time, which is difficult to observe and measure.
  • a dynamic control system consists of a variety of variables or parameters, which are interrelated and in constant motion.
  • the output of a static control system at any time is only related to the input at that time, and has nothing to do with the input before or after that time.
  • the final state of a dynamic control system may be either an equilibrium state or a non-equilibrium state.
  • the final state of a static system is the equilibrium state.
  • the dynamic control system may also have the characteristics of highly nonlinear dynamics and unknown amount of sensor noise and model error, which is why the aforementioned current anomaly detection methods are difficult to apply to dynamic control systems.
  • System identification is to determine the mathematical model that describes the behavior of the system according to the input and output time functions of the system.
  • the purpose of establishing a mathematical model through system identification is to estimate the important parameters that characterize the behavior of the system, and to establish a model that can imitate the behavior of the real system.
  • a time series is an ordered series of data. Usually data sampled at equal time intervals. If not equally spaced, the timescale for each data point is typically labeled.
  • FIG. 1 shows the structure of a neural network 10 used for system identification in an embodiment of the present invention.
  • the dynamic control system includes some sensors and some triggers.
  • the following neural network structure is proposed here to obtain the dynamic changes of the time series of the dynamic control system.
  • the neural network 10 may include three sub-networks, called g-network, f-network and h-network, respectively.
  • the g network takes ⁇ as a parameter and is a feed-forward network, in which the measurement value x t- 1 of the sensor at time point t-1 is used as input, and the measurement value of these sensors is encoded as Low-dimensional hidden state vector z t-1 .
  • the f network takes ⁇ as the parameter, takes the measurement value of the sensor and the state value of the trigger in the sliding window of length l as input, and can use the long short-term memory neural network (Long Short-Term Memory, LSTM) to encode them as latent vectors h t-1 . Further, with h t-1 as the context of the learned time series, the f network also takes the hidden state vector z t-1 as input, and then utilizes the feedforward network to predict the hidden state vector z t at the next time point.
  • LSTM Long Short-Term Memory
  • h network with is a feedforward network that takes the hidden state vector as input, and decodes the hidden state vector into the corresponding sensor measurement value. It should be noted that the two h networks in Figure 1 can share the same set of weights.
  • the f network, the g network, and the h network may be a simulation model or a differential equation solver, etc. during implementation, and the specific implementation manner is not limited.
  • the entire neural network 10 can be expressed as Taking the sensor measurement value x t- 1 at the time point t-1, the sensor measurement value x tl: t-1 in the sliding window and the state value u tl: t-1 of the trigger as input, to the hidden state Vector decoded sensor measurements and as output.
  • the first two terms are the reconstruction error and prediction error of the sensor's measurement value, respectively, and the third term is the smoothing factor, so that the two temporally continuous hidden state vectors can be closer.
  • ⁇ , ⁇ , and ⁇ are hyperparameters representing three weights.
  • the dynamic behavior of the dynamic control system can be expressed as follows:
  • Q is the covariance matrix of prediction errors, estimated from empirical values based on prediction errors on the validation dataset obtained from:
  • R is the covariance matrix of the reconstruction error, estimated from the reconstruction error based on the same validation dataset obtained from:
  • Bayesian filtering can be used for anomaly detection to iteratively estimate the time-varying probability distribution of the hidden states of the dynamic control system.
  • z t and P t can be used to track the probability distribution of the hidden states of the dynamic control system (hereinafter referred to as “hidden state distribution”), where z t represents the mean vector and P t represents the covariance of the hidden states at time point t matrix.
  • the whole process is divided into an initial step, a prediction step, an update step, and an anomaly detection step.
  • the mean and covariance of the prior of the hidden state distribution at time point t are calculated.
  • the corresponding weights of these sigma points are W m and W c .
  • a sigma function is the use of Van der Merwe's Scaled sigma point algorithm (see “Sigma Point Kalman Filters for Probabilistic Inference in Dynamic State Space Models," Van der Merwe, 2004).
  • the sigma points are selected so that only a small number of sigma points can represent the hidden state distribution of the dynamic control system at time t-1. These selected sigma points can be passed through the f network to predict at least one second sample point (an example here is a sigma point) such that:
  • the mean and covariance of the prior hidden state distribution at time point t can be calculated by the lossless transformation function:
  • the mean and covariance (referred to as z t and P t ) of the posterior of the hidden state distribution of the dynamic control system at time point t are calculated.
  • the h-network is used to map the sigma point Y of the hidden state distribution prior of the dynamic control system to the sensor measurement value space.
  • the mean and covariance of these measured sigma points are calculated using a lossless transformation function.
  • the Kalman gain can be obtained by the following formula:
  • anomalies can be detected by calculating the Mahalanobis distance between the measured value obtained by real-time monitoring and the predicted probability distribution:
  • the Mahalanobis distance exceeds a preset threshold ⁇ , it means that these real-time monitoring measurements are unlikely to occur even if the sensor and predicted noise are taken into account, i.e. an anomaly is detected.
  • the method provided by the embodiment of the present invention is used to detect the abnormality in the data set of the pump control system.
  • System data consists of measurements from 52 sensors sampled every minute over 5 months. There are 7 failures in the dataset with consecutive hours to days.
  • the dataset is divided into a training set, a validation set and a test set in a ratio of 3:1:1. All 7 failures occurred within the epoch of the test set, which means that the training and validation sets only contain data under normal conditions.
  • Fig. 2 shows the use of the embodiments of the present invention and several other methods mentioned above (isolation forest, Bayesian estimation algorithm, autoencoder including sparse autoencoder, variational autoencoder, LSTM autoencoder) Performance when doing anomaly detection. All baseline models are trained using the same dataset.
  • No. 1 corresponds to the embodiment of the present invention
  • No. 2 corresponds to an isolated forest
  • No. 3 corresponds to Seq2SeqLSTM
  • No. 4 corresponds to a dilated convolutional neural network (DilatedCNN)
  • No. 5 corresponds to a sparse auto-encoder
  • No. 6 corresponds to a sparse auto-encoder
  • Number 7 corresponds to the LSTM autoencoder
  • number 8 corresponds to the Bayesian estimation algorithm.
  • the abnormality detection apparatus 30 provided in the embodiment of the present invention may be implemented as a network of computer processors, so as to execute the abnormality detection method 400 of the dynamic control system in the embodiment of the present invention.
  • the anomaly detection device 30 may also be a single computer as shown in FIG. 3, including at least one memory 301 including a computer-readable medium such as random access memory (RAM).
  • Apparatus 30 also includes at least one processor 302 coupled with at least one memory 301 .
  • Computer-executable instructions are stored in at least one memory 301 and, when executed by at least one processor 302, can cause at least one processor 302 to perform the steps described herein.
  • the at least one processor 302 may include a microprocessor, an application specific integrated circuit (ASIC), a digital signal processor (DSP), a central processing unit (CPU), a graphics processing unit (GPU), a state machine, and the like.
  • Examples of computer readable media include, but are not limited to, floppy disks, CD-ROMs, magnetic disks, memory chips, ROM, RAM, ASICs, configured processors, all-optical media, all magnetic tapes or other magnetic media, or from which a computer processor can Any other medium from which to read instructions.
  • various other forms of computer-readable media can transmit or carry instructions to a computer, including routers, private or public networks, or other wired and wireless transmission devices or channels. Instructions can include code in any computer programming language, including C, C++, C, Visual Basic, java, and JavaScript.
  • the at least one memory 301 shown in FIG. 3 may contain the abnormality detection program 31 of the dynamic control system, so that the at least one processor 302 executes the method for dynamic control described in the embodiment of the present invention Anomaly detection method 400 of the system.
  • the abnormality detection program 31 of the dynamic control system may include:
  • an initialization module 311 configured to initialize the hidden state distribution of a dynamic control system using the g-network in the neural network 10 shown in FIG. 1;
  • a data acquisition module 312 configured to receive the measured value of the sensor and the state value of the trigger in the dynamic control system at the current time point t obtained by real-time monitoring;
  • a prediction module 313, configured to: input at least one first sampling point into the f network in the neural network 10 shown in FIG. 1 to obtain at least one second sampling point, wherein the at least one first sampling point is obtained by using where at least one second sampling point is used to represent the hidden state distribution of the dynamic control system prior to the current time point t, at least one second sampling point is used to represent the hidden state distribution of the dynamic control system prior at the current time point t; and Using the h network in the neural network 10 to map at least one second sampling point to the sensor measurement value space to predict the probability distribution of the sensor measurement values of the dynamic control system at the current time point t;
  • an abnormality judgment module 314 configured to judge whether there is an abnormality in the dynamic control system by comparing the measured value obtained by real-time monitoring with the predicted probability distribution;
  • the anomaly detection program 31 may further include an update module 315 configured to update the latent state distribution of the dynamic control system posterior at the current time point t, for obtaining the adjacent time point t after the current time point t. First sample point at +1.
  • the loss function used in the training of the neural network minimizes the sum of the reconstruction error and the prediction error of the measurement value of the sensor at each time point used for training.
  • the at least one first sampling point and the at least one second sampling point are both sigma sampling points.
  • the abnormality detection apparatus 30 may further include a communication module 303, which is connected to at least one processor 302 and at least one memory 301 through a bus, and is used for the abnormality detection apparatus 30 to communicate with external devices.
  • a communication module 303 which is connected to at least one processor 302 and at least one memory 301 through a bus, and is used for the abnormality detection apparatus 30 to communicate with external devices.
  • embodiments of the present invention may include apparatuses having architectures different from those shown in FIG. 3 .
  • the above architecture is only exemplary, and is used to explain the method 400 provided by the embodiment of the present invention.
  • the above modules can also be regarded as various functional modules implemented by hardware, which are used to realize various functions involved in the abnormality detection device 30 executing the abnormality detection method of the dynamic control system.
  • the control logic of the process is baked into chips such as Field-Programmable Gate Array (FPGA) or Complex Programmable Logic Device (CPLD), and these chips or devices execute the functions of the above modules. Function, the specific implementation method can be determined according to engineering practice.
  • the abnormality detection apparatus 30 may further include a communication module 303, which is connected to at least one processor 302 and at least one memory 301 through a bus, and is used for the abnormality detection apparatus 30 to communicate with external devices.
  • a communication module 303 which is connected to at least one processor 302 and at least one memory 301 through a bus, and is used for the abnormality detection apparatus 30 to communicate with external devices.
  • embodiments of the present invention may include apparatuses having architectures different from those shown in FIG. 3 .
  • the above architecture is only exemplary, and is used to explain the method 400 provided by the embodiment of the present invention.
  • the above modules can also be regarded as various functional modules implemented by hardware, which are used to realize various functions involved in the abnormality detection device 30 executing the abnormality detection method of the dynamic control system.
  • the control logic of the process is baked into chips such as Field-Programmable Gate Array (FPGA) or Complex Programmable Logic Device (CPLD), and these chips or devices execute the functions of the above modules. Function, the specific implementation method can be determined according to engineering practice.
  • the following describes an abnormality detection method 400 for a dynamic control system provided by an embodiment of the present invention with reference to FIG. 4 .
  • the method may include the following steps:
  • - S404 Use the h network in the neural network 10 to map at least one second sampling point to the sensor measurement value space, so as to predict the probability distribution of the sensor measurement values of the dynamic control system at the current time point t;
  • -S405 Determine whether the dynamic control system is abnormal by comparing the measured value obtained by real-time monitoring with the probability distribution obtained by prediction.
  • the method 400 may further include step S406: updating the posterior hidden state distribution of the dynamic control system at the current time point t, for obtaining the first value at the adjacent time point t+1 after the current time point t. Sampling point.
  • the loss function used in the training of the neural network 10 minimizes the sum of the reconstruction error and the prediction error of the measurement values of the sensors at each time point used for training.
  • At least one first sampling point and at least one second sampling point are both sigma sampling points.
  • embodiments of the present invention further provide a computer-readable medium, where computer-readable instructions are stored on the computer-readable medium, and when the computer-readable instructions are executed by a processor, the processor executes the foregoing dynamic control system anomaly detection method.
  • Examples of computer-readable media include floppy disks, hard disks, magneto-optical disks, optical disks (eg, CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM, DVD-RW, DVD+RW), magnetic tape, non- Volatile memory cards and ROMs.
  • the computer readable instructions may be downloaded from a server computer or cloud over a communications network.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Automation & Control Theory (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Neurology (AREA)
  • Testing And Monitoring For Control Systems (AREA)

Abstract

本发明实施例涉及动态控制系统的异常检测方法、装置和计算机可读介质。方法包括:使用神经网络中的g网络初始化动态控制系统的隐状态分布;接收实时监测得到的当前时间点t处传感器的测量值和触发器的状态值;将至少一个第一采样点输入神经网络中的f网络以预测得到至少一个第二采样点,第一采样点表示在当前时间点t之前的邻近时间点t-1处动态控制系统的隐状态分布,第二采样点表示在当前时间点t处动态控制系统先验的隐状态分布;使用神经网络中的h网络将第二采样点映射到传感器测量值空间,以预测得到在当前时间点t处动态控制系统的传感器测量值的概率分布;通过比较实时监测得到的测量值与预测得到的概率分布来判断系统是否存在异常。

Description

一种动态控制系统的异常检测方法、装置和计算机可读介质 技术领域
本发明实施例涉及异常检测技术领域,尤其涉及一种动态控制系统的异常检测方法、装置和计算机可读介质。
背景技术
动态控制系统的主动状态监测对于保障各种工业(比如:离散制造、发电、楼宇资产管理和过程工业)的安全和可靠性至关重要。为了在预测性维护中预先检测运行故障,通常都会部署异常检测系统以监测控制系统的动态行为,其中包括传感器的测量值和触发器的状态值随时间的动态变化。但实践中,为动态控制系统建立有效的、具有高真阳性率和低假阳性率的异常检测模型仍是十分困难的,这是因为:
第一,通常故障数据量较少,异常检测模型必须要能够检测出未知的故障。
第二,对于具有高度非线性动态的控制系统,异常检测模型要能够准确地捕捉到系统复杂的动态行为。
第三,异常检测必须要在随机时间点处的传感器噪音量和模型误差未知的情况下准确检测异常。
目前动态控制系统的异常检测方法包括:基于残差的异常检测方法、基于密度的异常检测方法、基于单分类的异常检测方法和基于规则的异常检测方法。
其中,基于残差的异常检测方法依赖于诸如基于神经网络的回归模型的预测模型(参见《长短期存储》Hochreiter,Sepp和Jürgen Schmidhuber在1977年发表于《神经计算》1735页至1780页)或诸如自动编码器的重建模型(参见《用于深度置信网络的快速学习算法》Hinton,Geoffrey E、Simon Osindero和Yee-Whye Teh在2006年发表于《神经计算》1527页至1554页,以及《自动编码变分贝叶斯》,Kingma、Diederik P和Max Welling于2013年发表于预印本平台arXiv),以压缩传感器的测量值来获取低维度特征并进行重建。然后将预测或重建的测量值与实时监测得到的测量值相比,生成残差。如果残差超过预设的阈值则认为检测到异常。实践中,由于传感器噪音量以及每一个时间点预测误差或重建误差未知,介于正常和异常的传感器的测量值之间严格的阈值很难定义。因此,基于残差的方法的性能通常在传感器的测量值受到较大噪音干扰或模型预测或重建的误差不稳定时恶化。
其中,基于密度的异常检测方法对每一个时间点处传感器的测量值的概率分布建模,当 实时监测的测量值的似然值低于预设阈值时认为检测到异常。基于密度的异常检测方法包括卡尔曼滤波算法(参见《卡尔曼滤波》,C.K.Chui和G.Chen等在2017年发表于《施普林格》;以及《用于非线性估计的无损卡尔曼滤波》,Wan,Eric A和Rudolph Van Der Merwe在2000年发表于《IEEE信号处理、通信和控制自适应系统研讨会综述》(编号00EX373))和贝叶斯估计算法(参见《用于机器状态监测的贝叶斯传感器估计》,Chao Yuan和Claus Neubauer在2007年发表于IEEE声学、语音和信号处理国际会议刊物517页至520页;以及《使用时间信息的鲁棒的传感器估计》,Chao Yuan和Claus Neubauer在2008年发表于IEEE声学、语音和信号处理国际会议刊物2077页至2080页)。虽然总体上基于密度的方法对于传感器噪音比基于残差的方法更鲁棒,但仍具有一定局限性,限制了实际应用。比如:卡尔曼滤波方法在应用之前,通常需要通过系统辨识建立物理动态过程的数学模型,而系统辨识在实践中是比较困难的。此外,许多基于密度的方法在物理动态过程和/或传感器测量值分布建模时通常需要较高的先验知识,当物理动态过程高度非线性时,这些方法的性能可能恶化。
基于单分类(Oneclass classification)的异常检测方法,比如:单分类支持向量机(Support Vector Machine,SVM)(参见《用于文档分类的单分类支持向量机》,Manevitz,Larry M和Malik Yousef在2001年发表于《机器学习研究杂志》第139页至154页)和孤立森林(isolation forest)(参见《孤立森林》,Liu,Fei Tony、Kai Ming Ting和Zhi-Hua Zhou在2008年发表于IEEE关于数据挖掘的第八次国际会议),可以自然地应用到动态控制系统的异常检测,并具有较好的可解释性。但由于维度灾难和系统动态的高度非线性,这些方法已无法适用于如今的动态控制系统中。
基于规则的异常检测方法中获取由先验知识得到的系统所必须保持的状态条件。实时监测的任何破坏该规则的物理过程值被分类为异常。典型地,这些规则由领域专家在系统设计阶段定义,人工处理十分费时费力。此外,特别当这些规则跨子系统时也存在许多潜在的规则难以由人来发现,。因此,基于规则的异常检测方法常常受限于无法发现足够的规则。
发明内容
本发明实施例提供一种动态控制系统的异常检测方法、装置和计算机可读介质,其中,首先,使用特别设计的神经网络的结构进行动态控制系统的系统辨识,通过神经网络的训练自动完成系统识别的过程。神经网络的使用可显著改善系统辨识的通用性并能够获取动态控制系统的高度非线性动态的行为,而且也克服了表达能力一般的模型可能带来维度灾难的通病。然后,考虑到来自传感器噪声和模型误差的不确定性,使用贝叶斯滤波的方法,通过实时观察的传感器测量值的似然性来检测异常。
第一方面,提供一种动态控制系统的异常检测方法,该方法可由计算机程序实现,该方法中,使用g网络初始化一个动态控制系统的隐状态分布;接收实时监测得到的当前时间点t处所述动态控制系统中传感器的测量值和触发器的状态值;将至少一个第一采样点输入f网络,以预测得到至少一个第二采样点,其中,所述至少一个第一采样点用于表示在当前时间点t之前的邻近时间点t-1处所述动态控制系统的隐状态分布,所述至少一个第二采样点用于表示在当前时间点t处所述动态控制系统先验的隐状态分布;使用h网络将所述至少一个第二采样点映射到传感器测量值空间,以预测得到在当前时间点t处所述动态控制系统的传感器测量值的概率分布;通过比较实时监测得到的所述测量值与预测得到的所述概率分布来判断所述动态控制系统是否存在异常;所述g网络、f网络和h网络是用于表示所述动态控制系统动态分布的神经网络中的子网络,所述g网络为前馈网络,用于将传感器的测量值编码为低维隐状态向量;所述f网络将滑动窗口内的传感器的测量值和触发器的状态值编码为向量,并利用所述g网络编码得到的当前时间点的隐状态向量预测下一个时间点处的隐状态向量;所述h网络为前馈网络,将预测得到的下一个时间点处的隐状态向量解码为传感器的测量值,并将当前时间点处的低维隐状态向量解码为传感器的测量值;所述神经网络是使用所述动态控制系统正常工况下获取的传感器的测量值训练得到的。
第二方面,提供一种动态控制系统的异常检测装置,包括:
-一个初始化模块,被配置为使用g网络初始化一个动态控制系统的隐状态分布;
-一个数据获取模块,被配置为接收实时监测得到的当前时间点t处所述动态控制系统中传感器的测量值和触发器的状态值;
-一个预测模块,被配置为:将至少一个第一采样点输入f网络,以预测得到至少一个第二采样点,其中,所述至少一个第一采样点用于表示在当前时间点t之前的邻近时间点t-1处所述动态控制系统的隐状态分布,所述至少一个第二采样点用于表示在当前时间点t处所述动态控制系统先验的隐状态分布;以及使用h网络将所述至少一个第二采样点映射到传感器测量值空间,以预测得到在当前时间点t处所述动态控制系统的传感器测量值的概率分布;
-一个异常判断模块,被配置为通过比较实时监测得到的所述测量值与预测得到的所述概率分布来判断所述动态控制系统是否存在异常;
其中,所述g网络、f网络和h网络是用于表示所述动态控制系统动态分布的神经网络中的子网络,所述g网络为前馈网络,用于将传感器的测量值编码为低维隐状态向量;所述f网络将滑动窗口内的传感器的测量值和触发器的状态值编码为向量,并利用所述g网络编码得到的当前时间点的隐状态向量预测下一个时间点处的隐状态向量;所述h网络为前馈网络,将预测得到的下一个时间点处的隐状态向量解码为传感器的测量值,并将当前时间点处的低 维隐状态向量解码为传感器的测量值;所述神经网络是使用所述动态控制系统正常工况下获取的传感器的测量值训练得到的。
第三方面,提供一种动态控制系统的异常检测装置,包括:至少一个存储器,被配置为存储计算机可读代码;至少一个处理器,被配置为调用所述计算机可读代码,执行第一方面所提供的步骤。
第四方面,一种计算机可读介质,所述计算机可读介质上存储有计算机可读指令,所述计算机可读指令在被处理器执行时,使所述处理器执行第一方面所提供的步骤。
对于上述任一方面,可选地,还可更新在当前时间点t处所述动态控制系统后验的隐状态分布,用于获取在当前时间点t之后的邻近时间点t+1处的所述第一采样点。从而使得系统隐状态的不确定性得到实时追踪,增加异常监测的可靠性。
对于上述任一方面,可选地,所述神经网络训练时采用的损失函数使得用于训练的各时间点处传感器的测量值的重建误差和预测误差之和最小。这种端到端的训练方法使得我们的神经网络在现实应用中非常容易实施。
对于上述任一方面,可选地,所述至少一个第一采样点和所述至少一个第二采样点均为sigma采样点。这样使得以最少的采样点来高效地表达概率分布,较大程度地提高方法的运行效率。
附图说明
图1为本发明实施例中用于系统辨识的神经网络的结构示意图。
图2为采用本发明实施例和采用现有方法进行异常检测的效果对比图。
图3为本发明实施例提供的异常检测装置的结构示意图。
图4为本发明实施例提供的异常检测方法的流程图。
附图标记列表:
Figure PCTCN2021141706-appb-000001
Figure PCTCN2021141706-appb-000002
具体实施方式
现在将参考示例实施方式讨论本文描述的主题。应该理解,讨论这些实施方式只是为了使得本领域技术人员能够更好地理解从而实现本文描述的主题,并非是对权利要求书中所阐述的保护范围、适用性或者示例的限制。可以在不脱离本发明实施例内容的保护范围的情况下,对所讨论的元素的功能和排列进行改变。各个示例可以根据需要,省略、替代或者添加各种过程或组件。例如,所描述的方法可以按照与所描述的顺序不同的顺序来执行,以及各个步骤可以被添加、省略或者组合。另外,相对一些示例所描述的特征在其它例子中也可以进行组合。
如本文中使用的,术语“包括”及其变型表示开放的术语,含义是“包括但不限于”。术语“基于”表示“至少部分地基于”。术语“一个实施例”和“一实施例”表示“至少一个实施例”。术语“另一个实施例”表示“至少一个其他实施例”。术语“第一”、“第二”等可以指代不同的或相同的对象。下面可以包括其他的定义,无论是明确的还是隐含的。除非上下文中明确地指明,否则一个术语的定义在整个说明书中是一致的。
为了能够使得本发明实施例所提供的方案更容易理解,这里对涉及的一些概念加以解释,需要说明的是,这些解释不应视为对本发明权利要求保护范围的限定。
1、动态控制系统
控制系统分为静态控制系统和动态控制系统。两者存在如下区别:
1)变化不同
动态控制系统的状态变量随时间有明显变化,是时间的函数。静态控制系统的状态变量随时间变化较小,难以观察和测量。
2)参数关联不同
动态控制系统由多种变量或参数构成,这些变量相互联系并处于恒动中。静态控制系统在任一时刻的输出只与该时刻的输入有关,而与该时刻之前或之后的输入无关。
3)终态不同
动态控制系统的终态既可能是平衡态,也可能是非平衡态。静态系统的终态是平衡态。
此外,动态控制系统还可能具有高度非线性动态以及传感器噪音量和模型误差未知的特 点,这也是前述的目前异常检测方法难以适用于动态控制系统的原因。
2、系统辨识(system identification)
系统辨识是根据系统的输入输出时间函数来确定描述系统行为的数学模型。通过系统辨识建立数学模型的目的是估计表征系统行为的重要参数,建立一个能模仿真实系统行为的模型。
3、时间序列(time series)
时间序列是一系列有序的数据。通常是等时间间隔采样的数据。如果不是等间隔,则一般会标注每个数据点的时间刻度。
下面,结合附图对本发明实施例进行详细说明。
首先,结合图1和图2说明本发明实施例中对动态控制系统进行系统辨识所使用的神经网络以及采用贝叶斯滤波进行异常检测的过程。
一、系统辨识
图1示出了本发明实施例中进行系统辨识所使用的神经网络10的结构。
不失一般性地,假设动态控制系统包括一些传感器和一些触发器。设x t表示在时间点t处这些传感器的测量值,u t表示在时间点t处这些触发器的状态值。这里提出如下的神经网络结构以获取动态控制系统的时间序列的动态变化。
这里神经网络10可包括三个子网络,分别称为g网络、f网络和h网络。
其中,g网络以ω为参数,为一个前馈网络(feed-forward network),其中,将时间点t-1处的传感器的测量值x t-1作为输入,将这些传感器的测量值编码为低维隐状态向量z t-1
f网络以θ为参数,将长度为l的滑动窗口内传感器的测量值和触发器的状态值作为输入,可利用长短期记忆神经网络(Long Short-Term Memory,LSTM)将它们编码为隐向量h t-1。进一步地,以h t-1作为学习到的时间序列的上下文,f网络还将隐状态向量z t-1作为输入,然后利用前馈网络来预测下一个时间点处的隐状态向量z t
h网络以
Figure PCTCN2021141706-appb-000003
为参数,为一个前馈网络,将隐状态向量作为输入,对隐状态向量解码为对应的传感器的测量值。需要说明的是,图1中的两个h网络可共用同一套权重。
可选的,f网络、g网络和h网络在实现时可为仿真模型或微分方程求解器等,具体实现方式不作限定。
综上,整个神经网络10可表示为
Figure PCTCN2021141706-appb-000004
以将时间点t-1处的传感器的测量值x t-1、滑 动窗口内的传感器测量值x t-l:t-1和触发器的状态值u t-l:t-1作为输入,以对隐状态向量解码后的传感器的测量值
Figure PCTCN2021141706-appb-000005
Figure PCTCN2021141706-appb-000006
作为输出。
为了训练该模型,需要获取动态控制系统为正常工况没有异常时的数据集,然后可使用梯度下降算法来解决下面的问题。其中可使用的梯度下降算法的一个例子为adam(参见《Adam:一种随机优化的方法》,Kingma,Diederik P和Jimmy Ba在2014年发表于预印本平台arXiv:1412.6980)
Figure PCTCN2021141706-appb-000007
在上面的损失函数中,前两项分别为传感器的测量值的重建误差和预测误差,第三项是平滑因子,这样两个时间上连续的隐状态向量就能够更接近。α、β和γ是代表三项权重的超参数。
模型训练完成后,即可将动态控制系统的动态行为表示如下:
z t=f θ(z t-1;x t-l:t-1,u t-l:t-1)+Q
Figure PCTCN2021141706-appb-000008
其中,Q是预测误差的协方差矩阵,依据从下式获得的基于验证数据集上预测误差的经验值评估得到:
g ω(x t)-f θ(g ω(x t-1);x t-l:t-1,u t-l:t-1)对于所有t>l
其中,R是重建误差的协方差矩阵,依据从下式获得的基于相同的验证数据集上的重构误差评估得到:
Figure PCTCN2021141706-appb-000009
对于所有t
二、用于异常检测的贝叶斯滤波
这里,可使用贝叶斯滤波来进行异常检测,以迭代地估计动态控制系统的隐状态随时间变化的概率分布。
特别地,可利用z t和P t来跟踪动态控制系统隐状态的概率分布(以下简称“隐状态分布”),其中,z t表示均值向量,P t表示时间点t处隐状态的协方差矩阵。
整个过程分为初始步骤、预测步骤、更新步骤和异常检测步骤。
1、初始步骤
设t=0,初始化z 0=g ω(x 0),P 0=0(所有元素均为0)。
然后采用如下三个步骤(预测、更新和异常检测)迭代地估计z t和P t并检测异常:
2、预测步骤
该步骤中,计算时间点t处隐状态分布先验的均值和协方差。首先,通过采样函数(比 如sigma函数)为时间点t-1处的隐状态分布生成一组采样点Z(这里称为“第一采样点”,若采样函数为sigma函数,则采样点为sigma点),后面以sigma函数进行采样为例加以说明。这些sigma点对应的权重为W m和W c。其中,sigma函数的一个例子是使用Van der Merwe的Scaled sigma点算法(参见《动态状态空间模型概率推理的Sigma点Kalman滤波器》,Van der Merwe在2004年发表)。
Z,W m,W c=sigmafunction(z t-1,P t-1)  (1)
选择sigma点,使得仅用少量的sigma点即能够表示t-1时间点处动态控制系统的隐状态分布。可将这些选择的sigma点通过f网络,以预测得到至少一个第二采样点(这里的一个例子是sigma点)使得:
Y=f θ(Z,x t-l:t-1,u t-l:t-1)  (2)
其中,时间点t处先验隐状态分布的均值和协方差可以通过无损变换函数计算得到:
Figure PCTCN2021141706-appb-000010
Figure PCTCN2021141706-appb-000011
3、更新步骤
该步骤中,计算时间点t处动态控制系统的隐状态分布后验的均值和协方差(称为z t和P t)。首先,使用h网络将动态控制系统隐状态分布先验的sigma点Y映射到传感器测量值空间。
L=h(Y)  (5)
使用无损变换函数计算得到这些测量sigma点的均值和协方差。
Figure PCTCN2021141706-appb-000012
Figure PCTCN2021141706-appb-000013
可通过如下公式获得卡尔曼增益:
Figure PCTCN2021141706-appb-000014
然后,可进行如下更新:
Figure PCTCN2021141706-appb-000015
Figure PCTCN2021141706-appb-000016
4、异常检测步骤
该步骤中,可通过计算实时监测得到的测量值和预测的概率分布之间的马氏距离(Mahalanobis distance)来检测异常:
Figure PCTCN2021141706-appb-000017
当马氏距离超过预设的阈值τ时,表示即使考虑了传感器和预测的噪音,这些实时监测的测量值也是不可能发生的,即检测出了异常。
三、实验
采用本发明实施例提供的方法来检测泵控制系统数据集中的异常。系统数据由5个月内每分钟采样的52个传感器的测量值组成。数据集中有7个连续几个小时到几天的故障。这里,按照3:1:1的比例将数据集分成一个训练集、一个验证集和一个测试集。所有7个故障都发生在测试集的周期内,这意味着训练集和验证集只包含正常工况下的数据。我们使用训练集来训练上述神经网络,使用验证集来调整超参数以获得最佳的验证性能。在测试集上评估了异常检测性能。
图2示出了采用本发明实施例和前述的其他几种方法(孤立森林、贝叶斯估计算法、包括稀疏自动编码器、变分自动编码器、LSTM自动编码器在内的自动编码器)进行异常检测时的性能。所有的基线模型都使用相同的数据集进行训练。
假设最大可接受的假阳性率(FPR)为0.01(每100分钟1次假告警),比较FPR最大值为0.01时的部分ROC曲线下的面积(Area Under ROC curve,AUC)值,AUC值越高,在FPR相同的情况下模型检测出的异常越多。图2中示出,本发明实施例的方法明显优于其他方法。其中,编号1对应本发明实施例,编号2对应孤立森林,编号3对应Seq2SeqLSTM,编号4对应空洞卷积神经网络(DilatedCNN),编号5对应稀疏自动编码器,编号6对应变分自动编码器,编号7对应LSTM自动编码器,编号8对应贝叶斯估计算法。
以上,介绍了本发明实施例中对动态控制系统进行系统辨识和采用贝叶斯滤波进行异常检测的原理。下面介绍本发明实施例提供的能够实现异常检测的装置30。
本发明实施例提供的异常检测装置30可以实现为计算机处理器的网络,以执行本发明实施例中的动态控制系统的异常检测方法400。异常检测装置30也可以是如图3所示的单台计算机,包括至少一个存储器301,其包括计算机可读介质,例如随机存取存储器(RAM)。装置30还包括与至少一个存储器301耦合的至少一个处理器302。计算机可执行指令存储在至少一个存储器301中,并且当由至少一个处理器302执行时,可以使至少一个处理器302执行本文所述的步骤。至少一个处理器302可以包括微处理器、专用集成电路(ASIC)、数字信号处理器(DSP)、中央处理单元(CPU)、图形处理单元(GPU)、状态机等。计算机可读介质的实施例包括但不限于软盘、CD-ROM、磁盘,存储器芯片、ROM、RAM、ASIC、配置的处理器、全光介质、所有磁带或其他磁性介质,或计算机处理器可以从中读取指令的任 何其他介质。此外,各种其它形式的计算机可读介质可以向计算机发送或携带指令,包括路由器、专用或公用网络、或其它有线和无线传输设备或信道。指令可以包括任何计算机编程语言的代码,包括C、C++、C语言、Visual Basic、java和JavaScript。
当由至少一个处理器302执行时,图3中所示的至少一个存储器301可以包含动态控制系统的异常检测程序31,使得至少一个处理器302执行本发明实施例中所述的用于动态控制系统的异常检测方法400。动态控制系统的异常检测程序31可以包括:
-一个初始化模块311,被配置为使用图1所示的神经网络10中的g网络初始化一个动态控制系统的隐状态分布;
-一个数据获取模块312,被配置为接收实时监测得到的当前时间点t处动态控制系统中传感器的测量值和触发器的状态值;
-一个预测模块313,被配置为:将至少一个第一采样点输入图1所示的神经网络10中的f网络,以预测得到至少一个第二采样点,其中,至少一个第一采样点用于表示在当前时间点t之前的邻近时间点t-1处动态控制系统的隐状态分布,至少一个第二采样点用于表示在当前时间点t处动态控制系统先验的隐状态分布;以及使用神经网络10中的h网络将至少一个第二采样点映射到传感器测量值空间,以预测得到在当前时间点t处动态控制系统的传感器测量值的概率分布;
-一个异常判断模块314,被配置为通过比较实时监测得到的测量值与预测得到的概率分布来判断动态控制系统是否存在异常;
可选地,异常检测程序31还可包括一个更新模块315,被配置为更新在当前时间点t处动态控制系统后验的隐状态分布,用于获取在当前时间点t之后的邻近时间点t+1处的第一采样点。
可选地,所述神经网络训练时采用的损失函数使得用于训练的各时间点处传感器的测量值的重建误差和预测误差之和最小。
可选地,所述至少一个第一采样点和所述至少一个第二采样点均为sigma采样点。
可选地,异常检测装置30还可包括一通信模块303,与至少一个处理器302和至少一个存储器301通过总线连接,用于异常检测装置30与外部设备通信。
应当提及的是,本发明实施例可以包括具有不同于图3所示架构的装置。上述架构仅仅是示例性的,用于解释本发明实施例提供的方法400。
此外,上述各模块还也可视为由硬件实现的各个功能模块,用于实现异常检测装置30在执行动态控制系统的异常检测方法时涉及的各种功能,比如预先将该方法中涉及的各流程的控制逻辑烧制到诸如现场可编程门阵列(Field-Programmable Gate Array,FPGA)芯片或复杂 可编程逻辑器件(Complex Programmable Logic Device,CPLD)中,而由这些芯片或器件执行上述各模块的功能,具体实现方式可依工程实践而定。
可选地,异常检测装置30还可包括一通信模块303,与至少一个处理器302和至少一个存储器301通过总线连接,用于异常检测装置30与外部设备通信。
应当提及的是,本发明实施例可以包括具有不同于图3所示架构的装置。上述架构仅仅是示例性的,用于解释本发明实施例提供的方法400。
此外,上述各模块还也可视为由硬件实现的各个功能模块,用于实现异常检测装置30在执行动态控制系统的异常检测方法时涉及的各种功能,比如预先将该方法中涉及的各流程的控制逻辑烧制到诸如现场可编程门阵列(Field-Programmable Gate Array,FPGA)芯片或复杂可编程逻辑器件(Complex Programmable Logic Device,CPLD)中,而由这些芯片或器件执行上述各模块的功能,具体实现方式可依工程实践而定。
下面,参照图4说明本发明实施例提供的动态控制系统的异常检测方法400。如图4所示,该方法可包括以下步骤:
-S401:使用图1所示的神经网络10中的g网络初始化一个动态控制系统的隐状态分布;
-S402:接收实时监测得到的当前时间点t处动态控制系统中传感器的测量值和触发器的状态值;
-S403:将至少一个第一采样点输入神经网络10中的f网络,以预测得到至少一个第二采样点,其中,至少一个第一采样点用于表示在当前时间点t之前的邻近时间点t-1处动态控制系统的隐状态分布,至少一个第二采样点用于表示在当前时间点t处动态控制系统先验的隐状态分布;
-S404:使用神经网络10中的h网络将至少一个第二采样点映射到传感器测量值空间,以预测得到在当前时间点t处动态控制系统的传感器测量值的概率分布;
-S405:通过比较实时监测得到的测量值与预测得到的概率分布来判断动态控制系统是否存在异常。
可选地,方法400中还可包括步骤S406:更新在当前时间点t处动态控制系统后验的隐状态分布,用于获取在当前时间点t之后的邻近时间点t+1处的第一采样点。
可选地,神经网络10训练时采用的损失函数使得用于训练的各时间点处传感器的测量值的重建误差和预测误差之和最小。
可选地,至少一个第一采样点和至少一个第二采样点均为sigma采样点。
此外,本发明实施例实施例还提供一种计算机可读介质,该计算机可读介质上存储有计算机可读指令,计算机可读指令在被处理器执行时,使处理器执行前述的动态控制系统的异常检测方法。计算机可读介质的实施例包括软盘、硬盘、磁光盘、光盘(如CD-ROM、CD-R、CD-RW、DVD-ROM、DVD-RAM、DVD-RW、DVD+RW)、磁带、非易失性存储卡和ROM。可选地,可以由通信网络从服务器计算机上或云上下载计算机可读指令。
需要说明的是,上述各流程和各系统结构图中不是所有的步骤和模块都是必须的,可以根据实际的需要忽略某些步骤或模块。各步骤的执行顺序不是固定的,可以根据需要进行调整。上述各实施例中描述的系统结构可以是物理结构,也可以是逻辑结构,即,有些模块可能由同一物理实体实现,或者,有些模块可能分由多个物理实体实现,或者,可以由多个独立设备中的某些部件共同实现。

Claims (10)

  1. 一种动态控制系统的异常检测方法(400),其特征在于,包括:
    -使用g网络初始化(S401)一个动态控制系统的隐状态分布;
    -接收(S402)实时监测得到的当前时间点t处所述动态控制系统中传感器的测量值和触发器的状态值;
    -将至少一个第一采样点输入(S403)f网络,以预测得到至少一个第二采样点,其中,所述至少一个第一采样点用于表示在当前时间点t之前的邻近时间点t-1处所述动态控制系统的隐状态分布,所述至少一个第二采样点用于表示在当前时间点t处所述动态控制系统先验的隐状态分布;
    -使用h网络将所述至少一个第二采样点映射(S404)到传感器测量值空间,以预测得到在当前时间点t处所述动态控制系统的传感器测量值的概率分布;
    -通过比较实时监测得到的所述测量值与预测得到的所述概率分布来判断(S405)所述动态控制系统是否存在异常;
    其中,所述g网络、f网络和h网络是用于表示所述动态控制系统动态分布的神经网络中的子网络,所述g网络为前馈网络,用于将传感器的测量值编码为低维隐状态向量;所述f网络将滑动窗口内的传感器的测量值和触发器的状态值编码为向量,并利用所述g网络编码得到的当前时间点的隐状态向量预测下一个时间点处的隐状态向量;所述h网络为前馈网络,将预测得到的下一个时间点处的隐状态向量解码为传感器的测量值,并将当前时间点处的低维隐状态向量解码为传感器的测量值;所述神经网络是使用所述动态控制系统正常工况下获取的传感器的测量值训练得到的。
  2. 如权利要求1所述的方法,其特征在于,还包括:
    -更新(S406)在当前时间点t处所述动态控制系统后验的隐状态分布,用于获取在当前时间点t之后的邻近时间点t+1处的所述第一采样点。
  3. 如权利要求1所述的方法,其特征在于,所述神经网络训练时采用的损失函数使得用于训练的各时间点处传感器的测量值的重建误差和预测误差之和最小。
  4. 如权利要求1所述的方法,其特征在于,所述至少一个第一采样点和所述至少一个第二采样点均为sigma采样点。
  5. 一种动态控制系统的异常检测装置(30),包括:
    -一个初始化模块(311),被配置为使用g网络初始化一个动态控制系统的隐状态分布;
    -一个数据获取模块(312),被配置为接收实时监测得到的当前时间点t处所述动态控制系统中传感器的测量值和触发器的状态值;
    -一个预测模块(313),被配置为:
    -将至少一个第一采样点输入f网络,以预测得到至少一个第二采样点,其中,所述至少一个第一采样点用于表示在当前时间点t之前的邻近时间点t-1处所述动态控制系统的隐状态分布,所述至少一个第二采样点用于表示在当前时间点t处所述动态控制系统先验的隐状态分布;以及
    -使用h网络将所述至少一个第二采样点映射到传感器测量值空间,以预测得到在当前时间点t处所述动态控制系统的传感器测量值的概率分布;
    -一个异常判断模块(314),被配置为通过比较实时监测得到的所述测量值与预测得到的所述概率分布来判断所述动态控制系统是否存在异常;
    其中,所述g网络、f网络和h网络是用于表示所述动态控制系统动态分布的神经网络中的子网络,所述g网络为前馈网络,用于将传感器的测量值编码为低维隐状态向量;所述f网络将滑动窗口内的传感器的测量值和触发器的状态值编码为向量,并利用所述g网络编码得到的当前时间点的隐状态向量预测下一个时间点处的隐状态向量;所述h网络为前馈网络,将预测得到的下一个时间点处的隐状态向量解码为传感器的测量值,并将当前时间点处的低维隐状态向量解码为传感器的测量值;所述神经网络是使用所述动态控制系统正常工况下获取的传感器的测量值训练得到的。
  6. 如权利要求5所述的装置,其特征在于,还包括:
    -一个更新模块(315),被配置为更新在当前时间点t处所述动态控制系统后验的隐状态分布,用于获取在当前时间点t之后的邻近时间点t+1处的所述第一采样点。
  7. 如权利要求5所述的装置,其特征在于,所述神经网络训练时采用的损失函数使得用于训练的各时间点处传感器的测量值的重建误差和预测误差之和最小。
  8. 如权利要求5所述的装置,其特征在于,所述至少一个第一采样点和所述至少一个第二采样点均为sigma采样点。
  9. 一种动态控制系统的异常检测装置(30),其特征在于,包括:
    -至少一个存储器(301),被配置为存储计算机可读代码;
    -至少一个处理器(302),被配置为调用所述计算机可读代码,执行如权利要求1~4任一项所述的方法。
  10. 一种计算机可读介质,其特征在于,所述计算机可读介质上存储有计算机可读指令,所述计算机可读指令在被处理器执行时,使所述处理器执行如权利要求1~4任一项所述的方法。
PCT/CN2021/141706 2021-01-27 2021-12-27 一种动态控制系统的异常检测方法、装置和计算机可读介质 WO2022161069A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP21922658.6A EP4266209A4 (en) 2021-01-27 2021-12-27 ANOMALY DETECTION METHOD AND APPARATUS FOR DYNAMIC CONTROL SYSTEM AND COMPUTER-READABLE MEDIUM
US18/262,630 US20240045411A1 (en) 2021-01-27 2021-12-27 Anomaly Detection Method and Apparatus for Dynamic Control System, and Computer-Readable Medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110112274.X 2021-01-27
CN202110112274.XA CN114815763A (zh) 2021-01-27 2021-01-27 一种动态控制系统的异常检测方法、装置和计算机可读介质

Publications (1)

Publication Number Publication Date
WO2022161069A1 true WO2022161069A1 (zh) 2022-08-04

Family

ID=82525196

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/141706 WO2022161069A1 (zh) 2021-01-27 2021-12-27 一种动态控制系统的异常检测方法、装置和计算机可读介质

Country Status (4)

Country Link
US (1) US20240045411A1 (zh)
EP (1) EP4266209A4 (zh)
CN (1) CN114815763A (zh)
WO (1) WO2022161069A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115294674A (zh) * 2022-10-09 2022-11-04 南京信息工程大学 一种无人艇航行状态的监测评估方法

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5586221A (en) * 1994-07-01 1996-12-17 Syracuse University Predictive control of rolling mills using neural network gauge estimation
CN104156422A (zh) * 2014-08-06 2014-11-19 辽宁工程技术大学 一种基于动态神经网络的瓦斯浓度实时预测方法
CN111666982A (zh) * 2020-05-19 2020-09-15 上海核工程研究设计院有限公司 一种基于深度神经网络的机电设备故障诊断方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7567878B2 (en) * 2005-12-07 2009-07-28 Siemens Corporate Research, Inc. Evaluating anomaly for one class classifiers in machine condition monitoring
EP1914638A1 (en) * 2006-10-18 2008-04-23 Bp Oil International Limited Abnormal event detection using principal component analysis
US8606554B2 (en) * 2009-10-19 2013-12-10 Siemens Aktiengesellschaft Heat flow model for building fault detection and diagnosis
JP7204626B2 (ja) * 2019-10-01 2023-01-16 株式会社東芝 異常検知装置、異常検知方法および異常検知プログラム

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5586221A (en) * 1994-07-01 1996-12-17 Syracuse University Predictive control of rolling mills using neural network gauge estimation
CN104156422A (zh) * 2014-08-06 2014-11-19 辽宁工程技术大学 一种基于动态神经网络的瓦斯浓度实时预测方法
CN111666982A (zh) * 2020-05-19 2020-09-15 上海核工程研究设计院有限公司 一种基于深度神经网络的机电设备故障诊断方法

Non-Patent Citations (12)

* Cited by examiner, † Cited by third party
Title
C.K. CHUIG. CHEN: "Kalman Filter", 2017, SPRINGER
CHAO YUANCLAUS NEUBAUER: "Bayes Sensor Estimation for Machine State Monitoring", IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, 2007, pages 517 - 520
CHAO YUANCLAUS NEUBAUER: "Robust Sensor Estimation Using Time Information", EEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, 2008, pages 2077 - 2080, XP031250992
HINTON, GEOFFREY ESIMON OSINDEROYEE-WHYE TEH: "Fast Learning Algorithm for Deep Belief Nets", NEURAL COMPUTATION, 2006, pages 1527 - 1554, XP055217715, DOI: 10.1162/neco.2006.18.7.1527
HOCHREITER, SEPPJÜRGEN SCHMIDHUBER: "Long Short-Term Memory", NEURAL COMPUTATION, 1977, pages 1735 - 1780
KINGMA, DIEDERIK PJIMMY BA: "Adam: a method for stochastic optimization", ARXIV, vol. 1412, 2014, pages 6980
KINGMA, DIEDERIK PMAX WELLING: "Auto-Encoding Variational Bayes", ARXIV, 2013
LIU, FEI TONYKAI MING TINGZHI-HUA ZHOU: "Isolation Forest", IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2008
MANEVITZ, LARRY MMALIK YOUSEF: "One-Class Support Vector Machine for Document Classification", MACHINE LEARNING, 2001, pages 139 - 154
See also references of EP4266209A4
VAN DER MERWE, SIGMA-POINT KALMAN FILTERS FOR PROBABILISTIC INFERENCE IN DYNAMIC STATE-SPACE MODELS PUBLISHED, 2004
WAN, ERIC ARUDOLPH VAN DER MERWE: "Unscented Kalman Filter for Nonlinear Estimation", SUMMARY OF IEEE WORKSHOP ON SIGNAL PROCESSING, COMMUNICATION AND ADAPTIVE CONTROL SYSTEM

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115294674A (zh) * 2022-10-09 2022-11-04 南京信息工程大学 一种无人艇航行状态的监测评估方法
CN115294674B (zh) * 2022-10-09 2022-12-20 南京信息工程大学 一种无人艇航行状态的监测评估方法

Also Published As

Publication number Publication date
EP4266209A1 (en) 2023-10-25
CN114815763A (zh) 2022-07-29
EP4266209A4 (en) 2024-04-17
US20240045411A1 (en) 2024-02-08

Similar Documents

Publication Publication Date Title
JP6740247B2 (ja) 異常検出システム、異常検出方法、異常検出プログラム及び学習済モデル生成方法
US11720821B2 (en) Automated and customized post-production release review of a model
US20210097438A1 (en) Anomaly detection device, anomaly detection method, and anomaly detection program
KR102320706B1 (ko) 설비 모니터링 시스템의 모델 임계값 설정 방법
CN111711608B (zh) 一种电力数据网流量异常检测方法、系统及电子设备
CN112949026B (zh) 一种考虑年龄和状态依赖的退化设备剩余寿命预测方法
KR102270202B1 (ko) 설비 모니터링 시스템의 모델 업데이트 방법
Jiang et al. A fault diagnosis method for electric vehicle power lithium battery based on wavelet packet decomposition
CN113723716B (zh) 一种客流分级预警异常告警方法、设备及存储介质
WO2022161069A1 (zh) 一种动态控制系统的异常检测方法、装置和计算机可读介质
CN115327979A (zh) 基于高斯建模及线性贝叶斯估计的状态监测方法及组件
Martins et al. A support vector machine based technique for online detection of outliers in transient time series
Huang et al. Damage identification of a steel frame based on integration of time series and neural network under varying temperatures
Wang et al. A deep learning anomaly detection framework for satellite telemetry with fake anomalies
Cheng et al. A data‐driven distributed fault detection scheme based on subspace identification technique for dynamic systems
CN117150445B (zh) 一种区间隧道近距离下穿河流的沉降监测和评价方法
KR102320707B1 (ko) 설비 모니터링 시스템의 설비 고장 분류 방법
Sinha et al. Data-driven approach for inferencing causality and network topology
Chen et al. Rapid isolation of small oscillation faults via deterministic learning
Alweshah et al. Evolution of software reliability growth models: a comparison of auto-regression and genetic programming models
Ingimundarson et al. Robust fault diagnosis using parallelotope-based set-membership consistency tests
Ceci et al. Signal and graph perturbations via total least-squares
Yang et al. A Nonlinear Adaptive Observer‐Based Differential Evolution Algorithm to Multiparameter Fault Diagnosis
CN116991137B (zh) 一种面向概念漂移的可适应可解释的工控系统异常检测方法
CN116304846B (zh) 一种基于自监督学习的cvt内部绝缘异常在线评估方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21922658

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 18262630

Country of ref document: US

ENP Entry into the national phase

Ref document number: 2021922658

Country of ref document: EP

Effective date: 20230719

NENP Non-entry into the national phase

Ref country code: DE