WO2023197617A1

WO2023197617A1 - Method for detecting and diagnosing production abnormality of industrial system on basis of multi-dimensional sensing data

Info

Publication number: WO2023197617A1
Application number: PCT/CN2022/135714
Authority: WO
Inventors: 吕明琪; 陈铁明; 朱添田
Original assignee: 浙江工业大学
Priority date: 2022-04-11
Filing date: 2022-11-30
Publication date: 2023-10-19
Also published as: CN114841250A

Abstract

A method for detecting and diagnosing a production abnormality of an industrial system on the basis of multi-dimensional sensing data. The method comprises: pre-processing a multi-dimensional sensing data sample, and dividing the pre-processed multi-dimensional sensing data sample into several sub-samples by using a sliding window; by using an automatic encoder, obtaining an abnormality detection model by means of unsupervised training and on the basis of training normal sub-samples; training a classification model according to the abnormality detection model; and performing real-time detection and diagnosis on a production abnormality of an industrial system on the basis of the abnormality detection model and the classification model. By using the method, the problem of it being difficult to perform abnormality diagnosis for abnormality detection of multi-dimensional sensing data by using a black-box model at present is solved.

Description

Industrial system production anomaly detection and diagnosis method based on multi-dimensional sensing data

Technical field

The invention belongs to the technical field of data mining, and specifically relates to an industrial system production anomaly detection and diagnosis method based on multi-dimensional sensing data.

Background technique

The Industrial Internet aims to achieve smarter and more efficient automated control and resource allocation of industrial manufacturing systems, while improving the production efficiency of smart factories. However, because the Industrial Internet breaks the boundaries between the online world and the physical world, industrial manufacturing systems are more vulnerable to external malicious behaviors. In addition, production problems such as equipment failure, performance degradation, and quality defects are inevitable in industrial manufacturing systems. If abnormal situations such as intrusions and failures in industrial production cannot be detected in time, it may cause serious losses to the entire manufacturing system. Therefore, anomaly detection and diagnosis are basic requirements of the industrial Internet and are of great significance to intelligent manufacturing companies.

With the rapid development of the industrial Internet, modern industrial manufacturing systems have realized the perception and recording of production operating status and processes through sensors, and accumulated a large amount of industrial production data. Data-driven methods have become the mainstream means of anomaly detection. In recent years, deep learning has gradually become a mainstream technology for data-driven methods. However, because the deep learning model is too complex and contains a large number of nonlinear transformations, it is generally a black box, and its prediction results are uninterpretable. In abnormal detection of industrial systems, the interpretation of detection results is very important and is the basis for diagnosis of abnormal detection results. For example, anomaly detection result diagnosis can help locate which device and which time period an anomaly occurred.

Existing methods for interpreting deep learning models focus on supervised learning models, such as deep learning interpretable frameworks such as SHAP and LIME. However, due to the complexity of industrial production data and the high cost of manual annotation, the obtained industrial production data are basically unlabeled. Therefore, the anomaly detection model needs to be trained in an unsupervised manner. Especially for new deep unsupervised learning models such as automatic encoding machines, it is almost impossible for existing deep learning explainable frameworks to learn the association between abnormal samples and semantic features, making it impossible to explain deep unsupervised learning models.

Contents of the invention

The purpose of the present invention is to provide an industrial system production anomaly detection and diagnosis method based on multi-dimensional sensing data to improve the accuracy of abnormal diagnosis.

In order to achieve the above objects, the technical solutions adopted by the present invention are:

An industrial system production anomaly detection and diagnosis method based on multi-dimensional sensing data. The industrial system production anomaly detection and diagnosis method based on multi-dimensional sensing data includes:

S1. Preprocess the multidimensional sensing data samples, and use a sliding window to divide the preprocessed multidimensional sensing data samples into several subsamples, where the subsamples include normal subsamples and abnormal subsamples; given multidimensional sensing Data sample s∈R ^N×T , s is a two-dimensional matrix, where N is the characteristic dimension of s, that is, the number of devices included in the industrial system, and T is the data duration of s, that is, the number of sampling points of the sensor;

S2. Use an automatic encoding machine to train an anomaly detection model based on normal subsamples in an unsupervised training method;

S3. Train the classification model based on the anomaly detection model, including:

Step 31: Use the anomaly detection model to detect subsamples containing normal subsamples and abnormal subsamples, and add labels to the subsamples based on the detection results to obtain a labeled subsample set;

Step 32. Assume that F is a set of N features. According to the combination of features, take n features in the set F each time to obtain 2 ^N -1 feature subsets S, n = 1, 2,..., N. According to each The feature subset generates a training subset containing only the features in the feature subset from the labeled subsample set, and uses the XGBoost classifier to train a classification model in a supervised manner on each training subset, resulting in a total of 2 ^N -1 a classification model;

S4. Real-time detection and diagnosis of industrial system production anomalies based on anomaly detection models and classification models, including:

Obtain the real-time subsample to be detected. If the detection result of the real-time subsample by the anomaly detection model is a normal subsample, it ends; otherwise, the classification model is used to calculate the feature confidence corresponding to each of the N feature dimensions based on the real-time subsample. , and diagnose abnormal features based on feature confidence, that is, locate abnormal equipment in the industrial system.

Several optional methods are also provided below, but they are not used as additional limitations on the above-mentioned overall plan. They are only further additions or preferences. On the premise that there are no technical or logical contradictions, each optional method can be independently implemented for the above-mentioned overall plan. Combination can also be a combination between multiple optional methods.

Preferably, the preprocessing of multi-dimensional sensing data samples includes:

For the missing values in the multi-dimensional sensing data sample s, the average value of the before and after data is used to fill;

Multidimensional sensing data samples s are normalized so that the data is in the range of [0,1].

Preferably, the sliding window is used to divide the preprocessed multi-dimensional sensing data samples into several sub-samples, including:

The multi-dimensional sensing data sample s is divided using a sliding window with window size W to obtain continuous M sub-samples ss∈R ^N×W .

Preferably, the network structure of the automatic encoding machine includes an input layer, a coding layer, a semantic layer, a decoding layer and an output layer, where:

The input layer: the input is subsample ss∈R ^N×W ;

The coding layer: uses two layers of LSTM as the encoder. The N-dimensional feature vectors x ₁ , x ₂ ,..., x _W at W moments in the subsample ss are input into each unit of the first layer LSTM in sequence, and the obtained W The hidden vectors are then input into each unit of the second layer LSTM in sequence, and W hidden vectors h ₁ , h ₂ ,..., h _W are obtained;

The semantic layer: takes the latent vector h _W as the encoded low-dimensional semantic vector;

The decoding layer: uses two layers of LSTM as the decoder, repeats the hidden vector h W times _W and inputs it into each unit of the first layer LSTM in sequence, and the obtained W hidden vectors are then input into each unit of the second layer LSTM in sequence. units, get W hidden vectors g ₁ , g ₂ ,...,g _W ;

The output layer: uses a fully connected layer to convert W hidden vectors g ₁ , g ₂ ,..., g _W into vectors y ₁ , y ₂ ,..., y _W , vectors y ₁ , y consistent with the dimension of the subsample ss. ₂ ,…,y _W is used as the output data rss.

As a preferred option, in the training of the anomaly detection model, the mean square error of the output data rss and the subsample ss is used as the loss function, and gradient descent is used for optimization iteration.

Preferably, the feature confidence corresponding to each of the N feature dimensions is calculated sequentially based on real-time subsamples, including:

For feature k, the feature confidence is calculated as follows:

In the formula, φ _k is the feature confidence of feature k, k = 1, 2,...,N, CM _S (x _S ) is the classification model CM trained using the training subset corresponding to the feature subset S that does not contain feature k. The output result of _S on the subsample x _S , the output result is 0 or 1. The subsample x _S is the sample data extracted from the real-time subsample with the same features as the feature subset S, CM _S∪{k} (x _S∪{k} ) is the output result of the classification model CM S∪{k} trained on the subsample x S∪ _{{k} using the training subset corresponding to the feature subset S∪{k}} _containing feature k , the output result is 0 or 1, and the subsample x _S∪{k} is the sample data extracted from the real-time subsample with the same features contained in the feature subset S∪{k},

Represents the feature subset S that does not contain feature k.

Preferably, diagnosing abnormal features based on feature confidence includes:

First, the Sigmoid function is used to normalize the confidence of all features to obtain the weight score. The absolute value of the weight score indicates the impact of the feature on the final detection result. Based on the impact value, the SHAP explanation model is used to explain the detection results.

The industrial system production anomaly detection and diagnosis method based on multi-dimensional sensing data provided by the present invention uses an automatic coding machine to train an anomaly detection model in an unsupervised manner without providing anomaly labeled samples; the output of the anomaly detection model is used to train supervised The classification model realizes the interpretation and diagnosis of anomaly detection results on this basis, solving the problem of difficulty in anomaly diagnosis in the current black box model for anomaly detection of multi-dimensional sensing data.

Description of the drawings

Figure 1 is a flow chart of the industrial system production anomaly detection and diagnosis method based on multi-dimensional sensing data according to the present invention;

Figure 2 is a grid structure diagram of the automatic encoding machine of the present invention;

Figure 3 is a schematic diagram of parameter settings for each layer of the automatic encoding machine of the present invention;

Figure 4 is an explanation diagram of abnormality detection for abnormal subsample output according to the present invention;

Figure 5 is an explanation diagram of abnormality detection for normal subsample output according to the present invention.

Detailed ways

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, of the embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts fall within the scope of protection of the present invention.

Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the technical field to which the invention belongs. The terminology used herein in the description of the present invention is for the purpose of describing specific embodiments only and is not intended to limit the present invention.

In order to solve the problem in the prior art that it is difficult to perform abnormal diagnosis when detecting multi-dimensional sensing data anomalies using a black box model, this embodiment provides a method for industrial system production anomaly detection and diagnosis based on multi-dimensional sensing data.

As shown in Figure 1, this embodiment proposes an industrial system production anomaly detection and diagnosis method based on multi-dimensional sensing data, which includes the following steps:

S1. Preprocess the multi-dimensional sensing data samples, and use a sliding window to divide the pre-processed multi-dimensional sensing data samples into several sub-samples, where the sub-samples include normal sub-samples and abnormal sub-samples.

In this embodiment, a multi-dimensional sensing data sample s∈R ^N×T is given, s is a two-dimensional matrix, where N is the characteristic dimension of s, that is, the number of devices included in the industrial system, and T is the data duration of s , that is, the number of sampling points of the sensor. Therefore, the detailed operations of preprocessing sample data in this embodiment are as follows:

1) Data cleaning: The missing values in the multi-dimensional sensing data samples s are filled with the average value of the before and after data.

2) Data standardization: Standardize the multi-dimensional sensing data samples s so that the data is within the range of [0,1].

When dividing the data in this embodiment, a sliding window with a window size of W is used to divide the multi-dimensional sensing data sample s, and M continuous sub-samples ss∈R ^N×W are obtained.

It should be noted that since the training of the anomaly detection model in this embodiment requires the use of normal subsamples, the subsamples in this embodiment include normal subsamples and abnormal subsamples, but the subsamples divided in step 1 are not marked. , the distinction between normality and abnormality corresponds to the normality and abnormality of the original data taken.

S2. Use an automatic encoding machine to train an anomaly detection model based on normal subsamples in an unsupervised training method.

In this embodiment, an automatic encoding machine is used to train the anomaly detection model AM. The input of the automatic encoding machine is the original sub-sample ss. The original sub-sample is first converted into a low-dimensional feature space through the encoder, and then the low-dimensional features are output into a heavy feature space through the decoder. Construct a subsample rss, and the training goal is to make ss and rss as close as possible. Referring to Figure 2, the network structure of the automatic encoding machine used is as follows:

Input layer: The input is subsample ss∈R ^N×W .

Coding layer: Two layers of LSTM are used as the encoder. The N-dimensional feature vectors x ₁ , x ₂ ,..., x _W (a total of W moments) in the subsample ss at each moment (W moments in total), that is, x ₁ is one N-dimensional feature vectors (others are understood in the same way) are input into each unit of the first layer LSTM in sequence, and the W hidden vectors obtained are input into each unit of the second layer LSTM in order, and W hidden vectors h ₁ , h are obtained. ₂ ,…,h _W .

Semantic layer: Take the latent vector h _W as the encoded low-dimensional semantic vector.

Decoding layer: Use two layers of LSTM as the decoder, repeat the hidden vector h _W times W and input it into each unit of the first layer LSTM in sequence, and the obtained W hidden vectors are then input into each unit of the second layer LSTM in sequence. , get W hidden vectors g ₁ , g ₂ ,...,g _W .

Output layer: Use a fully connected layer to convert W hidden vectors g ₁ , g ₂ ,..., g _W into vectors y ₁ , y ₂ ,..., y _W , vectors y ₁ , y ₂ , consistent with the dimension of the subsample ss. …,y _W as the output data rss.

During the training process of the anomaly detection model AM, on the one hand, in order to minimize the difference between ss and rss, the mean square error of the output data rss and the sub-sample ss is used as the loss function, and on this basis, the gradient descent method is used to perform the model Optimize training; on the other hand, in order for the model to learn the pattern of normal subsamples, all normal subsamples are used for training. The parameter settings of each layer of the automatic encoding machine used in this embodiment are shown in Figure 3.

After the training is completed, given a real-time subsample ss, input it into the trained automatic encoding machine (i.e., anomaly detection model AM) to obtain the output reconstructed subsample rss. Calculate the mean square error between ss and rss. If the mean square error is greater than the predefined threshold, the subsample is determined to be abnormal, otherwise it is determined to be normal.

Step 31. Construction of labeled subsample set: Use the anomaly detection model to detect the subsamples containing normal subsamples and abnormal subsamples, and add labels to the subsamples based on the detection results. Mark the subsamples with detected abnormalities as 1. Normal subsamples are labeled as 0, and the labeled subsample set LSS is obtained.

Step 32. Classification model construction: Assume that F is a set of N features. According to the combination of features, take n features from the set F each time to obtain 2 ^N -1 feature subsets S, n=1, 2,...,N , according to each feature subset, a training subset containing only features in the feature subset is generated from the labeled subsample set LSS, and an XGBoost classifier is used on each training subset to train a classification model in a supervised manner. A total of 2 ^N -1 classification models were obtained.

The number of subsamples in each training subset is the same as the number of subsamples in the labeled subsample set LSS, and the labeling of each subsample remains unchanged.

S41. Obtain the real-time subsample to be detected. If the detection result of the real-time subsample by the anomaly detection model is a normal subsample, the end is completed; otherwise, proceed to the next step;

S42. Use the classification model to sequentially calculate the feature confidence corresponding to each of the N feature dimensions based on the real-time subsamples.

This embodiment calculates the feature confidence corresponding to each feature when the real-time subsample is abnormal. The calculation of the feature confidence of feature k is used as an example. Given feature k, (the feature is one of N features), by calculating The difference between the classification model using feature k and the classification model not using feature k is used to evaluate the confidence φ _k of feature k. The greater the confidence φ _k , the higher the importance of feature k.

For feature k, the feature confidence is calculated as follows:

In the formula, φ _k is the feature confidence of feature k, k = 1, 2,...,N, CM _S (x _S ) is the classification model CM trained using the training subset corresponding to the feature subset S that does not contain feature k. The output result of _S on the subsample x _S , the output result is 0 or 1. The subsample x _S is the sample data extracted from the real-time subsample with the same features as the feature subset S, CM _S∪{k} (x _S∪{k} ) is the output result of the classification model CM S∪{k} trained on the subsample x S∪ _{{k} using the training subset corresponding to the feature subset S∪{k}} _containing feature k , the output result is 0 or 1, and the subsample x _S∪{k} is the sample data extracted from the real-time subsample with the same features as the feature subset S∪{k},

Represents the feature subset S that does not contain feature k.

S43. Diagnose abnormal features based on feature confidence, that is, locate abnormal equipment in the industrial system.

The abnormality judgment in this application is based on the SHAP interpretation model, which is implemented based on the Shapley value. Therefore, the calculation of the feature confidence in this embodiment is equivalent to the calculation of the Shapley value. Abnormal features are obtained based on the final impact value (for example, the impact value is higher than the set threshold). Since the features correspond to the equipment, equipment that may cause anomalies in the industrial system can also be directly located.

In order to facilitate observation, this embodiment further visualizes the diagnosis results. Refer to Figures 4 and 5. In the figure, f(x) represents the probability that the output of the classification model is an abnormal result. The left side of f(x) represents a positive correlation with the abnormal detection results. , the right side indicates a negative correlation with the abnormal results, and the larger the width of the feature area, the higher the weight score of the feature, thereby diagnosing the cause of the abnormality. For example, in Figure 4, features such as f4, f6, and f1 (corresponding to device numbers) have relatively high impact values, indicating that the cause of the anomaly is most likely caused by the devices corresponding to these features.

The technical features of the above-described embodiments can be combined in any way. To simplify the description, not all possible combinations of the technical features in the above-described embodiments are described. However, as long as there is no contradiction in the combination of these technical features, All should be considered to be within the scope of this manual.

The above-described embodiments only express several implementation modes of the present invention. The descriptions are relatively specific and detailed, but should not be construed as limiting the scope of the invention. It should be noted that, for those of ordinary skill in the art, several modifications and improvements can be made without departing from the concept of the present invention, and these all belong to the protection scope of the present invention. Therefore, the protection scope of the present invention should be determined by the appended claims.

Claims

An industrial system production anomaly detection and diagnosis method based on multi-dimensional sensing data, characterized in that the industrial system production anomaly detection and diagnosis method based on multi-dimensional sensing data includes:

S1. Preprocess the multidimensional sensing data samples, and use a sliding window to divide the preprocessed multidimensional sensing data samples into several subsamples, where the subsamples include normal subsamples and abnormal subsamples; given multidimensional sensing Data sample s∈R N×T , s is a two-dimensional matrix, where N is the characteristic dimension of s, that is, the number of devices included in the industrial system, and T is the data duration of s, that is, the number of sampling points of the sensor;

S2. Use an automatic encoding machine to train an anomaly detection model based on normal subsamples in an unsupervised training method;

S3. Train the classification model based on the anomaly detection model, including:

Step 31: Use the anomaly detection model to detect subsamples containing normal subsamples and abnormal subsamples, and add labels to the subsamples based on the detection results to obtain a labeled subsample set;

Step 32. Assume that F is a set of N features. According to the combination of features, take n features in the set F each time to obtain 2 N -1 feature subsets S, n = 1, 2,..., N. According to each The feature subset generates a training subset containing only the features in the feature subset from the labeled subsample set, and uses the XGBoost classifier to train a classification model in a supervised manner on each training subset, resulting in a total of 2 N -1 a classification model;

S4. Real-time detection and diagnosis of industrial system production anomalies based on anomaly detection models and classification models, including:

Obtain the real-time subsample to be detected. If the detection result of the real-time subsample by the anomaly detection model is a normal subsample, it ends; otherwise, the classification model is used to calculate the feature confidence corresponding to each of the N feature dimensions based on the real-time subsample. , and diagnose abnormal features based on feature confidence, that is, locate abnormal equipment in the industrial system.
The industrial system production anomaly detection and diagnosis method based on multi-dimensional sensing data according to claim 1, characterized in that the preprocessing of multi-dimensional sensing data samples includes:

For the missing values in the multi-dimensional sensing data sample s, the average value of the before and after data is used to fill;

Multidimensional sensing data samples s are normalized so that the data is in the range of [0,1].
The industrial system production anomaly detection and diagnosis method based on multi-dimensional sensing data according to claim 1, characterized in that the sliding window is used to divide the pre-processed multi-dimensional sensing data samples into several sub-samples, including:

The multi-dimensional sensing data sample s is divided using a sliding window with window size W to obtain continuous M sub-samples ss∈R N×W .
The industrial system production anomaly detection and diagnosis method based on multi-dimensional sensing data according to claim 3, characterized in that the network structure of the automatic encoding machine includes an input layer, a coding layer, a semantic layer, a decoding layer and an output layer, in:

The input layer: the input is subsample ss∈R N×W ;

The coding layer: uses two layers of LSTM as the encoder. The N-dimensional feature vectors x 1 , x 2 ,..., x W at W moments in the subsample ss are input into each unit of the first layer LSTM in sequence, and the obtained W The hidden vectors are then input into each unit of the second layer LSTM in sequence, and W hidden vectors h 1 , h 2 ,..., h W are obtained;

The semantic layer: takes the latent vector h W as the encoded low-dimensional semantic vector;

The decoding layer: uses two layers of LSTM as the decoder, repeats the hidden vector h W times W and inputs it into each unit of the first layer LSTM in sequence, and the obtained W hidden vectors are then input into each unit of the second layer LSTM in sequence. units, get W hidden vectors g 1 , g 2 ,...,g W ;

The output layer: uses a fully connected layer to convert W hidden vectors g 1 , g 2 ,..., g W into vectors y 1 , y 2 ,..., y W , vectors y 1 , y consistent with the dimension of the subsample ss. 2 ,…,y W is used as the output data rss.
The industrial system production anomaly detection and diagnosis method based on multi-dimensional sensing data as claimed in claim 4, characterized in that the mean square error of the output data rss and the sub-sample ss is used as the loss function in the training of the anomaly detection model, and Optimization iteration is performed using gradient descent.
The industrial system production anomaly detection and diagnosis method based on multi-dimensional sensing data according to claim 1, characterized in that the feature confidence corresponding to each of the N feature dimensions is calculated sequentially based on real-time sub-samples, including :

For feature k, the feature confidence is calculated as follows:

In the formula, φ k is the feature confidence of feature k, k = 1, 2,...,N, CM S (x S ) is the classification model CM trained using the training subset corresponding to the feature subset S that does not contain feature k. The output result of S on the subsample x S , the output result is 0 or 1. The subsample x S is the sample data extracted from the real-time subsample with the same features as the feature subset S, CM S∪{k} (x S∪{k} ) is the output result of the classification model CM S∪{k} trained on the subsample x S∪{k} using the training subset corresponding to the feature subset S∪{k} containing feature k , the output result is 0 or 1, and the subsample x S∪{k} is the sample data extracted from the real-time subsample with the same features as the feature subset S∪{k},
Represents the feature subset S that does not contain feature k.
The industrial system production anomaly detection and diagnosis method based on multi-dimensional sensing data according to claim 1, wherein the diagnosis of abnormal features based on feature confidence includes:

First, the Sigmoid function is used to normalize the confidence of all features to obtain the weight score. The absolute value of the weight score indicates the impact of the feature on the final detection result. Based on the impact value, the SHAP explanation model is used to explain the detection results.