CN115329219A - Complex equipment abnormity detection method based on prediction - Google Patents

Complex equipment abnormity detection method based on prediction Download PDF

Info

Publication number
CN115329219A
CN115329219A CN202210814189.2A CN202210814189A CN115329219A CN 115329219 A CN115329219 A CN 115329219A CN 202210814189 A CN202210814189 A CN 202210814189A CN 115329219 A CN115329219 A CN 115329219A
Authority
CN
China
Prior art keywords
data
prediction
distribution
value
complex equipment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210814189.2A
Other languages
Chinese (zh)
Inventor
张超祺
崔朗福
张庆振
陈娟
宋子雄
王钧乐
王明贤
王贺
孙宏波
胡德友
李继光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Capital Aerospace Machinery Co Ltd
Original Assignee
Beihang University
Capital Aerospace Machinery Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University, Capital Aerospace Machinery Co Ltd filed Critical Beihang University
Priority to CN202210814189.2A priority Critical patent/CN115329219A/en
Publication of CN115329219A publication Critical patent/CN115329219A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a complex equipment abnormity detection method based on prediction, which comprises the steps of obtaining a parameter combination with strong relevance; establishing an LSTM network multi-input multi-output prediction model; training an LSTM prediction model, and obtaining a prediction error of the model on training data; judging statistical distribution obeyed by the error data; inputting latest historical data of the complex equipment into an LSTM prediction model to obtain a predicted value of the model, and calculating the deviation between the predicted value and an actual value when new data come; performing significance test on the deviation between the predicted value and the actual value, and judging whether the deviation value is consistent with the statistical distribution obtained in the step S5; and (4) carrying out abnormity discrimination on the actual data, adding the actual data into the historical data if the detection result is normal, and inputting the actual data into an LSTM prediction model for next prediction. The method improves the accuracy and the rapidity of the abnormity detection of the complex equipment, and provides powerful guarantee for the stable and efficient operation of the complex equipment.

Description

Complex equipment abnormity detection method based on prediction
Technical Field
The invention relates to the technical field of complex equipment abnormity detection, in particular to a complex equipment abnormity detection method based on prediction.
Background
The complex equipment has a plurality of testing parameters, the parameters are mutually influenced and restricted, the complex equipment is a software and hardware integrated system with high fusion machine, liquid, electricity, magnetism and heat, and the working information data is rich and numerous. Therefore, the accurate and timely abnormality detection can guarantee the operation efficiency of the complex equipment to the maximum extent.
Anomaly detection for complex equipment has long used methods such as manual interpretation, simple threshold detection, and expert experience. At present, data generated by complex equipment has the characteristics of large data volume, high dimensionality, complex relation among different channel data and the like, and under the background, the traditional method has low efficiency, has expert knowledge bottleneck, cannot effectively utilize mass data, cannot provide power for abnormal data which does not exceed a threshold value, and cannot find other unrecognized abnormal data.
Therefore, how to provide a complex equipment abnormality detection method for improving the accuracy and the rapidity of complex equipment abnormality detection becomes a technical problem which needs to be solved urgently by a person skilled in the art.
Disclosure of Invention
The invention aims to provide a complex equipment abnormity detection method based on prediction so as to solve the problems.
The technical scheme adopted by the invention for solving the technical problem is as follows:
a complex equipment abnormity detection method based on prediction comprises the following steps:
s1, accumulating time sequence data of a normal state in the historical operation process of the complex equipment;
s2, performing relevance correlation analysis on the multi-dimensional parameters of the complex equipment to obtain a parameter combination with strong relevance;
s3, establishing an LSTM network multi-input multi-output prediction model, wherein input variables and output variables are parameter combinations with strong correlation obtained by analysis in the step S2, and the prediction model adopts a single-step prediction mode;
s4, training an LSTM prediction model based on the accumulated complex equipment historical data, and obtaining a prediction error of the model on the training data;
s5, performing KS hypothesis test on the distribution of the prediction error of the training data, and judging the statistical distribution obeyed by the error data;
s6, inputting the latest historical data of the complex equipment into an LSTM prediction model to obtain a prediction value of the model, and calculating the deviation between the prediction value and an actual value when new data comes;
s7, carrying out significance test on the deviation between the predicted value and the actual value, and judging whether the deviation value is consistent with the statistical distribution obtained in the step S5;
s8, judging the abnormality of the actual data according to the detection result of the step S7, if the calculated deviation is significantly different from the statistical distribution of the step S5, determining that the data is abnormal data, otherwise, determining that the data is normal data;
s9, if the detection result in the step S8 is normal, adding the actual data into historical data, and inputting an LSTM prediction model for next prediction; if the detection result in the step S8 is abnormal, adding the predicted value into historical data, and inputting an LSTM prediction model to perform rolling forward prediction; when new data arrives, steps S6, S7, S8 are iteratively executed.
Further, the step S2 specifically includes:
preprocessing data, including missing value filling and normalization processing;
selecting a target variable, and sequentially calculating mutual information between the target variable and other variables; for two time series X and Y, the formula for the mutual information is:
Figure BDA0003740419790000021
wherein p (X, Y) is a joint probability distribution function of X and Y, and p (X) and p (Y) are edge probability distribution functions of X and Y, respectively;
setting a relevance evaluation threshold, and when mutual information between two variables is larger than the threshold, considering that the two variables have strong relevance to obtain a variable group with strong relevance to the target variable;
and (4) iteratively executing the steps to obtain a parameter combination with strong correlation among all parameters of the complex equipment.
Further, the step S3 specifically includes:
both the input and output of the network comprise a combination of parameters, which have strong correlation;
the input time step is m, the output time step is 1, and the data of the future step 1 are predicted by utilizing the monitoring data of the previous m steps;
the hidden layers of the network are alternately connected by LSTM and dropout layers.
Further, the step S5 specifically includes:
drawing a statistical chart of error data, and preliminarily judging statistical distribution which is possibly accorded;
the original hypothesis H0: preparation ofDistribution of error measurement data F n (x) According to a certain statistical distribution F (x);
calculating the absolute difference between the sample accumulated frequency and the theoretical distribution accumulated probability to make the maximum absolute difference be D n
D n =max|F n (x)-F(x)|
Finding out critical value D by using sample capacity n and significance level alpha n (α);
If D is n <D n And (alpha), considering that the error data conforms to the distribution, and accepting the original hypothesis, or rejecting the original hypothesis.
Further, the step S7 specifically includes:
the original hypothesis H0: the predicted deviation value is consistent with the statistical distribution determined in the step S5;
constructing test statistics: if the original assumed distribution is normal, the test statistic is
Figure BDA0003740419790000031
If the distribution of the original hypothesis is t distribution, the test statistic is
Figure BDA0003740419790000032
Where E is the prediction deviation value, σ is the standard deviation of the normal distribution, S is the standard deviation of the training error data, μ 0 Is the mean of the training error data;
setting a significance level alpha, and checking a corresponding distribution table to obtain a critical value of double-side detection;
comparing the value of the test statistic with a critical value of the alpha level, and further making a decision, if the rejection domain of the original hypothesis is met: | statistics | critical value, rejecting the original hypothesis, otherwise accepting the original hypothesis.
Further, the step S8 specifically includes:
and (5) according to the significance test result of the step (S7), carrying out abnormity judgment on the actual data: if the original assumption is accepted, the calculated deviation is consistent with the statistical distribution obtained in the step S5, and the data is considered to be normal data; if the original assumption is rejected, it indicates that the calculated deviation is significantly different from the statistical distribution obtained in step S5, and the data is considered to be abnormal data.
Further, the step S9 specifically includes:
if the detection result of the step S8 is normal, adding the actual data of the step into historical data to form data to be predicted, and inputting the data to be predicted into an LSTM network for next prediction;
if the detection result of the step S8 is abnormal, adding the predicted value of the step into historical data to form data to be predicted, and inputting the data to be predicted into an LSTM network for rolling forward prediction;
when new data arrives, step S6, step S7 and step S8 are executed iteratively.
Has the advantages that:
the invention discloses a complex equipment abnormity detection method based on prediction, which is based on an LSTM network and used for predicting multi-dimensional time sequence data of complex equipment; and judging whether the deviation between the predicted value and the actual value is in a normal range or not based on a hypothesis testing theory, so as to judge whether the current data is abnormal or not and realize real-time abnormal detection of the complex equipment.
Drawings
Fig. 1 is a flowchart of a prediction-based complex equipment anomaly detection method provided by the present invention.
FIG. 2 is a diagram of the LSTM-based prediction model architecture of the present invention.
FIG. 3 is a flow chart of the training of the predictive model of the present invention.
FIG. 4 is a graphical representation of the significance level of the two-sided assay of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example 1
The invention discloses a complex equipment abnormity detection method based on prediction, which comprises the following steps:
s1, accumulating time sequence data of a normal state in the historical operation process of the complex equipment;
s2, performing correlation analysis on the multi-dimensional parameters of the complex equipment to obtain a parameter combination with strong correlation;
s3, establishing an LSTM network multi-input multi-output prediction model, wherein input variables and output variables are parameter combinations with strong correlation obtained by analysis in the step S2, and the prediction model adopts a single-step prediction mode;
s4, training an LSTM prediction model based on the accumulated complex equipment historical data, and obtaining a prediction error of the model on the training data;
s5, performing KS hypothesis test on the distribution of the prediction error of the training data, and judging the statistical distribution obeyed by the error data;
s6, inputting the latest historical data of the complex equipment into an LSTM prediction model to obtain a prediction value of the model, and calculating the deviation between the prediction value and an actual value when new data comes;
s7, carrying out significance test on the deviation between the predicted value and the actual value, and judging whether the deviation value is consistent with the statistical distribution obtained in the step S5;
s8, judging the abnormality of the actual data according to the detection result of the step S7, if the calculated deviation is obviously different from the statistical distribution of the step S5, determining that the data is abnormal data, and if not, determining that the data is normal data;
s9, if the detection result in the step S8 is normal, adding the actual data into the historical data, and inputting the historical data into an LSTM prediction model for next prediction; if the detection result in the step S8 is abnormal, adding the predicted value into historical data, and inputting the historical data into an LSTM prediction model to perform rolling forward prediction; when new data arrives, steps S6, S7, S8 are iteratively executed.
In this embodiment, the step S2 specifically includes:
preprocessing data, including missing value filling and normalization processing;
selecting a target variable, and sequentially calculating mutual information between the target variable and other variables; for two time series X and Y, the formula for the mutual information is:
Figure BDA0003740419790000051
wherein p (X, Y) is a joint probability distribution function of X and Y, and p (X) and p (Y) are edge probability distribution functions of X and Y, respectively;
setting a relevance evaluation threshold, and when mutual information between two variables is larger than the threshold, considering that the two variables have strong relevance to obtain a variable group with strong relevance to the target variable;
and (4) iteratively executing the steps to obtain a parameter combination with strong correlation among all parameters of the complex equipment.
In this embodiment, the step S3 specifically includes:
the input and output of the network both comprise a combination of parameters, which have a strong correlation;
the input time step is m, the output time step is 1, and the data of the future step 1 are predicted by utilizing the monitoring data of the previous m steps;
the hidden layers of the network are alternately connected by LSTM and dropout layers.
In this embodiment, the step S5 specifically includes:
drawing a statistical graph of error data, and preliminarily judging statistical distribution which is possibly accorded with the error data;
the original hypothesis H0: distribution F of prediction error data n (x) According to a certain statistical distribution F (x);
calculating the absolute difference between the sample accumulated frequency and the theoretical distribution accumulated probability to make the maximum absolute difference be D n
D n =max|F n (x)-F(x)|
Finding the threshold D using the sample volume n and the significance level alpha n (α);
If D is n <D n (α), the number of errors is consideredAccording to the distribution, the original hypothesis is accepted, otherwise, the original hypothesis is rejected.
In this embodiment, the step S7 specifically includes:
the original hypothesis H0 is proposed: the predicted deviation value is consistent with the statistical distribution determined in the step S5;
constructing test statistics: if the original assumed distribution is normal, the test statistic is
Figure BDA0003740419790000061
If the distribution of the original hypothesis is t distribution, the test statistic is
Figure BDA0003740419790000062
Wherein E is the prediction deviation value, σ is the standard deviation of the normal distribution, S is the standard deviation of the training error data, μ 0 Is the mean of the training error data;
setting a significance level alpha, and checking a corresponding distribution table to obtain a critical value of double-side detection;
comparing the value of the test statistic with a critical value of the alpha level, and further making a decision, if the rejection domain of the original hypothesis is met: | statistics | critical value, rejecting the original hypothesis, otherwise accepting the original hypothesis.
In this embodiment, the step S8 specifically includes:
and (5) according to the significance test result of the step (S7), carrying out abnormity judgment on the actual data: if the original assumption is accepted, the calculated deviation is consistent with the statistical distribution obtained in the step S5, and the data is considered to be normal data; if the original assumption is rejected, it indicates that the calculated deviation is significantly different from the statistical distribution obtained in step S5, and the data is considered to be abnormal data.
In this embodiment, the step S9 specifically includes:
if the detection result of the step S8 is normal, adding the actual data of the step into historical data to form data to be predicted, and inputting the data to be predicted into an LSTM network for next prediction;
if the detection result of the step S8 is abnormal, adding the predicted value of the step into historical data to form data to be predicted, and inputting the data to be predicted into an LSTM network for rolling forward prediction;
when new data arrives, step S6, step S7 and step S8 are executed in an iterative manner.
Example 2
As shown in fig. 1, the present invention provides a complex equipment anomaly detection method based on prediction, which includes the following steps:
step 1, accumulating running state data of the complex equipment in the historical working process, and collecting real-time monitoring data of the complex equipment. The monitoring data of the complex equipment mainly comprises information of the running state of the equipment, such as current, voltage, attitude angle, addition table, frame angle, speed, position, acceleration and the like of the system.
And 2, performing correlation analysis on the multi-dimensional parameters of the complex equipment by using a mutual information-based method, mining the coupling relation among the multi-dimensional complex data, obtaining a parameter combination with strong correlation, and laying a foundation for establishing a more accurate prediction model subsequently.
Specifically, the steps of the correlation analysis are described in detail as follows:
step 2.1, preprocessing the data, including missing value filling and normalization processing:
(1) Filling missing values: padding with an average of two neighboring elements that are closest to the missing data;
(2) Data normalization processing:
Figure BDA0003740419790000071
wherein T is a time sequence, T i The data is the ith data in T, min (T) is the minimum value in T, and max (T) is the maximum value in T.
And 2.2, selecting a target variable, and sequentially calculating mutual information between the target variable and other variables. For two discrete time series X and Y, the formula for the mutual information is:
Figure BDA0003740419790000072
wherein p (X, Y) is a joint probability distribution function of X and Y, and p (X) and p (Y) are edge probability distribution functions of X and Y, respectively;
as the dependency between the two time series increases, the mutual information increases.
And 2.3, setting a relevance evaluation threshold, and when mutual information between the two variables is greater than the threshold, considering that the two variables have strong relevance to obtain a variable group with strong relevance to the target variable.
And 2.4, iteratively executing the steps 2.2 and 2.3 to obtain a parameter combination with strong correlation among all parameters of the complex equipment.
And 3, establishing an LSTM network multi-input multi-output prediction model according to the parameter combination with strong correlation obtained in the step 2, wherein the prediction model adopts a single-step prediction mode.
Specifically, as shown in fig. 2, the structure of the established LSTM network prediction model is described in detail as follows:
and 3.1, the input and the output of the network comprise N-dimensional parameters which have strong relevance, so that the prediction precision is improved, and the calculation redundancy is reduced.
And 3.2, the input time step is m, the output time step is 1, and the data of the step 1 in the future is predicted by utilizing the monitoring data of the step m.
And 3.3, alternately connecting hidden layers of the network by an LSTM layer and a dropout layer, wherein a long-time memory mechanism of the LSTM layer can effectively mine time sequence information in the data, and the dropout layer is used for preventing overfitting.
Step 4, training an LSTM prediction model based on the accumulated complex equipment historical data, and obtaining a prediction error e of the model on the training data 1 ,e 2 ,e 3 ,...,e k Where k is the number of training samples.
Specifically, as shown in fig. 3, the detailed flow of the LSTM network model training is as follows:
step 4.1, constructing a training sample and a test sample based on time sequence data according to a network structure;
step 4.2, initializing LSTM network parameters, and setting a loss function, a parameter optimization mode and training end conditions;
4.3, starting training, inputting a training sample, and performing forward calculation;
4.4, calculating a model prediction error function (loss function), and judging whether the error function meets the requirement or not;
step 4.5, if the error function meets the requirement, finishing the training; otherwise, the error is propagated reversely, the model parameters are optimized, and the steps 4.3 and 4.4 are repeated iteratively until the error function meets the requirement or meets the training end condition.
And 5, performing KS hypothesis test on the distribution of the prediction errors of the training data, and judging the statistical distribution obeyed by the error data.
Specifically, the detailed flow of the KS test is as follows:
step 5.1, drawing a statistical graph of error data, and primarily judging statistical distribution which is possibly accorded with, wherein the result of primary judgment is usually normal distribution or t distribution;
step 5.2, providing an original hypothesis H0: distribution F of prediction error data n (x) According to a certain statistical distribution F (x);
step 5.3, calculating the absolute difference between the sample accumulative frequency and the theoretical distribution accumulative probability, and making the maximum absolute difference be D n
D n =max|F n (x)-F(x)|
Step 5.4, find out critical value D by sample capacity n and significant level alpha n (α);
Step 5.5, if D n <D n And (alpha), considering that the error data conforms to the distribution, and accepting the original hypothesis, or rejecting the original hypothesis. If the original hypothesis is accepted, the judgment is finished, the step is ended, otherwise, the step 5.2 is returned, and the statistical distribution of the original hypothesis is modified.
And 6, inputting the latest monitoring data with the time step length of m of the complex equipment into a prediction network model, obtaining a predicted value of each parameter at the next moment, and calculating the deviation between the predicted value and an actual value when new data comes.
And 7, performing double-side significance test on the deviation between the predicted value and the actual value, and judging whether the deviation value is consistent with the statistical distribution obtained in the step 5, wherein a significance level schematic diagram of the double-side test is shown in fig. 4.
Specifically, the detailed flow of the significance test for the deviation is as follows:
step 7.1, providing an original hypothesis H0: the predicted deviation value is consistent with the statistical distribution determined in the step 5;
step 7.2, constructing test statistics: if the original assumed distribution is normal, the test statistic is
Figure BDA0003740419790000091
If the distribution of the original hypothesis is t distribution, the test statistic is
Figure BDA0003740419790000092
Wherein E is the prediction deviation value, σ is the standard deviation of the normal distribution, S is the standard deviation of the training error data, μ 0 Is the mean of the training error data;
7.3, setting a significance level alpha, generally taking alpha as 0.05,0.01,0.001 and the like, and checking a corresponding distribution table to obtain a critical value of double-side detection;
and 7.4, comparing the value of the test statistic with a critical value of the alpha level, further making a decision, and if the rejection region of the original hypothesis is met: | statistics | critical value, rejecting the original hypothesis, otherwise accepting the original hypothesis.
Step 8, judging the abnormality of the actual data according to the significance test result of the step 7, and if the original assumption is accepted, indicating that the calculated deviation is consistent with the statistical distribution obtained in the step 5, considering the data as normal data; if the original hypothesis is rejected, the calculated deviation is obviously different from the statistical distribution obtained in the step 5, and the data is considered to be abnormal data.
Step 9, if the detection result in the step 8 is normal, adding the actual data in the step into the historical data to form data to be predicted, and inputting the data to be predicted into an LSTM network for next prediction; and if the detection result in the step 8 is abnormal, adding the predicted value in the step into historical data to form data to be predicted, and inputting the data to be predicted into an LSTM network for rolling forward prediction. When new data arrives, step 6, step 7 and step 8 are executed iteratively.
The invention discloses a complex equipment abnormity detection method based on prediction, which is based on an LSTM network and used for predicting multidimensional time sequence data of complex equipment; and judging whether the deviation between the predicted value and the actual value is in a normal range or not based on a hypothesis testing theory, so as to judge whether the current data is abnormal or not and realize real-time abnormal detection of the complex equipment. Based on deep learning prediction and statistical analysis theory, the potential unknown abnormal operation state of the complex equipment is excavated, the accuracy and the rapidity of the abnormal detection of the complex equipment are improved, and powerful guarantee is provided for stable and efficient operation of the complex equipment.
Aiming at the defects of the existing method, the invention provides an anomaly detection method based on deep learning prediction and statistical analysis. On the basis of carrying out relevance analysis on multi-dimensional complex data of complex equipment, predicting the multi-dimensional time sequence data generated by the complex equipment based on a long-time and short-time memory neural network (LSTM), and calculating the deviation between an actual value and a predicted value when new data comes; and based on a hypothesis testing theory, judging whether the deviation is in a normal range by combining errors of the prediction model on the training data, so as to judge whether the current data is abnormal or not and realize real-time abnormal detection on the complex equipment.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (7)

1. A complex equipment abnormity detection method based on prediction is characterized by comprising the following steps:
s1, accumulating time sequence data of a normal state in a historical operation process of complex equipment;
s2, performing correlation analysis on the multi-dimensional parameters of the complex equipment to obtain a parameter combination with strong correlation;
s3, establishing an LSTM network multi-input multi-output prediction model, wherein input variables and output variables are parameter combinations with strong correlation obtained by analysis in the step S2, and the prediction model adopts a single-step prediction mode;
s4, training an LSTM prediction model based on the accumulated complex equipment historical data, and obtaining a prediction error of the model on the training data;
s5, performing KS hypothesis test on the distribution of the prediction errors of the training data, and judging the statistical distribution obeyed by the error data;
s6, inputting the latest historical data of the complex equipment into an LSTM prediction model to obtain a prediction value of the model, and calculating the deviation between the prediction value and an actual value when new data comes;
s7, performing significance test on the deviation between the predicted value and the actual value, and judging whether the deviation value is consistent with the statistical distribution obtained in the step S5;
s8, judging the abnormality of the actual data according to the detection result of the step S7, if the calculated deviation is obviously different from the statistical distribution of the step S5, determining that the data is abnormal data, and if not, determining that the data is normal data;
s9, if the detection result in the step S8 is normal, adding the actual data into the historical data, and inputting the historical data into an LSTM prediction model for next prediction; if the detection result in the step S8 is abnormal, adding the predicted value into historical data, and inputting the historical data into an LSTM prediction model to perform rolling forward prediction; when new data arrives, steps S6, S7 and S8 are executed iteratively.
2. The method according to claim 1, wherein the step S2 specifically comprises:
preprocessing data, including missing value filling and normalization processing;
selecting a target variable, based onMutual information between the secondary calculation target variable and other variables; for two time series X and Y, the formula for the mutual information is:
Figure FDA0003740419780000011
wherein p (X, Y) is a joint probability distribution function of X and Y, and p (X) and p (Y) are edge probability distribution functions of X and Y, respectively;
setting a relevance evaluation threshold, and when mutual information between two variables is larger than the threshold, considering that the two variables have strong relevance to obtain a variable group with strong relevance to the target variable;
and (4) iteratively executing the steps to obtain a parameter combination with strong correlation among all parameters of the complex equipment.
3. The method according to claim 2, wherein the step S3 specifically comprises:
both the input and output of the network comprise a combination of parameters, which have strong correlation;
the input time step is m, the output time step is 1, and the data of the next step 1 are predicted by using the monitoring data of the previous step m;
the hidden layers of the network are alternately connected by LSTM layers and dropout layers.
4. The method according to claim 3, wherein the step S5 specifically comprises:
drawing a statistical chart of error data, and preliminarily judging statistical distribution which is possibly accorded;
the original hypothesis H0 is proposed: distribution F of prediction error data n (x) According to a certain statistical distribution F (x);
calculating the absolute difference between the sample accumulated frequency and the theoretical distribution accumulated probability to make the maximum absolute difference be D n
D n =max|F n (x)-F(x)|
Finding out critical value D by using sample capacity n and significance level alpha n (α);
If D is n <D n And (alpha), considering that the error data conforms to the distribution, and accepting the original hypothesis, or rejecting the original hypothesis.
5. The method for detecting the abnormality of the complex equipment based on the prediction as claimed in claim 4, wherein the step S7 includes the following specific steps:
the original hypothesis H0 is proposed: the predicted deviation value is consistent with the statistical distribution determined in the step S5;
constructing test statistics: if the original assumed distribution is normal distribution, the test statistic is
Figure FDA0003740419780000021
If the distribution of the original hypothesis is t distribution, the test statistic is
Figure FDA0003740419780000022
Wherein E is the prediction deviation value, σ is the standard deviation of the normal distribution, S is the standard deviation of the training error data, μ 0 Is the mean of the training error data;
setting a significance level alpha, and checking a corresponding distribution table to obtain a critical value of double-side detection;
comparing the value of the test statistic with a critical value of the alpha level, and further making a decision, if the rejection region of the original hypothesis is met: if yes, rejecting the original hypothesis, otherwise accepting the original hypothesis.
6. The method according to claim 5, wherein the step S8 specifically comprises:
and (5) according to the significance test result of the step (S7), carrying out abnormity judgment on the actual data: if the original assumption is accepted, the calculated deviation is consistent with the statistical distribution obtained in the step S5, and the data is considered to be normal data; if the original assumption is rejected, it indicates that the calculated deviation is significantly different from the statistical distribution obtained in step S5, and the data is considered to be abnormal data.
7. The method according to claim 6, wherein the step S9 specifically comprises:
if the detection result of the step S8 is normal, adding the actual data of the step into historical data to form data to be predicted, and inputting the data to be predicted into an LSTM network for next prediction;
if the detection result of the step S8 is abnormal, adding the predicted value of the step into historical data to form data to be predicted, and inputting the data to be predicted into an LSTM network for rolling forward prediction;
when new data arrives, step S6, step S7 and step S8 are executed in an iterative manner.
CN202210814189.2A 2022-07-11 2022-07-11 Complex equipment abnormity detection method based on prediction Pending CN115329219A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210814189.2A CN115329219A (en) 2022-07-11 2022-07-11 Complex equipment abnormity detection method based on prediction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210814189.2A CN115329219A (en) 2022-07-11 2022-07-11 Complex equipment abnormity detection method based on prediction

Publications (1)

Publication Number Publication Date
CN115329219A true CN115329219A (en) 2022-11-11

Family

ID=83916781

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210814189.2A Pending CN115329219A (en) 2022-07-11 2022-07-11 Complex equipment abnormity detection method based on prediction

Country Status (1)

Country Link
CN (1) CN115329219A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117370870A (en) * 2023-12-05 2024-01-09 浙江大学 Knowledge and data compound driven equipment multi-working condition identification and performance prediction method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117370870A (en) * 2023-12-05 2024-01-09 浙江大学 Knowledge and data compound driven equipment multi-working condition identification and performance prediction method
CN117370870B (en) * 2023-12-05 2024-02-20 浙江大学 Knowledge and data compound driven equipment multi-working condition identification and performance prediction method

Similar Documents

Publication Publication Date Title
CN112637132B (en) Network anomaly detection method and device, electronic equipment and storage medium
CN112529341B (en) Drilling well leakage probability prediction method based on naive Bayesian algorithm
CN113762329A (en) Method and system for constructing state prediction model of large rolling mill
CN110472671B (en) Multi-stage-based fault data preprocessing method for oil immersed transformer
CN110493221B (en) Network anomaly detection method based on clustering contour
CN113010504B (en) Electric power data anomaly detection method and system based on LSTM and improved K-means algorithm
CN113076700A (en) SVM-LDA rock burst machine learning prediction model method based on data analysis principle
CN115329219A (en) Complex equipment abnormity detection method based on prediction
CN118041661A (en) Abnormal network flow monitoring method, device and equipment based on deep learning and readable storage medium
CN108446714A (en) A kind of non-Markovian degeneration system method for predicting residual useful life under multi-state
CN112305441A (en) Power battery health state assessment method under integrated clustering
CN114036647A (en) Power battery safety risk assessment method based on real vehicle data
CN116821832A (en) Abnormal data identification and correction method for high-voltage industrial and commercial user power load
CN117369455A (en) Autonomous exploration method and system for robot based on generation of countermeasure network
CN116861214A (en) Health state identification method and system based on convolution long short-time memory network
CN116365519B (en) Power load prediction method, system, storage medium and equipment
CN113033898A (en) Electrical load prediction method and system based on K-means clustering and BI-LSTM neural network
CN115878987A (en) Fault positioning method based on contribution value and causal graph
CN115935285A (en) Multi-element time series anomaly detection method and system based on mask map neural network model
CN115883182A (en) Method and system for improving network security situation element identification efficiency
CN115834424A (en) Method for identifying and correcting abnormal data of line loss of power distribution network
CN112968740B (en) Satellite spectrum sensing method based on machine learning
CN115544886A (en) Method, system, apparatus and medium for predicting failure time node of high-speed elevator
CN114943281A (en) Intelligent decision-making method and system for heat pipe cooling reactor
CN114897193A (en) Airplane structure maintenance decision method and decision system based on man-in-the-loop

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination