A kind of alarm root finding method based on data-driven
Technical field
The invention belongs to safety monitoring technology field, more particularly to a kind of alarm root searching side based on data-driven
Method.
Background technology
Due to the continuous improvement required industrial system safety and reliability, online, in real time to system operation
It is monitored as key link essential in modern industry system.In view of being difficult to the accurate Mathematical Modeling of acquisition system
Situations such as producing a large amount of history datas with priori and industrial system, the process monitoring based on data-driven turns into existing
The mainstream technology of foundry industry security monitoring.Sending alarm after failure generation can help staff to judge system operation feelings in time
Condition, but the method not can determine that alarm occurrence cause.Alarm root finding method just can clearly alarm original when alarm occurs
Cause, thus receive most attention.
Alarm root finding method is accurately positioned failure by a series of measures, and ancillary staff isolates and excludes in time
Failure.By development for many years, technology is found it has already been proposed various alarm roots, be broadly divided into three major types:
1) signed digraph method, depends on the physical model and priori of system;
2) Granger causality analysis method, is the causality based on prediction;
3) entropy of transition (transfer entropy, TE) method.
First two method is all only applicable to linear system, and the relation between variable is obtained by building model, and uncomfortable
For large-scale complicated system.A kind of last method is mainly obtained between variable by the probability density function of calculating process variable
Causality, the nonlinear system of complexity can be applied to, practicality is stronger.The deficiency of such method is to modeling data
Quantity there is requirement higher, and modern industry system just to make up this characteristics of produce mass data not enough.
Therefore, it is considered as desirable by the inventor to determine the causality between variable using entropy of transition method, established to find alarm root
Determine solid foundation.
The content of the invention
It is an object of the invention to provide a kind of alarm root finding method based on data-driven, system thing is not relied on
Reason model and priori, only relying on process measurement variable can just obtain causality, so that it may which sending the initial stage in alarm can just seek
Go to the root of, to isolate and fixing a breakdown in time, reduce and even avoid accident from occurring, improve the security and reliability of system operation
Property.
To achieve these goals, the present invention is adopted the following technical scheme that:
A kind of alarm root finding method based on data-driven, comprises the following steps:
Step one:Detect the operational data of industrial system and obtain observational variable, and data are arrived into d observational variable storage
In matrix X, check the time stationarity of data and data are pre-processed;
Described operational data includes the parameter of reaction system ruuning situation;
Step 2:Model parameter is initialized, and utilizes Cao criterions or Ragwitz criterion Optimized model parameters;
Step 3:Entropy of transition matrix P is calculated, including:
A, selection variable:Appoint from data matrix X and take two variables, labeled as x, y, common d (d-1)/2 kind combination;
Entropy of transition between B, two variables of calculating:
Wherein,It is joint probability density function, f (|) is conditional probability density function, and w is random
VectorAssuming that the element of w is w1,w2,…,ws, ∫ () dw isWithIt is respectively the embedded vector of x and y historical measurements, k1And l1It is respectively the embedded dimension of y and x
Number, h1It is estimation range;
C, calculating standard translations entropy:
Wherein, H represents entropy,It is conditional entropy;And Tx→y≠Ty→x;
IfMore than defined threshold, then judge that two variables x, y have causality;
D, repeat step B, C calculate the variable standard entropy of transition of d (d-1)/2 kinds of combination up to calculating d (d-1),
It is stored in matrix P, then will be represented with flow graph with causal variable;
Step 4:Standard direct transformation entropy is calculated based on variable causality in information flow direction figure:
Appoint from matrix P and take x, y, z 3 with causal variable, wherein z is intermediate variable, judges that x's and y is direct
Causality, including:
1) direct transformation entropy, is calculated:
Wherein, v represents random vectorEstimation range h=max (h1,h3), embedded vectorIt is the history value of z, effectively letter can be provided for the prediction y at i+h moment
Breath,It is the history value of x, if h=h1, thenIf h=h3, thenAnd calculate Tx→zWhen, l2And m1It is the Embedded dimensions of x and z, h2It is estimation range, τ2It is time interval;Calculate Tz→y
When, k2And m2It is the Embedded dimensions of y and z, h3It is estimation range, τ3It is time interval;
2) standard direct transformation entropy, is calculated:
IfMore than defined threshold, then illustrate that x and y has direct causality;
1) and 2) carry out above-mentioned two steps to the variable in step 3 information flow direction figure to calculate, checking variable is causal true
It is false;
Step 5, the result according to step 4 sets up the direct causality figure of variable.
In step one, data are carried out with time stationarity inspection with the fowler method of inspection of augmentation.
Step a pair of the data carry out pretreatment to be included:Using method processing data noises such as filtering.
In step 4, appoint from matrix P and take x, y, z 3 with causal variable, wherein variable z can be sky, x, y
It is adjacent variable.
After such scheme, the present invention has advantages below:Mould is set up merely with the mass data of reflection system operation
Type, does not rely on the physical model and priori of system, and restrictive condition is few, strong applicability;Additionally, carrying out event at the alarm initial stage
Barrier positioning, can quickly discharge failure, reduce major accident and occur, and improve the safety and reliability of system, improve economic effect
Benefit.
The present invention is described further below in conjunction with the accompanying drawings.
Brief description of the drawings
Fig. 1 is the flow chart of alarm root finding method of the present invention based on data-driven.
Fig. 2 is the graph of a relation of variable x, y and z.
Fig. 3 is the information flow direction figure of variable x, y and z, and wherein z to y has direct causality.
Fig. 4 is the information flow direction figure of variable x, y and z, and wherein z to y does not have causality.
Fig. 5 is the information flow direction figure based on standard translations entropy.
Fig. 6 is the calculation procedure of standard direct transformation entropy.
Fig. 7 is the information flow direction figure based on standard direct transformation entropy.
Specific embodiment
Embodiment one
This specific embodiment, a kind of alarm based on data-driven that the embodiment of the present invention one is disclosed are illustrated with reference to Fig. 1
Root finding method, is carried out according to the following steps:
Step one:Detect the operational data of industrial system and obtain observational variable, and data are arrived into d observational variable storage
In matrix X, this embodiment make use of the fowler method of inspection of augmentation to check the time stationarity of data, and carry out pre- place to data
Reason, pretreatment is using method processing data noises such as filtering;Wherein operational data includes the ginseng of reaction system ruuning situation
Number, such as temperature, pressure, water level etc.;
Step 2:Model parameter is initialized, and utilizes Cao criterion Optimized model parameters;Model refers to set up to become
Causal model is measured, model parameter is exactly to set up some arrange parameters of model needs, and pretreated operational data is
The input of model;
Step 3:Calculate entropy of transition matrix P:
A, 3 variables are taken from data matrix X, labeled as x, y, z, calculate the standard translations entropy between any two variable
(normalized TE, NTE), common d (d-1)/2 kind combination;
The computational methods of description standard direct transformation entropy by taking this 3 variables as an example;
B, the TE values for calculating x to y:
Wherein,It is joint probability density function, f (|) is conditional probability density function, and w is random
VectorAssuming that the element of w is w1,w2,...,ws, ∫ () dw is
WithIt is respectively the embedded vector of x and y historical measurements, k1And l1It is respectively the dimension of y and x,
h1It is estimation range;
If Tx→y=0, illustrate that x and y do not have causality;
C, the NTE for calculating x to y:
Wherein,It is conditional entropy;
H represents entropy;
D, the TE values for calculating x to z:
Wherein,WithIt is time interval τ2Insertion
Vector, η is random vectorh2It is estimation range;
If Tx→z=0, illustrate that x and z do not have causality;
E, the NTE values for calculating x to z:
F, the TE values for calculating z to y:
Wherein,WithIt is time interval τ3It is embedding
Incoming vector,It is random vectorh3It is estimation range;
If Tz→y=0, illustrate that z and y do not have causality;
G, the NTE values for calculating z to y:
Any two variable to data matrix X all carries out the calculating of TE values, and TE values storage is arrived the matrix P of d × d
In;
The diagonal entry of matrix P is variable entropy of transition in itself, and its value is NA;
When threshold value of the NTE values more than regulation, judge that two variables have causality, information flow of the description based on NTE
Xiang Tu;
It should be noted that generally utilizing gaussian kernel function estimated probability density function
Single argument probability density function can be calculated with following equation
Wherein, N is sample number, and γ is the bandwidth for reducing PDF estimation,C=(4/3)1/5
≈1.06;
Multivariable situation is tieed up to d, PDF estimation can be calculated with following equation
Wherein,S=1 ..., d;
Step 4:Calculating standard direct transformation entropy (normalized direct TE, NDTE):
A:Calculate direct transformation entropy (direct TE, DTE):
As shown in Fig. 2 x causes the change of z and y, in order to judge whether x and y has direct causality, DTE is defined:
Wherein, v represents random vectorEstimation range h=max (h1,h3), embedded vectorIt is the history value of z, effectively letter can be provided for the prediction y at i+h moment
Breath,It is the history value of x, if h=h1, thenIf h=h3, then
B, calculating NDTE:
If DTEx→y=0, then x and y do not have direct causality;
IfMore than defined threshold, then x and y have direct causality;
Next need to judge the causal true and false property of z to y;
The DTE value of z to y is calculated as:
Wherein, υ represents random vectorEstimation range h=max (h1,h3), embedded vectorIt is the history value of z, effectively letter can be provided for the prediction y at i+h moment
Breath,It is the history value of x, if h=h1, thenIf h=h3, then
The NDTE values of z to y are calculated as
If NDTEz→yMore than defined threshold, represent that z to y has causality, as shown in Figure 3;Otherwise, z to y does not have
Causality, as shown in Figure 4;
Step 5:Having causal variable to confirmation carries out the calculating of NDTE, between two variables of checking it is direct because
Fruit relation, filters out direct causal variable, to set up direct causality figure, that is, determines information flow direction figure.
Embodiment two:Present embodiment from unlike specific embodiment one:Step 2 is optimized using Ragwitz criterions
Parameter.
Specific embodiment:The alarm root finding method based on data-driven of this specific embodiment, for an oil
Flue gas desulfurization course (flue gas desulfurization, FGD) variable causality of company is emulated, specific steps
It is as follows;
Step one, by taking FDG processes as an example, chooses reaction tank, tank 1 and the liquid level of tank 2 and the flow velocity of pump 2 and 3 is
Variable, is designated as y respectively1、y2、y3y4、y5, 3544 groups of data are gathered, data have time stationarity, and carry out pre- place to data
Reason;
Step 2, model parameter initialization, and parameter is optimized using Cao criterions;
Step 3, calculates the TE values and NTE values between variable, is shown in Table 1;
Table 1
0.02 is chosen as threshold value, the information flow direction path based on standard translations entropy is as shown in Figure 5;
Step 4, calculates DTE the and NDTE values of FDG partial routines, is shown in Table 2;
Table 2
If NDTE values are too small, judge that variable does not have direct causality, obtained according to direct transformation entropy result of calculation
The step of information flow direction figure, is as shown in Figure 6;
Step 5, the information flow direction figure for obtaining FDG processes is as shown in Figure 7.
Described above has shown and described the preferred embodiments of the present invention, it should be understood that the present invention is not limited to this paper institutes
The form of disclosure, is not to be taken as the exclusion to other embodiment, and can be used for various other combinations, modification and environment, and energy
Enough in invention contemplated scope herein, it is modified by the technology or knowledge of above-mentioned teaching or association area.And people from this area
The change and change that member is carried out do not depart from the spirit and scope of the present invention, then all should be in the protection of appended claims of the present invention
In the range of.