CN111680454A

CN111680454A - Fan blade icing fault prediction method based on double attention mechanism

Info

Publication number: CN111680454A
Application number: CN202010551118.9A
Authority: CN
Inventors: 朱玉婷; 于海阳; 杨震; 郑忠斌; 王朝栋
Original assignee: Beijing University of Technology
Current assignee: Beijing University of Technology
Priority date: 2020-06-16
Filing date: 2020-06-16
Publication date: 2020-09-18

Abstract

The invention relates to a fan blade icing fault prediction method based on a double attention mechanism, which is used for solving the problem that the icing state of a fan is difficult to predict in the prior art by combining a CNN + LSTM network with the double attention mechanism, and specifically comprises the following steps: acquiring an icing original data set of a fan; preprocessing an original data set to obtain a training set and a test set; the CNN network combines an attention mechanism to extract features; further extracting time information and predicting the icing state of the fan by using an LSTM network in combination with an attention mechanism; and predicting the icing of the fan blade by using the trained model. The method of the invention predicts the icing phenomenon of the fan blade by utilizing a large amount of time sequence detection variables acquired by the SCADA system, can predict when the blade will be iced before the blade is iced, takes protective measures in advance to prevent the blade from being iced, and plays a greater role in the safe, reliable and continuous operation of the wind power generation system.

Description

Fan blade icing fault prediction method based on double attention mechanism

Technical Field

The invention relates to a fan blade icing fault prediction method based on a double attention mechanism, and belongs to the field of industrial system fault prediction.

Background

Wind power generation is currently the most mature and potentially renewable energy technology. The development of wind power in China draws attention. However, the particularity of wind energy acquisition determines that a large number of fans need to be arranged in cold areas with high latitude and high altitude. The fan working in cold areas is affected by weather conditions such as frost ice, rime, wet snow and the like, so that the blade icing phenomenon is easy to occur, and a series of consequences are caused. The wind energy capture capacity is reduced, the power generation power is lost, the fan blade is broken, and even safety accidents are caused.

Therefore, the blade icing fault can be predicted and prevented in time, and the method has important significance for prolonging the service life of the wind power equipment and preventing major safety accidents. In actual operation, severe icing can generally be easily detected and automatically deiced by the fan deicing system. However, severe icing, once formed, has an irreversible effect on blade performance. At present, one of the main methods is to use a sensor to acquire blade temperature data to judge whether the blade has an early icing phenomenon or not, and then to process the blade through a deicing system, but the method has two obvious disadvantages: firstly, need additionally to arrange the sensor and lead to the cost to fly to rise, secondly often cause the damage to the blade when detecting the blade and freeze, can extend blade life-span to a certain extent but the effect is limited. In addition, a data-driven method is adopted, machine learning and deep learning algorithm models are applied, the complex nonlinear relation among signals of the industrial equipment is usually extracted by using industrial equipment operation data acquired by an SCADA system, abnormal measurement values are searched for diagnosing the frozen fan blade, the data-based diagnosis method can effectively detect the frozen blade, and the cost of a wind field is greatly reduced. Although the early icing phenomenon of the fan blade can be accurately detected so as to maintain the blade, the icing still influences the physical parameters of the fan blade and causes certain influence on fan power generation equipment. The method of the invention predicts the icing phenomenon of the fan blade by utilizing a large amount of time sequence detection variables collected by the SCADA system and combining a double attention mechanism based on the CNN + LSTM network, can predict when the blade will be iced before the blade is iced, takes protective measures in advance to prevent the blade from being iced, and plays a greater role in the safe, reliable and continuous operation of the wind power generation system.

Disclosure of Invention

The invention aims to provide a fan blade icing fault prediction method based on a double attention mechanism, which has the innovation points that two different attention mechanisms are used for respectively extracting the characteristics of time and channel dimensions in data, and a fault prediction model with higher accuracy is provided for solving the problem that in a common blade icing diagnosis method based on data driving, the detection can be carried out only after blades are iced, the blade icing cannot be prevented in advance, so that the physical properties of the blades and a wind driven generator are influenced. The method includes the steps that firstly, preprocessed data are input into a CNN network, local features are learned through the CNN, an attention layer is added to help the CNN network focus on the importance degree of local information in the data, high-dimensional features output by the CNN network pass through an LSTM network, the dependency relationship among time of sequence data is learned through an LSTM neural network, the importance of the data in a corresponding time period in icing forecasting is measured through the attention layer, and finally whether a fan blade in the next time period is iced or not is forecasted through data information in historical time. The method can adaptively extract fault characteristics and time sequence information from the original data acquired by the SCADA system, can adaptively focus on characteristics of time and channel dimensions in the fan blade data, and can predict the icing state of the fan blade more accurately.

In order to achieve the purpose, the technical scheme adopted by the invention is a fan blade icing fault prediction method based on a double attention mechanism: as shown in fig. 1, the method is implemented as follows:

step (1): acquiring an icing original data set of a fan

The original data set is from a fan icing data set of 'the first China industry big data competition', the data is acquired from an industrial SCADA system, the total length is 2 months, the data comprises about 58 ten thousand pieces of data, each piece of data comprises 28 dimensions, the characteristic dimensions comprise wind speed, generator rotating speed, net side active power, wind direction angle, each blade angle, variable pitch motor temperature and the like, and the data are subjected to standardization processing. The time in the data set is in minutes as the minimum time unit, and each minute contains about 6-10 measurement sets.

Step (2): preprocessing a data set

Dividing original data into normal data, namely positive samples, labels are normal, fault data, namely negative samples, labels are frozen, and invalid data, namely label-free data, according to the time periods of freezing and non-freezing in the data; when constructing the training set, the non-labeled invalid data is first deleted. Taking several groups of measured values in the same unit time by taking each minute as a unit time, expressing the characteristics by using a plurality of statistics, and performing characteristic expansion by using m statistical representations for each characteristic quantity. Obtaining data of (m, n) -dimension for each group of unit time intervals, i.e.

M represents the number of statistical representations, and n represents the number of features. The data after pretreatment are divided into a training set, a verification set and a test set according to the proportion of 0.7: 0.1: 0.2.

And (3): feature extraction is performed using CNN, convolution layers are convolved with training set data through convolution kernels and generate features, and then pooling layers extract the most important local features. And then the importance of different features in the extracted features is concerned through an attention layer. Here, Location-based Attention is used, focusing on the concealment vector itself.

Input data is divided into x ═ x per unit time₁，x₂，......，x_N]And N represents the sequence length. The convolution kernel of the convolution operation is defined as w. Feature map of convolutional layer when inputting data x

Can be expressed as:

wherein, b represents an offset amount,

it is shown that the activation function is,

can be seen as the convolution kernel w at the corresponding input x_iThe feature vector of the upper learning, l, indicates the l-th network. The ReLU is used as the activation function, the advantages that the output of a part of neurons is 0, the sparsity of the network is improved, the interdependence relation of parameters is reduced, and overfitting is relieved. The ReLU formula is expressed as follows:

the pooling layer is used to perform downsampling, which in the present method is in the form of max pooling:

down (-) represents the downsampling function of the pooling layer,

an output feature vector representing the pooling layer,

is the feature vector of the previous convolution layer and s is the pool size.

Attention in CNN focuses on the feature vector itself, and the designed attention model is as follows:

tanh (-) represents the activation function,

represents the weight of the attention layer, p_iRepresenting input data x_iHigh-dimensional representation after feature extraction by convolution pooling, b_attIndicating the offset α_iThe attention weight is expressed and can be considered as the importance of the corresponding feature vector in all features, and c represents the feature dimension. The attentions are then distributed with a fully connected layer.

v_i＝∑_i＝1α_ip_i(6)

Where v is_iRepresents x after the original data is subjected to feature extraction by using CNN + attribute_iIs represented in high dimension.

And (4): and (4) continuing the network construction in the step (3) by using the LSTM, wherein the structure of the LSTM is mainly divided into three main stages: forgetting of information, long-term status update, and short-term status update. In conjunction with localization-based Attention, context-related information is captured to help predict the future.

Firstly, an information forgetting stage: forget door f_tReading the previous short-term state h_t-1And the current state x_t(where v is represented by the high dimension obtained in step 3_t) And goes one unit c upwards through sigmoid function_t-1Outputting a value between 0 and 1.

f_t＝σ(w_f·h_t-1+w_f·v_t+b_f) (7)

Wherein (w)_f，b_f) Is the weight vector and offset value of the forgetting gate, σ is the sigmoid function, v_tRepresenting the raw data x at time t obtained by step 3_iHigh dimensional feature v of (i ═ t)_i(i＝t)。

Long-term state (Long-term state) update phase: it is determined which new information is stored in a long-term cell state. Input door i_tDetermining updatesValue of (1), input unit g_tInputting the current state:

i_t＝σ(w_i·h_t-1+w_i·v_t+b_i) (8)

g_t＝tanh(w_g·h_t-1+w_g·v_t+b_g) (9)

a new cell state c can then be obtained_t.

Short-term state (Short-term state) update phase: through an output gate o_tPerforming filter on the long-term state to obtain a short-term state h_t。

o_t＝σ(w_o·h_t-i+w_o·v_t+b_o) (11)

In the failure prediction task, the final goal is to be according to x₁To x_tFan state prediction x_t+1State of (i) y_t. An attention mechanism is utilized to capture context-related information to help predict the future. First, the current state h is calculated_tWeights with all previous states, wherein the current state h_tAnd the previous ith state h_iWeight α of_tiThe specific calculation is as follows:

wherein, theta_αIs the parameter to be learned, w_αIs a weight matrix.

And then α as described above_tiGet attention vector α_tThe specific calculation formula is as follows:

α_i＝Soft max([α_t1，α_t2，...，α_ti]) (14)

finally α according to the attention vector_tAnd hidden state h₁To h_t-1Calculating a vector v containing the context attention information_t：

A prediction stage: binding context vector v_tAnd a current hidden state h_tInformation of two vectors, generating one vector

This vector contains both historical information passed over the LSTM network and information on the importance of the data for the corresponding time period in predicting icing.

w_cIs a weight matrix. Attention vector

The fan state (icing or not) for the t +1 time period is generated by softmax layer input, and is defined as:

w_sis a weight matrix, b_sIs the offset.

An objective function: calculating the true state y_tAnd predicted state

Binary cross entropy between.

The network continuously modifies and updates the parameter values and the connection weights of neurons in each layer through forward propagation and backward propagation, and the verification set is used for determining the optimal values of the parameters to minimize the error value until an iteration stop condition is met, so that a trained model is obtained.

And (3) testing the model by using the test set subjected to pretreatment in the step (2), and selecting Accuracy (Accuracy) as an evaluation criterion to verify the prediction Accuracy of the model.

And (5) predicting the fault by using the trained model.

And acquiring a data set to be predicted from the SCADA system, inputting the data set to be predicted into a model, and predicting whether the fan is frozen in a future time period according to historical fan state information.

The invention has the beneficial effects that:

the method utilizes a large amount of detection variables collected by the SCADA system to predict the icing of the fan blade by combining a convolutional neural network and a long-short term memory network with a double attention mechanism, and predicts when the icing occurs before the blade is iced, so that protection measures are taken in advance to prevent the blade from icing, and a greater effect is achieved on the safe, reliable and continuous operation of a wind power generation system. The method includes the steps that firstly, preprocessed data are input into a CNN network, local features are learned through the CNN, an attention layer is added to help the CNN network focus on the importance degree of local information in the data, high-dimensional features output by the CNN network pass through an LSTM network, the dependency relationship among time of sequence data is learned through an LSTM neural network, the importance of the data in a corresponding time period in icing forecasting is measured through the attention layer, and finally whether a fan blade in the next time period is iced or not is forecasted through data information in historical time. The method can adaptively extract fault characteristics and time sequence information from the original data acquired by the SCADA system, can adaptively focus on characteristics of time and channel dimensions in the fan blade data, and can predict the icing state of the fan blade more accurately.

Drawings

Fig. 1 is an overall flowchart.

Fig. 2 is a flow chart of the training process.

FIG. 3 is a view showing a model structure

Detailed Description

The method adopts a fan blade icing prediction method based on a double-attention machine system, and the implementation process of the method is as follows:

step (1): acquiring an icing original data set of a fan

Step (2): preprocessing a data set

Dividing original data into normal data, namely positive samples, labels are normal, fault data, namely negative samples, labels are frozen, and invalid data, namely label-free data, according to the time periods of freezing and non-freezing in the data; when constructing the training set, the non-labeled invalid data is first deleted. Taking several groups of measured values in the same unit time as a unit time every minute, replacing original measured data with a Mean value (Mean), a Variance (Sample Variance), a Standard Deviation (Standard development), a Coefficient of Variation (CV), a Standard error (Standard error), a Skewness (Skewness), a Kurtosis (Kurtosis), a median (Mean) and a Quartile (Quartile) for the several groups of measured values in each unit time, and performing feature expansion by using m statistical representations for each feature quantity. Obtaining data of (m, n) -dimension for each group of unit time intervals, i.e.

m represents the number of statistical representations (here m is 9), and n represents the number of features. The data after pretreatment are divided into a training set, a verification set and a test set according to the proportion of 0.7: 0.1: 0.2.

Can be expressed as:

wherein, b represents an offset amount,

it is shown that the activation function is,

down (-) represents the downsampling function of the pooling layer,

an output feature vector representing the pooling layer,

is the feature vector of the previous convolution layer and s is the pool size.

tanh (-) represents the activation function,

v_i＝∑_i＝1α_ip_i(6)

Firstly, an information forgetting stage: forget door f_tReading the previous short-term state h_t-1And the current state x_t(obtained here by step 3)High-dimensional representation of v_t) And goes one unit c upwards through sigmoid function_t-1Outputting a value between 0 and 1.

f_t＝σ(w_f·h_t-1+w_f·v_t+b_f) (7)

Long-term state (Long-term state) update phase: it is determined which new information is stored in a long-term cell state. Input door i_tDetermining updated values, input unit g_tInputting the current state:

i_t＝σ(w_i·h_t-1+w_i·v_t+b_i) (8)

g_t＝tanh(w_g·h_t-1+w_g·v_t+b_g) (9)

a new cell state c can then be obtained_t.

o_t＝σ(w_o·h_t-1+w_o·v_t+b_o) (11)

In the failure prediction task, the final goal is to be according to x₁To x_tFan state prediction x_t+1State of (i) y_t. An attention mechanism is utilized to capture context-related information to help predict the future. In the task of the failure prediction,the final goal is according to x₁To x_tFan state prediction x_t+1State of (i) y_t. An attention mechanism is utilized to capture context-related information to help predict the future. First, the current state h is calculated_tWeights with all previous states, wherein the current state h_tAnd the previous ith state h_iWeight α of_tiThe specific calculation is as follows:

wherein, theta_αIs the parameter to be learned, w_αIs a weight matrix.

α_i＝Soft max([α_t1，α_t2，...，α_ti]) (14)

w_cIs a weight matrix, attention vector

w_sis a weight matrix, b_sIs the offset.

An objective function: calculating the true state y_tAnd predicted state

Binary cross entropy between.

The technical scheme of the invention is described in the aspect of network structure as follows: the input of the convolution network is a matrix of 26x9, and the convolution kernel is set to be 2x9 through convolution → pooling → convolution → pooling, and then the LSTM network is set to have an LSTM hidden layer characteristic number of 128 and an output layer of 2. The network continuously modifies and updates the parameter values and the connection weights of the neurons in each layer through forward propagation and backward propagation. And the verification set is used for determining the optimal value of the parameter, the iteration times are set to be 100 times, the learning rate is set to be 0.001, and the error value is minimized until the iteration stop condition is met to obtain the trained model.

And (5) detecting early icing faults of the blades by using the trained model.

The data set to be predicted is collected from an SCADA system of an actual wind force field, wherein the data set to be predicted comprises a timestamp, a wind speed, a generator rotating speed, net side active power (kw), a wind angle (°), a 25-second average wind direction angle, a yaw position, a yaw speed, a blade 1 angle, a blade 2 angle, a blade 3 angle, a blade 1 speed, a blade 2 speed, a blade 3 speed, a pitch motor 1 temperature, a pitch motor 2 temperature, a pitch motor 3 temperature, an x-direction acceleration, a y-direction acceleration, an environment temperature, a cabin temperature, an ng 51 temperature, an ng 52 temperature, an ng 53 temperature, an ng 51 charger direct current, an ng 52 charger direct current and an ng 53 charger direct current.

And inputting the data set to be predicted into the model, and predicting whether the fan is frozen in the future time period according to historical fan state information.

Claims

1. The method for predicting the icing fault of the fan blade based on the double attention mechanism is characterized by comprising the following steps of:

step (1): acquiring an icing original data set of a fan;

step (2): preprocessing an original data set to obtain a training set and a test set;

and (3): performing local feature extraction on the preprocessed data by using the CNN; then, focusing attention on the importance of different features in the extracted local features through an attention layer to obtain a local feature vector containing attention weight;

and (4): inputting the feature vector containing attention weight obtained in the step (3), and extracting the feature h of the time dimension through LSTM_tCapturing context related information through an attention mechanism, paying attention to feature importance of different time, and finally forming a prediction model;

and (5): and predicting the icing fault of the fan blade.

2. The dual attention mechanism-based wind turbine blade icing fault prediction method of claim 1, wherein: the freezing original data of the fan comprises but is not limited to wind speed, rotating speed of a generator, net side active power, wind direction angle, angle of each blade and temperature of a variable pitch motor.

3. The dual attention mechanism-based wind turbine blade icing fault prediction method of claim 1, wherein: the pretreatment comprises the following steps: dividing original data into normal data, namely positive samples, labels are normal, fault data, namely negative samples, labels are frozen, and invalid data, namely label-free data, according to the time periods of freezing and non-freezing in the data; when constructing the training set, the non-labeled invalid data is first deleted. Taking several groups of measured values in the same unit time by taking each minute as a unit time, expressing the characteristics by using a plurality of statistics, and performing characteristic expansion by using m statistical representations for each characteristic quantity. Obtaining data of (m, n) -dimension for each group of unit time intervals, i.e.

m represents the number of statistical representations and n represents the number of features. And dividing the data after preprocessing into a training set, a verification set and a test set.

4. The dual attention mechanism-based wind turbine blade icing fault prediction method of claim 1, wherein: the attention model in the step (3) is as follows:

tanh (-) represents the activation function,

represents the weight of the attention layer, p_iRepresenting input data x_iHigh-dimensional representation after feature extraction by CNN, b_attDenotes an offset amount, α_iExpressing attention weight, expressing the importance degree of the corresponding feature vector in all features, and c expressing feature dimension;

and then the attentions are distributed by using a full connection layer,

v_i＝∑_i＝1α_ip_i(6)

5. The method for predicting the icing fault of the fan blade based on the dual attention mechanism according to claim 1, wherein the attention mechanism in the step 4 is specifically as follows:

first, the current state h is calculated_tWeights with all previous states, wherein the current state h_tAnd the previous ith state h_iWeight α of_tiThe specific calculation is as follows:

wherein, theta_αIs the parameter to be learned, w_αIs a weight matrix.

α_i＝Softmax([α_t1，α_t2，...，α_ti]) (14)

6. The method for predicting the icing fault of the fan blade based on the dual attention mechanism according to claim 1, wherein the prediction model in the step 4 is specifically as follows:

binding context vector v_tAnd a current hidden state h_tInformation of two vectors, generating one vector

This vector contains both historical information passed over the LSTM network, and information on the importance of the data for the corresponding time period in predicting icing,

w_cis a weight matrix, vector

The input through softmax layer, the state of the fan generating t +1 time period, i.e. whether the fan is frozen or not, is defined as:

w_sis a weight matrix, b_sIs the offset.

7. The dual attention mechanism-based wind turbine blade icing fault prediction method according to claim 1, wherein the objective function of the prediction model in step 4 is:

calculating the true state y_tAnd predicted state

The binary cross-entropy between the two is,