WO2023159336A1

WO2023159336A1 - Deep autoregressive network based prediction method for stalling and surging of axial-flow compressor

Info

Publication number: WO2023159336A1
Application number: PCT/CN2022/077168
Authority: WO
Inventors: 李英顺; 弓子勤; 孙希明; 全福祥
Original assignee: 大连理工大学
Priority date: 2022-02-22
Filing date: 2022-02-22
Publication date: 2023-08-31

Abstract

The present invention relates to the technical field of aeroengine modeling and simulation, and provides a deep autoregressive network based prediction method for stalling and surging of an axial-flow compressor. The method comprises: first using surge experiment data of a certain type of an aeroengine, selecting and preprocessing the data, and dividing the data into a training set and a test set; then building and training a deep autoregressive network model based on an attention mechanism, performing real-time prediction on the test set by using the finally trained model, and giving model loss and evaluation indexes; and finally, performing real-time prediction on test data by using a prediction model, and giving a change trend of the surge probability along with time according to a time sequence. According to the present invention, the attention mechanism is used to effectively capture the characteristics of the experimental data to accurately predict the surge probability, so that the prediction stability and accuracy can be improved; and the active control performance of the engine can be improved, and certain universality is achieved.

Description

A Deep Autoregressive Network Based Axial Compressor Stall Surge Prediction Method

technical field

The invention relates to a method for predicting the stall and surge probability of an axial flow compressor based on a deep autoregressive network based on an attention mechanism, and belongs to the technical field of aeroengine modeling and simulation.

Background technique

Aeroengines are the "crown jewels" in the history of human industry, reflecting the highest level of technology in a country. The compressor is a key component of a high-performance aero-engine. It increases the air pressure through the high-speed rotation of the blades and limits the stable working range of the engine while providing a high-pressure ratio. It plays a vital role in the stability and safety of the aero-engine. Surge and rotating stall are two important manifestations of compressor gas flow instability faults.

The main feature of compressor surge is the occurrence of airflow interruption. The airflow oscillates along the axis of the compressor with low frequency (several Hz or more than a dozen Hz) and high amplitude. In severe cases, flow blockage or even reverse flow occurs. Once the surge occurs, it will cause very serious damage to the aero-engine. Rotating stall is an unstable flow phenomenon that can significantly degrade aeroengine performance. A large number of studies have shown that rotating stall is the precursor of surge, and surge is the consequence of the extreme development of rotating stall. Therefore, fast and accurate prediction of rotating stall has become an urgent problem to be solved in the field of aeroengines.

At present, there are two methods for the detection and identification of compressor rotating stall faults at home and abroad: one is to actively control the compressor through the method of building a model, and suppress the disturbance of the compressor from continuing to occur when the compressor has a precursor to surge; Prevent from entering a surge state. The second is to study the surge prediction algorithm based on the time-domain or frequency-domain characteristics of the compressor pressure signal. Among them, the traditional algorithms based on the time-domain characteristics of the pressure signal mainly include: short-term energy method, autocorrelation function method, variance analysis method, rate of change method, pressure difference method, statistical characteristic method, etc., traditional surge algorithms based on the frequency-domain characteristics of the pressure signal The detection algorithms mainly include: spectrum analysis method, wavelet analysis method, frequency domain amplitude method and so on.

Contents of the invention

Aiming at the problems of low accuracy and poor reliability in the prior art, the present invention provides an axial compressor stall surge probability prediction based on an attention-based deep autoregressive network (TPA-DeepAR, Temporal Pattern Attention DeepAutoregressive Recurrent Networks) method.

In order to achieve the above object, the technical scheme adopted in the present invention is as follows:

A method for predicting stall and surge of an axial flow compressor with a deep autoregressive network, specifically a method for predicting stall and surge of an axial flow compressor with a deep autoregressive network based on an attention mechanism, comprising the following steps:

S1. The aero-engine surge data is preprocessed, including the following steps:

S1.1 Acquire the experimental data of a certain type of aero-engine surge, and eliminate the invalid data due to sensor failure in the experimental data;

S1.2 Perform down-sampling processing and filtering processing on the remaining valid data in sequence;

S1.3 normalize and smooth the filtered data;

S1.4 In order to ensure the objectivity of the test results, the experimental data is divided into a test data set and a training data set;

S1.5 Segment the training data set by time windows, the data points covered by each time window form a sample, and divide the training data set into a training set and a verification set in a ratio of 4:1;

S2. Construct a deep autoregressive network model based on the attention mechanism, that is, the TPA-DeepAR model, including the following steps:

S2.1 Adjust each sample dimension to (w,1) as the input of the TPA-DeepAR model, where w represents the length of the time window;

S2.2 Build an embedding layer to convert the dimension of the input sample from (w, 1) to (w, m), m is the specified dimension, and disperse the characteristics of the sample from one dimension to m dimensions;

S2.3 Build the LSTM layer, use the output of the embedding layer as the input of the LSTM layer, and the LSTM layer outputs w hidden vectors {h _t-w+1, h _t-w+2 ,...,h _t }, each hidden The dimension of the vector is m.

S2.4 Build the attention layer. The w hidden vectors {h _t-w+1, h _t-w+2 ,...,h _t } output by the LSTM layer are used as the input of the attention layer, and the relevant Dimensionally weighted, and finally output a hidden vector

S2.5 Build a Gaussian layer, the Gaussian layer is composed of two fully connected layers, and the hidden vector output by the attention layer

As the input of the Gaussian layer, the output of the two fully connected layers of the Gaussian layer is the parameter μ and the parameter σ respectively, so the output of the Gaussian layer will determine a Gaussian distribution, so that the model achieves the purpose of fitting the Gaussian distribution;

S2.6 Use the fitted Gaussian distribution to perform random sampling multiple times to obtain the data of the prediction points, and obtain different quantiles of the prediction points based on these sampling points to realize probability prediction;

S3. Build the attention layer mentioned in S2:

S3.1 The input of the attention layer is the output of the LSTM layer {h _t-w+1, h _t-w+2 ,...,h _t }, the dimension of the input data is (w, m), except for the last hidden Except for the vector ht, other w-1 hidden vectors form a hidden state matrix H={h _t-w+1, h _t-w+2 ,...,h _t-1 };

S3.2 Use k convolution kernels to capture the signal pattern of H to obtain the H ^C matrix to enhance the model's ability to learn features.

S3.3 Calculate the similarity between the hidden vector h _t and the H ^C matrix through the scoring function to obtain the attention weight α _i , use the attention weight α _i to weight and sum each row of H ^C to obtain the vector v _t ;

S3.4 Finally, h _t and v _t are spliced and input into a fully connected layer to obtain a new hidden vector output

S4.TPA-DeepAR model loss function and evaluation indicators:

S4.1 For the TPA-DeepAR model, the model outputs the parameters μ and σ of the predicted Gaussian distribution during forward propagation. The traditional loss function for regression cannot handle μ, σ, and y_true (the real label of the sample). Therefore, the loss function adopted is as follows:

Assuming that the sample obeys the Gaussian distribution y_true~(μ, σ ² ), then its likelihood function is:

Its log-likelihood function is:

Among them, n represents the number of samples, y_true is known, and represents the real label of the sample, μ and σ are the parameters of the Gaussian distribution predicted by the model, and the likelihood function describes the distribution formed by the parameters μ and σ, and y_true appears The size of the probability of the sample point.

Therefore, the network parameters are learned by maximizing the log likelihood function, that is, the distribution formed by the parameters μ and σ can have the maximum probability of the sample point y_true, and the loss function of the corresponding model training can be determined as -lnL(μ,σ ² ) .

S4.2 Based on the loss function, update the weight of the TPA-DeepAR model on the training set obtained in step S1, and finally generate a preliminary prediction model of the model.

S4.3 Use the preliminary prediction model to test on the verification set obtained in step S1, obtain the F2 evaluation index, adjust the parameters of the TPA-DeepAR model according to the F2 index, confusion matrix and ROC curve to achieve better performance, and save the performance of each evaluation index Optimal TPA-DeepAR prediction model;

Among them, the F2 index mentioned is:

Among them, P is the precision rate (precision), which indicates the proportion of the samples that are classified as positive classes that are actually positive classes:

Among them, TP is the number of true cases, FP is the number of false positive cases, and R is the recall rate (recall), which means that among all the samples that are actually positive classes, the proportion that is correctly judged as positive class:

Among them, FN is the number of false negative cases.

Present the four indicators of TP, FP, TN, and FN together in a 2*2 table, and a confusion matrix will be obtained. The first to fourth quadrants of the table are TP, FP, FN, and TN respectively. Among them, TN is the number of true negative cases.

After obtaining the confusion matrix, the larger the values in the second and fourth quadrants of the matrix, the better. Conversely, the smaller the values in the first and third quadrants, the better.

Among all the samples that are actually negative examples, the proportion that is wrongly judged as positive examples is FPR: FPR=FP/(FP+TN). With FPR as the horizontal axis and R as the vertical axis, the ROC curve is obtained. The closer the ROC curve is to the upper left corner, the higher the recall rate of the TPA-DeepAR model, the least number of false positives and false negatives, and the better the prediction effect.

S5. Use the final TPA-DeepAR prediction model to make real-time predictions on the test set:

S5.1. Preprocess the test set data according to the preprocessing steps, adjust the dimension of the test set data and input it into the trained TPA-DeepAR model for testing;

S5.2. According to the time sequence, use the TPA-DeepAR prediction model to give the surge prediction probability of each test set sample, and obtain the real-time surge probability of the test set sample.

The beneficial effects of the present invention are:

The prediction method provided by the present invention learns the time-correlation characteristics of the compressor dynamic pressure experimental data, captures the small stall precursor signal, calculates and outputs the prediction probability of surge, and gives a warning signal in time whether the surge occurs. Compared with the traditional method, the prediction method uses the attention mechanism to select relevant dimensions for attention weighting, which can effectively capture the characteristics of the experimental data to achieve accurate prediction of the probability of surge, and improve the prediction stability and accuracy. At the same time, the method outputs multiple quantiles of predicted probability, which is convenient for the system to carry out early warning according to different quantiles. The method can judge whether the surge occurs according to the surge probability output in real time, and feed back to the engine control system in time, so as to adjust the engine running state and gain time for the active control method of the compressor.

Description of drawings

Figure 1 is a flow chart of an axial compressor stall surge prediction method based on an attention mechanism based on a deep autoregressive network;

Fig. 2 is a flow chart of data preprocessing;

Figure 3 is a structural diagram of the TPA-DeepAR model;

Figure 4 is a structural diagram of the attention mechanism;

Figure 5 is the prediction results of the TPA-DeepAR model on the test data, in which (a) is the dynamic pressure _p2 at the tip of the second-stage stator changing with time, and (b) is the surge prediction probability given by the TPA-DeepAR model The change graph over time, (c) is the early warning signal given by the TPA-DeepAR model;

Detailed ways

The present invention will be further described below in conjunction with the accompanying drawings. The background of the present invention is the surge experimental data of a certain type of aeroengine, and the process flow of the axial compressor stall surge prediction method based on the deep autoregressive network of the attention mechanism is shown in FIG. 1 .

Figure 2 is a flow chart of data preprocessing, and the steps of data preprocessing are as follows:

S1. Preprocessing the aero-engine surge data.

S1.1 Acquire the experimental data of a certain type of aeroengine surge, and eliminate the invalid data due to sensor failure in the experimental data; there are 16 groups of experimental data, each group of experiments includes 10 measurement points from normal to surge for a total of 10s Dynamic pressure value, sensor measurement frequency is 6kHz, 10 measurement points are located at: inlet guide vane stator tip, zero-stage stator tip, first-stage stator tip (three in the circumferential direction), second-stage stator tip, third-stage stator tip Ministry, the tip of the fourth-level stator, the tip of the fifth-level stator, and the exit wall;

S1.3 normalize and smooth the filtered data;

Figure 3 is a structural diagram of the TPA-DeepAR model.

S2. The steps to construct the TPA-DeepAR model are as follows:

S2.3 Build the LSTM layer, use the output of the embedding layer as the input of the LSTM layer, and the LSTM layer outputs w hidden vectors {h _t-w+1, h _t-w+2 ,...,h _t }, each hidden The dimension of the vector is m;

S2.4 After the hidden vector h _t of the last time step is output, add an attention layer, w hidden vectors output by the LSTM layer {h _t-w+1, h _t-w+2 ,...,h _t } As the input of the attention layer, the attention layer adds attention to the m dimensions of these hidden vectors, selects the relevant dimension weights, better captures the characteristics of the hidden vector, and finally outputs a new hidden vector

S2.5 Build a Gaussian layer, the Gaussian layer is composed of two fully connected layers, and the hidden vector

As the input of the Gaussian layer, the outputs of the two fully connected layers are parameter μ and parameter σ respectively, and the output of the Gaussian layer will determine a Gaussian distribution, so that the model achieves the purpose of fitting the Gaussian distribution;

S2.6 Use the fitted Gaussian distribution to carry out random sampling multiple times to obtain the data of the prediction points, and according to these sampling points, different quantiles of the prediction points can be obtained to realize the probability prediction. The present invention adopts the 0.5 quantile of the prediction points Surge probability as output;

Figure 4 is a structural diagram of the attention layer

S3. The steps to construct the attention layer are as follows:

S3.1 After the original sequence is processed by the embedding layer and the LSTM layer, the hidden vector {h _t-w+1, h _t-w+2 ,...,h _t } of each time step of the sample is obtained, and the hidden vector of each hidden vector The dimension is m, except for the last hidden vector h _t , the other w-1 hidden vectors form a hidden state matrix H={h _t-w+1, h _t-w+2 ,...,h _t-1 } ;

The row vector of the hidden state matrix represents the state of a single dimension at all time steps, that is, the vector composed of all time steps of the same dimension.

The column vector of the hidden state matrix represents the state of a single time step, that is, the vector composed of all dimensions at the same time step.

S3.2 Use convolution to capture variable signal patterns to form matrix H ^C ;

The convolution is configured as k convolution kernels, w is the length of the time window, and the size of the convolution kernel is 1×T (T represents the range covered by the attention, let T=w-1), the above convolution kernel along the hidden state The row vector calculation convolution of the matrix H extracts the time pattern matrix of the variable within the scope of the convolution kernel

Indicates the result value of the i-th row vector of the H matrix and the j-th convolution kernel.

S3.3 The similarity between the hidden vector h _t and the H ^C matrix is calculated through the scoring function (scoring function) to obtain the attention weight α _i , and the scoring function is selected as:

Among them, W _a is the weight.

Use sigmoid for normalization to get the attention weight α _i , which is convenient for selecting multiple dimensions:

Finally, use the attention weight α _i to

Each row is weighted and summed to get the vector v _t :

Finally, h _t and v _t are spliced and input into a fully connected layer to obtain a new hidden vector

as output;

Among them, W _h and W _v are weights.

S4.TPA-DeepAR model loss function and evaluation indicators:

S4.1 For the TPA-DeepAR model, the model outputs the predicted Gaussian distribution μ and σ during forward propagation, and the traditional loss function for regression cannot handle μ, σ, y_true (the real label of the sample) Relationship, so the loss function used is as follows:

Its log-likelihood function is:

Among them, n represents the number of samples, y_true is known, and represents the real label of the sample, μ and σ are the parameters of the Gaussian distribution predicted by the model, and the likelihood function describes the distribution formed by the parameters μ and σ, and the sample y_true appears The size of the probability of a point.

Among them, the F2 index mentioned is:

Among them, FN is the number of false negative cases.

Present the four indicators of TP, FP, TN, and FN together in a 2*2 table, and a confusion matrix will be obtained. The first to fourth quadrants of the table are TP, FP, FN, and TN respectively.

Among them, TN is the number of true negative cases. After obtaining the confusion matrix, the larger the values in the second and fourth quadrants of the matrix, the better. Conversely, the smaller the values in the first and third quadrants, the better.

S5. Use the final TPA-DeepAR prediction model to make real-time predictions on the test set; Figure 5 is the prediction result of the TPA-DeepAR prediction model on the test data, where (a) is the change of the dynamic pressure _p2 at the tip of the second-stage stator with time Figure (b) is the change of surge prediction probability over time given by the TPA-DeepAR prediction model, and (c) is the early warning signal given by the TPA-DeepAR prediction model based on the prediction probability. The steps to perform real-time prediction on test data are as follows:

S5.1 Preprocess the test set data according to the preprocessing steps, adjust the dimension of the test set data and input it into the trained TPA-DeepAR model; the test set data is the dynamic pressure data of the tip position of the secondary stator, from It can be seen from Figure (a) that a downward-developing protrusion began to appear at 7.48s, which was in the initial disturbance stage of the stall. With the development of the stall disturbance, it began to fluctuate violently at 7.826s, and completely developed into a stall surge .

S5.2 In chronological order, use the TPA-DeepAR prediction model to give the surge prediction probability of each test set data; observe the figure (b), you can see that the prediction probability curve identifies the initial disturbance at around 7.488s, and the surge probability It rose rapidly, and then maintained a high surge probability until the original dynamic pressure data returned to a stable state at about 7.68s, and the surge probability curve also fell rapidly, and then rose again with the fluctuation of the original dynamic pressure data. When the initial disturbance occurs, the high probability of rotating stall and surge will occur. Once it occurs, it will have a very serious impact. Therefore, a threshold is set for the surge probability prediction curve. When the threshold is exceeded, an early warning signal is given to realize the initial disturbance. early warning stage. Therefore, the TPA-DeepAR prediction model can respond to small changes in the initial disturbance stage in time, and output the surge probability value according to the development of the disturbance.

The above-mentioned embodiment only expresses the implementation mode of the present invention, but can not therefore be interpreted as the limitation of the scope of the patent of the present invention, it should be pointed out that, for those skilled in the art, under the premise of not departing from the concept of the present invention, Several modifications and improvements can also be made, all of which belong to the protection scope of the present invention.

Claims

A deep autoregressive network axial flow compressor stall surge prediction method is characterized in that it comprises the following steps:

S1. Preprocess the aero-engine surge data, divide the experimental data into a test data set and a training data set, and then divide the training data set into a training set and a verification set in proportion;

S2. Construct a deep autoregressive network model based on the attention mechanism, that is, the TPA-DeepAR model, including the following steps:

S2.1 Adjust each sample dimension to (w,1) as the input of the TPA-DeepAR model, where w represents the length of the time window;

S2.2 Build an embedding layer to convert the dimension of the input sample from (w, 1) to (w, m), m is the specified dimension, and disperse the characteristics of the sample from one dimension to m dimensions;

S2.3 Build the LSTM layer, use the output of the embedding layer as the input of the LSTM layer, and the LSTM layer outputs w hidden vectors {h t-w+1, h t-w+2 ,...,h t }, each hidden The dimension of the vector is m;

S2.4 Build the attention layer. The w hidden vectors {h t-w+1 , h t-w+2 ,...,h t } output by the LSTM layer are used as the input of the attention layer, and the relevant Dimensionally weighted, and finally output a hidden vector

S2.5 Build a Gaussian layer, the Gaussian layer is composed of two fully connected layers, and the hidden vector output by the attention layer
As the input of the Gaussian layer, the output of the two fully connected layers of the Gaussian layer is the parameter μ and the parameter σ respectively, so the output of the Gaussian layer will determine a Gaussian distribution, so that the model can achieve the purpose of fitting the Gaussian distribution;

S2.6 Use the fitted Gaussian distribution to perform random sampling multiple times to obtain the data of the prediction points, and obtain different quantiles of the prediction points based on these sampling points to realize probability prediction;

S3. Build the attention layer described in S2:

S3.1 The input of the attention layer is the output of the LSTM layer {h t-w+1 ,h t-w+2 ,...,h t }, the dimension of the input data is (w,m), except for the last hidden Except for the vector ht, other w-1 hidden vectors form a hidden state matrix H={h t-w+1 ,h t-w+2 ,...,h t-1 };

S3.2 Use k convolution kernels to capture the signal pattern of H to obtain the H C matrix to enhance the model's ability to learn features;

S3.3 Calculate the similarity between the hidden vector h t and the H C matrix through the scoring function to obtain the attention weight α i , use the attention weight α i to weight and sum each row of H C to obtain the vector v t ;

S3.4 Finally, h t and v t are spliced and input into a fully connected layer to obtain a new hidden vector output

S4.TPA-DeepAR model loss function and evaluation indicators:

S4.1 For the TPA-DeepAR model, the model outputs the parameters μ and σ of the predicted Gaussian distribution during forward propagation, and the loss function used is as follows:

Assuming that the sample obeys the Gaussian distribution y_true~(μ, σ 2 ), then its likelihood function is:

Its log-likelihood function is:

Among them, n represents the number of samples, y_true is known, and represents the real label of the sample, μ and σ are the parameters of the Gaussian distribution predicted by the model, and the likelihood function describes the distribution formed by the parameters μ and σ, and y_true appears The size of the probability of the sample point;

Therefore, the network parameters are learned by maximizing the log likelihood function, that is, the distribution formed by the parameters μ and σ can have the maximum probability of the sample point y_true, and the loss function of the corresponding model training can be determined as -lnL(μ,σ 2 ) ;

S4.2 Based on the loss function, update the weight of the TPA-DeepAR model on the training set obtained in step S1, and finally generate a preliminary prediction model of the model;

S4.3 Use the preliminary prediction model to test on the verification set obtained in step S1, obtain the F2 evaluation index, adjust the parameters of the TPA-DeepAR model according to the F2 index, confusion matrix and ROC curve to achieve better performance, and save the performance of each evaluation index Optimal TPA-DeepAR prediction model;

S5. Use the final TPA-DeepAR prediction model to make real-time predictions on the test set:

S5.1. Preprocess the test set data according to the preprocessing steps, adjust the dimension of the test set data and input it into the trained TPA-DeepAR model for testing;

S5.2. According to the time sequence, use the TPA-DeepAR prediction model to give the surge prediction probability of each test set sample, and obtain the real-time surge probability of the test set sample.
A method for predicting stall and surge of an axial flow compressor with a deep autoregressive network according to claim 1, wherein said step S1 preprocesses the surge data of an aeroengine specifically as follows:

S1.1 Acquire the experimental data of a certain type of aero-engine surge, and eliminate the invalid data due to sensor failure in the experimental data;

S1.2 Perform down-sampling processing and filtering processing on the remaining valid data in sequence;

S1.3 normalize and smooth the filtered data;

S1.4 In order to ensure the objectivity of the test results, the experimental data is divided into a test data set and a training data set;

S1.5 Segment the training data set by time windows, the data points covered by each time window form a sample, and divide the training data set into a training set and a verification set at a ratio of 4:1.
A deep autoregressive network axial compressor stall surge prediction method according to claim 2, characterized in that, in the step S4.3:

The F2 indicators described are:

Among them, P is the accuracy rate, indicating the proportion of the samples that are classified as positive classes that are actually positive classes:
Among them, TP is the number of true cases, FP is the number of false positive cases, and R is the recall rate, which means that among all the samples that are actually positive classes, the proportion that is correctly judged as positive class:
Among them, FN is the number of false negative cases;

Present the four indicators of TP, FP, TN, and FN together in a 2*2 table, and a confusion matrix will be obtained. The first to fourth quadrants of the table are TP, FP, FN, and TN respectively; among them, TN is true number of negative cases;

After obtaining the confusion matrix, the larger the values in the second and fourth quadrants of the matrix, the better; conversely, the smaller the values in the first and third quadrants, the better;

Among all the samples that are actually negative examples, the proportion that is wrongly judged as positive examples is FPR: FPR=FP/(FP+TN); with FPR as the horizontal axis and R as the vertical axis, the ROC curve is obtained; the The closer the ROC curve is to the upper left corner, the higher the recall rate of the TPA-DeepAR model, the least number of false positives and false negatives, and the better the prediction effect.