CN112766303A - CNN-based aeroengine fault diagnosis method - Google Patents

CNN-based aeroengine fault diagnosis method Download PDF

Info

Publication number
CN112766303A
CN112766303A CN202011535827.4A CN202011535827A CN112766303A CN 112766303 A CN112766303 A CN 112766303A CN 202011535827 A CN202011535827 A CN 202011535827A CN 112766303 A CN112766303 A CN 112766303A
Authority
CN
China
Prior art keywords
data
model
value
output
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011535827.4A
Other languages
Chinese (zh)
Other versions
CN112766303B (en
Inventor
全哲
高晋峰
肖桐
郭燕
李磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan University
Original Assignee
Hunan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan University filed Critical Hunan University
Priority to CN202011535827.4A priority Critical patent/CN112766303B/en
Publication of CN112766303A publication Critical patent/CN112766303A/en
Application granted granted Critical
Publication of CN112766303B publication Critical patent/CN112766303B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/10Pre-processing; Data cleansing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T90/00Enabling technologies or technologies with a potential or indirect contribution to GHG emissions mitigation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a CNN-based aeroengine fault diagnosis method, wherein the data set used for prevention is gas path parameters collected by an aeroengine sensor, wherein the gas path parameters comprise gas path parameters when various faults occur and gas path parameters under normal conditions, and the data are collected according to a time sequence, the characteristic that a convolutional neural network can fully excavate the change between the gas path parameters back and forth is used, compared with the traditional method for modeling discrete data (analyzing the data at a specific moment), the method not only considers the change of specific values of the gas path parameters at different moments, but also further considers the trend characteristic of the parameter change at continuous moments and the relation back and forth, and as the CNN is used with certain translational invariance, the generalization capability is better, more comprehensive, more advanced and more complex characteristics can be obtained, and then a novel loss function is provided, and the method is used for evaluating the classification result of the model so as to realize the diagnosis of the fault.

Description

CNN-based aeroengine fault diagnosis method
Technical Field
The invention belongs to the field of engines, and particularly relates to a CNN-based aircraft engine fault diagnosis method.
Background
The aircraft engine is one of the most central components of an aircraft, is a system with high complexity, and the health condition of the aircraft engine is an important prerequisite for ensuring the flight safety of the aircraft. The relevant data show that more than 50% of the flight accidents in the last decade are caused by the failure of the aircraft engine, and in addition, the maintenance expenditure of the aircraft engine accounts for up to 40% of the global aircraft maintenance industry, so that the reliable and stable operation of the engine is guaranteed, and the method has great significance for reducing the maintenance cost of the airlines and manufacturers, shortening the maintenance period and the engine stop time and improving the operation efficiency of the engine. The fault detection technology of the aircraft engine is one of the most important core technologies. At present, the mainstream intelligent fault detection algorithm is mainly based on a neural network method and a support vector regression method, and an offset is obtained by converting an aero-engine gas path measurement parameter into a standard state and calculating a difference value with a corresponding engine performance baseline (or a reference value), and fault diagnosis and performance prediction are performed through the offset and a change trend thereof. On the other hand, feature extraction is carried out on limited gas circuit measurement data change through an artificial intelligence technology, and the method becomes a new means for diagnosing faults of the aero-engine.
The current common practice is mainly divided into the following steps:
1. a baseline modeling method based on a neural network. With the rapid development of artificial intelligence, the Neural Network provides possibility for solving uncertain input and output description existing in engine baseline modeling, an aircraft engine performance parameter baseline library is constructed by analyzing performance parameters of a factory monitoring system by adopting a nonlinear regression analysis method, and an engine gas circuit state parameter prediction method based on a Process Neural Network (PNN) is adopted. Or using NeuroSolution6 software to realize a Radial Basis Function (RBF) neural network algorithm and establishing EGT, FF and N2 healthy baselines. Establishing a baseline model of aeroengine gas path parameters (EGT, FF and N2) by utilizing a Back Propagation (BP) neural network optimized by a genetic algorithm. Although the neural network has strong nonlinear fitting capability, the neural network has the defect that when the training sample set is small, divergence easily occurs.
2. A baseline modeling method based on Support Vector Regression (SVR). In recent years, the support vector regression has been studied as a data mining method by many scholars, and the defects of the neural network can be well avoided. The SVR algorithm has the advantages of high processing speed and accurate calculation when processing the nonlinear regression problem, and performs multi-parameter and single-parameter regression analysis. However, the algorithm based on the SVR still has the problems of sensitivity in model parameter and kernel function selection, and the like, and has a general effect on the problem of multi-classification.
3. The deep learning is used as a hotspot technology of machine learning, and has been successfully applied to the field of fault diagnosis by virtue of excellent feature extraction capability, the feature extraction is carried out on the real-time monitoring data and the historical data of the engine by utilizing the strong feature learning capability of the deep learning, and the classification and the diagnosis of the engine fault can be better completed by utilizing the classifier to classify the features, so that the deep confidence network-based feature extractor and the fault classification method have stronger generalization and practicability. However, if the number of layers of the neural network is too deep, the problem of overfitting is easy to occur, an extremely large data set needs to be used for training, and if a traditional shallow network is used, the problems of local minimum and overfitting are easy to occur, so that the generalization of the system is influenced.
In summary, each method has certain limitations, and the method closest to the third scheme of the invention is very easy to train and have poor generalization capability, but because the traditional neural network is used, the method provided by the invention uses a one-dimensional convolutional neural network to extract features according to the gas path monitoring data of the time sequence of the aircraft engine, integrates the variation trend of a plurality of monitoring parameters, and can obtain better multi-classification fault features. The method has the advantages that the shallow one-dimensional convolutional neural network is used for extracting features, the SVM and the softmax classifier are used for fusion learning of the extracted features, and the fault detection technology with better generalization effect and higher accuracy can be obtained under the condition that the high calculation cost of the deep neural network is avoided.
The noun explains:
one hot: one hot coding is a common label coding mode in multiple classifications, which is also called one-bit effective coding, and one N-dimensional vector is adopted to represent N states, where N represents the total number of the classifications in the multiple classifications, and when a classification label is i, the position of the i-th index in the N-dimensional vector is set to be 1, and the other positions are all 0 values.
Relu activation function: the Rectified linear Unit, modified linear units, is of the form
Figure BDA0002853413740000031
The method is a nonlinear activation function, the output of a part of neurons is 0, thus the sparsity of the network is caused, the interdependence relation of parameters is reduced, and the occurrence of the overfitting problem is relieved.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a CNN-based aeroengine fault diagnosis method. The method firstly uses the one-dimensional convolutional neural network to intelligently extract the characteristics of the aeroengine gas path parameters based on the time sequence, replaces the traditional artificial characteristic design, has better generalization and stability, and has low calculation cost due to the shallow convolutional neural network. And performing pooling operation on the result after the convolution to further extract features so as to obtain the feature size with a fixed scale, so that the technology can be applied to sequence data with different lengths and has good flexibility. And finally, classifying the sequence characteristics by combining a classifier and a back propagation algorithm, thereby accurately diagnosing the fault mode.
The purpose of the invention can be realized by the following technical scheme:
a CNN-based aeroengine fault diagnosis method comprises the following steps:
the method comprises the following steps of firstly, collecting operation data when an aircraft engine fails, classifying fault categories, labeling to construct a data set, and segmenting the data set to form a training set and a testing set;
preprocessing the data of the training set to finish data cleaning, and adopting min-max normalization to perform dimension removal on the data;
constructing a multi-classification model, wherein the multi-classification model comprises a shallow convolutional layer and a pooling layer, and is fused with a classifier;
taking the sampled gas path parameters as input, carrying out convolution, pooling and output layer, transmitting the output value to a classifier, optimizing a cross entropy loss function, and then continuing training;
step five, carrying out repeated iterative computation to preset times to obtain a trained model;
step six, carrying out the same data preprocessing on the test set for testing;
and step seven, inputting the running data of the engine into the trained model in real time to obtain the diagnosis result when the engine fails.
In the first step, data containing null values or abnormal values in the data set are removed, and interference data are removed; then, the data are segmented to ensure that the training set and the test set are distributed consistently; with 80% of the data as the training set and 20% as the test set.
In a further improvement, the second step includes the following steps:
2.1, counting the engine parameters, and sorting out practical data of each engine parameter to obtain a counting result; the prime number practical data comprises an actual range interval and an occurrence frequency; the engine parameters comprise torque, inter-turbine temperature, rotating speed of a low-pressure turbine compressor, rotating speed of a high-pressure turbine compressor, rotating speed of a propeller, outlet pressure of the high-pressure compressor, fuel flow, take-off height, flight speed Mach number and flight height;
2.2 counting outliers or dirty data by utilizing a box plot principle according to the statistical result;
2.3 removing outliers or dirty data to obtain a data set;
2.4, carrying out normalization processing on the data in the data set;
2.5 the data is subjected to a dispersion normalization process, and the original data is subjected to a linear transformation, so that the result is mapped between [0, 1], and the conversion function is as follows:
Figure BDA0002853413740000041
wherein x*Representing the data after normalization, x representing the data before normalization, max being the maximum value of the sample data, and min being the minimum value of the sample data;
2.6 further sorting the normalized data set to form an engine parameter matrix:
Figure BDA0002853413740000042
wherein the fault label represents the fault type, X0 mThe value represents the state value of the mth variable at time 0, and n represents the nth time.
In the third step, the building of the multi-classification model includes the following steps:
3.1 the data of n continuous time points are sampled in the engine gas path parameter matrix as the input of the model.
3.2 the true value of the output y of the model is coded by one hot, y represents the fault label corresponding to the input matrix, the true value dimension of y is consistent with the actual label, each element corresponds to the possible probability of one fault category, and the sum of all the probabilities is 1;
3.3 the model uses a one-dimensional convolution neural network for feature extraction, the one-dimensional convolution has two basic features: firstly, the data is a one-dimensional matrix; secondly, each line is arranged according to a time sequence and has a front-back incidence relation; the calculation formula for one-dimensional convolution is as follows:
Figure BDA0002853413740000051
wherein u is one-dimensional data with the sequence length of s, and u is used as the input of the model; each element in the one-dimensional data is a vector of fixed size; f (i, j) represents that the row index is i and the column index is j in the one-dimensional convolution kernel represents the parameter of the convolution kernel; i represents a row index of the one-dimensional convolution kernel, j represents a column index of the one-dimensional convolution kernel, u (i, j) represents an element which is represented by i and j in the input parameter u, b represents a bias parameter, and Conv1D (u) represents the output of the input parameter u after the one-dimensional convolution operation; sigma represents a Relu activation function, is used for increasing the nonlinear fitting capability of the neural network, overcomes the problem of gradient disappearance and accelerates the training speed;
the formula of the 3.4SVM classifier is as follows:
Figure BDA0002853413740000052
wherein L isiRepresenting the value of the loss function obtained after the ith input matrix is subjected to model calculation,
yiindicating the actual correct label, sjA probability value representing a class j of an actual output of the model;
Figure BDA0002853413740000053
class y representing the actual prediction output of the modeliA probability value of (d); Δ represents a threshold value if
Figure BDA0002853413740000054
If the difference is equal to or higher than the threshold value, the correct category and the compared category are judged to be well distinguished, and a 0 loss value is given; if less than the threshold, the model is said to have poor classification between the correct class and the class being compared, and the difference between the class scores is determinedAdding a threshold value delta as a loss;
3.5Softmax classifier:
Figure BDA0002853413740000061
firstly, the softmax classifier normalizes the actual output of the model through the normalization function of the formula to ensure that each is positive and the sum of all classes is 1; after normalization, if a certain class value is larger and closer to 1, the model judges that the most possible class is the corresponding class, conversely, if the model judges that the correct class probability value is closer to 0, the model is worse, according to the characteristic, the loss value is taken as the-log value of the correct class, and the smaller the correct class probability is, the larger the loss is, as shown in the following formula:
Figure BDA0002853413740000062
wherein P(s) represents the probability of the actually predicted vector of the model after being subjected to softmax normalization, s represents the actually output vector of the model, k represents the kth fault category, esIndicating that the predicted probability value for a certain class is exponentially operated, M indicating the number of classes,
Figure BDA0002853413740000063
represents 0 or 1, is 1 if the actual class of sample i is consistent with c, otherwise is 0,
Figure BDA0002853413740000064
representing the probability that the type of the model actual prediction sample i is c; n represents the number of samples;
3.6 combining loss functions
Figure BDA0002853413740000065
Where i denotes the ith sample, N denotes the total number of samples, and α denotes LsvmLoss function stationThe factor occupied.
In a further improvement, the fourth step includes the following steps:
4.1 one-dimensional convolution process:
4.1.1 traversing a sliding window with the same size as the convolution kernel in the input features;
4.1.3 performing dot product operation on the convolution kernel and the corresponding characteristic matrix window in the previous step;
4.1.4 traversing the whole feature matrix to calculate the result of point multiplication for summation;
4.1.5 sending the result after summation into Relu activation function to increase the fitting capability of the nonlinear characteristic of the model;
4.1.6 after summing, reducing the dimension of the original feature matrix and extracting the original feature matrix into high-level features related to front and back time series;
4.1.7 returning the extracted high-level features;
4.2 the pooling process:
a pooling layer is added after the convolutional layer, so that the complexity of data is reduced, overfitting of the model is prevented, and maximum pooling or average pooling is selected according to final needs;
4.2.1 the global maximum pooling is to select the maximum eigenvalue from the eigenvalues as the output of the maximum pooling according to the size of the eigenvalue;
4.2.2 Global average pooling refers to selecting an average value group in the features as the output of the final average pooling layer according to the size of the feature value;
4.3 classifier: connecting all extracted features and sending output values to a classifier
4.3.1 converting the output value of the network into a vector;
4.3.2 replacing a full connection layer by adopting a global average pooling technology; or a full connection layer technology is used, and a dropout layer is used in a matching way;
4.4 dropout layer:
the Dropout layer randomly discards a part of input in the training process, and parameters corresponding to the input of the lost part cannot be updated at the moment so as to solve the problem of overfitting and reduce the problem of complex adaptation among neurons;
4.4.1 at first, randomly deleting half of hidden neurons in the network, and keeping the input and output neurons unchanged;
4.4.2 then propagating the input x forward through the modified network and then propagating the resulting loss result backward through the modified network; after the partial training samples are executed, updating corresponding parameters on the undeleted neurons according to a random gradient descent method;
4.4.3 repeat step 4.4.1 and step 4.4.2.
In a further improvement, the step five comprises the following steps:
5.1 obtaining the super parameter batch size through an experiment, namely the range of batch size;
5.2 using convolution kernels with different sizes in the convolution process;
and 5.3, carrying out grid search or hyper-parameter search of the network model on the number of convolution kernels and even the size of the convolution kernels according to a preset rule in a preset range.
Because the state m of the engine parameter is fixed at each time, the sizes of the convolution kernels can be selected from different combinations of (3, m), (5, m), (7, m), (9, m), and the like, and the number of the convolution kernels can be 64, 128, 256, and the like. And training the models according to different combinations so as to select the optimal combination mode. Setting the epoch of training to be 400, fixing the size of the batch size and the combination of convolution kernels in the process of training each epoch, performing back propagation to adjust the parameters of the model by using a gradient descent algorithm in the training of each batch, and performing repeated iteration to complete the model training. Thereby obtaining the optimal parameter to complete the parameter search.
Compared with the prior art, the invention has the following advantages and characteristics:
1. the method comprises the steps of firstly preprocessing data more reasonably, counting an effective range interval of a gas circuit parameter of the aircraft engine by using a statistical method, analyzing abnormal values of outliers by using a box diagram, and rejecting noise at the same time, so that the data are more reasonable; meanwhile, min-max normalization processing is carried out on the data, dimensions among different variables are removed, and the normalized data can accelerate the training speed of the model and the convergence of the model.
2. And thirdly, a more reasonable model architecture is constructed, the convolution neural network is used for extracting the gas path parameter characteristics, so that more characteristics which cannot be extracted by the traditional manual method can be extracted, a large amount of workload of manually selecting the characteristics is avoided, the automatic extraction of the characteristics is completely dependent on the extraction capability of the model, the extracted characteristics are further selected by adopting a pooling layer, and the model is simpler and lighter. Meanwhile, the dropout layer is used for further improving the generalization capability of the model and reducing the model to a great extent
Drawings
FIG. 1 is a process flow diagram of the present invention;
fig. 2 is a structural view of feature extraction.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
The method firstly uses the one-dimensional convolutional neural network to intelligently extract the characteristics of the aeroengine gas path parameters based on the time sequence, replaces the traditional artificial characteristic design, has better generalization and stability, and has low calculation cost due to the shallow convolutional neural network. And performing pooling operation on the result after the convolution to further extract features so as to obtain the feature size with a fixed scale, so that the technology can be applied to sequence data with different lengths and has good flexibility. And finally, classifying the sequence characteristics by combining a classifier and a back propagation algorithm, thereby accurately diagnosing the fault mode. The specific technical scheme of the invention is as follows:
firstly, constructing a data set:
1.1 the data set is cut and shuffled, comprising 80% of training set and 20% of test set, which are used to verify the model effect.
1.2 the data set is classified and labeled according to the fault category of the aircraft engine, and the data containing null values or other abnormal values are removed, so that the accuracy can be obviously improved by eliminating interference data.
1.3 the data are divided to ensure that the training set and the testing set are distributed consistently.
Secondly, preprocessing the data of the training set to finish data cleaning, and adopting min-max normalization to perform dimensionless operation on the data
2.1, carrying out statistics on parameters such as torque, inter-turbine temperature, low-pressure turbine compressor rotating speed, high-pressure turbine compressor rotating speed, propeller rotating speed, high-pressure compressor outlet pressure, fuel flow, takeoff height, flight speed Mach number, flight height and the like, and sorting out practical data such as actual range intervals, appearance frequency and the like of each parameter;
2.2 counting outliers or dirty data by utilizing a box plot principle according to the statistical result;
and 2.3, rejecting partial data which does not meet the requirement.
2.4 because different evaluation indexes (parameters) often have different dimensions and dimension units, such a situation affects the result of data analysis, and in order to eliminate the dimension influence among the indexes, data standardization processing is required.
2.5 discrete normalization of the data, linear transformation of the original data, mapping the result between [0, 1], the transfer function is shown in FIG. 2 as follows:
Figure BDA0002853413740000091
FIG. 2 Min-Max Normalization (Min-Max Normalization)
Wherein max is the maximum value of the sample data, min is the minimum value of the sample data, and a reasonable value can be set according to an empirical value.
2.6 the normalized data set is further collated so that it is shown as follows:
Figure BDA0002853413740000101
first row X of the matrix0 0To X0 mThe state values of m variables at 0 th time are represented, and the variable states from 0 th row to nth row are based on time series from 0 th to n th. The label corresponding to the whole matrix is a certain type of fault as a label.
And thirdly, constructing a classification model (multi-classification), which comprises a feature extractor and a classifier:
3.1 inputting a matrix shown in the above formula as a variable and a class label in the form of one hot code as a supervision signal;
3.2 using one hot coding for the true value of the model output y, wherein the dimension should be consistent with the actual label, each element corresponds to a possible probability of a fault category, and the sum of all the probabilities is 1;
3.3 the model uses a one-dimensional convolutional neural network, the one-dimensional convolution has two basic features: one is a matrix in which the data is one-dimensional, appearing to be 2-dimensional, with each row being a whole; secondly, each line in the graph is arranged according to a time sequence and has a certain pre-and post-association relation; the following figure is a calculation formula of one-dimensional convolution, wherein u is data of a certain length sequence dimension in the input:
Figure BDA0002853413740000102
u: one-dimensional data of sequence length s, where each element is also a vector of fixed size;
f (i, j) one-dimensional convolution kernel parameters;
σ: the Relu activating function increases the nonlinear fitting capability of the neural network, achieves a better fitting effect, can overcome the problem of gradient disappearance and accelerates the training speed.
The formula calculates the convolution kernel parameter and the corresponding state parameter matrix in a dot product mode according to a fixed convolution step length to obtain a corresponding convolution characteristic, and meanwhile, the original dimensionality can be compressed. And strictly moving the convolution kernel in a sliding window-like manner according to the step length until the state matrix is completely traversed, and obtaining the characteristic parameters of the input matrix, namely a characteristic set.
3.4SVM classifier: the formula is shown below
Figure BDA0002853413740000111
Wherein L isiRepresenting the value of the loss function obtained after the ith input matrix is subjected to model calculation,
yiindicating the actual correct label, sjA score of class j representing the actual output of the model.
Delta denotes a threshold if above the threshold we consider the correct class to distinguish well from a certain class, we give a 0 penalty to distinguish between the two classes, on the contrary, we show that the model distinguishes the two classes very poorly, we add the difference between the class scores to the threshold as the penalty.
3.5Softmax classifier:
Figure BDA0002853413740000112
first, the softmax classifier normalizes the actual model output (which may be positive or negative) by the normalization function in the above equation, ensuring that each is positive and the sum of the classes is 1. After normalization, if a certain class value is larger and closer to 1, the model judges that the most possible class is the class, conversely, if the model judges that the correct class probability value is closer to 0, the model is worse, and according to the characteristic, the loss can be regarded as the-log value of the correct class (the smaller the correct class probability is, the larger the loss is), and the formula is as follows:
Figure BDA0002853413740000121
and fourthly, taking the sampled gas path parameters as input, and after convolution and pooling operations, transmitting an output value to a classifier, optimizing a loss function and continuing training. The specific process is shown in fig. 2:
4.1 one-dimensional convolution process:
4.1.1 traversing a sliding window with the same size as the convolution kernel in the input features;
4.1.3 performing dot product operation on the convolution kernel and the corresponding characteristic matrix window in the previous step;
4.1.4 traversing the whole feature matrix to calculate the result of point multiplication for summation;
4.1.5 sending the result after summation into Relu activation function to increase the fitting capability of the nonlinear characteristic of the model;
4.1.6 after summing, reducing the dimension of the original feature matrix and extracting the original feature matrix into higher-level features related to the front and back time series;
4.1.7 return the high level features that have already been extracted.
4.2 the pooling process:
usually, a pooling layer is added after the convolutional layer to reduce the complexity of the data and prevent overfitting of the model, and the maximum pooling or average pooling can be selected according to the final requirement
4.2.1 the maximum pooling is that the maximum eigenvalue is selected from the adjacent eigenvalues according to the size of the pooling kernel as the output of the maximum pooling;
4.2.2 average pooling means that instead of being the maximum of the de-feature values, the average values in the features are selected to be the output of the final average pooling layer.
4.3 classifier:
connecting all extracted features and sending output values to a classifier
4.3.1 convert the output values of the network into a vector.
4.3.2 Global average pooling technique can be used instead of a fully-connected layer
4.3.3 if full connectivity layer technology is used, it is necessary to work with dropout layers because full connectivity is scale sensitive
4.4 dropout layer:
the Dropout layer randomly discards a part of input in the training process, and parameters corresponding to the input of the lost part cannot be updated at the moment, so that the overfitting problem can be solved to a great extent, and the complex fitting problem among neurons is reduced.
4.4.1 at first, half of the hidden neurons in the network are deleted randomly (temporarily according to a certain probability), and the input and output neurons are kept unchanged.
4.4.2 then propagate the input x forward through the modified network and then propagate the resulting loss results back through the modified network. After a small batch of training samples finishes the process, the corresponding parameters are updated on the neurons which are not deleted according to a random gradient descent method.
4.4.3 this process is repeated continuously.
And fifthly, obtaining a trained model through multiple iterative computations:
5.1 there is an important over-parameter batch size in random gradient descent training, the size of which has a great influence on the whole model training. Larger batch sizes will calculate more accurate gradient estimates because the more data is used per parameter update, the more representative the gradient of the global loss function and therefore the more accurate the gradient, but it may be that the network falls into a local minimum. And if the data volume is too large, the data loaded into the GPU video memory at one time can be too much bottleneck, and if the batch size is too small, the model can be not converged, so that the batch size needs to be increased within a reasonable range.
5.2 convolution process can use convolution kernel combination mode of different sizes, because the bigger the size of convolution kernel, the bigger the corresponding receptive field will be, the more the characteristics that can be learned the learning ability is stronger, but also increase the parameter quantity of model, increase the training difficulty of model. On the contrary, if the size of the convolution kernel is too small, the receptive field is too small, and the learning capability of the model may be limited, because the mode of combining convolution kernels with different sizes can be considered, different receptive fields can be considered, the learning capability of the model can be increased, and model parameters can not be greatly increased.
5.3 the number of the one-dimensional convolution kernels is also an important hyper-parameter, because the model finally adopts a global average pooling layer, one convolution kernel corresponds to a high-level feature, if the number of the convolution kernels is too small, the number of the finally extracted features is too small, and further, the learning capability of the model is possibly too poor, if the number of the convolution kernels is too large, the model becomes too complex, the training time is too long, and the like.
And sixthly, carrying out the test by the same pretreatment of the open test set.
The data set used by the invention is the gas circuit parameters collected by the aeroengine sensor, which comprises the gas circuit parameters when various faults occur and the gas circuit parameters under normal conditions, and the data is collected according to a time sequence, the invention uses the characteristic that the convolutional neural network can fully excavate the front and back change among the gas circuit parameters, compared with the traditional method for modeling discrete data (analyzing the data at a specific moment), the invention not only considers the change of the specific values of the gas circuit parameters at different moments, but also further considers the trend characteristic of the parameter change at continuous moments and the front and back relation, as the CNN has certain translation invariance, the generalization capability is better, more comprehensive, higher and more complex characteristics can be obtained, and then the invention provides a novel loss function, and the method is used for evaluating the classification result of the model so as to realize the diagnosis of the fault.

Claims (6)

1. A CNN-based aeroengine fault diagnosis method is characterized by comprising the following steps:
the method comprises the following steps of firstly, collecting operation data when an aircraft engine fails, classifying fault categories, labeling to construct a data set, and segmenting the data set to form a training set and a test set;
preprocessing the data of the training set to finish data cleaning, and adopting min-max normalization to perform dimension removal on the data;
constructing a multi-classification model, wherein the multi-classification model comprises a shallow convolutional layer and a pooling layer, and is fused with a classifier;
taking the sampled gas path parameters as input, carrying out convolution, pooling and output layer, transmitting the output value to a classifier, optimizing a cross entropy loss function, and then continuing training;
step five, carrying out repeated iterative computation to preset times to obtain a trained model;
step six, carrying out the same data preprocessing on the test set for testing;
and step seven, inputting the running data of the engine into the trained model in real time to obtain the diagnosis result when the engine fails.
2. The CNN-based aeroengine fault diagnosis method of claim 1, wherein in the first step, data including null values or abnormal values in the data set are removed, and interference data are removed; then, the data are segmented to ensure that the training set and the test set are distributed consistently; with 80% of the data as the training set and 20% as the test set.
3. The CNN-based aircraft engine fault diagnosis method according to claim 1, wherein the second step comprises the following steps:
2.1, counting the engine parameters, and sorting out practical data of each engine parameter to obtain a statistical result; the prime number practical data comprises an actual range interval and an occurrence frequency; the engine parameters comprise torque, inter-turbine temperature, rotating speed of a low-pressure turbine compressor, rotating speed of a high-pressure turbine compressor, rotating speed of a propeller, outlet pressure of the high-pressure compressor, fuel flow, take-off height, flight speed Mach number and flight height;
2.2 counting outliers or dirty data by utilizing a box plot principle according to the statistical result;
2.3 removing outliers or dirty data to obtain a data set;
2.4, carrying out normalization processing on the data in the data set;
2.5 the data is subjected to a dispersion normalization process, and the original data is subjected to a linear transformation, so that the result is mapped between [0, 1], and the conversion function is as follows:
Figure FDA0002853413730000021
wherein x*Representing the data after normalization, x representing the data before normalization, max being the maximum value of the sample data, and min being the minimum value of the sample data;
2.6 further sorting the normalized data set to form an engine parameter matrix:
Figure FDA0002853413730000022
wherein the fault label represents the fault type, X0 mThe value represents the state value of the mth variable at time 0, and n represents the nth time.
4. The CNN-based aircraft engine fault diagnosis method according to claim 3, wherein in the third step, the building of the multi-classification model comprises the following steps:
3.1 the data of n continuous time points are sampled in the engine gas path parameter matrix as the input of the model.
3.2 the true value of the output y of the model is coded by one hot, y represents the fault label corresponding to the input matrix, the true value dimension of y is consistent with the actual label, each element corresponds to the possible probability of one fault category, and the sum of all the probabilities is 1;
3.3 the model uses a one-dimensional convolution neural network for feature extraction, the one-dimensional convolution has two basic features: firstly, the data is a one-dimensional matrix; secondly, each line is arranged according to a time sequence and has a front-back incidence relation; the calculation formula of the one-dimensional convolution is as follows:
Figure FDA0002853413730000031
wherein u is one-dimensional data with the sequence length of s, and u is used as the input of the model; each element in the one-dimensional data is a vector of fixed size; f (i, j) represents that the row index is i and the column index is j in the one-dimensional convolution kernel represents the parameter of the convolution kernel; i represents a row index of the one-dimensional convolution kernel, j represents a column index of the one-dimensional convolution kernel, u (i, j) represents an element which is represented by i and j in the input parameter u, b represents a bias parameter, and Conv1D (u) represents the output of the input parameter u after the one-dimensional convolution operation; sigma represents a Relu activation function, is used for increasing the nonlinear fitting capability of the neural network, overcomes the problem of gradient disappearance and accelerates the training speed;
the formula of the 3.4SVM classifier is as follows:
Figure FDA0002853413730000032
wherein L isiRepresenting the value of the loss function obtained after the ith input matrix is subjected to model calculation,
yiindicating the actual correct label, sjA probability value representing a class j of an actual output of the model;
Figure FDA0002853413730000034
class y representing the actual prediction output of the modeliA probability value of (d); Δ represents a threshold value if
Figure FDA0002853413730000035
If the value is equal to or higher than the threshold value, the classification is judged to be correct and the classification is comparedThe classification is well differentiated, and a 0 loss value is given; if the value is less than the threshold value, the model distinguishes the correct category from the compared categories badly, and the difference of the category scores is added with a threshold value delta to be used as loss;
3.5Softmax classifier:
Figure FDA0002853413730000033
firstly, the softmax classifier normalizes the actual output of the model through the normalization function of the formula to ensure that each is positive and the sum of all classes is 1; after normalization, if a certain class value is larger and closer to 1, the model judges that the most possible class is the corresponding class, conversely, if the model judges that the correct class probability value is closer to 0, the model is worse, according to the characteristic, the loss value is taken as the-log value of the correct class, and the smaller the correct class probability is, the larger the loss is, as shown in the following formula:
Figure FDA0002853413730000041
wherein P(s) represents the probability of the actually predicted vector of the model after being subjected to softmax normalization, s represents the actually output vector of the model, k represents the kth fault category, and esIndicating that the predicted probability value for a certain class is exponentially operated, M indicating the number of classes,
Figure FDA0002853413730000042
represents 0 or 1, is 1 if the actual class of sample i is consistent with c, otherwise is 0,
Figure FDA0002853413730000043
representing the probability that the type of the model actual prediction sample i is c; n represents the number of samples;
3.6 combining loss functions
Figure FDA0002853413730000044
Where i denotes the ith sample, N denotes the total number of samples, and α denotes LsvmThe factor occupied by the loss function.
5. The CNN-based aircraft engine fault diagnosis method according to claim 1, wherein the fourth step comprises the steps of:
4.1 one-dimensional convolution process:
4.1.1 traversing a sliding window with the same size as the convolution kernel in the input features;
4.1.3 performing dot product operation on the convolution kernel and the corresponding characteristic matrix window in the previous step;
4.1.4 traversing the whole feature matrix to calculate the result of point multiplication for summation;
4.1.5 sending the result after summation into Relu activation function to increase the fitting capability of the nonlinear characteristic of the model;
4.1.6 after summing, reducing the dimension of the original feature matrix and extracting the original feature matrix into high-level features related to the front and back time series;
4.1.7 returning the extracted high-level features;
4.2 the pooling process:
a pooling layer is added after the convolutional layer, so that the complexity of data is reduced, overfitting of the model is prevented, and maximum pooling or average pooling is selected according to final needs;
4.2.1 the global maximum pooling is to select the maximum eigenvalue from the eigenvalues as the output of the maximum pooling according to the size of the eigenvalue;
4.2.2 Global average pooling refers to selecting an average value group in the features as the output of the final average pooling layer according to the size of the feature value;
4.3 classifier: connecting all extracted features and sending output values to a classifier
4.3.1 converting the output value of the network into a vector;
4.3.2 replacing a full connection layer by adopting a global average pooling technology; or a full connection layer technology is used, and a dropout layer is used in a matching way;
4.4 dropout layer:
the Dropout layer randomly discards a part of input in the training process, and parameters corresponding to the input of the lost part cannot be updated at the moment so as to solve the problem of overfitting and reduce the problem of complex adaptation among neurons;
4.4.1 at first, randomly deleting half of hidden neurons in the network, and keeping the input and output neurons unchanged;
4.4.2 then propagating the input x forward through the modified network and then propagating the resulting loss result backward through the modified network; after the partial training samples are executed, updating corresponding parameters on the undeleted neurons according to a random gradient descent method;
4.4.3 repeat step 4.4.1 and step 4.4.2.
6. The CNN-based aircraft engine fault diagnosis method according to claim 1, wherein the fifth step includes the steps of:
5.1 obtaining the super parameter batch size through an experiment, namely the range of batch size;
5.2 using convolution kernels with different sizes in the convolution process;
and 5.3, carrying out grid search or hyper-parameter search of the network model on the number of convolution kernels and even the size of the convolution kernels according to a preset rule in a preset range.
CN202011535827.4A 2020-12-23 2020-12-23 CNN-based aeroengine fault diagnosis method Active CN112766303B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011535827.4A CN112766303B (en) 2020-12-23 2020-12-23 CNN-based aeroengine fault diagnosis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011535827.4A CN112766303B (en) 2020-12-23 2020-12-23 CNN-based aeroengine fault diagnosis method

Publications (2)

Publication Number Publication Date
CN112766303A true CN112766303A (en) 2021-05-07
CN112766303B CN112766303B (en) 2024-03-29

Family

ID=75695321

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011535827.4A Active CN112766303B (en) 2020-12-23 2020-12-23 CNN-based aeroengine fault diagnosis method

Country Status (1)

Country Link
CN (1) CN112766303B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113486868A (en) * 2021-09-07 2021-10-08 中南大学 Motor fault diagnosis method and system
CN113536658A (en) * 2021-05-21 2021-10-22 西北工业大学 Electromechanical equipment lightweight fault diagnosis method based on STM32 embedded processor
CN114781507A (en) * 2022-04-18 2022-07-22 杭州电子科技大学 1 DCNN-DS-based water chilling unit fault diagnosis method
CN115014445A (en) * 2022-08-10 2022-09-06 中国农业大学 Smart fishery multi-dimensional panoramic perception monitoring method, system and device
CN115169505A (en) * 2022-09-06 2022-10-11 杭州浅水数字技术有限公司 Early warning method and early warning system for mechanical fault of special equipment moving part
CN115204031A (en) * 2022-05-13 2022-10-18 哈尔滨工业大学 Vibration value prediction method for aircraft engine assembly process
CN115542172A (en) * 2022-12-01 2022-12-30 湖北工业大学 Power battery fault detection method, system, device and storage medium
CN116835540A (en) * 2023-04-28 2023-10-03 福建省龙德新能源有限公司 Preparation method of phosphorus pentafluoride

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109001557A (en) * 2018-06-11 2018-12-14 西北工业大学 A kind of aircraft utilities system fault recognition method based on random convolutional neural networks
CN109115501A (en) * 2018-07-12 2019-01-01 哈尔滨工业大学(威海) A kind of Civil Aviation Engine Gas path fault diagnosis method based on CNN and SVM
CN110321603A (en) * 2019-06-18 2019-10-11 大连理工大学 A kind of depth calculation model for Fault Diagnosis of Aircraft Engine Gas Path
CN110555479A (en) * 2019-09-05 2019-12-10 北京工业大学 fault feature learning and classifying method based on fusion of 1DCNN and GRU
CN111222458A (en) * 2020-01-06 2020-06-02 浙江工业大学 Rolling bearing fault diagnosis method based on ensemble empirical mode decomposition and convolutional neural network
CN111581763A (en) * 2019-02-15 2020-08-25 中国航发商用航空发动机有限责任公司 Method for evaluating diagnosis result of gas circuit fault of aircraft engine

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109001557A (en) * 2018-06-11 2018-12-14 西北工业大学 A kind of aircraft utilities system fault recognition method based on random convolutional neural networks
CN109115501A (en) * 2018-07-12 2019-01-01 哈尔滨工业大学(威海) A kind of Civil Aviation Engine Gas path fault diagnosis method based on CNN and SVM
CN111581763A (en) * 2019-02-15 2020-08-25 中国航发商用航空发动机有限责任公司 Method for evaluating diagnosis result of gas circuit fault of aircraft engine
CN110321603A (en) * 2019-06-18 2019-10-11 大连理工大学 A kind of depth calculation model for Fault Diagnosis of Aircraft Engine Gas Path
CN110555479A (en) * 2019-09-05 2019-12-10 北京工业大学 fault feature learning and classifying method based on fusion of 1DCNN and GRU
CN111222458A (en) * 2020-01-06 2020-06-02 浙江工业大学 Rolling bearing fault diagnosis method based on ensemble empirical mode decomposition and convolutional neural network

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
XIAOBIN ZHU ET AL: "steady-state process fault detection for liquid rocket engines based on convolutional auto-enconder and one-class support vector machine", 《IEEE》, 23 December 2019 (2019-12-23), pages 3144 - 3158, XP011766107, DOI: 10.1109/ACCESS.2019.2961742 *
付松: "基于深度特征挖掘的民航发动机故障诊断技术研究", 《中国博士学位论文全文数据库工程科技II辑》, 15 February 2020 (2020-02-15), pages 031 - 941 *
牛乃平 等: "基于1D-CNN的采煤机摇臂齿轮故障诊断", 《煤矿机械》, vol. 41, no. 11, pages 148 - 150 *
王维锋;邱雪欢;孙剑桥;张惠民;: "基于双层长短时记忆网络的齿轮故障诊断方法", 装甲兵工程学院学报, no. 02, pages 86 - 90 *
韩开旭;黎永壹;邱桂华;钱威;: "基于分段卷积神经网络的文本情感极性分析", 计算机仿真, no. 06, pages 366 - 369 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113536658A (en) * 2021-05-21 2021-10-22 西北工业大学 Electromechanical equipment lightweight fault diagnosis method based on STM32 embedded processor
CN113486868A (en) * 2021-09-07 2021-10-08 中南大学 Motor fault diagnosis method and system
CN113486868B (en) * 2021-09-07 2022-02-11 中南大学 Motor fault diagnosis method and system
CN114781507A (en) * 2022-04-18 2022-07-22 杭州电子科技大学 1 DCNN-DS-based water chilling unit fault diagnosis method
CN114781507B (en) * 2022-04-18 2024-04-05 杭州电子科技大学 1 DCNN-DS-based water chilling unit fault diagnosis method
CN115204031A (en) * 2022-05-13 2022-10-18 哈尔滨工业大学 Vibration value prediction method for aircraft engine assembly process
CN115014445B (en) * 2022-08-10 2022-11-15 中国农业大学 Smart fishery multi-dimensional panoramic perception monitoring method, system and device
US11758887B1 (en) 2022-08-10 2023-09-19 China Agricultural University Method, system and apparatus for intelligently monitoring aquafarm with multi-dimensional panoramic perception
CN115014445A (en) * 2022-08-10 2022-09-06 中国农业大学 Smart fishery multi-dimensional panoramic perception monitoring method, system and device
CN115169505A (en) * 2022-09-06 2022-10-11 杭州浅水数字技术有限公司 Early warning method and early warning system for mechanical fault of special equipment moving part
CN115542172A (en) * 2022-12-01 2022-12-30 湖北工业大学 Power battery fault detection method, system, device and storage medium
CN116835540A (en) * 2023-04-28 2023-10-03 福建省龙德新能源有限公司 Preparation method of phosphorus pentafluoride
CN116835540B (en) * 2023-04-28 2024-05-21 福建省龙德新能源有限公司 Preparation method of phosphorus pentafluoride

Also Published As

Publication number Publication date
CN112766303B (en) 2024-03-29

Similar Documents

Publication Publication Date Title
CN112766303B (en) CNN-based aeroengine fault diagnosis method
CN114997051B (en) Aero-engine life prediction and health assessment method based on transfer learning
CN111368885B (en) Gas circuit fault diagnosis method for aircraft engine
CN109141847B (en) Aircraft system fault diagnosis method based on MSCNN deep learning
CN111340292B (en) Integrated neural network PM2.5 prediction method based on clustering
CN109766583A (en) Based on no label, unbalanced, initial value uncertain data aero-engine service life prediction technique
CN111950191B (en) Rotary kiln sintering temperature prediction method based on hybrid deep neural network
CN110609524B (en) Industrial equipment residual life prediction model and construction method and application thereof
CN112580263A (en) Turbofan engine residual service life prediction method based on space-time feature fusion
CN114297910A (en) Aero-engine service life prediction method based on improved LSTM
CN114218872B (en) DBN-LSTM semi-supervised joint model-based residual service life prediction method
CN114282443B (en) Residual service life prediction method based on MLP-LSTM supervised joint model
CN109826818B (en) Fault diagnosis method for mining fan
CN113743016B (en) Engine residual life prediction method based on self-encoder and echo state network
WO2024045246A1 (en) Spike echo state network model for aero engine fault prediction
CN112668775A (en) Air quality prediction method based on time sequence convolution network algorithm
Shi et al. Health index synthetization and remaining useful life estimation for turbofan engines based on run-to-failure datasets
CN114330517A (en) Neural network-based aircraft engine sensor fault self-diagnosis method
CN114462459A (en) Hydraulic machine fault diagnosis method based on 1DCNN-LSTM network model
Liu et al. Deep & attention: A self-attention based neural network for remaining useful lifetime predictions
CN115375026A (en) Method for predicting service life of aircraft engine in multiple fault modes
CN113673774A (en) Aero-engine remaining life prediction method based on self-encoder and time sequence convolution network
CN115048873B (en) Residual service life prediction system for aircraft engine
CN113849479A (en) Comprehensive energy supply station oil tank leakage detection method based on instant learning and self-adaptive threshold
CN114444544A (en) Signal classification and identification method based on convolutional neural network and knowledge migration

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant