CN114778112A

CN114778112A - Audio identification and fault diagnosis method for mechanical fault of wind turbine generator system

Info

Publication number: CN114778112A
Application number: CN202110413071.4A
Authority: CN
Inventors: 徐春; 岳永军; 丛智慧; 马亮; 于萍
Original assignee: Zhongke Innovation Beijing Technology Co ltd; Datang Chifeng New Energy Co ltd
Current assignee: Zhongke Innovation Beijing Technology Co ltd; Datang Chifeng New Energy Co ltd
Priority date: 2021-04-16
Filing date: 2021-04-16
Publication date: 2022-07-22

Abstract

The invention belongs to the technical field of wind power generator fault diagnosis, and particularly relates to a mechanical fault audio identification and fault diagnosis method for a wind power generator set. The invention provides a method flow based on vibration fault signal monitoring and a convolutional neural network model for intelligent bearing fault diagnosis, vibration signals are obtained by an acceleration sensor, historical data are reasonably sampled and subjected to 1D-2D signal processing transformation, bearing signal samples in various fault states are reasonably divided into a training set and a testing set, the training set is sent to an established deep convolutional neural network for model learning, and after the model learning is completed, the generalization capability of the model is verified by using the testing set, namely the testing accuracy is realized.

Description

Audio identification and fault diagnosis method for mechanical fault of wind turbine generator system

Technical Field

The invention relates to the technical field of wind power generator fault diagnosis, in particular to a wind power generator set mechanical fault audio frequency identification and fault diagnosis method.

Background

With continuous progress of scientific technology and continuous growth of economic development, the living standard of people is improved unprecedentedly, and with the daily change of industrial manufacturing and product production, energy supply is facing unprecedented challenge, and under the background of energy shortage, the development of clean renewable energy is more and more emphasized, wherein wind energy as a safe clean renewable energy has great development potential and is an important alternative energy of traditional fossil fuels, and with continuous development and maturity of wind power technology, wind power generation enters a high-speed development period, and wind energy is highly favored in China and even in the global scope.

The wind power generation scale is continuously enlarged, the operation and maintenance problems of the wind turbine generator are increasingly outstanding, particularly, the operation and maintenance cost is high due to faults and maintenance, the maintenance cost even occupies 10 to 25 percent of the total investment, the existing large wind power plant is generally built in remote areas such as northwest, plateau, coastal areas and the like in China due to the restriction of the geographical position of wind energy resources, the operation environment of the wind turbine generator is severe, the traffic and transportation conditions are inconvenient, a plurality of difficulties are brought to the operation and maintenance of the wind turbine generator, the wind turbine generator can be affected by extreme severe weather such as strong rainfall, snowfall, storm, lightning and the like, along with the increase of the operation time, various abnormalities and faults are easy to occur on fan blades, a gear box, a generator and other components, and the wind turbine generator is caused to operate abnormally and even stop, wherein the gear box is used as an important transmission component in the generator, in actual operation, the failure rate is high, and statistically about 20 percent of the downtime is caused by the failure of the gearbox, most of the failures occur on the bearing, and therefore the failure of the bearing is a main factor for causing the failure of the gearbox of the fan.

Health condition monitoring is indispensable for ensuring that a wind turbine can operate safely and stably, an effective fault diagnosis technology can detect faults of the wind turbine in time at an early stage, because the scale of a wind turbine installation is continuously enlarged, the traditional regular maintenance method can not meet the actual operation and maintenance requirements of a wind farm, a state monitoring technology is developed later, the purpose of on-line monitoring and fault diagnosis is achieved by acquiring real-time operation data of the wind turbine through a sensor, but the system depends on expert knowledge and working experience, a large amount of manpower and time cost is needed, the complexity of the system is increased, the diversity of faults is increased, the data volume is increased, and the requirements of on-line diagnosis are met, the effective fault diagnosis by the traditional method is difficult, and an effective intelligent diagnosis technology is urgently needed to solve the problems, therefore, the audio frequency identification and fault diagnosis method for the mechanical fault of the wind generating set is provided.

Disclosure of Invention

Objects of the invention

In order to solve the technical problems in the background art, the invention provides a mechanical fault audio frequency identification and fault diagnosis method for a wind generating set, which has the characteristic of intelligent detection.

(II) technical scheme

In order to solve the technical problem, the invention provides a mechanical fault audio frequency identification and fault diagnosis method for a wind generating set, which comprises fault signal detection and a convolution neural network, the method is characterized in that fault signal detection comprises vibration signal detection, acoustic emission signal detection, strain force signal detection, temperature signal detection, oil parameter detection and electric signal detection, intelligent diagnosis for detecting bearing faults based on a convolutional neural network comprises data acquisition, feature extraction and feature classification, the fault diagnosis problem is converted into a similar image recognition classification task for processing by means of the function of a deep learning model in the aspect of automatic feature extraction, signal data preprocessing is carried out, then a model structure is established for training optimization, overlapping sampling is carried out, and whether the bearing state is healthy or not is diagnosed through two-dimensional representation of vibration data.

The model used for building the model structure is designed into 3 convolution-pooling pairs and 2 network structures of full connection layers, training optimization is carried out after samples are obtained and preprocessed, two-dimensional images and health state labels corresponding to the two-dimensional images serve as one sample pair, a part of all sample sets is randomly selected according to a given proportion to serve as a training set, the rest samples serve as a test set, the training set is used for model training, supervised learning is carried out on the two-dimensional images and the health state labels, after model training is finished, test set data without labels belong to the test set and are sent to the model to be tested, classification prediction results are obtained, the generalization capability of the model can be evaluated after the classification prediction results are compared with real results, and the contingency of experimental results is reduced.

The bearing faults comprise inner ring faults, outer ring faults, rolling body faults and retainer faults, damage points generated by the faults repeatedly collide with the surfaces of other elements to generate periodic fault impact, the periodic fault impact is a low-frequency vibration signal, the larger the fault degree is, the larger the impact amplitude is, and the calculation formula of the fault characteristic frequency of each part of the rolling bearing is as follows:

inner ring failure frequency:

outer ring fault frequency:

frequency of rolling element failure:

cage failure frequency:

in the above formulas, N is the number of rolling elements, D is the bearing diameter, D_bIs the diameter of the rolling element, alpha is the contact angle, f_rIs the rotational frequency of the shaft.

The convolutional neural network learning module consists of an input layer, a convolutional layer, a pooling layer, a full-link layer and an output layer, back propagation of the convolutional neural network learning module is a key step of neural network parameter optimization, each weight parameter and bias parameter are updated according to an optimization algorithm after partial derivation of a target function is solved, and weight optimization updating is performed by the optimization algorithm after back propagation.

The fault signal detection also comprises comprehensive composite fault diagnosis, composite faults are regarded as belonging to a plurality of single faults, namely the composite faults need to be identified as a plurality of fault modes at the same time, the model becomes a single-input multi-output model, the set composite faults need to be identified as corresponding single fault modes respectively, and the problem can be classified into multi-label in the mode identification problem.

Preferably, the pooling layer is a key picture identification capability, has sparse connection and weight sharing, the input layer is image input and comprises a single-channel gray image and a three-channel color image input form, the output layer is a Softmax classifier and is used for outputting an image classification identification result, a special output layer structure is provided for the field of target detection or image segmentation, the convolution layer is used for feature learning, the pooling layer is used for feature selection, the classifier utilizes learned deep features to perform classification output, and network parameters of all layers are optimized simultaneously in training.

Preferably, the data preprocessing rearranges the sampling values of the one-dimensional signal in order into a two-dimensional matrix, regards as a vibration gray map, and performs overlapped sampling at a certain length to increase the number of samples when constructing the samples.

Preferably, the fault signal detection is carried out on the part to be detected through a sensor for data acquisition and monitoring control.

Preferably, the convolutional neural network verifies the validity through the simulation signal and the test bed signal.

Preferably, the fault signal detection is performed with convolutional neural network-based simulation training under noise interference, so that the noise immunity of the model is increased.

Preferably, after the model structure is built, the objective function design and the model training are carried out.

Preferably, the vibration signal monitoring and convolutional neural network model is used for identifying different fault types and different fault degrees, detecting fault signals under noise interference and effectively identifying composite faults.

The technical scheme of the invention has the following beneficial technical effects:

1. the invention provides a method flow based on vibration fault signal monitoring and a convolutional neural network model for intelligent bearing fault diagnosis, vibration signals are obtained by an acceleration sensor, historical data are reasonably sampled and subjected to 1D-2D signal processing transformation, bearing signal samples in various fault states are reasonably divided into a training set and a testing set, the training set is sent into an established deep convolutional neural network for model learning, the generalization capability of the model, namely the testing accuracy, is verified by the testing set after the model learning is completed, and the testing convolutional neural network model can be further deployed in a wind turbine monitoring control system to realize online real-time state monitoring and fault diagnosis.

2. The effectiveness of the fault diagnosis method based on the convolutional neural network is verified through simulation signals and test board signals, simulation is to simplify mathematical modeling of bearing vibration, problems are simplified to a certain extent, the effectiveness of the method is verified quickly, a test board data set is data acquired in operation of a real fault bearing, test is performed by utilizing data of various fault types and fault severity degrees, and the effectiveness and superiority of an intelligent diagnosis algorithm are analyzed and evaluated comprehensively.

3. Experiments prove that on the premise that a composite fault sample can be obtained, satisfactory fault identification accuracy can be obtained no matter whether the composite fault is singly used as a type or model training is carried out by adopting a multi-label method. And then, on the premise of lacking real compound fault data, a method for constructing a pseudo compound fault signal sample by using an eigenmode function obtained by single fault signal empirical mode decomposition and adding the pseudo compound fault signal sample into model training is researched, so that the multi-label classification convolutional neural network model initially has the capability of diagnosing and identifying real compound faults, and compared with a model trained by only a single fault sample, the accuracy rate of simultaneously identifying the compound faults as containing various single fault components is greatly improved.

Drawings

FIG. 1 is a schematic view of a bearing structure according to the present invention;

FIG. 2 is a schematic diagram of the convolution operation after adjusting the moving step length according to the present invention;

FIG. 3 is a schematic diagram of the convolution operation after padding is added in accordance with the present invention;

FIG. 4 is a schematic diagram of the three-dimensional convolution operation of the present invention;

FIG. 5 is a graph of a conventional activation function of the present invention;

FIG. 6 is a schematic diagram of the maximum pooling and average pooling operation of the present invention;

FIG. 7 is a schematic diagram of a fully connected layer and an output layer according to the present invention;

FIG. 8 is a CNN-based bearing fault diagnosis flowchart of the present invention;

FIG. 9 is a schematic diagram of signal preprocessing oversampling according to the present invention;

FIG. 10 is a schematic representation of two-dimensional representation of signal pre-processing vibration data in accordance with the present invention;

FIG. 11 is a schematic diagram of a convolutional neural network structure according to the present invention;

FIG. 12 is a time domain waveform of two fault signals of the present invention;

FIG. 13 is a signal diagram illustrating simulation of normal and fault conditions according to the present invention;

FIG. 14 is a visualization graph of t-SNE characteristics of simulation test results of the present invention;

FIG. 15 is a schematic diagram of a confusion matrix of simulation test results according to the present invention;

FIG. 16 is a waveform of a vibration signal containing noise interference according to the present invention;

FIG. 17 is a graph of the results of a diagnostic test of a noisy interference signal in accordance with the present invention;

FIG. 18 is a schematic view of a fault classification ticket label of the present invention;

FIG. 19 is a fault classification multi-label diagram of the present invention;

FIG. 20 is a diagram of a pseudo-composite fault signal construction process in accordance with the present invention.

Fig. 21 simulation test results: (a) visualization of t-SNE characteristics; (b) test result confusion matrix

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings in conjunction with the following detailed description. It should be understood that the description is intended to be exemplary only, and is not intended to limit the scope of the present invention. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present invention.

As shown in figure 1, the wind generating set mechanical fault audio frequency identification and fault diagnosis method provided by the invention comprises fault signal detection and a convolution neural network, wherein the fault signal detection comprises vibration signal detection, acoustic emission signal detection, strain force signal detection, temperature signal detection, oil parameter detection and electric signal detection, intelligent diagnosis for detecting bearing faults based on the convolution neural network comprises data acquisition, characteristic extraction and characteristic classification, the fault diagnosis problem is converted into a similar image identification classification task for processing by means of the function of a deep learning model in the aspect of automatic characteristic extraction, signal data preprocessing is carried out, then a model structure is established for training optimization, overlapped sampling is carried out, and whether the bearing state is healthy or not is diagnosed by two-dimensional representation of vibration data.

The bearing faults comprise an inner ring fault, an outer ring fault, a rolling body fault and a retainer fault, damage points generated by the faults repeatedly collide with the surfaces of other elements to generate periodic fault impact, the periodic fault impact is a low-frequency vibration signal, the larger the fault degree is, the larger the impact amplitude is, and the calculation formula of the fault characteristic frequency of each part of the rolling bearing is as follows:

inner ring failure frequency:

outer ring fault frequency:

frequency of rolling element failure:

cage failure frequency:

When a certain type of fault occurs, a spectrum peak of corresponding fault frequency appears in a bearing vibration frequency spectrum, factors such as errors of production, processing and installation, elastic deformation generated under the action of load, interference of signal noise and the like are considered, actually observed fault frequency components may not be completely consistent with those obtained by formula calculation, but basically are close to theoretical calculation values, and the fault type can be preliminarily judged according to the factors under the ideal condition.

In addition, after fault impact, the impact pulse can also cause high-frequency vibration of the bearing, the vibration frequency of the high-frequency vibration is the natural vibration frequency of the bearing, the natural frequency of the inner ring and the outer ring of the bearing can reach thousands of hertz generally, the natural frequency of the rolling body can be higher and can reach hundreds of kilohertz, the vibration performance of the inner ring and the outer ring in the natural vibration is most obvious, and the natural frequency of the inner ring and the outer ring is as follows:

wherein n is the vibration order (deformation coefficient), E is the elastic modulus of the material, and the unit is kg/m²I is the moment of inertia of the cross section of the ferrule, D is the diameter of the neutral axis, and M is the mass per unit length. For steel balls, the natural frequency can be expressed by the equation (2-6), d represents the ball diameter, and ρ is the material density.

The natural frequency vibration caused by the short-time pulse impact emitted when the bearing is impacted by local faults can be measured by a sensor and is represented as sine vibration which is attenuated exponentially, the larger the natural frequency and the attenuation exponent are, the shorter the vibration existing time is, and the high-frequency vibration is caused by the impact generated by each fault.

The convolutional neural network learning module is composed of an input layer, a convolutional layer, a pooling layer, a full-connection layer and an output layer, back propagation of the convolutional neural network learning module is a key step of neural network parameter optimization, each weight parameter and bias parameter are updated according to an optimization algorithm after the bias derivation is carried out on an objective function, weight optimization updating is carried out by the optimization algorithm after the back propagation is carried out, for a computer, the convolutional layer is a matrix (the pixel value is between 0 and 255) which is composed of pixel values, and the maximum and minimum normalization processing is usually carried out before the convolutional layer is input into a neural network. The convolution layer is to perform convolution operation on an input image pixel value matrix, and aims to extract features from an image and output a feature map, for a network with multiple layers of convolution layers, a subsequent convolution layer is to continue convolution on a feature map obtained from a previous layer to extract deeper features, the convolution layer is composed of a series of convolution kernels (or filters), the typical convolution kernels have 3 × 3, 5 × 5 convolution kernels and the like, the sizes of the convolution kernels except the length and the width also have depth or called channel number, the depth or the called channel number is equal to the input channel number, the larger the length and the width of the convolution kernels is, the larger an image area covered by a single convolution operation is, the convolution operation retains the spatial relationship among pixels, the size and the dimension of each convolution kernel in one convolution layer are consistent, and the number of the convolution kernels determines the channel number of the output feature map.

The convolutional layer also has two hyper-parameters to set: the step size of the move refers to the length of each convolution kernel move, for example, a step size of 1 means that the convolution kernel slides pixel by pixel along one direction, and a step size of 2 means that the convolution kernel moves by 2 pixel units each time, as shown in the following figure, the larger the step size of the move is, the less the area overlapping involved in each convolution operation is, which can reduce the amount of redundant computation but may also result in insufficient feature extraction, and in practical applications, most Stride is 1 or Stride 2, and there are few larger step sizes. If n represents the input side length, f represents the convolution kernel side length, and s represents the move step, the output size is given by equation (2-1):

in the above formula, floor represents an integer function, and since f and s have values of at least 1, the output size is always smaller than or equal to the input size, and if it is desired to maintain the input and output dimensions consistent, padding operation needs to be added. In addition, since the number of times of the original output edge participating in the convolution calculation is less than that of the middle area, in order to reduce the influence of insufficient edge feature extraction, it is also necessary to perform a padding operation around the input matrix. Common terms for Padding include: (1) valid, i.e. no padding, the operation is performed when the convolution kernel completely covers the image pixels; and (2) Same, calculating when the center of a convolution kernel is superposed with the image pixel by filling spaces around, and keeping the side length of the feature mapping unchanged after convolution when the moving step length is 1, as shown in the formula (2-2). If the padded size is denoted by p, the output size after Stride and Padding operations is given by equation (2-3):

through the matching of the step length s and the padding p, the dimension of the feature graph after the convolution operation can be kept unchanged, the square matrix input with the same length and width is taken as an example, for the input principle with different lengths and widths, the schematic diagram of the convolution operation after the step length is adjusted is shown in fig. 2, and the convolution operation after the padding is added is shown in fig. 3.

When the input image has a plurality of channels, for example, the color image has three color channels of RGB, the convolution kernel needs to have the same number of channels, the channel of each convolution kernel is convolved with the channel corresponding to the input layer, the way of convolution operation is consistent with the above-mentioned flow, and finally the convolution results of each channel are added bit by bit to obtain the final feature map, as shown in fig. 4, which is a schematic diagram of three-dimensional convolution operation in three-channel input.

In addition, after convolution operation is performed on a convolution layer, nonlinear mapping is often required, that is, an active layer is passed through, because the problem that the convolution neural network is expected to solve in practical application is often a nonlinear problem, and the convolution and operation are linear operations, the purpose of introducing the active layer is to introduce a nonlinear factor, the active layer is to perform an active function operation on a convolution operation result, commonly-used active functions include a Sigmoid function, a hyperbolic tangent function (Tanh), a linear correction unit (Relu) and the like, the three active functions perform very well and are most commonly used in practical application, wherein the linear correction unit is a default selection of most models, and expressions and function images of the three active functions are respectively shown in fig. 5 below.

relu(z)＝max{0,z} (2-6)

As can be seen from fig. 5, the Sigmoid function and the Tanh function have a relatively obvious disadvantage that they are easily saturated, and when the input value is relatively large or small, the change of the function value is not obvious, and it is difficult to train when the neural network is relatively deep, which easily causes the problem that the gradient disappears. And when the input of the ReLU activation function is larger than 0, the derivative is constantly 1, and when the input of the ReLU activation function is smaller than 0, the derivative is constantly 0, so that a better effect can be obtained when a back propagation algorithm is used for training.

In summary, the forward propagation of convolutional layers can be represented by the following equation, where superscript represents the number of layers, a represents the cross-correlation convolution operation, W represents the weight, b represents the bias, and σ represents the activation function.

a^l＝σ(z^l)＝σ(a^l-1*W^l+b^l) (2-7)

In addition, after a single-layer or multi-layer convolutional layer, a pooling layer is usually added for the purpose of performing dimensionality reduction to reduce network parameters and computation amount, so called a downsampling layer, the pooling layer cannot reduce the size of the model and shorten training time, robustness of extracted features can be enhanced, and overfitting of the model is facilitated to be prevented, the pooling mode is generally divided into maximum pooling and average pooling, the computing method is given by formulas (2-8) and (2-9), the maximum pooling takes the maximum value in a window as output, the average pooling takes the average of all values in the window as output, fig. 6 is an effect diagram of two pooling operations, and the maximum pooling is mainly used in practical application.

The pooling layer is similar to the convolution layer, and has several hyper-parameters of sampling window size, moving step length and filling, which need to be set, so that the size of the input feature map is n_H×n_W×n_cThe same sign convention for width, height and channel number, window size and moving step sampling and convolutional layer, respectively, the output size of the pooling layer is given by equation (2-10), where the expression omits the dereferencing function and Padding parameter, because in practical application most models use the largest pooling sampling window of 2 × 2 size andwhen Stride is 2, Padding is not generally used, the side length of the feature graph is also generally even, the length and width dimensions of the feature graph can be reduced by half through the pooling operation, the size of the model is effectively reduced, the calculation speed is increased, and it is noted that no parameter needs to be learned in the pooling layer in the model training.

The fully-connected layer, the convolutional layer (including the active part) and the pooling layer can be regarded as a feature extraction layer of a convolutional neural network, and the extracted features are finally used for image classification or target detection. The fully-connected layer is consistent with the traditional feedforward neural network layer and can be regarded as a feature classification layer, which is not described herein any more, and all extracted features are flattened and then connected to one or more fully-connected layers for model learning. The fully-connected layer can be regarded as a buffer between the feature map and the output layer, as shown in fig. 7, which is equivalent to a hidden layer of a conventional neural network, and can enhance the recognition capability of the model on the extracted image features, and the expression of the fully-connected layer is as follows:

a^l＝σ(z^l)＝σ(W^la^l-1+b^l) (2-11)

the output layer of the convolutional neural network is also different according to different model purposes, and the output layer is generally a Softmax classifier in the image classification task. Assuming that the classification tasks share k classes, the output of the Softmax function can be expressed as:

in the formula, zi represents an output value of each neuron of the last layer of neural network which is not activated, and it is worth noting that the above formula is established on the premise that the number of the neurons of the last layer of neural network is consistent with the total number of classification categories, some reference data can classify the last layer of neural network into a Softmax classifier, the essence is also a fully-connected layer, only an activation function of the layer is a Softmax function, and only an output result subjected to the operation of the Softmax function is defined as an output layer. Careful observation of the above formula can find that the sum of terms is 1, so that a plurality of scalars output by the full connection layer can be mapped into a normalized probability distribution through the Softmax function, and the confidence of the classification result is output. Each item of output can be regarded as the probability of belonging to each category, that is, the larger the output of a neuron is, the higher the probability that the category corresponding to the neuron is a real category is, and the item with the highest probability is taken as the final classification predicted value.

After the model training and the convolutional neural network model are built, the depth of the network and the size of convolutional kernels are determined, classification objects are also known, but specific network parameters (including weight and bias) need to be trained and optimized, meanwhile, the setting of training parameters such as learning rate is very important, the unreasonable setting of the training parameters can cause overfitting or underfitting of the model training, so that the generalization capability of the model is insufficient, and the section mainly introduces basic objective function design and model training means.

In the deep learning task, a cost function is often used together with a necessary regularization term as the objective function, the cost function is a measurement mode of an error between a classification prediction result and an actual true value, the cost function commonly used in the classification task is an average value of cross entropy loss functions of samples, cross entropy is a means for measuring similarity of two probability distributions, the target distribution is represented as p (x), a distribution obtained by prediction estimation is represented as q (x), and the cross entropy between the two is defined as follows:

H(p,q)＝-∑_xp(x)logq(x) (2-14)

if the tag value is used by one-hThe ot vector representation, i.e. the label value for a k class classification, is represented as a target vector of length k: [ p ]¹,...,p^j,...,p^k]If the object class is y_iIf c, then let p^cWhen all other terms are 0, the final objective function may be expressed as formula (2-15), where expression 1{ c ═ yi } indicates that the condition in parentheses is satisfied, and if not, the final objective function is 0.

The back propagation is a key step of neural network parameter optimization, each weight parameter and bias parameter are updated according to an optimization algorithm after the bias derivation is performed on an objective function, and it can be known from the foregoing that the objective function is a measure of an error, so that the back propagation substantially transfers an error between an output value and a target value, more precisely a sensitivity of the error to a network parameter, and different from the back propagation of a common deep neural network, the error propagation of a pooling layer and a convolutional layer needs special processing, and the pooling layer does not have parameters which need learning, and the back propagation of a full connection layer is basically consistent with the traditional BP algorithm. For convenience, the expressions in the following description of this section are all symbolized by taking a single neuron or a single channel of a single filter as an example.

To facilitate the calculation of error propagation, δ is defined^lFor the objective function J with respect to z^lThe sensitivity, also called error, of (2) is given by the partial derivative form shown in equation (2-16). Wherein z is^lThe difference in expression is based on whether the current layer is a fully connected layer, a convolutional layer or a pooling layer, as shown in equation (2-17), all symbolic superscripts are the l-th layer representing the neural network.

When the deviation of the fully connected layer is known as delta^lIn time, the delta of the last hidden layer is easy to know after reviewing BP algorithm^l-1And the sensitivity of the objective function to weight and bias are shown in equations (2-18) - (2-20).

When the deviation of the pooling layer is known to be δ^lDeriving delta for the last hidden layer^l-1In the process, as the pooling layer generally uses maximum pooling or average pooling for down-sampling reduction during forward propagation, and one-time up-sampling is needed to restore a larger area during backward propagation, all delta is firstly sampled^lThe submatrix size is reduced to the pre-pooling size, δ is reduced for maximum pooling^lEach value is placed at the position of the maximum value in the corresponding pooling area during forward propagation, for average pooling, the value is averaged and placed in the corresponding pooling area after reduction, the pooling error matrix amplification and error redistribution process is expressed by upsample, and the error of weight and offset does not need to be considered because the pooling layer has no parameter.

Suppose delta^lIs the deviation of convolution layer, delta is derived^l-1The expression of (2-22) is different, and in combination with the above, the deviation of the previous hidden layer at this time is given by (2-22), where rot180(·) represents that the matrix is flipped 180 degrees, i.e., flipped one each time up, down, left, and right, which is the result obtained in the back propagation derivation, and furthermore, the back propagationThe convolution operation in the process needs padding to align the dimensions.

The optimization algorithm needs to perform weight optimization updating by using the optimization algorithm after calculating back propagation, the most common optimization algorithm in neural network optimization is a gradient descent method, the basic idea of the gradient descent method is to reduce a function value along a negative gradient direction of a variable when an objective function is optimized so as to achieve an optimization target, and the sample number calculated once can be divided into a random gradient descent method, a batch gradient descent method and a small batch gradient descent method. The batch gradient is decreased, all training set samples are used, the overall sample distribution can be represented, parallel operation can be realized, the defect is that when the training set is large, due to the fact that all samples need to be operated in each iteration, the training process is slow, the small batch is a compromise between the training set and the small batch, the number of the samples used in each iteration, namely the batch size, needs to be set, and good operation efficiency and parameter updating speed are achieved. For the neural network model, the gradient can be efficiently calculated by means of the BP algorithm, which is very beneficial to training the neural network by adopting the gradient descent algorithm, and the following table shows the algorithm flow of the small-batch gradient descent method for network optimization.

Small-batch random gradient descent method optimization process

In the training process of actual deep learning, multiple skills are often combined and used to achieve the fastest and best effect, but the process often needs to be repeatedly tried for finding the best strategy, the general rule of thumb is that batch normalization is carried out before each convolutional layer or full connection layer is fed, then random inactivation is used after the full connection layer, the overfitting possibility of a model can be greatly reduced, and the learning rate setting is often increased from 0.001 or even smaller according to ten times or five times, so that the most appropriate learning rate hyperparameter is found.

FIG. 8 shows the diagnostic flow from sample acquisition to data pre-processing to model training. Once the model is trained and optimized, the model can be used for classification and identification of unknown state samples so as to diagnose whether the bearing state is healthy or not.

As shown in fig. 9-10, the fault signal detection further includes a comprehensive composite fault diagnosis, the composite fault is considered to belong to a plurality of single faults, i.e. the composite fault needs to be identified as a plurality of fault patterns at the same time, in this case, the model becomes a single-input multi-output model, the predetermined composite faults need to be identified as corresponding single fault patterns, and in the pattern identification problem, the problem can be classified as a multi-label.

It should be noted that the pooling layer is a key picture identification capability, has sparse connection and weight sharing, the input layer is image input, and includes a single-channel gray scale map and a three-channel color map input form, the output layer is a Softmax classifier for outputting an image classification identification result, and has a special output layer structure in the field of target detection or image segmentation, the convolution layer is used for feature learning, the pooling layer is used for feature selection, the classifier utilizes the learned deep features to perform classification output, and network parameters of each layer are optimized simultaneously in training.

Furthermore, the data preprocessing rearranges the sampling values of the one-dimensional signals into a two-dimensional matrix in sequence, regards the two-dimensional matrix as a vibration gray scale map, and performs overlapping sampling according to a certain length to increase the number of samples when constructing the samples, in this system, signal sampling and sample construction are performed by default in this way, the length of a single signal sample (i.e. the number of sampling points included in a single sample) can be actually adjusted according to the sampling frequency, the system rotation frequency, and the like, so as to facilitate the processing of the convolutional neural network when the side length of the input image is often an even number, especially when the side length is an exponent with the base 2, to simplify the design of network structure parameters, the commonly used sample lengths are 400 × 20, 1024 × 32, 4096 × 64, and the like, and many documents have obtained a certain research result by using this signal processing method, and therefore, this method will also be directly used as one of the signal preprocessing methods, the related diagnosis test result can be used as a comparison reference value, and in addition, the invention also uses the short-time Fourier transform (STFT) and the Continuous Wavelet Transform (CWT) as the comparison of signal processing modes to comprehensively evaluate the feasibility and the high efficiency of an algorithm system.

It should be noted that the fault signal detection is performed by data acquisition and monitoring control on a part to be detected through a sensor, and different signals are suitable for monitoring different components, for example, a vibration signal is particularly suitable for a gear box, a blade, a transmission shaft and the like; the temperature can reflect whether equipment is degraded or not and whether the running condition is abnormal or not, and the device is suitable for monitoring gears, generators, converters and the like; the amplitude and harmonic components of the electric signal directly reflect the electric fault, and the method is particularly suitable for generators, sensors and the like, in addition, the wind turbine generator is a strong electromechanical coupling system, and the mechanical fault can also cause the abnormity of the electric signal; the monitoring of the oil component content is suitable for analyzing whether the mechanical components lubricated by the lubricating oil are damaged or not; it should be noted that in an actual system, multiple sensors or even multiple signals are installed at different positions at the same time, and fault diagnosis is performed jointly by using a data fusion technology, while the processing methods and analysis difficulty of different signals are different; the vibration signal is the most widely used monitoring signal, plays an important role in fault diagnosis of parts such as a bearing, a gear, a blade, a transmission shaft and the like, and the vibration signal often contains internal fault characteristics and reflects the running state of the fan; it is worth proposing that various signal sensors including vibration signals are intrusive and need to be embedded into equipment, so that the complexity and the fault possibility of the equipment are increased, and the non-intrusive sound signal detection has a very good application prospect in fault diagnosis.

Furthermore, the convolutional neural network verifies the effectiveness through simulation signals and test board signals, simulation is to simplify mathematical modeling of bearing vibration, problems are simplified to a certain extent, the effectiveness of the method is verified quickly, a test board data set is data acquired during operation of a real fault bearing, test is carried out by utilizing data of various fault types and fault severity, and the effectiveness and superiority of an intelligent diagnosis algorithm are analyzed and evaluated comprehensively.

It should be noted that, after the model structure is built, the objective function design and the model training are performed, and a network structure having 3 convolution-pooling pairs and 2 full-connected layers is shown in fig. 11.

It should be noted that when the fault diagnosis task changes, for example, the number of the neurons in the output layer increases, and the structure of the previous feature extraction layer remains unchanged, so that the network structure has certain applicability. In the text, the convolutional layer keeps the input and output sizes unchanged by adopting a Same filling mode, and the pooling layer is not filled by adopting a Valid filling mode, so that the down-sampling is realized. Table 3.1 shows the details of the structure of the designed convolutional neural network model, where 32 × 32@16 means that there are 16 channels and the size of each channel is 32 × 32, and the other items mean similarly. In addition to the table, there are some details and skills for training in the experiments, mainly Batch Normalization (Batch Normalization) and random deactivation (Dropout) mentioned above, where Batch Normalization is used in convolutional and fully-connected layers to speed up training, while random deactivation Normalization is used in fully-connected layers to prevent model overfitting, improving the generalization ability of the model, and the rate of random deactivation takes the recommended value of 0.5 as the choice in the experiments herein.

The convolutional neural network structure is designed as follows:

the effectiveness of the CNN-based fault diagnosis method is verified through simulation signals and test board signals, simulation is simplified mathematical modeling of bearing vibration, problems are simplified to a certain extent, and the effectiveness of the method is verified quickly. The data set of the test bench is data acquired in the operation of a real fault bearing, and the test is performed by using data of various fault types and fault severity degrees, so that the effectiveness and superiority of an intelligent diagnosis algorithm can be comprehensively analyzed and evaluated.

The vibration signal simulation is carried out through numerical simulation, the fault signal of the rolling bearing mainly comprises three parts, pulse impact excited by a fault defect point, vibration harmonic waves generated by unbalance of the bearing or meshing of gears in a gear box and Gaussian white noise introduced in measurement are shown as the following formula.

Wherein: a. the_iIs the amplitude, T, of the ith pulse excited by the fault_iIs the time at which the pulse occurs; b_kAnd phi_kAmplitude and initial phase of the kth harmonic due to bearing imbalance or gear meshing in the gearbox; n (t) is white Gaussian noise.

The vibration signal is considered here in two cases, namely a normal case and a fault case. Two vibration harmonics are normally generated, and simplified mathematical expressions are given at (3-2) and (3-3):

s₁＝0.5sin(2πf₁t) (3-2)

s₂＝0.3sin(2πf₂t) (3-3)

wherein f is₁200Hz and f₂400Hz represents two vibration harmonic frequencies, t represents time and satisfies t n/f_sN is a sampling point, f_sThe simulation signal amplitude is lower considering that the vibration is small under normal conditions as 10kHz, and in addition, two fault signal simulation expressions are as follows, wherein an expression (3-4) represents the vibration caused by the outer ring fault, an expression (3-5) represents the vibration caused by the inner ring fault, and a signal modulation component is added, wherein f in the following expression_ro＝2000Hz，f_ri＝3000Hz，f_o＝30Hz，f_i150Hz, representing the resonant frequency and the fault characteristic frequency of the outer and inner rings, respectively. f. of₀And 20Hz represents the rotation frequency of the neutral axis. Fig. 12 shows time domain waveforms of two fault signals.

From fig. 12, it can be seen that in the rational case where the outer ring fault causes the impact signal, the maximum amplitudes of the respective pulses are equal and are spaced by the reciprocal of the characteristic frequency of the fault, and each pulse oscillates at a resonant frequency in a damped manner, and the diagram (b) shows a typical inner ring fault signal, since the relative positions of the vibration acceleration sensor and the inner ring fault point change with the rotation of the shaft, the amplitude of the impact generated by each impact defect is modulated by the rotation speed frequency, so that the envelope of the signal shows a "wavy" change.

Based on the signals generated by the vibration source, after superimposing white gaussian noise with a signal-to-noise ratio of 20dB, simulation signals of various bearing health states, namely a health state, an outer ring fault, an inner ring fault, and a composite fault of simultaneous faults of the inner ring and the outer ring, can be constructed according to the formula 3-1, as shown in fig. 13.

In the experiment, 1024 points are taken as sampling length and the overlapping rate is 30%, 250 samples are obtained in each state, 1000 samples are obtained in total, 60% of samples are randomly marked out and used as a training set, and the rest 40% of samples are used as a testing set. Firstly, the data of the training set is normalized and rearranged and then sent to CNN for training according to a 32 x 32 two-dimensional gray scale image, and when the training precision is not obviously improved any more, the training is finished. And sending the test set data into the trained model in the same way to obtain the test precision for evaluating the recognition precision and generalization capability of the model. Because the simulation data is ideal, the fault characteristics are obvious, the test accuracy of 100 percent is successfully achieved in the experiment, and the feasibility and the effectiveness of the chapter based on a two-dimensional gray scale image and a CNN algorithm model are verified. Since the neural network model is a black box model, the interpretability is not strong, and further helps to understand the reason why the model can distinguish and identify various fault types and health states by means of feature visualization. FIG. 14 is a visual distribution diagram of high-dimensional features extracted from each sample by the t-SNE method in a feature space after dimensionality reduction. It can be seen from the figure that before entering the classifier, the signals of the classes are separated from each other in the feature space, and the more distant the clusters are, the more significant the feature difference of the classes is, the more the classification identification is facilitated. The convolution neural network has very good feature learning capability, and can learn the discriminative features from the original input signal for classification, fig. 15 is a confusion matrix of the test result, and the confusion matrix can be used for checking the relationship between the respective predicted result and the actual result of each category, so as to judge whether the model is insufficient in the recognition of some categories or easy to be confused among the categories, and it can be seen from the figure that the prediction accuracy of each category in the diagnosis result of the simulation signal reaches 100%. It should be noted that the ideal result is also obtained under the conditions of less fault types and ideal signal data quality, and the bearing test data in the real environment often cannot achieve such perfect diagnosis result, so that further analysis and research by using the actual test data are necessary.

The simulation test results are shown in fig. 21: (a) visualization of t-SNE characteristics; (b) test result confusion matrix

Further, the vibration signal monitoring and convolutional neural network model is used for identifying different fault types and different fault degrees, detecting fault signals under noise interference and effectively identifying composite faults.

The signal form after the bearing vibration signal is superimposed with the noise can be simulated by additive white gaussian noise, and the signal-to-noise ratio is defined, as formula (4-1), the logarithm of the ratio of the signal energy to the noise energy is represented, and the unit is decibel (dB), for example, when the signal-to-noise ratio is 10dB, the signal energy is 10 times of the noise energy, and when the signal-to-noise ratio is 0dB, the signal energy is equivalent to the noise energy, and the influence of the noise is already obvious at this time. Fig. 16 shows the time domain waveform diagrams before and after adding gaussian white noise with a signal-to-noise ratio of 0dB under the condition of a rolling element fault of 0.007inch, and it can be seen from the diagrams that the fault impact component is difficult to distinguish in the signal waveform after the noise is superimposed.

In order to improve the generalization capability of the model, the improved cost function is used as an optimization target for model training, and simultaneously, in order to improve the characteristic learning capability of the model to the samples, noise interference (SI) is carried out on part of training samples in the training process, and the samples and the original signal samples are randomly mixed and then are sent to the model for training. Because part of samples are interfered by noise, the anti-noise interference capability of the model is enhanced to a certain extent by the characteristic learning process of the part of samples. The details of the experimental implementation are shown in the following table, in which the training set is augmented with signal samples for three signal-to-noise interference types.

Training set construction and parameter set

In addition, model training is carried out by using a method of only improving the cost function without adding interference samples and only adding the interference samples without improving the cost function, the experimental parameter setting is unchanged, and the test results of the four models are compared and analyzed by combining the conventional convolutional neural network model. In all experiments, the vibration signals are not subjected to complex time-frequency analysis processing, and are rearranged into a two-dimensional vibration gray-scale image according to the sequence of sampling points, so that the improvement of the test accuracy under the noise interference brought by the cost function is tested and improved. In the experiment, after the model training is finished, the original signals are respectively added with Gaussian white noise according to the signal-to-noise ratio of-4 dB to 6dB, and then the original signals are sent into the trained model for testing, so that accuracy results under various noise interference conditions are obtained, and a relevant curve is drawn as shown in figure 17.

It can be seen from the above curves that, after cost function improvement, the convolutional neural network model can obtain better fault identification accuracy in a noise interference environment, and particularly, after sample interference is increased, the generalization capability of the model is further enhanced. Under the condition that the signal to noise ratio (SNR) is-4 dB, the original CNN model can only achieve about 81% of fault identification accuracy, while the LDCNN + SI model improved in the chapter can achieve about 94% of accuracy, thereby greatly improving the accuracy and mainly benefiting from the improvement of a cost function and the amplification of an interference sample. Meanwhile, compared with the test results of the model before and after the sample interference is added, the improvement of the algorithm and the amplification of the data sample are very important, for the artificial intelligence algorithm based on data driving, the more the data, the more the characteristic learning of the model is, the more the model is, and the generalization capability of the model is stronger. Sample amplification is relatively easy to implement compared with algorithm improvement, so that the link should be very important in practical application.

In addition, when the signal-to-noise ratio is high, the original model or the improved model can obtain a satisfactory classification result. It should be noted that although the recognition rate of the improved model is still inferior to that after wavelet transform signal processing, the value is that the accuracy of the model can still be improved greatly only by simple cost function improvement without complex signal processing steps. Therefore, when the signal-to-noise ratio SNR is about 0dB, the improved model has achieved a satisfactory accuracy of 99.71% of fault diagnosis, which can be considered as a result of the improvement in practical significance and effectiveness.

For example, for bearing fault diagnosis, the known single faults except the healthy state include 3 types of faults of an inner ring and an outer ring and faults of a rolling body, the composite fault is that faults exist in the inner ring and the outer ring simultaneously, 5 independent categories are provided for single label classification, 3 fault categories are provided for multi-label classification, the composite fault belongs to two categories of faults of the inner ring and the outer ring, and the normal state does not belong to any one category. The relevant fault diagnosis flow is shown in fig. 18 and 19. For multi-label classification, special attention needs to be paid to a label setting method during sample construction, namely different from common one-hot coding, for a label of a multi-label sample, a plurality of elements in a label vector are 1, all elements of the label in a normal state are 0, namely, any single fault does not exist, and in addition, the setting of a loss function and an accuracy evaluation method are different. In this document, multiple two classifiers are used to implement multi-label classification, similar to the one-vs-rest classification strategy in a multi-classification support vector machine, for convenience, a logistic regression two classifier with a Sigmoid activation function is used for each label category, and theoretically, other machine learning classifiers such as a support vector machine may also be used. The features extracted by the CNN model can be mapped to probability output of 0-1 after being activated by a Sigmoid function of a logistic regression classifier, and a common threshold value theta of 0.5 is used as a judgment standard for outputting positive and negative categories in practice. The corresponding cross-entropy cost function is adjusted as shown in formula (5-1), wherein

The output value of the activation function of the ith sample in the c classifier represents the label class predictionThe probability that a class is positive, k is the total number of classifiers, i.e., the total number of single failure classes. The accuracy rate can be defined as the average accuracy rate of each label, that is, a single fault is identified by a composite fault, and can also be considered to contribute to the overall accuracy rate, and can also be defined as the average accuracy rate of each category, that is, the average accuracy rate of each category is considered to contribute to the overall accuracy rate only when each label category of the composite fault is completely and accurately identified, and the latter is used as an evaluation index and is used as the absolute accuracy rate.

As can be seen from the foregoing description, the close correlation between a compound fault and a single fault can be regarded as the nonlinear coupling of multiple single faults, and even if the generation mechanism is complicated and difficult to interpret, the fault frequency components constituting a single fault are not regarded as still existing in the compound fault, and the conventional diagnostic method based on signal decoupling and fault frequency component analysis is based on this principle. Therefore, it can be considered that the construction of the "pseudo-composite fault" data containing a plurality of single fault frequency components, which is added to the training sample, will help the model to obtain a stronger signal decoupling analysis capability. This document mainly attempts from time domain signal superposition. The time domain signal superposition method comprises the steps of carrying out Empirical Mode Decomposition (EMD) on a plurality of single fault signals, carrying out linear superposition on selected eigenmode functions (IMF), and carrying out linear superposition on a plurality of sampling points in a staggered proportion according to the sample length by considering that the occurrence sequence and the fault impact interval of different single faults in the time domain are uncertain, wherein a table 5.1 is an EMD algorithm flow, and a 'pseudo composite fault' signal construction process is shown in a figure 20.

TABLE 1 empirical mode decomposition Algorithm flow

The eigenmode functions which can represent fault information can be obtained by decomposing the signals of the single faults through the EMD decomposition method, and the eigenmode functions of the signals of different fault types are superposed, so that the signal characteristics of the composite faults can be approximately described. The method does not need to set parameters, can be used for analyzing coexistence of two faults, can also be expanded to the situation that more faults exist simultaneously, and has certain universality.

The method for constructing the pseudo-composite fault sample by using the EMD decomposition result of the single fault data is provided, the convolutional neural network feature extractor and the two classifiers are obtained by using the multi-label classification method for training, correct diagnosis of the real composite fault sample is preliminarily realized, and the method provided by the chapter is proved to have practical feasibility. On the premise of not improving the complexity and difficulty of signal processing and model construction, the convolutional neural network model can realize effective diagnosis when multiple single faults coexist under the condition of lacking of composite fault samples.

In summary, the invention provides a method flow based on vibration signal monitoring and convolutional neural network model for intelligent bearing fault diagnosis, wherein vibration signals are obtained by an acceleration sensor, historical data are reasonably sampled and subjected to 1D-2D signal processing transformation, bearing signal samples in various fault states are reasonably divided into a training set and a test set, the training set is sent into an established deep convolutional neural network for model learning, the test set is used for verifying the generalization capability of the model, namely the test accuracy after the model learning is completed, and the test convolutional neural network model can be further deployed in a wind turbine monitoring control system to realize online real-time state monitoring and fault diagnosis.

Then, because of the adoption of a two-dimensional input convolution neural network, a one-dimensional time domain vibration signal needs to be converted into a two-dimensional form through special processing, and by means of the strong image identification capability of the deep convolution neural network, the ideal fault diagnosis accuracy rate can be realized by sampling any two-dimensional image form, wherein the vibration gray-scale image based on signal sampling point rearrangement has unique advantages due to simple realization and no need of complex transformation.

And then the number of layers of the convolution pooling, the number of filters and the like are contrastively analyzed by the principle that the structure is as simple as possible, the training is as efficient as possible and the prediction is as accurate as possible, so that a deep convolution neural network model which is most suitable for the research task is obtained, and a better balance is achieved in the aspects of the training cost and the prediction accuracy.

Then, in consideration of the universality of noise interference in an actual industrial production environment, the invention researches the identification capability of a convolutional neural network model on fault signals under strong noise interference. Experiments show that the identification method based on the vibration gray level graph still has very good fault identification accuracy when the signal-to-noise ratio is not too low, but the identification accuracy is greatly reduced when the signal-to-noise ratio is reduced to be below 0dB, and the time-frequency graph identification based on the continuous wavelet transform still has very high fault identification accuracy due to the fact that the wavelet transform has the noise reduction function. Based on the improved strategy, the improved strategy based on the clustering cost function and the training sample interference is provided, and the fault signal identification accuracy rate of the method based on the vibration gray scale map under the condition of low signal to noise ratio is improved.

Finally, experiments prove that on the premise that a composite fault sample can be obtained, satisfactory fault identification accuracy can be obtained no matter whether the composite fault is singly used as a class or model training is carried out by adopting a multi-label method. Then, different from the traditional method for adding a compound fault sample into a model for training, the method for constructing a pseudo compound fault signal sample by using an eigenmode function obtained by empirical mode decomposition of a single fault signal and adding the pseudo compound fault signal sample into the model for training on the premise of lacking real compound fault data is researched, so that the multi-label classification convolutional neural network model initially has the capability of diagnosing and identifying real compound faults, and compared with a model only trained by a single fault sample, the accuracy rate of simultaneously identifying the compound faults as containing various single fault components is greatly improved.

It should be understood that the above-described embodiments of the present invention are merely illustrative of or explaining the principles of the invention and are not to be construed as limiting the invention. Therefore, any modification, equivalent replacement, improvement and the like made without departing from the spirit and scope of the present invention should be included in the protection scope of the present invention. Further, it is intended that the appended claims cover all such variations and modifications as fall within the scope and boundaries of the appended claims or the equivalents of such scope and boundaries.

Claims

1. The wind generating set mechanical fault audio frequency identification and fault diagnosis method comprises fault signal detection and a convolution neural network, and is characterized in that the fault signal detection comprises vibration signal detection, acoustic emission signal detection, strain force signal detection, temperature signal detection, oil parameter detection and electric signal detection, intelligent diagnosis for detecting bearing faults based on the convolution neural network comprises data acquisition, characteristic extraction and characteristic classification, the fault diagnosis problem is converted into a similar image identification classification task to be processed by means of the function of a deep learning model in the aspect of automatic characteristic extraction, signal data preprocessing is carried out, then a model structure is established to carry out training optimization, overlapping sampling is carried out, and whether the bearing state is healthy or not is diagnosed through two-dimensional representation of vibration data;

the model used for building the model structure is designed into 3 convolution-pooling pairs and 2 network structures of full connection layers, training optimization is carried out after samples are obtained and preprocessed, two-dimensional images and health state labels corresponding to the two-dimensional images serve as one sample pair, a part of all sample sets is randomly selected according to a given proportion to serve as a training set, the rest samples serve as a test set, the training set is used for model training, supervised learning is carried out on the two-dimensional images and the health state labels, after model training is finished, test set data without labels belong to the test set and are sent to the model for testing, classification prediction results are obtained, the generalization capability of the model can be evaluated after the classification prediction results are compared with real results, and the contingency of experimental results is reduced;

inner ring failure frequency:

outer ring fault frequency:

frequency of rolling element failure:

cage failure frequency:

in the above formulas, N is the number of rolling elements, D is the bearing diameter, D_bIs the diameter of the rolling element, alpha is the contact angle, f_rIs the rotational frequency of the shaft;

the convolutional neural network learning module consists of an input layer, a convolutional layer, a pooling layer, a full-link layer and an output layer, back propagation of the convolutional neural network learning module is a key step of neural network parameter optimization, each weight parameter and bias parameter are updated according to an optimization algorithm after bias derivation of a target function, and weight optimization updating is performed by the optimization algorithm after back propagation;

2. The method according to claim 1, wherein the pooling layer is a key picture recognition capability, and has sparse connection and weight sharing, the input layer is an image input, and includes a single-channel gray-scale map and a three-channel color map input form, the output layer is a Softmax classifier for outputting an image classification recognition result, and has a special output layer structure for the field of target detection or image segmentation, the convolutional layer is used for feature learning, the pooling layer is used for feature selection, the classifier performs classification output by using learned deep features, and network parameters of each layer are simultaneously optimized during training.

3. The method of claim 1, wherein the data preprocessing rearranges sampling values of one-dimensional signals into a two-dimensional matrix in sequence, and the two-dimensional matrix is regarded as a vibration gray scale map, and samples are overlapped according to a certain length to increase the number of samples when constructing the samples.

4. The method for audio recognition and fault diagnosis of mechanical faults of wind generating sets according to claim 1, wherein fault signal detection is performed on the parts to be detected through a sensor for data acquisition and monitoring control.

5. The method for audio identification and fault diagnosis of mechanical faults of wind turbine generators of claim 1, wherein the convolutional neural network is validated through simulation signals and test bench signals.

6. The method for audio identification and fault diagnosis of mechanical faults of wind turbine generators of claim 1, wherein the fault signal detection is based on convolutional neural network simulation training under noise interference, so that the noise immunity of the model is increased.

7. The method for audio identification and fault diagnosis of mechanical faults of a wind turbine generator system of claim 1, wherein target function design and model training are performed after model structure construction is completed.

8. The method of claim 1, wherein the vibration signal monitoring and convolutional neural network model is used to identify different fault types and fault degrees, to detect fault signals under noise interference, and to effectively identify complex faults.