CN114037001A

CN114037001A - Mechanical pump small sample fault diagnosis method based on WGAN-GP-C and metric learning

Info

Publication number: CN114037001A
Application number: CN202111183493.3A
Authority: CN
Inventors: 王雪仁; 高晟耀; 刘瑞杰; 苏常伟; 管峰; 缪旭弘; 寻波; 张海峰
Original assignee: People's Liberation Army 92578
Current assignee: People's Liberation Army 92578
Priority date: 2021-10-11
Filing date: 2021-10-11
Publication date: 2022-02-11

Abstract

The invention discloses a fault diagnosis method for a small sample of a mechanical pump based on WGAN-GP-C and metric learning, which is based on one-dimensional convolution and time sequence data design and comprises four parts of data preprocessing, data enhancement, metric learning and model optimization. The data preprocessing realizes the self-adaptive denoising and standardization of data; in the data enhancement part, network and structure modification is carried out based on WGAN-GP to obtain WGAN-GP-C so as to realize data expansion according to categories and further strengthen the edge boundary information of the data; the measurement network combines the residual error idea and the space self-adaptive structure to realize feature mapping, and then combines the KNN algorithm to realize state classification; in the model optimization part, the weight quantization idea is used for optimizing the network, the Ball-tree is used for realizing KNN, and the training data is deleted according to the important factors, so that the algorithm performance is integrally improved. The invention has good performance and high practical value under the condition of rare data, and can provide ideas for related workers for maintaining mechanical pumps.

Description

Mechanical pump small sample fault diagnosis method based on WGAN-GP-C and metric learning

Technical Field

The invention relates to a mechanical pump fault diagnosis method, in particular to a mechanical pump small sample fault diagnosis method based on WGAN-GP-C and metric learning.

Background

With the development of science and technology, the structure of ship equipment is more and more complicated, the function is more and more perfect, degree of automation is more and more high, and the complexity of mechanical equipment is multiplied therewith. Therefore, when a certain component of the mechanical equipment fails, the whole system may fail, and the failure may not only cause serious economic loss, but also even cause casualties. Therefore, the development and improvement of intelligent diagnosis technology capable of effectively diagnosing the fault of the mechanical equipment are urgently needed.

The intelligent diagnosis technology combines a fault diagnosis technology with a network transmission technology, and adopts computer software to intelligentize a diagnosis process, so that the fault diagnosis threshold is reduced, the intelligent diagnosis technology is convenient to use, and meanwhile, the reliability of a diagnosis result is improved, so that the intelligent diagnosis technology gradually receives more and more attention, and becomes a hotspot of current research and application.

With the emergence of detection technologies such as fuzzy set theory, expert system, neural network, wavelet analysis and the like, the intelligent fault diagnosis technology has been developed into a comprehensive technology of multidisciplinary coupling, the efficiency of ship equipment is exponentially improved by understanding and reasonably adopting the technologies, the full-life, full-system and full-cost management of the ship equipment is met, and the intelligent fault diagnosis technology has very important significance for ensuring the reliability and safety of the ship system.

At present, data-driven models are generally adopted in intelligent fault diagnosis systems for mechanical pumps, most of the models input extracted fault features into an algorithm model by means of machine learning algorithms such as gradient lifting trees, support vector machines, logistic regression and the like, model parameters are adjusted, and a diagnosis model is obtained through multiple iterative training.

In recent years, the development of deep learning and neural networks introduces a new idea for fault diagnosis, the method has strong expression capability, overcomes the defects of the traditional machine learning algorithm, has the potential of identifying tiny faults, and has wide attention paid by researchers in the industry to the application of the method in fault diagnosis. The diagnostic models usually have good effects in specific scenes, but most models need more data to be established, the influence of the data scale on part of the models is huge, and particularly, the influence of the data is particularly obvious based on deep learning and a neural network and model structure. However, in reality, fault data of mechanical equipment such as a mechanical pump and the like is rare and may have partial deletion, effective training of a set of common algorithm is difficult to support, the difficulty of data analysis is increased under the condition of small sample data, and an ideal result is difficult to obtain by the algorithm, so that the fault diagnosis faces the problems of low accuracy, missing report, false report and the like.

The design of a mechanical fault diagnosis algorithm aiming at a small sample background is a hot research problem at present, and a plurality of strategies emerge in recent years, but most of the strategies have the problems of single adaptive scene, defects in algorithm performance and the like, which is the significance of continuing research at present.

Disclosure of Invention

In order to solve the defects in the prior art, the invention aims to provide a fault diagnosis method for a mechanical pump small sample based on WGAN-GP-C and metric learning, which is used for designing a fault diagnosis algorithm for the small sample based on WGAN-GP-C and metric learning based on one-dimensional convolution and time sequence data under the background of the small sample with rare fault data of the mechanical pump of a ship, and adopting the fault data of a typical mechanical pump to carry out algorithm verification and realization.

In order to realize the purpose of the invention, the technical scheme adopted by the invention is as follows:

a fault diagnosis method for small samples of a mechanical pump based on WGAN-GP-C and metric learning, the method comprises the following steps:

(1) reading time sequence data and carrying out data preprocessing;

(2) constructing a WGAN-GP-C model, and performing data enhancement based on the WGAN-GP-C model;

(3) constructing a fault diagnosis model based on metric learning and KNN classification algorithm;

(4) carrying out model optimization;

(5) and utilizing the optimized model to carry out fault diagnosis on the data which is preprocessed and subjected to data enhancement.

Further, the step (1) specifically includes:

(1.1) reading time sequence data, intercepting according to a set length, and filtering by using a normalized least mean square adaptive filter (NLMS);

(1.2) removing an abnormal value by using median filtering;

(1.3) counting the mean and variance of each dimension of the data, and carrying out data standardization by adopting a Z-score method. Further, the NLMS filtering algorithm is as follows:

y(k)＝w^T(k)x(k)

e(k)＝d(k)-y(k)

where x (k) is an input signal, y (k) is an output signal, d (k) is a reference signal, w (k) is a filter coefficient, α is an adjustment factor for preventing denominator from being zero, | x (k) is a filter coefficient²Is the square of the euclidean norm of x (k),

is the step size parameter.

Further, the step (2) specifically includes:

(2.1) constructing a WGAN-GP-C model based on the WGAN-GP and a residual error structure, adding a label for class guidance when random noise is input into a generator, and judging the class of the data at the output end of a discriminator to generate the data according to the class;

(2.2) carrying out WGAN-GP-C model training, sending the generated sample and the real sample into a discriminator, firstly training data distribution, enabling the generated sample and the real sample to be drawn close, and then training and classifying while perfecting the distribution;

(2.3) when the WGAN-GP-C which is trained is used for generating a sample, screening the generated sample data;

(2.4) the quality of the generated sample data is evaluated by adopting the maximum mean difference MMD and the model is adjusted according to the maximum mean difference MMD.

Further, the loss function of the generator G, the loss function of the discriminator D, and the loss function of the data class are as follows:

wherein the content of the first and second substances,

denotes f (x) the mathematical expectation for x under the distribution P, x_rIs a true sample, x_fIs to generate a sample, P_rRepresenting the true data distribution, P_fThe representation generates a distribution of the data,

and

respectively represent x_rAnd x_fIn the category of (a) to (b),

to represent

Data of (2)The distribution of the water content is carried out,

is shown in distribution P_rAnd distribution P_fThe data points on the connecting line of (a),

to represent

Data distribution of L₂Denotes the L2 regularization penalty, λ₁、λ₂、λ₃、λ₄And λ₅Is a hyperparameter controlling different parts of the loss function, n represents the number of samples, i.e. the dimension of y, y_i、class_iRepresenting the net output of the ith sample and the true category.

Further, the data screening step comprises: carrying out category judgment on the generated samples through a discriminator, and reserving the generated samples with the input labels being the same as the output categories of the discriminator; and selecting whether to retain the sample or not according to a certain probability a for each retained sample, wherein the probability a is the accuracy a of the discriminator on the real data test set.

Further, the step (3) specifically includes:

(3.1) constructing a measurement network model, which comprises a convolutional neural network Vector-CNN and a space adaptive structure, wherein sample data obtains a Vector through the convolutional neural network, a space position matrix is constructed in the space adaptive structure, and then different weights are given to the space position according to Vector characteristics so as to realize space measurement mapping;

(3.2) constructing a triplet data set, and training a measurement network model by adopting an Adam optimizer;

and (3.3) constructing a classification algorithm model, searching an optimal k value and corresponding accuracy by using a KNN algorithm, and testing the metric network model obtained by training.

Further, the spatially adaptive structure loss function is as follows:

l_triplet(x_a,x_p,x_n)＝max{0,d(E_a,E_p)-d(E_a,E_n)+μ

loss＝l_triplet(x_a,x_p,x_n)+λl_E

wherein d (E)_a,E_p) Represents E_aAnd E_pμ is a hyper-parameter, λ is a hyper-parameter controlling the penalty degree; x is the number of_aDenotes anchor, x_pAnd x_nRespectively representing positive and negative examples, E_a、E_pAnd E_nRepresenting the measurement network mapping output of the three; l_EIndicating an L2 regularization process.

Further, the step (4) specifically includes:

(4.1) quantizing for a metric network; specifically, quantization is carried out on a Vector-CNN of a convolutional neural network, and adjacent network modules are fused; specifically, symmetric quantization is adopted for the weight of convolution, bias is not quantized, and asymmetric quantization is adopted for the active layer;

(4.2) accelerating the KNN algorithm by adopting a tree structure Ball-tree;

(4.3) screening the training set data, proposing important factors IF of the training set data, and sequencing all samples from small to large according to the important factors; and on the premise of ensuring the accuracy of the algorithm, a part of samples with smaller IF are removed.

Further, defining the important factors of the samples as the reciprocal sum of the squares of the Euclidean distances between the samples and the samples of the same type, adding the squared sums of the Euclidean distances between the samples and the samples of different types after the size adjustment, and respectively operating according to the data types when sorting and deleting the data by considering the difference of the size distribution of the important factors of the samples of different types;

wherein x is_inN sample representing class i of training samples, N_iDenotes the number of class i samples, C denotes the number of sample classes, d (x)_in,x_im) Denotes x_inAnd x_imThe Euclidean distance of (a) is,

representation collection

The maximum value of (a) is,

representation collection

The minimum value of (a) is determined,

and

are respectively a set

And collections

Deleting the set of large values and small values according to a certain proportion, mean () representing the average value of all the values in the set, inf representing the minimum value preventing the denominator from being zero.

Compared with the prior art, the invention has the beneficial effects that:

(1) aiming at rare fault data, the scheme of NLMS + median filtering is adopted to obtain better filtering performance under the condition of smaller time disadvantage; the filtering and data combining are standardized to form a data preprocessing part, which has an obvious effect of improving the performance of the model;

(2) the data enhancement part is a first step scheme aiming at the current situation of a small sample, changes based on WGAN-GP to obtain WGAN-GP-C so as to generate data according to categories, and combines parameter adjustment and an effective data screening strategy to realize data enhancement and strengthen data edge information;

(3) in the model optimization part, quantization processing is carried out on Vector-CNN, the size of the model is reduced by about 4 times, and meanwhile, the operation speed and the classification performance are ensured; the conventional implementation of KNN is changed into Ball-tree, so that the search speed is effectively improved; important factor definition is provided, and data are removed by taking important factors based on the properties of a training set as reference in implementation, so that the tree building time and the searching time of the Ball-tree are further optimized;

(4) the method has better fault identification accuracy rate under complex scenes of high noise interference, insufficient sample quantity and the like; the method has the advantages of scientificity, reasonableness, strong adaptability and high practical value, and can provide reference for related personnel such as mechanical pump research and development, large-scale mechanical operation maintenance and research.

Drawings

FIG. 1 is a general block diagram of a mechanical pump small sample fault diagnosis method based on WGAN-GP and metric learning;

FIG. 2 is a WGAN-GP-C network design diagram;

FIG. 3 is a design drawing of a metric-net structure;

FIG. 4 is a schematic diagram of a KNN classifier;

FIG. 5 is a diagram illustrating the variation of the loss associated with the WGAN-GP-C model training; a is the generator loss and b is the discriminator loss;

FIG. 6 is a schematic diagram of the loss variation during the metric-net training process; a is loss change and b is l_E(ii) a change;

FIG. 7 is a two-dimensional visualization diagram of a quantized vector library;

FIG. 8 is a diagram illustrating the variation of accuracy in data screening; a is the optimum accuracy change, b is the search time change, c is the time change for building the tree structure, and d is the accuracy change.

Detailed Description

The technical solution of the present invention is further described below with reference to the accompanying drawings and examples. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present application is not limited thereby.

As shown in figure 1, the fault diagnosis method for the mechanical pump small sample based on WGAN-GP-C and metric learning is composed of four parts, namely data preprocessing, data enhancement based on WGAN-GP-C, a classification algorithm based on metric learning and KNN and model optimization, and accurate diagnosis of the mechanical pump fault with the small sample background is achieved.

The invention relates to a fault diagnosis method for a small sample of a mechanical pump based on WGAN-GP-C and metric learning, which comprises the following specific steps:

(1) designing data preprocessing;

reading original data and intercepting according to the length of 2015, filtering the data by using a Normalized Least Mean Square adaptive filter (NLMS) with the filter order of 16 and the step factor of 0.05, and removing an abnormal value by using median filtering with the kernel of 5. And (5) counting the average value and the variance of each dimension of the data, and performing data standardization by adopting a Z-score method.

(2) The data enhancement design based on WGAN-GP-C;

designing based on the WGAN-GP and residual structure thought when building a WGAN-GP-C model, adding category guide information when a generator inputs random noise, and introducing a part for distinguishing the data category at the output end of a discriminator. In WGAN-GP-C training, the first 50000 epochs train data distribution only, and then train data distribution and data classification simultaneously to realize data generation according to classes. The data generation quality was evaluated using the Maximum Mean variance (MMD) and the network was adjusted accordingly. The data enhancement process follows a data screening strategy.

(3) Designing a classification algorithm based on metric learning and KNN;

and building a measurement network model, based on a residual error thought, and providing a space self-adaptive structure, so that a more appropriate space position is searched for by the sample according to the property of the sample when space mapping is carried out. The data mapping output dimension is 32. Constructing a triplet dataset. Using learning rate lr being 5 × 10^-5The Adam optimizer of (a) trains a network model. When a classification algorithm model is constructed, a KNN algorithm is used for setting the parameter k to be 1,100]And traversing within the range, and then testing the performance of the KNN algorithm on the test set to find the optimal k value and the corresponding accuracy.

(4) Constructing a model and optimizing the model;

quantization is performed for the metric network. Symmetric quantization is adopted for the weight of the convolution, bias is not quantized, and asymmetric quantization is adopted for the active layer. The operation of 32 bits to 8 bits is realized. And realizing KNN by adopting a tree structure. And (4) carrying out search time comparison on linear scanning (linear scan), Kd-tree and Ball-tree, and selecting a Ball-tree mode to realize the KNN algorithm. And (2) screening data of the training set, putting forward a concept of Important Factors (IF) of the training data, sequencing all samples from small to large according to the Important factors, and removing a part of samples with small IF on the premise of ensuring the accuracy of the algorithm, thereby realizing data screening, optimizing the speed of building a Ball-tree and improving the searching speed.

(5) And constructing and optimizing a model, and performing fault diagnosis on the preprocessed data.

In the step (1), the step (c),

(1.1) NLMS in data pre-processing is improved by Least Mean Square adaptive filter (LMS). The method is widely applied to the fields of signal processing, channel equalization and the like, an optimal solution is obtained based on random gradient descent, and the arithmetic operation process comprises filtering output calculation (formula 1), error signal calculation (formula 2) and weighting coefficient updating (formula 3).

y(k)＝w^T(k)x(k) (1)

e(k)＝d(k)-y(k) (2)

w(k+1)＝w(k)+μe(k)x(k) (3)

Where x (k) is the input signal, y (k) is the output signal, d (k) is the reference signal (x (k) is used as the reference in practice), w (k) is the filter coefficient (the coefficient of the adaptive filter), and μ is the step size parameter.

When the signal is unstable or complex, the LMS often has the problem of unstable convergence or even difficulty in convergence due to a fixed convergence step. A Normalized Least Mean Square adaptive filter (NLMS) normalizes a fixed step size parameter with an input signal to continuously change the step size parameter, so that the NLMS algorithm has higher adaptability and convergence stability than the LMS algorithm.

When the NLMS is operated, the filter coefficient is updated as shown in the formula 4.

Wherein α is an adjustment factor for preventing denominator from being zero, | x (k) non-conducting phosphor²Is the square of the euclidean norm of x (k),

is the step size parameter.

(1.2) the median filtering is to replace the value of one point in the digital sequence with the median of each point in a neighborhood of the point, thereby eliminating isolated noise points and effectively inhibiting abnormal points.

(1.3) the data standardization is to ensure that the network looks at the same core for each dimension of the data so as to achieve a good convergence effect, the Z-score standardization method is used in the invention, and the calculation mode is as follows.

Wherein x is_ikFor the ith row and kth column parameters of the data set,

std_kare the mean and standard deviation of the parameters in column k.

In the step (2), the step (c),

(2.1) WGAN-GP-C based data enhancement is the first step solution of the present invention for mechanical pump failure diagnosis. The WGAN-GP-C is obtained by carrying out structural modification based on the WGAN-GP for generating data according to categories.

The problem that a native generated countermeasure Network (GAN) is difficult to train is solved based on Wasserstein generated countermeasure Network (WGAN), but gradient shearing (weight clipping) of the WGAN still easily causes gradient disappearance or gradient explosion. To solve this problem, a training technique of gradient penalty (gradient penalty) is introduced into the WGAN, i.e. WGAN-GP.

As shown in fig. 2, an important premise of the data generation of the present invention is that the generation is performed according to the data type, so that the network structure design based on WGAN-GP needs to add the type guidance information, which is called WGAN-GP-C (WGAN-GP with category).

The method comprises the steps of adding a label to random noise data input by a generator for class guidance, wherein the label is represented by trainable word templates with the same latitude as random noise, sending a generated sample and a real sample to a discriminator, and outputting Wasserstein Distance (EM Distance, Earth Mover's Distance, EMD) of the generated sample and the real sample and the class of the sample by the discriminator. When a generator and a discriminator of the WGAN-GP-C are constructed, the design is carried out by using a residual error idea based on 1 DCNN. It is worth noting that in the construction of the discriminator, when the residual structure is used, the Batch Normalization is not added. According to the structural design of the invention, the loss function of the generator G (x) can be obtained as shown in formula 6, and the loss function of the discriminator D (x) is obtained as shown in formula 7. Wherein, the loss calculation of the data category adopts a cross entropy function, as shown in formula 8.

Wherein the content of the first and second substances,

and

respectively represent x_rAnd x_fIn the category of (a) to (b),

to represent

The distribution of the data of (a) is,

to represent

The training mode of WGAN-GP-C is to better meet the requirement of generating data according to categories, firstly train data distribution, enable the generated data to be close to real data, train classification while perfecting distribution, and search boundary information of the generated data. I.e. setting the parameters of

equations

6 and 7, in front of 5Ten thousand epochs, set to λ₂＝10，λ₁＝λ₃＝λ₄＝λ₅When the value is 0, data training is imported. In training after 5 keep, λ is set₁＝1、λ₂＝10、λ₃＝0、λ₄＝10、λ₅Training was continued at 0.004.

When a trained WGAN-GP-C is used for generating a sample, the design is used for screening the generated sample data, in the process, the discriminator is considered to have better performance, so that the data screening process is associated with the discriminator, and the data screening step is as follows: (1) and performing category judgment on the generated samples through a discriminator, and reserving the generated samples with input label same as the output category of the discriminator. (2) And (3) selecting whether to retain the sample or not according to a certain probability a for each sample retained in the step (1), wherein the probability a is the accuracy a of the discriminator on the real data test set.

(2.2) the quality of the generated samples is evaluated using the MMD (Maximum Mean discrimination, MMD) method, which measures the distance between the training data set distribution and the generated data set in Hilbert space. The smaller the value, the closer the two sets of data are distributed.

In the step (3), the step (c),

(3.1) the metrology network model designed based on metrology learning and classification algorithms of KNN is the second step solution of the present invention to the problem of small samples of mechanical pumps. Metric Learning (Metric Learning) is a space mapping method, and can learn a feature space (embedding space) with higher discriminant, so that the effect is more effective than that of feature extraction, and the effect is obvious on the problem of small samples.

As shown in fig. 3, the Network model design based on metric learning of the present invention is designed based on 1DCNN and a residual structure, and a spatial adaptive structure is proposed with reference to a Similarity Condition Embedding Network, so that a sample automatically selects a more suitable spatial position according to its own data property when performing spatial mapping. It was named as metric-net. The metric-net is mainly composed of a convolutional neural network named Vector-CNN and a spatial adaptive structure. The Vector-CNN is realized based on 1DCNN, a plurality of residual error structures and pooling structures are used for achieving a better convergence effect, and one-dimensional data passes through the residual error structures and pooling structures to obtain a Vector, namely a Vector.

In a space adaptive structure, Vector is used for processing, and a network uses a learnable matrix mask in the part, wherein the matrix mask comprises n arrays with the same dimension as that of the Vector, namely M1, M2, … and Mn, and the function of the matrix mask is to automatically learn n space positions which can effectively represent the vibration data characteristics of the mechanical pump; in addition, the Vector outputs n data, namely W1, W2, … and Wn, through a small network, and aims to give different weights to the n spatial positions according to Vector features so as to reasonably recombine to realize better spatial metric mapping.

In the calculation of the space adaptive structure part, Mi (i ═ 1,2, …, n) is normalized by an L2 norm, then is subjected to hadamard product operation with Vector to realize the mapping of the sample to different space positions, then is subjected to product operation with Wi (i ═ 1,2, …, n), and n obtained vectors are accumulated according to dimensions to obtain the final Embedding. The metric-net adopts a learnable mask and does not seek the sparsity of the mask when being realized, thereby realizing the self-adaptive search of the spatial position.

(3.2) to train the metric network, the invention defines a triplet { x }_a,x_p,x_nIn which x_aDenotes anchor, x_pAnd x_nRespectively representing positive and negative examples, E_a、E_pAnd E_nRepresenting the output of the metric network mapping of the three, the final triplet loss can be expressed as equation 9. The invention carries out L2 regularization treatment on the last Embedding, namely L_EThen the loss function of the network is shown as equation 10.

l_triplet(x_a,x_p,x_n)＝max{0,d(E_a,E_p)-d(E_a,E_n)+μ} (9)

loss＝l_triplet(x_a,x_p,x_n)+λl_E (10)

Wherein d (E)_a,E_p) Represents E_aAnd E_pIs a hyper-parameter, mu. λ is a hyper-parameter that controls the degree of penalty.

When a triplet data set is constructed, the method is carried out according to the principle that 2 groups of data in the same category are matched with 9 groups of data in different categories. In order to make the network obtain good convergence, a smaller learning rate is adopted during training. Learning by adopting an Adma optimizer, wherein the learning rate is lr being 5 multiplied by 10^-5The batch size is set to 512. The parameters of formula 9 and formula 10 are set, where μ is 0.6 and λ is 5 e-3.

And (3.3) for the data to be diagnosed, the data can be mapped into a vector by using the network after being subjected to data preprocessing, and then the vector is searched for in the surrounding space, and the data is judged according to the category of the surrounding vector.

As shown in FIG. 4, the present invention uses the KNN algorithm to perform a close-proximity search, traverses the parameter k within the range of [1,100], and then tests the performance of the KNN algorithm on the test set to find the optimal k value and the corresponding accuracy.

In the step (4), the step of (C),

(4.1) the main purpose of model optimization is to reduce the occupied memory of the model and improve the operation speed under the condition of ensuring the performance. Quantization is an important way in which the present invention handles the measurement network. When the quantization processing is performed on the measurement network, 32-bit operation is converted into 8-bit operation, so that the volume is reduced, the operation speed is increased, and certain errors are brought. The invention adopts a scheme of combining module fusion and Static Quantization (Post tracking Static Quantization). Aiming at a trained network model, after a quantization position is specified, parameters of a convolutional layer, a bn layer (Batch Normalization) and an activation layer in the model are fused, a quantization mode is determined, reasoning is carried out on a training set, and reasonable quantization parameters are obtained through channel-by-channel observation.

And (4.2) realizing the KNN by adopting a tree structure so as to improve the searching speed of the KNN. The linear scanning mode of the KNN algorithm has a low operation speed, and in order to improve the real-time performance of the algorithm, the tree structure is adopted for operation acceleration. The Kd-tree is commonly used, which can save online search time, but the Kd-tree uses a hyper-rectangle to divide data, and the search speed is not good when the data dimension is high. To improve this situation, Ball-tree arises by dividing data on a series of nested hyper-spheres and making a determination based on the sum of the two sides and the size of the third side during searching, thereby improving the computational efficiency.

And (4.3) screening the training set data to further improve the search speed of the Ball-tree and achieve the effects of reducing tree building time of the Ball-tree and reducing memory occupation. Although the tree structure has high search efficiency, the tree building process is complex and time-consuming, so that the tree building efficiency and the search efficiency can be further improved if the training data can be reduced without affecting the performance. The method uses a weighted voting mode in a KNN algorithm to obtain a final classification result. Training samples which can effectively realize classification are close to the same class and far from different classes, so that the classification has better distinguishability.

Therefore, the significant Factor (IF) of a sample is defined as the reciprocal sum of the squared euclidean distances from the same class of samples and the squared euclidean distances from the different class of samples after being adjusted in size, and the size adjustment refers to the mean of the squared euclidean distances, as shown in equation 11. And considering that the size distribution of the important factors of the samples of different classes has difference, the operation is respectively carried out according to the data classes when the data are sorted and the data deletion is realized.

representation collection

The maximum value of (a) is,

representation collection

The minimum value of (a) is determined,

and

are respectively a set

And collections

The invention carries out a plurality of experimental comparisons on different parameter settings of algorithm design so as to achieve better parameter selection and further expand the practicability and universality of the whole algorithm.

s1, designing data preprocessing;

The invention distributes the vibration sensor on the mechanical pump test bed, transmits the time sequence data to PC software through conversion, extracts partial data and verifies the algorithm. The data comprises 10 types of data, wherein 0 type is normal, and 1-9 types are faults. By processing, 150 data per class, 90 of which were used for model building and 60 were used for testing. The experimental computer is configured as an 8-core CPU (i7-6700), the RAM is 8GB, and the GTX 1080 video card, and it should be noted that the calculation of the model running time is performed on the CPU.

The parameters and experimental settings of this step are as follows: (1) and counting the data volume of all the category samples and unifying the format. (2) Setting the number of sampling points of NLMS to be 2015, the order of a filter to be 16 and a step factor

All data were processed with NLMS with a scaling factor α of 0.0001. (3) And setting the kernel of the median filtering to be 5, and processing the sample data. (4) Data were processed with Z-score normalization while data set means and standard deviations were recorded.

The complex signal dn is 0.6sin (5x) +0.3cos (10x) +0.1sin (20x), gaussian noise with the Mean value of 0 and the standard deviation of 0.3 is added to form a mixed signal, the mixed signal is processed by a filtering mode combining LMS, NLMS and median filtering respectively, the processing effect is evaluated by Root Mean Square Error (RMSE) and operation time, and the performances of different filtering modes are shown in the table 1.

TABLE 1

In order to show the influence of a filtering scheme and a data processing mode of data standardization on the model performance, four kinds of data which are not processed, only subjected to data standardization, only subjected to filtering processing, and simultaneously subjected to filtering and data standardization are used for training a BP neural Network (Back propagation neural Network) with 4 hidden layers and hidden neurons of 1000, 500, 100 and 50 to train 1500 epochs, and the accuracy of a test set is shown in Table 2.

TABLE 2

As can be seen from Table 2, in the data background of the mechanical pump set, the filtering scheme designed by the invention has an obvious effect of improving the data expression capacity by combining with a data preprocessing mode of data standardization, and is beneficial to model establishment.

S2, enhancing design based on WGAN-GP-C data;

The experimental procedure was as follows: (1) and processing the sample to make the sample more suitable for network training. (2) Setting the dimensionality of random noise to be 128, learning by adopting an adam optimizer, and setting the initial learning rate to be lr equal to 1 × 10^-4And the learning rate is reduced by 10 times at 10 ten thousand epochs and 15 ten thousand epochs, the generator is updated every 6 steps, and the batch size is set to 128. (3) Setting parameters for

settings

6 and 7, setting λ at the first 5 ten thousand epochs₂＝10，λ₁＝λ₃＝λ₄＝λ₅When the value is 0, data training is imported. (4) In training after 5 keep, λ is set₁＝1、λ₂＝10、λ₃＝0、λ₄＝10、λ₅Training was continued at 0.004. (5) And selecting a well-behaved model according to the EM distance, performing data evaluation by using the MMD, properly adjusting parameters, and screening generated data to obtain an enhanced data set.

As shown in fig. 5, forλ of formula 7₃、λ₄The invention carries out two experiments comparing different settings, and meanwhile, in order to prove the effect of WGAN-GP-C, an Auxiliary Classifier GAN (ACGAN) is used for carrying out a comparison experiment. And generating two groups of data by adopting two processing modes of not carrying out data screening (no filter) and carrying out data screening (filter) on the generated sample for displaying the effect of the data screening scheme, and evaluating the finally generated sample by using MMD. The data evaluation results under different models and different data processing modes are shown in Table 3, the quality of the generated data of the WGAN-GP-C is far higher than that of ACGAN, the model has better performance due to the parameter setting mode of the invention, and meanwhile, the data screening scheme can realize further data quality improvement.

TABLE 3

Experiments show that the classification accuracy of the discriminator in a test set is 83.33%, the model is based on the model to enhance data, for the effect of enhancing test data, a BP Neuron Network which comprises 4 hidden layers and hidden neurons of 1000, 500, 100 and 50 is trained by enhancing data sets with different multiples, the accuracy of the model on the test set is tested, when the data enhancement multiple is 1, the accuracy of the BP Neuron Network is improved fastest, so that under the condition that the data characteristics which do not belong to real data per se are not introduced as much as possible, the model selects the data enhancement multiple to be 1, and the enhanced data set is generated for subsequent algorithm design.

S3, designing a classification algorithm based on metric learning and KNN;

and building a measurement network model, based on a residual error thought, and providing a space self-adaptive structure, so that a more appropriate space position is searched for by the sample according to the property of the sample when space mapping is carried out. The data mapping output dimension is 32. A triplet dataset is constructed. Using learning rate lr being 5 × 10^-5The Adam optimizer of (a) trains a network model. When a classification algorithm model is constructed, a KNN algorithm is used for setting the parameter k to be 1,100]Go through range and then testAnd (3) the performance of the KNN algorithm on the test set to search an optimal k value and corresponding accuracy.

The specific experimental steps are as follows: (1) the data set in the triplet format is constructed with the training set to facilitate metric-net training. (2) Learning by adopting an adma optimizer, wherein the learning rate is lr being 5 multiplied by 10^-5The batch size is set to 512. (3) The parameters of formula 10 and formula 11 are set, where μ is 0.6 and λ is 5 e-3. (4) Converting the data set into a vector library by using a trained network, and setting the parameter k at [1,100] by using a KNN algorithm]And traversing within the range to find the optimal k value and the corresponding accuracy. (5) And (4) testing the plurality of trained models by adopting the mode of (4), so as to select a better measurement network model.

As shown in FIG. 6, the present invention performed comparative experiments on different parameter settings to achieve better model design. Where the settings of equation 10 are set for 0.2, 0.6, 1 comparisons while performing validation on the original dataset and the enhanced dataset. Setting data dimensions of Embedding output by the model as 16 and 32 respectively, and verifying on the original data set. The spatial adaptive structure introduces a structure with smaller calculation amount for the algorithm model, and in order to verify the function of the algorithm model, the experiment based on the metric learning is respectively verified on the metric-net and the Vector-CNN without the spatial adaptive structure, and the accuracy results of the experiments are shown in (1) to (14) of the table 4. In order to prove that metric learning is more suitable for a small sample background, Vector-CNN is changed into a classification network, multi-classification cross entropy is used for training on an original data set enhanced data set, and the results are shown in (15) and (16) of table 4.

TABLE 4

As can be seen from (1), (3), (8) and (10) of table 4, in the data set of the present invention, the data dimension of Embedding output by the model is set to 32, so that the model can achieve better performance. Theoretically, when the output dimension is 32, more data information can be reserved, the expressiveness of the data is stronger, and the classification task is more facilitated.

As can be seen from the accuracy of μ with different parameter settings, the effect of 0.6 μ is better under the same conditions. In this case, the degree of the data of different categories being pulled apart is greater than 0.2, so that the data boundary is more obvious, and the method is more suitable for the complicated task of diagnosing the fault of the mechanical equipment. Meanwhile, compared with the value of mu being 1, the network has the advantage that the degree of separating different types of data is smaller and the method is easier to achieve in training. Although the model accuracy was 99.33% in both cases (13) and (14), the model performed more consistently and superiorly in case (13) from the combined training and final model selection scenarios. Therefore, the invention finally selects mu as 0.6.

As can be seen from comparison of (1) - (7) and (8) - (14) in table 4, under the idea based on metric learning, in combination with the spatial adaptive structure, the data can find a more suitable spatial location according to its own performance, thereby improving the performance of the algorithm. Meanwhile, in order to eliminate the possibility that the spatial adaptation result is due to high performance of large calculation amount and large space occupation, Vector-CNN and the spatial adaptation structure are compared from three aspects of model size, calculation amount (FLOPs) and cpu time consumption, and the result is shown in table 5. As can be seen from the table, the size ratio of the Vector-CNN model to the spatial adaptive structure model is 75:1, the MFLOPs ratio is 6663:1, and the operation time ratio is 3:1, so that the spatial adaptive structure can obtain larger performance improvement at a smaller cost under the background of the invention, and the rationality of the design is proved.

As can be seen from comparison of (1) to (14) and (15) and (16) in table 4, the scheme design based on metric learning is more suitable for the fault diagnosis of the mechanical pump in a small sample background, thereby proving the reasonableness and effectiveness of the scheme design.

It can be known from the comparison of the model performances on the original data set and the enhanced data set that the data enhancement enables the model effect to be improved from 98.16% to 99.33% of (10) in table 4, and the data enhancement scheme has obvious performance improvement effect, and the two-step solution designed for the small sample problem, namely the data enhancement based on WGAN-GP-C and the classification algorithm design based on metric learning and KNN, of the invention is also proved to show better performance in the face of the fault diagnosis task of the small sample of the mechanical pump, and the Vector-CNN and the model pair of the space adaptive structure are shown in table 5.

TABLE 5

S4, optimizing the model;

quantization is performed for the metric network. Symmetric quantization is adopted for the weight of the convolution, bias is not quantized, and asymmetric quantization is adopted for the active layer. The operation of 32 bits to 8 bits is realized. And realizing KNN by adopting a tree structure. And comparing the search time of the linear scanning, the Kd-tree and the Ball-tree, and selecting the Ball-tree mode to realize the KNN algorithm. And (2) screening data of the training set, putting forward a concept of Important Factors (IF) of the training data, sequencing all samples from small to large according to the Important factors, and removing a part of samples with small IF on the premise of ensuring the accuracy of the algorithm, thereby realizing data screening, optimizing the speed of building a Ball-tree and improving the searching speed.

The model optimization part is divided into 3 steps, and the steps are as follows:

quantification of metric-net: (1) and analyzing the network characteristics. The cpu time consumption of Vector-CNN and spatial adaptive structure is about 3:1, the size of the model is about 75:1, and the calculated amount (FLOPs) is about 6663:1, so the quantization of the invention is carried out on the Vector-CNN part. (2) Some adjacent modules of the network are merged. (3) Symmetric quantization is adopted for the weight of the convolution, bias is not quantized, and asymmetric quantization is adopted for the active layer. (4) And inserting a model of the observation tensor into the network, and sending the training set into network operation to realize calibration. (5) And carrying out quantitative conversion on the network model, and testing the performance of the new model by combining a KNN algorithm. As shown in fig. 7.

Accelerating KNN operation by adopting a tree structure: (1) algorithm substitution for linear scanning was achieved using Kd-tree and Ball-tree. (2) Search time comparisons were made for linear scans, Kd-tree, Ball-tree. (3) Tree building time analysis was performed for Kd-tree, Ball-tree. As shown in fig. 8.

And (3) screening the data of the vector library according to important factors: (1) determining a set

And collections

Delete the larger value and the smaller value with a ratio of 0.15, resulting in a set

And

an Importance Factor (IF) calculation is performed for each sample in the vector library and sorted by category from small to large. (2) The ratio of the removed data is taken to be 0,0.99]With 0.01 as the interval between them. (3) And removing data from each type of data according to a proportion, testing the algorithm accuracy, the searching time and the tree building time, and selecting a larger proportion of removed data under the condition of ensuring the accuracy.

And (3) carrying out quantification processing on Vector-CNN of the metric-net according to the content, and measuring the optimization effect of the model by calculating the size of the model, the calculated quantities (MFLOPs), the operation time and the classification accuracy of the final model. For comparison, the present invention processes Vector-CNN using Group Convolution and depth Separable Convolution, and the results are shown in Table 6, wherein GC represents Group Convolution, DPC represents depth Separable Convolution, and the model size, MFLOPs and time in the table are for Vector-CNN.

TABLE 6

Quantization has the advantage of reducing the volume of the model over packet convolution and depth separable convolution with less impact on model accuracy. A lightweight network structure such as packet convolution and deep separable convolution is not well supported on a CPU, and therefore the operation time thereof is increased. The quantization process effectively reduces the size of the model, and the operation time is basically kept unchanged.

The invention establishes a classification model by using the steps, and compares the operation time and the accuracy with other common algorithms on the preprocessed data set. The algorithm participating in comparison comprises Naive Bayes classifiers (Naive Bayes classifiers), Decision trees (Decision trees), support Vector machines, Ball-trees, 3-layer BP neural networks (hidden neurons are respectively 1000, 500, 100 and 50), Vector-CNN-implemented classification and the like. The classifier performance comparison table is shown in table 7.

TABLE 7

The comparison results in table 7 show that the mechanical pump fault data used in the invention has a relatively obvious small sample characteristic, and although the scheme designed according to the small sample characteristic of the invention is relatively complex in training, the accuracy rate of the invention has an obvious advantage compared with that of a common classification model, and the operation speed after model optimization is relatively considerable, and the overall performance is relatively good.

In order to better verify the effect of the method of the invention under different training data volumes, the sample data volume of each type of the training set is respectively set as 10, 20, 30 and 60, the model performance is tested and compared with the classification realized by the SVM and Vector-CNN which have better performance in Table 7, and the accuracy result is shown in Table 8.

TABLE 8

It can be seen that the algorithm scheme proposed against the background of small samples of mechanical pumps has good performance on the data set of the invention, and has higher accuracy than the common method under different training data volumes. Thereby proving the validity of the scheme.

The present applicant has described and illustrated embodiments of the present invention in detail with reference to the accompanying drawings, but it should be understood by those skilled in the art that the above embodiments are merely preferred embodiments of the present invention, and the detailed description is only for the purpose of helping the reader to better understand the spirit of the present invention, and not for limiting the scope of the present invention, and on the contrary, any improvement or modification made based on the spirit of the present invention should fall within the scope of the present invention.

Claims

1. A fault diagnosis method for small samples of a mechanical pump based on WGAN-GP-C and metric learning is characterized by comprising the following steps:

(1) reading time sequence data and carrying out data preprocessing;

(4) carrying out model optimization;

2. The WGAN-GP-C and metric learning based mechanical pump small sample fault diagnosis method according to claim 1, wherein the step (1) specifically comprises:

(1.2) removing an abnormal value by using median filtering;

(1.3) counting the mean and variance of each dimension of the data, and carrying out data standardization by adopting a Z-score method.

3. The WGAN-GP-C and metric learning based mechanical pump small sample fault diagnosis method of claim 2, wherein an NLMS filtering algorithm is as follows:

y(k)＝w^T(k)x(k)

e(k)＝d(k)-y(k)

is the step size parameter.

4. The WGAN-GP-C and metric learning based mechanical pump small sample fault diagnosis method according to claim 1, wherein the step (2) specifically comprises:

5. The WGAN-GP-C and metric learning based mechanical pump small sample fault diagnosis method according to claim 4, wherein the loss function of the generator G, the loss function of the discriminator D and the loss function of the data category are as follows:

wherein the content of the first and second substances,

and

respectively represent x_rAnd x_fIn the category of (a) to (b),

to represent

The distribution of the data of (a) is,

to represent

6. The WGAN-GP-C and metric learning based mechanical pump small sample fault diagnosis method according to claim 4, wherein the data screening step comprises: carrying out category judgment on the generated samples through a discriminator, and reserving the generated samples with the input labels being the same as the output categories of the discriminator; and selecting whether to retain the sample or not according to a certain probability a for each retained sample, wherein the probability a is the accuracy a of the discriminator on the real data test set.

7. The WGAN-GP-C and metric learning based mechanical pump small sample fault diagnosis method according to claim 1, wherein the step (3) specifically comprises:

8. The WGAN-GP-C and metric learning based mechanical pump small sample fault diagnosis method of claim 8, wherein the spatially adaptive structure loss function is as follows:

l_triplet(x_a,x_p,x_n)＝max{0,d(E_a,E_p)-d(E_a,E_n)+μ}

loss＝l_triplet(x_a,x_p,x_n)+λl_E

9. The WGAN-GP-C and metric learning based mechanical pump small sample fault diagnosis method according to claim 1, wherein the step (4) specifically comprises:

(4.2) accelerating the KNN algorithm by adopting a tree structure Ball-tree;

10. The WGAN-GP-C and metric learning based mechanical pump small sample fault diagnosis method of claim 9, wherein the importance factor of a sample is defined as the reciprocal sum of the square of the Euclidean distance between the importance factor and the same type of sample, and the squared sum of the Euclidean distance between the importance factor and different types of samples after being adjusted in size is added, and the operation is respectively carried out according to the data category when the sorting and the data deletion are realized by considering the difference of the size distribution of the importance factors of different types of samples;

representation collection

The maximum value of (a) is,

representation collection

The minimum value of (a) is determined,

and

are respectively a set

And collections