CN117192358A

CN117192358A - Permanent magnet synchronous motor turn-to-turn short circuit fault diagnosis method based on convolutional neural network

Info

Publication number: CN117192358A
Application number: CN202310938232.0A
Authority: CN
Inventors: 陈辉; 毛念玲; 葛斌; 江友华
Original assignee: Anhui University of Science and Technology
Current assignee: Anhui University of Science and Technology
Priority date: 2023-07-26
Filing date: 2023-07-26
Publication date: 2023-12-08

Abstract

A permanent magnet synchronous motor turn-to-turn short circuit fault diagnosis method based on a convolutional neural network belongs to the technical field of motor fault diagnosis, and solves the problems of low accuracy and poor generalization capability of permanent magnet synchronous motor fault diagnosis caused by insufficient feature extraction capability of the convolutional neural network in a strong noise environment; firstly, converting an acquired one-dimensional time sequence signal into a two-dimensional gray scale map; secondly, improving a multi-scale feature extraction module, replacing part of common convolution of the module with cavity convolution, expanding receptive field, and extracting effective information in a data signal to the greatest extent; then introducing a mixed attention mechanism to dynamically update weight parameters, continuously strengthening fault characteristics, and weakening the influence of noise and other interference signals; finally, diagnosing turn-to-turn short circuit faults by using a classifier; the method has better accuracy and robustness, and the accuracy of the model is more than 97% under each noise background, so that the method has stronger anti-noise performance and generalization capability.

Description

Permanent magnet synchronous motor turn-to-turn short circuit fault diagnosis method based on convolutional neural network

Technical Field

The invention belongs to the technical field of motor fault diagnosis, and relates to a permanent magnet synchronous motor turn-to-turn short circuit fault diagnosis method based on a convolutional neural network.

Background

In recent years, permanent magnet synchronous motors are rapidly developed, and are widely applied to the fields of national defense, medical equipment, electric automobiles and the like. However, the permanent magnet synchronous motor has a complex structure and a variable working environment, and various faults are easy to occur, wherein the stator turn-to-turn short circuit fault is one of high-frequency faults of the permanent magnet synchronous motor, and the damage is extremely strong. When such a fault occurs, it can be aggravated if not timely discovered and serviced, and even life-threatening. Therefore, the method has very important significance for rapid, accurate and intelligent fault diagnosis of the motor.

The conventional motor fault diagnosis method mainly includes a model-based method and a signal processing-based method. Along with the characteristic that the motor operation data presents massive, diversified and other big data, the defects of the traditional fault diagnosis method are gradually revealed. The motor control system is increasingly complex, and the characteristic models of the equipment are difficult to build, so that the effect of rapid analysis is difficult to achieve by a model-based method; the method based on signal processing algorithms such as Fourier transform (Fourier transform, FT), empirical mode decomposition (empirical mode decomposition, EMD), wavelet transform (wavelet transform, WT) and the like has very complex characteristic extraction and fault diagnosis processes, needs to rely on abundant expert experience, and is not beneficial to wide popularization.

In recent years, with the rise of artificial intelligence technology, machine learning methods based on a multi-layer perceptron (multilayer perceptron, MLP), a support vector machine (support vector machine, SVM) and the like are widely studied, but cannot meet the current fault diagnosis requirements due to the dependence on artificial design features.

The deep learning algorithm proposed by HINTON et al in the document Reducing the dimensionality of data with neural networks published in 2006 (HINTON G E, salkhutdinov r.science,2006,313 (5786):504-507.) breaks through the bottleneck of the neural network, and becomes a new research hotspot in the development of artificial intelligence technology in recent years. The deep learning has strong feature extraction capability and end-to-end learning capability, can automatically extract corresponding features from input signals, and eliminates interference and errors caused by manually extracting the features. The literature (Li Junqing, li Sixuan, chen Yating) on the basis of deep belief network, university of North China, university of electric power, 2020,47 (5): 48-55.) proposes a synchronous motor fault diagnosis method based on deep belief network, which has lower complexity and higher accuracy than the traditional fault diagnosis method. The literature (Li Junqing, li Sixuan, chen Yating, wang Zhenxing, he Yuling) discloses a synchronous motor rotor winding turn-to-turn short circuit fault diagnosis method based on CGAN-CNN, wherein the synchronous motor rotor winding turn-to-turn short circuit fault diagnosis is completed by using a condition generation type countermeasure network and a one-dimensional convolution neural network through electric power automation equipment (2021,41 (08): 169-174). The literature (research on fault diagnosis methods of alternating current motor systems based on LSTM) (Zhang Peng, shu Xiaoman, li Xueyi, hang Jun, ding Danchuan, wang Qunjing. Motor and control report, 2022,26 (03): 109-116.) proposes a fault diagnosis method of alternating current motor systems based on combination of a long-short-term memory network and a Softmax multi-classifier, data extraction characteristics are processed by using the long-short-term memory network, network parameters are optimized, and the effectiveness of the method is proved through practice. The document DE algorithm optimized CNN rolling bearing fault diagnosis research (Sun Qichun, li Yuanyuan. Noise and vibration control, 2022,42 (4): 165-171+176.) proposes a convolution neural network optimized by a differential evolution (Differential evolution) algorithm, thereby improving the extraction capability of bearing fault characteristics.

Compared with the traditional method, the method can obtain better fault diagnosis effect. However, most of the traditional CNN fault diagnosis models adopt a single scale convolution layer or a deepened network layer number for feature extraction. The motor signal has complex time scale characteristics, the running environment is complex and changeable, interference signals are often mixed in fault characteristics, especially under the interference of noise, the single-scale convolution is difficult to effectively extract micro fault characteristics, and the diagnosis effect is general.

Disclosure of Invention

The technical scheme of the invention is used for solving the problems of low accuracy and poor generalization capability of fault diagnosis of the permanent magnet synchronous motor caused by insufficient feature extraction capability of the convolutional neural network in a strong noise environment.

The invention solves the technical problems through the following technical scheme:

the method for diagnosing the turn-to-turn short circuit fault of the permanent magnet synchronous motor based on the convolutional neural network comprises the following steps:

step 1, acquiring zero sequence voltage data of a permanent magnet synchronous motor under normal and fault conditions by using a motor experiment platform;

step 2, converting the collected one-dimensional time sequence zero sequence voltage data into a two-dimensional gray scale map, and dividing the sample into a training set and a testing set;

step 3, designing a multi-scale feature extraction framework based on an acceptance network structure, generating a multi-scale feature extraction module, constructing a multi-scale convolution network model combined with a mixed attention mechanism on the basis of the multi-scale feature extraction module, and initializing model parameters;

step 4, training the model by using a sample training set, and continuously adjusting parameters to update the bias and weight of the model until the model converges;

step 5, after training the network parameters of the model, storing the model, inputting a test set into the model for fault diagnosis, judging whether the model meets the diagnosis requirement according to the accuracy of a test sample, if so, executing step 6, otherwise, turning to step 3, and reestablishing the model parameters;

and 6, using the determined network model for fault diagnosis of the permanent magnet synchronous motor, and outputting a fault diagnosis result.

Further, the convolutional neural network consists of an input layer, a convolutional layer, a pooling layer, a full-connection layer and an output layer; the convolution layer is used for extracting features and obtaining an output feature map according to input calculation; the pooling layer is used for calculating a downsampled output image according to the input image; the full connection layer is positioned at the tail position of the network and plays a role of a classifier;

the mathematical model of the convolution layer is:

y _i,j ＝f(∑x*ω _ij +b) (1)

wherein "×" is the two-dimensional discrete convolution operator; b is offset; omega _ij Is a convolution kernel; x is an input feature map; f (·) is the activation function;

the form of the pooling layer is as follows:

z＝f(βdown(x)+b) (2)

wherein β is multiplicative bias; down (·) is the downsampling function; b is an additive bias, f (·) is an activation function;

the full connection layer is at the end of the neural network, each neuron of the full connection layer is fully connected with all neurons output by the previous layer, the full connection layer can integrate local information with category distinction in the convolution layer or the pooling layer, and the output is that:

h(x)＝f(ωx+b) (3)

wherein x is the input of the full connection layer; h (x) is the output of the full connection layer; omega is a weight; b is an additive bias; f (·) is the activation function;

the activation function carries out linear conversion on the output of the neural network, and the activation function adopts a Relu function, and the formula is as follows:

Relu(x)＝max(x,0) (4)

in the classification task, softmax is used at the output layer to perform tag classification.

Further, the method for converting the collected one-dimensional time sequence zero sequence voltage data into the two-dimensional gray scale map in the step 2 is as follows: the collected zero sequence voltage data is cut in order, and the permanent magnet synchronous motor is positiveThe data under normal and different fault degrees are 800 groups, and each group has 1024 sampling points; the acquired one-dimensional time sequence signal is cut into 1X 32 according to the length 32 ² Is converted into a 32 x 32 two-dimensional matrix through discrete processing, thereby obtaining a gray scale map.

Further, the convolution kernel sizes used by the acceptance network structure in the step 3 are 1×1, 3×3 and 5×5 respectively, convolution operations are combined in series and parallel, different receptive fields are obtained in the process of mining information, features with different scales are cascaded together, and the features of all branches are integrated to obtain the output of the network.

Further, in step 3, the multi-scale feature extraction module: adopting a three-branch parallel convolution structure, and respectively adding convolution layers with the sizes of 1 multiplied by 1, 3 multiplied by 3 and 5 multiplied by 5 into the first layers of 3 branches; the convolution layers of 1 multiplied by 1 are used for adjusting the dimension of the characteristic channel and improving the width of the network, the convolution layers of 3 multiplied by 3 and 5 multiplied by 5 are adopted as the second layers of the 2 nd and 3 rd branches respectively, then the cavity convolution layers of 3 multiplied by 3 and 5 multiplied by 5 are adopted as the third layers of the two branches, a BatchNorml function and an activation function are added at the output end of each branch convolution layer, and the characteristic dimensions of the three branches are stacked and spliced through the concat layer.

Further, the mixed attention mechanism in step 3 includes a channel attention module and a spatial attention module;

the channel attention module is designed based on a SENet module, in the channel attention module, a characteristic with the size of H multiplied by W multiplied by C is subjected to global average pooling and global maximum pooling, and the height and the width are compressed to be 1, so that two characteristics with the size of 1 multiplied by C are obtained, wherein H represents high, W represents wide, and C represents a channel; simultaneously sending the two 1 multiplied by C characteristics into a multi-layer perceptron, adding output results, and multiplying the channel module weight Wc obtained by mapping by using a sigmoid activation function to obtain the output of a channel attention module;

the spatial attention module is used for splicing the maximum pooling and the average pooling in the channel dimension; the input original feature map is subjected to a pooling layer to obtain an H multiplied by W multiplied by 2 feature map, then the mapping is completed through convolution operation and finally through an activating function sigmoid, the weight of the spatial attention module is generated, and then the spatial attention module and the original feature map are subjected to dot multiplication to complete the calculation of the spatial attention module.

Further, the multi-scale convolution network model in the step 3 consists of 3 MSDB modules, 3 CBAM modules, 2 convolution layers, 1 global average pooling layer and 1 full connection layer; the method comprises the steps that 1 multi-scale feature extraction module is formed by an MSDB module and a CBAM module, the cavity convolution expansion rates in the 3 parallel multi-scale feature extraction modules are respectively set to be r=1, r=2 and r=3, then the obtained features are fused through concat, more feature information is further learned through two convolution layers, input dimensions are laminated through global average pooling, network model parameters are reduced, features corresponding to categories are generated, and overfitting is prevented; finally, sorting was performed by full connectivity layer using a softmax sorter.

The invention has the advantages that:

1) The multi-scale characteristic extraction is carried out on the gray level diagram, and the multi-scale framework formed by three different scale convolution kernels can furthest extract the local characteristic and the global characteristic of the data under the condition of not deepening a network, so that the fault diagnosis precision of the model can be effectively improved;

2) Introducing a mixed attention mechanism, adaptively distributing weights for input features, inhibiting interference features such as noise, enhancing response of fault features, and optimizing a learning mechanism of CNN; experimental results prove that the model has good diagnosis effect under different signal-to-noise ratio backgrounds, and the model is superior in noise resistance and has good stability and robustness;

3) Compared with other network models mentioned in the invention, the model of the invention has more superior fault diagnosis performance and higher accuracy and has practical application value.

Drawings

FIG. 1 is a block diagram of a convolutional neural network of an embodiment of the present invention;

FIG. 2 is a diagram of an acceptance model architecture of an embodiment of the present invention;

FIG. 3 is a block diagram of a mixed-attention module of an embodiment of the present invention;

FIG. 4 is a block diagram of a channel and spatial attention module according to an embodiment of the present invention;

FIG. 5 is a zero sequence voltage diagram under turn-to-turn short circuit fault of an embodiment of the present invention;

FIG. 6 is a signal-to-image conversion flow chart of an embodiment of the present invention;

FIG. 7 is a two-dimensional gray scale map of an embodiment of the present invention;

FIG. 8 is a block diagram of a multi-scale module according to an embodiment of the invention;

FIG. 9 is a diagram of a multi-scale CNN network architecture of an embodiment of the invention;

FIG. 10 is a model diagnostic flow diagram of an embodiment of the present invention;

FIG. 11 is a diagram of an experimental platform of an embodiment of the invention;

FIG. 12 is a waveform diagram of signals under normal and fault conditions of the method of the present invention;

FIGS. 13 (a) and 13 (b) are graphs of training and testing experimental results of the model of the present invention;

FIG. 14 is a graph of test set confusion matrix results for the model of the present invention;

FIG. 15 is a graph of accuracy of the model of the present invention versus other models.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions in the embodiments of the present invention will be clearly and completely described in the following in conjunction with the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The technical scheme of the invention is further described below with reference to the attached drawings and specific embodiments:

example 1

1. Basic theory of

1.1 convolutional neural network

The convolutional neural network (convolutional neural networks, CNN) is a feedforward artificial neural network, which replaces the conventional matrix multiplication operation with a convolutional operation, and is specially used for processing data having a grid-like structure, such as time-series data, image data, and the like.

As shown in fig. 1, the basic structure of CNN is composed of an input layer, a convolution layer, a pooling layer (downsampling layer), a full connection layer, and an output layer, wherein the convolution layer, the pooling layer, and the full connection layer are the most critical. The convolution layer is used for extracting features and obtaining an output feature map according to input calculation; the pooling layer is used for calculating a downsampled output image according to the input image; the full connection layer is positioned at the end position of the network and plays a role of a classifier.

The convolution layer is composed of a plurality of feature planes, each feature plane is composed of a plurality of neurons, each of which is connected to a local area of a feature plane of a previous layer by a convolution kernel. The convolution kernel is a weight matrix, one convolution layer contains a plurality of convolution kernels, and each convolution kernel detects specific features on all positions of the input features, so that weight sharing on the same input feature is realized. In order to extract the different features of the input feature map, different convolution kernels are used for the convolution operation. The convolution layer can extract deep relation from a large amount of data, so that the operation of manual feature extraction is avoided, and the automation of feature engineering is realized. The mathematical model of the convolution layer is:

y _i,j ＝f(∑x*ω _ij +b) (1)

wherein "×" is the two-dimensional discrete convolution operator; b is offset; omega _ij Is a convolution kernel; x is an input feature map; f (·) is the activation function.

The pooling layer, also known as the downsampling layer. The method is characterized in that the method is immediately followed by a convolution layer and also comprises a plurality of characteristic surfaces, each characteristic surface of the method corresponds to the characteristic surface of the previous layer one by one and is mainly used for characteristic dimension reduction, and the purpose of obtaining the characteristics with space invariance by reducing the resolution of the characteristic surfaces is achieved, so that the function of secondarily extracting the characteristics is achieved. The general form of the pooling layer is:

z＝f(βdown(x)+b) (2)

wherein β is multiplicative bias; down (·) is the downsampling function; b is the additive bias and f (·) is the activation function.

The fully connected layer is usually at the end of the neural network, and each neuron of the fully connected layer is fully connected with all neurons output by the previous layer, and the layer can integrate local information with category distinction in a convolution layer or a pooling layer. The output is:

h(x)＝f(ωx+b) (3)

wherein x is the input of the full connection layer; h (x) is the output of the full connection layer; omega is a weight; b is an additive bias; f (·) is the activation function.

The invention adopts a Relu function, which is a more common activation function in the neural network, and the formula is as follows:

Relu(x)＝max(x,0) (4)

typically in the classification task, softmax is used at the output layer to perform tag classification.

1.2 acceptance network architecture

The acceptance structure is the core of the GoogleNet neural network framework, is formed by splicing a plurality of neural network branches, and connects the calculated results of each branch. The main idea is to increase the network width through convolution kernels with different sizes, simplify the complex stacked network structure and achieve the purpose of extracting more abundant characteristic information. The feature map is subjected to dimension reduction operation by using a convolution kernel with a dimension of 1 multiplied by 1, so that the parameter calculation amount and the calculation cost are reduced, and the network training speed is increased. The CNN model can achieve better feature expression by using a multi-branched structure.

Fig. 2 shows a schematic diagram of a common structure of acceptance, which includes different branches, the data are respectively transmitted to each branch after being input into a network, the convolution kernel sizes are respectively 1×1, 3×3 and 5×5, the convolution operation is combined in series and parallel, different receptive fields are obtained in the process of mining information, the characteristics with different scales are cascaded together, and the characteristics of each branch are integrated to obtain the output of the network.

1.3 attention mechanism

The main function of the attention mechanism is to increase the discrimination of the neural network to the local key areas, thereby achieving the purpose of improving the classification and feature extraction performance. The mixed attention mechanism (Convolutional Block Attention Module, CBAM) includes two major modules, a Channel Attention Module (CAM) and a Spatial Attention Module (SAM), respectively, as shown in fig. 3.

The channel attention module is designed based on the SENet module. In the channel attention module, a feature of size h×w×c can be compressed to 1 both in height and width by global average pooling (AvgPool) and global maximum pooling (MaxPool), resulting in two features of 1×1×c, where H represents high, W represents wide, and C represents a channel. And then simultaneously sending the two 1 multiplied by C characteristics into the multi-layer perceptron, and then multiplying the channel module weight Wc obtained by the mapping by using a sigmoid activation function by the input characteristics to obtain the output of the channel attention module. See in particular fig. 4 (a).

In the designed spatial attention module, the maximum pooling and the average pooling are spliced together in the channel dimension. The input original feature map is subjected to a pooling layer to obtain an H multiplied by W multiplied by 2 feature map, then the mapping is completed through convolution operation and finally through an activating function sigmoid, a spatial attention module weight Ws is generated, and then the spatial attention module weight Ws and the original feature map are subjected to dot multiplication to complete the calculation of a spatial attention module. As shown in fig. 4 (b).

2. Convolutional neural network fault diagnosis model based on improved multiscale

Aiming at the limitations of the traditional motor fault diagnosis method and CNN under a single scale, the invention provides a convolutional neural network fault diagnosis model for improving multi-scale convolution (multi-scale Dilated convolutions residual block, MSDB) and mixed attention (CBAM) by combining the advantages of a multi-scale frame and an attention mechanism, which is used for diagnosing turn-to-turn short circuit faults of stator windings of a permanent magnet synchronous motor. The multi-scale network structure widens the width of the network, increases the adaptability of the model to the convolution kernel scale, gives different weights to different characteristics of data by an attention mechanism, extracts more key and important information, and enables the model to make more accurate judgment. The diagnosis flow comprises three stages of data acquisition, data preprocessing and model pre-training, and the model after training and optimization can detect and diagnose the turn-to-turn short circuit fault of the permanent magnet synchronous motor.

2.1 zero sequence voltage

The voltage between the neutral point of the Y-shaped coupling winding of the permanent magnet synchronous motor and the midpoint of the voltage of the direct current bus is zero sequence voltage V _o As shown in fig. 5. Wherein, in order to prevent the inverter pair from measuring the zero sequence voltage V _o The three-phase balance resistor network is added into the model.

Zero sequence voltage V when permanent magnet synchronous motor operates healthily _o The zero sequence flux linkage is only related to the zero sequence flux linkage in the permanent magnet, and the zero sequence flux linkage is composed of third harmonic and odd multiple harmonic thereof. Therefore, the zero sequence voltage V collected under the normal running state of the motor _o Contains only the third harmonic and its odd multiples. When the permanent magnet synchronous motor has a fault condition of turn-to-turn short circuit, the zero sequence voltage of the permanent magnet synchronous motor not only contains third harmonic waves and odd multiple harmonic waves thereof, but also contains fundamental waves, third harmonic wave components, fifth harmonic wave components, seventh harmonic wave components and the like. Compared with the zero sequence voltage of the normal operation of the motor, the zero sequence voltage under the turn-to-turn short circuit fault is increased by fundamental wave and fifth and seventh harmonic components, and the amplitude of the fundamental wave component is obvious.

In conclusion, the characteristics of the zero sequence voltage signals collected by the normal operation and the operation under the fault of the motor are obviously different, so that the diagnosis of the turn-to-turn short circuit fault of the stator winding can be realized through the difference between the comparison characteristics.

2.2 data Pre-processing

The one-dimensional time sequence signal is easy to be fitted in the training process, and meanwhile, the problem of characteristic loss is easy to occur in the characteristic extraction process. The invention can avoid the problem of feature loss by converting the one-dimensional time domain signal into the gray level map.

In order to ensure that the experimental data amounts of the motor at different fault degrees are consistent, the collected data are cut and aligned, and the data of the motor at normal and different fault degrees are 800 groups of 1024 sampling points. The signal image conversion flow is shown in FIG. 6, which firstly cuts the collected one-dimensional time sequence signal into 1X 32 according to the length 32 ² Is converted into a 32 x 32 two-dimensional matrix through discrete processing, and a gray scale map can be obtained.

As shown in fig. 7, signals under 4 operating conditions in total under the condition of normal permanent magnet synchronous motor and turn-to-turn short fault (ISF) fault are collected based on an experimental platform, wherein u is a fault rate, namely a ratio of a fault number to a total number of turns, a picture is extracted from each state, and the size of each picture is 32 x 32.

2.3 Multi-scale feature extraction Module

Aiming at the problem that the extracted characteristic signals are not continuous due to single scale in the traditional CNN characteristic extraction, the invention designs a multi-scale characteristic extraction frame based on an acceptance network structure to generate a multi-scale characteristic extraction module (Multi Scale Dilated Block, MSDB) which has stronger characteristic extraction capability and can maximally extract characteristic information, wherein the structure is shown in figure 8. The module adopts a three-branch parallel convolution structure, and convolution layers with the sizes of 1 multiplied by 1, 3 multiplied by 3 and 5 multiplied by 5 are respectively added to the first layers of 3 branches; the convolution layers of 1 multiplied by 1 are used for adjusting the dimension of the characteristic channel and improving the width of the network, the convolution layers of 3 multiplied by 3 and 5 multiplied by 5 are adopted as the second layers of the 2 nd and 3 rd branches respectively, then the cavity convolution layers of 3 multiplied by 3 and 5 multiplied by 5 are adopted as the third layers of the two branches, compared with the traditional CNN, the cavity convolution enables the convolution cores of the same size to obtain larger receptive fields through the expansion ratio, the range of extracting image characteristics can be effectively enlarged, and the parameter number is unchanged. In order to optimize the characteristics, a BatchNorml function and an activation function are added at the output end of each branch convolution layer to accelerate training and prevent overfitting, so that the diagnosis effect of the model is further improved. And finally, stacking and splicing the feature dimensions of the three branches through a concat layer.

2.4 network model framework

On the basis of the multi-scale feature extraction module, a multi-scale convolution network model combined with a mixed attention mechanism is constructed, and the whole structure is shown in fig. 9. The model consists of 3 MSDB modules, 3 CBAM modules, 2 convolution layers, 1 global average pooling layer, and 1 full connection layer. The method comprises the steps that 1 multi-scale attention module (MBM) is formed by an MSDB module and a CBAM module, hole convolution expansion rates in 3 parallel MBM modules are respectively set to be r=1, r=2 and r=3, so that the generalization capability of a network can be enhanced, obtained features are fused through concat, more feature information is further learned through two convolution layers, input dimensions are laminated through global average pooling, network model parameters are reduced, features corresponding to categories are generated, and overfitting is prevented; finally, sorting was performed by full connectivity layer using a softmax sorter.

2.5 Fault diagnosis procedure

Based on the model designed in the invention, the zero sequence voltage data is collected through the motor experiment platform, the data is converted into a gray level diagram, and the gray level diagram is divided into a training set and a testing set for training and testing. The fault diagnosis flow is shown in fig. 10.

The specific steps of fault diagnosis are as follows:

(1) Collecting zero sequence voltage data of a motor under normal and fault conditions by using a motor experiment platform;

(2) Converting the acquired one-dimensional time sequence signal into a two-dimensional gray scale image, and dividing the sample into a training set and a testing set;

(3) Establishing a network model and initializing model parameters;

(4) Training the model by using a sample training set, and continuously adjusting parameters to update the bias and weight of the model until the model converges;

(5) After training the network parameters of the model, the model is stored, then a test set is input into the model for fault diagnosis, and a diagnosis result is output.

3 experimental verification and analysis

In order to fully verify the performance and generalization capability of the multi-scale convolutional neural network model in a strong noise environment, the invention uses a permanent magnet synchronous motor fault simulation platform of a university of Anhui and a control technology national local joint laboratory to acquire data.

3.1 construction of an experiment platform and data acquisition

In order to verify the effectiveness of the multi-scale convolutional neural network model, an experimental platform is built, as shown in fig. 11. Two permanent magnet synchronous motors are directly connected through a coupler, one is used as a test motor, the other is used as a load motor, and three short circuit conditions with the short circuit turns ratio u of 0.05, 0.1 and 0.15 can be realized through artificial connection tap simulation of turn-to-turn short circuit faults of stator windings of the test motor. The main controller adopts dsace DS1103 and the motor parameters for experimental verification are shown in table 1.

Table 1 parameters of permanent magnet synchronous motor

Parameters (parameters)	Numerical value
		Rated power (W)	550
Rated current (A)	1.5
		Polar logarithm	2
Permanent magnet flux linkage (Wb)	0.624
		Straight shaft inductance (H)	0.16
Quadrature axis inductance (H)	0.16

The test uses motor A phase as fault phase, and the sampling frequency is 2000HZ. And collecting zero sequence voltage signals with the rotating speed of 200r/min and the load of 2 N.m. The collected data sets of three fault degrees with the turn ratios of normal and short circuits of 0.05, 0.1 and 0.15 are divided into 4 state labels. The waveforms of the collected zero sequence voltage signals are shown in fig. 12, wherein fig. 12 (a) is the normal zero sequence voltage, and 12 (b) (c) (d) is the zero sequence voltage waveforms with u of 0.05, 0.1 and 0.15 under the condition of turn-to-turn short circuit fault.

3.2 Experimental parameter setting

The invention is based on Pytorch to build and train a convolutional neural network model, an operating system is windows10, and a display card model is RTX1060ti. The influence of a plurality of parameters on the diagnosis process and the result is large, and reasonable parameter setting can properly improve the speed and accuracy of network training. When training the model, the data batch size is set to be 64, the maximum iteration number set by the model is 40, the learning rate is set to be 0.001, and model parameters are updated by adopting an Adam optimization algorithm. The data set converted into the gray level map is divided into training samples and test samples according to the proportion of 3:1, 600 samples can be obtained as sample training sets, and the rest samples are taken as sample test sets.

3.3 Experimental part

3.3.1 model fault diagnosis results and analysis

The training samples and the test samples were subjected to experiments, and the experimental results are shown in fig. 13 (a), 13 (b) and 14. As can be seen from the diagnostic effect plot of fig. 13 (a), the accuracy rises with increasing number of iterations, after 15 iterations training, the model network tends to smooth and the training and test curves almost coincide. As can be seen from the loss curve of fig. 13 (b), the loss value is continuously decreasing as the number of iterations increases, and becomes stable after 12 iterations.

Fig. 14 is a confusion matrix, in which the abscissa represents predicted values and the ordinate represents true values. As can be seen from the graph, the mutual recognition errors occur among faults under different fault degrees, but the accuracy is over 96%, which indicates that the model training effect is good.

3.3.2 comparative analysis of different model experiments

In order to verify that the model provided by the invention has a better diagnosis effect, compared with the currently proposed deep learning methods such as ResNet, alexNet and the MSDB-CNN model, the MSDB-CNN model is a network structure for removing the CBAM module in the method. To reduce the impact of single experimental contingencies, the above model was run 10 times based on the respective environment, averaged, and trained using the same dataset. The diagnostic accuracy versus effect of the 4 models is shown in figure 15.

The average diagnosis accuracy of the AlexNet model is only 90.65%, because the AlexNet network structure is formed by simply stacking a plurality of convolutions, and the correlation between signals cannot be fully mined, so that rich data features are difficult to extract; the accuracy rate of ResNet is improved by 1.74% compared with AlexNet, and the residual structure has the function of improving the feature extraction capacity, so that the connection of each convolution layer is realized; compared with AlexNet and ResNet, the accuracy of MSDB-CNN is improved by 4.68% and 2.94% respectively, because multi-scale convolution is adopted to replace single-size convolution, the network width can be widened, and the extracted characteristic information is more comprehensive. The method provided by the invention combines the multi-scale module with the mixed attention mechanism, further improves the diagnosis effect, has the average accuracy up to 97.43 percent, is obviously higher than that of other methods, and can be used for extracting the sample characteristics by the attention mechanism, and the deeper characteristic information can be extracted and utilized in an intensified manner.

3.3.3 model anti-noise Performance analysis

In actual working conditions, the operation of the motor is not only interfered by noise in the environment, but also noise can be generated when the motor body operates, so that fault signals are covered. In order to ensure the stability of the motor diagnosis method under noise, the invention adds Gaussian white noise with signal-to-noise ratios of 3dB, 6dB, 9dB and 12dB into test samples in a data set respectively, and detects the noise resistance of the design model of the invention by using the samples. And comparing the model with ResNet, alexNet, MSDB-CNN model, and taking average value in 10 times.

As shown in Table 2, the accuracy of the invention is obviously better than other methods under the conditions of 3dB, 6dB, 9dB and 12dB of signal-to-noise ratio.

TABLE 2 accuracy at different signal-to-noise ratios

The ResNet and AlexNet have relatively simple structures, and feature extraction is not completely extracted, so that noise resistance is weak. Under the modules of multi-scale extraction and cavity convolution, the MSDB-CNN enables the feeling range of the features to be wider, can deeper mine the internal features in the data, and obviously improves the diagnosis method. Meanwhile, as can be seen from table 2, the accuracy of the method provided by the invention is better, because the advantages of the multi-scale convolution module and the attention mechanism are fully exerted, the weight of the channel is adjusted by introducing the attention mechanism, more effective characteristic signals are captured, interference factors such as noise are reduced, the anti-noise performance is stronger, the recognition capability of the model is further improved, and the diagnosis accuracy of the test set reaches more than 98% in 9dB and 12dB noise environments.

The above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. The method for diagnosing the turn-to-turn short circuit fault of the permanent magnet synchronous motor based on the convolutional neural network is characterized by comprising the following steps of:

2. The method for diagnosing the turn-to-turn short circuit fault of the permanent magnet synchronous motor based on the convolutional neural network according to claim 1, wherein the convolutional neural network consists of an input layer, a convolutional layer, a pooling layer, a full-connection layer and an output layer; the convolution layer is used for extracting features and obtaining an output feature map according to input calculation; the pooling layer is used for calculating a downsampled output image according to the input image; the full connection layer is positioned at the tail position of the network and plays a role of a classifier;

the mathematical model of the convolution layer is:

y _i,j ＝f(∑x*ω _ij +b) (1)

the form of the pooling layer is as follows:

z＝f(βdown(x)+b) (2)

h(x)＝f(ωx+b) (3)

Relu(x)＝max(x,0) (4)

3. The method for diagnosing turn-to-turn short circuit fault of permanent magnet synchronous motor based on convolutional neural network according to claim 2, wherein the method for converting the collected one-dimensional time sequence zero sequence voltage data into two-dimensional gray scale map in step 2 is as follows: the collected zero sequence voltage data are subjected to trimming processing, the data of the permanent magnet synchronous motor under normal and different fault degrees are 800 groups, and each group has 1024 sampling points; the acquired one-dimensional time sequence signal is cut into 1X 32 according to the length 32 ² Is converted into a 32 x 32 two-dimensional matrix through discrete processing, thereby obtaining a gray scale map.

4. The method for diagnosing turn-to-turn short circuit faults of a permanent magnet synchronous motor based on a convolutional neural network according to claim 3, wherein the convolution kernel sizes used by the concept network structure in the step 3 are 1×1, 3×3 and 5×5 respectively, convolution operation is combined in series and parallel, different receptive fields are obtained in the process of mining information, features with different scales are cascaded together, and the features of each branch are integrated to obtain the output of the network.

5. The method for diagnosing a turn-to-turn short circuit fault of a permanent magnet synchronous motor based on a convolutional neural network according to claim 4, wherein the multi-scale feature extraction module in step 3: adopting a three-branch parallel convolution structure, and respectively adding convolution layers with the sizes of 1 multiplied by 1, 3 multiplied by 3 and 5 multiplied by 5 into the first layers of 3 branches; the convolution layers of 1 multiplied by 1 are used for adjusting the dimension of the characteristic channel and improving the width of the network, the convolution layers of 3 multiplied by 3 and 5 multiplied by 5 are adopted as the second layers of the 2 nd and 3 rd branches respectively, then the cavity convolution layers of 3 multiplied by 3 and 5 multiplied by 5 are adopted as the third layers of the two branches, a BatchNorml function and an activation function are added at the output end of each branch convolution layer, and the characteristic dimensions of the three branches are stacked and spliced through the concat layer.

6. The method for diagnosing a turn-to-turn short circuit fault of a permanent magnet synchronous motor based on a convolutional neural network according to claim 5, wherein the hybrid attention mechanism in step 3 comprises a channel attention module and a spatial attention module;

the spatial attention module is used for splicing the maximum pooling and the average pooling in the channel dimension; the input original feature map is subjected to a pooling layer to obtain an H multiplied by W multiplied by 2 feature map, then the mapping is completed through convolution operation and finally through an activating function sigmoid, a spatial attention module weight Ws is generated, and then the spatial attention module weight Ws and the original feature map are subjected to dot multiplication to complete the calculation of a spatial attention module.

7. The method for diagnosing the turn-to-turn short circuit fault of the permanent magnet synchronous motor based on the convolutional neural network according to claim 6, wherein the multi-scale convolutional network model in the step 3 consists of 3 MSDB modules, 3 CBAM modules, 2 convolutional layers, 1 global average pooling layer and 1 full connection layer; the method comprises the steps that 1 multi-scale feature extraction module is formed by an MSDB module and a CBAM module, the cavity convolution expansion rates in the 3 parallel multi-scale feature extraction modules are respectively set to be r=1, r=2 and r=3, then the obtained features are fused through concat, more feature information is further learned through two convolution layers, input dimensions are laminated through global average pooling, network model parameters are reduced, features corresponding to categories are generated, and overfitting is prevented; finally, sorting was performed by full connectivity layer using a softmax sorter.