CN115062648A

CN115062648A - Fault diagnosis method, system and device for rotary machine and storage medium

Info

Publication number: CN115062648A
Application number: CN202210620387.5A
Authority: CN
Inventors: 李巍华; 梁靖康; 陈祝云; 廖奕校; 陈浚彬
Original assignee: South China University of Technology SCUT
Current assignee: South China University of Technology SCUT
Priority date: 2022-06-02
Filing date: 2022-06-02
Publication date: 2022-09-16
Anticipated expiration: 2042-06-02
Also published as: CN115062648B

Abstract

The invention discloses a fault diagnosis method, a system, a device and a storage medium for a rotary machine, wherein the method comprises the following steps: acquiring a data set; constructing a lightweight one-dimensional convolutional neural network for fault diagnosis, wherein a network framework comprises 4 parts: the method comprises the following steps of a first convolution layer with a large convolution kernel width, intermediate layer superposition selectable by k operation operators, a global average pooling layer and a full-connection output layer; defining a search space of hyper-parameters to be optimized of a neural network; training a network by using an automatic hyper-parameter optimization method of a tree structure parzen estimator (TPE) to obtain an optimal hyper-parameter set within iteration times; and (4) training the neural network using the optimal hyper-parameter set in a training set to obtain a trained model. The method automatically searches the hyper-parameters for the fault diagnosis model, saves manual labor, has small network parameter number of the obtained fault diagnosis model, high operation speed and high accuracy, and can be widely applied to the field of intelligent fault diagnosis of rotary machines.

Description

Fault diagnosis method, system and device for rotary machine and storage medium

Technical Field

The invention relates to the field of intelligent fault diagnosis of rotary machines, in particular to a method, a system and a device for fault diagnosis of a rotary machine and a storage medium.

Background

Modern electromechanical devices play an important role in manufacturing and industry. Gears and bearings are critical components of rotary machines, and failure thereof will greatly affect the performance of the machine, causing serious safety risks and economic losses. In order to ensure continuous and stable operation of the equipment, an algorithm capable of diagnosing a fault of the rotary machine in time is required, thereby reducing economic loss. Therefore, the development of the fault diagnosis algorithm of the rotary machine is significant.

The fault diagnosis method based on deep learning can automatically extract data characteristics, realizes end-to-end diagnosis and is a mainstream intelligent fault diagnosis method. However, the existing fault diagnosis network focuses on the improvement of the precision and ignores the influence of the calculation time brought by the size of the model, and the model size should be reduced as much as possible to shorten the calculation time of the model under the condition of ensuring the precision, so that the fault diagnosis in real time is realized. Another problem is that different network structure models are often required for different fault diagnosis tasks, for example, a complex network structure may be over-fitted to a simple task, while a simple network structure cannot complete the complex diagnosis task, and one solution is to design a specific network structure and adjust the hyper-parameters for each specific task. However, designing the network structure for fault diagnosis and adjusting the hyper-parameters requires expert knowledge, and manual adjustment is time-consuming and labor-consuming. Therefore, it is difficult to manually design a suitable lightweight network for a specific task.

Disclosure of Invention

To solve at least some of the technical problems in the prior art, an object of the present invention is to provide a method, a system, a device and a storage medium for diagnosing a fault of a rotating machine.

The technical scheme adopted by the invention is as follows:

a method of fault diagnosis for a rotating machine, comprising the steps of:

acquiring an original vibration acceleration signal of a rotary machine, forming a sample by intercepting signal data with a preset length, carrying out normalization processing and labeling to obtain a data set, and acquiring a training set, a verification set and a test set according to the data set;

constructing a lightweight one-dimensional convolutional neural network for fault diagnosis, the convolutional neural network comprising: the system comprises a first convolution layer with large convolution kernel width, k intermediate layers with selectable operation operators, a global average pooling layer and a full-connection output layer; wherein, the first convolution layer with large convolution kernel width is used for feature extraction; k intermediate layers are structures to be optimized and used for further feature operation; the global average pooling layer is used for mapping the feature vector of each channel into a single value;

defining a search space theta of a hyper-parameter to be optimized of the convolutional neural network;

training the convolutional neural network by using an automatic hyper-parameter optimization method of a Tree-structured park estimator (TPE); training a network model using different hyper-parameter sets by adopting a training set, and obtaining evaluation values corresponding to the different hyper-parameter sets by adopting a verification set; obtaining an optimal hyper-parameter set within the iteration times according to the evaluation value;

retraining the convolutional neural network using the optimal hyper-parameter set in a training set to obtain a trained fault diagnosis model;

and inputting the test set into the trained fault diagnosis model, and outputting a classification diagnosis result.

Further, the constructing the sample by intercepting the signal data of the preset length includes:

and intercepting data segments containing fault characteristic frequency at least one period length in all fault categories from the original vibration acceleration signal to form a sample.

Further, a first intermediate layer and a second intermediate layer in the intermediate layers are structures to be optimized.

Further, when the number k of interlayers is 2, the search space Θ defining the hyper-parameters to be optimized by the convolutional neural network includes:

learning rate: the learning rate of the Adam optimizer is used for controlling the neural network optimization, and the good learning rate can enable the network to be converged more quickly and obtain higher precision;

number of channels in first layer: the number of the filters of the first layer of the convolution layer is larger, the required parameter quantity is larger, and the calculation time is longer;

first-layer convolution kernel size: the first layer of convolution is used as the main operation of feature extraction, so important consideration is needed, and the larger the size of a convolution kernel is, the more parameters are needed;

first layer convolution kernel step size: the larger the step length is, the larger the degree of feature compression is, and the smaller the required operation time is;

operator of the first intermediate layer: different operation operators can be selected to match with each other to form the structure of the first middle layer, and the different structures can influence the performance of the network, such as the computing time, the precision and the like of the network;

operator of the second intermediate layer: the function and the introduction of the first intermediate layer are the same;

defining the distribution type of the hyper-parameter candidate value to be optimized, comprising the following steps: and giving a selection type, an even distribution type, a discrete even distribution type, a logarithmic coordinate axis even distribution type and a logarithmic coordinate axis even distribution type.

Further, the candidate operation sets of the first intermediate layer and the second intermediate layer are represented as: { Fi }, representing i different operators;

the operation operators comprise { "maxpool", "avgpool", "skip", "conv _1", "conv _3" }, wherein "maxpool" represents the maximum pooling operation with the width of 3 and the step size of 1; "avgpool" represents an average pooling operation with a width of 3 and a step size of 1; "skip" represents an identity mapping; "conv _1" represents the convolution kernel size is 1, the step length is 1, and the one-dimensional packet convolution is grouped into the number of input channels, and the purpose of the packet convolution is to reduce the parameter number of the model; "conv _3" represents one-dimensional packet convolution in which the convolution kernel size is 3, the step size is 1, and the packets are the number of input channels;

n different operation operators can be selected from the candidate operation set to be combined, so that a network structure of the middle layer is formed, and when n is 0, any operation operator feature map is not selected, and the next layer is directly entered by skipping the layer; when n is 1, the output of the middle layer is F (x), and F is the selected operator; n is>1, the input characteristic diagram is copied into n parts, the n operations are respectively carried out, the calculation results are added to obtain the output of the layer, namely the output of the middle layer is

Wherein F _j Is the selected operator.

Further, the evaluation value is calculated as follows:

y＝acc _val -λ×t _val

in the formula, y represents an evaluation value of a hyper-parameter; acc (acrylic acid) _val Accuracy in the validation set; t is t _val Calculating a round time for the validation set on the device; λ is a scaling factor, in order to let λ x t _val And acc _val On an order of magnitude, the TPE algorithm is made to take the computation time as one of the objectives of the optimization. The choice of lambda should be looked at t _val Is determined by the size of the computing time t _val Large differences may result from the computing equipment, the selected batch size, etc.

Further, the retraining the convolutional neural network using the optimal hyper-parameter set in the training set comprises:

in each iteration process, performing repeat training on the convolutional neural network by using a training set;

the hyper-parameters and the corresponding evaluation values are used as historical prior to guide the tree structure parzen estimator to select the hyper-parameters next time;

and in a preset iteration turn, acquiring a hyper-parameter set corresponding to the maximum evaluation value as an optimal hyper-parameter set.

The other technical scheme adopted by the invention is as follows:

a rotary machine fault diagnostic system comprising:

the data processing module is used for acquiring an original vibration acceleration signal of the rotary machine, forming a sample by intercepting signal data with a preset length, carrying out normalization processing and labeling to obtain a data set, and acquiring a training set, a verification set and a test set according to the data set;

a network construction module for constructing a lightweight one-dimensional convolutional neural network for fault diagnosis, the convolutional neural network comprising: the system comprises a first convolution layer with large convolution kernel width, k intermediate layers with selectable operation operators, a global average pooling layer and a full-connection output layer; wherein, the first convolution layer with large convolution kernel width is used for feature extraction; k intermediate layers are structures to be optimized and used for further feature operation; the global average pooling layer is used for mapping the feature vector of each channel into a single value;

the parameter definition module is used for defining a search space theta of the hyper-parameters to be optimized of the convolutional neural network;

the model training module is used for training the convolutional neural network by utilizing an automatic hyper-parameter optimization method of the tree structure parzen estimator; training a network model using different hyper-parameter sets by adopting a training set, and obtaining evaluation values corresponding to the different hyper-parameter sets by adopting a verification set; obtaining an optimal hyper-parameter set within the iteration times according to the evaluation value;

the iterative training module is used for retraining the convolutional neural network using the optimal hyper-parameter set in a training set to obtain a trained fault diagnosis model;

and the diagnosis testing module is used for inputting the tested data into the trained fault diagnosis model and outputting a classification diagnosis result.

The other technical scheme adopted by the invention is as follows:

a rotary machine fault diagnosis device comprising:

at least one processor;

at least one memory for storing at least one program;

when executed by the at least one processor, cause the at least one processor to implement the method described above.

The invention adopts another technical scheme that:

a computer readable storage medium in which a processor executable program is stored, which when executed by a processor is for performing the method as described above.

The invention has the beneficial effects that: the method automatically searches the hyper-parameters for the fault diagnosis model, saves manual labor, and has the advantages of small network parameter number, high operation speed and high accuracy of the obtained fault diagnosis model.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following description is made on the drawings of the embodiments of the present invention or the related technical solutions in the prior art, and it should be understood that the drawings in the following description are only for convenience and clarity of describing some embodiments in the technical solutions of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 is a flow chart of a method of fault diagnosis of a rotating machine in an embodiment of the present invention;

FIG. 2 is a schematic diagram of a network framework of a lightweight network in an embodiment of the invention;

FIG. 3 is a diagram illustrating a hyper-parametric search space in an embodiment of the invention.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention and are not to be construed as limiting the present invention. The step numbers in the following embodiments are provided only for convenience of illustration, the order between the steps is not limited at all, and the execution order of each step in the embodiments can be adapted according to the understanding of those skilled in the art.

In the description of the present invention, it should be understood that the orientation or positional relationship referred to in the description of the orientation, such as the upper, lower, front, rear, left, right, etc., is based on the orientation or positional relationship shown in the drawings, and is only for convenience of description and simplification of description, and does not indicate or imply that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus, should not be construed as limiting the present invention.

In the description of the present invention, the meaning of a plurality of means is one or more, the meaning of a plurality of means is two or more, and larger, smaller, larger, etc. are understood as excluding the number, and larger, smaller, inner, etc. are understood as including the number. If the first and second are described for the purpose of distinguishing technical features, they are not to be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the precedence of the technical features indicated.

In the description of the present invention, unless otherwise specifically limited, terms such as set, installation, connection and the like should be understood in a broad sense, and those skilled in the art can reasonably determine the specific meanings of the above terms in the present invention by combining the specific contents of the technical solutions.

As shown in fig. 1, the present embodiment provides a method for diagnosing a fault of a rotating machine based on a Lightweight Network and an improved TPE (light weight Network with Modified Tree-structured partial analyzers, LN-MT), where the method is based on global average pooling and packet convolution to construct a Lightweight one-dimensional convolutional neural Network framework; defining a hyper-parameter search space of a network framework, and searching a hyper-parameter set which has high network precision and short calculation speed as far as possible in the hyper-parameter search space by using an improved TPE optimization algorithm and taking the weight of diagnosis precision and calculation time as an optimization target; then training the one-dimensional convolution neural network using the optimal hyper-parameter set from the beginning; and aiming at different fault diagnosis tasks, the LN-MT automatically searches out a proper lightweight model, so that the labor cost is effectively saved, and remarkable diagnosis precision and diagnosis efficiency are realized. The method specifically comprises the following steps:

step 1: the method comprises the steps of collecting original vibration acceleration signals of the rotary machine under a certain experimental working condition, intercepting signal data with a certain length to form a sample, marking after simple normalization processing, and dividing a data set into a training set, a verification set and a test set.

In some embodiments of the present invention, when the data segment is extracted from the vibration acceleration signal, the data segment containing the fault characteristic frequency of at least one period length in all fault categories is extracted from the vibration acceleration signal to form the sample.

In some embodiments of the present invention, the training set sample length is not equal to the test set sample length.

And 2, step: constructing a lightweight one-dimensional convolutional neural network for fault diagnosis, comprising: the device comprises a first convolution layer with a large convolution kernel width, a first intermediate layer, a second intermediate layer, a global average pooling layer and a full-connection output layer, wherein the first convolution layer with the large convolution kernel width is used for feature extraction, the first intermediate layer and the second intermediate layer are structures to be optimized and used for further feature operation, and the global average pooling layer is used for mapping a feature vector of each channel into a single value, so that full-connection operation is facilitated.

In some examples of the invention, the constructed one-dimensional convolutional neural network includes a plurality of intermediate layers, each intermediate layer includes a plurality of candidate operations, and the number of operations selectable for each intermediate layer may also be different. Referring to fig. 2, in the present example, there are two intermediate layers, and each intermediate layer is provided with 5 different optional operations, and each intermediate layer can simultaneously select at most 2 operations, and the candidate operations include: [] "[" maxpool "], [" avgpool "], [" skip "], [" conv _1"], [" conv _3"], [" maxpool "," avgpool "], [" maxpool "," skip "], [" maxpool "," conv _1"], [" max-pool "," conv _3"], [" avgpool "," skip "], [" avgpool "," conv _1"], [" avg-pool "," conv _3"], [" skip "," conv _1"], [" skip "," conv _3"], [" conv _1"," conv _3 "]; in the figure, two operations of [ "Conv _3", "Maxpool" ] are selected in the first intermediate layer, and one operator of [ "Conv _1" ] is selected in the second intermediate layer.

And step 3: defining a search space of hyper-parameters to be optimized for the neural network.

In some examples of the present invention, the hyperparametric search space to be optimized for the neural network defined in step 3, please refer to fig. 3, which includes: learning rate, the number of channels of the first layer, the size of a convolution kernel of the first layer, the step length of the convolution kernel of the first layer, an operation operator of the first middle layer and an operation operator of the second middle layer; defining the distribution type of the hyper-parameter candidate value to be optimized, comprising the following steps: and giving a selection type, an even distribution type, a discrete even distribution type, a logarithmic coordinate axis even distribution type and a logarithmic coordinate axis even distribution type.

And 4, step 4: training a network by using an automatic hyper-parameter optimization method of a tree structure parzen estimator (TPE), training a network model using different hyper-parameter sets by using a training set, and obtaining evaluation values corresponding to the hyper-parameter sets by using a verification set; and obtaining the optimal hyper-parameter set within the iteration number.

And (3) calculating the evaluation value of the improved TPE based on the verification set precision and the calculation time of the network:

y＝acc _val -λ×t _val (1)

wherein y represents the evaluation value of the hyper-parameter, acc _val For accuracy in verification set, t _val To verify that a set calculates a round of time on the device, λ is the scaling factor, in order for λ T to be _val And acc _val On the order of magnitude, the TPE algorithm is made to take the computation time as one of the objectives of the optimization. The choice of lambda should be taken into account for t _val Is determined by the size of the computing time t _val Large differences may result from the computing equipment, the selected batch size, etc.

In the process of selecting the optimal hyper-parameters by using the TPE in the step 4, in each iteration process, a training set is used for carrying out repeated training on the network, and an evaluation value is calculated based on the precision of the verification set and the calculation time; the super-parameters and the corresponding evaluation values are used as historical prior to guide the TPE to select the next super-parameters; and within a certain iteration turn, the hyper-parameter set with the largest evaluation value is taken as the optimal hyper-parameter set.

And 5: and (4) carrying out repeated training on the neural network using the optimal hyper-parameter set in a training set to obtain a trained model.

Step 6: and inputting the test data into the trained diagnostic model for testing, and outputting a final classification diagnostic result.

The invention is further explained below with reference to the drawings and experimental examples.

In order to evaluate the performance of the method, the gear box used is a three-shaft five-gear transmission rotary mechanical test platform. The transmission has five forward gears and one reverse gear, and the transmission is in fifth gear in this test. In the test, the vibration data is collected by an acceleration sensor arranged on an output bearing seat, when the gearbox actually works, the gearbox respectively operates at four rotating speeds of 500rpm, 750rpm, 1000rpm and 1250rpm, and is matched with 50 N.m loading torque, and the same type of faults under all working conditions share the same label. The sampling frequency of the signal was set to 24 kHz. Different types of faults are arranged on the inner ring of the bearing and the gear teeth of the five-gear respectively. The failed bearing is located at the output shaft end and is of the NPU311EN type. The bearing fault is arranged on an inner ring of the bearing, the fault width is 0.2mm, and the fault depth is 1 mm. The fault gear is a five-gear driven gear, and the faults of the fault gear simulate local faults with different severity degrees by carrying out linear cutting processing on gear teeth of the gear in different degrees, wherein the faults comprise three different fault types of slight broken gear, medium broken gear and complete broken gear. Meanwhile, in order to further increase the challenge of classification tasks, a compound fault type comprising a combination of bearing inner ring faults and different serious faults of the gear is further introduced into the test, and the details of an experimental data set are shown in table 1. Two tasks were created using raw data and data with an added noise signal-to-noise ratio SNR of 0dB, respectively. The sample length of each sample is 2048.

The second experiment was experimentally verified through a numerically controlled machine tool bearing data set. And the bearing data set of the numerical control machine tool carries out data acquisition when the rotating speed of the input shaft is 8000 rpm. The bearing fault detection method comprises data of four bearing states, namely normal, inner ring fault, outer ring fault and retainer fault. When the bearing works, the bearing respectively operates under three working conditions of aluminum alloy cutting, stainless steel cutting and no-load, and the same type of faults of all the working conditions share the same label. The sampling frequency of the signal was set to 25.6 kHz. Different types of faults are arranged on the inner ring of the bearing and the gear teeth of the five-gear respectively. The failed bearing is located at the output shaft end and is of the type NSK 40BNR 10. The experimental data set details are shown in table 2. Two tasks were created using data with additive noise signal to noise ratios SNR of 0dB and-5 dB, respectively. The sample length of each sample is 2048.

TABLE 1 gearbox data set

TABLE 2 bearing data set for numerically controlled machine tool

To verify the superiority of the proposed method, several classical algorithms were used for comparison on four classification tasks, including:

WDCNN: the convolutional neural network with the first layer of large convolutional kernels is proved to have good fault diagnosis classification performance.

WDCNN-T: and (3) optimizing the WDCNN super-parameter by using the TPE, wherein the optimization target is the diagnosis precision.

WDCNN-MT: and (3) optimizing the WDCNN hyper-parameter by using the TPE, wherein the optimization target is the weight of the diagnosis precision and the calculation time.

DARTS: a gradient-based neural network structure search algorithm can search out a suitable network structure according to a specific fault diagnosis task.

deployed-FC: the GAP layer in the proposed method is replaced with a fully connected layer with 50 neurons.

deployed-STD: the packet convolution of the proposed method is replaced by a standard convolution.

deployed-T: the optimization objective of the TPE algorithm of the proposed method is changed to only improve the diagnostic accuracy.

Evaluating the performance of the algorithm in different dimensions, including:

the quantity of the ginseng is as follows: the trainable parameters in the model are larger, the complexity of the model is higher, and the required memory is larger.

Training time: the model is trained on the training data set for a round of computation time.

And (3) testing time: the model calculates the time of a round for the test data set.

The accuracy is as follows: and (4) the diagnosis precision of the model on the test set.

Evaluation score: the method is obtained by subtracting the lambda times of the test time from the accuracy, and the optimization target of the TPE is the same as that of the TPE in the invention.

For experimental fairness, all methods use the same convolutional neural network structure. To prevent contingency of the experiment, each migration task was performed 5 times, and the accuracy was averaged. The results are shown in tables 3, 4, 5 and 6:

TABLE 3 LN-MT of the invention compared with other methods (original data set of gearbox)

TABLE 4 LN-MT of the invention compared with other methods (gearbox signal-to-noise ratio 0dB data set)

TABLE 5 comparison of LN-MT of the invention with other methods (numerical control machine bearing signal-to-noise ratio 0dB data set)

TABLE 6 comparison of LN-MT of the invention with other methods (numerical control machine bearing signal-to-noise ratio-5 dB data set)

In four different experiments, the LN-MT method provided by the invention shows good classification performance, and has higher evaluation score due to shorter test time. In addition, compared with the deployed-STD, the LN-M has smaller model parameter quantity and shorter training time while ensuring the precision, and is a lighter network. By comparing WDCNN, WDCNN-T and WDCNN-MT, the WDCNN-T using TPE is obviously superior to WDCNN in accuracy, and the effectiveness of the TPE automatic hyper-parameter optimization algorithm is proved; the WDCNN-MT is further superior to the WDCNN-T in calculation time and parameter, and the improved TPE optimization target in the method provided by the invention can be proved to be capable of effectively realizing a lighter network while ensuring the precision. Compared with DARTS, the LN-MT diagnosis accuracy of the method provided by the invention is remarkably improved, the calculation time and the parameter quantity are smaller, and the superiority of the LN-MT compared with the DARTS of the neural network structure searching method is proved. Compared with a deployed-STD, the LN-MT method provided by the invention has slight defects in accuracy and evaluation score, but has remarkable optimization in parameter quantity and training time, and the use of the packet convolution is proved to further reduce the model quantity and the calculation time under the condition of not influencing the precision, so that the aim of lightening the network is fulfilled. Compared with a deployed-FC, LN-MT provided by the method is obviously superior in evaluation indexes such as parameter number, calculation time, accuracy and the like, and the method proves that the model quantity can be effectively reduced and overfitting can be reduced by using global average pooling instead of a full connection layer, so that more efficient and accurate fault diagnosis is realized. Compared with a deployed-T, the LN-MT method provided by the invention is slightly insufficient in accuracy, but is significantly better in parameter quantity, calculation time and evaluation score, and the improved TPE optimization target provided by the invention can effectively realize a lighter network while ensuring the precision.

Aiming at the problems that a new network structure needs to be designed and the hyper-parameters need to be readjusted when different fault diagnosis tasks are carried out, and the designed network needs to take time and precision into consideration, the invention takes the rotary machine fault as a research object, replaces full connection with global average pooling, replaces standard convolution with grouped convolution, carries out automatic hyper-parameter optimization with TPE by taking precision and calculated time as an optimization target, and effectively and automatically obtains a lightweight model suitable for a specific fault diagnosis task.

In summary, compared with the prior art, the invention has the following advantages and beneficial effects:

(1) the method adopts the one-dimensional convolution kernel to construct the convolution neural network, avoids artificial feature extraction, reduces the dependence on professional knowledge such as signal processing and the like, and can effectively extract the high-dimensional features of the data.

(2) The method adopts the grouping convolution to replace the conventional convolution, uses the global average pooling to replace the common full connection, effectively reduces the model parameter number, and realizes the lightweight model structure on the premise of not losing the diagnosis precision.

(3) The invention adopts global average pooling, can adapt to signal inputs with different lengths, has higher diagnosis precision as the input length is longer, has better flexibility in practical industrial application, and effectively overcomes the problem that the common full-connection layer can not adapt to the inputs with different lengths.

(4) The invention designs an improved TPE algorithm, and the evaluation value is the weighting of the network precision and the calculation time, so that the automatically searched hyper-parameter set can ensure that the network has higher precision, the network operation speed is higher, and the problem that the common TPE algorithm only pays attention to the network precision and ignores the calculation time is effectively solved.

(5) According to the method, through a TPE automatic hyper-parameter optimization algorithm in automatic machine learning, an agent model is constructed based on priori knowledge to automatically search out the optimal hyper-parameter, the problem that different fault diagnosis tasks of the rotary machine need to be manually designed in different networks and adjusted in the hyper-parameter is solved, and a feasible solution is provided for realizing automatic customized rotary machine equipment state monitoring and fault diagnosis models.

The present embodiment further provides a fault diagnosis system for a rotary machine, including:

The rotary machine fault diagnosis system of the embodiment can execute the rotary machine fault diagnosis method provided by the method embodiment of the invention, can execute any combination of the implementation steps of the method embodiment, and has corresponding functions and beneficial effects of the method.

A rotary machine fault diagnosis apparatus comprising:

at least one processor;

at least one memory for storing at least one program;

when executed by the at least one processor, cause the at least one processor to implement the method of fig. 1.

The rotary machine fault diagnosis device of the embodiment can execute the rotary machine fault diagnosis method provided by the method embodiment of the invention, can execute any combination of the implementation steps of the method embodiment, and has corresponding functions and beneficial effects of the method.

The embodiment of the application also discloses a computer program product or a computer program, which comprises computer instructions, and the computer instructions are stored in a computer readable storage medium. The computer instructions may be read by a processor of a computer device from a computer-readable storage medium, and executed by the processor to cause the computer device to perform the method illustrated in fig. 1.

The embodiment also provides a storage medium, which stores instructions or programs for executing the method for diagnosing the fault of the rotary machine, and when the instructions or the programs are executed, the steps can be executed in any combination of the method embodiments, and the corresponding functions and the beneficial effects of the method are achieved.

In alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flow charts of the present invention are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed and in which sub-operations described as part of larger operations are performed independently.

Furthermore, although the present invention is described in the context of functional modules, it should be understood that, unless otherwise stated to the contrary, one or more of the described functions and/or features may be integrated in a single physical device and/or software module, or one or more functions and/or features may be implemented in a separate physical device or software module. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary for an understanding of the present invention. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be understood within the ordinary skill of an engineer, given the nature, function, and internal relationship of the modules. Accordingly, those skilled in the art can, using ordinary skill, practice the invention as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative of and not intended to limit the scope of the invention, which is defined by the appended claims and their full scope of equivalents.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention or a part thereof which substantially contributes to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.

More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

In the foregoing description of the specification, reference to the description of "one embodiment/example," "another embodiment/example," or "certain embodiments/examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.

While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A method of diagnosing a fault in a rotary machine, comprising the steps of:

acquiring an original vibration acceleration signal of a rotary machine, forming a sample by intercepting signal data with a preset length, carrying out normalization processing, labeling to obtain a data set, and acquiring a training set, a verification set and a test set according to the data set;

defining a search space theta of the hyper-parameters to be optimized of the convolutional neural network;

training the convolutional neural network by using an automatic hyper-parameter optimization method of a tree structure parzen estimator; training a network model using different hyper-parameter sets by adopting a training set, and obtaining evaluation values corresponding to the different hyper-parameter sets by adopting a verification set; obtaining an optimal hyper-parameter set within the iteration times according to the evaluation value;

retraining the convolutional neural network using the optimal hyper-parameter set in a training set to obtain a trained fault diagnosis model; and inputting the test set into the trained fault diagnosis model, and outputting a classification diagnosis result.

2. A method according to claim 1, wherein the constructing the sample by intercepting the signal data of a preset length comprises:

3. A method according to claim 1, wherein the first and second intermediate layers are structures to be optimized.

4. The method according to claim 3, wherein when the number k of intermediate layers is 2, the search space Θ defining the hyper-parameters to be optimized for the convolutional neural network comprises:

learning rate: the learning rate of the Adam optimizer is used for controlling the convolutional neural network optimization;

number of channels in first layer: the number of filters of the first layer of convolution layer;

a first layer convolution kernel size;

a first layer of convolution kernel step size;

an operator of the first intermediate layer;

an operator of the second intermediate layer;

and (4) the distribution type of the hyper-parameter candidate values to be optimized.

5. A method according to claim 3, wherein the set of candidate operations for the first and second intermediate layers is represented as: { F _i H, representing i different operators;

the operation operators comprise { "maxpool", "avgpool", "skip", "conv _1", "conv _3" }, wherein "maxpool" represents the maximum pooling operation with the width of 3 and the step size of 1; "avgpool" represents an average pooling operation with a width of 3 and a step size of 1;

"skip" represents an identity mapping; "conv _1" represents the convolution kernel size is 1, the step length is 1, and the one-dimensional packet convolution is grouped into the number of input channels, and the purpose of the packet convolution is to reduce the parameter number of the model; "conv _3" represents one-dimensional packet convolution with convolution kernel size of 3, step size of 1 and grouping as the number of input channels;

Wherein F _j Is the selected operator.

6. A rotary machine fault diagnosis method according to claim 1, characterized in that said evaluation value is calculated as follows:

y＝acc _val -λ×t _val

in the formula, y represents an evaluation value of a hyper-parameter; acc (acrylic acid) _val Accuracy in the validation set; t is t _val Calculating a round time for the validation set on the device; λ is a scaling factor, in order to let λ x t _val And acc _val On the order of magnitude.

7. The method of claim 1, wherein the retraining the convolutional neural network using the optimal hyper-parameter set in a training set comprises:

8. A rotary machine fault diagnostic system, comprising:

a network construction module for constructing a lightweight one-dimensional convolutional neural network for fault diagnosis, the convolutional neural network comprising: the system comprises a first convolution layer with large convolution kernel width, k intermediate layers with selectable operation operators, a global average pooling layer and a full-connection output layer; wherein, the first convolution layer with large convolution kernel width is used for feature extraction; the k intermediate layers are structures to be optimized and used for further feature operation; the global average pooling layer is used for mapping the feature vector of each channel into a single value;

and the diagnosis testing module is used for inputting the tested data into the trained fault diagnosis model and outputting the classification diagnosis result.

9. A rotary machine fault diagnosis device characterized by comprising:

at least one processor;

at least one memory for storing at least one program;

when executed by the at least one processor, cause the at least one processor to implement the method of any one of claims 1-7.

10. A computer-readable storage medium, in which a program executable by a processor is stored, wherein the program executable by the processor is adapted to perform the method according to any one of claims 1 to 7 when executed by the processor.