CN112131907A

CN112131907A - Method and device for training classification model

Info

Publication number: CN112131907A
Application number: CN201910550658.2A
Authority: CN
Inventors: 景泉; 李城梁
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Cloud Computing Technologies Co Ltd
Priority date: 2019-06-24
Filing date: 2019-06-24
Publication date: 2020-12-25
Also published as: WO2020258913A1

Abstract

A method and a device for training a classification model relate to the field of artificial intelligence and are used for training a classification model with higher accuracy. The method comprises the following steps: after the collecting device collects the training set, the training set is sent to a classification model training device, if the number of signals in the first information set in the training set is small or does not exceed a threshold value, the classification model training device utilizes an augmentation model to augment the first signal set, further, the augmented first signal set is utilized to train the classification model, the trained classification model is used for determining the type of the input signals to be classified, and the augmentation model and the classification model adopt neural network models with different structures. The classification model is trained by utilizing the first signal set after the amplification, so that the classification model can learn richer characteristics from the first signal set after the amplification, and the accuracy of the classification model is improved.

Description

Method and device for training classification model

Technical Field

The application relates to the field of artificial intelligence, in particular to a method and a device for training a classification model.

Background

With the development of deep learning, various neural network models with different structures are used to solve different problems, the neural network models are commonly used for identifying (essentially two classes) or classifying (multi-class) one signal, one neural network model capable of determining the type of the input signal is called a classification model, and in daily application, many scenes needing classification by using the classification model exist.

Taking the fault diagnosis of mechanical equipment as an example, some mechanical equipment such as a steam turbine, a generator, a centrifugal compressor and the like are key equipment related to the capacity of petrochemical industry, metallurgy, aviation and important industrial departments. When the mechanical equipment fails, the whole equipment can not work normally; it is also easy to have chain reaction and even serious casualty accident. In order to find the fault of the mechanical equipment in time and prevent the chain reaction caused by the mechanical equipment, fault diagnosis needs to be performed on the mechanical equipment, so that the fault of the mechanical equipment can be quickly determined at the initial stage of the fault of the mechanical equipment, and the fault type can be determined.

The fault diagnosis of the mechanical equipment can judge whether the mechanical equipment has faults or not and the type of the faults of the mechanical equipment according to the obtained current signals of the mechanical equipment by means of the classification model.

The accuracy of the classification model depends on a training set adopted by the classification model in a pre-training process to a certain extent, when signals included in the training set are rich, and when the classification model is trained by the training set, the classification model can learn rich and stable characteristics from the training set, and the accuracy of the classification model after training is higher.

However, in daily production, the mechanical equipment normally operates more frequently or partially fails, more signals are available during normal operation or some common failures, and for some less frequent failures, less signals (such as vibration signals) are available during operation with such failure types, which results in less signals in the training set for training the classification model than in other failure types; currently, an oversampling method is usually adopted to increase the signals of the fault types in the training set of the classification model; the oversampling makes the signals of the fault in the training set have the same order of magnitude as the signals of other types by copying, cutting, turning and the like of the collected signals of the fault type. However, the oversampling mode is only based on simple transformation such as copying, cutting, flipping and the like, and the added signal does not provide more new features per se, which may cause the phenomenon of overfitting of the classification model obtained by training according to the training set, so that the accuracy of the classification model is reduced.

Disclosure of Invention

The application provides a method for training a classification model, which is used for improving the accuracy of the trained classification model.

In a first aspect, an embodiment of the present application provides a method for training a classification model, where the method includes: after the collection device collects the first signal set, the first signal set may be sent to the classification model training apparatus, where the first signal set includes a plurality of signals, and the plurality of signals are of the same type. If the number of signals in the first information set is small or does not exceed the threshold, the classification model training device may increase the number of signals in the first signal set after acquiring the first signal set; for example, the classification model training device may augment the first signal set with an augmentation model to add new signals to the first signal set, such that the number of signals in the augmented first signal set is increased compared to the number of signals in the first signal set. And then training the classification model according to the augmented first signal set to generate a classification model for determining the type of the input signal to be classified, wherein the augmented model and the classification model respectively adopt neural network models with different structures.

By the method, for the first signal set with a small number of signals, the signals in the first signal set are increased based on the amplification mode of the amplification model; in subsequent training, the classification model can learn richer characteristics from the first signal after amplification in a centralized manner, so that the accuracy of the classification model is improved, excessive noise signals are not easily generated in the amplification mode based on the amplification model, and the overfitting phenomenon of the classification model is avoided.

In one possible implementation, the augmented model includes a generator and a discriminator, the generator generates a candidate signal set, a candidate training set and a first signal set are input into the discriminator, and the discriminator discriminates whether the candidate signal set is a true signal set, the true signal set being a signal set having the same or similar characteristics as the first signal set; the generator and the discriminator circulate the above process, and alternate training is carried out until the discriminator can not discriminate whether the candidate signal set is the real signal set, the candidate signal set generated by the generator is close to the real signal set, and the signals in the candidate signal set can be added to the first signal set to augment the first signal set.

By the method, the generator in the augmentation model is matched with the discriminator to complete augmentation of the first signal set, so that the augmented first signal set and the augmented first signal set have the same or similar characteristics.

In a possible implementation manner, when the classification model is trained according to the first signal set after amplification, time-domain to frequency-domain conversion is performed on the signals in the first signal set, envelope detection is performed on the signals in the first signal set, an envelope of each signal in the first signal set is extracted, then fourier transform is performed based on the envelope of each signal in the first signal set, and an envelope spectrum of each signal in the first signal set after amplification is obtained; and inputting the envelope spectrum of each signal in the first signal set after the amplification into a classification model so as to train the classification model.

By the method, the frequency domain characteristics of the signals are extracted by the conversion of the signals from the time domain to the frequency domain, the data dimension of the input values input into the classification model can be effectively reduced, the number of parameters in the classification model can be reduced, and the generalization of the classification model can be better ensured.

In one possible implementation, the signals in the first signal set after amplification are of the same type as the signals in the first signal set.

By the method, the type of the signals in the first signal set after amplification is the same as that of the signals in the first signal set, and the accuracy of the classification model after training can be improved when the classification model is trained subsequently.

In one possible implementation, applied in a fault diagnosis scenario, the type of signal in the first signal set represents a fault state of the mechanical device. The signal in the first set of signals may be a vibration signal generated when the mechanical device is experiencing the fault.

By the method, the classification model trained and completed based on the signal sets can be applied to fault diagnosis of the mechanical equipment by acquiring the signal sets of the mechanical equipment in different fault states.

In a second aspect, an embodiment of the present application further provides a classification model training device, where the classification model training device includes an obtaining module, an augmentation model, and a training module: the acquisition module is configured to acquire a first signal set, where the first signal set includes a plurality of signals; the amplification model is used for receiving the signals in the first signal set as input and amplifying the first signal set, wherein the number of the signals in the first signal set after amplification is larger than that of the signals in the first signal set; the training module is used for training the classification model according to the first signal set after the amplification, and the classification model after the training is used for determining the type of the input signal to be classified, wherein the amplification model and the classification model respectively adopt neural network models with different structures.

In one possible implementation, the augmented model includes a generator and an arbiter, and the augmented model is specifically configured to: generating a set of candidate signals with the generator; the first signal set and the candidate signal set are input to the discriminator to augment the first signal set.

In a possible implementation manner, the training module is specifically configured to: obtaining an envelope spectrum of each signal in the first signal set after amplification; inputting the amplified envelope spectrum of each signal in the first signal set to the classification model to train the classification model.

In a possible implementation, the type of the signal in the first signal set is a fault state of the mechanical device.

In a third aspect, an embodiment of the present application further provides a computer device, where the computer device includes a processor and a memory, and may further include a communication interface, and the processor executes program instructions in the memory to perform the method provided in the first aspect or any possible implementation manner of the first aspect. The memory is coupled to the processor and stores program instructions and data necessary for the training of the classification model. The communication interface is used for communicating with other devices, such as acquiring the first signal set from other devices.

In a fourth aspect, the present application provides a computing device system comprising at least one computing device. Each computing device includes a memory and a processor. A processor of at least one computing device is configured to access code in the memory to perform the method provided by the first aspect or any one of its possible implementations.

In a fifth aspect, the present application provides a classification apparatus, which includes a processing module and a classification model, wherein the classification model is trained by the method of the first aspect or any one of the design methods of the first aspect; the processing module may extract an envelope spectrum of a signal to be classified (e.g., a vibration signal of a mechanical device), input the envelope spectrum of the signal to be classified to the classification model, and the classification model outputs a type of the signal to be classified according to the input envelope spectrum of the signal to be classified.

In a possible implementation manner, the classification apparatus includes a preprocessing module, which is configured to perform preprocessing, such as signal denoising, on the signal to be classified before the processing model extracts the envelope spectrum of the signal to be classified. And then, the processing module extracts the envelope frequency spectrum of the preprocessed signal to be classified.

In a sixth aspect, the present application provides a non-transitory readable storage medium, which when executed by a computing device, performs the method provided in the foregoing first aspect or any possible implementation manner of the first aspect. The storage medium stores a program therein. The storage medium includes, but is not limited to, volatile memory such as random access memory, and non-volatile memory such as flash memory, Hard Disk Drive (HDD), and Solid State Drive (SSD).

In a seventh aspect, the present application provides a computing device program product comprising computer instructions that, when executed by a computing device, perform the method provided in the foregoing first aspect or any possible implementation manner of the first aspect. The computer program product may be a software installation package, which may be downloaded and executed on a computing device in case it is desired to use the method as provided in the first aspect or any possible implementation manner of the first aspect.

Drawings

FIG. 1 is a system architecture diagram provided herein;

FIG. 2 is a schematic diagram of another system architecture provided herein;

FIG. 3 is a schematic illustration of a vibration signal provided herein;

FIG. 4 is a schematic flow chart of a method for training a classification model provided herein;

FIG. 5 is a schematic structural diagram of the DCGAN provided herein;

FIG. 6 is a schematic diagram of an envelope spectrum of a vibration signal provided herein;

fig. 7 is a schematic structural diagram of CNN provided in the present application;

fig. 8 is a schematic flowchart of a vibration signal classification method provided in the present application;

FIG. 9 is a schematic diagram of a classification model training apparatus 900 according to an embodiment of the present application;

fig. 10 is a schematic diagram of a computing device 1000 provided by an embodiment of the present application;

fig. 11 is a schematic diagram of a computing device 1100 in a computing device system according to an embodiment of the present application.

Detailed Description

The application provides a method and a device for training a classification model, which are used for solving the problem of poor accuracy of the classification model.

In the embodiments of the present application, the neural network model used to determine the type of an input signal is referred to as a classification model. The neural network model is a mathematical computation model that mimics the structure and function of a biological neural network (the central nervous system of an animal), and includes a plurality of network layers, each of which is configured with weights and a computational formula. At present, the neural network models have various structures, the neural network models with different structures can be used in different scenes (for example, classification and signal amplification) or provide different effects when used in the same scene, and the different structures of the neural network models specifically include one or more of the following: the neural network model has different network layers, different sequences of the network layers, and different weights, parameters or calculation formulas in each network layer. Generally, a classification model is a neural network model with supervised learning, that is, when training the classification model, a training set is required to include signals with labels, and for the signals in the training set of the classification model, the label of each signal is the type of the signal. Therefore, in the present application, the training set used for training the classification model includes a plurality of signal sets of different types, and the signals in each signal set belong to the same type (i.e., the signals in each signal set have the same label). If the classification model needs to implement fault diagnosis of the mechanical equipment, signals in different signal sets in the training set are signals of operation of the mechanical equipment obtained when the mechanical equipment has different faults, and a signal in each signal set is labeled as a fault type of the mechanical equipment, for example, a signal in one signal set in the training set is labeled as a loosening type (a signal indicating operation of the mechanical equipment obtained when a part in the mechanical equipment is loosened), a signal in another signal set is labeled as a wear type (a signal indicating operation of the mechanical equipment obtained when a part in the mechanical equipment is worn), and the like.

For a signal set in the training set, if the number of signals in the signal set is smaller, the signal set is a small sample signal set if the number of signals in the signal set is smaller than the threshold, and vice versa.

When training the classification model based on the training set, the following two methods can be adopted:

inputting signals in each signal set in a training set into a classification model, enabling the classification model to learn the rules and characteristics of the signals with labels, and continuously adjusting the weight of each network layer in the classification model; and finishing the training of the classification model until the loss function is converged.

When the number of signals in each signal set in the training set is large (for example, each signal set includes 5000-. However, in some classification scenarios (such as fault diagnosis of mechanical equipment, sound classification of marine rare living beings, etc.), it is difficult to acquire some specific types of signals, which results in a smaller number of signals in the type of signal set in the training set, where the type of signal set is a small sample signal set. When a small sample signal set exists in the training set, the adoption of the mode for training the classification model can cause that the classification model formed after training can not accurately identify the type of the small sample signal set, and the accuracy of the classification model can be reduced.

And secondly, preprocessing the training set, sequentially inputting the signals in each signal set in the preprocessed training set into the classification model, and training the classification model based on the preprocessed training set.

When the training set is preprocessed, the number of signals in the small sample signal set in the training set may be increased, for example, the signals may be generated in an oversampling manner for the small sample signal set, and the generated signals may be added to the small sample signal set, so that the number of signals in the small sample signal set is increased, and the number of signals in the small sample signal set in which the signals are added may reach the same order of magnitude as the number of signals in the large sample signal set. The over-sampling may be performed by copying some of the signals in the decimated small sample signal set and adding the resulting signals to the small sample signal set. The oversampling can also be to add random noise, white noise or simply filter some signals in the small sample signal set to form new signals, and add the formed new signals to the small sample signal set, so that the number of signals in the small sample signal set is in the same number set as the number in the large sample signal set.

In the training process of the classification model, the characteristics of the signal set of the small samples extracted by the classification model from the training set after the pretreatment are the same as the characteristics of the signal set of the small samples extracted by the classification model from the training set before the pretreatment. Thus, pre-processing with oversampling does not add more feature rich signals to the small sample signal set. The preprocessing performed by oversampling is also easy to increase the noise signal of the small sample signal set, and the classification accuracy is reduced when the trained classification model is overfitting.

In the embodiment of the application, in order to improve the accuracy of the trained classification model, the classification model is trained by using a classification model training device, the classification model training device can increase the number of signals in the small sample signal set in the training set, the classification model training device can extract richer features of the signals in the small sample signal set by using an augmentation model, and signals with more abundant features are deduced according to the features of the signals in the small sample signal set, so that the deduced signals are added into the small sample signal set. In the embodiment of the present application, a signal having the same or similar characteristics is inferred based on richer characteristics of signals in the small sample signal set, and a manner of increasing the number of signals in the small sample signal set in the training set is referred to as augmentation. The method is different from an oversampling mode, and an amplification mode based on an amplification model is not simply used for simply transforming the signals in the small sample signal set, but is used for extracting richer features of the signals in the small sample signal set and increasing signals with the same or similar features as those in the small sample signal set; the method has the advantages that when the classification model is trained by the augmented training set, the classification model can learn more characteristics from the signal set, the accuracy of the classification model can be improved, excessive noise signals are not easily generated by the augmented mode based on the augmented model, and the overfitting phenomenon of the classification model is favorably avoided. In the present application, the augmentation model employs a neural network model.

Before describing the method for training the classification model provided in the embodiment of the present application, a system architecture to which the embodiment of the present application is applied is described.

The method for training the classification model provided in the embodiment of the present application may be executed by a classification model training device, and the position where the classification model training device is deployed is not limited in the embodiment of the present application. For example, as shown in fig. 1, the classification model training apparatus may be run on a cloud computing device system (including at least one cloud computing device, such as a server, etc.), may also be run on an edge computing device system (including at least one edge computing device, such as a server, a desktop, etc.), and may also be run on various terminal computing devices, such as: notebook computers, personal desktop computers, and the like.

The classification model training device may also be a device composed of multiple parts, for example, the classification model training device may include an acquisition module, a preprocessing module, an augmentation model, and a training module, and each component in the classification model training device may be deployed in different systems or servers, respectively. For example, as shown in fig. 2, each part of the apparatus may operate in three environments, namely, a cloud computing device system, an edge computing device system, or a terminal computing device, respectively, or may operate in any two of the three environments. The cloud computing device system, the edge computing device system and the terminal computing device are connected through communication paths, and can communicate with each other and transmit data. The training method of the classification model provided by the embodiment of the application is cooperatively executed by each combined part of the classification model training device which runs in three environments (or any two of the three environments).

It should be understood that, in the embodiment of the present application, each signal set in the initial training set may be sent to the classification model training apparatus after being collected by the collecting apparatus in advance. In a fault diagnosis scene of mechanical equipment, the collecting equipment can acquire a vibration signal of the mechanical equipment when a fault occurs through a vibration sensor installed in the mechanical equipment, and the vibration signal is used as a signal concentrated by a signal of the fault type; artificially marking the collected vibration signals, marking the fault types of the vibration signals, forming a signal set by the vibration signals belonging to the same fault type, and sending the signal sets serving as an initial training set to a classification model training device after collecting a plurality of signal sets by the collecting device. The embodiment of the present application does not limit the specific form of the collecting device, for example: in the case of collecting vibration signals of different failure types of mechanical devices, the collecting device may be a device connected to one or more mechanical devices and having a vibration signal collecting function, or may be a module in the mechanical devices.

As shown in fig. 3, the device is a visual image of a vibration signal collected by the vibration sensor, wherein the abscissa is the serial number of the sampling point in the vibration signal, and the ordinate is the vibration acceleration in m/s²。

The vibration acceleration of the mechanical equipment on each sampling point is recorded through vibration signals collected by the vibration sensor, wherein the sampling point refers to the moment when the vibration acceleration is collected through the vibration sensor. In the embodiment of the application, the collected signal is taken as a vibration signal, and the vibration acceleration of each sampling point is recorded in the vibration signal as an example. It should be understood that other information may be recorded in the vibration signal, such as vibration amplitude, vibration speed, etc. The information recorded by the vibration signal is not limited in the embodiment of the present application.

The following description will take the training method of the classification model in the fault diagnosis scenario of the mechanical device as an example with reference to the accompanying drawings.

As shown in fig. 4, a method for training a classification model according to an embodiment of the present application includes:

step 401: the classification model training device preprocesses each signal in an initial training set (such as signal denoising, signal interception and the like), wherein the initial training set comprises a plurality of signal sets, the types of the signals in different signal sets are different, and each signal set is a vibration signal of mechanical equipment when the same fault occurs, so that the type of each signal set can be a fault type (also called as a fault state) of the mechanical equipment, and each signal set comprises a plurality of vibration signals; the signals in each signal set in the initial training set are labeled with the type of fault to which the signal belongs.

The signal denoising refers to removing an abnormal value in a signal, the signal interception refers to intercepting a signal within a certain period of time, and the two preprocessing modes are explained in the following. It should be noted that, the execution sequence of signal denoising is not limited in the embodiment of the present application, and the classification model training device directly performs signal denoising after receiving the initial training set, as in step 401; signal denoising may also be performed after step 403, which is illustrated in the embodiment shown in fig. 4 by taking the signal denoising performed in step 401 as an example.

Step 402: the classification model training device determines a large sample signal set and a small sample signal set.

The initial training set can comprise a plurality of signal sets with different fault types, the classification model training device can determine a large sample signal set and a small sample signal set in the initial training set according to a preset threshold, the signal set of which the signal quantity is not less than the preset threshold is determined to be the large sample signal set, and the signal set of which the signal quantity is less than the preset threshold is determined to be the small sample signal set. In the embodiment of the present application, the number of small sample signal sets and large sample signal sets in the initial training set is not limited, and one or more small sample signal sets may exist or one or more large sample signal sets may exist.

It should be noted that, in the present application, the determination of the large sample signal set and the small sample signal set depends on a preset threshold, and the present application does not limit the specific value of the preset threshold, and the preset threshold may be an empirical value, or a value determined after integrating the number of signals in each signal set in the initial training set.

Step 403: the classification model training device augments the small sample signal set determined in step S402 based on the augmentation model to generate an augmentation signal set, where the number of signals in the augmentation signal set is greater than the number of signals in the small sample signal set, and the number of signals in the augmentation signal set is not less than a preset threshold.

The extended training set is composed of the extended signal set obtained in step S403 and the large sample signal set determined in step S402. Optionally, the number of signals in each signal set in the extended training set is in the same order of magnitude.

Step 404: the classification model training device calculates the envelope spectrum of each signal in the extended training set; the obtained envelope spectrum of each signal in the extended training set constitutes a spectral training set.

For example, the classification model training apparatus may extract an envelope of each signal in the extended training set, and then perform fourier transform on the envelope of each signal in the extended training set to obtain an envelope spectrum of each signal in the extended training set. It is noted that the envelope spectrum of each signal still carries the label of each signal.

Step 405: and the classification model training device trains the classification model according to the frequency spectrum training set. In the present application, any one of neural network models available for classification in the industry may be used as the classification model. Inputting the envelope frequency spectrum of the signal in the frequency spectrum training set into a classification model, gradually extracting the characteristics of the envelope frequency spectrum of the signal by the classification model, predicting the type of the envelope frequency spectrum of the signal, comparing the predicted type with the type corresponding to the labeled envelope frequency spectrum of the signal, and modifying the parameters of each layer in the classification model, so that the classification model learns the characteristics of the envelope frequency spectrum of the signal in each signal set in the frequency spectrum training set until the classification model learns classification.

In the embodiment of the application, the augmentation model and the classification model adopt neural network models with different structures.

The foregoing steps S401 to S405 will be described in more detail below, respectively.

In step 401, there may be multiple preprocessing operations performed on each signal in the initial training set by the classification model training apparatus, and two preprocessing operations are specifically described below, one is signal interception, and the other is signal denoising, and the classification model training apparatus may perform one or both of the two operations. These two types of operations are described below:

1. and (6) signal interception.

One signal in the initial training set may be a signal collected by the collection device within a preset time period, and if the time period is longer, the time length represented by the signal is longer; in the subsequent training process of the classification model, signals with longer time length are not needed, so that the signals can be intercepted, for example, the classification model training device can intercept signals of each signal in the initial training set, and only intercepts the signals in fixed time length.

As a possible implementation manner, the classification model training apparatus may intercept one signal in the initial training set into a plurality of signals of fixed duration, and the embodiment of the present application does not limit the position of the intercepted signal. For example, the classification model training device may intercept a signal of a fixed duration at each interception position with the first sampling point of the signal as a starting point and 200 sampling points at intervals as one interception position.

The specific length of the fixed time duration is not limited in the embodiment of the application, the fixed time duration may be an empirical value, may also be a randomly determined numerical value, and may also be determined according to the rotation speed of the mechanical device and the sampling frequency of the collecting device, for example, the operation cycle of the mechanical device is m seconds (for example, the time of one rotation of the mechanical device may be taken as the operation cycle), the collecting frequency of the collecting device collecting the vibration signal through the vibration sensor is n kilohertz (KHz), and the fixed time duration may be set to be m × n × 1000 which is an integral multiple, so that the vibration signal in the fixed time duration may more completely reflect the vibration condition of the mechanical device in a complete operation cycle.

2. And denoising the signal.

Outliers are usually present in the signals in the initial training set, for example: in the process of collecting the vibration signal of the mechanical equipment by the collecting equipment through the vibration sensor, due to the influence of the environment and the vibration sensor, some abnormal values exist in the vibration signal, for example, the signal intensity of the vibration signal at a certain point suddenly increases or decreases, and the abnormal values are formed. The classification model training device can remove the abnormal value in each signal in the initial training set, and the process of removing the abnormal value can also be called signal denoising.

For any signal, the classification model training device may remove outliers in the signal based on the 3sigma principle.

For example, (μ -3 σ, μ +3 σ) may be calculated for the collection device collecting the vibration signal of the mechanical device by the vibration sensor, where μ is an average value of the vibration acceleration of each sampling point in the vibration signal, and σ is a standard deviation of each sampling point in the vibration signal, and the sampling points in the vibration signal in which the vibration acceleration of each sampling point is not in the range of (μ -3 σ, μ +3 σ) may be removed.

It should be understood that, the embodiment of the present application does not limit the order of signal denoising and signal interception, and the signal denoising and the signal interception may be performed first, or the signal denoising and the signal interception may be performed first. If the time length of each signal in the initial training set is uniform and the time length is small, signal interception is not needed. If the initial training set has no abnormal value or less abnormal values of each signal, signal denoising is not needed.

In the foregoing step 403, since the number of signals in the small sample signal set is small, an augmented signal set may be generated in an augmented manner, in this embodiment, the classification model training apparatus augments the small sample signal set based on an augmented model, where the augmented model adopts a neural network model, and this embodiment does not limit the structure of the augmented model, for example: the method may be any model based on a antagonistic neural network (GAN) framework, such as a deep convolutional antagonistic neural network (DCGAN), or other neural network models that may generate new signals based on characteristics of each signal in a small sample signal set.

The following describes the principle of GAN:

the GAN includes a generator G (generator) for generating a candidate signal set based on an input of the GAN, and a discriminator d (discriminator) connected to an output of the generator G for discriminating whether the candidate signal set output by the generator G is a true signal set.

In the GAN, a generator G and a discriminator D are a process of alternative antagonistic training, an arbitrary signal set is used as an input of the generator G, the generator G outputs a candidate signal set, the discriminator D takes the candidate signal set generated by the generator G and a small sample signal set (such as the small sample signal set in the embodiment of the present application) as input values, compares the features of the candidate signal set and the small sample signal set, outputs the probability that the candidate signal set and the small sample signal set belong to the same type of signal set (the candidate signal set of the same type as the small sample signal set is also called a true signal set, and the signals in the true signal set and the small sample signal set have the same or similar features), optimizes the parameters in the generator G according to the probability that the output candidate signal set is the true signal set (the parameters in the discriminator D are not changed at this time) until the candidate signal set output by the generator G is discriminated by the discriminator D from the true signal set (the candidate signal set is the true signal set of the generator G The probability of the signal set is greater than the threshold), the discriminator D optimizes the parameters of each internal network layer according to the probability output by the discriminator D (at this time, the parameters in the generator G are not changed), so that the discriminator D can continuously discriminate that the candidate signal set output by the generator G and the small sample signal set belong to the same class. And alternately optimizing parameters in the generator G and the discriminator D until the generator G generates whether the candidate signal set which cannot be discriminated by the discriminator D is a real signal set. From the training process, the process of training the generator G and the discriminator D alternately is the process of mutually gaming the generator G and the discriminator D; when the candidate signal set which can be generated by the generator G and the small sample signal set have the same or similar characteristics, that is, the candidate signal set is close to the real signal set, the discriminator D cannot accurately discriminate whether the input signal set is the real signal set, and the GAN training is completed.

The GAN does not limit the structures of the generator G and the discriminator D, but explains the functions of the generator G and the discriminator D in principle; DCGAN is a concrete model under the GAN framework, defining the structure of generator G and discriminator D.

The following describes a manner of amplifying a small sample signal set in the present application, taking an amplification model as DCGAN as an example:

as shown in fig. 5, which is a schematic structural diagram of a DCGAN, the DCGAN includes a generator G and a discriminator D, wherein the generator G includes a fully-connected layer and four reverse convolutional layers. The discriminator D includes four convolution layers and two full-link layers.

In a generator G of DCGAN, an input value of the generator G is a four-dimensional matrix Z which can be defined by a user, the generator G performs dimension increasing by utilizing a full connection layer, the input value is mapped to a high-dimensional space, the dimension can be matched with the input requirement of a subsequent deconvolution layer, the four deconvolution layers can perform upsampling, high-dimensional hidden features (the high-dimensional hidden features refer to potential features in the high-dimensional space) in the input four-dimensional matrix Z are progressively and reversely pushed out, and an output value of the last deconvolution layer is a candidate signal set generated by the generator G.

It should be noted that the high-dimensional feature mentioned in the embodiment of the present application means that, compared with the input value, due to the existence of full connection, the subsequent dimensions are increased, and the feature extracted by the deconvolution layer is a high-dimensional hidden feature.

The candidate signal set generated by the generator G is input to a discriminator D of the DCGAN, where the small sample signal set and the candidate signal set generated by the generator G are mixed to form a mixed signal set, organized in a matrix form of (N, H, W, C), N indicates the number of signals in one mixed signal set, H indicates the height of signals in the mixed signal set, since the vibration signal is a one-dimensional vector, H is 1 in this example, W indicates the width of the signal, i.e., the intercept length of the signal, and C indicates the number of channels in the signal set, for example, in a fault signal scenario where the signal set is a mechanical device, the number of vibration sensors can be used as the number of channels.

As a possible embodiment, the classification model training apparatus may mix the small sample signal set and the candidate signal set generated by the generator G, and then divide the mixed signal into a plurality of sets of signal sets, where each set of signal set includes one or more signals in the small sample signal set and the candidate signal set generated by the generator G, and each set of signal sets is organized into a matrix form of (N, H, W, C), where the description of N, H, W, C may refer to the foregoing description, and is not repeated here.

The above-mentioned manner of constructing the small sample signal set and the candidate signal set generated by the generator G into the matrix meeting the requirement of the input value of the discriminator D is only an example, and the embodiment of the present application does not limit the manner of organizing the small sample signal set and the candidate signal set generated by the generator G into other types of matrix forms and inputting the matrix forms into the discriminator D.

After 4 convolutional layers are adopted in a discriminator D of the DCGAN to extract low-dimensional hidden features of each signal in an input signal set layer by layer, the probability that the signal set is a real signal set is judged by utilizing 2 full-connected layers based on the low-dimensional hidden features of each signal.

The discriminator D utilizes the convolution layer to extract the characteristics of the signals in the input signal set, the extracted characteristics can better reflect the characteristics of the signals in the signal set, and the characteristics extracted by the discriminator D are low-dimensional hidden characteristics which are the characteristics that the signals in the signal set are mapped into a low-dimensional space. And then, the discriminator D can discriminate the input signal set more accurately based on the low-dimensional hidden features.

The generator G and the discriminator D in the DCGAN are alternately trained until the discriminator D has difficulty in judging whether the candidate signal set generated by the generator G is the true signal set (e.g., the probability of the output of the discriminator D is equal to 0.5), in which case, the candidate signal set generated by the generator G can be considered to be close to the true signal set, and the candidate signal set generated by the generator G is taken as the augmented signal set.

After the small sample signal set is augmented according to the DCGAN to generate an augmented signal set, the augmented training set and the large sample training set together form an augmented training set. The classification model training apparatus may perform the step 404, and when performing the step 404, the manner in which the classification model training apparatus acquires the envelope spectrum of each signal in the extended training set is specifically described by taking an extended signal set in the extended training set as an example:

the classification model training device can extract the envelope of each signal in the amplified signal set in an envelope detection mode. The envelope detection (envelope-demodulation) is a method for processing signals based on filtering detection, and is a process of extracting low-frequency and low-frequency features from high-frequency signals, and further extracting an envelope of each signal in an amplified signal set. Illustratively, the envelope of each signal in the set of amplified signals may be obtained by performing envelope detection in a Hilbert (Hilbert) transform.

For a vibration signal as shown in fig. 3, the envelope of the vibration signal describes the variation trend of the vibration acceleration of each sampling point in the time domain. For any signal in the augmented signal set, the classification model training device can decompose the envelope of the signal into a plurality of sinusoidal signals with different frequencies by utilizing Fourier transform, the amplitudes of the sinusoidal signals are also different, and then the sinusoidal signals with different frequencies are projected into a frequency domain space to generate the envelope spectrum of the signal, and the envelope of the signal is changed into the envelope spectrum capable of describing the variation trend of the frequencies and the amplitudes.

Fig. 6 is a visual image of the envelope spectrum of the vibration signal shown in fig. 3, wherein, as shown in fig. 6, the abscissa is frequency and the ordinate is amplitude corresponding to each frequency. The envelope spectrum shown in fig. 6 can describe the relationship between the frequency and the amplitude of a vibration signal in the amplified signal set in the frequency domain space, which represents the frequency domain characteristics of the vibration signal in the amplified signal set.

For other signal sets in the extended training set, such as a large sample signal set, the classification model training apparatus may obtain the envelope spectrum of each signal in the other signal sets in the extended training set in the same manner as the envelope spectrum of each signal in the extended signal set.

In step 404, because envelope detection is performed on each signal in the extended training set, noise in the signal is filtered, that is, a high-frequency signal in the signal is filtered, and an effective signal in the signal is extracted, that is, a low-frequency signal with better generalization capability is extracted, which can effectively ensure the accuracy of a classification model completed by subsequent training. The Fourier transform is adopted to realize the conversion from the time domain to the frequency domain, the frequency domain characteristics of each signal in the extended training set are extracted, the data dimensionality of the input value of the subsequent classification model can be better reduced through the conversion from the time domain to the frequency domain, the number of parameters in the classification model can be reduced, and the generalization of the classification model can be better ensured.

The envelope frequencies of all signals in the training set are extended to form a new training set, which is called a spectrum training set, and the classification model training apparatus executes step 405 to train the classification model. The number of network layers in the classification model, and the weight and the calculation formula in each network layer are not limited in the embodiments of the present application, and the classification model may be a model formed by a fully-connected layer, or may be a model formed by a convolutional layer, a fully-connected layer, and a pooling layer.

The classification model may be a Convolutional Neural Network (CNN) or a neural network model with other structures, which is not limited in the embodiment of the present application.

The following description will be given taking the classification model CNN as an example, where the CNN includes two convolutional layers, two pooling layers, and two fully-connected layers.

As shown in fig. 7, the CNN has a structure in which, from input to input, each network layer is a convolutional layer, a pooling layer, a fully-connected layer, and a fully-connected layer.

Wherein, the convolution layer is used for extracting the characteristics of the signal; the pooling layer can keep the main characteristics of signals, reduce the parameters (to achieve the effect of reducing the dimension) and the calculation amount introduced in the processing process, prevent the overfitting of the CNN and improve the generalization capability of the CNN; the fully-connected layer may combine features output by previous convolutional layers and pooling layers and map the combined features to a label space, where the label space is a space formed by a type to which each signal belongs as a label, to obtain an output value.

The classification model training device can input signals in a spectrum training set into a (N, H, W, C) form to the CNN, wherein the alternately arranged convolution layer and pooling layer in the CNN can extract the characteristics of the signals, and then the characteristics are mapped to a label space through full connection, so that the type of the signals input to the CNN is judged. Until the CNN output type is consistent with the signal labeling type without adjusting the weights of the network layers in the CNN, the training may be regarded as completed, and the CNN after the training may be used as a classification model.

The trained classification model may be used in a classification apparatus for classifying one or more signals. The classification device comprises a preprocessing module and a classification model, and the classification model is trained by adopting the classification model training device.

When a vibration signal of a certain mechanical device is input into the classification device, the classification device can acquire an envelope frequency spectrum of the vibration signal, the envelope frequency spectrum of the vibration signal is used as an input value of a classification model, and the fault type of the vibration signal is judged.

Fig. 8 shows a process of the classification device performing fault diagnosis based on the classification model, which includes the following steps:

step 801: the classification device performs preprocessing on the input vibration signal, such as signal denoising. The signal denoising method can be referred to in the related description in step 401, and is not described herein again.

Step 802: the classification device extracts the envelope frequency spectrum of the preprocessed vibration signal. For example, the classification device may perform envelope detection on the preprocessed vibration signal, extract an envelope, and then perform fourier transform on the envelope to generate an envelope spectrum. The way of extracting the envelope spectrum of the preprocessed vibration signal by the classification device is similar to the way of extracting the envelope spectrum of each signal in the extended training set by the classification model training device in step 403, and is not described here again.

In step 803, the classification apparatus may input the envelope spectrum of the preprocessed vibration signal into a classification model, and determine the fault type to which the vibration signal belongs according to the output of the classification model.

Based on the same inventive concept as the method embodiment, the embodiment of the present application further provides a classification model training apparatus, and the classification model training apparatus is used for executing the method executed by the classification model training apparatus in the method embodiment. As shown in fig. 9, the classification model training apparatus 900 includes an obtaining module 901, an augmentation model 902, and a training module 903, which may be a software module. Specifically, in the classification model training apparatus 900, the modules are connected to each other through a communication path.

An obtaining module 901, configured to obtain a first signal set, where the first signal set includes a plurality of signals; the obtaining module 901 may perform the operation of the classification model training apparatus obtaining the initial training set in the foregoing method embodiments.

An augmentation model 902, configured to receive signals in the first signal set, and augment the first signal set, where the number of signals in the first signal set after augmentation is greater than the number of signals in the first signal set; the augmented model 902 may perform steps 402-403 as in the embodiment shown in FIG. 4.

And a training module 903, configured to train a classification model according to the augmented first signal set, where the trained classification model is used to determine a type of an input signal to be classified. The training module 903 may perform steps 404-405 as in the embodiment shown in fig. 4.

The classification model training device 900 may further include a preprocessing module 904, and after the obtaining module 903 obtains the first signal set, the preprocessing module 904 may perform preprocessing on signals in the first signal set; the pre-processing module 904 may perform the performing step 401 in the embodiment shown in fig. 4.

The division of the modules in the embodiments of the present application is schematic, and only one logic function division is provided, and in actual implementation, there may be another division manner, and in addition, each functional module in each embodiment of the present application may be integrated in one processor, may also exist alone physically, or may also be integrated in one module by two or more modules. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.

The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a terminal device (which may be a personal computer, a mobile phone, or a network device) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

Such as computing device 1000 shown in fig. 10. The computing device 1000 includes a bus 1001, a processor 1002, a communication interface 1003, and a memory 1004. The processor 1002, the memory 1004, and the communication interface 1003 communicate with each other via a bus 1001.

The processor 1002 may be a Central Processing Unit (CPU). The memory 1004 may include volatile memory (volatile memory), such as Random Access Memory (RAM). The memory 1004 may also include a non-volatile memory (non-volatile memory), such as a read-only memory (ROM), a flash memory, an HDD, or an SSD. The memory has stored therein executable code that the processor 1002 executes to perform the methods of classification model training described above. The memory 1004 may also include other software modules required to run a process, such as an operating system. The operating system may be LINUX^TM,UNIX^TM,WINDOWS^TMAnd the like.

Specifically, the memory 1004 stores the modules of the apparatus 900. The memory 1004 may also store an initial training set, an extended training set, and a frequencyA spectrum training set, etc., and the memory 1004 may include, in addition to the aforementioned modules, other software modules required for running processes, such as an operating system. The operating system may be LINUX^TM,UNIX^TM,WINDOWS^TMAnd the like.

The present application also provides a computing device system that includes at least one computing device 1100 as shown in fig. 11. The computing device 1100 includes a bus 1101, a processor 1102, a communication interface 1103, and a memory 1104. Communication between the processor 1102, memory 1104 and communication interface 1103 occurs via a bus 1101. At least one computing device 1100 in the system of computing devices communicates over a communications path.

The processor 1102 may be a CPU. The memory 1104 may include volatile memory, such as random access memory. The memory 1104 may also include a non-volatile memory such as a read-only memory, a flash memory, an HDD, or an SSD. The memory 1104 has stored therein executable code that the processor 1102 executes to perform any or all of the methods of training the classification models described above. The memory may also include other software modules required to run processes, such as an operating system. The operating system may be LINUX^TM,UNIX^TM,WINDOWS^TMAnd the like.

Specifically, the memory 1104 stores any one or more modules of the apparatus 900. The memory 1104 may further store an initial training set, an extended training set, a spectrum training set, and the like, and the memory 1104 may further include an operating system and other software modules required for running a process, in addition to any one or more of the foregoing modules. The operating system may be LINUX^TM,UNIX^TM,WINDOWS^TMAnd the like.

At least one of the computing devices 1100 in the computing device system, on each of which any one or any number of the modules of apparatus 900 are running, establishes communication with each other over a communication network. The at least one computing device 1100 collectively performs the aforementioned operations of augmenting the small sample signal set and training the classification model.

The descriptions of the flows corresponding to the above-mentioned figures have respective emphasis, and for parts not described in detail in a certain flow, reference may be made to the related descriptions of other flows.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. A computer program product for classification model training comprises one or more classification model training computer program instructions that, when loaded and executed on a computer, cause, in whole or in part, the processes or functions of classification model training according to embodiments of the invention.

The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another, for example, from one website, computer, server, or data center over a wired (e.g., coaxial, fiber, digital subscriber line, or wireless (e.g., infrared, wireless, microwave, etc.) link to another website, computer, server, or data center. (e.g., floppy disk, hard disk, magnetic tape), optical media (e.g., DVD), or semiconductor media (e.g., SSD).

Claims

1. A method of training a classification model, the method comprising:

obtaining a first signal set, the first signal set comprising a plurality of signals;

inputting signals in the first signal set into an amplification model to amplify the first signal set, wherein the number of the signals in the first signal set after amplification is larger than that of the signals in the first signal set;

and training the classification model according to the first signal set after the amplification, wherein the classification model after the training is used for determining the type of the input signal to be classified, and the amplification model and the classification model adopt neural network models with different structures.

2. The method of claim 1, wherein the augmentation model comprises a generator and an arbiter, the inputting signals in the first set of signals into the augmentation model to augment the first set of signals comprising:

generating a set of candidate signals with the generator;

the first signal set and the candidate signal set are input to the discriminator to augment the first signal set.

3. The method of claim 1 or 2, wherein the training of the classification model from the augmented first signal set comprises:

obtaining an envelope spectrum of each signal in the first signal set after amplification;

inputting the amplified envelope spectrum of each signal in the first signal set to the classification model to train the classification model.

4. A method as claimed in any one of claims 1 to 3, wherein the signals in the first signal set after amplification are of the same type as the signals in the first signal set.

5. A method according to any of claims 1 to 4, wherein the type of signal in said first set of signals is a fault condition of the mechanical device.

6. The utility model provides a classification model training device which characterized in that, the device is including obtaining module, augmentation model and training module:

the acquisition module is configured to acquire a first signal set, where the first signal set includes a plurality of signals;

the amplification model is used for receiving the signals in the first signal set as input so as to amplify the first signal set, and the number of the signals in the first signal set after amplification is larger than that of the signals in the first signal set;

the training module is used for training the classification model according to the first signal set after the amplification, and the trained classification model is used for determining the type of the input signal to be classified, wherein the amplification model and the classification model adopt neural network models with different structures.

7. The apparatus of claim 6, wherein the augmented model comprises a generator and an arbiter; the augmentation model is specifically used for:

generating a set of candidate signals with the generator;

8. The apparatus of claim 6 or 7, wherein the training module is specifically configured to:

9. An apparatus as claimed in any one of claims 6 to 8, wherein the signals in the first signal set after amplification are of the same type as the signals in the first signal set.

10. An apparatus according to any of claims 6 to 9, wherein the type of signal in the first set of signals is a fault condition of the mechanical device.

11. A computing device system comprising at least one computing device, each computing device comprising a memory and a processor, the memory of the at least one computing device for storing computer instructions;

the processor of the at least one computing device executes the computer instructions stored by the memory to perform the method of any of the preceding claims 1-5.

12. A non-transitory readable storage medium, wherein the non-transitory readable storage medium, when executed by a computing device, performs the method of any of claims 1-5.