CN112232252B

CN112232252B - Transmission chain unsupervised domain adaptive fault diagnosis method based on optimal transportation

Info

Publication number: CN112232252B
Application number: CN202011152085.7A
Authority: CN
Inventors: 刘朝华; 蒋林博; 王畅通; 吴亮红; 陈磊; 张红强; 李小花
Original assignee: Hunan University of Science and Technology
Current assignee: Hunan University of Science and Technology
Priority date: 2020-10-23
Filing date: 2020-10-23
Publication date: 2023-12-01
Anticipated expiration: 2040-10-23
Also published as: CN112232252A

Abstract

The invention discloses a transmission chain unsupervised domain adaptive fault diagnosis method based on optimal transportation, which comprises the following steps: constructing a transmission chain fault database; constructing an unsupervised domain adaptive feature extraction and classification model; establishing an unsupervised domain adaptive fault diagnosis model fused with optimal transportation; and sending the target domain data into a trained feature extractor and a softmax classifier to obtain a predictive label of the target domain data, and calculating the classification accuracy of the predictive label. According to the invention, the optimal transport theory and the domain adaptation theory are combined together, and the distribution of the feature space and the label space is aligned by minimizing the distance between the features of the source domain and the target domain and the labels, so that the domain adaptation problem of the source domain and the target domain in fault diagnosis is solved, and the precision of fault diagnosis is improved.

Description

Transmission chain unsupervised domain adaptive fault diagnosis method based on optimal transportation

Technical Field

The invention relates to a transmission chain fault diagnosis method, in particular to a transmission chain unsupervised domain adaptive fault diagnosis method based on optimal transportation.

Background

Transmission chains play an important role in industrial applications, being part of many industrial systems, such as high-speed rail, aircraft, wind power systems, etc., but transmission chains tend to fail. The fault diagnosis of the transmission chain is important to ensure the safe operation of the system, reduce the machine downtime and save the maintenance cost. Therefore, the failure diagnosis of the power train is receiving a great deal of attention, and a large number of failure diagnosis methods are proposed.

Fault diagnosis methods can be broadly classified into model-based methods and data-driven methods. Early fault diagnosis is mainly based on a physical model, and can accurately describe the connection between faults and related industrial systems; however, this fault diagnosis method has two main disadvantages: 1) Highly dependent on a priori knowledge of the system; 2) Disturbances in the industrial operation process and some assumptions about the system, such as the form of noise and the operating conditions of the system, may not be appropriate, which may lead to uncertainties and misdiagnosis. The data-driven fault diagnosis method directly analyzes the acquired data, and uses techniques such as signal processing, machine learning and the like, so that dependence on priori knowledge of a system can be reduced, and the method is more suitable for modern industrial application; such as Hilbert transform and wavelet transform, are widely used for fault signature analysis. While these methods can achieve good performance in fault diagnosis tasks, they still face the challenge of automatically extracting fault signatures from the initial fault signal. Moreover, manually extracting fault signatures is very time consuming.

In recent years, deep learning has been greatly advanced in the field of fault diagnosis, such as deep stack networks and convolutional neural networks, and has been used to solve the problem of fault diagnosis due to good capability of automatically extracting features. Most existing methods can accurately detect faults, but the success of these methods is based on two assumptions: 1) There is a large amount of tagged training data; 2) The training data of the source domain and the test data of the target domain are distributed identically. When these two assumptions are violated, the performance of these algorithms may drop significantly. However, in practical industrial applications, such as wind power generation systems, these assumptions often fail due to changes in the operating environment and instability in the load torque. Also, collecting data is very time consuming, so it is often difficult to collect a large amount of tagged training data. It is also widely appreciated that models trained from raw data may be less reliable. Even though tag data may be available under certain operating conditions, the data distribution may change with new operating conditions.

Domain adaptation is an effective method for solving the problem of data imbalance or data scarcity, and can effectively relieve the influence of data distribution difference and remarkably improve the performance of the classifier. However, the existing fault diagnosis method based on domain adaptation theory mainly has two disadvantages: 1) When the distribution of the source domain and the target domain is multi-modal, the probability measures used by these methods cannot represent the difference between the source domain and the target domain; 2) Only the differences in the source and target domains are considered in the high-dimensional feature space, while the differences in the tag space are often ignored, resulting in inadequate adaptation. The optimal transport (Optimal transport, OT) theory is a powerful tool to calculate the distance between probability distributions. The OT distance can be calculated directly from the distributed samples without the need for density estimation or other non-parametric methods.

Disclosure of Invention

In order to solve the technical problems, the invention provides the transmission chain unsupervised domain adaptive fault diagnosis method based on optimal transportation, which is simple in algorithm and high in diagnosis precision.

The technical scheme for solving the problems is as follows: an optimal transportation-based transmission chain unsupervised domain adaptation fault diagnosis method comprises the following steps:

(1) Constructing a transmission chain fault database: collecting fault data of a source domain and a target domain of a transmission chain, wherein a fault sample set of the source domain of an active label is as followsIts corresponding label is->Wherein->For the mth source domain failure sample, +.>Is->Corresponding labels, s represents a source domain, and m is the total number of source domain samples; the unlabeled target domain fault sample set isWherein->The method comprises the steps that an nth target domain fault sample is obtained, t represents a target domain, and n is the total number of target domain samples; the source domain fault sample set and the target domain fault sample set together form a transmission chain fault database;

(2) Constructing an unsupervised domain adaptive feature extraction and classification model: an automatic encoder is adopted as a feature extractor f, features of fault samples of a source domain and a target domain are automatically extracted, a softmax classifier g is trained by using the source domain samples, so that a prediction label of the fault samples of the target domain is obtained, and unsupervised domain adaptation feature extraction of the fault samples of the target domain and construction of a classification model are realized;

(3) Establishing an unsupervised domain adaptive fault diagnosis model integrating optimal transportation: processing fault samples of a source domain and a target domain through unsupervised domain adaptive feature extraction and classification models to obtain sample features and corresponding labels; the feature extractor f and the softmax classifier g are fused into an objective function of an optimal transportation solver, a transportation planning matrix gamma is calculated through sample features and labels, the Ji Yuanyu and the target domain are distributed, and the source domain label information is transported to the target domain; alternately optimizing the unsupervised domain adaptive feature extraction and classification model and the optimal transport solver, so as to obtain an unsupervised domain adaptive fault diagnosis model fused with optimal transport;

(4) And sending the target domain data into a trained feature extractor f and a softmax classifier g to obtain a predictive label of the target domain data, and calculating the classification accuracy of the predictive label.

In the above-mentioned transmission chain unsupervised domain adaptive fault diagnosis method based on optimal transportation, in the step (2), an automatic encoder is adopted as a feature extractor f, and the specific process of automatically extracting the features of the source domain and the target domain fault samples is as follows:

the automatic encoder passes through a function h _θ Encoding the input data x into a representative feature y; this process is expressed as:

y＝h _θ (x)＝σ(Wx+b)

wherein θ represents a parameter of the encoding portion; w represents a weight matrix of the coding process, and b represents a bias vector of the coding process; sigma represents an activation function;

accordingly, the decoding part passes through the function h' _θ′ Reconstructing the representative features back into the input data:

x′＝h′ _θ′ (y)＝σ′(W′y+b′)

wherein θ ' represents parameters of the decoding part, x ' represents reconstructed input data, W ' represents a weight matrix of the decoding process, b ' represents a bias vector of the decoding process, and σ ' represents an activation function of the decoding process;

the loss function of the entire automatic encoder network is:

wherein x is _ρ And x' _ρ Rho=1, 2, …, I is the number of input data, |x ', representing the rho-th input data and the decoder reconstructed input data, respectively' _ρ -x _ρ || ₂ Represents x' _ρ -x _ρ 2-norms of (2).

In the above-mentioned optimal transport-based drive chain unsupervised domain adaptive fault diagnosis method, in the step (2), the softmax classifier is used for estimating the probability that each sample belongs to each category, and the category with the highest probability is used as the category of the sample;

given input data and corresponding tags { (x) ⁽¹⁾ ,y ⁽¹⁾ ),(x ⁽²⁾ ,y ⁽²⁾ ),…,(x ^(τ) ,y ^(τ) ) X, where x ^(τ) Represents the τ input data, y ^(τ) Represents x ^(τ) Corresponding tag, and y ^(τ) E {1,2, …, k }, k representing the number of categories of tags; input data x ^(τ) Tag probability of (2)The method comprises the following steps:

wherein,parameters representing a softmax classifier, T representing the transpose of the vector,/for the vectors>Representing all +.>Summing to normalize the probability value; />Represents the first parameter->Is a transpose of (2);

cross entropy loss is used as a loss function of a softmax classifier

Wherein 1{ expression is true } = 1,1{ expression is false } = 0, y ^(α) For the alpha-th input data x ^(α) α=1, 2, …, τ, β represents a class β tag, β=1, 2, …, k;represents the beta parameter->Is a transpose of (a).

In the above-mentioned transmission chain unsupervised domain adaptive fault diagnosis method based on optimal transport, in the step (3), the optimal transport solver is used to calculate the distance between the probability distribution of the source domain and the target domain, and the domain adaptation problem is regarded as a special case of the discrete optimal transport problem;

let the edge distribution of the source domain and target domain samples be μ _s Sum mu _t The distance between the source domain and target domain feature space and the tag space is:

wherein C (i, j) represents the cost of moving the probability mass from the i-th source domain sample feature and label to the j-th target domain sample feature and label; c is a cost matrix, i=1, 2, …, m, j=1, 2, …, n;features representing the ith source domain sample; />Features representing a jth target domain sample; η represents a trade-off parameter; />Representing calculating an L2 distance between sample features of the source domain and the target domain; />Representing a cross entropy loss between the computed source domain and target domain labels;

since the target domain samples do not have corresponding tags,cannot be used directly, so generating a tag of the target domain sample feature with the source domain trained softmax classifier g>Namely:

the objective function for solving the optimal transport plan is:

wherein, pi= { γ∈ (R ⁺ ) ^m×n |γ1 _n ＝μ _s ,γ ^T 1 _m ＝μ _t -represents a set of all transport plans γ; gamma (i, j) is the probability mass of the ith source domain sample feature and label being transferred to the jth target domain sample feature and label; 1 _n N×1 vectors representing elements 1; 1 _m An mx1 vector representing an element of 1; (R) ⁺ ) ^m×n Representing a positive real matrix with dimensions m x n;<·,·>representing the Frobenius dot product;

initial value gamma of transport planning ₀ The method comprises the following steps:

wherein C is ₀ Cost matrix for moving probability mass from source domain sample to target domain sample, andγ ₀ (i, j) is the probability mass of the ith source domain sample transitioning to the jth target domain sample.

In the above-mentioned transmission chain unsupervised domain adaptive fault diagnosis method based on optimal transportation, in the step (3), the objective loss function to be optimized for the whole model is:

wherein f' denotes a decoder of the automatic encoder,and->Respectively represent the source domain featuresAnd->Decoding into input data; l (L) _f And L _g Representing the loss functions of the feature extractor f and softmax classifier g, respectively;

alternately solving the objective loss function using a synergistic descent algorithm:

when the parameters of the feature extractor f and softmax classifier g are fixed, the target loss function is written as:

the above is a standard linear programming problem, and is solved by using a network simplex flow algorithm;

when the transport plan γ is fixed, the target loss function is written as:

the Adam algorithm is used to solve the above equation.

In the above optimal transportation-based driving chain unsupervised domain adaptive fault diagnosis method, in the step (4), the prediction accuracy of the target domain sample label is calculated according to the following formula:

wherein,for sample->Predicted tag,/->Is a corresponding true label; 1 {. Cndot. } is a binary function, whenWhen (I)>1 is shown in the specification; when->When (I)>Is 0.

The invention has the beneficial effects that:

1. according to the invention, the AE network is used as a feature extractor to extract representative features in the original data, and the softmax classifier trained by the source domain sample is used for classifying the target domain fault sample, so that the unsupervised fault diagnosis of the target domain sample is realized.

2. According to the invention, the optimal transmission theory is fused into the domain adaptation problem, and the feature extractor and the softmax classifier trained on the source domain sample are used for the target domain sample through the feature space and the label space of Ji Yuanyu and the target domain, so that the feature extraction capability and the fault diagnosis capability are improved.

Drawings

FIG. 1 is a flow chart of the present invention.

FIG. 2 is a schematic diagram of an automatic encoder according to the present invention.

FIG. 3 is a schematic diagram of the optimal transportation problem of the present invention.

FIG. 4 is a bar graph of a comparative experiment of the present invention.

Detailed Description

The invention is further described below with reference to the drawings and examples.

As shown in fig. 1, a transmission chain unsupervised domain adaptive fault diagnosis method based on optimal transportation includes the following steps:

(1) Constructing a transmission chain fault database: collecting fault data of a source domain and a target domain of a transmission chain, wherein a fault sample set of the source domain of an active label is as followsIts corresponding label is->Wherein->For the mth source domain failure sample, +.>Is->Corresponding labels, s represents a source domain, and m is the total number of source domain samples; the unlabeled target domain fault sample set isWherein->For the nth target domain failure sample, t represents the target domain, and n is the total number of target domain samples.

(2) Constructing an unsupervised domain adaptive feature extraction and classification model: and an automatic encoder is adopted as a feature extractor f to automatically extract the features of the source domain fault samples and the target domain fault samples, and a softmax classifier g is trained by using the source domain samples, so that a prediction label of the target domain fault samples is obtained, and the unsupervised domain adaptation feature extraction and classification of the target domain fault samples are realized.

Automatic encoders are used to learn more compact sample features with class discrimination, thereby speeding up the training process;

as shown in fig. 2, the automatic encoder passes through a function h _θ The input data x is encoded as a representative feature y. This process is expressed as:

y＝h _θ (x)＝σ(Wx+b)

where θ represents a parameter of the encoded section. W represents a weight matrix of the encoding process, and b represents a bias vector of the encoding process. Sigma represents an activation function;

x′＝h′ _θ′ (y)＝σ′(W′y+b′)

where θ' represents a parameter of the decoding section. x' is reconstructed input data. W 'represents a weight matrix of the decoding process, and b' represents a bias vector of the decoding process. σ' represents an activation function of the decoding process;

the loss function of the entire automatic encoder network is:

wherein x is _ρ And x' _ρ The ρ -th input data and the decoder reconstructed input data are represented by ρ=1, 2, …, I, respectively, which are the number of input data. ||x' _ρ -x _ρ || ₂ Represents x' _ρ -x _ρ 2-norms of (2);

the softmax classifier is the most common algorithm to solve the multi-classification problem. The main function of Softmax is to estimate the probability that each sample belongs to each category, and the category with the highest probability is taken as the category of the sample;

given input data and corresponding tags { (x) ⁽¹⁾ ,y ⁽¹⁾ ),(x ⁽²⁾ ,y ⁽²⁾ ),…,(x ^(τ) ,y ^(τ) ) }. Wherein x is ^(τ) Represents the τ input data, y ^(τ) Represents x ^(τ) Corresponding tag, and y ^(τ) E {1,2, …, k }, k representing the number of categories of tags. Input data x ^(τ) The tag probability of (2) is:

wherein,parameters representing a softmax classifier, T representing the transpose of the vector,/for the vectors>Representing all +.>Summing, and normalizing the probability value. />Represents the first parameter->Is a transpose of (2);

the cross entropy loss is used as a loss function of the softmax classifier:

where 1{ expression is true } = 1,1{ expression is false } = 0.y is ^(α) For the alpha-th input data x ^(α) α=1, 2, …, τ. β represents a class β tag, β=1, 2, …, k.Represents the beta parameter->Is a transpose of (a).

(3) Establishing an unsupervised domain adaptive fault diagnosis model integrating optimal transportation: processing fault samples of a source domain and a target domain through unsupervised domain adaptive feature extraction and classification models to obtain sample features and corresponding labels; the feature extractor f and the softmax classifier g are fused into an objective function of an optimal transport OT solver, a transport planning matrix gamma is calculated through sample features and labels, so that the Ji Yuanyu and the target domain are distributed, and the source domain label information is transported to the target domain; and alternately optimizing the unsupervised domain adaptive feature extraction and classification model and the OT solver, so as to obtain the unsupervised domain adaptive fault diagnosis model fused with optimal transportation.

The optimal transport OT is used to calculate the distance between the source domain and the target domain probability distribution. Domain adaptation problems can be seen as a special case of discrete OT problems;

as shown in FIG. 3, let the edge distributions of the source domain and target domain samples be μ, respectively _s Sum mu _t The distance between the source domain and target domain feature space and the tag space is:

where C (i, j) represents the cost of moving the probability mass from the i-th source domain sample feature and label to the j-th target domain sample feature and label. C is a cost matrix, i=1, 2, …, m, j=1, 2, …, n.Representing the characteristics of the ith source domain sample. />Features representing the jth target field sample. η represents a trade-off parameter. />Representing the L2 distance between the sample features of the computed source and target domains. />Representing a cross entropy loss between the computed source domain and target domain labels;

the objective function for solving the optimal transport plan is:

wherein, pi= { γ∈ (R ⁺ ) ^m×n |γ1 _n ＝μ _s ,γ ^T 1 _m ＝μ _t And, representing the aggregate of all transport plans γ. Gamma (i, j) is the probability mass of the ith source domain sample feature and label transitioning to the jth target domain sample feature and label. 1 _n N×1 vector representing element 1,1 _m An mx1 vector representing an element of 1. (R) ⁺ ) ^m×n Representing a positive real matrix with dimensions m x n.<·,·>Representing the Frobenius dot product;

wherein C is ₀ Cost matrix for moving probability mass from source domain sample to target domain sample, andγ ₀ (i, j) is the probability mass of the ith source domain sample transitioning to the jth target domain sample;

the target loss function to be optimized for the whole model is:

wherein f' denotes a decoder of the automatic encoder,and->Respectively represent the source domain featuresAnd->Decoded into input data. L (L) _f And L _g Representing the loss functions of the feature extractor f and softmax classifier g, respectively;

the objective loss function is solved alternately using a synergistic descent algorithm. When the parameters of the feature extractor f and softmax classifier g are fixed, the target loss function can be written as:

the above is a standard linear programming problem, which can be solved by using a network simplex flow algorithm;

when the transport plan γ is fixed, the objective loss function can be written as:

the Adam algorithm is used to solve the above equation.

The prediction accuracy of the target domain sample tag is calculated as follows:

wherein,for sample->Predicted tag,/->Is the corresponding true tag. 1 {. Cndot. } is a binary function, whenWhen (I)>1 is shown in the specification; when->When (I)>Is 0.

In order to verify the effectiveness of the invention, the invention selects the real bearing fault data in the transmission system for verification. The source domain samples are normal and fault samples in a load state and are labeled. While the target domain samples are normal and fault samples in the other state and have no label. The motor is loaded with four types of motor, namely 0hp, 1hp, 2hp and 3hp, and 6 types of cross domains are formed for experimental verification: 0-1hp, 0-2hp, 0-3hp, 1-2hp, 1-3hp and 2-3hp. The fault diameters were 0.007 inches and 0.014 inches. The experiment selects the methods of a support vector machine SVM, a K-nearest neighbor, a softmax classifier, a back propagation neural network BP, migration component analysis TCA, a joint distribution adaptation JDA, a correlation alignment method CORAL and the like, and the method is used for carrying out a comparison experiment. The experimental result is shown in fig. 4, and the method can realize bearing fault diagnosis under various working conditions, and the accuracy is higher than that of other comparison methods.

In summary, according to the optimal transportation-based driving chain unsupervised domain adaptive fault diagnosis method, ji Yuanyu and target domains are distributed by minimizing the distances between the fault feature space and the label space of the source domain and the target domain, so that the depth features of target domain samples are better extracted, the label classifier trained by the source domain can be well used for target domain data, the target domain unsupervised domain adaptive label prediction is realized, and the fault diagnosis precision is improved.

Claims

1. An optimal transportation-based transmission chain unsupervised domain adaptation fault diagnosis method is characterized by comprising the following steps of: the method comprises the following steps:

2. The optimal transport-based drive train unsupervised domain adaptive fault diagnosis method according to claim 1, characterized by: in the step (2), an automatic encoder is adopted as a feature extractor f, and the specific process of automatically extracting the features of the source domain and target domain fault samples is as follows:

y＝h _θ (x)＝σ(Wx+b)

x′＝h′ _θ′ (y)＝σ′(W′y+b′)

the loss function of the entire automatic encoder network is:

3. The optimal transport-based drive train unsupervised domain adaptive fault diagnosis method according to claim 2, characterized by: in the step (2), the softmax classifier has the functions of estimating the probability that each sample belongs to each category, and taking the category with the highest probability as the category of the sample;

cross entropy loss is used as a loss function of a softmax classifier

4. The optimal transport-based drive train unsupervised domain adaptive fault diagnosis method according to claim 3, characterized by: in the step (3), the optimal transportation solver is used for calculating the distance between the probability distribution of the source domain and the probability distribution of the target domain, and the domain adaptation problem is regarded as a special case of the discrete optimal transportation problem;

the objective function for solving the optimal transport plan is:

wherein,representing a set of all transport plans γ; gamma (i, j) is the probability mass of the ith source domain sample feature and label being transferred to the jth target domain sample feature and label; 1 _n N×1 vectors representing elements 1; 1 _m An mx1 vector representing an element of 1; (R) ⁺ ) ^m×n Representing a positive real matrix with dimensions m x n;<·,·>representing the Frobenius dot product;

5. The optimal transport-based drive train unsupervised domain adaptive fault diagnosis method according to claim 4, characterized by: in the step (3), the objective loss function to be optimized for the whole model is as follows:

wherein f' denotes a decoder of the automatic encoder,and->Respectively represent the source domain feature->And->Decoding into input data; l (L) _f And L _g Representing the loss functions of the feature extractor f and softmax classifier g, respectively;

when the transport plan γ is fixed, the target loss function is written as:

the Adam algorithm is used to solve the above equation.

6. The optimal transport-based drive train unsupervised domain adaptive fault diagnosis method according to claim 5, characterized by: in the step (4), the prediction accuracy of the target domain sample tag is calculated according to the following formula:

wherein,for sample->Predicted tag,/->Is a corresponding true label; 1 {. Cndot. } is a binary function, when +.>When (I)>1 is shown in the specification; when->When (I)>Is 0.