CN114818996B - Method and system for diagnosing mechanical fault based on federal domain generalization - Google Patents

Method and system for diagnosing mechanical fault based on federal domain generalization Download PDF

Info

Publication number
CN114818996B
CN114818996B CN202210738070.1A CN202210738070A CN114818996B CN 114818996 B CN114818996 B CN 114818996B CN 202210738070 A CN202210738070 A CN 202210738070A CN 114818996 B CN114818996 B CN 114818996B
Authority
CN
China
Prior art keywords
client
central server
loss
clients
global
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210738070.1A
Other languages
Chinese (zh)
Other versions
CN114818996A (en
Inventor
宋艳
李沂滨
贾磊
崔明
王代超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University
Original Assignee
Shandong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University filed Critical Shandong University
Priority to CN202210738070.1A priority Critical patent/CN114818996B/en
Publication of CN114818996A publication Critical patent/CN114818996A/en
Application granted granted Critical
Publication of CN114818996B publication Critical patent/CN114818996B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2431Multiple classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Bioethics (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Databases & Information Systems (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Test And Diagnosis Of Digital Computers (AREA)

Abstract

The invention discloses a method and a system for diagnosing mechanical faults based on the generalization of the federal domain, which relate to the technical field of fault diagnosis. In a second step, the client independently trains the model using its own training data set. And thirdly, sending the models trained by all the clients to a server, and averaging all the model parameters in the server to obtain a global model. And fourthly, the client and the central server cooperate to train the global model. In the testing stage, the server sends the global model to the client side containing the target domain data to complete fault diagnosis. The invention utilizes the inherent relation between the label and the characteristic of the source domain data, and completes the training of the global fault diagnosis model by weighting and aggregating the training loss and the model parameters of different client models in the central server.

Description

Method and system for diagnosing mechanical fault based on federal domain generalization
Technical Field
The invention relates to the technical field of fault diagnosis, in particular to a method and a system for diagnosing mechanical faults based on the generalization of the federal domain.
Background
The mechanical fault data are usually from different types of equipment, different working conditions or different operating environments, and the fault diagnosis model trained by using the data in cooperation has the defects of low accuracy and poor generalization capability on the prediction of new data. The domain generalization and domain adaptation method in the transfer learning solves the domain drift problem by aligning the data feature space. In the domain generalization and domain adaptation methods, labeled data is generally called source domain data, and unlabeled data to be predicted is called target domain data. The domain adaptation realizes the fault prediction of the target domain data by aligning the feature space of the source domain data and the target domain data. Unlike domain adaptation, which only has source domain data and no target domain data, domain generalization achieves domain migration by exploiting the internal relationships of data features and labels in the source domain.
In the prior art, a certain amount of Fault Diagnosis is studied based on Domain Generalization, and a Rolling Bearing Fault Diagnosis method based on Domain Generalization is proposed in a document [ Deep Domain Generalization combination A principle Diagnosis Knowledge heated Cross-Domain Fault Diagnosis of Rolling Bearing ]. The method eliminates potential differences among multiple domains under the condition that the target domain only has a healthy sample, and realizes efficient fault diagnosis. The document [ Conditional adaptive Domain Generalization With a Single Discriminator for Bearing Fault Diagnosis ] proposes a condition-to-immunity Domain Generalization method With a Discriminator, and aims to extract Domain invariant features from data With different working conditions and generalize the features into new Fault data. A novel Intelligent Fault Identification method Based on a multi-source Domain is provided in a document [ Intelligent Fault Identification Based on multi-source Domain general knowledge scientific scienio ]. The method describes the discriminant structure of each source domain as a point of the Grassmann manifold using local Fisher discriminant analysis. By preserving the local structure within the class, the local Fisher discriminant analysis can learn an effective discriminant from the multimodal fault data. A multi-Source Domain Adaptation probability learning Method is proposed in the document (A New Multiple Source Domain Adaptation probability methods Between Different learning Machines). The method uses a multi-pair learning strategy to obtain a feature representation of domain alignment while having discriminability for a target domain. The document [ Deep adaptive Domain Adaptation Model for Bearing Fault Diagnosis ] proposes a depth-to-anti-Domain adaptive Model for Fault Diagnosis of a rolling Bearing. The model constructs an anti-domain adaptation network to solve the problem of inconsistent distribution of source domain and target domain characteristics.
It can be seen that the data-driven fault diagnosis algorithm trains the diagnostic model based on a large amount of fault data. Therefore, in order to guarantee the effectiveness of the deep learning method, as much fault data as possible needs to be used in an aggregation manner. However, due to data security and privacy requirements, the aggregated use of data by different clients is not allowed in most cases. Therefore, in order to effectively aggregate and use data on the premise of ensuring the data safety of different clients and solve the problem of data island in the deep learning process, federal learning is carried forward. In federal learning, the learning task is solved in federal form by multiple participating devices (i.e., clients) under the coordination of a central server. From the perspective of theoretical research, scholars at home and abroad develop research on common scientific problems in federal learning, such as the problem of non-independent and same distribution of data, the problem of no-labeled data, safety and the like. From the application perspective, a great deal of research has been conducted by domestic and foreign scholars on how to combine federal learning with a specific application scenario, such as finance, medical treatment, robots, smart cities, and the like.
Federal learning uses data of different clients to collaboratively train a model, but due to different operating conditions or models of devices of different clients, the data usually has domain drift problems, so federal migration learning is concerned by more and more researchers. Zhang et al (Federal Transfer Learning method for Intelligent Fault Diagnostics Using Deep adaptive Networks with Data Privacy) provides a Federated Transfer Learning method for Fault diagnosis, which designs different network model structures for different clients. The document [ Data privacy fed transfer learning in mechanical failure diagnostics using prior distributions ] proposes a joint migration learning method for mechanical failure diagnosis. The method provides that the domain drifting problem is indirectly solved by using prior distribution, and the fault diagnosis is carried out by extracting the domain invariant features of different users.
The existing Federal transfer learning fault diagnosis method considers the problems of data safety and domain drift between a source domain and a target domain. However, the existing method assumes that the target domain data exists and participates in the training process, and does not consider the problems that the target domain data is unavailable and the model is personalized to train.
Disclosure of Invention
Aiming at the problems, the invention provides a method and a system for training individualized fault diagnosis models for each client based on the difference of source domain data, and provides a method and a system for diagnosing mechanical faults based on the generalization of the federal domain.
In the invention, in order to ensure the safety of data, fault data and fault characteristics are not shared between the client and between the client and the central server. On the other hand, the method provided by the invention trains a global fault diagnosis model in a central server by using partial model parameters and weighted loss of different clients.
In order to achieve the purpose, the invention is realized by the following technical scheme:
the first aspect of the disclosure provides a mechanical fault diagnosis method based on the generalization of the federal domain, which comprises the following steps:
the central server randomly initializes the global model and sends the global model to all the clients;
the client independently trains the model by using the training data set of the client;
sending the models trained by all the clients to a central server, and averaging all the model parameters in the central server to obtain a global model;
the central server sends the global model to all the clients, and the clients and the central server cooperate to train the global model;
and the central server sends the trained global model to a client containing target domain data to complete fault diagnosis.
Further, the central server sends the global model to all the clients, and the clients complete the following tasks:
calculating classification loss based on the classification loss function and sending the classification loss to a central server;
acquiring the output characteristics of each client characteristic extraction network, and sending the covariance matrix of the output characteristics to a central server;
and calculating the invariant risk minimization loss based on the invariant risk minimization loss function and sending the invariant risk minimization loss to the central server.
Further, the classification loss is
Figure 185163DEST_PATH_IMAGE001
The classification loss function is:
Figure 663548DEST_PATH_IMAGE002
wherein, the first and the second end of the pipe are connected with each other,
Figure 713544DEST_PATH_IMAGE003
is as follows
Figure 986394DEST_PATH_IMAGE004
The training data of the individual clients is,
Figure 246474DEST_PATH_IMAGE005
is as follows
Figure 528550DEST_PATH_IMAGE004
The training data sets of the individual clients are true labels,
Figure 433052DEST_PATH_IMAGE006
is a prediction result.
Further, the invariant risk minimization loss function is as follows:
Figure 611224DEST_PATH_IMAGE007
wherein IRM is an invariant risk minimization loss,
Figure 476108DEST_PATH_IMAGE008
it is meant that the gradient calculation is performed,
Figure 155351DEST_PATH_IMAGE009
b represents
Figure 179939DEST_PATH_IMAGE010
And
Figure 794591DEST_PATH_IMAGE011
the number of the (c) is greater than the total number of the (c),
Figure 904630DEST_PATH_IMAGE012
Figure 387563DEST_PATH_IMAGE013
in order to input the data, it is proposed that,
Figure 532237DEST_PATH_IMAGE014
Figure 583370DEST_PATH_IMAGE015
in order to input the label, the user must,
Figure 180704DEST_PATH_IMAGE016
is composed of
Figure 811537DEST_PATH_IMAGE012
Passing through a feature extraction network
Figure 935351DEST_PATH_IMAGE017
The characteristics of the latter output are such that,
Figure 626226DEST_PATH_IMAGE018
is composed of
Figure 976436DEST_PATH_IMAGE019
Passing through a feature extraction network
Figure 676539DEST_PATH_IMAGE020
The characteristics of the latter output are such that,
Figure 795804DEST_PATH_IMAGE010
is the first
Figure 923160DEST_PATH_IMAGE004
To a client
Figure 354142DEST_PATH_IMAGE021
The group characteristics and the label of the tag,
Figure 857935DEST_PATH_IMAGE011
is the first
Figure 831708DEST_PATH_IMAGE004
To a client
Figure 129965DEST_PATH_IMAGE022
Group characteristics and labels;
Figure 189188DEST_PATH_IMAGE023
is a function of the loss of the classification,
Figure 231093DEST_PATH_IMAGE017
is a network of feature extraction that is,
Figure 918426DEST_PATH_IMAGE009
is a scalar quantity.
Further, the central server receives the classification loss and invariant risk minimization loss of all the clients and the covariance matrix of the features; and calculating the second-order statistical characteristic distance of the characteristic covariance matrix of every two clients in the central server, and obtaining characteristic distance measurement loss based on a characteristic distance measurement loss function.
Further, the feature distance metric loss function is as follows:
Figure 653164DEST_PATH_IMAGE024
wherein the content of the first and second substances,
Figure 668525DEST_PATH_IMAGE025
for the feature distance metric loss, N represents the number of clients; f denotes the F-norm of the matrix,
Figure 373176DEST_PATH_IMAGE026
is the size of the feature vector and is,
Figure 55961DEST_PATH_IMAGE027
is a party of assistanceThe difference matrix is a matrix of the differences,
Figure 961600DEST_PATH_IMAGE028
Figure 729836DEST_PATH_IMAGE029
representing the characteristic covariance matrix of any two clients.
Further, a global penalty value is calculated at the central server based on the global penalty function, and a global model of the central server is trained based on the global penalty value back propagation.
Further, the global penalty function is as follows:
Figure 113544DEST_PATH_IMAGE030
wherein, the first and the second end of the pipe are connected with each other,
Figure 775469DEST_PATH_IMAGE031
classification loss, invariant risk minimization loss and feature distance metric loss,
Figure 586430DEST_PATH_IMAGE004
is as follows
Figure 107542DEST_PATH_IMAGE004
The number of the client-side is small,
Figure 294940DEST_PATH_IMAGE032
classifying a weight lost for each client, wherein,
Figure 952318DEST_PATH_IMAGE033
is an integer between 1 and N,
Figure 793235DEST_PATH_IMAGE034
denotes the first
Figure 536063DEST_PATH_IMAGE033
The loss value of a sample of source domain datasets, N representing the number of clients.
A second aspect of the present disclosure provides a mechanical fault diagnosis system based on federal domain generalization, including:
a central server and a client; the central server comprises a global feature extraction network and a global classification network, and the central server simultaneously carries out information interaction with a plurality of clients; the central server is also used for initializing the global model.
Furthermore, each client comprises a feature extraction network and a classification network, wherein N clients comprise N source domain data sets, and the (N + 1) th client comprises a target domain data set.
The beneficial effects of the above-mentioned embodiment of the present invention are as follows:
according to the method, a federal learning mode is adopted, data of a source domain does not need to be leaked to other untrusted third parties, the privacy of the source domain data is protected, the safety of the data is guaranteed, and a global fault diagnosis model is trained on the basis of the source domain data of all clients; compared with other domain adaptation methods, the data security and the fault diagnosis accuracy of the federal domain generalized fault diagnosis method provided by the invention are improved, the interpretability of a machine learning method is improved, and the problem of cross-domain fault diagnosis is fundamentally solved.
The method comprehensively considers the internal inherent relation between the data characteristics and the labels, the spatial distance of the client data characteristics and the migration among different client models, takes the intrinsic causal relation between the fault characteristics and the fault types of each client as a training objective function, and trains under a federal learning framework to obtain the fault diagnosis model with strong generalization capability.
The method provided by the invention does not share fault data or characteristics, reduces the difference of data in different fields through the training and migration of part of model parameters in each client, adopts a model transfer strategy in a characteristic extraction layer of a client model, and reduces the workload of a training model on the premise of not influencing the generalization capability of the model.
The method provided by the invention quantifies the difference of the characteristic space distances of different source domains, and realizes domain generalization by weighting the model losses of different clients.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the invention and together with the description serve to explain the invention and not to limit the invention.
Fig. 1 is a structural relationship diagram of a federal learning center server and a client in a conventional method;
FIG. 2 is a framework diagram of the federated domain generalization method of the present invention.
Detailed Description
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the present application. As used herein, the singular forms "a", "an", and/or "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof;
the first embodiment is as follows:
let N clients contain N source domain data sets
Figure 527153DEST_PATH_IMAGE035
The (N + 1) th client side has a target domain data set
Figure 304616DEST_PATH_IMAGE036
Wherein, in the process,
Figure 722959DEST_PATH_IMAGE037
represents the total number of samples of the target domain data set,
Figure 953083DEST_PATH_IMAGE038
is shown as
Figure 216705DEST_PATH_IMAGE038
The number of samples of the target domain data set,
Figure 442150DEST_PATH_IMAGE038
range of values from 1 to
Figure 31395DEST_PATH_IMAGE037
An integer in between.
Figure 483236DEST_PATH_IMAGE033
Is shown as
Figure 675183DEST_PATH_IMAGE033
The individual source domain data set samples are,
Figure 568183DEST_PATH_IMAGE033
is an integer between 1 and N. The target domain data set does not participate in model training, is only used for testing, and the source domain participates in training.
Each client has a feature extraction network and a classification network. Setting the model set of N characteristic extraction networks as
Figure 593908DEST_PATH_IMAGE039
The set of N classification network models is
Figure 798625DEST_PATH_IMAGE040
. Setting the global characteristic extraction network model in the central server as
Figure 669629DEST_PATH_IMAGE041
The global classification network model is
Figure 135245DEST_PATH_IMAGE042
In order to effectively aggregate and use data on the premise of ensuring the data safety of different clients and solve the problem of data island in the deep learning process, federal learning is carried forward. The distribution architecture of the client and the central server in the traditional federal learning is shown in fig. 1. In federal learning, the learning task is solved in federal form by multiple participating devices (i.e., clients) under the coordination of a central server.
The existing federal migration learning fault diagnosis method considers the problems of data safety and domain drift between a source domain and a target domain. However, the existing method assumes that the target domain data exists and participates in the training process, and does not consider the problems that the target domain data is unavailable and the model is trained individually. Aiming at the problems, the invention trains individualized fault diagnosis models for each client based on the principle that source domain data are different, and as shown in fig. 2, the training of a global model is completed by performing weighted aggregation of loss gradients on training losses and model parameters of different client models in a central server by utilizing the inherent relation between labels and characteristics of the source domain data. In order to ensure the safety of data, fault data and fault characteristics are not shared between the client and between the client and the central server. On the other hand, the method provided by the invention uses partial model parameters and weighting loss of different clients to carry out loss gradient aggregation in the central server, thereby training the model and overcoming the defects that target domain data is unavailable and the model is not trained in a personalized way.
The first embodiment of the disclosure provides a mechanical fault diagnosis method based on federal domain generalization, and in a training phase, a central server firstly transmits a randomly initialized global model to all clients. The client side independently trains the models by using the training data set of the client side, then transmits all client side models to the central server, averages all model parameters in the central server to obtain a global model, and then transmits the processed model to the client side. The client transmits the loss to the central server, and based on the loss value, the central server trains the global model. And finally, sending the trained global model to a client with target domain data, and inputting fault data to be tested to the trained global model to diagnose the fault. The method specifically comprises the following steps:
first, the central server initializes the model randomly
Figure 800713DEST_PATH_IMAGE041
And
Figure 492725DEST_PATH_IMAGE042
to each client.
Second, the client will
Figure 901841DEST_PATH_IMAGE041
And
Figure 97330DEST_PATH_IMAGE042
as its initial model, the client model is obtained by training model parameters using a data set in the client
Figure 323912DEST_PATH_IMAGE043
And
Figure 237641DEST_PATH_IMAGE044
Figure 184869DEST_PATH_IMAGE033
is shown as
Figure 766023DEST_PATH_IMAGE033
The individual source domain data set samples are,
Figure 773293DEST_PATH_IMAGE033
is an integer between 1 and N.
Third, set of all client models
Figure 298952DEST_PATH_IMAGE045
And
Figure 49871DEST_PATH_IMAGE040
is sent to a central server which averages all model parameters to obtain a global model
Figure 219952DEST_PATH_IMAGE046
And
Figure 663703DEST_PATH_IMAGE047
wherein, in the step (A),
Figure 411079DEST_PATH_IMAGE048
a global model of the network is extracted for the central server features,
Figure 965688DEST_PATH_IMAGE049
a global model of the network is categorized for the central server. The specific feature extraction network and classification network structures are shown in the following table:
TABLE 1 network architecture
Figure 990276DEST_PATH_IMAGE050
As shown in the above table, the feature extraction network consists of three sets of one-dimensional convolutional layers, batch normalization layers, modified linear unit layers, and one-dimensional maximum pooling layers connected in series. The number of convolution kernels of the three one-dimensional convolution layers is 128, the sizes of the convolution kernels are 17/17/3 respectively, and convolution step lengths are 1; batch standardization layer no parameter; the parameters of the three modified linear units are all 0.2; the parameters of the three one-dimensional maximum pooling layers are 16/16/2, respectively.
The classification network consists of a full connection layer, a batch standardization layer, a modified linear unit layer, a Dropout layer and a Softmax layer. Wherein the parameter of the full connection layer is 512, the parameter of the correction linear unit layer is 0.2, the parameter of random zero setting is 0.3, and the parameter of the Softmax layer is the number of fault categories.
The fourth step, the central server will
Figure 604928DEST_PATH_IMAGE046
And
Figure 839600DEST_PATH_IMAGE049
sending the data to all clients, and completing the following tasks by the clients:
calculating classification loss
Figure 932321DEST_PATH_IMAGE051
. Wherein
Figure 76995DEST_PATH_IMAGE002
Figure 862548DEST_PATH_IMAGE052
For the training data of the k-th client,
Figure 584517DEST_PATH_IMAGE053
for the training data set true label of the kth client,
Figure 746508DEST_PATH_IMAGE054
is a prediction result. Will classify the loss
Figure 745688DEST_PATH_IMAGE055
And sending the data to a central server.
Obtaining the output characteristics of each client terminal characteristic extraction network, and outputting the covariance matrix of the output characteristics
Figure 702142DEST_PATH_IMAGE056
And sending the data to a central server.
Invariant Risk Minimization loss (Invariant Risk Minimization) was calculated. Invariant risk minimization assumes that the distribution of data in different domains is different, but the causal relationship of data features to tags is constant. The causal relationship between the tags and features does not change with changes in the operating conditions or environment. The purpose of invariant risk minimization is to find out the potential invariance of different domains. The invariant risk minimization loss function is as follows:
Figure 786773DEST_PATH_IMAGE007
wherein IRM is invariable windThe risk is minimized and the loss is minimized,
Figure 486876DEST_PATH_IMAGE008
it is meant that the gradient calculation is performed,
Figure 606142DEST_PATH_IMAGE009
b represents
Figure 592552DEST_PATH_IMAGE010
And
Figure 453496DEST_PATH_IMAGE011
the number of the (c) is,
Figure 691710DEST_PATH_IMAGE012
Figure 665482DEST_PATH_IMAGE013
in order to input the data, the data is,
Figure 963740DEST_PATH_IMAGE014
Figure 22962DEST_PATH_IMAGE015
in order to input the label, the user must,
Figure 189502DEST_PATH_IMAGE016
is composed of
Figure 17780DEST_PATH_IMAGE012
Via a feature extraction network
Figure 486939DEST_PATH_IMAGE017
The characteristics of the latter output are such that,
Figure 767879DEST_PATH_IMAGE018
is composed of
Figure 613475DEST_PATH_IMAGE019
Via a feature extraction network
Figure 561839DEST_PATH_IMAGE020
The characteristics of the latter output are such that,
Figure 326533DEST_PATH_IMAGE010
is the first
Figure 94769DEST_PATH_IMAGE004
To a client
Figure 744056DEST_PATH_IMAGE021
The group characteristics and the label of the tag,
Figure 281348DEST_PATH_IMAGE011
is the first
Figure 92309DEST_PATH_IMAGE004
To a client
Figure 613420DEST_PATH_IMAGE022
Group characteristics and labels;
Figure 394294DEST_PATH_IMAGE023
is a function of the classification loss for the,
Figure 317251DEST_PATH_IMAGE017
is a network of feature extraction that is,
Figure 299113DEST_PATH_IMAGE009
is a scalar quantity.
Fifthly, the central server receives the loss of all the clients
Figure 307521DEST_PATH_IMAGE057
And
Figure 33031DEST_PATH_IMAGE058
and covariance matrix of features
Figure 669549DEST_PATH_IMAGE059
. Calculating the second-order statistical characteristic distance of the characteristic covariance matrix of every two clients in the central server to obtain a characteristic distance measurement loss function as follows:
Figure 822313DEST_PATH_IMAGE024
wherein, the first and the second end of the pipe are connected with each other,
Figure 786858DEST_PATH_IMAGE025
in order to characterize the loss of the distance metric,
Figure 316059DEST_PATH_IMAGE060
is the size of the feature vector, N represents the number of clients; f denotes the F-norm of the matrix,
Figure 948029DEST_PATH_IMAGE004
is shown as
Figure 130748DEST_PATH_IMAGE004
The number of the client-side is small,
Figure 113748DEST_PATH_IMAGE028
Figure 181061DEST_PATH_IMAGE061
representing the characteristic covariance matrix of any two clients.
Sixthly, calculating a global loss value on the central server, wherein the global loss function is as follows:
Figure 667537DEST_PATH_IMAGE030
wherein the content of the first and second substances,
Figure 427683DEST_PATH_IMAGE031
classification loss, invariant risk minimization loss and feature distance metric loss,
Figure 632399DEST_PATH_IMAGE004
is a first
Figure 628037DEST_PATH_IMAGE004
The number of the client-side is small,
Figure 437861DEST_PATH_IMAGE032
classifying a weight lost for each client, wherein,
Figure 634487DEST_PATH_IMAGE033
is an integer between 1 and N,
Figure 60921DEST_PATH_IMAGE034
is shown as
Figure 594670DEST_PATH_IMAGE033
The loss value of a sample of source domain datasets, N representing the number of clients.
By weighting, clients with poor classification performance (large loss value) contribute large loss value proportion, so the selection of the model with the worst performance is considered in the global model training.
And seventhly, training the fault diagnosis model of the central server by adopting a back propagation mode based on the global loss value. Because the low-level feature extraction network is a general feature of fault data, namely, the difference of the model parameters in different clients is small, in order to reduce the model training burden, a model transfer strategy is adopted, the parameters of the low-level network of the global model are frozen, only the parameters in the high-level network are trained, and the workload of the training model is reduced on the premise of not influencing the generalization capability of the model.
The model transfer strategy means that in the process of cooperatively training a model by a client and a server, because the characteristics extracted by the low-level network parameters of the characteristic feature extraction layer are general characteristics, in the process of training the model by the central server, the low-level network parameters can be frozen, and only the parameters of a higher-level network are trained.
And finally, sending the trained global model to a client with target domain data, and inputting fault data to be tested to carry out fault diagnosis.
And (3) experimental verification:
the invention verifies the fault diagnosis accuracy and safety of the fault diagnosis model obtained by the application through the following experiments.
1) Introduction of data set: the bearing failure data set was provided by the university of Keiss West reservoir (CWRU). In this data set, the bearings had three different failure diameters, including 7, 14, and 21 mils. The label information is shown in table 2, and the condition information is shown in table 3. The data set sample length is 4096. The training and testing protocol of the present invention is shown in table 4.
Case 1 indicates that the 1 st experiment uses the data sets numbered 0 and 1 in table 3 as source domain 1 and source domain 2, respectively, and the data set numbered 3 as the target domain.
Case 2 indicates that the 2 nd experiment will use the data sets numbered 1 and 2 in table 3 as source domain 1 and source domain 2, respectively, and the data set numbered 3 as the target domain.
Table 2.Cwru fault data set tag information
Figure 790159DEST_PATH_IMAGE062
TABLE 3 CWRU Equipment Condition
Figure 157687DEST_PATH_IMAGE063
TABLE 4 Source and target Domain information in CWRU experiments
Figure 805837DEST_PATH_IMAGE064
2) The experimental results are as follows: the results of the experiment are shown in Table 5. As can be seen from table 5, compared With the Domain adaptation method in the paper [ Conditional adaptive Domain Generalization With a Single discovery Fault Diagnosis ], the method provided by the present invention can improve the Fault Diagnosis accuracy of the target Domain on the premise of ensuring the data security of the source Domain.
TABLE 5 results of the experiment
Figure 18643DEST_PATH_IMAGE065
Example two:
the second embodiment of the present disclosure provides a mechanical fault diagnosis system based on federal domain generalization, including:
the central server is used for initializing a global model, the central server comprises a global feature extraction network and a global classification network, and the central server simultaneously carries out information interaction with a plurality of clients; each client comprises a feature extraction network and a classification network, N clients comprise N source domain data sets, and the (N + 1) th client comprises a target domain data set.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (5)

1. The mechanical fault diagnosis method based on the generalization of the federal domain is characterized by comprising the following steps of:
the central server randomly initializes the global model and sends the global model to all the clients; the client completes the following tasks:
calculating the classification loss based on the classification loss function and sending the classification loss to the central server;
acquiring the output characteristics of each client characteristic extraction network, and sending the covariance matrix of the output characteristics to a central server;
calculating invariant risk minimum loss based on the invariant risk minimum loss function and sending the invariant risk minimum loss to the central server;
the client independently trains the model by using the training data set of the client;
sending the models trained by all the clients to a central server, and averaging all the model parameters in the central server to obtain a global model; the specific process is as follows: the central server receives the classification loss and invariant risk minimization loss of all the clients and the covariance matrix of the features; calculating second-order statistical characteristic distances of characteristic covariance matrixes of every two clients in a central server, and obtaining characteristic distance measurement loss based on a characteristic distance measurement loss function;
is classified as
Figure DEST_PATH_IMAGE001
The classification loss function is:
Figure 47146DEST_PATH_IMAGE002
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE003
for the training data of the k-th client,
Figure 433128DEST_PATH_IMAGE004
for the training data set true label of the kth client,
Figure DEST_PATH_IMAGE005
is a predicted result;
the invariant risk minimization loss function is as follows:
Figure 42226DEST_PATH_IMAGE006
wherein the IRM is an invariant risk minimization loss,
Figure DEST_PATH_IMAGE007
it is meant that the gradient calculation is performed,
Figure 765332DEST_PATH_IMAGE008
b represents
Figure DEST_PATH_IMAGE009
And
Figure 48545DEST_PATH_IMAGE010
the number of the (c) is greater than the total number of the (c),
Figure DEST_PATH_IMAGE011
Figure 198904DEST_PATH_IMAGE012
in order to input the data, the data is,
Figure DEST_PATH_IMAGE013
Figure 793833DEST_PATH_IMAGE014
in order to input the label, the user can input the label,
Figure DEST_PATH_IMAGE015
is composed of
Figure 556515DEST_PATH_IMAGE011
Passing through a feature extraction network
Figure 959815DEST_PATH_IMAGE016
The characteristics of the latter output are such that,
Figure DEST_PATH_IMAGE017
is composed of
Figure 281075DEST_PATH_IMAGE018
Passing through a feature extraction network
Figure 769825DEST_PATH_IMAGE016
The characteristics of the latter output are such that,
Figure 975678DEST_PATH_IMAGE009
is the first
Figure DEST_PATH_IMAGE019
To a client
Figure 92539DEST_PATH_IMAGE020
A group of characteristics and a label, and,
Figure 256804DEST_PATH_IMAGE010
is the first
Figure 232850DEST_PATH_IMAGE019
To a client
Figure DEST_PATH_IMAGE021
Group characteristics and labels;
Figure 835870DEST_PATH_IMAGE022
is a function of the loss of the classification,
Figure 213762DEST_PATH_IMAGE016
is a network of feature extraction that is,
Figure 897729DEST_PATH_IMAGE008
is a scalar;
the feature distance metric loss function is as follows:
Figure DEST_PATH_IMAGE023
wherein the content of the first and second substances,
Figure 829912DEST_PATH_IMAGE024
for the feature distance metric loss, N represents the number of clients; f denotes the F-norm of the matrix,
Figure DEST_PATH_IMAGE025
is the size of the feature vector(s),
Figure 236623DEST_PATH_IMAGE019
denotes the first
Figure 469021DEST_PATH_IMAGE019
The number of the client-side is small,
Figure 834144DEST_PATH_IMAGE026
Figure DEST_PATH_IMAGE027
a feature covariance matrix representing any two clients;
the central server sends the global model to all the clients, and the clients and the central server cooperate to train the global model;
and the central server sends the trained global model to a client containing target domain data to complete fault diagnosis.
2. The federal domain generalization-based mechanical fault diagnosis method as claimed in claim 1, wherein the global loss value is calculated on the central server based on a global loss function, and the global model of the central server is trained based on the global loss value back propagation.
3. The federal domain generalization-based mechanical failure diagnostic method of claim 2, wherein the global loss function is as follows:
Figure 253624DEST_PATH_IMAGE028
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE029
classification loss, invariant risk minimization loss and feature distance metric loss,
Figure 464025DEST_PATH_IMAGE019
is a first
Figure 550930DEST_PATH_IMAGE019
The number of the client-side is small,
Figure 853997DEST_PATH_IMAGE030
classifying a weight lost for each client, wherein,
Figure DEST_PATH_IMAGE031
is an integer between 1 and N,
Figure 495194DEST_PATH_IMAGE032
denotes the first
Figure 774866DEST_PATH_IMAGE031
Loss values for the source domain dataset samples, N represents the number of clients.
4. A diagnostic system for a federal domain generalization-based mechanical failure diagnostic method as defined in any one of claims 1 to 3, comprising:
a central server and a client; the central server comprises a global feature extraction network and a global classification network, and the central server simultaneously carries out information interaction with a plurality of clients; the central server is also used for initializing a global model; the characteristic extraction network consists of three groups of one-dimensional convolutional layers, a batch standardization layer, a correction linear unit layer and a one-dimensional maximum pooling layer which are connected in series; the classification network consists of a full connection layer, a batch standardization layer, a modified linear unit layer, a Dropout layer and a Softmax layer.
5. The diagnostic system of claim 4, wherein each client comprises a feature extraction network and a classification network, and it is assumed that N clients comprise N source domain data sets, and the (N + 1) th client comprises a target domain data set; the characteristic extraction network consists of three groups of one-dimensional convolution layers, batch standardization layers, correction linear unit layers and one-dimensional maximum pooling layers which are connected in series; the classification network consists of a full connection layer, a batch standardization layer, a correction linear unit layer, a Dropout layer and a Softmax layer.
CN202210738070.1A 2022-06-28 2022-06-28 Method and system for diagnosing mechanical fault based on federal domain generalization Active CN114818996B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210738070.1A CN114818996B (en) 2022-06-28 2022-06-28 Method and system for diagnosing mechanical fault based on federal domain generalization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210738070.1A CN114818996B (en) 2022-06-28 2022-06-28 Method and system for diagnosing mechanical fault based on federal domain generalization

Publications (2)

Publication Number Publication Date
CN114818996A CN114818996A (en) 2022-07-29
CN114818996B true CN114818996B (en) 2022-10-11

Family

ID=82522738

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210738070.1A Active CN114818996B (en) 2022-06-28 2022-06-28 Method and system for diagnosing mechanical fault based on federal domain generalization

Country Status (1)

Country Link
CN (1) CN114818996B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115328691B (en) * 2022-10-14 2023-03-03 山东大学 Fault diagnosis method, system, storage medium and equipment based on model difference
CN116226784A (en) * 2023-02-03 2023-06-06 中国人民解放军92578部队 Federal domain adaptive fault diagnosis method based on statistical feature fusion
CN115952442B (en) * 2023-03-09 2023-06-13 山东大学 Global robust weighting-based federal domain generalized fault diagnosis method and system
CN117172312A (en) * 2023-08-18 2023-12-05 南京理工大学 Equipment fault diagnosis method based on improved federal element learning
CN116992336B (en) * 2023-09-04 2024-02-13 南京理工大学 Bearing fault diagnosis method based on federal local migration learning

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112308158B (en) * 2020-11-05 2021-09-24 电子科技大学 Multi-source field self-adaptive model and method based on partial feature alignment
CN112560991B (en) * 2020-12-25 2023-07-07 中山大学 Personalized federal learning method based on mixed expert model
CN112784872B (en) * 2020-12-25 2023-06-30 北京航空航天大学 Cross-working condition fault diagnosis method based on open set joint transfer learning
CN114186237A (en) * 2021-10-26 2022-03-15 北京理工大学 Truth-value discovery-based robust federated learning model aggregation method
CN114254700A (en) * 2021-12-06 2022-03-29 中国海洋大学 TBM hob fault diagnosis model construction method based on federal learning
CN114399055A (en) * 2021-12-28 2022-04-26 重庆大学 Domain generalization method based on federal learning
CN114358286A (en) * 2022-03-08 2022-04-15 浙江中科华知科技股份有限公司 Mobile equipment federal learning method and system

Also Published As

Publication number Publication date
CN114818996A (en) 2022-07-29

Similar Documents

Publication Publication Date Title
CN114818996B (en) Method and system for diagnosing mechanical fault based on federal domain generalization
CN109636658B (en) Graph convolution-based social network alignment method
Yang et al. Robust federated learning with noisy labels
Guo et al. Deep clustering with convolutional autoencoders
Sun et al. Robust co-training
CN112446423B (en) Fast hybrid high-order attention domain confrontation network method based on transfer learning
CN104657718A (en) Face recognition method based on face image feature extreme learning machine
CN113469219B (en) Rotary machine fault diagnosis method under complex working condition based on element transfer learning
CN111931814A (en) Unsupervised anti-domain adaptation method based on intra-class structure compactness constraint
CN109165275A (en) Intelligent substation operation order information intelligent search matching process based on deep learning
CN112784920A (en) Cloud-side-end-coordinated dual-anti-domain self-adaptive fault diagnosis method for rotating part
Yu et al. Exemplar-based recursive instance segmentation with application to plant image analysis
Xu et al. Hp-gmn: Graph memory networks for heterophilous graphs
Wang et al. Network-combined broad learning and transfer learning: A new intelligent fault diagnosis method for rolling bearings
Wang et al. Few-shot learning based balanced distribution adaptation for heterogeneous defect prediction
CN114399055A (en) Domain generalization method based on federal learning
Chen et al. Multi-channel domain adaptation graph convolutional networks-based fault diagnosis method and with its application
Sun et al. Few‐shot learning for plant disease recognition: A review
Fang et al. Regression with label permutation in generalized linear model
CN116226784A (en) Federal domain adaptive fault diagnosis method based on statistical feature fusion
Zou et al. FedDCS: Federated learning framework based on dynamic client selection
CN115952442B (en) Global robust weighting-based federal domain generalized fault diagnosis method and system
CN112435034A (en) Marketing arbitrage black product identification method based on multi-network graph aggregation
CN117235490A (en) Fault self-adaptive diagnosis method integrating deep volume and self-attention network
CN107506726B (en) SAR image classification method based on quadratic form primitive multitiered network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant