CN114818996A - Method and system for diagnosing mechanical fault based on federal domain generalization - Google Patents

Method and system for diagnosing mechanical fault based on federal domain generalization Download PDF

Info

Publication number
CN114818996A
CN114818996A CN202210738070.1A CN202210738070A CN114818996A CN 114818996 A CN114818996 A CN 114818996A CN 202210738070 A CN202210738070 A CN 202210738070A CN 114818996 A CN114818996 A CN 114818996A
Authority
CN
China
Prior art keywords
central server
client
loss
domain
clients
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210738070.1A
Other languages
Chinese (zh)
Other versions
CN114818996B (en
Inventor
宋艳
李沂滨
贾磊
崔明
王代超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University
Original Assignee
Shandong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University filed Critical Shandong University
Priority to CN202210738070.1A priority Critical patent/CN114818996B/en
Publication of CN114818996A publication Critical patent/CN114818996A/en
Application granted granted Critical
Publication of CN114818996B publication Critical patent/CN114818996B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2431Multiple classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Medical Informatics (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioethics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Databases & Information Systems (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Test And Diagnosis Of Digital Computers (AREA)

Abstract

The invention discloses a method and a system for diagnosing mechanical faults based on the generalization of the federal domain, which relate to the technical field of fault diagnosis. In a second step, the client independently trains the model using its own training data set. And thirdly, sending the models trained by all the clients to a server, and averaging all the model parameters in the server to obtain a global model. And fourthly, the client and the central server cooperate to train the global model. In the testing stage, the server sends the global model to the client side containing the target domain data to complete fault diagnosis. The invention utilizes the inherent relation between the label and the characteristic of the source domain data, and completes the training of the global fault diagnosis model by weighting and aggregating the training loss and the model parameters of different client models in the central server.

Description

Method and system for diagnosing mechanical fault based on federal domain generalization
Technical Field
The invention relates to the technical field of fault diagnosis, in particular to a method and a system for diagnosing mechanical faults based on the generalization of the federal domain.
Background
The mechanical fault data are generally from devices with different models, different working conditions or different operating environments, and a fault diagnosis model trained by using the data in cooperation has the defects of low accuracy and poor generalization capability on the prediction of new data. The domain generalization and domain adaptation method in the transfer learning solves the domain drift problem by aligning the data feature space. In the domain generalization and domain adaptation methods, labeled data is generally called source domain data, and unlabeled data to be predicted is called target domain data. The domain adaptation realizes the fault prediction of the target domain data by carrying out feature space alignment on the source domain data and the target domain data. Unlike domain adaptation, which only has source domain data and no target domain data, domain generalization achieves domain migration by exploiting the internal relationships of data features and labels in the source domain.
In the prior art, a certain amount of Fault Diagnosis is studied based on Domain Generalization, and a Rolling Bearing Fault Diagnosis method based on Domain Generalization is proposed in a document [ Deep Domain Generalization combination A principle Diagnosis Knowledge heated Cross-Domain Fault Diagnosis of Rolling Bearing ]. The method eliminates potential differences among multiple domains under the condition that the target domain only has a healthy sample, and realizes efficient fault diagnosis. The document [ Conditional adaptive Domain Generalization With a Single Discriminator for Bearing Fault Diagnosis ] proposes a condition-to-immunity Domain Generalization method With a Discriminator, and aims to extract Domain invariant features from data With different working conditions and generalize the features into new Fault data. A novel Intelligent Fault Identification method Based on a multi-source Domain is provided in a document [ Intelligent Fault Identification Based on multi-source Domain general knowledge scientific scienio ]. The method describes the discriminant structure of each source domain as a point of the Grassmann manifold using local Fisher discriminant analysis. By preserving the local structure within the class, the local Fisher discriminant analysis can learn an effective discriminator from multi-modal fault data. A multi-Source Domain Adaptation probability learning Method is proposed in the document (A New Multiple Source Domain Adaptation probability methods Between Different learning Machines). The method uses a multi-pair learning strategy to obtain a feature representation of domain alignment while having discriminability for a target domain. The document [ Deep adaptive Domain Adaptation Model for Bearing Fault Diagnosis ] proposes a depth-to-anti-Domain adaptive Model for Fault Diagnosis of a rolling Bearing. The model constructs an anti-domain adaptation network to solve the problem of inconsistent distribution of source domain and target domain characteristics.
It can be seen that the data-driven fault diagnosis algorithm trains the diagnostic model based on a large amount of fault data. Therefore, in order to guarantee the effectiveness of the deep learning method, as much fault data as possible needs to be used in an aggregation manner. However, due to data security and privacy requirements, the aggregated use of data by different clients is not allowed in most cases. Therefore, in order to effectively aggregate and use data on the premise of ensuring the data safety of different clients and solve the problem of data island in the deep learning process, federal learning is carried forward. In federal learning, the learning task is federally addressed by multiple participating devices (i.e., clients) under the coordination of a central server. From the perspective of theoretical research, scholars at home and abroad develop research on common scientific problems in federal learning, such as the problem of non-independent and same distribution of data, the problem of no-labeled data, safety and the like. From the application perspective, a great deal of research has been conducted by domestic and foreign scholars on how to combine federal learning with a specific application scenario, such as finance, medical treatment, robots, smart cities, and the like.
Federal learning uses data of different clients to cooperatively train a model, but due to different operating conditions or models of devices of different clients, the data usually has domain drift problems, so federal migration learning is concerned by more and more researchers. Zhang et al (Federal Transfer Learning method for Intelligent Fault Diagnostics Using Deep adaptive Networks with Data Privacy) provides a Federated Transfer Learning method for Fault diagnosis, which designs different network model structures for different clients. The document [ Data privacy fed transfer learning in mechanical failure diagnostics using prior distributions ] proposes a joint migration learning method for mechanical failure diagnosis. The method provides that the domain drift problem is indirectly solved by using prior distribution, and the fault diagnosis is carried out by extracting the domain invariant features of different users.
The existing federal migration learning fault diagnosis method considers the problems of data safety and domain drift between a source domain and a target domain. However, the existing method assumes that the target domain data exists and participates in the training process, and does not consider the problems that the target domain data is unavailable and the model is trained individually.
Disclosure of Invention
Aiming at the problems, the invention provides a method and a system for diagnosing the mechanical fault based on the generalization of the federal domain, which train an individualized fault diagnosis model for each client based on the difference of source domain data.
In the invention, in order to ensure the safety of data, fault data and fault characteristics are not shared between the client and between the client and the central server. On the other hand, the method provided by the invention trains a global fault diagnosis model in a central server by using partial model parameters and weighted loss of different clients.
In order to achieve the purpose, the invention is realized by the following technical scheme:
the first aspect of the disclosure provides a mechanical fault diagnosis method based on federal domain generalization, which includes the following steps:
the central server randomly initializes the global model and sends the global model to all the clients;
the client independently trains the model by using the training data set of the client;
sending the models trained by all the clients to a central server, and averaging all the model parameters in the central server to obtain a global model;
the central server sends the global model to all the clients, and the clients and the central server cooperate to train the global model;
and the central server sends the trained global model to a client containing target domain data to complete fault diagnosis.
Further, the central server sends the global model to all the clients, and the clients complete the following tasks:
calculating the classification loss based on the classification loss function and sending the classification loss to the central server;
acquiring the output characteristics of each client characteristic extraction network, and sending the covariance matrix of the output characteristics to a central server;
and calculating the invariant risk minimization loss based on the invariant risk minimization loss function and sending the invariant risk minimization loss to the central server.
Further, the classification loss is
Figure 185163DEST_PATH_IMAGE001
The classification loss function is:
Figure 663548DEST_PATH_IMAGE002
wherein, the first and the second end of the pipe are connected with each other,
Figure 713544DEST_PATH_IMAGE003
is as follows
Figure 986394DEST_PATH_IMAGE004
The training data of the individual clients is,
Figure 246474DEST_PATH_IMAGE005
is a first
Figure 528550DEST_PATH_IMAGE004
A clientThe training data set of the terminal is a true label,
Figure 433052DEST_PATH_IMAGE006
is a prediction result.
Further, the invariant risk minimization loss function is as follows:
Figure 611224DEST_PATH_IMAGE007
wherein IRM is an invariant risk minimization loss,
Figure 476108DEST_PATH_IMAGE008
it is meant that the gradient calculation is performed,
Figure 155351DEST_PATH_IMAGE009
b represents
Figure 179939DEST_PATH_IMAGE010
And
Figure 794591DEST_PATH_IMAGE011
the number of the (c) is greater than the total number of the (c),
Figure 904630DEST_PATH_IMAGE012
Figure 387563DEST_PATH_IMAGE013
in order to input the data, the data is,
Figure 532237DEST_PATH_IMAGE014
Figure 583370DEST_PATH_IMAGE015
in order to input the label, the user must,
Figure 180704DEST_PATH_IMAGE016
is composed of
Figure 811537DEST_PATH_IMAGE012
Passing through a feature extraction network
Figure 935351DEST_PATH_IMAGE017
The characteristics of the latter output are such that,
Figure 626226DEST_PATH_IMAGE018
is composed of
Figure 976436DEST_PATH_IMAGE019
Passing through a feature extraction network
Figure 676539DEST_PATH_IMAGE020
The characteristics of the latter output are such that,
Figure 795804DEST_PATH_IMAGE010
is the first
Figure 923160DEST_PATH_IMAGE004
To a client
Figure 354142DEST_PATH_IMAGE021
The group characteristics and the label of the tag,
Figure 857935DEST_PATH_IMAGE011
is the first
Figure 831708DEST_PATH_IMAGE004
To a client
Figure 129965DEST_PATH_IMAGE022
Group characteristics and labels;
Figure 189188DEST_PATH_IMAGE023
is a function of the classification loss for the,
Figure 231093DEST_PATH_IMAGE017
is a network of feature extraction that is,
Figure 918426DEST_PATH_IMAGE009
is a scalar quantity.
Further, the central server receives the classification loss and invariant risk minimization loss of all the clients and the covariance matrix of the features; and calculating the second-order statistical characteristic distance of the characteristic covariance matrix of every two clients in the central server, and obtaining characteristic distance measurement loss based on a characteristic distance measurement loss function.
Further, the feature distance metric loss function is as follows:
Figure 653164DEST_PATH_IMAGE024
wherein the content of the first and second substances,
Figure 668525DEST_PATH_IMAGE025
for the feature distance metric loss, N represents the number of clients; f denotes the F-norm of the matrix,
Figure 373176DEST_PATH_IMAGE026
is the size of the feature vector(s),
Figure 55961DEST_PATH_IMAGE027
in the form of a covariance matrix,
Figure 961600DEST_PATH_IMAGE028
Figure 729836DEST_PATH_IMAGE029
representing the characteristic covariance matrix of any two clients.
Further, a global penalty value is calculated at the central server based on the global penalty function, and a global model of the central server is trained based on the global penalty value back propagation.
Further, the global penalty function is as follows:
Figure 113544DEST_PATH_IMAGE030
wherein the content of the first and second substances,
Figure 775469DEST_PATH_IMAGE031
categorical loss, invariant risk minimization loss and feature distance metric, respectivelyThe loss of the carbon dioxide gas is reduced,
Figure 586430DEST_PATH_IMAGE004
is as follows
Figure 107542DEST_PATH_IMAGE004
The number of the client-side is small,
Figure 294940DEST_PATH_IMAGE032
classifying a weight lost for each client, wherein,
Figure 952318DEST_PATH_IMAGE033
is an integer between 1 and N,
Figure 793235DEST_PATH_IMAGE034
is shown as
Figure 536063DEST_PATH_IMAGE033
The loss value of a sample of source domain datasets, N representing the number of clients.
A second aspect of the present disclosure provides a mechanical fault diagnosis system based on federal domain generalization, including:
a central server and a client; the central server comprises a global feature extraction network and a global classification network, and the central server simultaneously carries out information interaction with a plurality of clients; the central server is also used for initializing the global model.
Furthermore, each client comprises a feature extraction network and a classification network, wherein N clients comprise N source domain data sets, and the (N + 1) th client comprises a target domain data set.
The beneficial effects of the above-mentioned embodiment of the present invention are as follows:
according to the method, a federal learning mode is adopted, the data of the source domain does not need to be leaked to other untrusted third parties, the privacy of the source domain data is protected, the safety of the data is ensured, and a global fault diagnosis model is trained on the basis of the source domain data of all clients; compared with other domain adaptation methods, the data security and the fault diagnosis accuracy of the federal domain generalized fault diagnosis method provided by the invention are improved, the interpretability of a machine learning method is improved, and the problem of cross-domain fault diagnosis is fundamentally solved.
The method comprehensively considers the internal inherent relation between the data characteristics and the labels, the spatial distance of the client data characteristics and the migration among different client models, takes the intrinsic causal relation between the fault characteristics and the fault types of each client as a training objective function, and trains under a federal learning framework to obtain the fault diagnosis model with strong generalization capability.
The method provided by the invention does not share fault data or characteristics, reduces the difference of data in different fields through the training and migration of part of model parameters in each client, adopts a model transfer strategy in a characteristic extraction layer of a client model, and reduces the workload of a training model on the premise of not influencing the generalization capability of the model.
The method provided by the invention quantifies the difference of the characteristic space distances of different source domains, and realizes domain generalization by weighting the model losses of different clients.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the invention and together with the description serve to explain the invention and not to limit the invention.
FIG. 1 is a structural relationship diagram of a Federal learning center server and a client in a conventional method;
FIG. 2 is a framework diagram of the federated domain generalization method of the present invention.
Detailed Description
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an", and/or "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof;
the first embodiment is as follows:
let N clients contain N source domain data sets
Figure 527153DEST_PATH_IMAGE035
The (N + 1) th client has a target domain data set
Figure 304616DEST_PATH_IMAGE036
Wherein, in the step (A),
Figure 722959DEST_PATH_IMAGE037
represents the total number of samples of the target domain data set,
Figure 953083DEST_PATH_IMAGE038
denotes the first
Figure 216705DEST_PATH_IMAGE038
The number of samples of the target domain data set,
Figure 442150DEST_PATH_IMAGE038
range of values from 1 to
Figure 31395DEST_PATH_IMAGE037
An integer in between.
Figure 483236DEST_PATH_IMAGE033
Is shown as
Figure 675183DEST_PATH_IMAGE033
The individual source domain data set samples are,
Figure 568183DEST_PATH_IMAGE033
is an integer between 1 and N. The target domain data set does not participate in model training and is only used for testing and the source domainAnd (5) participating in training.
Each client has a feature extraction network and a classification network. Setting the model set of N characteristic extraction networks as
Figure 593908DEST_PATH_IMAGE039
The set of N classification network models is
Figure 798625DEST_PATH_IMAGE040
. Setting the global characteristic extraction network model in the central server as
Figure 669629DEST_PATH_IMAGE041
The global classification network model is
Figure 135245DEST_PATH_IMAGE042
In order to effectively aggregate and use data on the premise of ensuring the data safety of different clients and solve the problem of data island in the deep learning process, federal learning is carried forward. The distribution architecture of the client and the central server in the traditional federal learning is shown in fig. 1. In federal learning, the learning task is federally addressed by multiple participating devices (i.e., clients) under the coordination of a central server.
The existing federal migration learning fault diagnosis method considers the problems of data safety and domain drift between a source domain and a target domain. However, the existing method assumes that the target domain data exists and participates in the training process, and does not consider the problems that the target domain data is unavailable and the model is trained individually. Aiming at the problems, the invention trains an individualized fault diagnosis model for each client based on the principle that source domain data are different, and as shown in fig. 2, the training of a global model is completed by performing weighted aggregation of loss gradients on training losses and model parameters of different client models in a central server by utilizing the inherent relationship between the labels and the characteristics of the source domain data. In order to ensure the safety of data, fault data and fault characteristics are not shared between the client and between the client and the central server. On the other hand, the method provided by the invention uses partial model parameters and weighting loss of different clients to carry out loss gradient aggregation in the central server, thereby training the model and overcoming the defects that target domain data is unavailable and the model is not trained in a personalized way.
The first embodiment of the disclosure provides a mechanical fault diagnosis method based on federal domain generalization, and in a training phase, a central server firstly transmits a randomly initialized global model to all clients. The client side independently trains the models by using the training data set of the client side, then transmits all client side models to the central server, averages all model parameters in the central server to obtain a global model, and then transmits the processed model to the client side. The client transmits the loss to the central server, and based on the loss value, the central server trains the global model. And finally, sending the trained global model to a client with target domain data, and inputting fault data to be tested to the trained global model to diagnose the fault. The method specifically comprises the following steps:
first, the central server initializes the model randomly
Figure 800713DEST_PATH_IMAGE041
And
Figure 492725DEST_PATH_IMAGE042
to each client.
Second, the client will
Figure 901841DEST_PATH_IMAGE041
And
Figure 97330DEST_PATH_IMAGE042
as its initial model, the client model is obtained by training model parameters using a data set in the client
Figure 323912DEST_PATH_IMAGE043
And
Figure 237641DEST_PATH_IMAGE044
Figure 184869DEST_PATH_IMAGE033
is shown as
Figure 766023DEST_PATH_IMAGE033
The individual source domain data set samples are,
Figure 773293DEST_PATH_IMAGE033
is an integer between 1 and N.
Third, set of all client models
Figure 298952DEST_PATH_IMAGE045
And
Figure 49871DEST_PATH_IMAGE040
is sent to a central server which averages all model parameters to obtain a global model
Figure 219952DEST_PATH_IMAGE046
And
Figure 663703DEST_PATH_IMAGE047
wherein, in the step (A),
Figure 411079DEST_PATH_IMAGE048
a global model of the network is extracted for the central server features,
Figure 965688DEST_PATH_IMAGE049
a global model of the network is classified for the central server. The specific feature extraction network and classification network structures are shown in the following table:
TABLE 1 network architecture
Figure 990276DEST_PATH_IMAGE050
As shown in the above table, the feature extraction network consists of three sets of one-dimensional convolutional layers, batch normalization layers, modified linear unit layers, and one-dimensional maximum pooling layers connected in series. The number of convolution kernels of the three one-dimensional convolution layers is 128, the sizes of the convolution kernels are 17/17/3 respectively, and convolution step lengths are 1; batch normalization layer no parameter; the parameters of the three modified linear units are all 0.2; the parameters for the three one-dimensional maximum pooling layers are 16/16/2, respectively.
The classification network consists of a full connection layer, a batch standardization layer, a modified linear unit layer, a Dropout layer and a Softmax layer. Wherein the parameter of the full connection layer is 512, the parameter of the correction linear unit layer is 0.2, the parameter of random zero setting is 0.3, and the parameter of the Softmax layer is the number of fault categories.
The fourth step, the central server will
Figure 604928DEST_PATH_IMAGE046
And
Figure 839600DEST_PATH_IMAGE049
sending the data to all clients, and completing the following tasks by the clients:
calculating classification loss
Figure 932321DEST_PATH_IMAGE051
. Wherein
Figure 76995DEST_PATH_IMAGE002
Figure 862548DEST_PATH_IMAGE052
For the training data of the k-th client,
Figure 584517DEST_PATH_IMAGE053
for the training data set true label of the kth client,
Figure 746508DEST_PATH_IMAGE054
is a prediction result. Will classify the loss
Figure 745688DEST_PATH_IMAGE055
And sending the data to a central server.
Obtaining the output characteristics of each client terminal characteristic extraction network, and outputting the covariance matrix of the output characteristics
Figure 702142DEST_PATH_IMAGE056
And sending the data to a central server.
Invariant Risk Minimization loss (Invariant Risk Minimization) was calculated. Invariant risk minimization assumes that the distribution of data in different domains is different, but the causal relationship of data features to tags is constant. The causal relationship between the tags and features does not change as conditions or environments change. The purpose of invariant risk minimization is to find out the potential invariance of different domains. The invariant risk minimization loss function is as follows:
Figure 786773DEST_PATH_IMAGE007
wherein IRM is an invariant risk minimization loss,
Figure 486876DEST_PATH_IMAGE008
it is meant that the gradient calculation is performed,
Figure 606142DEST_PATH_IMAGE009
b represents
Figure 592552DEST_PATH_IMAGE010
And
Figure 453496DEST_PATH_IMAGE011
the number of the (c) is,
Figure 691710DEST_PATH_IMAGE012
Figure 665482DEST_PATH_IMAGE013
in order to input the data, the data is,
Figure 963740DEST_PATH_IMAGE014
Figure 22962DEST_PATH_IMAGE015
in order to input the label, the user must,
Figure 189502DEST_PATH_IMAGE016
is composed of
Figure 17780DEST_PATH_IMAGE012
Passing through a feature extraction network
Figure 486939DEST_PATH_IMAGE017
The characteristics of the latter output are such that,
Figure 767879DEST_PATH_IMAGE018
is composed of
Figure 613475DEST_PATH_IMAGE019
Passing through a feature extraction network
Figure 561839DEST_PATH_IMAGE020
The characteristics of the latter output are such that,
Figure 326533DEST_PATH_IMAGE010
is the first
Figure 94769DEST_PATH_IMAGE004
To a client
Figure 744056DEST_PATH_IMAGE021
The group characteristics and the label of the tag,
Figure 281348DEST_PATH_IMAGE011
is the first
Figure 92309DEST_PATH_IMAGE004
To a client
Figure 613420DEST_PATH_IMAGE022
Group characteristics and labels;
Figure 394294DEST_PATH_IMAGE023
is a function of the classification loss for the,
Figure 317251DEST_PATH_IMAGE017
is a network of feature extraction that is,
Figure 299113DEST_PATH_IMAGE009
is a scalar quantity.
Fifthly, the central server receives the loss of all the clients
Figure 307521DEST_PATH_IMAGE057
And
Figure 33031DEST_PATH_IMAGE058
and covariance matrix of features
Figure 669549DEST_PATH_IMAGE059
. Calculating the second-order statistical characteristic distance of the characteristic covariance matrix of every two clients in the central server to obtain a characteristic distance measurement loss function as follows:
Figure 822313DEST_PATH_IMAGE024
wherein the content of the first and second substances,
Figure 786858DEST_PATH_IMAGE025
in order to characterize the loss of the distance metric,
Figure 316059DEST_PATH_IMAGE060
is the size of the feature vector, N represents the number of clients; f denotes the F-norm of the matrix,
Figure 948029DEST_PATH_IMAGE004
denotes the first
Figure 130748DEST_PATH_IMAGE004
The number of the client-side is small,
Figure 113748DEST_PATH_IMAGE028
Figure 181061DEST_PATH_IMAGE061
representing the characteristic covariance matrix of any two clients.
And sixthly, calculating a global loss value on the central server, wherein a global loss function is as follows:
Figure 667537DEST_PATH_IMAGE030
wherein the content of the first and second substances,
Figure 427683DEST_PATH_IMAGE031
classification loss, invariant risk minimization loss and feature distance metric loss,
Figure 632399DEST_PATH_IMAGE004
is as follows
Figure 628037DEST_PATH_IMAGE004
The number of the client-side is small,
Figure 437861DEST_PATH_IMAGE032
classifying a weight lost for each client, wherein,
Figure 634487DEST_PATH_IMAGE033
is an integer between 1 and N,
Figure 60921DEST_PATH_IMAGE034
is shown as
Figure 594670DEST_PATH_IMAGE033
The loss value of a sample of source domain datasets, N representing the number of clients.
By weighting, clients with poor classification performance (large loss value) contribute large loss value proportion, so the selection of the model with the worst performance is considered in the global model training.
And seventhly, training the fault diagnosis model of the central server in a back propagation mode based on the global loss value. Because the low-level feature extraction network is a general feature of fault data, namely, the difference of the model parameters in different clients is small, in order to reduce the training burden of the model, a model transfer strategy is adopted, the low-level network parameters of the global model are frozen, only the parameters in the high-level network are trained, and the workload of the training model is reduced on the premise of not influencing the generalization capability of the model.
The model transfer strategy means that in the process of cooperatively training a model by a client and a server, because the characteristics extracted by the low-level network parameters of the characteristic feature extraction layer are general characteristics, in the process of training the model by the central server, the low-level network parameters can be frozen, and only the parameters of a higher-level network are trained.
And finally, sending the trained global model to a client with target domain data, and inputting fault data to be tested to carry out fault diagnosis.
And (3) experimental verification:
the invention verifies the fault diagnosis accuracy and safety of the fault diagnosis model obtained by the application through the following experiments.
1) Introduction of data set: the bearing failure data set was provided by the university of Keiss West reservoir (CWRU). In this data set, the bearings had three different failure diameters, including 7, 14, and 21 mils. The label information is shown in table 2, and the condition information is shown in table 3. The data set sample length is 4096. The training and testing protocols of the present invention are shown in table 4, for example.
Case 1 indicates that the 1 st experiment uses the data sets numbered 0 and 1 in table 3 as source domain 1 and source domain 2, respectively, and the data set numbered 3 as the target domain.
Case 2 indicates that the 2 nd experiment will use the data sets numbered 1 and 2 in table 3 as source domain 1 and source domain 2, respectively, and the data set numbered 3 as the target domain.
TABLE 2 CWRU Fault data set tag information
Figure 790159DEST_PATH_IMAGE062
TABLE 3 CWRU Equipment Condition
Figure 157687DEST_PATH_IMAGE063
TABLE 4 Source and target Domain information in CWRU experiments
Figure 805837DEST_PATH_IMAGE064
2) The experimental results are as follows: the results of the experiment are shown in Table 5. It can be seen from table 5 that, compared With the Domain adaptation method in the paper [ Conditional adaptive Domain genetic With a Single resolver for Bearing Fault Diagnosis ], the method provided by the present invention can improve the Fault Diagnosis accuracy of the target Domain on the premise of ensuring the data security of the source Domain.
TABLE 5 results of the experiment
Figure 18643DEST_PATH_IMAGE065
Example two:
the second embodiment of the present disclosure provides a mechanical fault diagnosis system based on federal domain generalization, including:
the central server is used for initializing a global model, the central server comprises a global feature extraction network and a global classification network, and the central server simultaneously carries out information interaction with a plurality of clients; each client comprises a feature extraction network and a classification network, N clients comprise N source domain data sets, and the (N + 1) th client comprises a target domain data set.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (10)

1. The mechanical fault diagnosis method based on the generalization of the federal domain is characterized by comprising the following steps of:
the central server randomly initializes the global model and sends the global model to all the clients;
the client independently trains the model by using the training data set of the client;
sending the models trained by all the clients to a central server, and averaging all the model parameters in the central server to obtain a global model;
the central server sends the global model to all the clients, and the clients and the central server cooperate to train the global model;
and the central server sends the trained global model to a client containing target domain data to complete fault diagnosis.
2. The method for diagnosing mechanical failure based on generalization of the federal domain as claimed in claim 1, wherein said fault diagnosis means is a fault diagnosis means,
the central server sends the global model to all the clients, and the clients complete the following tasks:
calculating the classification loss based on the classification loss function and sending the classification loss to the central server;
acquiring the output characteristics of each client characteristic extraction network, and sending the covariance matrix of the output characteristics to a central server;
and calculating the invariant risk minimization loss based on the invariant risk minimization loss function and sending the invariant risk minimization loss to the central server.
3. The method for diagnosing mechanical failure based on generalization of the federal domain as claimed in claim 2, wherein,
is classified as
Figure 919227DEST_PATH_IMAGE001
The classification loss function is:
Figure 150488DEST_PATH_IMAGE002
wherein the content of the first and second substances,
Figure 863229DEST_PATH_IMAGE003
for the training data of the k-th client,
Figure 990585DEST_PATH_IMAGE004
for the training data set true label of the kth client,
Figure 562512DEST_PATH_IMAGE005
is a prediction result.
4. The method of claim 3, wherein the machine fault diagnosis based on the generalization of the Federal domain,
the invariant risk minimization loss function is as follows:
Figure 800726DEST_PATH_IMAGE006
wherein the IRM is an invariant risk minimization loss,
Figure 633553DEST_PATH_IMAGE007
it is meant that the gradient calculation is performed,
Figure 931810DEST_PATH_IMAGE008
b represents
Figure 194296DEST_PATH_IMAGE009
And
Figure 111567DEST_PATH_IMAGE010
the number of the (c) is,
Figure 939846DEST_PATH_IMAGE011
Figure 409005DEST_PATH_IMAGE012
in order to input the data, the data is,
Figure 548999DEST_PATH_IMAGE013
Figure 129016DEST_PATH_IMAGE014
in order to input the label, the user must,
Figure 77380DEST_PATH_IMAGE015
is composed of
Figure 983019DEST_PATH_IMAGE011
Passing through a feature extraction network
Figure 16835DEST_PATH_IMAGE016
The characteristics of the latter output are such that,
Figure 400542DEST_PATH_IMAGE017
is composed of
Figure 328047DEST_PATH_IMAGE018
Passing through a feature extraction network
Figure 545533DEST_PATH_IMAGE019
The characteristics of the latter output are such that,
Figure 660120DEST_PATH_IMAGE009
is the first
Figure 113098DEST_PATH_IMAGE020
To a client
Figure 36054DEST_PATH_IMAGE021
The group characteristics and the label of the tag,
Figure 17917DEST_PATH_IMAGE010
is the first
Figure 229586DEST_PATH_IMAGE020
To a client
Figure 220676DEST_PATH_IMAGE022
Group characteristics and labels;
Figure 857194DEST_PATH_IMAGE023
is a function of the classification loss for the,
Figure 9958DEST_PATH_IMAGE016
is a network of feature extraction that is,
Figure 240082DEST_PATH_IMAGE008
is a scalar quantity.
5. The federal domain generalization-based mechanical failure diagnosis method of claim 4, wherein the central server receives classification loss and invariant risk minimization loss of all clients, and a covariance matrix of features; and calculating the second-order statistical characteristic distance of the characteristic covariance matrix of every two clients in the central server, and obtaining characteristic distance measurement loss based on a characteristic distance measurement loss function.
6. The federal domain generalization-based mechanical failure diagnostic method of claim 5, wherein the characteristic distance metric loss function is as follows:
Figure 503704DEST_PATH_IMAGE024
wherein the content of the first and second substances,
Figure 666832DEST_PATH_IMAGE025
for the feature distance metric loss, N represents the number of clients; f denotes the F-norm of the matrix,
Figure 990497DEST_PATH_IMAGE026
is the size of the feature vector and is,
Figure 707918DEST_PATH_IMAGE027
is shown as
Figure 899864DEST_PATH_IMAGE027
The number of the client-side is small,
Figure 651920DEST_PATH_IMAGE028
Figure 146486DEST_PATH_IMAGE029
representing the characteristic covariance matrix of any two clients.
7. The federal domain generalization-based mechanical fault diagnosis method as claimed in claim 6, wherein the global loss value is calculated on the central server based on a global loss function, and the global model of the central server is trained based on the global loss value back propagation.
8. The federal domain generalization-based mechanical failure diagnosis method of claim 7, wherein the global loss function is as follows:
Figure 85623DEST_PATH_IMAGE030
wherein the content of the first and second substances,
Figure 956627DEST_PATH_IMAGE031
classification loss, invariant risk minimization loss and feature distance metric loss,
Figure 422244DEST_PATH_IMAGE027
is as follows
Figure 353291DEST_PATH_IMAGE027
The number of the client-side is small,
Figure 45303DEST_PATH_IMAGE032
classifying a weight lost for each client, wherein,
Figure 985577DEST_PATH_IMAGE033
is an integer between 1 and N,
Figure 446646DEST_PATH_IMAGE034
is shown as
Figure 17435DEST_PATH_IMAGE033
Loss values for the source domain dataset samples, N represents the number of clients.
9. Mechanical fault diagnosis system based on federal domain generalization, characterized by, includes:
a central server and a client; the central server comprises a global feature extraction network and a global classification network, and the central server simultaneously carries out information interaction with a plurality of clients; the central server is also used for initializing a global model; the characteristic extraction network consists of three groups of one-dimensional convolutional layers, a batch standardization layer, a correction linear unit layer and a one-dimensional maximum pooling layer which are connected in series; the classification network consists of a full connection layer, a batch standardization layer, a modified linear unit layer, a Dropout layer and a Softmax layer.
10. The system according to claim 9, wherein each client includes a feature extraction network and a classification network, and N clients include N source domain data sets, and the (N + 1) th client includes a target domain data set; the characteristic extraction network consists of three groups of one-dimensional convolutional layers, a batch standardization layer, a correction linear unit layer and a one-dimensional maximum pooling layer which are connected in series; the classification network consists of a full connection layer, a batch standardization layer, a modified linear unit layer, a Dropout layer and a Softmax layer.
CN202210738070.1A 2022-06-28 2022-06-28 Method and system for diagnosing mechanical fault based on federal domain generalization Active CN114818996B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210738070.1A CN114818996B (en) 2022-06-28 2022-06-28 Method and system for diagnosing mechanical fault based on federal domain generalization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210738070.1A CN114818996B (en) 2022-06-28 2022-06-28 Method and system for diagnosing mechanical fault based on federal domain generalization

Publications (2)

Publication Number Publication Date
CN114818996A true CN114818996A (en) 2022-07-29
CN114818996B CN114818996B (en) 2022-10-11

Family

ID=82522738

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210738070.1A Active CN114818996B (en) 2022-06-28 2022-06-28 Method and system for diagnosing mechanical fault based on federal domain generalization

Country Status (1)

Country Link
CN (1) CN114818996B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115328691A (en) * 2022-10-14 2022-11-11 山东大学 Fault diagnosis method, system, storage medium and equipment based on model difference
CN115952442A (en) * 2023-03-09 2023-04-11 山东大学 Global robust weighting-based federal domain generalized fault diagnosis method and system
CN116226784A (en) * 2023-02-03 2023-06-06 中国人民解放军92578部队 Federal domain adaptive fault diagnosis method based on statistical feature fusion
CN116992336A (en) * 2023-09-04 2023-11-03 南京理工大学 Bearing fault diagnosis method based on federal local migration learning
CN117172312A (en) * 2023-08-18 2023-12-05 南京理工大学 Equipment fault diagnosis method based on improved federal element learning
CN117938691A (en) * 2024-03-25 2024-04-26 山东科技大学 Industrial Internet of things fault diagnosis method, system, equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112560991A (en) * 2020-12-25 2021-03-26 中山大学 Personalized federal learning method based on hybrid expert model
CN112784872A (en) * 2020-12-25 2021-05-11 北京航空航天大学 Cross-working-condition fault diagnosis method based on open set joint migration learning
CN114186237A (en) * 2021-10-26 2022-03-15 北京理工大学 Truth-value discovery-based robust federated learning model aggregation method
CN114254700A (en) * 2021-12-06 2022-03-29 中国海洋大学 TBM hob fault diagnosis model construction method based on federal learning
CN114358286A (en) * 2022-03-08 2022-04-15 浙江中科华知科技股份有限公司 Mobile equipment federal learning method and system
CN114399055A (en) * 2021-12-28 2022-04-26 重庆大学 Domain generalization method based on federal learning
US20220138495A1 (en) * 2020-11-05 2022-05-05 University Of Electronic Science And Technology Of China Model and method for multi-source domain adaptation by aligning partial features

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220138495A1 (en) * 2020-11-05 2022-05-05 University Of Electronic Science And Technology Of China Model and method for multi-source domain adaptation by aligning partial features
CN112560991A (en) * 2020-12-25 2021-03-26 中山大学 Personalized federal learning method based on hybrid expert model
CN112784872A (en) * 2020-12-25 2021-05-11 北京航空航天大学 Cross-working-condition fault diagnosis method based on open set joint migration learning
CN114186237A (en) * 2021-10-26 2022-03-15 北京理工大学 Truth-value discovery-based robust federated learning model aggregation method
CN114254700A (en) * 2021-12-06 2022-03-29 中国海洋大学 TBM hob fault diagnosis model construction method based on federal learning
CN114399055A (en) * 2021-12-28 2022-04-26 重庆大学 Domain generalization method based on federal learning
CN114358286A (en) * 2022-03-08 2022-04-15 浙江中科华知科技股份有限公司 Mobile equipment federal learning method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈金龙等: "基于级联卷积神经网络的手势特征提取方法", 《计算机应用》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115328691A (en) * 2022-10-14 2022-11-11 山东大学 Fault diagnosis method, system, storage medium and equipment based on model difference
CN115328691B (en) * 2022-10-14 2023-03-03 山东大学 Fault diagnosis method, system, storage medium and equipment based on model difference
CN116226784A (en) * 2023-02-03 2023-06-06 中国人民解放军92578部队 Federal domain adaptive fault diagnosis method based on statistical feature fusion
CN115952442A (en) * 2023-03-09 2023-04-11 山东大学 Global robust weighting-based federal domain generalized fault diagnosis method and system
CN115952442B (en) * 2023-03-09 2023-06-13 山东大学 Global robust weighting-based federal domain generalized fault diagnosis method and system
CN117172312A (en) * 2023-08-18 2023-12-05 南京理工大学 Equipment fault diagnosis method based on improved federal element learning
CN116992336A (en) * 2023-09-04 2023-11-03 南京理工大学 Bearing fault diagnosis method based on federal local migration learning
CN116992336B (en) * 2023-09-04 2024-02-13 南京理工大学 Bearing fault diagnosis method based on federal local migration learning
CN117938691A (en) * 2024-03-25 2024-04-26 山东科技大学 Industrial Internet of things fault diagnosis method, system, equipment and storage medium
CN117938691B (en) * 2024-03-25 2024-05-31 山东科技大学 Industrial Internet of things fault diagnosis method, system, equipment and storage medium

Also Published As

Publication number Publication date
CN114818996B (en) 2022-10-11

Similar Documents

Publication Publication Date Title
CN114818996B (en) Method and system for diagnosing mechanical fault based on federal domain generalization
Wang et al. Domain adaptive transfer learning for fault diagnosis
Sun et al. Robust co-training
CN112446423B (en) Fast hybrid high-order attention domain confrontation network method based on transfer learning
CN104657718A (en) Face recognition method based on face image feature extreme learning machine
Liu et al. EfficientNet based recognition of maize diseases by leaf image classification
Yang et al. Ida-gan: A novel imbalanced data augmentation gan
CN113469219B (en) Rotary machine fault diagnosis method under complex working condition based on element transfer learning
CN113344044B (en) Cross-species medical image classification method based on field self-adaption
CN111931814B (en) Unsupervised countering domain adaptation method based on intra-class structure tightening constraint
CN112784920A (en) Cloud-side-end-coordinated dual-anti-domain self-adaptive fault diagnosis method for rotating part
CN112434628A (en) Small sample polarization SAR image classification method based on active learning and collaborative representation
Yu et al. Exemplar-based recursive instance segmentation with application to plant image analysis
CN115189942A (en) Multi-view common-identification-picture semi-supervised network intrusion detection system under guidance of pseudo labels
CN114399055A (en) Domain generalization method based on federal learning
CN113869451A (en) Rolling bearing fault diagnosis method under variable working conditions based on improved JGSA algorithm
CN114898136A (en) Small sample image classification method based on feature self-adaption
Chen et al. Multichannel domain adaptation graph convolutional networks-based fault diagnosis method and with its application
Yang et al. Federated continual learning via knowledge fusion: A survey
CN116226784A (en) Federal domain adaptive fault diagnosis method based on statistical feature fusion
Zou et al. FedDCS: Federated learning framework based on dynamic client selection
CN115952442B (en) Global robust weighting-based federal domain generalized fault diagnosis method and system
CN117235490A (en) Fault self-adaptive diagnosis method integrating deep volume and self-attention network
CN107506726B (en) SAR image classification method based on quadratic form primitive multitiered network
CN112836511B (en) Knowledge graph context embedding method based on cooperative relationship

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant