CN117893807A - Knowledge distillation-based federal self-supervision contrast learning image classification system and method - Google Patents
Knowledge distillation-based federal self-supervision contrast learning image classification system and method Download PDFInfo
- Publication number
- CN117893807A CN117893807A CN202410047272.0A CN202410047272A CN117893807A CN 117893807 A CN117893807 A CN 117893807A CN 202410047272 A CN202410047272 A CN 202410047272A CN 117893807 A CN117893807 A CN 117893807A
- Authority
- CN
- China
- Prior art keywords
- model
- global model
- client
- self
- local
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000013140 knowledge distillation Methods 0.000 title claims abstract description 55
- 238000000034 method Methods 0.000 title claims abstract description 31
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 23
- 230000002776 aggregation Effects 0.000 claims abstract description 16
- 238000004220 aggregation Methods 0.000 claims abstract description 16
- 238000012549 training Methods 0.000 claims description 37
- 230000006870 function Effects 0.000 claims description 15
- 238000005516 engineering process Methods 0.000 claims description 10
- 238000004891 communication Methods 0.000 claims description 3
- 238000004590 computer program Methods 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 238000006116 polymerization reaction Methods 0.000 claims description 3
- 238000013135 deep learning Methods 0.000 abstract description 2
- 239000010410 layer Substances 0.000 description 6
- 238000010801 machine learning Methods 0.000 description 3
- 230000004931 aggregating effect Effects 0.000 description 2
- 230000008485 antagonism Effects 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 239000002355 dual-layer Substances 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/042—Knowledge-based neural networks; Logical representations of neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/0895—Weakly supervised learning, e.g. semi-supervised or self-supervised learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/098—Distributed learning, e.g. federated learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a federal self-supervision contrast learning image classification system and a federal self-supervision contrast learning image classification method based on knowledge distillation, which belong to the technical field of deep learning and computers, and specifically comprise the following steps: the central server randomly initializes the global model and sends the global model to each client, and the clients dynamically update the local model; randomly selecting clients participating in the aggregation, receiving global model parameters, dynamically updating local model parameters according to a divergence sensing method, performing self-supervision contrast learning based on a SimCLR algorithm by using a local image dataset, learning structural knowledge of the global model by using knowledge distillation, and uploading the parameters to a central server by the clients; the central server performs weighted average on the received model parameters based on FedAVg algorithm according to the client data volume to obtain an aggregated global model, and sends the aggregated global model to each client; the execution is repeated until each client gets a converged global model for completing the image classification. The invention has the advantages of good safety, high efficiency and good accuracy.
Description
Technical Field
The invention belongs to the technical field of deep learning and computers, and particularly relates to a federal self-supervision contrast learning image classification system and method based on knowledge distillation.
Background
With the development of big data technology, importance of data privacy and security is becoming a big trend. Meanwhile, data in most industries presents a data island phenomenon, and how to perform data cooperation across users is a difficult problem on the premise of meeting the data privacy protection of users, and federal learning is a key technology for solving the difficult problem.
For example, in an industrial scenario, different enterprises need to jointly perform fault detection model modeling. Enterprise a has some fault data but due to the small amount of data, it is desirable to jointly model with more data, such as enterprise B's fault data.
For massive image data, traditional centralized machine learning is to concentrate the data in a central location and then align the training, but the centralized training has the following problems: 1) Data security problems because data must be transmitted to a centralized server; 2) When the data volume is very large, centralized learning faces the problem of large data volume, resulting in an increase in training time and resource cost. These problems make centralized learning difficult to apply to some tasks that require processing large amounts of data; 3) And a large amount of unlabeled data at the edge end cannot be utilized, so that the waste of data resources is caused.
The federal self-supervised learning paradigm aims at realizing collaborative training of models at the network edge without concentrating the original data, thereby greatly improving the data privacy problem. However, the heterogeneity of data between user devices can severely degrade traditional federal average performance. Meanwhile, a large amount of unlabeled data exists in an actual scene, and the manual labeling of the data is time-consuming and labor-consuming, and the problems of generalization errors, false correlation, antagonism and the like exist. How to efficiently utilize such unlabeled data is also a hot spot of current research.
Disclosure of Invention
The invention aims to provide a federal self-supervision contrast learning image classification system and a federal self-supervision contrast learning image classification method based on knowledge distillation, which solve the problems that the data isomerism among user equipment in the technology can seriously reduce the traditional federal average performance, and the manual labeling of a large amount of unlabeled data is time-consuming and labor-consuming, and the generalization error, false correlation and antagonism exist.
In order to achieve the above purpose, the invention provides a federal self-supervision contrast learning image classification system based on knowledge distillation, which comprises an initialization module, a client selection module, a self-supervision contrast learning module based on knowledge distillation, a global model aggregation module and a model issuing module, wherein:
the initialization module randomly initializes the global model through the central server and sends the global model to each client;
the client selection module is used for randomly selecting clients participating in the aggregation;
the self-supervision contrast learning module based on knowledge distillation receives the global model parameters issued by the initialization module through each selected client, dynamically updates the parameters by using a divergence-aware technology DAU and serves as the initialization parameters of a local model, and the specific expression is as follows: wherein (1)>Is the global parameter of the t-th round of aggregation, < >>Is the parameter after the client i t-1 round of local training is finished, muIs to->And->Obtaining a value obtained by KL divergence, then using a local image dataset to perform self-supervision contrast learning based on a SimCLR algorithm, and simultaneously learning structural knowledge of a global model by using knowledge distillation, and uploading shared layer parameters to a central server by a client after local training is completed;
the global model aggregation module is used for carrying out weighted average on the received model parameters through the central server based on the FedAVg algorithm according to the client data volume to obtain an aggregated global model;
the model issuing module issues the aggregated global model to each client through the central server;
and repeatedly executing the client selection module, the knowledge distillation-based self-supervision comparison learning module, the global model aggregation module and the model issuing module until a converged global model is obtained and is used for completing image classification.
The invention also provides a federal self-supervision contrast learning image classification method based on knowledge distillation, which comprises the following steps:
step 1, a central server randomly initializes a global model and sends the global model to each client;
step 2, randomly selecting clients participating in the polymerization;
step 3, each selected client receives the global model parameters issued by the step 1, dynamically updates the parameters by using a divergence sensing technology DAU, uses the parameters as initialization parameters of local model training, uses a local image dataset to perform self-supervision contrast learning based on a SimCLR algorithm, and simultaneously utilizes knowledge distillation to learn the structural knowledge of the global model, and uploads the shared layer parameters and data volume to a central server after the local training is completed;
step 4, the central server performs weighted average on the received model parameters based on FedAvg algorithm according to the client data volume to obtain an aggregated global model;
step 5, the central server transmits the aggregated global model to each client;
and 6, repeatedly executing the steps 2 to 5 until a converged global model is obtained and is used for completing image classification.
Preferably, in step 1, the global model is composed of an encoder with a network structure of Resnet18 and a dual-layer MLP mapping header.
Preferably, in step 3, self-supervision contrast learning is performed based on SimCLR algorithm, specifically:
s31, under the condition that no data label exists, the SimCLR learns and expresses by maximizing the consistency of the same data under different enhancements through the contrast loss of the hidden space, and the loss function is as follows:
wherein z is i ,z j Is the output of the image i, j through the model, sim (z i ,z j ) Representing z i ,z j Cosine similarity, τ is the temperature coefficient of contrast learning, 2N is the number of data points obtained by pairwise enhancement of a small sample N, z k Is the output of the image k through the model, l i,j Is the loss of positive sample pair (i, j);
s32, randomly extracting a small batch of samples containing N samples, and obtaining paired enhanced samples in the small batch of samples, wherein the total number of the enhanced samples is 2N; image i and its enhanced samples j form a positive sample pair (i, j), the loss function is shown in the above equation, and finally the loss function on the batch of samplesTo calculate the average of the losses for all positive sample pairs in 2N data points.
Preferably, in step 3, knowledge distillation is simultaneously utilized to learn structural knowledge of the global model, specifically: using a portion of the public dataset D P And (3) knowledge is carried out, the structural knowledge of the global model is learned, and the loss function is as follows:
wherein knowledge distillation comprises two parts: loss based on angleAnd distance loss based on distance->Two parts; (t) 1 ,t 2 ,…,t n ) Is the output of the public dataset through the global model,(s) 1 ,s 2 ,…,s n ) Is the output of the public data set through the local model; />Representing the distance relationship between outputs, w being a normalization factor; /> v 1 ,v 2 ,…,v n Broadly referring to the output of the global model or the local model, x represents the relationship potential directly calculated by the output of the global model, and y represents the relationship potential calculated by the output of the local model.
Preferably, in step 3, self-supervised contrast learning is performed based on SimCLR algorithm using the local image dataset, and structural knowledge of the global model is learned by knowledge distillation, specifically:
after the global model is dynamically updated by the client i in the t communication round, training the local model based on the local data:
wherein,and->Is a loss function, +.>Is a gradient operator, η is the learning step size of the optimizer, +.>Is the initialization parameter of the client i t-th training round,>is a coefficient of knowledge distillation;
wherein sigma 2 (t) is the variance of the client local loss for the t-th round of computation, and γ is a configurable parameter.
Preferably, in step 4, the central server performs weighted average on the received model parameters according to the client data volume based on the FedAvg algorithm to obtain an aggregated global model, and the adopted formula is as follows:
wherein S is t Is the set of clients participating in federal training in the t-th round, n i Is the amount of data for client i,is the model parameter, theta after the client i t round of local training t+1 Is the initialization parameter of the t+1 round of federal learning.
The invention also provides a mobile terminal, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the federal self-supervision contrast learning image classification method based on knowledge distillation when executing the program.
Therefore, the federal self-supervision contrast learning image classification system and method based on knowledge distillation have the following beneficial effects:
(1) The centralized model training part in the Internet of things is transferred to the edge equipment, so that local data of all mechanisms are combined, and meanwhile, the calculation load of a cloud end or a server end is reduced;
(2) Privacy data is always kept in the local area of the edge equipment, so that the safety of the data can be improved;
(3) For the situation that a large amount of privacy data in an actual scene lacks labels, the cost is reduced through self-supervision, contrast and learning;
(4) For the problem of unbalanced data among all mechanisms in an actual application scene, global model knowledge is effectively learned through structural knowledge distillation, and the problem of model divergence is restrained, so that the quality of a final model is improved;
(5) After training is finished, a global model applicable to all clients can be obtained.
The technical scheme of the invention is further described in detail through the drawings and the embodiments.
Drawings
FIG. 1 is a flow chart of a federal self-supervised contrast learning image classification method based on knowledge distillation of the present invention;
FIG. 2 is a schematic diagram of a system for implementing a model training process in accordance with the present invention;
FIG. 3 is a graph comparing the performance of the method of the present invention and the conventional method in the examples of the present invention.
Detailed Description
The following detailed description of the embodiments of the invention, provided in the accompanying drawings, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The federal self-supervision contrast learning image classification system based on knowledge distillation comprises an initialization module, a client selection module, a self-supervision contrast learning module based on knowledge distillation, a global model aggregation module and a model issuing module, wherein: the initialization module randomly initializes the global model through the central server and sends the global model to each client; the client selection module is used for randomly selecting clients participating in the aggregation; the self-supervision contrast learning module based on knowledge distillation receives the global model parameters issued by the initialization module through each selected client, dynamically updates the parameters by using a divergence-aware technology DAU and serves as the initialization parameters of a local model, and the specific expression is as follows: wherein (1)>Is the global parameter of the t-th round of aggregation, < >>Is the parameter of client i t-1 round of local training, mu is the parameter of +.>And->Obtaining a value obtained by KL divergence, then using a local image dataset to perform self-supervision contrast learning based on a SimCLR algorithm, and simultaneously learning structural knowledge of a global model by using knowledge distillation, and uploading shared layer parameters to a central server by a client after local training is completed; the global model aggregation module is used for carrying out weighted average on the received model parameters through the central server based on the FedAVg algorithm according to the client data volume to obtain an aggregated global model; the model issuing module issues the aggregated global model to each client through the central server; and repeatedly executing the client selection module, the knowledge distillation-based self-supervision comparison learning module, the global model aggregation module and the model issuing module until a converged global model is obtained and is used for completing image classification.
Referring to fig. 1, a federal self-supervision contrast learning image classification method based on knowledge distillation includes the following steps:
step 1, a central server randomly initializes a global model and transmits the global model to each client, wherein the global model is composed of an encoder with a network structure of Resnet18 and a double-layer MLP mapping head;
step 2, randomly selecting clients participating in the polymerization;
step 3, each selected client receives the global model parameters issued by the step 1, dynamically updates the parameters by using a divergence sensing technology DAU, uses the parameters as initialization parameters of local model training, uses a local image dataset to perform self-supervision contrast learning based on a SimCLR algorithm, and simultaneously utilizes knowledge distillation to learn the structural knowledge of the global model, and uploads the shared layer parameters and data volume to a central server after the local training is completed;
the self-supervision contrast learning is performed based on the SimCLR algorithm, and specifically comprises the following steps:
s31, under the condition that no data label exists, the SimCLR learns and expresses by maximizing the consistency of the same data under different enhancements through the contrast loss of the hidden space, and the loss function is as follows:
wherein z is i ,z j Is the output of the image i, j through the model, sim (z i ,z j ) Representing z i ,z j Cosine similarity, τ is the temperature coefficient of contrast learning, 2N is the number of data points obtained by pairwise enhancement of a small sample N, z k Is the output of the image k through the model, l i,j Is the loss of positive sample pair (i, j);
s32, randomly extracting a small batch of samples containing N samples, and obtaining paired enhanced samples in the small batch of samples, wherein the total number of the enhanced samples is 2N; image i and its enhanced samples j form a positive sample pair (i, j), the loss function is shown in the above equation, and finally the loss function on the batch of samplesTo calculate the average of the losses for all positive sample pairs in 2N data points.
The structural knowledge of the global model is learned by knowledge distillation, and the method specifically comprises the following steps: using a portion of the public dataset D P And (3) knowledge is carried out, the structural knowledge of the global model is learned, and the loss function is as follows:
wherein knowledge distillation comprises two parts: loss based on angleAnd distance loss based on distance->Two parts; (t) 1 ,t 2 ,…,t n ) Is the output of the public dataset through the global model,(s) 1 ,s 2 ,…,s n ) Is the output of the public data set through the local model; />Representing the distance relationship between outputs, w being a normalization factor; /> v 1 ,v 2 ,…,v n Broadly referring to the output of the global model or the local model, x represents the relationship potential directly calculated by the output of the global model, and y represents the relationship potential calculated by the output of the local model.
Self-supervision contrast learning is carried out based on SimCLR algorithm by using a local image dataset, and meanwhile, structural knowledge of a global model is learned by utilizing knowledge distillation, specifically:
after the global model is dynamically updated by the client i in the t communication round, training the local model based on the local data:
wherein,and->Is a loss function, +.>Is a gradient operator, η is the learning step size of the optimizer, +.>Is the initialization parameter of the client i t-th training round,>is a coefficient of knowledge distillation;
wherein sigma 2 (t) is the variance of the client local loss for the t-th round of computation, and γ is a configurable parameter.
Step 4, the central server performs weighted average on the received model parameters based on FedAvg algorithm according to the client data volume to obtain an aggregated global model; the formula used is as follows:
wherein S is t Is the set of clients participating in federal training in the t-th round, n i Is the amount of data for client i,is the model parameter, theta after the client i t round of local training t+1 Is the initialization parameter of the t+1 round of federal learning.
Step 5, the central server transmits the aggregated global model to each client;
and 6, repeatedly executing the steps 2 to 5 until a converged global model is obtained and is used for completing image classification.
The invention also provides a mobile terminal, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the federal self-supervision contrast learning image classification method based on knowledge distillation when executing the program.
Examples
The embodiment provides a federal self-supervision contrast learning image classification method based on structural knowledge distillation, wherein federal learning is used as a distributed machine learning technology, allows a plurality of users to cooperatively train, can effectively help a plurality of institutions to perform data use and machine learning modeling under the condition that the requirements of user privacy protection, data safety and government regulations are met, can effectively solve the problem of data island, enables participants to jointly model on the basis of not sharing data, can technically break data island, and realizes Artificial Intelligence (AI) cooperation. The self-supervised learning is to automatically generate the labels or categories of the training samples by using the structure of the data or other information, so that the problem that a large amount of data needs to be manually marked in the traditional supervised learning is avoided.
According to the federal self-supervision contrast learning image classification method based on structural knowledge distillation, the method is based on the intuition that a local model of a client contains local data knowledge, and data heterogeneous characteristics among different clients cause the model to gradually deviate from a global model when the client is locally trained, initialization parameters are dynamically updated through divergent perception, and influences of Non-IID data on the model are relieved through structural knowledge distillation.
Federal learning can not access the privacy data of different enterprises by aggregating the parameters of the fault prediction models of the enterprises, so that the privacy data of the enterprises are effectively protected, and the fault data of the different enterprises are fused and analyzed, and the specific process is as follows:
(1) Each enterprise uses the same initial model and network architecture to train in the centralized federal learning framework, and the central server randomly initializes the global model and issues the global model to each enterprise;
(2) After each enterprise receives an initial global model issued by a server, training the local model by using local fault data under the condition of no label and utilizing federal self-supervision contrast learning based on structural knowledge distillation, and uploading local model parameters and data volume to the server after each enterprise completes training;
(3) After receiving the local model parameters, the central server performs weighted average according to the data volume to obtain aggregated global model parameters;
(4) And the enterprise receives the aggregated global model parameters, dynamically updates the initialized model parameters by using a divergent perception method, and performs the next round of training. The above process is repeated until a converged global model is obtained. The training process is shown in fig. 2.
The invention is applicable to a federal learning framework that includes a central server with computing functionality and a plurality of participants. By aggregating only the shared layer model parameters of the client, each enterprise can get a personalized local model.
According to the invention, under the condition that a large amount of unlabeled data exists in a real industrial scene, the federal self-supervision learning framework is adopted, so that a large amount of unlabeled data training feature extractors of each distributed user are efficiently utilized, and the model performance is improved under the condition that the safety and privacy of the data are ensured. The method solves the problems that in the prior art, the tag data are less, supervised learning cannot be performed, the local data size of participants is insufficient, and a model with high accuracy and strong applicability cannot be trained. As can be seen from FIG. 3, the present invention greatly improves model accuracy compared to conventional federal learning algorithms that work out a global universe.
Therefore, by adopting the federal self-supervision contrast learning image classification system and method based on knowledge distillation, the performance of the federal model of enterprises in the scene of too few industrial fault data labels and heterogeneous industrial privacy data is improved, and safe and effective sharing of the privacy data among enterprises is realized.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention and not for limiting it, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that: the technical scheme of the invention can be modified or replaced by the same, and the modified technical scheme cannot deviate from the spirit and scope of the technical scheme of the invention.
Claims (8)
1. The federal self-supervision contrast learning image classification system based on knowledge distillation is characterized in that: the system comprises an initialization module, a client selection module, a knowledge distillation-based self-supervision and contrast learning module, a global model aggregation module and a model issuing module, wherein:
the initialization module randomly initializes the global model through the central server and sends the global model to each client;
the client selection module is used for randomly selecting clients participating in the aggregation;
the self-supervision contrast learning module based on knowledge distillation receives the global model parameters issued by the initialization module through each selected client, dynamically updates the parameters by using a divergence-aware technology DAU and serves as the initialization parameters of a local model, and the specific expression is as follows: wherein (1)>Is the global parameter of the t-th round of aggregation, < >>Is the parameter of client i t-1 round of local training, mu is the parameter of +.>And->Obtaining a value obtained by KL divergence, then using a local image dataset to perform self-supervision contrast learning based on a SimCLR algorithm, and simultaneously learning structural knowledge of a global model by using knowledge distillation, and uploading shared layer parameters to a central server by a client after local training is completed;
the global model aggregation module is used for carrying out weighted average on the received model parameters through the central server based on the FedAVg algorithm according to the client data volume to obtain an aggregated global model;
the model issuing module issues the aggregated global model to each client through the central server;
and repeatedly executing the client selection module, the knowledge distillation-based self-supervision comparison learning module, the global model aggregation module and the model issuing module until a converged global model is obtained and is used for completing image classification.
2. The federal self-supervision contrast learning image classification method based on knowledge distillation is characterized by comprising the following steps of:
step 1, a central server randomly initializes a global model and sends the global model to each client;
step 2, randomly selecting clients participating in the polymerization;
step 3, each selected client receives the global model parameters issued by the step 1, dynamically updates the parameters by using a divergence sensing technology DAU, uses the parameters as initialization parameters of local model training, uses a local image dataset to perform self-supervision contrast learning based on a SimCLR algorithm, and simultaneously utilizes knowledge distillation to learn the structural knowledge of the global model, and uploads the shared layer parameters and data volume to a central server after the local training is completed;
step 4, the central server performs weighted average on the received model parameters based on FedAvg algorithm according to the client data volume to obtain an aggregated global model;
step 5, the central server transmits the aggregated global model to each client;
and 6, repeatedly executing the steps 2 to 5 until a converged global model is obtained and is used for completing image classification.
3. The federal self-supervised contrast learning image classification method based on knowledge distillation of claim 2, wherein: in step 1, the global model is composed of an encoder with a network structure of Resnet18 and a double-layer MLP mapping head.
4. The federal self-supervised contrast learning image classification method based on knowledge distillation of claim 2, wherein: in the step 3, self-supervision contrast learning is performed based on the SimCLR algorithm, specifically:
s31, under the condition that no data label exists, the SimCLR learns and expresses by maximizing the consistency of the same data under different enhancements through the contrast loss of the hidden space, and the loss function is as follows:
wherein z is i ,z j Is the output of the image i, j through the model, sim (z i ,z j ) Representing z i ,z j Cosine similarity, τ is the temperature coefficient of contrast learning, 2N is the number of data points obtained by pairwise enhancement of a small sample N, z k Is the output of the image k through the model, l i,j Is the loss of positive sample pair (i, j);
s32, randomly extracting a small batch of samples containing N samples, and obtaining paired enhanced samples in the small batch of samples, wherein the total number of the enhanced samples is 2N; image i and its enhanced samples j form a positive sample pair (i, j), the loss function is shown in the above equation, and finally the loss function on the batch of samplesTo calculate the average of the losses for all positive sample pairs in 2N data points.
5. The federal self-supervised contrast learning image classification method based on knowledge distillation of claim 2, wherein: in step 3, knowledge distillation is simultaneously utilized to learn the structural knowledge of the global model, specifically: using a portion of the public dataset D P And (3) knowledge is carried out, the structural knowledge of the global model is learned, and the loss function is as follows:
wherein knowledge distillation comprises two parts: loss based on angleAnd distance loss based on distance->Two parts; (t) 1 ,t 2 ,…,t n ) Is the output of the public dataset through the global model,(s) 1 ,s 2 ,…,s n ) Is the output of the public data set through the local model; />Representing the distance relationship between outputs, w being a normalization factor; v 1 ,v 2 ,…,v n broadly referring to the output of the global model or the local model, x represents the relationship potential directly calculated by the output of the global model, and y represents the relationship potential calculated by the output of the local model.
6. The federal self-supervised contrast learning image classification method based on knowledge distillation of claim 2, wherein: in step 3, self-supervision contrast learning is performed based on SimCLR algorithm by using the local image dataset, and structural knowledge of the global model is learned by using knowledge distillation, specifically:
after the global model is dynamically updated by the client i in the t communication round, training the local model based on the local data:
wherein,and->Is a loss function, +.>Is a gradient operator, η is the learning step size of the optimizer, +.>Is the initialization parameter of the client i t-th training round,>is a coefficient of knowledge distillation;
wherein sigma 2 (t) is the variance of the client local loss for the t-th round of computation, and γ is a configurable parameter.
7. The federal self-supervised contrast learning image classification method based on knowledge distillation of claim 2, wherein: in step 4, the central server performs weighted average on the received model parameters based on the FedAVg algorithm according to the client data volume to obtain an aggregated global model, and the adopted formula is as follows:
wherein S is t Is the set of clients participating in federal training in the t-th round, n i Is the amount of data for client i,is the model parameter, theta after the client i t round of local training t+1 Is the initialization parameter of the t+1 round of federal learning.
8. A mobile terminal comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, characterized by: the processor, when executing the program, implements the federal self-supervised contrast learning image classification method based on knowledge distillation as set forth in any one of claims 2-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410047272.0A CN117893807B (en) | 2024-01-12 | 2024-01-12 | Knowledge distillation-based federal self-supervision contrast learning image classification system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410047272.0A CN117893807B (en) | 2024-01-12 | 2024-01-12 | Knowledge distillation-based federal self-supervision contrast learning image classification system and method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117893807A true CN117893807A (en) | 2024-04-16 |
CN117893807B CN117893807B (en) | 2024-06-25 |
Family
ID=90646396
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410047272.0A Active CN117893807B (en) | 2024-01-12 | 2024-01-12 | Knowledge distillation-based federal self-supervision contrast learning image classification system and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117893807B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118070077A (en) * | 2024-04-25 | 2024-05-24 | 山东大学 | Fault diagnosis method and system based on federal learning and dual-supervision contrast learning |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116664930A (en) * | 2023-05-29 | 2023-08-29 | 南京理工大学 | Personalized federal learning image classification method and system based on self-supervision contrast learning |
CN116665000A (en) * | 2023-05-29 | 2023-08-29 | 河南大学 | Federal learning algorithm based on diffusion model and weight self-adaptive knowledge distillation |
CN117292221A (en) * | 2023-09-26 | 2023-12-26 | 山东省计算中心(国家超级计算济南中心) | Image recognition method and system based on federal element learning |
-
2024
- 2024-01-12 CN CN202410047272.0A patent/CN117893807B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116664930A (en) * | 2023-05-29 | 2023-08-29 | 南京理工大学 | Personalized federal learning image classification method and system based on self-supervision contrast learning |
CN116665000A (en) * | 2023-05-29 | 2023-08-29 | 河南大学 | Federal learning algorithm based on diffusion model and weight self-adaptive knowledge distillation |
CN117292221A (en) * | 2023-09-26 | 2023-12-26 | 山东省计算中心(国家超级计算济南中心) | Image recognition method and system based on federal element learning |
Non-Patent Citations (3)
Title |
---|
ATHANASIOS PSALTIS等: "FedRCIL: Federated Knowledge Distillation for Representation based Contrastive Incremental Learning", 《2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW)》, 25 December 2023 (2023-12-25), pages 3463 - 3472 * |
RAN WU等: "ADCL: Adversarial Distilled Contrastive Learning on lightweight models for self-supervised image classification", 《KNOWLEDGE-BASED SYSTEMS》, vol. 278, 25 October 2023 (2023-10-25), pages 1 - 11 * |
陈学斌等: "PFKD:综合考虑数据异构和模型异构的个性化联邦学习框架", 《南京信息工程大学学报(自然科学版)》, 26 October 2023 (2023-10-26), pages 1 - 10 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118070077A (en) * | 2024-04-25 | 2024-05-24 | 山东大学 | Fault diagnosis method and system based on federal learning and dual-supervision contrast learning |
Also Published As
Publication number | Publication date |
---|---|
CN117893807B (en) | 2024-06-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021063171A1 (en) | Decision tree model training method, system, storage medium, and prediction method | |
Gou et al. | Multilevel attention-based sample correlations for knowledge distillation | |
CN117893807B (en) | Knowledge distillation-based federal self-supervision contrast learning image classification system and method | |
CN111932386B (en) | User account determining method and device, information pushing method and device, and electronic equipment | |
Gao | Network intrusion detection method combining CNN and BiLSTM in cloud computing environment | |
Zhang et al. | Towards data-independent knowledge transfer in model-heterogeneous federated learning | |
CN115686868B (en) | Cross-node-oriented multi-mode retrieval method based on federated hash learning | |
WO2023087549A1 (en) | Efficient, secure and less-communication longitudinal federated learning method | |
CN114091667A (en) | Federal mutual learning model training method oriented to non-independent same distribution data | |
CN116664930A (en) | Personalized federal learning image classification method and system based on self-supervision contrast learning | |
CN116310385A (en) | Single data set domain generalization method in 3D point cloud data | |
CN115631008A (en) | Commodity recommendation method, commodity recommendation device, commodity recommendation equipment and commodity recommendation medium | |
CN113033652A (en) | Image recognition system and method based on block chain and federal learning | |
Hu et al. | A novel federated learning approach based on the confidence of federated Kalman filters | |
CN116187469A (en) | Client member reasoning attack method based on federal distillation learning framework | |
CN117914690A (en) | Edge node network fault prediction method based on deep learning GCN-LSTM | |
Caiqian et al. | Multimedia system and database simulation based on internet of things and cloud service platform | |
CN116702976A (en) | Enterprise resource prediction method and device based on modeling dynamic enterprise relationship | |
CN116541792A (en) | Method for carrying out group partner identification based on graph neural network node classification | |
Sah et al. | Aggregation techniques in federated learning: Comprehensive survey, challenges and opportunities | |
CN113886547B (en) | Client real-time dialogue switching method and device based on artificial intelligence and electronic equipment | |
CN115457365A (en) | Model interpretation method and device, electronic equipment and storage medium | |
Hong et al. | Retracted: Artificial intelligence point‐to‐point signal communication network optimization based on ubiquitous clouds | |
Liu | Simulation Training Auxiliary Model Based on Neural Network and Virtual Reality Technology | |
Feng et al. | Intelligent Evaluation Mechanism for Cloud-Edge-End based Next Generation Ship Simulator towards Maritime Pilot Training |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |