CN117893807A - Knowledge distillation-based federal self-supervision contrast learning image classification system and method - Google Patents

Knowledge distillation-based federal self-supervision contrast learning image classification system and method Download PDF

Info

Publication number
CN117893807A
CN117893807A CN202410047272.0A CN202410047272A CN117893807A CN 117893807 A CN117893807 A CN 117893807A CN 202410047272 A CN202410047272 A CN 202410047272A CN 117893807 A CN117893807 A CN 117893807A
Authority
CN
China
Prior art keywords
model
global model
client
self
local
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410047272.0A
Other languages
Chinese (zh)
Other versions
CN117893807B (en
Inventor
李骏
罗丹
夏鹏程
崔继轩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Science and Technology
Original Assignee
Nanjing University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Science and Technology filed Critical Nanjing University of Science and Technology
Priority to CN202410047272.0A priority Critical patent/CN117893807B/en
Publication of CN117893807A publication Critical patent/CN117893807A/en
Application granted granted Critical
Publication of CN117893807B publication Critical patent/CN117893807B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/042Knowledge-based neural networks; Logical representations of neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0895Weakly supervised learning, e.g. semi-supervised or self-supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/098Distributed learning, e.g. federated learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a federal self-supervision contrast learning image classification system and a federal self-supervision contrast learning image classification method based on knowledge distillation, which belong to the technical field of deep learning and computers, and specifically comprise the following steps: the central server randomly initializes the global model and sends the global model to each client, and the clients dynamically update the local model; randomly selecting clients participating in the aggregation, receiving global model parameters, dynamically updating local model parameters according to a divergence sensing method, performing self-supervision contrast learning based on a SimCLR algorithm by using a local image dataset, learning structural knowledge of the global model by using knowledge distillation, and uploading the parameters to a central server by the clients; the central server performs weighted average on the received model parameters based on FedAVg algorithm according to the client data volume to obtain an aggregated global model, and sends the aggregated global model to each client; the execution is repeated until each client gets a converged global model for completing the image classification. The invention has the advantages of good safety, high efficiency and good accuracy.

Description

Knowledge distillation-based federal self-supervision contrast learning image classification system and method
Technical Field
The invention belongs to the technical field of deep learning and computers, and particularly relates to a federal self-supervision contrast learning image classification system and method based on knowledge distillation.
Background
With the development of big data technology, importance of data privacy and security is becoming a big trend. Meanwhile, data in most industries presents a data island phenomenon, and how to perform data cooperation across users is a difficult problem on the premise of meeting the data privacy protection of users, and federal learning is a key technology for solving the difficult problem.
For example, in an industrial scenario, different enterprises need to jointly perform fault detection model modeling. Enterprise a has some fault data but due to the small amount of data, it is desirable to jointly model with more data, such as enterprise B's fault data.
For massive image data, traditional centralized machine learning is to concentrate the data in a central location and then align the training, but the centralized training has the following problems: 1) Data security problems because data must be transmitted to a centralized server; 2) When the data volume is very large, centralized learning faces the problem of large data volume, resulting in an increase in training time and resource cost. These problems make centralized learning difficult to apply to some tasks that require processing large amounts of data; 3) And a large amount of unlabeled data at the edge end cannot be utilized, so that the waste of data resources is caused.
The federal self-supervised learning paradigm aims at realizing collaborative training of models at the network edge without concentrating the original data, thereby greatly improving the data privacy problem. However, the heterogeneity of data between user devices can severely degrade traditional federal average performance. Meanwhile, a large amount of unlabeled data exists in an actual scene, and the manual labeling of the data is time-consuming and labor-consuming, and the problems of generalization errors, false correlation, antagonism and the like exist. How to efficiently utilize such unlabeled data is also a hot spot of current research.
Disclosure of Invention
The invention aims to provide a federal self-supervision contrast learning image classification system and a federal self-supervision contrast learning image classification method based on knowledge distillation, which solve the problems that the data isomerism among user equipment in the technology can seriously reduce the traditional federal average performance, and the manual labeling of a large amount of unlabeled data is time-consuming and labor-consuming, and the generalization error, false correlation and antagonism exist.
In order to achieve the above purpose, the invention provides a federal self-supervision contrast learning image classification system based on knowledge distillation, which comprises an initialization module, a client selection module, a self-supervision contrast learning module based on knowledge distillation, a global model aggregation module and a model issuing module, wherein:
the initialization module randomly initializes the global model through the central server and sends the global model to each client;
the client selection module is used for randomly selecting clients participating in the aggregation;
the self-supervision contrast learning module based on knowledge distillation receives the global model parameters issued by the initialization module through each selected client, dynamically updates the parameters by using a divergence-aware technology DAU and serves as the initialization parameters of a local model, and the specific expression is as follows: wherein (1)>Is the global parameter of the t-th round of aggregation, < >>Is the parameter after the client i t-1 round of local training is finished, muIs to->And->Obtaining a value obtained by KL divergence, then using a local image dataset to perform self-supervision contrast learning based on a SimCLR algorithm, and simultaneously learning structural knowledge of a global model by using knowledge distillation, and uploading shared layer parameters to a central server by a client after local training is completed;
the global model aggregation module is used for carrying out weighted average on the received model parameters through the central server based on the FedAVg algorithm according to the client data volume to obtain an aggregated global model;
the model issuing module issues the aggregated global model to each client through the central server;
and repeatedly executing the client selection module, the knowledge distillation-based self-supervision comparison learning module, the global model aggregation module and the model issuing module until a converged global model is obtained and is used for completing image classification.
The invention also provides a federal self-supervision contrast learning image classification method based on knowledge distillation, which comprises the following steps:
step 1, a central server randomly initializes a global model and sends the global model to each client;
step 2, randomly selecting clients participating in the polymerization;
step 3, each selected client receives the global model parameters issued by the step 1, dynamically updates the parameters by using a divergence sensing technology DAU, uses the parameters as initialization parameters of local model training, uses a local image dataset to perform self-supervision contrast learning based on a SimCLR algorithm, and simultaneously utilizes knowledge distillation to learn the structural knowledge of the global model, and uploads the shared layer parameters and data volume to a central server after the local training is completed;
step 4, the central server performs weighted average on the received model parameters based on FedAvg algorithm according to the client data volume to obtain an aggregated global model;
step 5, the central server transmits the aggregated global model to each client;
and 6, repeatedly executing the steps 2 to 5 until a converged global model is obtained and is used for completing image classification.
Preferably, in step 1, the global model is composed of an encoder with a network structure of Resnet18 and a dual-layer MLP mapping header.
Preferably, in step 3, self-supervision contrast learning is performed based on SimCLR algorithm, specifically:
s31, under the condition that no data label exists, the SimCLR learns and expresses by maximizing the consistency of the same data under different enhancements through the contrast loss of the hidden space, and the loss function is as follows:
wherein z is i ,z j Is the output of the image i, j through the model, sim (z i ,z j ) Representing z i ,z j Cosine similarity, τ is the temperature coefficient of contrast learning, 2N is the number of data points obtained by pairwise enhancement of a small sample N, z k Is the output of the image k through the model, l i,j Is the loss of positive sample pair (i, j);
s32, randomly extracting a small batch of samples containing N samples, and obtaining paired enhanced samples in the small batch of samples, wherein the total number of the enhanced samples is 2N; image i and its enhanced samples j form a positive sample pair (i, j), the loss function is shown in the above equation, and finally the loss function on the batch of samplesTo calculate the average of the losses for all positive sample pairs in 2N data points.
Preferably, in step 3, knowledge distillation is simultaneously utilized to learn structural knowledge of the global model, specifically: using a portion of the public dataset D P And (3) knowledge is carried out, the structural knowledge of the global model is learned, and the loss function is as follows:
wherein knowledge distillation comprises two parts: loss based on angleAnd distance loss based on distance->Two parts; (t) 1 ,t 2 ,…,t n ) Is the output of the public dataset through the global model,(s) 1 ,s 2 ,…,s n ) Is the output of the public data set through the local model; />Representing the distance relationship between outputs, w being a normalization factor; /> v 1 ,v 2 ,…,v n Broadly referring to the output of the global model or the local model, x represents the relationship potential directly calculated by the output of the global model, and y represents the relationship potential calculated by the output of the local model.
Preferably, in step 3, self-supervised contrast learning is performed based on SimCLR algorithm using the local image dataset, and structural knowledge of the global model is learned by knowledge distillation, specifically:
after the global model is dynamically updated by the client i in the t communication round, training the local model based on the local data:
wherein,and->Is a loss function, +.>Is a gradient operator, η is the learning step size of the optimizer, +.>Is the initialization parameter of the client i t-th training round,>is a coefficient of knowledge distillation;
wherein sigma 2 (t) is the variance of the client local loss for the t-th round of computation, and γ is a configurable parameter.
Preferably, in step 4, the central server performs weighted average on the received model parameters according to the client data volume based on the FedAvg algorithm to obtain an aggregated global model, and the adopted formula is as follows:
wherein S is t Is the set of clients participating in federal training in the t-th round, n i Is the amount of data for client i,is the model parameter, theta after the client i t round of local training t+1 Is the initialization parameter of the t+1 round of federal learning.
The invention also provides a mobile terminal, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the federal self-supervision contrast learning image classification method based on knowledge distillation when executing the program.
Therefore, the federal self-supervision contrast learning image classification system and method based on knowledge distillation have the following beneficial effects:
(1) The centralized model training part in the Internet of things is transferred to the edge equipment, so that local data of all mechanisms are combined, and meanwhile, the calculation load of a cloud end or a server end is reduced;
(2) Privacy data is always kept in the local area of the edge equipment, so that the safety of the data can be improved;
(3) For the situation that a large amount of privacy data in an actual scene lacks labels, the cost is reduced through self-supervision, contrast and learning;
(4) For the problem of unbalanced data among all mechanisms in an actual application scene, global model knowledge is effectively learned through structural knowledge distillation, and the problem of model divergence is restrained, so that the quality of a final model is improved;
(5) After training is finished, a global model applicable to all clients can be obtained.
The technical scheme of the invention is further described in detail through the drawings and the embodiments.
Drawings
FIG. 1 is a flow chart of a federal self-supervised contrast learning image classification method based on knowledge distillation of the present invention;
FIG. 2 is a schematic diagram of a system for implementing a model training process in accordance with the present invention;
FIG. 3 is a graph comparing the performance of the method of the present invention and the conventional method in the examples of the present invention.
Detailed Description
The following detailed description of the embodiments of the invention, provided in the accompanying drawings, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The federal self-supervision contrast learning image classification system based on knowledge distillation comprises an initialization module, a client selection module, a self-supervision contrast learning module based on knowledge distillation, a global model aggregation module and a model issuing module, wherein: the initialization module randomly initializes the global model through the central server and sends the global model to each client; the client selection module is used for randomly selecting clients participating in the aggregation; the self-supervision contrast learning module based on knowledge distillation receives the global model parameters issued by the initialization module through each selected client, dynamically updates the parameters by using a divergence-aware technology DAU and serves as the initialization parameters of a local model, and the specific expression is as follows: wherein (1)>Is the global parameter of the t-th round of aggregation, < >>Is the parameter of client i t-1 round of local training, mu is the parameter of +.>And->Obtaining a value obtained by KL divergence, then using a local image dataset to perform self-supervision contrast learning based on a SimCLR algorithm, and simultaneously learning structural knowledge of a global model by using knowledge distillation, and uploading shared layer parameters to a central server by a client after local training is completed; the global model aggregation module is used for carrying out weighted average on the received model parameters through the central server based on the FedAVg algorithm according to the client data volume to obtain an aggregated global model; the model issuing module issues the aggregated global model to each client through the central server; and repeatedly executing the client selection module, the knowledge distillation-based self-supervision comparison learning module, the global model aggregation module and the model issuing module until a converged global model is obtained and is used for completing image classification.
Referring to fig. 1, a federal self-supervision contrast learning image classification method based on knowledge distillation includes the following steps:
step 1, a central server randomly initializes a global model and transmits the global model to each client, wherein the global model is composed of an encoder with a network structure of Resnet18 and a double-layer MLP mapping head;
step 2, randomly selecting clients participating in the polymerization;
step 3, each selected client receives the global model parameters issued by the step 1, dynamically updates the parameters by using a divergence sensing technology DAU, uses the parameters as initialization parameters of local model training, uses a local image dataset to perform self-supervision contrast learning based on a SimCLR algorithm, and simultaneously utilizes knowledge distillation to learn the structural knowledge of the global model, and uploads the shared layer parameters and data volume to a central server after the local training is completed;
the self-supervision contrast learning is performed based on the SimCLR algorithm, and specifically comprises the following steps:
s31, under the condition that no data label exists, the SimCLR learns and expresses by maximizing the consistency of the same data under different enhancements through the contrast loss of the hidden space, and the loss function is as follows:
wherein z is i ,z j Is the output of the image i, j through the model, sim (z i ,z j ) Representing z i ,z j Cosine similarity, τ is the temperature coefficient of contrast learning, 2N is the number of data points obtained by pairwise enhancement of a small sample N, z k Is the output of the image k through the model, l i,j Is the loss of positive sample pair (i, j);
s32, randomly extracting a small batch of samples containing N samples, and obtaining paired enhanced samples in the small batch of samples, wherein the total number of the enhanced samples is 2N; image i and its enhanced samples j form a positive sample pair (i, j), the loss function is shown in the above equation, and finally the loss function on the batch of samplesTo calculate the average of the losses for all positive sample pairs in 2N data points.
The structural knowledge of the global model is learned by knowledge distillation, and the method specifically comprises the following steps: using a portion of the public dataset D P And (3) knowledge is carried out, the structural knowledge of the global model is learned, and the loss function is as follows:
wherein knowledge distillation comprises two parts: loss based on angleAnd distance loss based on distance->Two parts; (t) 1 ,t 2 ,…,t n ) Is the output of the public dataset through the global model,(s) 1 ,s 2 ,…,s n ) Is the output of the public data set through the local model; />Representing the distance relationship between outputs, w being a normalization factor; /> v 1 ,v 2 ,…,v n Broadly referring to the output of the global model or the local model, x represents the relationship potential directly calculated by the output of the global model, and y represents the relationship potential calculated by the output of the local model.
Self-supervision contrast learning is carried out based on SimCLR algorithm by using a local image dataset, and meanwhile, structural knowledge of a global model is learned by utilizing knowledge distillation, specifically:
after the global model is dynamically updated by the client i in the t communication round, training the local model based on the local data:
wherein,and->Is a loss function, +.>Is a gradient operator, η is the learning step size of the optimizer, +.>Is the initialization parameter of the client i t-th training round,>is a coefficient of knowledge distillation;
wherein sigma 2 (t) is the variance of the client local loss for the t-th round of computation, and γ is a configurable parameter.
Step 4, the central server performs weighted average on the received model parameters based on FedAvg algorithm according to the client data volume to obtain an aggregated global model; the formula used is as follows:
wherein S is t Is the set of clients participating in federal training in the t-th round, n i Is the amount of data for client i,is the model parameter, theta after the client i t round of local training t+1 Is the initialization parameter of the t+1 round of federal learning.
Step 5, the central server transmits the aggregated global model to each client;
and 6, repeatedly executing the steps 2 to 5 until a converged global model is obtained and is used for completing image classification.
The invention also provides a mobile terminal, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the federal self-supervision contrast learning image classification method based on knowledge distillation when executing the program.
Examples
The embodiment provides a federal self-supervision contrast learning image classification method based on structural knowledge distillation, wherein federal learning is used as a distributed machine learning technology, allows a plurality of users to cooperatively train, can effectively help a plurality of institutions to perform data use and machine learning modeling under the condition that the requirements of user privacy protection, data safety and government regulations are met, can effectively solve the problem of data island, enables participants to jointly model on the basis of not sharing data, can technically break data island, and realizes Artificial Intelligence (AI) cooperation. The self-supervised learning is to automatically generate the labels or categories of the training samples by using the structure of the data or other information, so that the problem that a large amount of data needs to be manually marked in the traditional supervised learning is avoided.
According to the federal self-supervision contrast learning image classification method based on structural knowledge distillation, the method is based on the intuition that a local model of a client contains local data knowledge, and data heterogeneous characteristics among different clients cause the model to gradually deviate from a global model when the client is locally trained, initialization parameters are dynamically updated through divergent perception, and influences of Non-IID data on the model are relieved through structural knowledge distillation.
Federal learning can not access the privacy data of different enterprises by aggregating the parameters of the fault prediction models of the enterprises, so that the privacy data of the enterprises are effectively protected, and the fault data of the different enterprises are fused and analyzed, and the specific process is as follows:
(1) Each enterprise uses the same initial model and network architecture to train in the centralized federal learning framework, and the central server randomly initializes the global model and issues the global model to each enterprise;
(2) After each enterprise receives an initial global model issued by a server, training the local model by using local fault data under the condition of no label and utilizing federal self-supervision contrast learning based on structural knowledge distillation, and uploading local model parameters and data volume to the server after each enterprise completes training;
(3) After receiving the local model parameters, the central server performs weighted average according to the data volume to obtain aggregated global model parameters;
(4) And the enterprise receives the aggregated global model parameters, dynamically updates the initialized model parameters by using a divergent perception method, and performs the next round of training. The above process is repeated until a converged global model is obtained. The training process is shown in fig. 2.
The invention is applicable to a federal learning framework that includes a central server with computing functionality and a plurality of participants. By aggregating only the shared layer model parameters of the client, each enterprise can get a personalized local model.
According to the invention, under the condition that a large amount of unlabeled data exists in a real industrial scene, the federal self-supervision learning framework is adopted, so that a large amount of unlabeled data training feature extractors of each distributed user are efficiently utilized, and the model performance is improved under the condition that the safety and privacy of the data are ensured. The method solves the problems that in the prior art, the tag data are less, supervised learning cannot be performed, the local data size of participants is insufficient, and a model with high accuracy and strong applicability cannot be trained. As can be seen from FIG. 3, the present invention greatly improves model accuracy compared to conventional federal learning algorithms that work out a global universe.
Therefore, by adopting the federal self-supervision contrast learning image classification system and method based on knowledge distillation, the performance of the federal model of enterprises in the scene of too few industrial fault data labels and heterogeneous industrial privacy data is improved, and safe and effective sharing of the privacy data among enterprises is realized.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention and not for limiting it, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that: the technical scheme of the invention can be modified or replaced by the same, and the modified technical scheme cannot deviate from the spirit and scope of the technical scheme of the invention.

Claims (8)

1. The federal self-supervision contrast learning image classification system based on knowledge distillation is characterized in that: the system comprises an initialization module, a client selection module, a knowledge distillation-based self-supervision and contrast learning module, a global model aggregation module and a model issuing module, wherein:
the initialization module randomly initializes the global model through the central server and sends the global model to each client;
the client selection module is used for randomly selecting clients participating in the aggregation;
the self-supervision contrast learning module based on knowledge distillation receives the global model parameters issued by the initialization module through each selected client, dynamically updates the parameters by using a divergence-aware technology DAU and serves as the initialization parameters of a local model, and the specific expression is as follows: wherein (1)>Is the global parameter of the t-th round of aggregation, < >>Is the parameter of client i t-1 round of local training, mu is the parameter of +.>And->Obtaining a value obtained by KL divergence, then using a local image dataset to perform self-supervision contrast learning based on a SimCLR algorithm, and simultaneously learning structural knowledge of a global model by using knowledge distillation, and uploading shared layer parameters to a central server by a client after local training is completed;
the global model aggregation module is used for carrying out weighted average on the received model parameters through the central server based on the FedAVg algorithm according to the client data volume to obtain an aggregated global model;
the model issuing module issues the aggregated global model to each client through the central server;
and repeatedly executing the client selection module, the knowledge distillation-based self-supervision comparison learning module, the global model aggregation module and the model issuing module until a converged global model is obtained and is used for completing image classification.
2. The federal self-supervision contrast learning image classification method based on knowledge distillation is characterized by comprising the following steps of:
step 1, a central server randomly initializes a global model and sends the global model to each client;
step 2, randomly selecting clients participating in the polymerization;
step 3, each selected client receives the global model parameters issued by the step 1, dynamically updates the parameters by using a divergence sensing technology DAU, uses the parameters as initialization parameters of local model training, uses a local image dataset to perform self-supervision contrast learning based on a SimCLR algorithm, and simultaneously utilizes knowledge distillation to learn the structural knowledge of the global model, and uploads the shared layer parameters and data volume to a central server after the local training is completed;
step 4, the central server performs weighted average on the received model parameters based on FedAvg algorithm according to the client data volume to obtain an aggregated global model;
step 5, the central server transmits the aggregated global model to each client;
and 6, repeatedly executing the steps 2 to 5 until a converged global model is obtained and is used for completing image classification.
3. The federal self-supervised contrast learning image classification method based on knowledge distillation of claim 2, wherein: in step 1, the global model is composed of an encoder with a network structure of Resnet18 and a double-layer MLP mapping head.
4. The federal self-supervised contrast learning image classification method based on knowledge distillation of claim 2, wherein: in the step 3, self-supervision contrast learning is performed based on the SimCLR algorithm, specifically:
s31, under the condition that no data label exists, the SimCLR learns and expresses by maximizing the consistency of the same data under different enhancements through the contrast loss of the hidden space, and the loss function is as follows:
wherein z is i ,z j Is the output of the image i, j through the model, sim (z i ,z j ) Representing z i ,z j Cosine similarity, τ is the temperature coefficient of contrast learning, 2N is the number of data points obtained by pairwise enhancement of a small sample N, z k Is the output of the image k through the model, l i,j Is the loss of positive sample pair (i, j);
s32, randomly extracting a small batch of samples containing N samples, and obtaining paired enhanced samples in the small batch of samples, wherein the total number of the enhanced samples is 2N; image i and its enhanced samples j form a positive sample pair (i, j), the loss function is shown in the above equation, and finally the loss function on the batch of samplesTo calculate the average of the losses for all positive sample pairs in 2N data points.
5. The federal self-supervised contrast learning image classification method based on knowledge distillation of claim 2, wherein: in step 3, knowledge distillation is simultaneously utilized to learn the structural knowledge of the global model, specifically: using a portion of the public dataset D P And (3) knowledge is carried out, the structural knowledge of the global model is learned, and the loss function is as follows:
wherein knowledge distillation comprises two parts: loss based on angleAnd distance loss based on distance->Two parts; (t) 1 ,t 2 ,…,t n ) Is the output of the public dataset through the global model,(s) 1 ,s 2 ,…,s n ) Is the output of the public data set through the local model; />Representing the distance relationship between outputs, w being a normalization factor; v 1 ,v 2 ,…,v n broadly referring to the output of the global model or the local model, x represents the relationship potential directly calculated by the output of the global model, and y represents the relationship potential calculated by the output of the local model.
6. The federal self-supervised contrast learning image classification method based on knowledge distillation of claim 2, wherein: in step 3, self-supervision contrast learning is performed based on SimCLR algorithm by using the local image dataset, and structural knowledge of the global model is learned by using knowledge distillation, specifically:
after the global model is dynamically updated by the client i in the t communication round, training the local model based on the local data:
wherein,and->Is a loss function, +.>Is a gradient operator, η is the learning step size of the optimizer, +.>Is the initialization parameter of the client i t-th training round,>is a coefficient of knowledge distillation;
wherein sigma 2 (t) is the variance of the client local loss for the t-th round of computation, and γ is a configurable parameter.
7. The federal self-supervised contrast learning image classification method based on knowledge distillation of claim 2, wherein: in step 4, the central server performs weighted average on the received model parameters based on the FedAVg algorithm according to the client data volume to obtain an aggregated global model, and the adopted formula is as follows:
wherein S is t Is the set of clients participating in federal training in the t-th round, n i Is the amount of data for client i,is the model parameter, theta after the client i t round of local training t+1 Is the initialization parameter of the t+1 round of federal learning.
8. A mobile terminal comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, characterized by: the processor, when executing the program, implements the federal self-supervised contrast learning image classification method based on knowledge distillation as set forth in any one of claims 2-7.
CN202410047272.0A 2024-01-12 2024-01-12 Knowledge distillation-based federal self-supervision contrast learning image classification system and method Active CN117893807B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410047272.0A CN117893807B (en) 2024-01-12 2024-01-12 Knowledge distillation-based federal self-supervision contrast learning image classification system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410047272.0A CN117893807B (en) 2024-01-12 2024-01-12 Knowledge distillation-based federal self-supervision contrast learning image classification system and method

Publications (2)

Publication Number Publication Date
CN117893807A true CN117893807A (en) 2024-04-16
CN117893807B CN117893807B (en) 2024-06-25

Family

ID=90646396

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410047272.0A Active CN117893807B (en) 2024-01-12 2024-01-12 Knowledge distillation-based federal self-supervision contrast learning image classification system and method

Country Status (1)

Country Link
CN (1) CN117893807B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118070077A (en) * 2024-04-25 2024-05-24 山东大学 Fault diagnosis method and system based on federal learning and dual-supervision contrast learning

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116664930A (en) * 2023-05-29 2023-08-29 南京理工大学 Personalized federal learning image classification method and system based on self-supervision contrast learning
CN116665000A (en) * 2023-05-29 2023-08-29 河南大学 Federal learning algorithm based on diffusion model and weight self-adaptive knowledge distillation
CN117292221A (en) * 2023-09-26 2023-12-26 山东省计算中心(国家超级计算济南中心) Image recognition method and system based on federal element learning

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116664930A (en) * 2023-05-29 2023-08-29 南京理工大学 Personalized federal learning image classification method and system based on self-supervision contrast learning
CN116665000A (en) * 2023-05-29 2023-08-29 河南大学 Federal learning algorithm based on diffusion model and weight self-adaptive knowledge distillation
CN117292221A (en) * 2023-09-26 2023-12-26 山东省计算中心(国家超级计算济南中心) Image recognition method and system based on federal element learning

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ATHANASIOS PSALTIS等: "FedRCIL: Federated Knowledge Distillation for Representation based Contrastive Incremental Learning", 《2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW)》, 25 December 2023 (2023-12-25), pages 3463 - 3472 *
RAN WU等: "ADCL: Adversarial Distilled Contrastive Learning on lightweight models for self-supervised image classification", 《KNOWLEDGE-BASED SYSTEMS》, vol. 278, 25 October 2023 (2023-10-25), pages 1 - 11 *
陈学斌等: "PFKD:综合考虑数据异构和模型异构的个性化联邦学习框架", 《南京信息工程大学学报(自然科学版)》, 26 October 2023 (2023-10-26), pages 1 - 10 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118070077A (en) * 2024-04-25 2024-05-24 山东大学 Fault diagnosis method and system based on federal learning and dual-supervision contrast learning

Also Published As

Publication number Publication date
CN117893807B (en) 2024-06-25

Similar Documents

Publication Publication Date Title
WO2021063171A1 (en) Decision tree model training method, system, storage medium, and prediction method
Gou et al. Multilevel attention-based sample correlations for knowledge distillation
CN117893807B (en) Knowledge distillation-based federal self-supervision contrast learning image classification system and method
CN111932386B (en) User account determining method and device, information pushing method and device, and electronic equipment
Gao Network intrusion detection method combining CNN and BiLSTM in cloud computing environment
Zhang et al. Towards data-independent knowledge transfer in model-heterogeneous federated learning
CN115686868B (en) Cross-node-oriented multi-mode retrieval method based on federated hash learning
WO2023087549A1 (en) Efficient, secure and less-communication longitudinal federated learning method
CN114091667A (en) Federal mutual learning model training method oriented to non-independent same distribution data
CN116664930A (en) Personalized federal learning image classification method and system based on self-supervision contrast learning
CN116310385A (en) Single data set domain generalization method in 3D point cloud data
CN115631008A (en) Commodity recommendation method, commodity recommendation device, commodity recommendation equipment and commodity recommendation medium
CN113033652A (en) Image recognition system and method based on block chain and federal learning
Hu et al. A novel federated learning approach based on the confidence of federated Kalman filters
CN116187469A (en) Client member reasoning attack method based on federal distillation learning framework
CN117914690A (en) Edge node network fault prediction method based on deep learning GCN-LSTM
Caiqian et al. Multimedia system and database simulation based on internet of things and cloud service platform
CN116702976A (en) Enterprise resource prediction method and device based on modeling dynamic enterprise relationship
CN116541792A (en) Method for carrying out group partner identification based on graph neural network node classification
Sah et al. Aggregation techniques in federated learning: Comprehensive survey, challenges and opportunities
CN113886547B (en) Client real-time dialogue switching method and device based on artificial intelligence and electronic equipment
CN115457365A (en) Model interpretation method and device, electronic equipment and storage medium
Hong et al. Retracted: Artificial intelligence point‐to‐point signal communication network optimization based on ubiquitous clouds
Liu Simulation Training Auxiliary Model Based on Neural Network and Virtual Reality Technology
Feng et al. Intelligent Evaluation Mechanism for Cloud-Edge-End based Next Generation Ship Simulator towards Maritime Pilot Training

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant