CN114863092A - Knowledge distillation-based federal target detection method and system - Google Patents

Knowledge distillation-based federal target detection method and system Download PDF

Info

Publication number
CN114863092A
CN114863092A CN202210474634.5A CN202210474634A CN114863092A CN 114863092 A CN114863092 A CN 114863092A CN 202210474634 A CN202210474634 A CN 202210474634A CN 114863092 A CN114863092 A CN 114863092A
Authority
CN
China
Prior art keywords
model
knowledge distillation
information
student
aggregation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210474634.5A
Other languages
Chinese (zh)
Inventor
梁天恺
田丰
黄宇恒
徐天适
陈�光
张华俊
冼金才
苏新铎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
GRG Banking Equipment Co Ltd
Original Assignee
GRG Banking Equipment Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by GRG Banking Equipment Co Ltd filed Critical GRG Banking Equipment Co Ltd
Priority to CN202210474634.5A priority Critical patent/CN114863092A/en
Publication of CN114863092A publication Critical patent/CN114863092A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • G06N5/041Abduction

Abstract

The invention discloses a knowledge distillation-based federal target detection method and a knowledge distillation-based federal target detection system, wherein the method comprises the following steps: acquiring a picture data set as sample data; and performing federal learning by using the sample data, which specifically comprises the following steps: establishing a knowledge distillation model of a local participant by using the sample data, sending knowledge distillation model information to a server, receiving first aggregation information sent by the server and used for aggregating the local participant and other participants, and updating student models in the distillation model aiming at the first aggregation information to obtain first aggregation student models of the local participant aiming at all participants; obtaining a reasoning detection model according to the first aggregation student model; and detecting the data to be detected according to the reasoning detection model to obtain a reasoning result. The method can be used in the field of target detection, gives consideration to data privacy and safety, and improves the reasoning detection efficiency and effect of the target detection method.

Description

Knowledge distillation-based federal target detection method and system
Technical Field
The invention relates to the technical field of target detection, in particular to a knowledge distillation-based federal target detection method and system.
Background
The target detection learning algorithm of the current target detection system mostly belongs to a centralized learning algorithm, and requires that after all terminals in the system transmit collected data to a centralized server, the centralized server arranges resources for learning.
Moreover, most of the inference detection models of the current target detection system are a set of large models with a single or similar structure, the included parameter quantity is large, and the learning and inference efficiency is low. In contrast, the small model has the advantages of small parameter number and high learning and reasoning efficiency, but has poor learning ability and generalization ability due to the simplicity of the small model.
Disclosure of Invention
In view of the above technical problems, the invention aims to provide a knowledge distillation-based federal target detection method and system, which solve the problems that the traditional target detection system adopts a centralized learning algorithm to cause potential data safety hazards, or adopts a large model to learn and reason with low efficiency, or adopts a small model to cause poor learning ability and generalization ability.
The invention adopts the following technical scheme:
a knowledge distillation-based federal target detection method is applied to an intelligent terminal and comprises the following steps:
acquiring a picture data set as sample data;
and performing federal learning by using the sample data, which specifically comprises the following steps: constructing a knowledge distillation model of a local participant by using the sample data, and sending knowledge distillation model information to a server so that the server aggregates the knowledge distillation model information of the local participant and the knowledge distillation model information of the other n-1 participants; receiving first aggregation information which is sent by a server and used for aggregating a local participant and n-1 other participants, wherein n is a natural number; updating the student models in the distillation model according to the first aggregation information to obtain first aggregation student models of local participants aiming at all participants; obtaining a reasoning detection model according to the first aggregation student model;
and detecting the data to be detected according to the reasoning detection model to obtain a reasoning result.
Optionally, obtaining an inference detection model according to the first aggregation student model includes:
respectively calculating information gain values of the first aggregation student models of the local participants aiming at all the participants, arranging the information gain values according to the magnitude, and sending participant information corresponding to the first m information gain values to a server, wherein m is a natural number and is less than or equal to n, and the participants corresponding to the first m information gain values are beneficial participants corresponding to the local participants;
and receiving second aggregation information which is sent by the server and used for aggregating the local participant and the favorable participant thereof, updating a second aggregation student model of the local participant according to the second aggregation information to obtain a second aggregation student model of the local participant, and taking the second aggregation student model as an inference detection model.
Optionally, the constructing a knowledge distillation model of a local participant by using the sample data includes:
the knowledge distillation model comprises a teacher model and a student model, and the sample data is learned according to the teacher model to obtain a first target classification soft label and a first target frame soft prediction;
learning the sample data according to a student model, and obtaining a second target classification soft label and a second target frame soft prediction through a sofamax layer with the distillation temperature of t; obtaining a target classification hard label and a target frame hard prediction by distilling a sofamax layer with the temperature of 1;
calculating the distillation loss L according to the first target classification soft label, the first target frame soft prediction, the second target classification soft label and the second target frame soft prediction soft
Obtaining student loss L according to the target classification hard label and target frame hard prediction and the real sample label and real sample frame of the sample data hard
The objective function L ═ tL soft +(1-t)L hard Wherein t is more than or equal to 0 and less than or equal to 1, L hard For student loss, L soft And iterating the knowledge distillation model for minimizing the objective function to obtain a knowledge distillation model after the last iteration, and taking the knowledge distillation model after the last iteration as the constructed knowledge distillation model of the local participant.
Optionally, m is equal to 1/2 of n.
Optionally, the first m information gain values are all greater than 0.
Optionally, the teacher model comprises a SwinT model, and the student model comprises a NanoDet-m model.
Optionally, the detecting the data to be detected according to the inference detection model to obtain an inference result includes:
the data to be detected is picture data to be detected, the input picture data to be detected is received, and the picture to be detected is preprocessed, so that the size of the preprocessed picture to be detected is consistent with that of a picture of sample data; and inputting the preprocessed picture to be detected into the reasoning detection model to obtain a reasoning result.
A knowledge distillation based federal target detection system comprising:
the sample data acquisition unit is used for acquiring the picture data set as sample data;
the system comprises a federal learning unit, a knowledge distillation model information acquisition unit and a server, wherein the federal learning unit is used for carrying out federal learning by using the sample data, specifically for constructing a knowledge distillation model of a local participant by using the sample data and sending the knowledge distillation model information to the server so that the server can aggregate the knowledge distillation model information of the local participant and the knowledge distillation model information of other n-1 participants; receiving first aggregation information which is sent by a server and used for aggregating a local participant and n-1 other participants, wherein n is a natural number; updating the student models in the distillation model according to the first aggregation information to obtain first aggregation student models of local participants aiming at all participants; obtaining a reasoning detection model according to the first aggregation student model;
and the detection unit is used for detecting the data to be detected according to the reasoning detection model to obtain a reasoning result.
An electronic device, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the knowledge distillation-based federal target detection method.
A computer storage medium having stored thereon a computer program that, when executed by a processor, implements the knowledge distillation-based federal target detection method.
Compared with the prior art, the invention has the beneficial effects that:
the invention carries out federal learning by utilizing the sample data, which specifically comprises the following steps: constructing a knowledge distillation model of a local participant by using the sample data, and sending knowledge distillation model information to a server so that the server aggregates the knowledge distillation model information of the local participant and the knowledge distillation model information of the other n-1 participants; receiving first aggregation information which is sent by a server and used for aggregating a local participant and n-1 other participants, wherein n is a natural number; updating the student models in the distillation model according to the first aggregation information to obtain first aggregation student models of local participants aiming at all participants; obtaining a reasoning detection model according to the first aggregation student model; the method and the system for detecting the federal target based on knowledge distillation can be constructed by utilizing a large number of images and based on the technologies of knowledge distillation, artificial intelligence, federal learning and the like, can realize the functions of target class labels and target frames contained in the input image detection, can be used in the field of target detection such as image classification, face detection, object detection, biological monitoring and the like, and can realize the learning of a small model assisted by the large model by using the knowledge distillation model for the image detection, so that the small model for reasoning has the performance of the large model and the reasoning detection efficiency of the system is improved.
Furthermore, the second aggregation student model of the local participant is updated through the second aggregation information which is sent by the server and is used for aggregating the local participant and the favorable participants of the local participant, so that a federal aggregation mode of a proper participant can be dynamically selected, members in a federal system can freely select knowledge distillation model information which is more in line with the characteristics of the knowledge distillation model information, the efficiency and the effect of federal learning are improved, and the problem that the performance of the local model of the local participant is deteriorated due to the fact that the main server carries out indiscriminate whole member aggregation on the knowledge distillation model information of each participant and the information of the knowledge distillation model of the participant with a part of characteristics which are greatly different from the characteristics of the local participant is added is solved.
Drawings
FIG. 1 is a schematic flow diagram of a knowledge distillation-based federal target detection method according to an embodiment of the present invention;
FIG. 2 is a schematic flow diagram of a knowledge distillation-based Federal target detection method according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a target detection system according to an embodiment of the present invention;
fig. 4 is an architecture diagram of a federated learning system according to an embodiment of the present invention;
FIG. 5 is a schematic flow diagram of a Federal knowledge distillation polymerization with dynamic selection of participants according to an embodiment of the present invention;
FIG. 6 is a schematic flow chart of a target detection method for knowledge distillation according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of a NanoDet-m model according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of a shuffleNetV2 model according to an embodiment of the present invention;
fig. 9 is a schematic structural diagram of a SwinT model according to an embodiment of the present invention;
fig. 10 is a schematic structural diagram of a SwinT block model according to an embodiment of the present invention;
fig. 11 is a schematic diagram of an application scenario of the target detection system according to an embodiment of the present invention;
FIG. 12 is a schematic diagram of a knowledge distillation based federal target detection system in accordance with an embodiment of the present invention;
fig. 13 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The present invention will be further described with reference to the accompanying drawings and specific embodiments, and it should be noted that, in the premise of no conflict, the following described embodiments or technical features may be arbitrarily combined to form a new embodiment:
the first embodiment is as follows:
referring to fig. 1, fig. 1 shows a knowledge distillation-based federal target detection method applied to an intelligent terminal, which includes the following steps:
acquiring terminal data to obtain an original picture data set, and performing safety data alignment by using original data of all parties to obtain a uniform sample picture data set for learning;
b, carrying out federal learning aiming at a target detection method to obtain a reasoning detection model;
and c, utilizing the inference detection model to carry out scene application.
In the implementation process, the invention provides the federate target detection method based on knowledge distillation, which gives consideration to data privacy and safety and improves the reasoning detection efficiency of the target detection method, by carrying out federate learning on the basis of a large amount of picture data and aiming at the target detection method, thereby avoiding the problems of data privacy of each terminal caused by that each terminal in the traditional target detection system transmits the collected data to a centralized server and the centralized server arranges resources for learning, and the problems of data safety hidden dangers of data hijacking, data monitoring and the like in the transmission process, and the waste of computing resources and the like caused by that idle resources of each terminal cannot be reasonably mobilized.
Example two:
referring to fig. 2, fig. 2 shows a knowledge distillation-based federal target detection method applied to an intelligent terminal, which includes the following steps:
step S1, acquiring a picture data set as sample data;
in this embodiment, the picture data set may select picture data according to an application scene that needs to be detected.
For example, when used for image classification, the selected picture data set is a picture data set including image data of a plurality of classification categories.
The application scenes can include application scenes in the field of target detection such as image classification, face detection, object detection, biological detection and the like.
And step S2, carrying out federal learning by using the sample data, which specifically comprises the following steps: constructing a knowledge distillation model of a local participant by using the sample data, and sending knowledge distillation model information to a server so that the server aggregates the knowledge distillation model information of the local participant and the knowledge distillation model information of the other n-1 participants; receiving first aggregation information which is sent by a server and used for aggregating a local participant and n-1 other participants, wherein n is a natural number; updating the student models in the distillation model according to the first aggregation information to obtain first aggregation student models of local participants aiming at all participants; and obtaining an inference detection model according to the first aggregation student model.
It should be noted that knowledge distillation refers to the transfer of learning behavior from an awkward model (teacher) to a smaller model (student), where the output generated by the teacher is used as a "soft target" for training the student. The simplest form of knowledge distillation is to train a distillation model on a transmission set with a soft target distribution. To date, we should know that there are two goals for training student models. One is the correct label (hard object) and the other is the soft label (soft object) generated from the teacher's network, i.e. the feature map obtained from the middle layer of the teacher, to guide the students to know the teacher's behavior as much as possible.
Fig. 3 shows an architecture diagram of an object detection system provided in this embodiment, and in fig. 3, the object detection system includes four modules, namely, data acquisition module, security data alignment module, federal learning module, and scenario application module.
Referring to fig. 4, fig. 4 is a diagram illustrating an architecture of the federal learning system according to the embodiment; the federal learning process of the system mainly comprises the following main steps:
(1) the participator constructs a local model by using the target detection method provided by the invention according to the picture data set of the participator;
(2) the participator encrypts the model parameters of the local model based on a homomorphic encryption algorithm and then uploads the encrypted model parameters to the main server;
(3) the main server carries out the aggregation calculation of averaging on the model parameters of the participants;
(4) the main server broadcasts and transmits the aggregated model parameters to all participants;
(5) the participator decrypts the aggregation information transmitted by the main server, and updates the local model according to the decryption information;
(6) and (5) iterating the steps (1) to (5) until model convergence is achieved or the iteration number reaches 500 times.
Example three:
different from the second embodiment, the obtaining of the inference detection model according to the first aggregation student model includes:
respectively calculating information gain values of the first aggregation student models of the local participants aiming at all the participants, arranging the information gain values according to the magnitude, and sending participant information corresponding to the first m information gain values to a server, wherein m is a natural number and is less than or equal to n, and the participants corresponding to the first m information gain values are beneficial participants corresponding to the local participants;
and receiving second aggregation information which is sent by the server and used for aggregating the local participant and the favorable participant thereof, updating a second aggregation student model of the local participant according to the second aggregation information to obtain a second aggregation student model of the local participant, and taking the second aggregation student model as an inference detection model.
As an embodiment, please refer to fig. 5, fig. 5 shows a flow chart of federal knowledge distillation polymerization for dynamically selecting participants according to an embodiment of the present invention;
1) the main server receives local knowledge distillation model information of the current round (including student model information and teacher model information in the knowledge distillation process) from the n participants;
2) averagely aggregating the student model information of the currently processed participant A with the small model information of the other n-1 participants respectively, and issuing the aggregated information of the n-1 student models to the participant A;
3) the participator A respectively updates the local student models by using the aggregation information of the n-1 student models and calculates the loss of a new round of student models;
4) and the participator A respectively calculates the information gain of n-1 aggregation student models by using a formula (0-1,0-2,0-3) and informs the participator number corresponding to the maximum value of the front n/2 information gains of the main server. Wherein g is i Representing the first derivative of the loss function, h i Representing the second derivative, Σ, of the loss function after g i Indicating that the sum of the first derivatives of the model loss functions of the local student models is updated using the student model information of the other participants,∑ before g i the sum of the first derivative of the loss function, Σ, representing the local student model is updated without using student model information of other participants after h i Sum, Σ, representing the second derivative of the model loss function that updates the local student model using student model information for other participants before h i Indicating that the sum of the second derivatives of the loss functions of the local student models is not updated with student model information of other participants. If the information gain of a certain participant and the participant A is larger than 0, the model information of the current round of the participant is beneficial to the participant A to learn more positive knowledge, and the plurality of participants formed in the way are positive learning partners which the participant A wants to carry out federal learning in the current round.
Figure BDA0003624807100000091
Figure BDA0003624807100000092
Figure BDA0003624807100000093
5) After the main server receives the partner number lists corresponding to all the participant machines, the main server inquires and gathers the knowledge distillation model information of the participants and learning partners thereof according to the participant numbers which are informed by all the participants and hope for federal learning;
6) the main server correspondingly issues different aggregation information to corresponding participants;
7) and each participant updates the local knowledge distillation model by utilizing the polymerization information.
In the implementation process, the federate aggregation strategy can calculate different information gains according to different training stages of the model, so that learning partners of different participants are dynamically replaced, and the influence of knowledge distillation model information which is not beneficial to the current learning of the participants on the local knowledge distillation model information is favorably avoided. Secondly, the information of the student models of all the participants is used in the process of calculating the information gain, and because of the light weight characteristic of the student models, the calculation of the information gain does not bring great calculation cost, and the calculation amount is small.
The method of the invention enables the members in the federal system to freely select the knowledge distillation model information which is more in line with the data characteristics of the members by dynamically selecting the federal polymerization mode of the proper participants, improves the effect of federal learning, by dynamically selecting the federate aggregation mode of proper participants, the members in the federate system can freely select the knowledge distillation model information which is more in line with the data characteristics of the members, the efficiency and the effect of federate learning are improved, the indiscriminate aggregation of the main server on the knowledge distillation model information of each participant is avoided, however, for the participants, the knowledge distillation model information of all other participants can not positively affect the local models thereof, so that the problem that the performance of the local models thereof is deteriorated due to the addition of part of the information of the knowledge distillation models of the participants which has great difference with the characteristics of the local data is caused.
Optionally, the constructing a knowledge distillation model of a local participant by using the sample data includes:
the knowledge distillation model comprises a teacher model and a student model, and the sample data is learned according to the teacher model to obtain a first target classification soft label and a first target frame soft prediction;
learning the sample data according to a student model, and obtaining a second target classification soft label and a second target frame soft prediction through a sofamax layer with the distillation temperature of t; obtaining a target classification hard label and a target frame hard prediction by distilling a sofamax layer with the temperature of 1;
calculating the distillation loss L according to the first target classification soft label, the first target frame soft prediction, the second target classification soft label and the second target frame soft prediction soft
Classifying hard tags and target bounding box hard predictions according to the targets, and the number of samplesAccording to the real sample label and the real sample frame, obtaining the student loss L hard
The objective function L ═ tL soft +(1-t)L hard Wherein t is more than or equal to 0 and less than or equal to 1, L hard For student loss, L soft And iterating the knowledge distillation model for minimizing the objective function to obtain a knowledge distillation model after the last iteration, and taking the knowledge distillation model after the last iteration as the constructed knowledge distillation model of the local participant.
Illustratively, m is equal to 1/2 of n.
Illustratively, the first m information gain values are each greater than 0.
It should be noted that the softmax function is also called a normalized exponential function. The method is a popularization of a two-classification function sigmoid on multi-classification, and aims to show the multi-classification result in a probability form. softmax is the transformation of the model's predictors into an exponential function, thus ensuring that the probabilities are non-negative, and then the sum of the probabilities for the various predictors is equal to 1.
As an embodiment, please refer to fig. 6, fig. 6 illustrates a target detection method based on a knowledge distillation model according to the present invention;
in the target detection method of the embodiment, a high-efficiency large model is used for guiding a high-efficiency small model to learn, and finally, the high-efficiency small model is used for completing an application function of inference detection, so that the technical problem that the inference detection efficiency of the large model is low but the performance of the small model is poor is solved.
The target detection method based on knowledge distillation comprises the following main processes:
learning based on the teacher model using sample data: the teacher model is a SwinT model which is excellent in the field of target detection at present, and the end of the SwinT model is added with a distillation temperature t (the value range of t is [0,1 ]]The method for predicting the soft label of the teacher model based on the Sofamax layer with the t value of 0.5) is adopted to obtain a target classification soft label and a target frame soft prediction, wherein the distillation temperature t represents that the loss value of the teacher model is reflected by the weight of t to be calculatedIn the objective function of the method, the knowledge representing the teacher model is referred to by the student model with the weight of t. The generalized sofamax function of the sofamax layer is shown in formula (1), wherein z i Representing the output of the ith neuron.
Figure BDA0003624807100000121
(2) Learning based on the student model using sample data: the student model is a relatively lightweight NanoDet-m model in the field of target detection at present, and two branches are separated at the end of the NanoDet model:
1. obtaining a target classification soft label and a target frame soft prediction through a sofamax layer with the distillation temperature of t, and then calculating a distillation loss L with the temperature of t by combining the target classification soft label and the target frame soft prediction of the teacher model through a formula (2) soft It represents the loss of knowledge from the teacher model by using a coefficient with weight t when the temperature is t. Where p represents the output of the student model.
Figure BDA0003624807100000122
Obtaining a target classification hard label and a target frame hard prediction through a sofamax layer with the distillation temperature of 1, and then calculating through a formula (3) by combining a real sample label and a real sample frame of the same data to obtain the student loss L hard Finally combined with distillation loss L soft The objective function L for obtaining the algorithm proposed by the present invention is shown in formula (4). Where c represents the true value of the sample.
L hard =-∑ j c j log(q j (1) Equation (3)
L=tL soft +(1-t)L hard -formula (4)
Before the model convergence or the iteration times reach 500 times, the model iteration is carried out by taking the minimized objective function as the target.
Illustratively, the teacher model comprises a SwinT model and the student model comprises a NanoDet-m model.
Referring to fig. 7, fig. 7 is a schematic structural diagram of a NanoDet-m model provided by the present invention; the NanoDet-m model mainly comprises three major parts:
(1) backbone network shuffleNet V2: fig. 8 shows a schematic structural diagram of shuffleNetV2 according to an embodiment of the present invention.
The shufflenet 2 mainly includes 5 convolutional layers, 4 maximum pooling layers, 1 Global pooling layer, and 1 full-link layer. The convolutional layer has the main task of carrying out local perception on the picture through convolution operation and extracting the characteristics of the picture. After each convolutional layer, in order to break the situation that input and output are linearly related and enlarge the learnable space of subsequent convolutional layers, a ReLU function is used as an excitation function to perform nonlinear mapping on an output result, and the ReLU function is shown as formula (5). Secondly, maximum pooling is used to compress the number of data and parameters, avoiding the over-fitting situation and improving the generalization capability of the model. And finally, obtaining a plurality of output vectors through the Global pooling layer and the full-link layer. Wherein, the objective function obtained by using the cross entropy as the loss function in the shuffleNet V2 is shown in formula 6. Wherein m represents the number of classes, n represents the number of samples, x represents the samples, y ic Indicating whether the sample i belongs to the true value c, P ic Representing the prediction probability that sample i belongs to c. The training process of shuffleNetV2 is to iterate the model with the objective function of minimization before the model converges or the number of iterations reaches 500.
Relu (x) ═ max (0, x) -equation (5)
Figure BDA0003624807100000131
(2) Feature pyramid PAFPN: the number of layers of the PAFPN feature pyramid is consistent with the number of the shuffeNet V2 of the backbone network, and the PAFPN feature pyramid is connected with the shuffeNet V2 of the backbone network one by one and mainly responsible for extracting features of the corresponding images of the shuffeNet V2 to generate multi-scale feature representation and generate predicted values.
(3) Lightweight head:
the lightweight head of the NanoDet-m model is a simple convolutional neural network formed by connecting 5 convolutional layers end to end, and is mainly used for carrying out convolution operation and outputting corresponding target labels and frames.
Referring to fig. 9, fig. 9 is a schematic structural diagram of a teacher model, namely a SwinT model, provided by an embodiment of the present invention, and mainly includes 1 patch embedding layer, 1 linear embedding layer, 3 patch merging modules, and 4 SwinT modules:
patch embedding layer: the main function is to split the input H × W × 3 RGB color image into N × (P) of non-overlapping equal size 2 X 3) patch. Each P 2 X 3 patches are considered as a token, and N tokens are split. P used in the present invention 2 =4×4;
Linear embedding layer: linearly mapping the high-dimensional tensor onto the low-dimensional space;
patch merge module: merging 2x2 adjacent patches to enlarge the tensor size of the token and reduce the number of the token;
a SwinT module: the SwinT module is a core component of the SwinT model and consists of a plurality of SwinT blocks. As shown in fig. 10, fig. 10 is a schematic structural diagram of a single SwinT block according to an embodiment of the present invention. The processing procedure of each SwinT module is as follows: z input by linear embedded module LN l A linear mapping is performed, followed by the SW-MSA attention module's output and the module's input z l Adding the residual errors to obtain
Figure BDA0003624807100000141
Then after linear mapping is carried out through an LN module, the linear mapping is input into a two-layer convolutional neural network MLP with a GELU activation function sandwiched in the middle, and then the output of the MLP is compared with the output of the MLP
Figure BDA0003624807100000142
Adding the residual errors to obtain the output z of the module l+1 . As can be seen from fig. 9, in the SwinT model, the SwinT modules of stages 1, 2 and 4 are formed by connecting 2 SwinT blocks end to end, and the SwinT module of stage3 is formed by connecting 6 SwinT blocks end to end.
And step S3, detecting the data to be detected according to the inference detection model to obtain an inference result.
Optionally, the detecting the data to be detected according to the inference detection model to obtain an inference result includes:
the data to be detected is picture data to be detected, the input picture data to be detected is received, and the picture to be detected is preprocessed, so that the size of the preprocessed picture to be detected is consistent with that of a picture of sample data; and inputting the preprocessed picture to be detected into the reasoning detection model to obtain a reasoning result.
In this embodiment, the application scenarios may include application scenarios in the field of target detection, such as image classification, face detection, object detection, and biological detection.
As a specific embodiment, please refer to fig. 11, where fig. 11 shows an application scenario of a target detection system provided in an embodiment of the present invention, and the inference detection step specifically includes:
(1) inputting a picture to be detected;
(2) performing contraction processing on the input picture, namely, equally amplifying or reducing the picture according to the length-width ratio to keep the size of the picture consistent with that of the sample data;
(3) and inputting the processed picture into a reasoning and detecting model obtained according to the knowledge distillation-based federal target detection method provided by the invention, and obtaining a final reasoning and detecting result which can be applied to a corresponding scene through model operation.
Example four:
referring to fig. 12, fig. 12 shows a knowledge distillation-based federal target detection system according to an embodiment of the present invention, which includes:
the sample data acquisition unit is used for acquiring the picture data set as sample data;
the system comprises a federal learning unit, a knowledge distillation model information acquisition unit and a server, wherein the federal learning unit is used for carrying out federal learning by using the sample data, specifically for constructing a knowledge distillation model of a local participant by using the sample data and sending the knowledge distillation model information to the server so that the server can aggregate the knowledge distillation model information of the local participant and the knowledge distillation model information of other n-1 participants; receiving first aggregation information which is sent by a server and used for aggregating a local participant and n-1 other participants, wherein n is a natural number; updating the student models in the distillation model according to the first aggregation information to obtain first aggregation student models of local participants aiming at all participants; obtaining a reasoning detection model according to the first aggregation student model;
and the detection unit is used for detecting the data to be detected according to the reasoning detection model to obtain a reasoning result.
Example five:
fig. 13 is a schematic structural diagram of an electronic device provided in an embodiment of the present application, and in the present application, an electronic device 100 for implementing a knowledge distillation-based federal target detection method in an embodiment of the present application may be described by using the schematic diagram shown in fig. 13.
As shown in fig. 13, an electronic device 100 includes one or more processors 102, one or more memory devices 104, and the like, which are interconnected via a bus system and/or other type of connection mechanism (not shown). It should be noted that the components and structure of the electronic device 100 shown in fig. 13 are only exemplary and not restrictive, and the electronic device may have some of the components shown in fig. 13 and may have other components and structures not shown in fig. 13 as needed.
The processor 102 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device 100 to perform desired functions.
The storage 104 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), hard disk, flash memory, etc. On which one or more computer program instructions may be stored that may be executed by processor 102 to implement the functions of the embodiments of the application (as implemented by the processor) described below and/or other desired functions. Various applications and various data, such as various data used and/or generated by the applications, may also be stored in the computer-readable storage medium.
The invention also provides a computer storage medium on which a computer program is stored, in which the method of the invention, if implemented in the form of software functional units and sold or used as a stand-alone product, can be stored. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer storage medium and used by a processor to implement the steps of the embodiments of the method. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer storage medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer storage media may include content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer storage media that does not include electrical carrier signals and telecommunications signals as subject to legislation and patent practice.
Various other modifications and changes may be made by those skilled in the art based on the above-described technical solutions and concepts, and all such modifications and changes should fall within the scope of the claims of the present invention.

Claims (10)

1. A knowledge distillation-based federal target detection method is applied to an intelligent terminal and comprises the following steps:
acquiring a picture data set as sample data;
and performing federal learning by using the sample data, which specifically comprises the following steps: constructing a knowledge distillation model of a local participant by using the sample data, and sending knowledge distillation model information to a server so that the server aggregates the knowledge distillation model information of the local participant and the knowledge distillation model information of the other n-1 participants; receiving first aggregation information which is sent by a server and used for aggregating a local participant and other n-1 participants, wherein n is a natural number; updating student models in the distillation model according to the first aggregation information to obtain first aggregation student models of local participants for all participants; obtaining a reasoning detection model according to the first aggregation student model;
and detecting the data to be detected according to the reasoning detection model to obtain a reasoning result.
2. A knowledge distillation-based federal target test method as claimed in claim 1, wherein said deriving an inferential test model from said first aggregate student model comprises:
respectively calculating information gain values of the first aggregation student models of the local participants aiming at all the participants, arranging the information gain values according to the magnitude, and sending participant information corresponding to the first m information gain values to a server, wherein m is a natural number and is less than or equal to n, and the participants corresponding to the first m information gain values are beneficial participants corresponding to the local participants;
and receiving second aggregation information which is sent by the server and used for aggregating the local participant and the favorable participant thereof, updating a second aggregation student model of the local participant according to the second aggregation information to obtain a second aggregation student model of the local participant, and taking the second aggregation student model as an inference detection model.
3. The knowledge distillation-based federal target detection method as claimed in claim 1, wherein the construction of the knowledge distillation model of the local participant by using the sample data comprises:
the knowledge distillation model comprises a teacher model and a student model, and the sample data is learned according to the teacher model to obtain a first target classification soft label and a first target frame soft prediction;
learning the sample data according to a student model, and obtaining a second target classification soft label and a second target frame soft prediction through a sofamax layer with the distillation temperature of t; obtaining a target classification hard label and a target frame hard prediction by distilling a sofamax layer with the temperature of 1;
calculating the distillation loss L according to the first target classification soft label, the first target frame soft prediction, the second target classification soft label and the second target frame soft prediction soft
Obtaining student loss L according to the target classification hard label and target frame hard prediction and the real sample label and real sample frame of the sample data hard
The objective function L ═ tL soft +(1-t)L hard Wherein t is more than or equal to 0 and less than or equal to 1, L hard For student loss, L soft And iterating the knowledge distillation model for minimizing the objective function to obtain a knowledge distillation model after the last iteration, and taking the knowledge distillation model after the last iteration as the constructed knowledge distillation model of the local participant.
4. A knowledge distillation based federal target test method as in claim 2, wherein m is equal to 1/2 of n.
5. The knowledge-based distillation federal target detection method as claimed in claim 2, wherein the top m information gain values are all greater than 0.
6. A knowledge distillation based federal target detection method as claimed in claim 3, wherein the teacher model comprises a SwinT model and the student model comprises a NanoDet-m model.
7. The federal target test method based on knowledge distillation as claimed in claim 1, wherein the step of testing the data to be tested according to the inference test model to obtain an inference result comprises:
the data to be detected is picture data to be detected, the input picture data to be detected is received, and the picture to be detected is preprocessed, so that the size of the preprocessed picture to be detected is consistent with that of a picture of sample data; and inputting the preprocessed picture to be detected into the reasoning detection model to obtain a reasoning result.
8. A knowledge distillation based federal target detection system, comprising:
the sample data acquisition unit is used for acquiring the picture data set as sample data;
the system comprises a federal learning unit, a knowledge distillation model information acquisition unit and a server, wherein the federal learning unit is used for carrying out federal learning by using the sample data, specifically for constructing a knowledge distillation model of a local participant by using the sample data and sending the knowledge distillation model information to the server so that the server can aggregate the knowledge distillation model information of the local participant and the knowledge distillation model information of other n-1 participants; receiving first aggregation information which is sent by a server and used for aggregating a local participant and other n-1 participants, wherein n is a natural number; updating the student models in the distillation model according to the first aggregation information to obtain first aggregation student models of local participants aiming at all participants; obtaining a reasoning detection model according to the first aggregation student model;
and the detection unit is used for detecting the data to be detected according to the reasoning detection model to obtain a reasoning result.
9. An electronic device, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the knowledge distillation based federal target detection method as defined in any of claims 1 to 7.
10. A computer storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the knowledge distillation based federal objective test method as claimed in any one of claims 1 to 7.
CN202210474634.5A 2022-04-29 2022-04-29 Knowledge distillation-based federal target detection method and system Pending CN114863092A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210474634.5A CN114863092A (en) 2022-04-29 2022-04-29 Knowledge distillation-based federal target detection method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210474634.5A CN114863092A (en) 2022-04-29 2022-04-29 Knowledge distillation-based federal target detection method and system

Publications (1)

Publication Number Publication Date
CN114863092A true CN114863092A (en) 2022-08-05

Family

ID=82634790

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210474634.5A Pending CN114863092A (en) 2022-04-29 2022-04-29 Knowledge distillation-based federal target detection method and system

Country Status (1)

Country Link
CN (1) CN114863092A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115423540A (en) * 2022-11-04 2022-12-02 中邮消费金融有限公司 Financial model knowledge distillation method and device based on reinforcement learning
CN117131951A (en) * 2023-02-16 2023-11-28 荣耀终端有限公司 Federal learning method and electronic equipment
CN117236421A (en) * 2023-11-14 2023-12-15 湘江实验室 Large model training method based on federal knowledge distillation

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115423540A (en) * 2022-11-04 2022-12-02 中邮消费金融有限公司 Financial model knowledge distillation method and device based on reinforcement learning
CN115423540B (en) * 2022-11-04 2023-02-03 中邮消费金融有限公司 Financial model knowledge distillation method and device based on reinforcement learning
CN117131951A (en) * 2023-02-16 2023-11-28 荣耀终端有限公司 Federal learning method and electronic equipment
CN117236421A (en) * 2023-11-14 2023-12-15 湘江实验室 Large model training method based on federal knowledge distillation
CN117236421B (en) * 2023-11-14 2024-03-12 湘江实验室 Large model training method based on federal knowledge distillation

Similar Documents

Publication Publication Date Title
CN110084281B (en) Image generation method, neural network compression method, related device and equipment
CN111191791A (en) Application method, training method, device, equipment and medium of machine learning model
CN111444878B (en) Video classification method, device and computer readable storage medium
CN109583501B (en) Method, device, equipment and medium for generating image classification and classification recognition model
CN111507993B (en) Image segmentation method, device and storage medium based on generation countermeasure network
CN114863092A (en) Knowledge distillation-based federal target detection method and system
CN111507378A (en) Method and apparatus for training image processing model
CN111275107A (en) Multi-label scene image classification method and device based on transfer learning
CN111741330A (en) Video content evaluation method and device, storage medium and computer equipment
CN111598167B (en) Small sample image identification method and system based on graph learning
CN112257841A (en) Data processing method, device and equipment in graph neural network and storage medium
WO2022111387A1 (en) Data processing method and related apparatus
CN113821668A (en) Data classification identification method, device, equipment and readable storage medium
CN114445461A (en) Visible light infrared target tracking training method and device based on non-paired data
Zhou et al. MSAR‐DefogNet: Lightweight cloud removal network for high resolution remote sensing images based on multi scale convolution
CN114627331A (en) Model training method and device
CN109345497B (en) Image fusion processing method and system based on fuzzy operator and computer program
CN114596477A (en) Foggy day train fault detection method based on field self-adaption and attention mechanism
CN114511733A (en) Fine-grained image identification method and device based on weak supervised learning and readable medium
CN117095460A (en) Self-supervision group behavior recognition method and system based on long-short time relation predictive coding
CN114155388B (en) Image recognition method and device, computer equipment and storage medium
CN111935259B (en) Method and device for determining target account set, storage medium and electronic equipment
Jin et al. Blind image quality assessment for multiple distortion image
Yi et al. DCNet: dual-cascade network for single image dehazing
Pal et al. A deep learning model to detect foggy images for vision enhancement

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination