CN115099334A - Multi-party cooperative data learning system and learning model training method - Google Patents

Multi-party cooperative data learning system and learning model training method Download PDF

Info

Publication number
CN115099334A
CN115099334A CN202210715202.9A CN202210715202A CN115099334A CN 115099334 A CN115099334 A CN 115099334A CN 202210715202 A CN202210715202 A CN 202210715202A CN 115099334 A CN115099334 A CN 115099334A
Authority
CN
China
Prior art keywords
model
local
training
learning
client
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210715202.9A
Other languages
Chinese (zh)
Inventor
武星
裴洁
钱权
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Shanghai for Science and Technology
Original Assignee
University of Shanghai for Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Shanghai for Science and Technology filed Critical University of Shanghai for Science and Technology
Priority to CN202210715202.9A priority Critical patent/CN115099334A/en
Publication of CN115099334A publication Critical patent/CN115099334A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/70Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer
    • G06F21/71Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure computing or processing of information
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to a multi-party collaborative data learning system and a learning model training method, which fully utilize the combination of the advantages of active learning and federal learning, utilize a small amount of labeled data of each client to carry out collaborative training of a model on the premise of protecting data privacy, generate probability distribution by predicting loss values in each round of federal learning training, select the client to participate in training, accelerate the convergence of a global model and reduce communication traffic; the client side loads the current global model parameters into a local model and then actively learns, the local model is used for guiding sample query, and high-information samples are obtained through sampling and are labeled; and expanding the marked data set of each client, and performing federated learning training again to obtain a model with better performance. On the premise of reducing the sample marking cost as much as possible, the accuracy and generalization performance of the model are improved.

Description

Multi-party cooperative data learning system and learning model training method
Technical Field
The invention relates to a data information technology, in particular to a multi-party cooperative data learning system and a learning model training method.
Background
The development of artificial intelligence depends on a large amount of information data, so that deep learning is driven to make great progress under the data support of a large labeled data set, and the obtained deep learning model is widely applied to production practices, such as a face recognition system, a smoke alarm system, an intelligent workshop, automatic driving and the like. In addition, deep learning has great development prospects in the fields of medical treatment, finance, industry, education and the like. In recent years, the internet is mature, the internet of things is rapidly developed, various industries generate a large amount of data every day, and how to use the data to train a deep learning model becomes a practical challenge.
The following challenges are faced with using these data: firstly, the data has the characteristics of large data volume and low value density, and most of the data need to be marked by manual marking or other methods with huge cost. Because the cost of the labeled data set is high, a plurality of researchers are promoted to research and search the method to reduce the labeling cost, and research directions such as active learning are generated; secondly, an excellent deep learning model is not driven by a large amount of data, firstly, the amount of data collected by a single enterprise in a certain field may not meet the requirement for training the deep learning model, secondly, the data collected by some enterprises do not belong to the enterprise, and the data may belong to individuals or other enterprises, so that the data are uploaded to the same high-performance server center by multiple enterprises or individuals for deep learning model training through cooperation in a traditional data centralized mode, and thus, various problems such as privacy protection, industry competition, legal restriction, intellectual property protection, data storage and communication are raised. To address these issues, researchers have proposed a federal learning framework.
Disclosure of Invention
Aiming at the problem that the accuracy of a learning model is improved and depends on a large amount of high-quality data, a multi-party collaborative data learning system and a learning model training method are provided, the advantages of active learning and federal learning are fully utilized, on the premise of protecting data privacy, the labeled data information of each client is subjected to federal learning collaborative training aiming at classification tasks, the model obtained by the federal learning training is used for guiding each participant to perform active learning, an information sample is sampled, then the federal learning training is further performed, the sample labeling cost is reduced as far as possible, and the accuracy and the generalization performance of the model are improved.
The technical scheme of the invention is as follows: a multi-party cooperative data learning system comprises a central server for laying out a global model and a client with local classification models in multiple parties, wherein the central server issues model parameters, the client utilizes local label data to carry out reasoning and training, reasoning and training results are returned to the central server, and the central server receives results of the client with multiple parties to carry out federal learning and training of the global model; and the client actively learns the local unlabeled data according to the acquired trained global model parameters, expands the local labeled data set, and the global model performs federated learning on the labeled data set expanded by the multi-party clients again.
A multi-party cooperative data learning model training method specifically comprises the following steps:
1) the central server initially randomly generates a global model parameter W and sends the global model parameter W to all the clients;
2) each client receives the global model parameter W issued by the central server, loads the global model parameter W to the local model, and utilizes the local model to label the local data set D L All samples are subjected to model reasoning once, and the predicted loss value of the current learning is recorded and uploaded to a central server;
3) the central server receives a round of predicted loss value set V uploaded by all the clients, linear numerical value mapping is carried out on all elements in the predicted loss value set V to enable the sum of the elements to be 1, the elements are used as probability distribution to participate in the next round of selection of the Federal learning clients according to the size after numerical value mapping, and the number of the selected clients accounts for half of the number of the clients;
4) selected clients utilize local tagged data set D L Training the local model, and uploading the updated local model parameters to a central server;
5) the central server receives the local model parameters uploaded by the selected client, updates the global model parameters W by using a model aggregation algorithm and sends the global model parameters W to all the clients;
6) iterative training, repeating 2) -5) until a certain training frequency is met, and finishing the training of the federal learning global model;
7) each client receives updated parameters W after the global model sent by the central server is trained * Loading the data to a local model for active learning, selecting an unlabeled sample with the most information gain by using the local model, and requesting an expert to label the sample;
8) and returning the supplemented labeled data set to execute the step 6) to perform global model training again until the labeled data set cannot be expanded.
Further, the calculation formula of the predicted loss value in the step 2) is as follows:
Figure BDA0003709199840000031
wherein V i t The predicted loss value of the ith client end in the tth round of federal learning is represented, t represents the tth round of federal learning training, i represents the ith client end, n i Representing the currently labeled data set D of the ith client L Number of samples owned in, l (-) represents the loss function, x k 、y k Respectively represent the kth sample and a sample label, W local Representing local model parameters.
Further, the calculation formula of the linear numerical mapping in the step 3) is as follows:
Figure BDA0003709199840000032
wherein newV i t Denotes V i t The value after the linear numerical mapping is carried out,
Figure BDA0003709199840000033
sum, line representing all client predicted loss valuesAnd sampling the client by taking the prediction loss value after the characteristic value mapping as discrete probability distribution, wherein the higher the prediction loss value is, the higher the probability of sampling the client is.
Further, the formula of the model polymerization algorithm in the step 5) is as follows:
Figure BDA0003709199840000034
wherein h is the number of the currently selected clients,
Figure BDA0003709199840000035
and representing the local model parameters uploaded by the ith client in the selected clients.
Further, the step 7) of the active learning method of the local model is as follows:
7.1) additionally adding two auxiliary classifiers to a local model architecture in global model training, connecting the auxiliary classifiers to a main network in a local model, and then paralleling the auxiliary classifiers with a main classifier of the local model to form a local active learning model;
7.2) Using the labeled data set D L And unlabeled data set D U Training a local active learning model;
7.3) training the differences among the maximized auxiliary classifiers by taking the difference loss function as an objective function to obtain a tighter decision boundary, thereby selecting high-information samples from the unlabeled samples and adding the high-information samples into the label data set.
Furthermore, the two additional auxiliary classifiers are the same as the network architecture of the main classifier, the network parameters are generated by adding random Gaussian noise to the network parameters of the main classifier, the added random Gaussian noise p-N (0,0.1), the feature maps obtained after the data samples pass through the main network respectively enter the main classifier and the auxiliary classifiers, and the classifiers are not influenced with each other.
Further, the step 7.2) of training the local active learning model is as follows:
the backbone network and the primary classifier are denoted by θ, the backbone network is denoted by b, and the backbone network is denoted by θ 1 And theta 2 Two auxiliary classifiers are represented, p represents the probability distribution of the sample passing through theta output, and p represents the probability distribution of the sample passing through theta output 1 Represents the sample passing through (b, theta) 1 ) Probability distribution of output in p 2 Represents the sample passing through (b, theta) 2 ) A probability distribution of the output;
a: training a local active learning model by using the labeled data set;
a-1, calculating the sample passing through theta, (b, theta) 1 )、(b,θ 2 ) Cross entropy loss L produced by inference CE
The cross entropy loss function is calculated by the formula:
Figure BDA0003709199840000041
where C is the total number of sample classes, C represents the sample class, 1 represents the indicator function, p c (y | x) represents the probability that sample x belongs to class c;
a-2, local active learning model parameters are updated, wherein eta is the learning rate,
Figure BDA0003709199840000042
is a gradient:
Figure BDA0003709199840000043
Figure BDA0003709199840000044
Figure BDA0003709199840000045
b: training the auxiliary classifier by using the unlabeled data set;
b-1: calculating the sample pass (b, θ) 1 )、(b,θ 2 ) Loss of variance L from reasoning dist
L dist =d(p 1 ,p 2 )+d(p 1 ,p)+d(p 2 ,p),
Figure BDA0003709199840000046
B-2: updating the auxiliary classifier parameters:
Figure BDA0003709199840000047
Figure BDA0003709199840000048
the invention has the beneficial effects that: the invention relates to a multi-party collaborative data learning system and a learning model training method, which fully utilize a small amount of labeled data sets of all clients through federal learning and are used for guiding the sampling of samples actively learned by a single client; in the federal learning training, the client is subjected to sampling training by using the prediction loss value as probability distribution, so that the model convergence can be accelerated and the communication traffic can be reduced; in active learning, a tighter decision boundary can be obtained by maximizing the difference between the auxiliary classifiers, so that high-information samples are effectively selected; by utilizing the advantages of active learning and federal learning, a large amount of unmarked data of each participant can be fully utilized to carry out the collaborative training of the model on the premise of protecting the data privacy.
Drawings
FIG. 1 is a schematic diagram of a multi-party collaborative data learning system according to the present invention;
FIG. 2 is a block flow diagram of a multi-party collaborative data learning model training method of the present invention;
FIG. 3 is a schematic diagram of a sample sampling strategy in the multi-party collaborative data learning system according to the present invention.
Detailed Description
The invention is described in detail below with reference to the figures and specific embodiments. The present embodiment is implemented on the premise of the technical solution of the present invention, and a detailed implementation manner and a specific operation process are given, but the scope of the present invention is not limited to the following embodiments.
As shown in the schematic structural diagram of the multi-party collaborative data learning system shown in fig. 1, a local client loads global model parameters into a local model, performs local model reasoning by using local label data, obtains a prediction loss value and sends the prediction loss value back to a central server, so that the central server selects clients participating in the next round of federal learning; and after the selected client side trains the local model by using the locally marked data, uploading the local model parameters to the central server, aggregating the selected client side model parameters by the central server to obtain an updated global model, issuing the updated global model parameters to all the client sides for the next round of federal learning, and ending the federal learning training after the global model is converged. And the client loads the trained global model parameters into the local model, actively learns the local unmarked data by using the local model, expands the local marked data set, and then performs the federal learning.
As shown in fig. 2, a flow diagram of a training method of a multi-party collaborative data learning model is further illustrated with reference to fig. 2, and the method specifically includes the following steps:
s100, the central server initially randomly generates a global model parameter W and transmits the global model parameter W to all clients;
s200, each client receives the global model parameter W issued by the central server, loads the global model parameter W to the local model, and utilizes the local model to carry out local annotation on the data set D L All samples of the system are subjected to model reasoning once, and the predicted loss value V of the current round of federal learning is recorded i t And uploading to a central server;
in particular, the predicted loss value V i t The calculation formula of (2) is as follows:
Figure BDA0003709199840000061
wherein t represents the t-th round of federal learning training, i represents the ith client, n i Representing the currently labeled data set D of the ith client L Number of samples owned in, l (-) represents the loss function, x k 、y k Respectively represent the kth sample and a sample label, W local Representing local model parameters.
Each client locally stores a labeled data set D with a small number of samples L And unlabeled data set D of a large number of samples U The reason why the denominator of the calculation formula for predicting the loss value is the square of the number of marked samples is as follows: one side is inclined to the client side with more marked data samples; on the other hand, the influence of the fact that a certain client has a small number of samples in the labeled data set and a large amount of noise data exists is reduced.
S300, the central server receives a round of prediction loss value set V uploaded by all the clients,
Figure BDA0003709199840000062
m represents the number of the clients, linear numerical mapping is carried out on all elements in the prediction loss value set V to enable the sum of the elements to be 1, the elements are used as probability distribution to participate in the next round of selection of the Federal learning clients according to the size after numerical mapping, and the number of the selected clients accounts for half of the number of the clients;
specifically, the calculation formula of the linear numerical mapping is as follows:
Figure BDA0003709199840000063
wherein newV i t Represents V i t The value after the linear numerical mapping is carried out,
Figure BDA0003709199840000064
representing the sum of all client predicted loss values.
The method has the advantages that the central server can be used for predicting the loss value after linear value mapping
Figure BDA0003709199840000065
The client is sampled as a discrete probability distribution, the higher the predicted loss value, the greater the probability that the client is sampled. Firstly, the training of the local labeled data set of the client with higher loss prediction value on the current global model is more helpful, the convergence of the global model can be accelerated, and secondly, in each round of federal learning training, because not all the clients participate in the training, the communication volume can be reduced.
S400, the selected client utilizes the local labeled data set D L Training the local model and updating the local model parameter W local Uploading to a central server;
s500, the central server receives the local model parameters uploaded by the selected clients, updates the global model parameters W by using a model aggregation algorithm and sends the global model parameters W to all the clients;
specifically, the algorithm formula of the aggregate global model is as follows:
Figure BDA0003709199840000071
wherein h is the number of currently selected clients,
Figure BDA0003709199840000072
and representing the local model parameters uploaded by the ith client in the selected clients.
S600, iterative training is carried out, S200-S500 are repeated until a certain training frequency is met, and the federal learning global model training is completed;
the method has the advantages that the information of the existing marked data samples of each client can be fully utilized by repeating the steps of S200-S500, the global model obtained through the federal learning training has better performance than the local model obtained by training a single client by using a small amount of marked samples, and the global model is used for guiding each client to actively learn, so that the samples with gain on the model performance can be better inquired.
S700, each guestThe client receives the updated parameter W after the global model training sent by the central server * And loading the data to a local model for active learning, selecting a part of unlabeled samples with the most information gain by using the local model, and requesting an expert to label the samples. As shown in fig. 3, the active learning performed by each client in S700 includes the following steps:
s700-1, modifying a local model architecture, and establishing a local active learning model after additionally adding two auxiliary Classifier classic 1 and classic 2 modules to a main network backbone module, which are the same as a main Classifier classic module;
s700-2, utilizing labeled data set D L And unlabeled data set D U Training a local active learning model;
s700-3, using local active learning model to never mark data set D U A part of unlabeled samples with the most information gain is selected and moved to the labeled data set D L
Specifically, in sub-step S700-1 of step S700, two additional modules of Classifier1 and Classifier2 are added, which are the same as the network architecture of the main Classifier module, and the network parameters are generated by adding random gaussian noise to the network parameters of the main Classifier. Random Gaussian noise p-N (0,0.1) is added, a feature map obtained after a data sample passes through a backbone network back bone module respectively enters a main classifier and an auxiliary classifier, and the classifiers are not affected with each other.
Specifically, in sub-step S700-2 of step S700, the backbone network module and the main classifier module are denoted by θ, the backbone network is denoted by b, and θ is denoted by θ 1 And theta 2 Two auxiliary classifier blocks are represented, p represents the probability distribution of the sample through the theta output, p 1 Represents the sample passing through (b, θ) 1 ) Probability distribution of output in p 2 Represents the sample passing through (b, θ) 2 ) The training mode of the local active learning model is as follows: 1. training a local active learning model by using the labeled data set;
(1) computing samplesThrough theta, (b, theta) 1 )、(b,θ 2 ) Cross entropy loss L produced by inference CE : the cross entropy loss function is calculated by the formula:
Figure BDA0003709199840000081
where C is the total number of sample classes, C represents the sample class, 1 represents the indicator function, p c (y | x) represents the probability that sample x belongs to class c.
(2) Updating the local active learning model parameters, wherein eta is the learning rate,
Figure BDA0003709199840000082
is the gradient:
Figure BDA0003709199840000083
Figure BDA0003709199840000084
Figure BDA0003709199840000085
2. training the auxiliary classifier by using the unlabeled data set;
(1) calculating the sample pass through (b, θ) 1 )、(b,θ 2 ) Loss of variance L from reasoning dist
The formula for the difference loss function is:
L dist =d(p 1 ,p 2 )+d(p 1 ,p)+d(p 2 ,p),
Figure BDA0003709199840000086
(2) updating the auxiliary classifier parameters:
Figure BDA0003709199840000091
Figure BDA0003709199840000092
the auxiliary classifier is the same as the main classifier in the model architecture, and the difference is that:
(1) the parameters of the auxiliary classifier are different, and the parameters of the auxiliary classifier are obtained by adding random Gaussian noise to the parameters of the main classifier, so the parameters of the three classifiers are different;
(2) the invention can be briefly described as three stages, namely an initial federal learning stage, a local active learning stage and a federal learning stage after local labeled data set expansion, wherein the parameters of a main classifier are obtained by initial federal learning training, and in the local active learning stage, after two auxiliary classifiers are added to a local model, the local model also needs to be trained, wherein the parameters of the main classifier are updated only when the labeled data set is used for training,
Figure BDA0003709199840000093
however, the parameters of the two auxiliary classifiers are not only updated when trained with labeled data sets:
Figure BDA0003709199840000094
Figure BDA0003709199840000095
it is also updated when training is performed with unlabeled datasets:
Figure BDA0003709199840000096
Figure BDA0003709199840000097
the auxiliary classifier updates parameters by using an unmarked data set, and aims to train the differences among the maximized auxiliary classifiers by using a difference loss function as a target function to obtain a tighter decision boundary, thereby selecting high-information samples and adding the samples into a label data set.
Specifically, in the sub-step S700-3 of the step S700, the data sets are sorted in descending order according to the numerical values of f (x), and are sequentially sorted from the unlabeled data set D U A part of the unlabeled samples with the most information gain is selected and moved to the labeled data set D L The formula for F (x) is:
F(x)=d(p 1 (x),p 2 (x))。
the method has the advantages that two auxiliary classifiers are added to the local model, the structure of the local model is the same as that of the main classifier, so that the method is simple to implement, the difference between the maximized auxiliary classifiers is trained by taking the difference loss function as the objective function, a tighter decision boundary can be obtained, and the high-information samples can be effectively selected.
And S800, returning the supplemented labeled data set to execute the step S600, and performing global model training again until the labeled data set cannot be expanded.
The beneficial effect of this step is that after S700, the labeled data sets of the clients are further expanded, and then federate learning training is performed to obtain a model with better performance.
In the embodiment, as shown in fig. 2, an active federal learning model training system implemented based on the method includes a central server and a plurality of participant devices, where the participant devices store a large number of unlabeled data sets, and each participant has a related domain expert to perform high-quality labeling work on data samples.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is specific and detailed, but not to be understood as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent should be subject to the appended claims.

Claims (8)

1. A multi-party cooperative data learning system is characterized by comprising a central server for laying out a global model and a client with local classification models in multiple parties, wherein the central server issues model parameters, the client utilizes local label data to carry out reasoning and training, the reasoning and training results are returned to the central server, and the central server receives results of the client with multiple parties to carry out Federal learning and training on the global model; and the client actively learns the local unlabeled data according to the acquired trained global model parameters, expands the local labeled data set, and the global model performs federated learning on the labeled data set expanded by the multi-party clients again.
2. A multi-party cooperative data learning model training method is characterized by comprising the following steps:
1) the central server initially randomly generates a global model parameter W and transmits the global model parameter W to all the clients;
2) each client receives the global model parameter W issued by the central server, loads the global model parameter W to the local model, and uses the local model to label the local data set D L Performing model reasoning on all samples, recording the predicted loss value of the learning in the current round and uploading the predicted loss value to a central server;
3) the central server receives a round of predicted loss value set V uploaded by all the clients, linear numerical value mapping is carried out on all elements in the predicted loss value set V to enable the sum of the elements to be 1, the elements are used as probability distribution to participate in the next round of selection of the Federal learning clients according to the size after numerical value mapping, and the number of the selected clients accounts for half of the number of the clients;
4) selected clients utilize local tagged data set D L Training the local model, and uploading the updated local model parameters to a central server;
5) the central server receives the local model parameters uploaded by the selected client, updates the global model parameters W by using a model aggregation algorithm and sends the global model parameters W to all the clients;
6) iterative training, repeating 2) -5) until a certain training frequency is met, and finishing the training of the global model of the federal learning;
7) each client receives updated parameters W after global model training sent by the central server * Loading the data to a local model for active learning, selecting an unlabeled sample with the most information gain by using the local model, and requesting an expert to label the sample;
8) and returning the supplemented labeled data set to execute the step 6) to perform global model training again until the labeled data set cannot be expanded.
3. The multi-party collaborative data learning model training method according to claim 2, wherein the calculation formula of the prediction loss value in step 2) is:
Figure FDA0003709199830000021
wherein V i t Representing the predicted loss value of the ith client in the tth round of federal learning, t representing the tth round of federal learning training, i representing the ith client, n i Representing the current labeled data set D of the ith client L Number of samples owned in, l (-) represents the loss function, x k 、y k Respectively represent the kth sample and the sample label, W local Representing local model parameters.
4. The method for training the multi-party collaborative data learning model according to claim 3, wherein the calculation formula of the linear numerical value mapping in step 3) is:
Figure FDA0003709199830000022
wherein newV i t Represents V i t The value after the linear numerical mapping is carried out,
Figure FDA0003709199830000023
and the sum of all the client prediction loss values is shown, the client sampling is carried out by taking the prediction loss values after linear numerical value mapping as discrete probability distribution, and the higher the prediction loss value is, the higher the probability of the client being sampled is.
5. The multi-party collaborative data learning model training method according to any one of claims 2 to 4, wherein the model aggregation algorithm in the step 5) is formulated as:
Figure FDA0003709199830000024
wherein h is the number of currently selected clients,
Figure FDA0003709199830000025
and representing the local model parameters uploaded by the ith client in the selected clients.
6. The multi-party collaborative data learning model training method according to claim 5, wherein the step 7) local model active learning method is as follows:
7.1) additionally adding two auxiliary classifiers to a local model architecture in global model training, connecting the auxiliary classifiers to a main network in a local model, and then paralleling the auxiliary classifiers with a main classifier of the local model to form a local active learning model;
7.2) Using the labeled data set D L And unlabeled data set D U Training local active learning models;
7.3) training the differences among the maximized auxiliary classifiers by taking the difference loss function as an objective function to obtain a tighter decision boundary, thereby selecting high-information samples from the unlabeled samples and adding the high-information samples into the label data set.
7. The multi-party collaborative data learning model training method according to claim 6, wherein two additional auxiliary classifiers are added as same as a main classifier network architecture, a network parameter is generated by adding random Gaussian noise to a network parameter of the main classifier, the added random Gaussian noise p-N (0,0.1), feature maps obtained after a data sample passes through a main network respectively enter the main classifier and the auxiliary classifiers, and the classifiers are not affected.
8. The multi-party collaborative data learning model training method according to claim 7, wherein the local active learning model of step 7.2) is trained as follows:
the backbone network and the master classifier are denoted by theta, the backbone network is denoted by b, and the backbone network is denoted by theta 1 And theta 2 Two auxiliary classifiers are represented, p represents the probability distribution of the sample passing through theta output, and p represents the probability distribution of the sample passing through theta output 1 Represents the sample passing through (b, theta) 1 ) Probability distribution of output in p 2 Represents the sample passing through (b, theta) 2 ) A probability distribution of the output;
a: training a local active learning model by using the labeled data set;
a-1, calculating the sample passing through theta, (b, theta) 1 )、(b,θ 2 ) Cross entropy loss L produced by inference CE
The cross entropy loss function is calculated by the formula:
Figure FDA0003709199830000031
where C is the total number of sample classes, C represents the sample class, 1 represents the indicator function, p c (y | x) represents the probability that sample x belongs to class c;
a-2, local active learning model parameters are updated, wherein eta is the learning rate,
Figure FDA0003709199830000036
is the gradient:
Figure FDA0003709199830000032
Figure FDA0003709199830000033
Figure FDA0003709199830000034
b: training the auxiliary classifier by using the unlabeled data set;
b-1: calculating the sample pass through (b, θ) 1 )、(b,θ 2 ) Loss of variance L from reasoning dist
L dist =d(p 1 ,p 2 )+d(p 1 ,p)+d(p 2 ,p),
Figure FDA0003709199830000035
B-2: updating the auxiliary classifier parameters:
Figure FDA0003709199830000041
Figure FDA0003709199830000042
CN202210715202.9A 2022-06-23 2022-06-23 Multi-party cooperative data learning system and learning model training method Pending CN115099334A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210715202.9A CN115099334A (en) 2022-06-23 2022-06-23 Multi-party cooperative data learning system and learning model training method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210715202.9A CN115099334A (en) 2022-06-23 2022-06-23 Multi-party cooperative data learning system and learning model training method

Publications (1)

Publication Number Publication Date
CN115099334A true CN115099334A (en) 2022-09-23

Family

ID=83292709

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210715202.9A Pending CN115099334A (en) 2022-06-23 2022-06-23 Multi-party cooperative data learning system and learning model training method

Country Status (1)

Country Link
CN (1) CN115099334A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115577797A (en) * 2022-10-18 2023-01-06 东南大学 Local noise perception-based federated learning optimization method and system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115577797A (en) * 2022-10-18 2023-01-06 东南大学 Local noise perception-based federated learning optimization method and system
CN115577797B (en) * 2022-10-18 2023-09-26 东南大学 Federal learning optimization method and system based on local noise perception

Similar Documents

Publication Publication Date Title
CN109948029B (en) Neural network self-adaptive depth Hash image searching method
CN112308158A (en) Multi-source field self-adaptive model and method based on partial feature alignment
CN108875816A (en) Merge the Active Learning samples selection strategy of Reliability Code and diversity criterion
CN113191484A (en) Federal learning client intelligent selection method and system based on deep reinforcement learning
CN106570477A (en) Vehicle model recognition model construction method based on depth learning and vehicle model recognition method based on depth learning
CN112685504B (en) Production process-oriented distributed migration chart learning method
CN112990385B (en) Active crowdsourcing image learning method based on semi-supervised variational self-encoder
CN110263236B (en) Social network user multi-label classification method based on dynamic multi-view learning model
CN113518007B (en) Multi-internet-of-things equipment heterogeneous model efficient mutual learning method based on federal learning
CN115840900A (en) Personalized federal learning method and system based on self-adaptive clustering layering
CN115099334A (en) Multi-party cooperative data learning system and learning model training method
CN114418213A (en) Urban electric vehicle scheduling method and system based on deep reinforcement learning
Cheng et al. GFL: Federated learning on non-IID data via privacy-preserving synthetic data
CN113850399A (en) Prediction confidence sequence-based federal learning member inference method
CN111292062A (en) Crowdsourcing garbage worker detection method and system based on network embedding and storage medium
CN114168782B (en) Deep hash image retrieval method based on triplet network
CN115906959A (en) Parameter training method of neural network model based on DE-BP algorithm
Li et al. University Students' behavior characteristics analysis and prediction method based on combined data mining model
CN114299578A (en) Dynamic human face generation method based on facial emotion analysis
Yang et al. NAM net: meta-network with normalization-based attention for few-shot learning
CN113033653A (en) Edge-cloud collaborative deep neural network model training method
CN114077895A (en) Variational self-coding model of antagonism strategy
CN111814958B (en) Method and device for mapping public culture service individuals to public culture service scenes
CN111340291B (en) Medium-and-long-term power load combined prediction system and method based on cloud computing technology
CN115730529B (en) PHET energy management strategy generation method and system based on working condition identification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination