CN113961962A

CN113961962A - Model training method and system based on privacy protection and computer equipment

Info

Publication number: CN113961962A
Application number: CN202111181017.8A
Authority: CN
Inventors: 李雪峰; 梁亮
Original assignee: Baibao Shanghai Technology Co ltd
Current assignee: Baibao Shanghai Technology Co ltd
Priority date: 2021-10-11
Filing date: 2021-10-11
Publication date: 2022-01-21

Abstract

The invention discloses a model training method and system based on privacy protection and computer equipment. The model training method comprises the following steps: selecting K sample data from a sample set of the user, and performing steganography on initial data in the K sample data by using the rest K-1 other data to obtain steganography data; inputting K sample data and steganographic data into a coding model for coding to obtain a steganographic vector, inputting the steganographic vector into a model to be trained to obtain a weight parameter of the model to be trained, and sending the weight parameter to a third party; receiving the aggregation parameters fed back by the third party; updating the model to be trained according to the aggregation parameters to obtain a first updated model, and calculating the contrast loss corresponding to the first updated model; and updating the model according to the comparison loss, and repeating the steps until the comparison loss meets a preset first convergence condition to finish model training to obtain the target model. The invention can establish models for the fields of finance, medical treatment, government affairs and the like on the premise of ensuring the privacy of the user.

Description

Model training method and system based on privacy protection and computer equipment

Technical Field

The invention relates to the technical field of information processing, in particular to a model training method and system based on privacy protection and computer equipment.

Background

AI technology is widely used in business scenarios such as finance, medical treatment, and government affairs. For example, AI techniques such as machine learning and deep learning are widely used for image data detection and recognition for assisting diagnosis in the medical field. The effect of machine learning or deep learning is influenced by the number of image samples to a great extent, the more the number of general samples is, the richer the types are, and the better the machine learning model or deep learning model obtained by training is. However, each medical institution has a limited amount of medical image data and has certain limitations on type distribution, and machine learning or deep learning model training cannot be completed by only depending on the medical image data of a certain medical institution. And the original image data of a plurality of medical institutions are directly collected on a certain server or uploaded to the cloud for model training, so that the privacy of users can be disclosed, the requirements of the state on the supervision policy of a data layer are not met, and the medical institutions often do not want the own data to be known by other parties.

Therefore, how to establish a model for the fields of finance, medical treatment, government affairs and the like on the premise of ensuring the privacy of the user is a technical problem which needs to be solved urgently by the technical personnel in the field.

Disclosure of Invention

The invention provides a privacy protection-based model training method, a system and computer equipment, which can establish models used in the fields of finance, medical treatment, government affairs and the like on the premise of ensuring the privacy of users.

The invention provides the following scheme:

in a first aspect, a model training method based on privacy protection is provided, and is applied to each data provider, and includes:

step S11: selecting K sample data from a sample set of the user, and performing steganography on initial data in the K sample data by using the rest K-1 other data to obtain steganography data;

step S12: inputting the K sample data and the steganographic data into a coding model for coding to obtain a steganographic vector, inputting the steganographic vector into a model to be trained to obtain a weight parameter of the model to be trained, and sending the weight parameter to a third party;

step S13: receiving aggregation parameters fed back by the third party, wherein the aggregation parameters are obtained by aggregating the weight parameters of all the data providers by the third party;

step S14: updating the model to be trained according to the aggregation parameters to obtain a first updated model, and calculating the contrast loss corresponding to the first updated model;

step S15: judging whether the comparison loss meets a preset first convergence condition or not;

step S16: and if not, updating the first updated model through back propagation according to the contrast loss to obtain a second updated model, and repeating the steps S11-S15 until the contrast loss meets the preset first convergence condition, and completing model training to obtain the target model.

Optionally, the model training method further includes:

step S17: inputting the hidden vector into a decoding model, and calculating a reconstruction error of encoding-decoding according to an output parameter of the decoding model;

step S15 includes:

step S151: judging whether the sum of the contrast loss and the reconstruction error meets a preset second convergence condition or not;

step S16 includes:

step S161: and if the sum of the contrast loss and the reconstruction error does not meet the preset second convergence condition, respectively updating the first updating model and the coding model through back propagation according to the contrast loss and the reconstruction error, and repeating the steps S11-S15 until model training is completed when the sum of the contrast loss and the reconstruction error meets the preset second convergence condition to obtain the target model.

Optionally, the step S11 includes:

combining initial image data in the K sample image data with K-1 other image data to obtain new image data; k is a positive integer greater than 1;

the step S12 includes:

and inputting the new image data and the K sample image data into the coding model for coding to obtain the hidden vector.

Optionally, the combining an initial image data of the K sample image data with K-1 other image data to obtain new image data includes:

calculating the initial image data and the K-1 other image data according to a preset formula to obtain steganographic image data;

and turning and/or rotating the steganographic image data to obtain new image data.

Optionally, the preset formula is:

I_newis a matrix of the new image data, I_maskIs a mask matrix with the same size as the initial image data, the value of each element in the matrix is 1 or-1, W_iThe weight of the ith image data in the K sample image data is taken as [ 0-1 ]]And is

I_iA matrix for the ith image data of the K sample image data; wherein the weight of the initial image data is the largest.

Optionally, the reconstruction error is calculated according to the following formula:

L_Rfor the reconstruction error, N is the amount of image data in the sample set, I_MIs a matrix of the initial image data, I_RIs a matrix of reconstructed image data.

Optionally, the contrast loss is calculated according to the following formula:

L_Cfor the loss of contrast, d- ║ h_i-h_j║₂A hidden vector h representing the ith image data_iAnd a hidden vector h of the jth image data_jWhen the labels of the ith image data and the jth image data match, y is 1, when the labels do not match, y is 0, margin is a set threshold, and N is the number of image data in the sample set.

In a second aspect, a model training system based on privacy protection is further provided, including:

the steganography module is used for selecting K sample data from a sample set of the steganography module, and performing steganography on initial data in the K sample data by using the rest K-1 other data to obtain steganography data;

the comparison module is connected with the steganography module and used for inputting the K sample data and the steganography data into a coding model for coding to obtain a steganography vector, inputting the steganography vector into a model to be trained to obtain a weight parameter of the model to be trained, sending the weight parameter to a third party, receiving an aggregation parameter fed back by the third party, and aggregating the weight parameters of all the data providers by the third party to obtain the aggregation parameter;

the training module is connected with the comparison module and used for updating the model to be trained according to the aggregation parameters to obtain a first updated model, calculating the comparison loss corresponding to the first updated model, judging whether the comparison loss meets a preset first convergence condition or not, and if not, updating the first updated model through back propagation according to the comparison loss to obtain a second updated model;

and the steganography module, the comparison module and the training model repeat actions until the comparison loss meets the preset first convergence condition, and model training is completed to obtain a target model.

In a third aspect, a computer device is provided, which includes a memory and a processor, where the memory stores a computer program that is executable on the processor, and when the computer program is executed by the processor, the method for model training based on privacy protection is implemented.

In a fourth aspect, a computer-readable storage medium is provided, in which a computer program is stored, and when the computer program is executed, the method for model training based on privacy protection is implemented.

According to the specific embodiment provided by the invention, the invention discloses the following technical effects:

the model training method based on privacy protection provided by the invention relates to a plurality of data providers and third parties, wherein the data providers are medical institutions. When a model is constructed, each data provider firstly performs steganography on sample data in a sample library of the data provider to protect original data, performs model training by using the steganography data, and then performs back propagation updating on a model to be trained through comparison learning. Specifically, each data provider inputs steganographically-written data into a coding model to obtain a hidden vector, the hidden vector is input into a model to be trained, then weight parameters of the model to be trained are sent to a third party, the third party aggregates the weight parameters provided by all the data providers and feeds back the aggregated parameters to each data provider, each data provider updates the respective model to be trained according to the aggregated parameters, calculates the comparison loss of the updated model to be trained, then judges whether the comparison loss meets a first convergence condition or not, if yes, the model to be trained is shown to be converged, otherwise, the model to be trained is updated again, and a model iteration process is completed. And circularly performing the processes until the model to be trained converges. Therefore, the original data is subjected to privacy protection in a data steganography mode, the safety of sample data of a plurality of data providers is guaranteed, other data providers and third parties cannot reversely deduce the original data participating in modeling, and the identification effect of the trained model is improved by comparing and learning the reverse update model.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

FIG. 1 is a block flow diagram of a model training method based on privacy protection according to an embodiment of the present invention;

FIG. 2 is a block flow diagram of a model training method based on privacy protection according to another embodiment of the present invention;

FIG. 3 is a block diagram of a model training system based on privacy protection according to an embodiment of the present invention;

fig. 4 is an architecture diagram of a computer device provided by an embodiment of the invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments that can be derived by one of ordinary skill in the art from the embodiments given herein are intended to be within the scope of the present invention.

The following describes a specific implementation scheme provided by the embodiment of the present invention in detail.

The invention provides a model training method, a system and computer equipment based on privacy protection, which is essentially a federal learning method based on data steganography and contrast learning, and relates to two roles: the system comprises a data provider and a third party, wherein the third party is a trusted authority or a platform. The method supports a plurality of data providers to perform safe modeling, other data providers and third parties cannot reversely deduce original data participating in modeling, models used in the fields of finance, medical treatment, government affairs and the like are built on the premise of guaranteeing privacy of users, and further, the identification effect of the models is improved through comparison and learning.

Example one

The embodiment provides a model training method based on privacy protection, which is applied to each data provider. Fig. 1 is a flow chart of a model training method based on privacy protection according to an embodiment of the present invention. As shown in FIG. 1, the model training method generally includes the following steps:

the data provider makes the data of the data provider into a sample set for training a model, and performs steganography operation on sample data to protect the privacy of the sample data in the sample set. Taking a data provider as an example of a medical institution, the sample set is composed of data of a plurality of images.

after steganography operation is executed, inputting K sample data and steganography data into a coding model for coding to obtain data characteristics, and then expressing the data characteristics in a form of hidden vectors so that the sample data subjected to steganography can be recognized and used by a subsequent model to be trained.

the third party may be a plurality of the third parties or one third party. All the data providers send the weight parameters of the models to be trained to the third party, the third party aggregates all the weight parameters according to a set rule to obtain aggregation parameters, and then the aggregation parameters are fed back to all the data providers. Since the data is subjected to steganography and encoding, each data provider cannot know the weight of the other data providers except the data provider for steganography and encoding of the data, the data of the other data providers cannot be reversely deduced, and similarly, a third party cannot know the weight of the data provider for steganography and encoding of the data, so that the privacy of the data is ensured. The predetermined rule is, for example, an average value.

after the model to be trained is updated according to the aggregation parameters, the performance of the model needs to be tested, and the performance of the model is judged by comparing loss.

the first convergence condition may be set as needed.

During the above loop, several updates may be needed to the model to make the contrast loss satisfy the preset first convergence condition.

In summary, the model training method based on privacy protection provided by the embodiment relates to a plurality of data providers and third parties, where the data providers are, for example, medical institutions. When a model is constructed, each data provider firstly performs steganography on sample data in a sample library of the data provider so as to protect original data, performs model training by using the steganography data, and then performs back propagation updating on the trained model through comparison learning. Specifically, each data provider inputs implicit data and K sample data into a coding model to obtain an implicit vector, the implicit vector is input into a model to be trained, then weight parameters of the training model are sent to a third party, the third party aggregates the weight parameters provided by all the data providers and feeds back the aggregated parameters to each data provider, each data provider updates the respective model to be trained according to the aggregated parameters, the comparison loss of the updated model to be trained is calculated, whether the comparison loss meets a first convergence condition or not is judged, if yes, the model is converged, otherwise, the model to be trained is updated again, and a model iteration process is completed. And circularly performing the processes until the model to be trained converges. Therefore, the original data is subjected to privacy protection in a data steganography mode, the data safety of a plurality of data providers is guaranteed, other data providers and third parties cannot reversely deduce the original data participating in modeling, and the identification effect of the trained model is improved by comparing and learning the reverse update model.

It should be particularly noted that the model training method based on privacy protection provided by the invention can be applied to other fields such as finance, government affairs and the like, as well as the medical field.

Fig. 2 is a flowchart of a model training method based on privacy protection according to another embodiment of the present invention. As shown in fig. 2, specifically, the model training method further includes:

the performance of the coding model can be judged by the reconstruction error.

Step S15 includes:

the second convergence condition may be set as needed.

Step S16 includes:

step S161: and if the sum of the contrast loss and the reconstruction error does not meet the preset second convergence condition, respectively updating the first updating model and the coding model through back propagation according to the contrast loss and the reconstruction error, and repeating the steps S11-S15 until the sum of the contrast loss and the reconstruction error meets the preset second convergence condition, and completing model training to obtain the target model.

On one hand, the prediction performance of the target model can be improved through the judgment of the contrast loss, and on the other hand, the accuracy of the data used for training the target model after being coded can be ensured through the judgment of the reconstruction error, so that the prediction performance of the target model is further improved. Specifically, the reconstruction error firstly ensures that the encoding model can update the weight of the encoding model by using the aggregated parameters of each data provider after the third party aggregation during the back propagation, and secondly, the updated encoding model weight acts on the forward process of the encoding model during the next iteration, so that the data of the hidden vector not only covers the data of the current data provider, but also contains the data of other data providers, and the data of the hidden vector can be fused with the hidden write data of other data providers to further encode the local sample data on the premise of not revealing the initial data of the data providers.

In some specific scenarios, for example, in the medical field, a prediction result needs to be performed according to a detected image, and in this application, a sample set of the prediction result includes image data, and a model to be trained includes a classification network. The contrast loss (contrast loss) described in the above examples was calculated according to the following formula:

L_Cfor the loss of contrast, d- ║ h_i-h_j║₂A hidden vector hi representing the ith image data and a hidden vector h representing the jth image data_jThe Euclidean distance of (1) is determined when the labels of the ith image data and the jth graphic data are consistent, and is determined when the labels are inconsistentAnd 0, margin is a set threshold, N is the number of image data in the sample set, and N is greater than or equal to K. Wherein, each time updating needs to perform the contrast loss calculation of the C (K +1, 2), and C (K +1, 2) means that two image data in the K +1 image data are taken in each calculation.

In the medical field, the reconstruction error is calculated according to the following formula:

L_Rfor the reconstruction error, N is the amount of image data in the sample set, I_MIs a matrix of initial image data, I_RIs a matrix of reconstructed image data.

Preferably, the coding weight is shared in decoding and contrast learning, so that the hidden vector obtained after coding by a coding model can be ensured to mainly acquire image data with the maximum weight, and effective classification features can be ensured to be extracted.

Specifically, the step S11 includes:

combining initial image data in the K sample image data with K-1 other image data to obtain new image data; k is a positive integer greater than 1; the manner of image data combination may be random combination. The image data is, for example, image data in the medical field.

The step S12 includes:

Specifically, the combining an initial image data of the K sample image data with K-1 other image data to obtain a new image data includes:

and turning and/or rotating the steganographic image data to obtain the image data.

The turning and/or rotating of the steganographic image data can be realized by inputting the steganographic image data into a flip model, and the steganographic image data is randomly plus or minus 1 according to pixel positions in the flip model.

More specifically, the preset formula is:

I_newas a matrix of new image data, I_maskIs a mask matrix with the same size as the initial image data, each element in the matrix takes on the value of 1 or-1, W_iThe weight of the ith image data in the K sample image data is taken as [ 0-1 ]]And is

I_iIs a matrix of ith image data among the K sample image data. Wherein the weight of the initial image data is the largest.

In a specific embodiment, the model to be trained is a classification network, and the output weight parameters are the weight parameters of the fully-connected layer.

Example two

Corresponding to the model construction method based on privacy protection, the second embodiment provides a model training system based on privacy protection. Fig. 3 is a block diagram of a model training system based on privacy protection according to an embodiment of the present invention. As shown in FIG. 3, the model training system generally includes a steganographic module 10, a contrast module 20, and a training module 30.

The steganography module 10 is configured to select K sample data from a sample set of the steganography module, and perform steganography on initial data of the K sample data by using the remaining K-1 other data to obtain steganography data. The comparison module 20 is connected to the steganography module 10, and configured to input the K sample data and the steganography data into a coding model for coding to obtain a steganography vector, input the steganography vector into a model to be trained to obtain a weight parameter of the model to be trained, send the weight parameter to a third party, receive an aggregation parameter fed back by the third party, and aggregate the weight parameters of all data providers by the third party to obtain the aggregation parameter. The training module 30 is connected to the comparison module 20, and configured to update the model to be trained according to the aggregation parameter to obtain a first updated model, calculate a comparison loss corresponding to the first updated model, determine whether the comparison loss meets a preset first convergence condition, and if not, update the first updated model through back propagation according to the comparison loss to obtain a second updated model, where the steganography module 10, the comparison module 20, and the training model 30 repeat operations until the comparison loss meets the preset first convergence condition, and model training is completed to obtain a target model.

The privacy protection-based model training system provided by the invention is integrated in a data provider, the data provider and a third party perform information interaction to complete model training of each data provider, and the data provider is a medical institution for example. When a model is constructed, each data provider firstly performs steganography on sample data in a sample library of the data provider so as to protect original data, performs model training by using the steganography data, and then performs back propagation updating on the trained model through comparison learning. Specifically, each data provider inputs steganographically-written data into a coding model to obtain a hidden vector, the hidden vector is input into a model to be trained, then weight parameters of the training model are sent to a third party, the third party aggregates the weight parameters provided by all the data providers and feeds back the aggregated parameters to each data provider, each data provider updates the respective model to be trained according to the aggregated parameters, calculates the comparison loss of the updated model to be trained, then judges whether the comparison loss meets a first convergence condition or not, if the comparison loss meets the first convergence condition, the model to be trained is represented to be converged, otherwise, the model to be trained is updated again, and a model iteration process is completed. And circularly performing the processes until the model to be trained converges. Therefore, the original sample data is subjected to privacy protection in a data steganography mode, the safety of the sample data of a plurality of data providers is guaranteed, other data providers and third parties cannot reversely deduce the sample data participating in modeling, and the identification effect of the trained model is improved by comparing and learning the reverse update model.

For the second embodiment, reference may be made to the first embodiment without detailed description.

EXAMPLE III

The third embodiment provides a prediction method based on privacy protection, which is applied to each data provider, where the data provider is also a party to be predicted, and the prediction method includes: and inputting the data to be predicted into the target model to obtain target data. For example, the data provider is a medical institution, the data to be predicted may be image data, the medical institution inputs the image data to be predicted into the target model, and the target model obtains and outputs the target data after prediction.

For the parts of the third embodiment that are not described in detail, reference may be made to the descriptions in the foregoing embodiments, which are not described herein again.

Example four

Corresponding to the method, the invention also provides computer equipment, which comprises:

the system comprises a processor and a memory, wherein the memory stores a computer program capable of running on the processor, and when the computer program is executed by the processor, the model training method based on privacy protection provided by any one of the above embodiments is executed.

Fig. 4 illustratively shows computer devices that may specifically include a processor 1510, a video display adapter 1511, a disk drive 1512, an input/output interface 1513, a network interface 1514, and a memory 1520. The processor 1510, video display adapter 1511, disk drive 1512, input/output interface 1513, network interface 1514, and memory 1520 may be communicatively coupled via a communication bus 1530.

The processor 1510 may be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute related programs to implement the technical solution provided by the present invention.

The Memory 1520 may be implemented in the form of a ROM (Read Only Memory), a RAM (Random Access Memory), a static storage device, a dynamic storage device, or the like. The memory 1520 may store an operating system 1521 for controlling the operation of the electronic device, a Basic Input Output System (BIOS) for controlling low-level operations of the electronic device. In addition, a web browser 1523, a data storage management system 1524, a device identification information processing system 1525, and the like can also be stored. The device identification information processing system 1525 may be an application program that implements the operations of the foregoing steps in the embodiment of the present invention. In summary, when the technical solution provided by the present invention is implemented by software or firmware, the relevant program codes are stored in the memory 1520 and called for execution by the processor 1510.

The input/output interface 1513 is used for connecting an input/output module to realize information input and output. The i/o module may be configured as a component in a device (not shown) or may be external to the device to provide a corresponding function. The input devices may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc., and the output devices may include a display, a speaker, a vibrator, an indicator light, etc.

The network interface 1514 is used to connect a communication module (not shown) to enable the device to communicatively interact with other devices. The communication module can realize communication in a wired mode (such as USB, network cable and the like) and also can realize communication in a wireless mode (such as mobile network, WIFI, Bluetooth and the like).

The bus includes a path that transfers information between the various components of the device, such as the processor 1510, the video display adapter 1511, the disk drive 1512, the input/output interface 1513, the network interface 1514, and the memory 1520.

In addition, the electronic equipment can also obtain the information of specific extraction conditions from the virtual resource object extraction condition information sample library for condition judgment and the like.

It should be noted that although the above devices only show the processor 1510, the video display adapter 1511, the disk drive 1512, the input/output interface 1513, the network interface 1514, the memory 1520, the bus, etc., in the specific implementation, the devices may also include other components necessary for normal operation. Furthermore, it will be understood by those skilled in the art that the apparatus described above may also include only the components necessary to implement the inventive arrangements, and need not include all of the components shown in the figures.

EXAMPLE five

The invention further provides a computer-readable storage medium, in which a computer program is stored, and when the computer program is executed, the model training method based on privacy protection provided by any one of the above embodiments is implemented.

From the above description of the embodiments, it is clear to those skilled in the art that the present invention can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which may be stored in a storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, the system or system embodiments are substantially similar to the method embodiments and therefore are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for related points. The above-described system and system embodiments are only illustrative, wherein the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

The technical solutions provided by the present invention are described in detail above, and the principles and embodiments of the present invention are explained herein by using specific examples, which are merely used to help understanding the method and the core ideas of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims

1. A model training method based on privacy protection is applied to each data provider and is characterized by comprising the following steps:

2. The model training method of claim 1, further comprising:

step S15 includes:

step S16 includes:

3. The model training method according to claim 1, wherein the step S11 includes:

the step S12 includes:

4. The model training method of claim 3, wherein the combining an initial image data of the K sample image data with K-1 other image data to obtain a new image data comprises:

5. The model training method of claim 4, wherein the predetermined formula is:

I_newas a matrix of new image data, I_maskIs a mask matrix with the same size as the initial image data, the value of each element in the matrix is 1 or-1, W_iThe weight of the ith image data in the K sample image data is taken as [ 0-1 ]]And is

6. The model training method of claim 3, wherein the reconstruction error is calculated according to the following formula:

L_Rfor the reconstruction error, N is the amount of initial image data in the sample set, I_MIs a matrix of the initial image data, I_RIs a matrix of reconstructed image data.

7. The model training method of claim 3, wherein the contrast loss is calculated according to the following formula:

L_Cfor the loss of contrast, d- ║ h_i-h_j║₂A hidden vector h representing the ith image data_iAnd a hidden vector h of the jth image data_jWhen the labels of the ith image data and the jth image data match, y is 1, when the labels do not match, y is 0, margin is a set threshold, and N is the number of initial image data in the sample set.

8. A model training system based on privacy protection, comprising:

the comparison module is connected with the steganography module and used for inputting the K sample data and the steganography data into a coding model for coding to obtain a steganography vector, inputting the steganography vector into a model to be trained to obtain a weight parameter of the model to be trained, sending the weight parameter to a third party, receiving an aggregation parameter fed back by the third party, and aggregating the weight parameters of all data providers by the third party to obtain the aggregation parameter;

9. A computer device comprising a memory and a processor, the memory having stored thereon a computer program operable on the processor, the computer program, when executed by the processor, implementing the privacy protection based model training method of any one of claims 1-7.

10. A computer-readable storage medium, having stored thereon a computer program which, when executed, implements the privacy protection-based model training method of any one of claims 1-7.