CN111178524A

CN111178524A - Data processing method, device, equipment and medium based on federal learning

Info

Publication number: CN111178524A
Application number: CN201911346900.0A
Authority: CN
Inventors: 董厶溢
Original assignee: Ping An Life Insurance Company of China Ltd
Current assignee: Ping An Life Insurance Company of China Ltd
Priority date: 2019-12-24
Filing date: 2019-12-24
Publication date: 2020-05-19

Abstract

The invention discloses a data processing method, a device, equipment and a medium based on federal learning, which comprises the following steps executed by a first terminal: determining user characteristic data shared by the first terminal and the second terminal; carrying out feature coding processing on the user feature data to obtain feature data to be processed; obtaining a model predicted value obtained by processing based on the characteristic data to be processed; obtaining a loss value obtained by processing training label data and a model predicted value by adopting a predefined loss function; if the loss value is an external loss value, the external loss value is sent to the second terminal in an encryption mode; and if the loss value is an internal loss value, determining a target gradient based on the internal loss value and the current model parameter corresponding to the first model to be trained, and performing model optimization on the first model to be trained according to the target gradient to obtain a target prediction model. The data processing method based on the federal learning can effectively improve the training efficiency and accuracy of the model.

Description

Data processing method, device, equipment and medium based on federal learning

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a data processing method, a data processing device, data processing equipment and a data processing medium based on federal learning.

Background

With the wide application of artificial intelligence technology, various machine learning methods are being developed. Currently, a novel learning method, federal learning, is proposed for solving data islanding and realizing information interaction and model learning on the premise that sensitive data are not provided externally. Federal Learning (Federal Learning) is a new artificial intelligence basic technology, and the design goal of the federal Learning is to develop efficient machine Learning among multiple participants or multiple computing nodes on the premise of guaranteeing information safety and legal compliance during big data exchange.

At present, when some financial institutions train models, because user data of the financial institutions cannot meet training requirements, the user data of other financial institutions need to be simultaneously studied deeply, and in order to ensure the safety of the user data, the federal learning idea is adopted for model training so as to train the models by learning the user data provided by different data parties. Because the data adopted by the current model training is from multi-party mechanisms (namely from different data parties), and for different data parties, the data correlation, data scalar and data type of the data parties are different, the learning rate of the models trained by the data parties is greatly different, so that the model learning process is greatly fluctuated, the learning efficiency is low, namely the model training efficiency is low, the model training cost of each mechanism is high, and the model training accuracy is low because the key information of the multi-party data is different and the same optimization algorithm is adopted for optimization.

Disclosure of Invention

The embodiment of the invention provides a data processing method, a data processing device, data processing equipment and a data processing medium based on federal learning, and aims to solve the problem that the training efficiency and the accuracy are low when a data training model provided by multiple data parties is adopted at present.

A data processing method based on federal learning comprises the following steps executed by a first terminal:

determining user characteristic data shared by the first terminal and the second terminal; the first terminal corresponds to a first model to be trained; the second terminal corresponds to a second model to be trained; the first terminal comprises training label data corresponding to the user characteristic data;

carrying out feature coding processing on the user feature data to obtain feature data to be processed;

obtaining a model predicted value obtained by processing based on the characteristic data to be processed;

obtaining a loss value obtained by processing the training label data and the model predicted value by adopting a predefined loss function;

if the loss value is an external loss value, sending the external loss value to a second terminal in an encrypted mode, so that the second terminal determines a target gradient based on the external loss value and a current model parameter corresponding to the second model to be trained, and performing model optimization on the second model to be trained according to the target gradient;

and if the loss value is an internal loss value, determining a target gradient based on the internal loss value and the current model parameter corresponding to the first model to be trained, and performing model optimization on the first model to be trained according to the target gradient to obtain a target prediction model.

A data processing apparatus based on federal learning, comprising:

the user characteristic data determining module is used for determining user characteristic data shared by the first terminal and the second terminal; the first terminal corresponds to a first model to be trained; the second terminal corresponds to a second model to be trained; the first terminal comprises training label data corresponding to the user characteristic data;

the characteristic coding module is used for carrying out characteristic coding processing on the user characteristic data to obtain characteristic data to be processed;

the model predicted value obtaining module is used for obtaining a model predicted value obtained by processing based on the characteristic data to be processed;

the loss value acquisition module is used for acquiring a loss value obtained by processing the training label data and the model predicted value by adopting a predefined loss function;

the external loss value sending module is used for sending the external loss value to a second terminal in an encrypted mode if the loss value is the external loss value, so that the second terminal determines a target gradient based on the external loss value and current model parameters corresponding to the second model to be trained, and carries out model optimization on the second model to be trained according to the target gradient;

and the target prediction model optimization module is used for determining a target gradient based on the internal loss value and the current model parameter corresponding to the first model to be trained if the loss value is the internal loss value, and performing model optimization on the first model to be trained according to the target gradient to obtain a target prediction model.

A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the above federated learning-based data processing method when executing the computer program.

A computer storage medium storing a computer program which, when executed by a processor, implements the steps of the above-described federated learning-based data processing method.

In the data processing method, the device, the equipment and the medium based on the federal learning, the user characteristic data shared by the first terminal and the second terminal is determined so as to perform characteristic coding processing on the user characteristic data to obtain the characteristic data to be processed; then, obtaining a model predicted value obtained by processing based on the characteristic data to be processed so as to obtain a loss value obtained by processing the training label data and the model predicted value by adopting a predefined loss function; if the loss value is the external loss value, the external loss value is sent to the second terminal in an encryption mode, the safety of data interaction in the training process is guaranteed, the second terminal determines a target gradient based on the external loss value and current model parameters corresponding to a second model to be trained, and model optimization is carried out on the second model to be trained according to the target gradient; the first terminal is used as a collaborator for collaborative modeling, the collaborative modeling of the collaborator of a trusted third party is not needed, the limitation that the prior federal learning can only rely on the collaborator of the trusted third party for modeling is removed, and the purpose of synchronous training is achieved. If the loss value is an internal loss value, determining a target gradient based on the internal loss value and a current model parameter corresponding to the first model to be trained, directly correcting the gradient size by combining the relative self weight (namely the current model parameter), and determining the target gradient, so as to ensure that the gradient optimized by the model is determined by the characteristics of the data of the comprehensive data party, effectively solve the problem that the gradient size of other data parties cannot be influenced by adopting the same set of optimization strategy in the traditional federal learning, so that the problems of large fluctuation and low learning efficiency occur in the model learning process, and ensure the stability of the model training. And finally, performing model optimization on the first model to be trained according to the target gradient to obtain a target prediction model, so that the target prediction model retains the key information of the data of different data parties, and the accuracy of the target prediction model is further ensured.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive labor.

FIG. 1 is a schematic diagram of an application environment of a data processing method based on federated learning according to an embodiment of the present invention;

FIG. 2 is a flow chart of a federated learning-based data processing method in one embodiment of the present invention;

FIG. 3 is a detailed flowchart of step S60 in FIG. 2;

FIG. 4 is a detailed flowchart of step S62 in FIG. 3;

FIG. 5 is a detailed flowchart of step S63 in FIG. 3;

FIG. 6 is a flow chart of a data processing apparatus based on federated learning in one embodiment of the present invention;

fig. 7 is a schematic diagram of the first terminal according to an embodiment of the invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The data processing method based on federal learning provided by the embodiment of the invention can be applied to terminal equipment configured by financial institutions such as banks, securities, insurance and the like or other institutions, is used for synthesizing data provided by a plurality of data parties to quickly train the model of each institution, can effectively ensure the stability of the training process, and can fully reflect the quality and characteristics of the data of each data party so as to improve the accuracy of the model. The data processing method based on federal learning can be applied to the application environment as shown in fig. 1, wherein a first terminal communicates with a second terminal through a network. The first terminal or the second terminal may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices.

In an embodiment, as shown in fig. 2, a data processing method based on federal learning is provided, which is described by taking the first terminal in fig. 1 as an example, and includes the following steps:

s10: determining user characteristic data shared by the first terminal and the second terminal; the first terminal corresponds to a first model to be trained; the second terminal corresponds to a second model to be trained; the first terminal includes training label data corresponding to the user characteristic data.

Specifically, the first terminal refers to a data side terminal assisting a multi-party data organization to perform synchronous modeling or a terminal corresponding to a trusted third party collaborator. The second terminal refers to the data side terminal that needs to rely on the first terminal for modeling. According to the idea of federal learning, namely, on the premise that the first terminal and the second terminal do not disclose respective data, common users of the first terminal and the second terminal are confirmed, and the users which do not overlap with each other are not exposed, so that modeling is performed by combining the user feature data of the common users, so that the user feature data of the common users of the first terminal and the second terminal are determined to realize sample alignment, and the user feature data of the users which do not overlap with each other are not exposed, and the safety of the user data is ensured.

It should be noted that the common user characteristic data in this embodiment may refer to characteristic data of a user shared by the first terminal and the second terminal, that is, data in which the user characteristics of the first terminal and the second terminal are the same and the user characteristics are not completely the same, or may refer to characteristic data in which the user characteristics of the first terminal and the second terminal are the same, that is, data in which the user characteristics are the same and the user characteristics are not completely the same.

It can be understood that if the data side providing the training sample data is a and B, and a does not hold the label data required for training and B holds the label data required for training, then B may be used as the first terminal and a as the second terminal; if both A and B have label data required by training, one of A or B can be selected as the first terminal, or a trusted third party collaborator can be adopted for assisting training. It should be noted that the number of the second terminals may be one or more, and is not limited herein.

In this embodiment, the first model to be trained or the second model to be trained includes, but is not limited to, deep learning by using a neural network algorithm (e.g., lstm) or machine learning (e.g., random forest). It should be noted that the deep learning algorithm used by the first model to be trained and the second model to be trained needs to be consistent, that is, the model initialization parameters of the first model to be trained and the second model to be trained are the same.

The training label data includes, but is not limited to, label data required for modeling corresponding to a model training topic, for example, when predicting repayment ability of a user, the training label data may be a loan amount, that is, the corresponding user feature data is user feature data of a loan and repayment, and for example, when predicting favorite products of the user, the training label data may be purchased products (such as gloves), and the corresponding user feature data is user feature data of the purchased products.

It can be understood that, taking the data parties a and B, and the data party B can provide training label data required by training, the data party a does not hold the label data required by training, and the model prediction subject is the ability to predict repayment of the user, and the user feature data includes personal data of the user; training sample data required for training the model comprises user personal data (such as sex, age and academic calendar) owned by a loan company (data party A), training label data refers to repayment capability data (repayment amount data) owned by a bank (data party B) of a user, it needs to be noted that the data party B can also comprise user characteristic data (such as working experience) not owned by the data party A required for training, the loan company and the bank need to train the model to be trained synchronously, and because the data party A needs to train according to the label data of the data party B, the A and the B can not train the model directly through data exchange due to the consideration of protecting the privacy of the user, a federal learning idea needs to be introduced to ensure that the data of each party is not leaked. In this scenario, since the loan company only owns the personal data of the user and does not own the label data required for training, the data party B can be used as a "collaborator" in federal learning to receive the data (such as the user feature data or the model prediction value) sent by the data party a, so that the data parties a and B can learn synchronously without revealing privacy data, that is, the data party a can train according to the label data of the data party B or the data party B can model according to the feature data of the data party a and the own data.

In this embodiment, the first terminal is taken as an example for explanation, so training needs to be performed according to the initialization model parameters defined by the first terminal, so as to train the model to be trained in combination with the idea of federal learning. The federal Learning (federal Learning) is a new artificial intelligence basic technology, and the design goal of the federal Learning is to develop efficient machine Learning among multiple participants or multiple computing nodes on the premise of guaranteeing information safety during big data exchange and guaranteeing legal compliance.

S20: and carrying out characteristic coding processing on the user characteristic data to obtain the characteristic data to be processed.

The user characteristic data refers to characteristic data required by modeling and associated with a user, such as gender, age, working age and academic calendar, and can be acquired from a big data platform. The feature data to be processed is feature data which can be processed by the model to be trained. Specifically, the feature encoding process performed on the user feature data may be performed according to a preset encoding rule, for example, the user feature data includes gender, age, working life, and academic history, and then, men and women in gender may be encoded in a discrete encoding form, such as men- "0", women- "1", working life "1" for 1 year, "2" for 2 years, and so on, which are not listed here, and the academic history may be set to "0" for a subject, and "1" for a subject, and so on.

S30: and obtaining a model predicted value obtained by processing based on the characteristic data to be processed.

The model prediction value is obtained by processing the user characteristic data by adopting a first model to be trained or a second model to be trained. It can be understood that the model prediction value may be obtained by the first terminal processing the user feature data by using the first model to be trained, or the model prediction value may be obtained by the second terminal receiving the user feature data by using the second model to be trained. It can be understood that the model prediction value may refer to a model prediction value obtained by processing the feature data to be processed by using the first model to be trained; or, the model prediction value obtained by processing the characteristic data to be processed by adopting the second model to be trained, which is sent by the second terminal in an encryption mode, is received.

Because the second terminal needs to be modeled by the first terminal, the second terminal needs to send a model prediction value obtained by processing the user characteristic data according to the second model to be trained to the first terminal so as to calculate a loss value, so that the second terminal can carry out model optimization according to the loss value fed back by the first terminal, and data of each data party is kept locally in the process, thereby protecting the security of personal data of each terminal user.

Further, when the second terminal sends the model predicted value to the first terminal, in order to ensure privacy of user data, the second terminal needs to send the model predicted value in an encrypted manner, specifically, the first terminal may obtain a secret key by using an encryption algorithm, and send the public key to the second terminal, so that the second terminal encrypts the model predicted value by using the public key and sends the encrypted model predicted value to the first terminal, and the first terminal decrypts the encrypted model predicted value by using a private key, so that a corresponding loss value is calculated according to the decrypted model predicted value.

In the embodiment, the model predicted value obtained by processing the user characteristic data by the first terminal or the model predicted value obtained by processing the user characteristic data by receiving the second model to be trained corresponding to the second terminal is obtained, so that the trusted third-party collaborator is not required to be relied on for collaborative modeling, the limitation that the prior federal learning can only rely on the trusted third-party collaborator for modeling is removed, and the purpose of synchronous training is achieved.

S40: and obtaining a loss value obtained by processing the training label data and the model predicted value by adopting a predefined loss function.

In this embodiment, a description will be given taking as an example a terminal corresponding to a data side having tag data necessary for training as a first terminal. In particular, for the conventional back propagation optimization, a loss function is generally defined to calculate a loss value for evaluating the performance of the model. One standard way of defining the loss function is:

where f is a loss function used to evaluate the true result (i.e., y) and the predicted result (i.e., y)

) The difference in (a). In this embodiment, the loss function may be specifically defined by a user, and is not limited herein. The loss value is calculated by a predefined loss function on the model prediction value.

In this embodiment, the loss value obtained by processing the training label data and the model prediction value by using the predefined loss function includes an external loss value or an internal loss value. The external loss value is obtained by processing the training label data and the received model predicted value sent by the second terminal by adopting a predefined loss function; the internal loss value is obtained by processing the training label data and the feature data to be processed by the first terminal by adopting the first model to be trained by adopting a predefined loss function.

S50: and if the loss value is the external loss value, sending the external loss value to the second terminal in an encryption mode, so that the second terminal determines a target gradient based on the external loss value and the current model parameter corresponding to the second model to be trained, and performing model optimization on the second model to be trained according to the target gradient.

And sending the external loss value to the second terminal in an encryption mode, so that the second terminal determines a target gradient based on the external loss value and the current model parameter corresponding to the second model to be trained, and carrying out model optimization on the second model to be trained according to the target gradient to obtain a target prediction model. The current model parameter refers to a model parameter of the first model to be trained corresponding to each training. The target gradient refers to an optimized gradient determined by comprehensive analysis according to the loss value and the current model parameter. It should be noted that, in the present embodiment, the loss value required for determining the target gradient may include an external loss value or an internal loss value. The current model parameters correspond to the types of the loss values, namely if the loss values are external loss values, the target gradient is determined according to the external loss values and the current model parameters corresponding to the second model to be trained; if the loss value is an internal loss value, determining the target gradient according to the internal loss value and the current model parameter corresponding to the first model to be trained; and are not limited herein.

Specifically, in order to ensure the security of data interaction in the training process, the loss value obtained by processing the model prediction value sent by the second terminal needs to be sent to the second terminal in an encryption manner, so that the data privacy is prevented from being revealed. Specifically, the second terminal may obtain the secret key by using an encryption algorithm, and send the secret key to the first terminal by using the public key, so that the first terminal encrypts the loss value by using the public key and sends the loss value to the second terminal, and the second terminal decrypts the encrypted loss value by using the private key.

S60: and if the loss value is an internal loss value, determining a target gradient based on the internal loss value and the current model parameter corresponding to the first model to be trained, and performing model optimization on the first model to be trained according to the target gradient to obtain a target prediction model.

The current model parameter corresponding to the first model to be trained refers to the model weight corresponding to the first model to be trained. Specifically, because the data participating in learning come from different data parties, and the respective gradients have obvious differences according to the advantages and disadvantages of the data, the same set of optimization strategy cannot influence the gradients of other data parties, in the embodiment, the target gradient is determined based on the loss value and the current model parameter corresponding to the first model to be trained, so that the gradient size is directly corrected by combining the relative self weight (namely the current model parameter), the target gradient is determined, the gradient optimized by the model is determined by combining the characteristics of the data parties, the quality and the characteristics of the data of each data party can be fully embodied, the model accuracy is further improved, and meanwhile, the problem that the gradient size of other data parties cannot be influenced by adopting the same set of optimization strategy in the traditional federal learning can be effectively solved, so that great fluctuation occurs in the model learning process, the problem of lower learning efficiency guarantees the stability of model training.

It should be noted that the process of determining the target gradient by the first terminal and the second terminal is consistent.

In the embodiment, a model predicted value is obtained based on user characteristic data by determining the user characteristic data shared by the first terminal and the second terminal; then, obtaining a model predicted value obtained by processing based on the characteristic data to be processed so as to obtain a loss value obtained by processing the training label data and the model predicted value by adopting a predefined loss function; if the loss value is the external loss value, the external loss value is sent to the second terminal in an encryption mode, the safety of data interaction in the training process is guaranteed, the second terminal determines a target gradient based on the external loss value and current model parameters corresponding to a second model to be trained, and model optimization is carried out on the second model to be trained according to the target gradient; the first terminal is used as a collaborator for collaborative modeling, the collaborative modeling of the collaborator of a trusted third party is not needed, the limitation that the prior federal learning can only rely on the collaborator of the trusted third party for modeling is removed, and the purpose of synchronous training is achieved. If the loss value is an internal loss value, determining a target gradient based on the internal loss value and a current model parameter corresponding to the first model to be trained, directly correcting the gradient size by combining the relative self weight (namely the current model parameter), and determining the target gradient, so as to ensure that the gradient optimized by the model is determined by the characteristics of the data of the comprehensive data party, effectively solve the problem that the gradient size of other data parties cannot be influenced by adopting the same set of optimization strategy in the traditional federal learning, so that the problems of large fluctuation and low learning efficiency occur in the model learning process, and ensure the stability of the model training. And finally, performing model optimization on the first model to be trained according to the target gradient to obtain a target prediction model, so that the target prediction model retains the key information of the data of different data parties, and the accuracy of the target prediction model is further ensured.

In an embodiment, as shown in fig. 3, in step S60, that is, determining the target gradient based on the internal loss value and the current model parameter corresponding to the first model to be trained, the method specifically includes the following steps:

s61: and calculating according to the internal loss value and the current model parameter corresponding to the first model to be trained to obtain the original gradient.

Specifically, the internal loss value may be calculated by using a predefined gradient calculation formula to obtain an original gradient. For example, the gradient calculation formula may be

Wherein J (ω) represents an internal loss value, ω_jRepresenting the current model parameters, i.e. the weight parameters, g represents the original gradient.

S62: and acquiring a boundary condition based on the current model parameter corresponding to the first model to be trained.

Specifically, the first terminal obtains a boundary condition based on a current model parameter corresponding to the first model to be trained, so as to control the gradient size of model optimization according to the boundary condition.

In an embodiment, as shown in fig. 4, in step S62, that is, obtaining the boundary condition based on the original gradient and the current model parameter corresponding to the first model to be trained specifically includes the following steps:

s621: and acquiring a first hyper-parameter and a second hyper-parameter.

Wherein the first hyperparameter is a predefined constant satisfying (0-1). The second hyperparameter is a predefined constant satisfying (0-1). It will be appreciated that the first over-parameter is used to determine the upper boundary of the boundary condition and the second over-parameter is used to determine the lower boundary of the boundary condition. Therefore, in the present embodiment, the first hyper-parameter is smaller than the second hyper-parameter.

S622: and calculating a first norm of the current model parameter corresponding to the first model to be trained.

The first norm is an L2 norm obtained by processing the current model parameter. The L2 norm is the sum of the squares of the elements in the vector (i.e., the current model parameters) followed by the square root. In this embodiment, the Norm-2(L2) Norm is used to measure the magnitude of a gradient, so that the regularization term can be optimized to avoid the over-fitting problem, and therefore, a second Norm corresponding to the current model parameter needs to be calculated to provide a data source for subsequently determining the target gradient.

S623: the product of the first hyperparameter and the first norm is used as an upper boundary, and the product of the second hyperparameter and the first norm is used as a lower boundary.

S624: based on the upper and lower boundaries, a boundary condition is obtained.

Specifically, assuming that the first super parameter is represented by bl, the second super parameter is represented by bh, and the first norm is represented by | | ω |, an upper boundary of the boundary condition is bl | | | ω |, a lower boundary of the boundary condition is bh | | | | ω |, and the boundary condition is [ bl | | | ω |, bh | | | | | | ω | ].

In this embodiment, the first hyper-parameter and the second hyper-parameter are introduced, so that the first hyper-parameter and the second hyper-parameter are combined with the L2 norm of the current model parameter to obtain the upper boundary and the lower boundary, and then the boundary condition is obtained, so that the original gradient size is corrected according to the boundary condition in the following process, the gradient size meets the controllable boundary condition, the stability of model training is ensured, meanwhile, the quality and the characteristics of each data side data can be fully reflected, and the accuracy of the model is further improved

S63: and processing the original gradient based on the boundary condition to obtain a target gradient.

In federal learning, due to the updated scale of the model weight, the model weight is influenced by more factors, such as learning rate, gradient magnitude and the like. If the update gradient is too large, the update of the model can generate larger oscillation; if the update gradient is too small, the update of the model is abnormally slow and stable, and the learning effect of the model is not good. In addition, the learning effect of the model is mainly related to the gradient direction and the gradient magnitude. The gradient updating direction can be changed, so that the model updating gradient needs to be controlled to improve the model learning effect.

In the existing conventional optimization algorithm, the gradient size control mainly plays a control role by optimizing the learning rate. The existing method for controlling the learning rate mainly uses attenuation functions or introduces momentum to obtain the ratio of first-order and second-order gradients to control the learning rate. The method can play a certain role in limiting the gradient size under the condition of single learning. However, in federal learning, data participating in learning come from different data parties, the gradient of a model trained by each data party has obvious difference according to the quality of the data, and the gradient of other data parties cannot be known or influenced by the same set of optimization strategies, so that in the embodiment, the relative self weight (namely, the current model parameter) is introduced to directly correct the gradient (namely, the original gradient) to obtain the target gradient, so that the gradient meets controllable boundary conditions, and the stability of model training is ensured.

In an embodiment, as shown in fig. 5, in step S63, that is, processing the original gradient based on the boundary condition, to obtain the target gradient, the method specifically includes the following steps:

s631: and calculating a second norm corresponding to the original gradient, and taking the ratio of the first norm to the second norm as a ratio to be judged.

The first norm is an L2 norm obtained by processing the original gradient. Specifically, the first norm and the second normThe ratio is used as the ratio to be judged, i.e.

S632: and if the ratio to be judged meets the boundary condition, taking the original gradient as the target gradient.

Specifically, if the ratio to be determined is within the upper and lower bounds, the original gradient is taken as the target gradient, that is, when the ratio is within the upper and lower bounds

Then the original gradient is taken as the target gradient.

S633: if the ratio to be judged is smaller than the upper boundary of the boundary condition, processing the original gradient according to a first calculation formula to obtain a target gradient; the first calculation formula is

In particular, when

Then, the original gradient is processed according to a second calculation formula to obtain a target gradient, namely

S634: if the ratio to be judged is larger than the lower boundary of the boundary condition, processing the original gradient according to a second calculation formula to obtain a target gradient; the second calculation formula includes

Wherein bh denotes a second hyperparameter; g | | | represents a first norm; | ω | | represents a second norm; g denotes the original gradient and ω denotes the current model parameters.

In particular toWhen is coming into contact with

In this embodiment, the update gradient size is controlled according to the boundary condition, so as to effectively solve the problem that the gradient size of other data parties cannot be affected by adopting the same set of optimization strategy in the traditional federal learning, and make the update gradient size of each data party more fit with the data characteristics of the data provided by the data party, so as to achieve the purposes of enhancing the stability of the model training process and accelerating the progress of the model training.

In an embodiment, if the internal loss value is smaller than the preset threshold, a step of calculating according to the internal loss value and a current model parameter corresponding to the first model to be trained to obtain an original gradient is performed.

The preset threshold is a threshold for judging the convergence of the model. Specifically, when determining whether the model converges, the determination may be performed in two ways, one of which is to determine whether the loss value corresponding to each model (i.e., the first model to be trained or the second model to be trained) meets the criterion (a preset threshold may be set); and secondly, judging according to whether the loss average value of the loss values of the models meets the standard or not so as to achieve the aim of synchronous convergence. If the internal loss value is greater than the preset threshold value, the model is proved to be not converged at the moment, iterative training needs to be continued, a step of calculating according to the internal loss value and the current model parameter corresponding to the first model to be trained to obtain an original gradient value or an original gradient value is executed, or if the external loss value is greater than the preset threshold value, a step of sending the external loss value to the second terminal in an encryption mode is executed. Or, if the loss average of the internal loss value and the external loss value meets the criterion (i.e., is smaller than the preset threshold), if the average of the internal loss value and the external loss value is smaller than the preset threshold, the model is proved to be converged, and the training can be stopped.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.

In an embodiment, a data processing apparatus based on federal learning is provided, and the data processing apparatus based on federal learning corresponds to the data processing method based on federal learning in the above embodiment one to one. As shown in fig. 6, the data processing apparatus based on federal learning includes a user feature data determination module 10, a feature encoding module 20, a model prediction value acquisition module 30, a loss value acquisition module 40, an external loss value transmission module 50, and a target prediction model optimization module 60. The functional modules are explained in detail as follows:

a user characteristic data determining module 10, configured to determine user characteristic data common to the first terminal and the second terminal; the first terminal corresponds to a first model to be trained; the second terminal corresponds to a second model to be trained; the first terminal comprises training label data corresponding to the user characteristic data;

the feature coding module 20 is configured to perform feature coding processing on the user feature data to obtain feature data to be processed;

the model predicted value obtaining module 30 is used for obtaining a model predicted value obtained by processing based on the feature data to be processed;

a loss value obtaining module 40, configured to obtain a loss value obtained by processing the training label data and the model prediction value by using a predefined loss function;

the external loss value sending module 50 is used for sending the external loss value to the second terminal in an encrypted manner if the loss value is the external loss value, so that the second terminal determines a target gradient based on the external loss value and the current model parameter corresponding to the second model to be trained, and performs model optimization on the second model to be trained according to the target gradient;

and the target prediction model optimization module 60 is configured to determine a target gradient based on the internal loss value and the current model parameter corresponding to the first model to be trained if the loss value is the internal loss value, and perform model optimization on the first model to be trained according to the target gradient to obtain a target prediction model.

Specifically, the model prediction value obtaining module specifically includes: and receiving a model predicted value obtained by processing the characteristic data to be processed by adopting the second model to be trained, which is sent by the second terminal in an encryption mode.

Specifically, the model prediction value obtaining module specifically includes: and processing the characteristic data to be processed by adopting the first model to be trained to obtain a model predicted value.

Specifically, the target prediction model optimization module comprises an original gradient acquisition unit, a boundary condition acquisition unit and a target gradient acquisition unit.

And the original gradient obtaining unit is used for calculating according to the internal loss value and the current model parameter corresponding to the first model to be trained to obtain the original gradient.

And the boundary condition obtaining unit is used for obtaining a boundary condition based on the current model parameter corresponding to the first model to be trained.

And the target gradient acquisition unit is used for processing the original gradient based on the boundary condition to acquire the target gradient.

Specifically, the boundary condition obtaining unit comprises a hyper-parameter obtaining subunit, a first norm calculation subunit, an upper and lower boundary determining subunit and a boundary condition obtaining subunit.

And the hyper-parameter acquiring subunit is used for acquiring the first hyper-parameter and the second hyper-parameter.

And the first norm calculation subunit is used for calculating a first norm corresponding to the current model parameter corresponding to the first model to be trained.

And the upper and lower boundary determining subunit is used for taking the product of the first hyperparameter and the first norm as an upper boundary and taking the product of the second hyperparameter and the first norm as a lower boundary.

And the boundary condition acquisition subunit is used for obtaining a boundary condition based on the upper boundary and the lower boundary.

Specifically, the target gradient obtaining unit includes a ratio obtaining subunit to be determined, a first processing subunit, a second processing subunit, and a third processing subunit.

And the ratio acquiring subunit to be judged is used for calculating a second norm corresponding to the original gradient and taking the ratio of the first norm to the second norm as the ratio to be judged.

And the first processing subunit is used for taking the original gradient as the target gradient if the ratio to be judged meets the boundary condition.

The second processing subunit is used for processing the original gradient according to the first calculation formula to obtain a target gradient if the ratio to be judged is smaller than the upper boundary of the boundary condition; the first calculation formula is

The third processing subunit is used for processing the original gradient according to a second calculation formula to obtain a target gradient if the ratio to be judged is greater than the lower boundary of the boundary condition; the second calculation formula includes

For specific definition of the data processing device based on the federal learning, the above definition of the data processing method based on the federal learning can be referred to, and details are not repeated herein. The various modules in the above-described federally learned data processing apparatus can be implemented in whole or in part by software, hardware, and combinations thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 7. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a computer storage medium and an internal memory. The computer storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the computer storage media. The database of the computer device is used to store data generated or obtained during execution of the federal learning based data processing method, such as a target prediction model. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a federated learning-based data processing method.

In one embodiment, a computer device is provided, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and when the processor executes the computer program, the steps of the data processing method based on federal learning in the foregoing embodiments are implemented, for example, steps S10-S60 shown in fig. 2, or steps shown in fig. 3 to 5. Alternatively, the functions of each module/unit in the embodiment of the data processing apparatus based on federal learning, such as the functions of each module/unit shown in fig. 6, are implemented when the processor executes the computer program, and are not described herein again to avoid repetition.

In an embodiment, a computer storage medium is provided, where a computer program is stored on the computer storage medium, and when executed by a processor, the computer program implements the steps of the data processing method based on federated learning in the foregoing embodiments, such as steps S10-S60 shown in fig. 2 or steps shown in fig. 3 to fig. 5, which are not repeated here to avoid repetition. Alternatively, the computer program, when executed by the processor, implements the functions of each module/unit in the above-mentioned federate learning based data processing apparatus, for example, the functions of each module/unit shown in fig. 6, and is not described herein again to avoid repetition.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.

Claims

1. A data processing method based on federal learning is characterized by comprising the following steps executed by a first terminal:

2. The federal learning-based data processing method as claimed in claim 1, wherein the obtaining of the model predicted value obtained by processing based on the feature data to be processed comprises:

and receiving a model predicted value obtained by processing the characteristic data to be processed by adopting the second model to be trained, which is sent by the second terminal in an encryption mode.

3. The federal learning-based data processing method as claimed in claim 1, wherein the obtaining of the model predicted value obtained by processing based on the feature data to be processed comprises:

and processing the characteristic data to be processed by adopting the first model to be trained to obtain the model prediction value.

4. The federal learning-based data processing method as claimed in claim 1, wherein the determining a target gradient based on the internal loss value and the current model parameter corresponding to the first model to be trained comprises:

calculating according to the internal loss value and the current model parameter corresponding to the first model to be trained to obtain an original gradient;

acquiring a boundary condition based on the current model parameter corresponding to the first model to be trained;

and processing the original gradient based on the boundary condition to obtain a target gradient.

5. The federal learning-based data processing method as claimed in claim 4, wherein the obtaining of the boundary condition based on the original gradient and the current model parameter corresponding to the first model to be trained comprises:

acquiring a first hyper-parameter and a second hyper-parameter;

calculating a first norm corresponding to a current model parameter corresponding to the first model to be trained;

taking the product of the first hyperparameter and the first norm as an upper boundary, and taking the product of the second hyperparameter and the first norm as a lower boundary;

and obtaining the boundary condition based on the upper boundary and the lower boundary.

6. The federal learning-based data processing method as claimed in claim 5, wherein the processing the raw gradient to obtain a target gradient based on the boundary condition comprises:

calculating a second norm corresponding to the original gradient, and taking the ratio of the first norm to the second norm as a ratio to be judged;

if the ratio to be judged meets the boundary condition, taking the original gradient as the target gradient;

if the ratio to be judged is smaller than the upper boundary of the boundary condition, processing the original gradient according to a first calculation formula to obtain the target gradient; the first calculation formula is

if the ratio to be judged is larger than the lower boundary of the boundary condition, processing the original gradient according to a second calculation formula to obtain the target gradient; the second calculation formula comprises

7. The federal learning-based data processing method as claimed in claim 4, wherein before the original gradient is obtained by calculating the current model parameter corresponding to the first model to be trained according to the internal loss value, the federal learning-based data processing method further comprises:

and if the internal loss value is larger than a preset threshold value, executing the step of calculating according to the internal loss value and the current model parameter corresponding to the first model to be trained to obtain an original gradient.

8. A data processing apparatus based on federal learning, comprising:

the model predicted value obtaining module is used for obtaining a model predicted value based on the user characteristic data; the model predicted value is obtained by processing the user characteristic data by adopting the first model to be trained or the second model to be trained;

9. A computer arrangement comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the federal learning based data processing method as claimed in any one of claims 1 to 7 when executing the computer program.

10. A computer storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the steps of the federated learning-based data processing method as defined in any one of claims 1 to 7.