WO2022193432A1

WO2022193432A1 - Model parameter updating method, apparatus and device, storage medium, and program product

Info

Publication number: WO2022193432A1
Application number: PCT/CN2021/094936
Authority: WO
Inventors: 梁新乐; 刘洋; 陈天健
Original assignee: 深圳前海微众银行股份有限公司
Priority date: 2021-03-17
Filing date: 2021-05-20
Publication date: 2022-09-22
Also published as: CN113011603A

Abstract

A model parameter updating method, apparatus and device, a storage medium, and a program product. The method comprises: calculating a proximal optimization loss, wherein the proximal optimization loss represents a change amount of a parameter value of a parameter of a first model in a first device in local iteration of a current round compared with a parameter value in local iteration of a preset historical round (S10); on the basis of the proximal optimization loss, a model output of the first model in the local iteration of the current round and a longitudinal federated intermediate result received from a second device, calculating a gradient value corresponding to the parameter (S20); and updating the parameter by using the gradient value to complete the local iteration of the current round (S30).

Description

Model parameter updating method, apparatus, equipment, storage medium and program product

This application claims the priority of the Chinese patent application filed on March 17, 2021, the application number is 202110287041.3, and the title is "Model Parameter Update Method, Apparatus, Equipment, Storage Medium and Program Product", which is hereby incorporated in its entirety as refer to.

technical field

The present application relates to the technical field of machine learning, and in particular, to a method, apparatus, device, storage medium and program product for updating model parameters.

Background technique

With the development of artificial intelligence, people put forward the concept of "federated learning" in order to solve the problem of data islands, so that both sides of the federation can train models to obtain model parameters without giving their own data, and can avoid data privacy breaches. Vertical federated learning, vertical federated learning is to take out the part of the users and data that have the same user but different user data characteristics to jointly train the machine learning model when the data features of the participants overlap less and the users overlap more.

In the vertical federated learning process, the party with the label data needs to communicate with other parties multiple times to transmit the intermediate results required by the other party to update the parameters, such as the model output or the gradient value corresponding to the model output. Both parties need to perform multiple rounds of joint parameter update, that is, multiple communications are required, so the communication cost is relatively large. In response to this problem, a scheme is proposed in which participants use an intermediate result sent by other participants to perform multiple rounds of local iterations locally. By increasing the number of local iterations, the number of jointly updating parameters is reduced, thereby reducing communication costs.

However, in this scheme, when the number of local iterations of the participants is large, the problem of parameter distortion is prone to occur, which leads to the inability to guarantee the performance of the model, and when the number of local iterations is small, the communication cost cannot be effectively reduced.

SUMMARY OF THE INVENTION

The main purpose of this application is to provide a model parameter updating method, apparatus, device, storage medium and program product, aiming at the problem that communication cost and model performance are difficult to balance in the current vertical federated learning scheme.

In order to achieve the above object, the present application provides a method for updating model parameters. The method is applied to a first device participating in vertical federated learning, and the first device is communicatively connected to a second device participating in vertical federated learning. The method includes: The following steps:

Calculate a near-end optimization loss, wherein the near-end optimization loss represents the parameter value of the parameter of the first model in the first device in the current round of local iterations compared to the parameter values in the preset historical rounds of local iterations the amount of change;

Calculate the gradient value corresponding to the parameter based on the proximal optimization loss, the model output of the first model in the current local iteration, and the longitudinal federated intermediate result received from the second device;

The parameters are updated using the gradient values to complete the current round of local iterations.

In order to achieve the above object, the present application provides a user risk prediction method, the method is applied to a first device participating in longitudinal federated learning, the first device is communicatively connected to a second device participating in longitudinal federated learning, and the method includes: The following steps:

The local-end risk prediction model is obtained by jointly performing longitudinal federated learning with the second device based on the near-end optimization loss, wherein the near-end optimization loss represents the comparison of the parameter values of the parameters of the local-end model to be trained in the current local iteration The amount of change in the parameter value in the local iteration of the preset historical round;

The risk value of the user to be predicted is obtained by using the local end risk prediction model to predict.

In order to achieve the above object, the present application provides a model parameter updating device, the device is deployed in a first device participating in the vertical federated learning, the first device is communicatively connected with the second device participating in the vertical federated learning, and the device includes: :

A first calculation module, configured to calculate a near-end optimization loss, wherein the near-end optimization loss represents the parameter value of the parameter of the first model in the first device in the current round of local iterations compared to the parameter value in the preset historical round The amount of change in the parameter value in the next local iteration;

The second calculation module is configured to calculate the corresponding parameter based on the near-end optimization loss, the model output of the first model in this local iteration, and the longitudinal federation intermediate result received from the second device the gradient value of ;

An update module, configured to update the parameter by using the gradient value to complete the current round of local iteration.

In order to achieve the above object, the present application provides a user risk prediction device, the device is deployed in a first device participating in vertical federated learning, the first device is in communication connection with a second device participating in vertical federated learning, and the device includes: :

The federated learning module is used to jointly perform vertical federated learning with the second device based on the near-end optimization loss to obtain a local-end risk prediction model, wherein the near-end optimization loss represents the parameters of the local-end model to be trained in the current local iteration The amount of change in the parameter value in compared to the parameter value in the local iteration of the preset historical round;

A prediction module, configured to use the local risk prediction model to predict and obtain the risk value of the user to be predicted.

In order to achieve the above object, the present application also provides a model parameter update device, the model parameter update device includes: a memory, a processor, and a model parameter update program stored on the memory and running on the processor, The model parameter update program, when executed by the processor, implements the steps of the model parameter update method described above.

In order to achieve the above object, the present application also provides a user risk prediction device, the user risk prediction device includes: a memory, a processor, and a user risk prediction program stored on the memory and running on the processor, The user risk prediction program implements the steps of the user risk prediction method described above when executed by the processor.

In addition, in order to achieve the above purpose, the present application also proposes a computer-readable storage medium, where a model parameter update program is stored on the computer-readable storage medium, and when the model parameter update program is executed by a processor, the above-mentioned Steps of the model parameter update method.

In addition, in order to achieve the above purpose, the present application also proposes a computer-readable storage medium, where a user risk prediction program is stored on the computer-readable storage medium, and when the user risk prediction program is executed by a processor, the above-mentioned Steps of the User Risk Prediction Method.

In addition, in order to achieve the above object, the present application also proposes a computer program product, including a computer program, which implements the steps of the above-mentioned model parameter updating method when the computer program is executed by a processor.

In addition, in order to achieve the above object, the present application also proposes a computer program product, including a computer program, which implements the steps of the above-mentioned user risk prediction method when the computer program is executed by a processor.

Compared with the existing solution, in the present application, when the first device participating in the longitudinal federated learning performs local iteration, the calculation of the parameters that can characterize the first model in the first device is increased compared with the parameter values in this round of local iteration. A near-end optimization loss based on the amount of change in parameter values in a preset historical round of local iterations, and based on the near-end optimization loss, the model output of the first model in this round of local iterations, and the longitudinal direction received from the second device The federal intermediate result calculates the gradient value corresponding to the parameter in the first model, and updates the parameter according to the gradient value, that is, increases the near-end optimization loss to constrain the variation of the parameter of the first model in the local iteration, thereby avoiding the parameter value during the local iteration. Excessive changes lead to distortion, which can reduce the communication cost by increasing the number of local iterations, and at the same time ensure the prediction accuracy of the model.

Description of drawings

FIG. 1 is a schematic structural diagram of a hardware operating environment involved in a solution according to an embodiment of the present application;

FIG. 2 is a schematic flowchart of the first embodiment of the model parameter updating method of the present application;

3 is a schematic diagram of updating joint parameters by a participant involved in an embodiment of the present application;

FIG. 4 is a hardware architecture diagram of vertical federated learning performed by a first device and a second device involved in an embodiment of the application;

FIG. 5 is a schematic diagram of an interaction flow of multiple rounds of joint parameter update between a first device and a second device according to an embodiment of the present application;

FIG. 6 is a schematic diagram of functional modules of a preferred embodiment of a model parameter updating device of the present application.

The realization, functional characteristics and advantages of the purpose of the present application will be further described with reference to the accompanying drawings in conjunction with the embodiments.

Detailed ways

It should be understood that the specific embodiments described herein are only used to explain the present application, but not to limit the present application.

As shown in FIG. 1 , FIG. 1 is a schematic diagram of a device structure of a hardware operating environment involved in the solution of the embodiment of the present application.

It should be noted that the device for updating model parameters in the embodiment of the present application may be devices such as smart phones, personal computers, and servers, which are not specifically limited here, and the device for updating model parameters may be the first device participating in vertical federated learning.

As shown in FIG. 1 , the model parameter updating device may include: a processor 1001 , such as a CPU, a network interface 1004 , a user interface 1003 , a memory 1005 , and a communication bus 1002 . Among them, the communication bus 1002 is used to realize the connection and communication between these components. The user interface 1003 may include a display screen (Display), an input unit such as a keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a wireless interface. Optionally, the network interface 1004 may include a standard wired interface and a wireless interface (such as a WI-FI interface). The memory 1005 may be high-speed RAM memory, or may be non-volatile memory, such as disk memory. Optionally, the memory 1005 may also be a storage device independent of the aforementioned processor 1001 .

Those skilled in the art can understand that the device structure shown in FIG. 1 does not constitute a limitation on the model parameter updating device, and may include more or less components than those shown in the figure, or combine some components, or arrange different components .

As shown in FIG. 1 , the memory 1005 as a computer storage medium may include an operating system, a network communication module, a user interface module and a model parameter updating program. Among them, the operating system is a program that manages and controls the hardware and software resources of the device, and supports the operation of the model parameter update program and other software or programs. In the device shown in FIG. 1 , the user interface 1003 is mainly used for data communication with the client; the network interface 1004 is mainly used to establish a communication connection with the second device participating in the vertical federated learning; the processor 1001 can be used to call the memory 1005 The model parameter updater stored in , and does the following:

Further, the near-end optimization loss is calculated, wherein the near-end optimization loss represents that the parameter values of the parameters of the first model in the first device in the current round of local iterations are compared with the values in the preset historical rounds. The steps for changing the amount of parameter value in the local iteration include:

The parameter vector of the parameter of the first model in the first device in the current round of local iteration and the parameter vector in the local iteration of the preset historical round are subtracted by corresponding elements to obtain a difference vector;

A sum of squares of each element in the difference vector is calculated, and the near-end optimization loss is obtained based on the sum of squares.

Further, when the first device is a participant with label data, the vertical federation intermediate result is the output of the model in the second device,

The step of calculating the gradient value corresponding to the parameter based on the proximal optimization loss, the model output of the first model in the current round of local iterations, and the longitudinal federated intermediate result received from the second device include:

inputting the training data of the first device into the first model in the first device for processing to obtain the model output of the first model in this round of local iterations;

Calculate the predicted result according to the model output and the vertical federated intermediate result, and calculate the predicted loss based on the predicted result and the label data corresponding to the training data;

A total loss is obtained by adding the prediction loss and the near-end optimization loss, and a gradient value corresponding to the parameter is calculated based on the total loss.

Further, when the second device is a participant with tag data, the vertical federation intermediate result is the predicted loss in the second device relative to the first device sent during the current round of joint parameter update. The gradient value of the output of a model,

The step of calculating the gradient value corresponding to the parameter based on the near-end optimization loss, the model output of the model in the current round of local iterations, and the longitudinal federated intermediate result received from the second device includes:

Input the training data of the first device into the first model of the first device for processing, and obtain the model output of the first model in this round of local iterations;

Calculate the first sub-gradient value of the predicted loss with respect to the parameter according to the model output and the longitudinal federated intermediate result;

A second sub-gradient value of the proximal optimization loss relative to the parameter is calculated, and the first sub-gradient value and the second sub-gradient value are added to obtain a gradient value corresponding to the parameter.

Further, the step of adding the first sub-gradient value and the second sub-gradient value to obtain the gradient value corresponding to the parameter includes:

The gradient value corresponding to the parameter is obtained by multiplying the second sub-gradient value by a preset adjustment coefficient and then adding the first sub-gradient value.

The embodiment of the present application also proposes a user risk prediction device, the user risk prediction device is a first device participating in longitudinal federated learning, the first device establishes a communication connection with a second device participating in longitudinal federated learning, and the user risk prediction device The device includes: a memory, a processor, and a user risk prediction program stored on the memory and executable on the processor, where the user risk prediction program is executed by the processor to implement the following steps:

Further, the step of jointly performing vertical federated learning with the second device based on the near-end optimization loss to obtain the local-end risk prediction model includes:

receiving the vertical federation intermediate result of the current round of joint parameter update sent by the second device;

Based on the near-end optimization loss and the vertical federated intermediate result, locally iteratively update the parameters in the local model to be trained with a preset number of rounds;

Check whether the local model to be trained after updating the parameters meets the preset model conditions;

If it is satisfied, the local to-be-trained model after updating the parameters is used as the local-end risk prediction model;

If not satisfied, return to the step of receiving the vertical federation intermediate result of the current round of joint parameter update sent by the second device.

Further, the step of locally iteratively updating the parameters in the model to be trained at the local end with a preset number of rounds based on the near-end optimization loss and the vertical federated intermediate result includes:

Calculate the near-end optimization loss, and calculate the gradient value corresponding to the parameter based on the near-end optimization loss, the model output of the local model to be trained in the current round of local iteration, and the intermediate result of the vertical federation;

Use the gradient value to update the parameter to complete the current round of local iteration;

Check whether the number of local iteration rounds reaches the preset number of rounds;

If so, perform the step of detecting whether the local model to be trained after updating the parameters meets the preset model conditions;

If not, return to the step of calculating the near-end optimization loss, and increment the number of local iteration rounds by 1.

Based on the above structure, various embodiments of the model parameter updating method are proposed.

Referring to FIG. 2 , FIG. 2 is a schematic flowchart of a first embodiment of a method for updating model parameters of the present application. It should be noted that although a logical order is shown in the flowcharts, in some cases, the steps shown or described may be performed in an order different from that herein. The model parameter update method of the present application is applied to a first device participating in vertical federated learning, the first device is connected to the second device participating in vertical federated learning, and the first device and the second device can be devices such as smart phones, personal computers, and servers. . In this embodiment, the model parameter updating method includes:

Step S10, calculating a near-end optimization loss, wherein the near-end optimization loss represents the parameter values of the parameters of the first model in the first device in this round of local iterations compared to local iterations in preset historical rounds The amount of change in the parameter value;

In this embodiment, the participants in the vertical federated learning are divided into two categories: one is a data application participant with labeled data, and the other is a data provider without labeled data. In general, a data application participant There is one, and there are one or more data-providing parties. Each participant deploys a data set and a machine learning model based on their respective data features, and the machine learning models of each participant are combined to form a complete model, which is used to complete model tasks such as prediction or classification. The sample dimensions of the data sets of each participant are aligned, that is, the sample IDs of each data set are the same, but the data characteristics of each participant may be different. Each participant may use the encrypted sample alignment method in advance to construct a sample dimension-aligned data set, which will not be described in detail here.

The machine learning models deployed by the participants can be ordinary machine learning models, such as linear regression models, neural network models, etc., or models used in automatic machine learning, such as search networks. The search network refers to the network used for model parameter update (NAS); the search network includes multiple units, each unit corresponds to a network layer, and some units are provided with connection operations. Taking two units as an example, The connection operation before these two units can be preset N connection operations, and the corresponding weight of each connection operation is defined. The weight is the structural parameter of the search network, and the network layer parameters in the unit are the model parameters of the search network. ; In the process of model training, model parameters need to be updated to optimize and update structural parameters and model parameters, and the final network structure can be determined based on the final updated structural parameters, that is, which connection operation or operations to retain. Since the structure of the network is determined after a network search, each participant does not need to set the network structure of the model like designing a traditional vertical federated learning model, thus reducing the difficulty of designing the model.

In this embodiment, the first device may be a data application participant with tag data, and correspondingly, the second device may be a data provider without tag data. In this case, there may be multiple second devices; or, the first device It may also be a data providing participant without tag data, and correspondingly, the second device is a data application participant with tag data. No distinction is shown, hereinafter, the model in the first device is referred to as the first model, and the model in the second device is referred to as the second model.

The parameters of the model in each participant are initialized and set in advance, and each participant performs multiple rounds of joint parameter update to continuously update the parameters in their respective models and improve the performance of the entire model, such as the prediction accuracy. When the model of each participant is an ordinary machine learning model, the parameters updated in each round of joint parameter update process are model parameters, such as weight parameters in a neural network. When the model of each participant is a search network, the parameters updated in each round of joint parameter update may be structural parameters and/or model parameters. Specifically, in this embodiment, the update sequence of the structural parameters and the model parameters is not limited. For example, the structural parameters can be updated in the first few rounds of joint parameter updating, and the model parameters can be updated in the subsequent rounds of joint parameter updating. For another example, structural parameters and model parameters may be updated together in each round of joint parameter update.

In the process of jointly updating parameters in a round, each participant firstly interacts with the intermediate results used to update the parameters in their respective models (hereinafter also referred to as the intermediate results of vertical federation); A round of local iteration is performed, and after the local iteration, the next round of joint parameter update is performed. That is, during a round of joint parameter update process, a participant only receives an intermediate result sent by other participants, and subsequent rounds of local iterations use the intermediate result to participate in the calculation. Among them, the intermediate result can be the gradient or the output of the model. Specifically, when a participant is a data provider, the intermediate result sent to the other party can be the output of the model in the participant; when the participant is a data application participant, the intermediate result sent to the other party can be calculated. The gradient corresponding to the model output sent by the data provider. Since the intermediate results are transmitted instead of the original data in the data set, each participant does not disclose their data privacy to each other, and the data security of each participant is protected. As shown in FIG. 3 , it is a schematic diagram of updating joint parameters by participants in an embodiment, wherein Party K is a data application participant, Party 1 to Party K-1 are data providing participants, and Net _K is a data application participant Net _j is the model deployed in the data provider participant, Net _c is the model deployed in the data application participant to calculate the prediction result (Y _out ) based on the model output of each party, N _j is the output of the model ,

is the gradient value corresponding to the model output.

The first device may calculate the proximal optimization loss while performing one round of local iterations. The near-end optimization loss can represent the amount of change between the parameter values of the first model in the current round of local iterations and the parameter values in the preset historical rounds of local iterations. By minimizing the near-end optimization loss, it is possible to constrain the variation range of the parameter values of the first model in the current round of local iterations compared to the previous historical parameter values, that is, to make the parameter values of the first model change during the current round of local iterations smaller to avoid distortion of parameter values after more rounds of local iterations. In this embodiment, the calculation method of the near-end optimization loss is not limited, and the method for minimizing the near-end optimization loss is also not limited. For example, in one embodiment, the near-end optimization loss can be used as a loss function, and the near-end optimization loss can be minimized by a method of minimizing the loss function, such as calculating the difference between the near-end optimization loss and the parameters in the first model by a gradient descent algorithm. Gradient value, optimize parameters according to the gradient value, and then achieve the purpose of minimizing the loss of proximal optimization. In other embodiments, other methods can also be used to minimize the proximal optimization loss. For example, by randomly changing the parameter values of the parameters in the first model, it is calculated whether the proximal loss function becomes smaller, and a random experiment is performed to obtain the minimized proximal loss. Parameter value for optimization loss.

The preset historical round may be a round preset in the first device, and the round is earlier than the round of the local iteration of the current round. If the local iteration of the current round is the t-th round, the predetermined historical round less than t. In one round of joint parameter update, the preset historical round may be fixed, that is, each round of local iteration in this round of joint parameter update is calculated based on the parameter values of the same historical round of local iteration. The end optimization loss, for example, the preset historical round is fixed to 1 to ensure that the parameter values in the subsequent rounds of local iterations are smaller than the parameter values in the first round of local iterations. Alternatively, in a round of joint parameter update, the preset historical round may not be fixed, that is, for each round of local iterations in this round of joint parameter update, different preset historical rounds can be set; When the historical preset round is set separately for the round of local iteration, if the near-end optimization loss is calculated based on the previous local iteration of this round of local iteration, the gradient value of the parameter calculated according to the near-end optimization loss may be 0. That is to say, the near-end optimization loss does not play a role as a constraint. Therefore, in a preferred embodiment, for all rounds or partial rounds of local iterations, the rounds of this round of local iterations are subtracted from the corresponding preset historical rounds. Should be greater than 1. It should be noted that, in some embodiments, the first device may not need to calculate the near-end optimization loss for each round of local iterations. For example, there is no local iteration of historical rounds in the first round of local iteration, so there is no need to calculate the near-end optimization. loss.

Further, in one embodiment, the step S10 includes:

Step S101, the parameter vector of the parameter of the first model in the first device in the current round of local iteration and the parameter vector in the local iteration of the preset historical round are subtracted by corresponding elements to obtain a difference vector;

Step S102: Calculate the sum of squares of each element in the difference vector, and obtain the near-end optimization loss based on the sum of squares.

There are multiple parameters of the first model, which can be represented in the form of vectors. The first device can perform the corresponding element subtraction of the parameter vector of the parameters of the first model in the local iteration of the current round and the parameter vector of the local iteration of the preset historical round to obtain a difference vector formed by each difference value, and then Computes the sum of squares of the elements in the difference vector. The first device may directly use the sum of squares as the proximal optimization loss, or may calculate the square root of the sum of squares as the proximal optimization loss. It should be noted that, in the process of calculating the near-end optimization loss, the first device participates in the calculation by using each element of the parameter vector in the current round of local iterations as unknown variables.

In other implementation manners, the first device may also use other calculation methods capable of calculating the variation between vectors to calculate the near-end optimization loss.

Step S20, calculating the gradient value corresponding to the parameter based on the near-end optimization loss, the model output of the first model in the current round of local iteration, and the longitudinal federation intermediate result received from the second device;

The first device inputs the training data in the data set into the first model, and the model output is obtained after processing by the first model. The gradient value corresponding to each parameter in a model. The vertical federation intermediate result received from the second device is the intermediate result sent by the second device during the current round of joint parameter update. Specifically, when the first device is a data application participant with label data, the vertical federation intermediate result is the output of the second model sent by the second device, and the first device can obtain the data according to the vertical federation intermediate result and the current local iteration. The model output calculates the predicted loss; in one embodiment, the first device can add the near-end optimization loss and the predicted loss to obtain a total loss, and then calculate the gradient value of the total loss relative to the parameters in the first model; in another In an embodiment, the first device separately calculates the gradient values of the proximal optimization loss and the prediction loss relative to the parameters in the first model, and adds the two gradient values to obtain the final gradient value. When the first device is a data provider without labeled data and the second device is a data application participant with labeled data, the vertical federated intermediate result is the gradient value of the predicted loss calculated by the second device relative to the output of the first model ; The first device can calculate the gradient value of the predicted loss relative to the parameters in the first model according to the intermediate results of the longitudinal federation and the model output in the current round of local iterations, and then calculate the gradient value of the proximal optimization loss relative to the parameters in the first model. , add the two gradient values to get the final gradient value. It should be noted that, in each embodiment of the present application, the method for calculating the gradient value according to the loss may refer to the existing gradient calculation method, and details are not repeated.

Step S30, using the gradient value to update the parameter to complete the current round of local iteration.

After calculating and obtaining the gradient value of each parameter in the first model, the first device uses each gradient value to update each parameter. That is, each parameter corresponds to a gradient value, and the first device uses the gradient value corresponding to the parameter to update the parameter. Specifically, the first device may add the parameter value of the parameter after the local iterative update in the previous round and the gradient value corresponding to the parameter multiplied by the learning rate to obtain the parameter value of the parameter after the local iterative update of the current round. After updating each parameter, the current round of local iteration is completed. By increasing the near-end optimization loss when calculating the gradient value of the parameter, and then updating the parameter according to the gradient value, the parameters can be changed in the direction of minimizing the near-end optimization loss, thereby constraining the amount of parameter change and avoiding excessive parameter value changes and distorted.

Further, after completing the current round of local iteration, if the first device detects that the number of local iteration rounds of the current round of joint parameter update has been reached, it can perform the next round of joint parameter update; if it is detected that the current round of joint parameter update has not been reached. the number of local iteration rounds, the next round of local iteration can be performed. In one embodiment, a maximum number of rounds for jointly updating parameters can be set, and when the number of rounds is reached, the first device stops updating the model parameters. Or, in another embodiment, the first device may detect whether the prediction loss has converged after one round of joint parameter updating or after one round of local iteration has ended, and if so, stop updating the parameters. After the parameter update is stopped, the first device takes the current parameter value as the final parameter value of the first model, and after determining the parameter value of the first model, the first model can be used to complete the prediction task.

Figure 4 shows the hardware architecture diagram of the first device and the second device participating in the vertical federated learning in one embodiment. The first device and the second device interact with intermediate results. Based on the intermediate results sent by the other party, each of them performs multiple local Rounds of local iterations, in each round of local iterations, the calculation of the near-end optimization loss is added to constrain the variation of the parameters, so as to avoid the parameter value changing too much and causing distortion.

Compared with the existing solution, in this embodiment, when the first device participating in the vertical federated learning performs local iteration, it increases the calculation of the parameter values that can characterize the parameters of the first model in the first device in this round of local iterations. A near-end optimization loss compared to the amount of change in parameter values in a preset historical round of local iterations, and based on the near-end optimization loss, the model output of the first model in this round of local iterations, and the data received from the second device. The longitudinal federated intermediate result calculates the gradient values corresponding to the parameters in the first model, and updates the parameters according to the gradient values, that is, increases the proximal optimization loss to constrain the changes of the parameters of the first model in the local iteration, thereby avoiding the local iteration parameters. Excessive changes in the value lead to distortion, which can reduce the communication cost by increasing the number of local iterations, and at the same time ensure the prediction accuracy of the model.

Further, based on the above-mentioned first embodiment, a second embodiment of the method for updating model parameters of the present application is proposed. In this embodiment, when the first device is a participant with label data, the intermediate result of the vertical federation is: The output of the model in the second device, the step S20 includes:

Step S201, input the training data of the first device into the first model in the first device for processing, and obtain the model output of the first model in the current round of local iteration;

In this embodiment, when the first device is a data application participant with label data, and the second device is a data provider without label data, the intermediate result of vertical federated learning is that the second device is in the second device during this round of joint parameter update. The output obtained by inputting its training data into the second model for processing. In one round of local iteration of this round of joint parameter update, the first device may input its training data into the first model for processing, and obtain the model output of the first model in this round of local iteration.

Step S202, calculating a prediction result according to the model output and the vertical federation intermediate result, and calculating a prediction loss based on the prediction result and the label data corresponding to the training data;

The first device calculates and obtains the prediction result according to the model output and the longitudinal federation intermediate result, and calculates the prediction loss based on the prediction result and the label data corresponding to the training data. Specifically, according to the type of machine learning model, the calculation method of the prediction result is different; for example, when the machine learning model of the longitudinal federation learning is a linear regression model, the first device adds the model output and the intermediate result of the longitudinal federation to obtain the prediction result For another example, when the machine learning model of longitudinal federated learning is a neural network model, the first model in the first device includes two parts, Net _K and Net _c as shown in Figure 3, and the first device inputs the training data into the first device. The Net _K part of the model is processed to obtain the model output N _K , and then N _K and the vertical federated intermediate result N _j are input to the Net _c part for processing to obtain the prediction result Y _out .

The first device calculates and obtains the prediction loss according to the prediction result and the label data corresponding to the training data. Among them, the prediction loss can be calculated by using a common loss function calculation method, such as a cross entropy loss function, and different loss functions can be used according to different machine learning models being trained.

Step S203, adding the prediction loss and the near-end optimization loss to obtain a total loss, and calculating a gradient value corresponding to the parameter based on the total loss.

The first device adds the predicted loss and the proximal optimization loss to obtain a total loss. Specifically, the first device may directly add the two losses, or may weight and sum the two losses, and the weights of the two losses may be set as required. In one embodiment, the first device may add the predicted loss to the product of the near-end optimization loss and an adjustment coefficient to obtain the total loss, wherein the adjustment coefficient may be preset and adjusted flexibly in each round of local iterations, for example, In a round of joint parameter update, the adjustment coefficient can be initialized to 0.1 first, and then increase with the increase of the local iteration round, so as to realize that the larger the local iteration round, the greater the constraint on the parameter change. . The first device calculates the gradient value corresponding to the parameter based on the total loss, and the specific calculation process is not described in detail here.

Further, in other embodiments, after calculating the predicted loss and the proximal optimization loss, the first device may calculate the gradient value of the predicted loss relative to the parameters of the first model, and then calculate the proximal optimization loss relative to the first model. The gradient value of the parameter, and then add the two gradient values, or weighted sum, to obtain the gradient value corresponding to the parameter.

As shown in FIG. 5 , it is a schematic diagram of an interaction flow of a first device and a second device jointly performing multiple rounds of joint parameter update in an embodiment.

In this embodiment, the first device calculates the gradient value corresponding to the parameters of the first model by using the prediction loss and the near-end optimization loss, and then updates the parameters according to the gradient value, so as to realize the direction of minimizing the prediction loss and the near-end optimization loss Updating the parameters not only improves the prediction accuracy of the model, but also constrains the variation of the parameters to avoid distortion due to excessive parameter changes, so as to ensure that the communication cost is reduced by increasing the number of local iterations, and the prediction accuracy of the model can also be guaranteed. Rate.

Further, based on the above-mentioned first and/or second embodiments, a third embodiment of the method for updating model parameters of the present application is proposed. In this embodiment, when the second device is a participant with tag data, the The longitudinal federated intermediate result is the gradient value of the predicted loss in the second device relative to the output of the first model sent by the first device during the current round of joint parameter update. The step S20 includes:

Step S204, input the training data of the first device into the first model of the first device for processing, and obtain the model output of the first model in the current round of local iteration;

In this embodiment, when the first device is a data provider without label data, and the second device is a data application participant with label data, the first device inputs the training data to the first device during this round of joint parameter update. A model is processed to obtain an output, and the output is sent to the second device as an intermediate result. The second device calculates the gradient value of the predicted loss relative to the output, and sends it to the first device as an intermediate result. The intermediate result is the vertical federation Intermediate results. In one round of local iteration of this round of joint parameter update, the first device may input its training data into the first model for processing, and obtain the model output of the first model in this round of local iteration. In one embodiment, as shown in FIG. 3 , the first device inputs the training data into Net _j for processing, and obtains the model output N _j .

Step S205, calculating and obtaining the first sub-gradient value of the predicted loss relative to the parameter according to the model output and the longitudinal federation intermediate result;

The first device calculates the gradient value of the predicted loss relative to the parameters in the first model according to the model output and the longitudinal federated intermediate result (the difference is not shown and is hereinafter referred to as the first sub-gradient value). The first device calculates the first sub-gradient value corresponding to each parameter in the first model according to the back-propagation method according to the longitudinal federation intermediate result and the model output. Specifically, the first sub-gradient value can be calculated according to the following formula:

Among them, w is the parameter in the first model, N _j is the intermediate result sent to the second device during the current round of joint parameter update (that is, the model output of the first model before the local iteration), and G(N _j ) is the first The second device returns the gradient value of the predicted loss relative to _Nj , that is, the vertical federated intermediate result, and _Nb is the model output of the first model in this round of local iterations.

Step S206: Calculate a second sub-gradient value of the near-end optimization loss relative to the parameter, and add the first sub-gradient value and the second sub-gradient value to obtain a gradient value corresponding to the parameter.

After calculating the proximal optimization loss, the first device calculates the gradient value of the proximal optimization loss relative to the parameters in the first model (the difference is not shown and is hereinafter referred to as the second sub-gradient value). The first device adds the first sub-gradient value and the second sub-gradient value of the parameter to obtain the gradient value corresponding to the parameter. Specifically, when there are multiple parameters, each parameter has a corresponding first sub-gradient value and a second sub-gradient value, and the respective first sub-gradient value and second sub-gradient value of each parameter are added to obtain The corresponding gradient values for each parameter.

Further, the step of adding the first sub-gradient value and the second sub-gradient value to obtain the gradient value corresponding to the parameter in step S206 includes:

Step S2061: Multiply the second sub-gradient value by a preset adjustment coefficient and then add the first sub-gradient value to obtain a gradient value corresponding to the parameter.

In one embodiment, an adjustment coefficient may be set in the first device to adjust the degree of constraint on the parameter variation during each round of local iteration. Specifically, the first device may multiply the second sub-gradient value by the adjustment coefficient and then add the first sub-gradient value to obtain the gradient value corresponding to the parameter. The first device can adjust the adjustment coefficient according to the round of local iteration. For example, in one round of joint parameter update, the adjustment coefficient can be initialized to 0.1 first, and then increase with the increase of the round of local iteration, so as to realize the The larger the local iteration round, the stronger the constraint on the variation of the parameters.

Further, based on the above-mentioned first, second and/or third embodiments, a fourth embodiment of the user risk prediction method of the present application is proposed. In this embodiment, the method is applied to the first device participating in vertical federated learning, The first device is connected in communication with the second device participating in the longitudinal federated learning, and the first device and the second device may be devices such as a smart phone, a personal computer, and a server. The user risk prediction method includes the following steps:

Step A10, based on the near-end optimization loss and the second device jointly perform vertical federated learning to obtain the local-end risk prediction model, wherein the near-end optimization loss represents the parameters of the local-end model to be trained in the current local iteration The amount of change in the value compared to the parameter value in the local iteration of the preset historical round;

The first device may be a data application participant or a data providing participant. The first device is deployed with a first data set and a first model (hereinafter also referred to as the model to be trained at the local end) based on the data of each user under the first data feature, and the second device is deployed in the second device based on the data of each user. The second data set and the second model (hereinafter also referred to as the model to be trained on the other end) constructed from the data under the data feature, the user dimensions of the two data sets are the same; the first data feature and the second data feature are related to predicting user risks The first data feature and the second data feature are different; the first model and the second model are two parts of a machine learning model. Specifically, a commonly used machine learning model can be selected according to the needs to realize, such as linear A regression model or a neural network model, the prediction result of the model is set in a data form that can characterize the user's risk degree, such as a risk value; the first device and the second device jointly use the first data set and the second data set to train the first model and the second data set. The second model, after the training is completed, can use the two models to jointly predict the user's risk. Among them, the risk may be the credit risk before the user's loan, the default repayment risk in the user's loan, and the like. For example, in one embodiment, the first device is a device deployed in a bank, and the first data feature is a feature related to banking services, such as the number of historical loans of the user, the number of historical defaults of the user, etc.; the second device is deployed in an e-commerce business Device, the second data feature is the feature related to e-commerce business, such as the user's historical purchase times, amount, etc., the first device and the second device use their own data sets to perform vertical federated learning, and train the pre-loan credit risk prediction system. Model.

Specifically, the first device and the second device jointly perform longitudinal federated learning based on the near-end optimization loss to obtain a local-end risk prediction model. Specifically, the first device may perform each round of local iteration in each round of joint parameter updating according to the model parameter updating method in the above-mentioned first, second or third embodiment, so as to update the parameters in the first model. Describe in detail. After performing multiple rounds of joint parameter update, the first device uses the first model after the parameters are finally updated as the local-end risk prediction model.

Step A20, using the local end risk prediction model to predict and obtain the risk value of the user to be predicted.

After obtaining the local-end risk prediction model, the first device may use the local-end risk prediction model to predict the risk value of the user to be predicted. Specifically, the second device also performs each round of local iteration in each round of joint parameter updating according to the model parameter updating method in the above-mentioned embodiment to update the parameters in the second model. After performing multiple rounds of joint parameter updating, the second device The device uses the second model after the parameters are finally updated as the risk prediction model of the other end (wherein, the other end refers to the second device); the first device can adopt the risk prediction model of the local end, combined with the risk prediction model of the other end in the second device Predict to get the risk value of the user to be predicted. The risk value may be a value representing the user's risk level.

In one embodiment, the second device may send the other-end risk prediction model to the first device, and the first device obtains a model output by inputting the user data of the user to be predicted under the first data feature into the local-end risk prediction model. , and then input the user data of the user to be predicted under the second data feature into the risk prediction model at the other end to obtain a model output, and obtain the risk value of the user to be predicted according to the outputs of the two models, for example, directly adding the two models. In another embodiment, if the first device is a data application participant with tag data, the first device inputs the user data of the user to be predicted under the first data feature into the local risk prediction model to obtain a model output; The second device inputs the user data of the user to be predicted under the second data feature into the risk prediction model of the other end to obtain a model output, and sends it to the first device; the first device calculates the risk of the user to be predicted according to the two model outputs For example, when the local risk prediction model of the first device includes NetK and Netc as shown in FIG. 3 , the first device inputs the output of each model to the Netc part for processing to obtain the risk value of the user to be predicted. In another embodiment, if the first device is a data provider without tag data, and the second device is a data application participant with tag data, the first device will predict the user of the user under the first data feature. The data is input into the risk prediction model of the local end to obtain a model output, and the model output is sent to the second device; the second device inputs the user data of the user to be predicted under the second data feature into the risk prediction model of the other end to obtain a model Output, the second device calculates and obtains the risk value of the user to be predicted according to the output of the two models, and returns the risk value to the first device.

In this embodiment, in the process of performing longitudinal federated learning with the second device, the first device increases the parameter value of the parameter that can characterize the first model in the first device in this round of local iteration, compared with the parameter value in the preset The near-end optimization loss of the change of parameter values in the local iterations of the historical rounds. The near-end calculation function is used to constrain the change of the parameters of the first model in the local iteration to avoid distortion caused by excessive parameter value changes during the local iteration. Therefore, the communication cost in the user risk prediction can be reduced, and the accuracy of the user risk prediction can also be guaranteed.

Further, in one embodiment, the step A10 includes:

Step A101, receiving the vertical federation intermediate result of the current round of joint parameter update sent by the second device;

When the first device performs a round of joint parameter update, the first device receives the vertical federation intermediate result of the current round of joint parameter update sent by the second device. Specifically, if the first device is a data application participant with label data, the second device inputs its training data into the second model for processing in this round of joint parameter update to obtain the output of the model, and sends the output as an intermediate result To the first device, the intermediate result is the vertical federated intermediate result. If the first device is a data provider without label data, the first device inputs its training data into the first model for processing to obtain the output of the model, and sends the output as an intermediate result to the second device, and the second device calculates The gradient value of the prediction loss relative to this output is sent to the first device as an intermediate result, which is the longitudinal federated intermediate result.

Step A102, based on the near-end optimization loss and the vertical federation intermediate result, perform local iterative update of a preset number of rounds of parameters in the local model to be trained;

The first device performs a local iterative update of a preset number of rounds of parameters in the model to be trained at the local end based on the near-end optimization loss and the vertical federation intermediate result. For details, refer to the model parameter update in the first, second or third embodiments above. The method performs each round of local iteration on the model to be trained at the local end, which will not be described in detail here. The preset number of rounds may be a number set in advance as required.

Step A103, detecting whether the local model to be trained after updating the parameters satisfies the preset model condition;

After the first device performs local iterations for a preset number of rounds, it detects whether the local model to be trained after updating the parameters satisfies the preset model condition. The preset model condition may be a preset condition, such as the convergence of prediction loss, or the round of joint parameter update reaches a predetermined round, or the duration of joint parameter update reaches a predetermined duration.

Step A104, if it is satisfied, the local to-be-trained model after updating the parameters is used as the local-end risk prediction model;

If it is detected that the preset model condition is met, the first device may use the local-end to-be-trained model after updating the parameters as the local-end risk prediction model. Correspondingly, the second device uses the other-end to-be-trained model after updating the parameters as the other-end risk prediction model.

Step A105, if not satisfied, return to the step of receiving the vertical federation intermediate result of the current round of joint parameter update sent by the second device.

If it is detected that the preset model condition is not met, the first device returns to the above-mentioned step A101, that is, performs the next round of joint parameter update.

Further, in one embodiment, the step A102 includes:

Step A1021: Calculate the near-end optimization loss, and calculate the gradient corresponding to the parameter based on the near-end optimization loss, the model output of the local model to be trained in this round of local iterations, and the vertical federation intermediate result value;

Step A1022, using the gradient value to update the parameter to complete the current round of local iteration;

Specifically, the first device can calculate the near-end optimization loss and the gradient value corresponding to each parameter according to the specific implementation process of step S10 and step S20 in the above-mentioned first embodiment, and according to the specific implementation process of the above-mentioned step S30, according to the gradient value The update parameters are not described in detail in this embodiment.

Step A1023, detecting whether the number of local iteration rounds reaches a preset number of rounds;

Step A1024, if so, execute the step of detecting whether the local model to be trained after updating the parameters meets the preset model conditions;

Step A1025, if not reached, return to the step of calculating the near-end optimization loss, and increment the number of local iteration rounds by 1.

After completing one round of local iteration, the first device detects whether the current number of local iteration rounds has reached the preset number of rounds; if so, the first device executes step A103; if not, then increments the number of local iteration rounds by one, And return to step A1021, that is, perform the next round of local iteration.

In addition, an embodiment of the present application also proposes an apparatus for updating model parameters. Referring to FIG. 6 , the apparatus is deployed on a first device that participates in vertical federated learning, and the first device is communicatively connected to a second device that participates in vertical federated learning. The device includes:

The first calculation module 10 is configured to calculate a near-end optimization loss, wherein the near-end optimization loss represents that the parameter values of the parameters of the first model in the first device in this round of local iterations are compared with those in the preset history. The amount of change in the parameter value in the local iteration of the round;

The second calculation module 20 is configured to calculate the parameter based on the proximal optimization loss, the model output of the first model in the current local iteration, and the longitudinal federation intermediate result received from the second device corresponding gradient value;

The updating module 30 is configured to update the parameter by using the gradient value to complete the current round of local iteration.

Further, the first computing module 10 includes:

The first calculation unit is used to perform the corresponding element subtraction of the parameter vector of the parameter of the first model in the first device in this round of local iteration and the parameter vector in the local iteration of the preset historical round to obtain the difference. vector;

The second calculation unit is configured to calculate the sum of squares of each element in the difference vector, and obtain the near-end optimization loss based on the sum of squares.

Further, when the first device is a participant with label data, the vertical federated intermediate result is the output of the model in the second device, and the second calculation module 20 includes:

a first processing unit, configured to input the training data of the first device into the first model in the first device for processing, and obtain the model output of the first model in the current round of local iteration;

a third computing unit, configured to calculate and obtain a prediction result according to the model output and the vertical federation intermediate result, and calculate and obtain a prediction loss based on the prediction result and the label data corresponding to the training data;

The fourth calculation unit is configured to add the prediction loss and the near-end optimization loss to obtain a total loss, and calculate the gradient value corresponding to the parameter based on the total loss.

The second computing module 20 includes:

The second processing unit is used to input the training data of the first device into the first model of the first device for processing, and obtain the model output of the first model in this round of local iterations;

a fifth calculation unit, configured to calculate and obtain a first sub-gradient value of the predicted loss relative to the parameter according to the model output and the longitudinal federation intermediate result;

The sixth calculation unit is configured to calculate the second sub-gradient value of the near-end optimization loss relative to the parameter, and add the first sub-gradient value and the second sub-gradient value to obtain the corresponding value of the parameter. gradient value.

Further, the sixth computing unit is also used for:

In addition, an embodiment of the present application further proposes a user risk prediction device, the device is deployed on a first device participating in vertical federated learning, the first device is in communication connection with a second device participating in vertical federated learning, and the device includes:

Further, the federated learning module includes:

a receiving unit, configured to receive the vertical federation intermediate result of the current round of joint parameter update sent by the second device;

a local iterative unit, configured to perform local iterative update of a preset number of rounds of parameters in the model to be trained at the local end based on the near-end optimization loss and the vertical federated intermediate result;

A detection unit, configured to detect whether the local model to be trained after updating the parameters satisfies the preset model conditions;

A determination unit, configured to use the local-end to-be-trained model after updating the parameters as the local-end risk prediction model if it is satisfied;

The returning unit is configured to, if not satisfied, return to the step of receiving the vertical federation intermediate result of the current round of joint parameter update sent by the second device.

Further, the local iterative unit includes:

The calculation subunit is used to calculate the near-end optimization loss, and based on the near-end optimization loss, the model output of the local to-be-trained model in the current round of local iterations, and the intermediate results of the vertical federation, the parameters are calculated and obtained corresponding gradient value;

an update subunit, configured to update the parameter by using the gradient value to complete the current round of local iteration;

A detection subunit for detecting whether the number of local iteration rounds reaches a preset number of rounds;

an execution subunit, configured to execute the step of detecting whether the model to be trained at the local end after updating the parameters meets the preset model condition if it is reached;

The returning subunit is used for returning to the step of calculating the near-end optimization loss if not reached, and incrementing the number of local iteration rounds by 1.

In addition, an embodiment of the present application also proposes a computer-readable storage medium, where a model parameter update program is stored on the storage medium, and when the model parameter update program is executed by a processor, the steps of the above-mentioned model parameter update method are implemented . The present application also proposes a computer program product, including a computer program, which implements the steps of the above-mentioned model parameter updating method when the computer program is executed by a processor. For the embodiments of the model parameter updating device, the computer-readable storage medium, and the computer product of the present application, reference may be made to the embodiments of the model parameter updating method of the present application, which will not be repeated here.

In addition, an embodiment of the present application also proposes a computer-readable storage medium, where a user risk prediction program is stored on the storage medium, and when the user risk prediction program is executed by a processor, the steps of the user risk prediction method described above are implemented . The present application also proposes a computer program product, comprising a computer program, when the computer program is executed by a processor, the steps of the above-mentioned user risk prediction method are implemented. For the embodiments of the user risk prediction device, the computer-readable storage medium, and the computer product of the present application, reference may be made to the embodiments of the user risk prediction method of the present application, which will not be repeated here.

It should be noted that, herein, the terms "comprising", "comprising" or any other variation thereof are intended to encompass non-exclusive inclusion, such that a process, method, article or device comprising a series of elements includes not only those elements, It also includes other elements not expressly listed or inherent to such a process, method, article or apparatus. Without further limitation, an element qualified by the phrase "comprising a..." does not preclude the presence of additional identical elements in a process, method, article or apparatus that includes the element.

The above-mentioned serial numbers of the embodiments of the present application are only for description, and do not represent the advantages or disadvantages of the embodiments.

From the description of the above embodiments, those skilled in the art can clearly understand that the method of the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course can also be implemented by hardware, but in many cases the former is better implementation. Based on this understanding, the technical solution of the present application can be embodied in the form of a software product in essence or in a part that contributes to the prior art, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, CD), including several instructions to make a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute the methods described in the embodiments of this application.

The above are only the preferred embodiments of the present application, and are not intended to limit the patent scope of the present application. Any equivalent structure or equivalent process transformation made by using the contents of the description and drawings of the present application, or directly or indirectly applied in other related technical fields , are similarly included within the scope of patent protection of this application.

Claims

A method for updating model parameters, wherein the method is applied to a first device participating in longitudinal federated learning, and the first device is communicatively connected to a second device participating in longitudinal federated learning, and the method includes the following steps:

Calculate a near-end optimization loss, wherein the near-end optimization loss represents the parameter value of the parameter of the first model in the first device in the current round of local iterations compared to the parameter values in the preset historical rounds of local iterations the amount of change;

Calculate the gradient value corresponding to the parameter based on the proximal optimization loss, the model output of the first model in the current local iteration, and the longitudinal federated intermediate result received from the second device;

The parameters are updated using the gradient values to complete the current round of local iterations.
The method for updating model parameters according to claim 1, wherein the calculating a near-end optimization loss, wherein the near-end optimization loss represents the parameters of the parameters of the first model in the first device in the current round of local iterations The step of changing the value compared to the parameter value in the local iteration of the preset historical round includes:

The parameter vector of the parameter of the first model in the first device in the current round of local iteration and the parameter vector in the local iteration of the preset historical round are subtracted by corresponding elements to obtain a difference vector;

A sum of squares of each element in the difference vector is calculated, and the near-end optimization loss is obtained based on the sum of squares.
The method for updating model parameters according to any one of claims 1 to 2, wherein, when the first device is a participant with label data, the vertical federated intermediate result is the output of the model in the second device ,

The step of calculating the gradient value corresponding to the parameter based on the proximal optimization loss, the model output of the first model in the current round of local iterations, and the longitudinal federated intermediate result received from the second device include:

inputting the training data of the first device into the first model in the first device for processing to obtain the model output of the first model in this round of local iterations;

Calculate the predicted result according to the model output and the vertical federated intermediate result, and calculate the predicted loss based on the predicted result and the label data corresponding to the training data;

A total loss is obtained by adding the prediction loss and the near-end optimization loss, and a gradient value corresponding to the parameter is calculated based on the total loss.
The method for updating model parameters according to any one of claims 1 to 2, wherein, when the second device is a participant with label data, the vertical federated intermediate result is the relative prediction loss in the second device. Based on the gradient value of the output of the first model sent by the first device during the current round of joint parameter update,

The step of calculating the gradient value corresponding to the parameter based on the near-end optimization loss, the model output of the model in the current round of local iterations, and the longitudinal federated intermediate result received from the second device includes:

Input the training data of the first device into the first model of the first device for processing, and obtain the model output of the first model in this round of local iterations;

Calculate the first sub-gradient value of the predicted loss with respect to the parameter according to the model output and the longitudinal federated intermediate result;

A second sub-gradient value of the proximal optimization loss relative to the parameter is calculated, and the first sub-gradient value and the second sub-gradient value are added to obtain a gradient value corresponding to the parameter.
The method for updating model parameters according to claim 4, wherein the step of adding the first sub-gradient value and the second sub-gradient value to obtain the gradient value corresponding to the parameter comprises:

The gradient value corresponding to the parameter is obtained by multiplying the second sub-gradient value by a preset adjustment coefficient and then adding the first sub-gradient value.
A user risk prediction method, wherein the method is applied to a first device participating in longitudinal federated learning, the first device is communicatively connected with a second device participating in longitudinal federated learning, and the method includes the following steps:

The local-end risk prediction model is obtained by jointly performing longitudinal federated learning with the second device based on the near-end optimization loss, wherein the near-end optimization loss represents the comparison of the parameter values of the parameters of the local-end model to be trained in the current local iteration The amount of change in the parameter value in the local iteration of the preset historical round;

The risk value of the user to be predicted is obtained by using the local end risk prediction model to predict.
The user risk prediction method according to claim 6, wherein the step of jointly performing longitudinal federated learning with the second device based on the near-end optimization loss to obtain the local-end risk prediction model comprises:

receiving the vertical federation intermediate result of the current round of joint parameter update sent by the second device;

Based on the near-end optimization loss and the vertical federated intermediate result, locally iteratively update the parameters in the local model to be trained with a preset number of rounds;

Check whether the local model to be trained after updating the parameters meets the preset model conditions;

If it is satisfied, the local to-be-trained model after updating the parameters is used as the local-end risk prediction model;

If not satisfied, return to the step of receiving the vertical federation intermediate result of the current round of joint parameter update sent by the second device.
The user risk prediction method according to any one of claims 6 to 7, wherein the parameters in the model to be trained at the local end are subjected to a preset number of rounds based on the near-end optimization loss and the longitudinal federation intermediate result. The steps for local iterative update include:

Calculate the near-end optimization loss, and calculate the gradient value corresponding to the parameter based on the near-end optimization loss, the model output of the local model to be trained in the current round of local iteration, and the intermediate result of the vertical federation;

Use the gradient value to update the parameter to complete the current round of local iteration;

Check whether the number of local iteration rounds reaches the preset number of rounds;

If so, perform the step of detecting whether the local model to be trained after updating the parameters meets the preset model conditions;

If not, return to the step of calculating the near-end optimization loss, and increment the number of local iteration rounds by 1.
An apparatus for updating model parameters, wherein the apparatus is deployed on a first device that participates in longitudinal federated learning, the first device is in communication connection with a second device that participates in longitudinal federated learning, and the device includes:

A first calculation module, configured to calculate a near-end optimization loss, wherein the near-end optimization loss represents the parameter value of the parameter of the first model in the first device in this round of local iterations compared to the parameter value in the preset historical round The amount of change in the parameter value in the next local iteration;

The second calculation module is configured to calculate the corresponding parameter based on the near-end optimization loss, the model output of the first model in this local iteration, and the longitudinal federation intermediate result received from the second device the gradient value of ;

An update module, configured to update the parameter by using the gradient value to complete the current round of local iteration.
A model parameter update device, wherein the model parameter update device includes: a memory, a processor, and a model parameter update program stored on the memory and executable on the processor, the model parameter update program being The processor implements the steps of the model parameter updating method according to claim 1 when executed.
A model parameter update device, wherein the model parameter update device includes: a memory, a processor, and a model parameter update program stored on the memory and executable on the processor, the model parameter update program being The processor implements the steps of the model parameter updating method according to claim 2 when executed.
A model parameter update device, wherein the model parameter update device includes: a memory, a processor, and a model parameter update program stored on the memory and executable on the processor, the model parameter update program being The processor implements the steps of the model parameter updating method according to claim 3 when executed.
A user risk prediction device, wherein the device is deployed on a first device participating in longitudinal federated learning, the first device is in communication connection with a second device participating in longitudinal federated learning, and the device includes:

The federated learning module is used to jointly perform vertical federated learning with the second device based on the near-end optimization loss to obtain the local-end risk prediction model, wherein the near-end optimization loss represents the parameters of the local-end to-be-trained model in the current local iteration The amount of change in the parameter value in compared to the parameter value in the local iteration of the preset historical round;

A prediction module, configured to use the local risk prediction model to predict and obtain the risk value of the user to be predicted.
A user risk prediction device, wherein the user risk prediction device includes: a memory, a processor, and a model parameter update program stored on the memory and executable on the processor, the model parameter update program being The processor implements the steps of the model parameter updating method as claimed in claim 6 when executed.
A user risk prediction device, wherein the user risk prediction device includes: a memory, a processor, and a model parameter update program stored on the memory and executable on the processor, the model parameter update program being The processor implements the steps of the model parameter updating method as claimed in claim 7 when executed.
A user risk prediction device, wherein the user risk prediction device includes: a memory, a processor, and a model parameter update program stored on the memory and executable on the processor, the model parameter update program being The processor implements the steps of the model parameter updating method according to claim 8 when executed.
A computer-readable storage medium, wherein a model parameter update program is stored on the computer-readable storage medium, and when the model parameter update program is executed by a processor, the model according to any one of claims 1-5 is implemented The steps of the parameter update method.
A computer-readable storage medium, wherein a model parameter update program is stored on the computer-readable storage medium, and when the model parameter update program is executed by a processor, the model according to any one of claims 6-9 is implemented The steps of the parameter update method.
A computer program product, comprising a computer program, wherein the computer program, when executed by a processor, implements the steps of the model parameter updating method according to any one of claims 1-5.
A computer program product comprising a computer program, wherein the computer program, when executed by a processor, implements the steps of the model parameter updating method according to any one of claims 6-9.