WO2022016964A1

WO2022016964A1 - Vertical federated modeling optimization method and device, and readable storage medium

Info

Publication number: WO2022016964A1
Application number: PCT/CN2021/093407
Authority: WO
Inventors: 何元钦; 梁新乐; 刘洋; 陈天健
Original assignee: 深圳前海微众银行股份有限公司
Priority date: 2020-07-23
Filing date: 2021-05-12
Publication date: 2022-01-27
Also published as: CN111860864A

Abstract

A vertical federated modeling optimization method and device, and a readable storage medium. The method comprises: receiving first network outputs sent by the data party, the first network outputs being obtained by inputting, by the data party, the first dataset into the first search network (S10); fusing the first network outputs to obtain a second network output, and calculating a first gradient of a loss function relative to each first network output according to the second network output and tag data of a home terminal (S20); and performing differential privacy encryption processing on each first gradient to obtain first encryption gradients, and sending each first encryption gradient to a corresponding data party, so that the data party updates a search structure parameter and/or a model parameter in the first search network according to the first encryption gradient (S30).

Description

Vertical federated modeling optimization method, device and readable storage medium

This application claims the priority of the Chinese patent application with application number 202010719397.5 filed on July 23, 2020, the entire contents of which are incorporated herein by reference.

technical field

The present application relates to the technical field of artificial intelligence, and in particular, to a vertical federation modeling optimization method, device and readable storage medium.

Background technique

With the development of artificial intelligence, people put forward the concept of "federated learning" in order to solve the problem of data islands, so that both sides of the federation can train models to obtain model parameters without giving their own data, and can avoid data privacy breaches.

In vertical federated learning, when the data features of the participants overlap less and the users overlap more, the part of the users and data with the same users but different data features is taken out to jointly train the machine learning model. For example, there are two participants A and B belonging to the same region, where participant A is a bank and participant B is an e-commerce platform. Participants A and B have more and the same users in the same area, but A and B have different businesses, and the recorded user data characteristics are different. In particular, the user data characteristics of A and B records may be complementary. In such a scenario, vertical federated learning can be used to help A and B build a joint machine learning predictive model to help A and B provide better services to their customers.

However, at present, the participants of the vertical federated learning need to design their own model structures in advance when using the vertical federated technology, and the slight difference in the designed model structure may greatly affect the performance of the overall vertical federated learning technology. The participation threshold of vertical federated learning is relatively high, which limits the application scope of vertical federated learning in specific task areas.

technical problem

The main purpose of this application is to provide a vertical federation modeling optimization method, equipment and readable storage medium, which aims to solve the need for the current vertical federated learning participants to design their own model structures in advance when using vertical federation technology. This results in a high threshold for participation in vertical federated learning.

technical solutions

In order to achieve the above purpose, the present application provides an optimization method for vertical federation modeling. The method is applied to a label party participating in the vertical federation modeling, and the label party is connected to each data party participating in the vertical federation modeling. A first data set and a first search network constructed based on respective data characteristics are respectively deployed in the data cube, and the method includes the following steps:

receiving the first network output sent by the data party, wherein the first network output is obtained by the data party inputting the first data set into the first search network;

Fusing each of the first network outputs to obtain a second network output, and calculating the first gradient of the loss function relative to each of the first network outputs according to the second network output and the label data of the local end;

Performing differential privacy encryption processing on each of the first gradients to obtain each of the first encryption gradients, and sending each of the first encryption gradients to the corresponding data party for the data party to update the first encryption gradient according to the first encryption gradient. Search structure parameters and/or model parameters in the first search network.

In order to achieve the above purpose, the present application also provides a vertical federation modeling optimization method, the method is applied to the data cubes participating in the vertical federation modeling, and each data cube is respectively deployed with a first data set and a The first search network, the method includes the following steps:

Inputting the first data set into the first search network to obtain the original output of the network;

Performing differential privacy encryption processing on the original output of the network to obtain a first network output;

Send the first network output to the label party participating in the vertical federation modeling, so that the label party can fuse the first network output received from each data party to obtain the second network output, according to the The second network output and the label data of the label side calculate the first gradient of the loss function relative to each of the first network outputs, and return the first gradient to the corresponding data side;

Search structure parameters and/or model parameters in the first search network are updated according to the first gradient received from the tag side.

To achieve the above object, the present application also provides a vertical federated modeling and optimization device, the vertical federated modeling and optimization device includes: a memory, a processor, and a vertical federated modeling and optimization device stored on the memory and running on the processor. A federated modeling optimizer that, when executed by the processor, implements the steps of the vertical federated modeling optimization method described above.

In addition, in order to achieve the above object, the present application also proposes a computer-readable storage medium, where a vertical federated modeling optimization program is stored on the computer-readable storage medium, and the vertical federated modeling optimization program is implemented when executed by a processor Steps of the vertical federated modeling optimization method as described above.

beneficial effect

In this application, the first network output sent by the data side is received by the label side, wherein the first network output is obtained by the data side entering the first data set into the first search network; the label side fuses each One network output obtains the second network output, and the first gradient of the loss function relative to each first network output is calculated according to the second network output and the label data of the local end; each first gradient is subjected to differential privacy encryption processing to obtain each first encrypted Gradient, each first encrypted gradient is sent to the corresponding data party, so that the data party can update the search structure parameters and/or model parameters in the first search network according to the first encrypted gradient. In this application, since the gradient sent by the label party to the data party is processed by differential privacy encryption, the data party cannot know the original gradient, thus avoiding the data party from deriving the label data and feature data of the label party according to the gradient. This avoids the leakage of private data in the label side to the data side, and improves the data security of the label side during the vertical federation modeling process. Moreover, compared with the existing vertical federated learning method in which each participant needs to manually spend a lot of manpower and material resources to design the model structure in advance, the present application realizes that in the vertical federated modeling process, the data parties only need to set up their own search networks, that is, Yes, the connection between each network unit in the search network, that is, the model structure, is automatically determined by optimizing and updating the search structure parameters during the vertical federation modeling process, which realizes automatic vertical federated learning without spending a lot of manpower. The material resources pre-set the model structure, which lowers the threshold for participating in vertical federated learning, enables vertical federated learning to be applied to a wider range of specific task fields to achieve specific tasks, and improves the application scope of vertical federated learning. In the modeling process, the data sent to the tag side is the output of the search network, and the tag side sent to the data side is the gradient after differential privacy processing. To a certain extent, the data security and model information security of each participant are guaranteed.

Description of drawings

FIG. 1 is a schematic structural diagram of a hardware operating environment involved in a solution according to an embodiment of the present application;

FIG. 2 is a schematic flowchart of the first embodiment of the vertical federated modeling optimization method of the present application;

FIG. 3 is a framework diagram of an automatic vertical federated learning of differential privacy encrypted communication information involved in an embodiment of the application.

Embodiments of the present invention

It should be understood that the specific embodiments described herein are only used to explain the present application, but not to limit the present application.

As shown in FIG. 1 , FIG. 1 is a schematic diagram of a device structure of a hardware operating environment involved in the solution of the embodiment of the present application.

It should be noted that, the vertical federation modeling and optimization device in this embodiment of the present application may be devices such as a smart phone, a personal computer, and a server, which are not specifically limited herein.

As shown in FIG. 1 , the vertical federated modeling optimization device may include: a processor 1001 , such as a CPU, a network interface 1004 , a user interface 1003 , a memory 1005 , and a communication bus 1002 . Among them, the communication bus 1002 is used to realize the connection and communication between these components. The user interface 1003 may include a display screen (Display), an input unit such as a keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a wireless interface. Optionally, the network interface 1004 may include a standard wired interface and a wireless interface (eg, a WI-FI interface). The memory 1005 may be high-speed RAM memory, or may be non-volatile memory, such as disk memory. Optionally, the memory 1005 may also be a storage device independent of the aforementioned processor 1001 .

Those skilled in the art can understand that the device structure shown in FIG. 1 does not constitute a limitation on the vertical federated modeling optimization device, and may include more or less components than the one shown, or combine some components, or different Component placement.

As shown in FIG. 1 , the memory 1005 as a computer storage medium may include an operating system, a network communication module, a user interface module, and a vertical federation modeling optimization program. Among them, the operating system is a program that manages and controls the hardware and software resources of the device, and supports the operation of the vertical federation modeling optimization program and other software or programs.

When the device shown in FIG. 1 is a tag party participating in the vertical federation modeling, in the device shown in FIG. 1 , the user interface 1003 is mainly used for data communication with the client; The data side of the model establishes a communication connection, and each data side is respectively deployed with a first data set and a first search network constructed based on their respective data characteristics; the processor 1001 can be used to call the vertical federation modeling optimization program stored in the memory 1005, and do the following:

Further, the step of receiving the first network output sent by the data party, wherein the first network output is obtained by the data party entering the first data set into the first search network includes:

Receive the first network output sent by the data party, wherein the first network output is that the data party inputs the first data set into the first search network for processing to obtain the original network output, and analyzes the The original output of the network is obtained after differential privacy encryption processing.

Further, the label side is deployed with an output network and a second data set and a second search network constructed based on the data features of the label side,

The step of fusing each of the first network outputs to obtain the second network output includes:

Inputting the second data set into the second search network to obtain a third network output;

After splicing the third network output and each of the first network outputs, input the output network to obtain the second network output;

After the step of calculating the first gradient of the loss function relative to each of the first network outputs according to the second network output and the local label data, the processor 1001 may also be used to call the vertical federation construction stored in the memory 1005. modulo optimizer and do the following:

A second gradient of the loss function relative to the target parameter in the second search network is calculated according to the second network output and the label data, and the target parameter is updated according to the second gradient, wherein the target parameter are search structure parameters and/or model parameters in the second search network.

Further, the differential privacy encryption processing includes clipping processing and Gaussian noise addition processing, and the step of performing differential privacy encryption processing on each of the first gradients to obtain each first encrypted gradient includes:

Perform clipping processing on the first gradient to obtain a first clipping gradient, wherein the second-order norm of the first clipping gradient is less than or equal to a first preset threshold;

generating a noise array that obeys the target Gaussian distribution, wherein the mean value of the target Gaussian distribution is 0, the mean square error is a second preset threshold, and each element in the noise array corresponds to each element in the first gradient one-to-one;

The first encrypted gradient is obtained by adding Gaussian noise to the first gradient by using the noise array.

Further, before the step of generating the noise array subject to the target Gaussian distribution, the method further includes:

Obtain the privacy level and modeling progress of this vertical federation modeling;

The second preset threshold is set according to the privacy level and the modeling progress.

When the device shown in FIG. 1 is a data party participating in the vertical federation modeling, in the device shown in FIG. 1 , the user interface 1003 is mainly used for data communication with the client; The tag side of the model establishes a communication connection, and each data side is respectively deployed with a first data set and a first search network constructed based on their respective data characteristics; the processor 1001 can be used to call the vertical federation modeling optimization program stored in the memory 1005, and do the following:

Further, the search structure parameters in the first search network of the data side include weights corresponding to the connection operations between network elements in the first search network, and the said first gradient received from the label side is updated according to the weight. After the above-described steps of searching for structural parameters and/or model parameters in the first search network, the processor 1001 can also be used to call the vertical federation modeling optimization program stored in the memory 1005, and perform the following operations:

According to the search structure parameters in the first search network after updating the parameters, a reservation operation is selected from each connection operation;

The model formed by each of the reservation operations and the network elements connected to each of the reservation operations is taken as a target model.

Based on the above structure, various embodiments of the vertical federated modeling optimization method are proposed.

Referring to FIG. 2 , FIG. 2 is a schematic flowchart of the first embodiment of the vertical federation modeling optimization method of the present application. It should be noted that although a logical order is shown in the flowcharts, in some cases, the steps shown or described may be performed in an order different from that herein. The vertical federated modeling optimization method of the present application is applied to the labeling party participating in the vertical federated learning. The labeling party is connected to each data party participating in the vertical federated learning. With the first search network, the data party and the tag party can be devices such as smart phones, personal computers, and servers. In this embodiment, the vertical federated modeling optimization method includes:

Step S10, receiving the first network output sent by the data party, wherein the first network output is obtained by the data party inputting the first data set into the first search network;

In this embodiment, the participants in the vertical federated learning are divided into two categories, one is the labeling party with label data, and the other is the data party that has no label data but has feature data. Generally, the labeling party has one , the data side has one or more. Each data party can deploy a dataset and a search network based on its own data features; if the tag party also has feature data, the tag party can also act as a data party and deploy the data set and search network based on its data features. The network performs both the tasks of the label side and the tasks of the data side. In order to avoid unclear references, when the data side and the label side are described separately, the data side and the search network of the data side are called the first data set and the first search network, and the data set and the search network of the label side are called the first The second dataset and the second search network are distinguished. The sample dimensions of the data sets of each participant are aligned, that is, the sample IDs of each data set are the same, but the data characteristics of each participant may be different. Each participant may use the encrypted sample alignment method in advance to construct a sample dimension-aligned data set, which will not be described in detail here. The search network refers to a network used for network structure search (NAS). In this embodiment, the search network of each participant may Architecture Search, microstructure search) method to design the network.

The search network includes multiple units, each unit corresponds to a network layer, and some units are provided with connection operations. Taking two units as an example, the connection operations before these two units can be preset N types of connections operation, and defines the weight corresponding to each connection operation, the weight is the search structure parameter of the search network, and the network layer parameters in the unit are the model parameters of the search network. In the model training process, a network structure search is required to optimize and update the search structure parameters and model parameters. Based on the final updated search structure parameters, the final network structure can be determined, that is, which connection operation or operations to retain. Since the structure of the network is determined after a network search, each participant does not need to set the network structure of the model like designing a traditional vertical federated learning model, thus reducing the difficulty of designing the model.

The search network combination of each participant constitutes a task model, and the network output of each search network is fused to obtain the final output of the task model. Further, the task model may further include an output network for fusing the network outputs of each search network, the output network is set after the search network connected to each participant, and the output data of each search network is used as input data, The output of the output network is used as the final output of the task model. The output network can be deployed on the label side, and the output network can use a fully connected layer or other complex neural network structure, which can vary according to the model prediction task; the output form of the output network can also be set according to the specific model prediction task. For example, when the model prediction task is image classification, the output of the output network is the class to which the input image belongs.

All participants need to jointly optimize and update the task model, that is, jointly update the model structure parameters and model parameters in their respective search networks, and finally obtain a task model that meets the prediction accuracy requirements. Specifically, in the process of joint parameter update, if the data side wants to update the parameters in their respective search networks, it needs the label data in the label side to calculate the loss function and gradient, and the label side needs to calculate the loss function and gradient. Datasets and Search Networks in Square. Because the data sets in each participant and the label data in the label party may be private data, for example, when the joint modeling between banks, the data often comes from the data of users who handle banking-related business, if the data party and the label party If the data, model structure and model parameters in the dataset are directly interacted with each other, data privacy will be leaked between the participants. Therefore, in this embodiment, the data side and the label side can interact with each other to update the intermediate results of the model parameters and search structure parameters in the respective search networks, and update the model parameters and the search network based on the received intermediate results. The search structure parameters are used to update the respective search networks, thereby completing the update of the task model. The intermediate result can be the gradient of the parameters or the output data of the search network. Specifically, when the participant is the data side, the intermediate result sent to the tag side may be the output data of the search network at the end; when the participant is the tag side, the intermediate result sent to the data side may be the calculated data side. The gradient corresponding to the output data sent.

Each participant can jointly update parameters in multiple rounds. In the process of a round of joint parameter update, the data sends a network output to the label side, and the label sends a gradient to the data side. Each participant can only update the model structure parameters in their respective search networks, or they can only update their respective search networks. The model parameters can also be updated simultaneously with the model structure parameters and model parameters in the respective search networks, that is, the model structure parameters and/or model parameters can be updated. After multiple rounds of jointly updating parameters, the model structure parameters and model parameters in the search network of each participant are updated multiple times. Specifically, in each round of joint parameter update, which parameter each participant will update can be set uniformly in advance.

Specifically, in the process of jointly updating parameters in a round, each data party inputs their respective first data sets into their respective first search networks, obtains output results after processing by the first search networks, and obtains the output of the first network based on the output results . The data parties send their respective first network outputs to the tag parties. The data party can directly use the output result of the first search network (also referred to as the original output of the network) as the first network output, or encrypt the output result with an encryption algorithm, and use the encrypted result as the first network output. For example, the homomorphic encryption method is used for encryption or the differential privacy encryption method is used for encryption. The tag side receives the first network output sent by each data side.

It should be noted that the participants may use different data sets in each round of joint update parameters. Specifically, the participants can divide the total data set into multiple small training sets (also referred to as data batches), and each round uses a small data set to participate in the joint update of parameters, or the participants can also jointly update the parameters in each round Before parameter update, a batch of data is sampled with replacement from the total data set to participate in the joint parameter update of this round.

Step S20, fusing each of the first network outputs to obtain a second network output, and calculating the first gradient of the loss function relative to each of the first network outputs according to the second network output and the label data of the local end;

The label side fuses each first network output to obtain the second network output. Specifically, the tag side can average the outputs of each first network to obtain the second network output, or when the tag side deploys an output network, the output of each first network can be spliced and input into the output network, and processed by the output network. Get the second network output. The method of splicing may be vector splicing. The label side calculates a loss function according to the output of the second network and the label data of the label side. The loss function can be the mean square error of the regression problem or the cross entropy loss of the classification problem, etc., and calculates the loss function relative to the first network output. a gradient. The method of calculating the gradient according to the loss function can refer to the chain rule and the gradient descent algorithm, and will not be described in detail here.

Step S30, performing differential privacy encryption processing on each of the first gradients to obtain each first encryption gradient, and sending each of the first encryption gradients to the corresponding data party, so that the data party can use the first encryption gradient according to the first encryption gradient. The search structure parameters and/or model parameters in the first search network are updated.

After calculating and obtaining the first gradient corresponding to each first network output, the tag side performs differential privacy encryption processing on the first gradient to obtain each first encryption gradient. Among them, differential privacy (differential privacy) is a means in cryptography, which aims to provide a way to maximize the accuracy of data query when querying from statistical databases, while minimizing the chance of identifying its records. A common The differential privacy method described above is to add random noise to the data. In this embodiment, the differential privacy encryption processing method may adopt the existing differential privacy encryption processing method, which will not be described in detail here.

The tag side sends each first encryption gradient to the corresponding data side, that is, which data side sends the first network output corresponding to the first encryption gradient to which data side the first encryption gradient is sent. After receiving the first encryption gradient, the data party updates the search structure parameters and/or model parameters in its first search network according to the first encryption gradient. Specifically, according to the chain rule and gradient descent algorithm, the data party calculates the gradient of the loss function relative to the search structure parameters and/or model parameters in the search network according to the first encrypted gradient, and updates the search structure parameters and/or model parameters correspondingly according to the gradient. or model parameters. That is to say, there are three cases, the first one: calculate the gradient of the loss function relative to the search structure parameters according to the first encryption gradient, and update the search structure parameters according to the gradient; the second type: calculate the relative loss function according to the first encryption gradient. According to the gradient of the model parameters, update the model parameters according to the gradient; the third type: calculate the gradient of the loss function relative to the search structure parameter according to the first encryption gradient, update the search structure parameter according to the gradient, and calculate the loss function according to the first encryption gradient The model parameters are updated according to the gradient with respect to the model parameters. At this point, a round of joint parameter updating process is completed.

After several rounds of jointly updating the parameters, the participants can obtain the target model according to the search network after updating the parameters. Specifically, the search structure parameters in the search network of the participants may include weights corresponding to connection operations between network elements in the search network. That is, connection operations are set between network units, and each connection operation corresponds to a weight. It should be noted that a connection operation is not set between any two network units. Participants can select a reservation operation from each connection operation according to the search structure parameters in their updated search network. Specifically, for every two network units that have connection operations, there are multiple connection operations between them, and one or more connection operations with a greater weight may be selected from the multiple connection operations as the reserved operation. After the reservation operation is determined, the model formed by each reservation operation and the network elements connected by each reservation operation is used as the target model of the participant. Each participant can use their own target models to jointly complete specific model prediction tasks.

In this embodiment, the first network output sent by the data side is received by the tag side, wherein the first network output is obtained by the data side inputting the first data set into the first search network; the tag side fuses The output of each first network obtains the output of the second network, and the first gradient of the loss function relative to the output of each first network is calculated according to the output of the second network and the label data of the local end; An encrypted gradient, each first encrypted gradient is sent to the corresponding data party, so that the data party can update the search structure parameters and/or model parameters in the first search network according to the first encrypted gradient. In this application, since the gradient sent by the label party to the data party is processed by differential privacy encryption, the data party cannot know the original gradient, thus avoiding the data party from deriving the label data and feature data of the label party according to the gradient. This avoids the leakage of private data in the label side to the data side, and improves the data security of the label side during the vertical federation modeling process. Moreover, compared with the existing vertical federated learning method in which each participant needs to spend a lot of manpower and material resources to pre-design the model structure, this embodiment realizes that in the vertical federated modeling process, each data party only needs to set up their own search methods. The network is enough. The connection between each network unit in the search network, that is, the model structure, is automatically determined by optimizing and updating the search structure parameters in the vertical federation modeling process, which realizes automatic vertical federation learning without spending A large number of human and material resources are used to pre-set the model structure, which lowers the threshold for participating in vertical federated learning, so that vertical federated learning can be applied to a wider range of specific task fields to achieve specific tasks, and the application scope of vertical federated learning is improved. In the modeling process, the data sent to the tag side is the output of the search network, and the tag side sent to the data side is the gradient after differential privacy processing. To a certain extent, the data security and model information security of each participant are guaranteed.

Further, in order to further improve the data security of the data party, the step S10 may include:

Step S101: Receive the first network output sent by the data party, where the first network output is the original network output obtained by the data party entering the first data set into the first search network for processing, and It is obtained by performing differential privacy encryption processing on the original output of the network.

The data party can input its first data set into its first search network for processing to obtain the original output of the network, and the original output of the network is the result directly output by the first search network. The data party performs differential privacy encryption processing on the original network output to obtain the first network output, and then sends the first network output to the label party. That is, the first network output sent by each data party received by the tag party is the result of the differential privacy encryption process performed by the data party, not the original network output of the first search network. The tag party cannot know the network based on the first network output. The original output prevents the label party from deriving the characteristic data of the data party according to the original output of the network, which further prevents the private data in the data party from leaking to the label party, and improves the data security of the data party.

Further, based on the above-mentioned first embodiment, a second embodiment of the vertical federated modeling optimization method of the present application is proposed. In this embodiment, the labeling party deploys an output network and a third data characteristic constructed based on the labeling party. Two datasets and a second search network, the step of fusing each of the first network outputs to obtain the second network output in the step S20 includes:

Step S201, inputting the second data set into the second search network to obtain a third network output;

In this embodiment, when the tagging party owns the feature data, the tagging party can deploy the second data set and the second search network constructed based on the data features of the tagging party. The tag side may also deploy an output network for fusing the network outputs of the various search networks. In the process of jointly updating parameters in one round, the tag side can input the second data set into the second search network, and obtain the output of the third network after processing by the second search network.

Step S202, splicing the third network output and each of the first network outputs and then inputting the output network into the output network to obtain the second network output;

When the label side fuses the outputs of each first network, it splices the output of the third network and the output of each first network, that is, splices the outputs of each network, outputs the splicing result to the network, and obtains the second network through the processing of the output network. output. Among them, each network output can be regarded as a vector form, and a common vector splicing method can be used for splicing each network output.

After the step S20, it also includes:

Step S40: Calculate the second gradient of the loss function relative to the target parameter in the second search network according to the second network output and the label data, and update the target parameter according to the second gradient, wherein the The target parameters are search structure parameters and/or model parameters in the second search network.

After calculating the output of the second network and obtaining the loss function according to the output of the second network and the label data, the label side can also calculate the second gradient of the loss function relative to the target parameters in the second search network, and update the target according to the second gradient. parameter. The target parameters may be search structure parameters and/or model parameters in the second search network. That is to say, there are three cases, the first one: calculate the gradient of the loss function relative to the search structure parameters, and update the search structure parameters according to the gradient; the second: calculate the gradient of the loss function relative to the model parameters, and update the model according to the gradient The third type: calculate the gradient of the loss function relative to the search structure parameters, update the search structure parameters according to the gradient, and calculate the gradient of the loss function relative to the model parameters, and update the model parameters according to the gradient.

Further, based on the above-mentioned first and/or second embodiments, a third embodiment of the vertical federated modeling optimization method of the present application is proposed. In this embodiment, the step of performing differential privacy encryption processing on each of the first gradients in step S30 to obtain each of the first encrypted gradients includes:

Step S301, performing clipping processing on the first gradient to obtain a first clipping gradient, wherein the second-order norm of the first clipping gradient is less than or equal to a first preset threshold;

Further, in this embodiment, the differential privacy encryption processing may include two steps of clipping processing and adding Gaussian noise processing. Specifically, the label side performs clipping processing on the first gradient to obtain the first clipping gradient, and the second-order norm of the first clipping gradient obtained after clipping is less than or equal to the first preset threshold. The first preset threshold is a threshold preset by the tag side. The first gradient is clipped to the first clipped gradient whose second-order norm is less than or equal to the first preset threshold, so that the first gradient calculated when The change of the clipping gradient is limited to a certain range, so that the data side cannot deduce the original data of the label side according to the first clipping gradient. The label can adopt any cutting processing method that can achieve this purpose.

Further, a cropping processing method is that, for each first gradient, the label side calculates the ratio of the second-order norm of the first gradient to the first preset threshold, and selects the larger ratio from the obtained ratio and 1. value, divide the first gradient by the larger value to obtain the first clipping gradient. The second-order norm of the first clipping gradient calculated according to the method is less than or equal to the first preset threshold. According to this method, if i represents the label of each data square, and G _i represents the first gradient corresponding to each data square, then the first clipping gradient G _i '=G _i /max(1, ||Gi|| ₂ /D _A ), where D _A represents the first preset threshold set by the tag side. It should be noted that, according to the different privacy levels of each data party treated by the label party, the label party can set different first preset thresholds for different data parties. If the privacy level is higher, a smaller first preset threshold can be set. , and the privacy level is lower, a larger first preset threshold can be set.

Step S302, generating a noise array that obeys the target Gaussian distribution, wherein the mean value of the target Gaussian distribution is 0, the mean square error is a second preset threshold, and each element in the noise array is equal to each element in the first gradient. one correspondence;

Step S303, using the noise array to add Gaussian noise to the first gradient to obtain a first encrypted gradient.

The label can generate a noise array that obeys the target Gaussian distribution, where the mean of the target Gaussian distribution is 0, the mean square error is the second preset threshold, and each element in the noise array corresponds to each element in the first gradient. The second preset threshold may be set according to specific needs, and the second preset threshold may be the square of the first preset threshold multiplied by the square of a coefficient. If the first gradient is in matrix form, the generated noise array is also in matrix form, and the matrix size of the noise array is the same as that of the first gradient.

The tag side uses a noise array to add Gaussian noise to each first gradient to obtain each first encrypted gradient. Specifically, for each first gradient, the tag side adds the noise array to the first gradient, that is, adds each element in the first gradient to the element at the corresponding position in the noise array. Since the first encryption gradient is the result obtained after clipping and adding noise, the data side cannot know the original first gradient according to the first encryption gradient, so the original data of the label side cannot be deduced, thereby improving the data of the label side. privacy.

Further, the method also includes:

Step S50, obtaining the privacy level and modeling progress of this vertical federation modeling;

Step S60, setting the second preset threshold according to the privacy level and the modeling progress.

The tag side can set the second preset threshold during the vertical federation modeling process, that is, the tag side can use a different second preset threshold when parameters are jointly updated in each round. Specifically, the label can obtain the privacy level of this vertical federation modeling and the current modeling progress. The benchmark threshold corresponding to different privacy levels can be preset, and the threshold change range corresponding to different modeling progress can be preset, wherein the threshold change range can be negative or positive, and the modeling progress can be the convergence speed of the loss function, joint The rounds or duration of updating parameters.

After obtaining the privacy level of this vertical federation modeling, the tag side can determine the corresponding benchmark threshold according to the mapping level, and determine the threshold change range according to the current modeling progress, and add the benchmark threshold value to the threshold change range to obtain the second prediction value. Set the threshold. The correspondence between the privacy level and the reference threshold may be that the higher the level is, the larger the reference threshold is, and the lower the level is, the smaller the reference threshold is, so that the higher the privacy level is, the more noise is added, and the lower the privacy level is. Small, flexibly set the noise size according to the privacy level, avoiding data distortion caused by excessive noise and affecting the prediction accuracy of the model

When the modeling progress is the convergence speed, the relationship between the convergence speed and the threshold change range can be that the faster the convergence speed, the larger the threshold change range, and the slower the convergence speed, the smaller the threshold change range, so that when the convergence speed is relatively slow, it is difficult to When converging, the second preset threshold can be reduced by a smaller threshold variation range (possibly negative), thereby reducing the noise, so as to promote the convergence of the loss function and ensure the prediction accuracy of the model. When the modeling progress is the rounds of jointly updating the parameters, the relationship between the rounds and the threshold change range may be that the larger the round, the smaller the threshold change range, and the smaller the round, the larger the threshold change range, so that with the joint As the number of times of updating parameters increases, and the loss function tends to converge, the second preset threshold becomes smaller and smaller, so that the noise is gradually reduced, so as to promote the convergence of the loss function and ensure the prediction accuracy of the model. When the modeling progress is the duration of jointly updating the parameters, the relationship between the duration and the threshold change range may be that the longer the duration is, the smaller the threshold change range is, and the smaller the duration is, the larger the threshold change range is, so that with the duration of the joint update parameters When the length becomes longer and the loss function tends to converge, the second preset threshold becomes smaller and smaller, so that the noise is gradually reduced, so as to promote the convergence of the loss function and ensure the prediction accuracy of the model.

Further, as shown in Figure 3, it is an automatic vertical federated learning framework for differential privacy encrypted communication information. A represents the label side, B represents the data side, i represents the label of the data side, and N is the number of the data side. A has feature data X _A and corresponding label data Y _A , and B ₁ , . . . , B _N have feature data X ₁ , . . . , X _{N respectively.} The characteristic data X _A , X ₁ , . . . , X _N have data characteristics of different distributions. Each participant has a search network, namely, Net _A , Net ₁ ,..., Net _N , and the corresponding model parameters and search structure parameters are W _A , W ₁ ,..., W _N and α _A , α ₁ ,... , α _N . A also deploys an output network Net _out for computing Y _out . The clip(x) in the lower right corner of the figure indicates that x is clipped, and +N(0, σ ² ) indicates that Gaussian noise is added to the clipping result.

Further, based on the above-mentioned first, second and/or third embodiments, a fourth embodiment of the vertical federated modeling optimization method of the present application is proposed. In this embodiment, the method is applied to data cubes participating in vertical federation modeling, and each data cube is respectively deployed with a first data set and a first search network constructed based on respective data features, and the method includes the following steps:

Step A10, inputting the first data set into the first search network to obtain the original output of the network;

Specifically, in the process of jointly updating parameters in one round, each data party inputs their respective first data sets into their respective first search networks, and processes the first search networks to obtain an output result, which is the original output of the network.

Step A20, performing differential privacy encryption processing on the original network output to obtain a first network output;

Step A30, sending the first network output to the label party participating in the vertical federation modeling, so that the label party fuses the first network output received from each data party to obtain the second network output, Calculate the first gradient of the loss function relative to each of the first network outputs according to the second network output and the label data of the label side, and return the first gradient to the corresponding data side;

The data party performs differential privacy encryption processing on the original network output to obtain the first network output, and then sends the first network output to the label party. The differential privacy encryption processing method in this embodiment may adopt the existing differential privacy encryption processing method. That is, the first network output sent by each data party received by the tag party is the result of the differential privacy encryption process performed by the data party, not the original network output of the first search network. The tag party cannot know the network based on the first network output. The original output prevents the label party from deriving the characteristic data of the data party according to the original output of the network, which further prevents the private data in the data party from leaking to the label party, and improves the data security of the data party.

Further, in an embodiment, the method of performing differential privacy processing on the output of the first network by the data party may refer to the method of performing differential privacy processing on the first gradient by the tag side in the third embodiment. According to this method, if i represents the label of each data square, and O _i represents the first network output corresponding to each data square, then the result O _i '=O _i /max(1, || Oi|| ₂ /C _i ), where Di represents the threshold set by the data cube i.

The tag side receives the first network output sent by each data side, and fuses each first network output to obtain the second network output. Specifically, the tag side can average the outputs of each first network to obtain the second network output, or when the tag side deploys an output network, the output of each first network can be spliced and input into the output network, and processed by the output network. Get the second network output. The method of splicing may be vector splicing. The label side calculates a loss function according to the output of the second network and the label data of the label side. The loss function can be the mean square error of the regression problem or the cross entropy loss of the classification problem, etc., and calculates the loss function relative to the first network output. a gradient. The method of calculating the gradient according to the loss function can refer to the chain rule and the gradient descent algorithm, and will not be described in detail here.

After calculating the first gradient corresponding to each first network output, the tag side can send the first gradient to the corresponding data side, that is, which data side sends the first network output corresponding to the first gradient, then Which data party to send this first gradient to.

Step A40: Update search structure parameters and/or model parameters in the first search network according to the first gradient received from the tag side.

After receiving the first gradient, the data party updates the search structure parameters and/or model parameters in its first search network according to the first gradient. Specifically, according to the chain rule and the gradient descent algorithm, the data party obtains the gradient of the loss function relative to the search structure parameters and/or model parameters in the search network according to the first gradient calculation, and updates the search structure parameters and/or the corresponding gradient according to the gradient. model parameters. That is to say, there are three cases, the first one: calculate the gradient of the loss function relative to the search structure parameters according to the first gradient, and update the search structure parameters according to the gradient; the second: calculate the loss function according to the first gradient relative to the model The gradient of the parameter, update the model parameters according to the gradient; the third type: calculate the gradient of the loss function relative to the search structure parameters according to the first gradient, update the search structure parameters according to the gradient, and calculate the loss function according to the first gradient relative to the model parameters The gradient of , according to which the model parameters are updated. At this point, a round of joint parameter updating process is completed.

After several rounds of jointly updating the parameters, the participants can obtain the target model according to the search network after updating the parameters. Each participant can use their own target models to jointly complete specific model prediction tasks.

In this embodiment, the data party inputs the first data set into the first search network to obtain the original network output, performs differential privacy encryption processing on the original network output to obtain the first network output, and sends the first network output to the label party , so that the label side fuses the first network output received from each data side to obtain the second network output, and calculates the first network output of the loss function relative to the first network output according to the second network output and the label side’s label data. gradient, and return the first gradient to the corresponding data side; the data side updates the search structure parameters and/or model parameters in the first search network according to the first gradient received from the label side. Since the gradient sent by the label side to the data side is processed by differential privacy encryption, the data side cannot know the original gradient, thus preventing the data side from deriving the label data and feature data of the label side according to the gradient, which further avoids the The private data in the label side is leaked to the data side, which improves the data security of the label side. Moreover, compared with the existing vertical federated learning method in which each participant needs to spend a lot of manpower and material resources to pre-design the model structure, this embodiment realizes that in the vertical federated modeling process, each data party only needs to set up their own search methods. The network is enough. The connection between each network unit in the search network, that is, the model structure, is automatically determined by optimizing and updating the search structure parameters in the vertical federation modeling process, which realizes automatic vertical federation learning without spending A large number of human and material resources are used to pre-set the model structure, which lowers the threshold for participating in vertical federated learning, so that vertical federated learning can be applied to a wider range of specific task fields to achieve specific tasks, and the application scope of vertical federated learning is improved. In the modeling process, the data sent to the tag side is the output of the search network, and the tag side sent to the data side is the gradient after differential privacy processing. To a certain extent, the data security and model information security of each participant are guaranteed.

Further, the search structure parameter in the first search network of the data party includes the weight corresponding to the connection operation between the network units in the first search network, and after the step A40, the method further includes:

Step A50, according to the search structure parameters in the first search network after updating the parameters, select a reservation operation from each connection operation;

In step A60, a model formed by each of the reservation operations and the network units connected to each of the reservation operations is used as a target model.

The search structure parameters in the search network of the data party may include weights corresponding to connection operations between network elements in the search network. That is, connection operations are set between network units, and each connection operation corresponds to a weight. It should be noted that a connection operation is not set between any two network units. After several rounds of jointly updating the parameters, the data can select the retention operation from each connection operation according to the search structure parameters in the updated search network. Specifically, for every two network units that have connection operations, there are multiple connection operations between them, and one or more connection operations with a greater weight may be selected from the multiple connection operations as the reserved operation. After the reservation operation is determined, the model formed by each reservation operation and the network elements connected by each reservation operation is used as the target model of the participant.

Further, in one embodiment, each participant may be a device deployed in a bank or other financial institution, and the participant stores user data recorded by each institution during business processing. There are differences in the specific business involved in different institutions, so the characteristics of user data of each participant may be different. Each institution can build a data set based on its own data characteristics, and use their own data sets to jointly conduct vertical federated learning, and enrich the features by expanding the model. degree to improve the prediction performance of the model. Specifically, each participant can jointly build a user risk prediction model, which is used to predict the user's risk level in business scenarios such as credit business and insurance business. The data characteristics of each participant can select the risk characteristics related to the user's risk prediction according to actual experience, such as the user's deposit amount, the user's default times, and so on.

Each participant uses their own data sets to jointly perform vertical federation modeling according to the method in the above-mentioned embodiment to obtain their own target models.

After obtaining their respective target models, each participant can jointly carry out risk prediction for users. The data party inputs the user data corresponding to the second risk feature of the target user at the local end into the target model of the local end, and obtains the output of the first model after processing by the target model. The data party sends the first model output to the data application provider. The label side receives the first model output sent by each data side.

The labeler inputs the user data corresponding to the first risk feature of the target user at its local end into the target model of its local end, and after processing by the target model, the second model output is obtained. The tag side splices the output of each first model and the output of the second model, and inputs the splicing result into the output network of the tag side's local end. After processing by the output network, the output obtains the risk prediction result of the target user.

Further, when the target user's risk prediction task is initiated by the data party, the tag party can send the target user's risk prediction result to the data party, so that the data party can perform subsequent business processing according to the target user's risk prediction result, for example, Determine whether to lend to the target user according to the risk prediction result.

In this embodiment, each participant only needs to set up their own search network, and does not need to spend a lot of manpower and material resources to set up a carefully set model structure, thereby lowering the threshold for participating in vertical federated learning, enabling banks and other financial institutions to be more It is convenient to carry out joint modeling through longitudinal federated learning, and then complete the risk prediction task through the risk prediction model obtained by joint modeling. Moreover, in the process of vertical federation modeling and the use of models for risk prediction after modeling, each participant does not need to directly interact with their own datasets and models, thus ensuring the security of user privacy data in each participant. In addition, the data party performs differential privacy encryption on the network output of the search network before sending it to the label party, which prevents the label party from deriving the original user data in the data party according to the network output, thus further improving the data security of the data party. The label side encrypts the gradient corresponding to the network output before sending it to the data side, which prevents the data side from deriving the original user data in the label side according to the gradient of the network output, thus further improving the data security of the label side.

In addition, an embodiment of the present application also proposes a vertical federation modeling optimization device. The device is deployed on a label party participating in the vertical federation modeling. The label party is in communication connection with each data party participating in the vertical federation modeling. A first data set and a first search network constructed based on respective data characteristics are respectively deployed in the data cube, and the device includes:

a receiving module, configured to receive the first network output sent by the data party, wherein the first network output is obtained by the data party inputting the first data set into the first search network;

a computing module, configured to fuse each of the first network outputs to obtain a second network output, and calculate the first gradient of the loss function relative to each of the first network outputs according to the second network output and the label data of the local end;

A first differential privacy processing module, configured to perform differential privacy encryption processing on each of the first gradients to obtain each first encrypted gradient, and send each of the first encrypted gradients to the corresponding data party for the data party to base on The first encryption gradient updates search structure parameters and/or model parameters in the first search network.

Further, the receiving module is also used for:

Further, the tag side is deployed with an output network, a second data set and a second search network constructed based on the data characteristics of the tag side, and the computing module includes:

an input unit, configured to input the second data set into the second search network to obtain a third network output;

a splicing unit for splicing the third network output and each of the first network outputs and then inputting the output network into the output network to obtain the second network output;

The device also includes:

a first update module, configured to calculate the second gradient of the loss function relative to the target parameter in the second search network according to the second network output and the label data, and update the target parameter according to the second gradient , wherein the target parameter is a search structure parameter and/or a model parameter in the second search network.

Further, the differential privacy encryption processing includes clipping processing and adding Gaussian noise processing, and the first differential privacy processing module includes:

a clipping processing unit, configured to perform clipping processing on the first gradient to obtain a first clipping gradient, wherein the second-order norm of the first clipping gradient is less than or equal to a first preset threshold;

The generating unit is configured to generate a noise array that obeys a target Gaussian distribution, wherein the mean value of the target Gaussian distribution is 0, the mean square error is a second preset threshold, and each element in the noise array is the same as each element in the first gradient. One-to-one correspondence of elements;

The noise adding unit is used for adding Gaussian noise to the first gradient by using the noise array to obtain a first encrypted gradient.

Further, the device also includes:

The acquisition module is used to acquire the privacy level and modeling progress of this vertical federation modeling;

a setting module, configured to set the second preset threshold according to the privacy level and the modeling progress.

In addition, an embodiment of the present application also proposes a vertical federation modeling optimization device. The device is deployed in data parties participating in vertical federation modeling, and each data party is respectively deployed with a first data set and a first data set constructed based on respective data characteristics. searching the network, the apparatus includes:

an input module, configured to input the first data set into the first search network to obtain the original output of the network;

a second differential privacy processing module, configured to perform differential privacy encryption processing on the original network output to obtain the first network output;

A sending module, configured to send the first network output to the label party participating in the vertical federation modeling, so that the label party can fuse the first network output received from each data party to obtain the second network output Then, calculate the first gradient of the loss function relative to each of the first network outputs according to the second network output and the label data of the label side, and return the first gradient to the corresponding data side;

A second update module, configured to update search structure parameters and/or model parameters in the first search network according to the first gradient received from the tag side.

Further, the search structure parameters in the first search network of the data party include weights corresponding to connection operations between network elements in the first search network, and the device further includes:

a selection module for selecting a reservation operation from each connection operation according to the search structure parameter in the first search network after updating the parameter;

A determination module, configured to use a model formed by each of the reservation operations and the network units connected to each of the reservation operations as a target model.

The extended content of the specific implementation of the vertical federated modeling optimization apparatus of the present application is basically the same as that of the above-mentioned embodiments of the vertical federated modeling optimization method, and will not be repeated here.

In addition, an embodiment of the present application further provides a computer-readable storage medium, where a vertical federation modeling optimization program is stored on the storage medium, and when the vertical federation modeling optimization program is executed by a processor, the following vertical federation is realized Steps for modeling optimization methods.

For the embodiments of the vertical federation modeling and optimization device and the computer-readable storage medium of the present application, reference may be made to the embodiments of the vertical federated modeling and optimization method of the present application, which will not be repeated here.

The above-mentioned serial numbers of the embodiments of the present application are only for description, and do not represent the advantages or disadvantages of the embodiments.

From the description of the above embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus a necessary general hardware platform, and of course hardware can also be used, but in many cases the former is better implementation. Based on this understanding, the technical solution of the present application can be embodied in the form of a software product in essence or the part that contributes to the prior art, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, CD-ROM), including several instructions to enable a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to execute the methods described in the embodiments of this application.

The above are only the preferred embodiments of the present application, and are not intended to limit the scope of the patent of the present application. Any equivalent structure or equivalent process transformation made by using the contents of the description and drawings of the present application, or directly or indirectly applied in other related technical fields , are similarly included within the scope of patent protection of this application.

Claims

A vertical federation modeling optimization method, wherein the method is applied to a label party participating in the vertical federation modeling, the label party is in communication connection with each data party participating in the vertical federation modeling, and each of the data parties is deployed separately There are a first data set and a first search network constructed based on the respective data features, and the method includes the following steps:

receiving the first network output sent by the data party, wherein the first network output is obtained by the data party inputting the first data set into the first search network;

Fusing each of the first network outputs to obtain a second network output, and calculating the first gradient of the loss function relative to each of the first network outputs according to the second network output and the label data of the local end;

Performing differential privacy encryption processing on each of the first gradients to obtain each of the first encryption gradients, and sending each of the first encryption gradients to the corresponding data party for the data party to update the first encryption gradient according to the first encryption gradient. Search structure parameters and/or model parameters in the first search network.
The vertical federated modeling optimization method according to claim 1, wherein the receiving the first network output sent by the data party, wherein the first network output is the data party converting the first data set The steps obtained by entering the first search network include:

Receive the first network output sent by the data party, wherein the first network output is that the data party inputs the first data set into the first search network for processing to obtain the original network output, and analyzes the The original output of the network is obtained after differential privacy encryption processing.
The vertical federated modeling optimization method according to claim 1, wherein the tag side deploys an output network and a second data set and a second search network constructed based on data features of the tag side.
The vertical federated modeling optimization method according to claim 3, wherein the step of fusing each of the first network outputs to obtain the second network output comprises:

Inputting the second data set into the second search network to obtain a third network output;

After splicing the third network output and each of the first network outputs, input the output network to obtain the second network output;

After the step of calculating the first gradient of the loss function relative to each of the first network outputs according to the second network output and the local label data, the method further includes:

A second gradient of the loss function relative to the target parameter in the second search network is calculated according to the second network output and the label data, and the target parameter is updated according to the second gradient, wherein the target parameter are search structure parameters and/or model parameters in the second search network.
The vertical federated modeling optimization method according to any one of claims 1 to 4, wherein the differential privacy encryption processing includes clipping processing and Gaussian noise addition processing, and the differential privacy encryption processing is performed on each of the first gradients The steps of obtaining each first encryption gradient include:

Perform clipping processing on the first gradient to obtain a first clipping gradient, wherein the second-order norm of the first clipping gradient is less than or equal to a first preset threshold;

generating a noise array that obeys the target Gaussian distribution, wherein the mean value of the target Gaussian distribution is 0, the mean square error is a second preset threshold, and each element in the noise array corresponds to each element in the first gradient one-to-one;

The first encrypted gradient is obtained by adding Gaussian noise to the first gradient by using the noise array.
The longitudinal federated modeling optimization method according to claim 5, wherein before the step of generating a noise array obeying a target Gaussian distribution, the method further comprises:

Obtain the privacy level and modeling progress of this vertical federation modeling;

The second preset threshold is set according to the privacy level and the modeling progress.
The vertical federated modeling optimization method according to claim 1, wherein, if the tag side also owns feature data, the tag side also acts as a data side, and deploys a dataset and a search network constructed based on its data features, both executing The tasks of the label side also perform the tasks of the data side.
The vertical federated modeling optimization method according to claim 7, wherein the data set and search network of the data side are the first data set and the first search network, and the data set and search network of the label side are the second data set and the first search network 2. Search the web.
A vertical federation modeling optimization method, wherein the method is applied to data cubes participating in vertical federation modeling, and each data cube is respectively deployed with a first data set and a first search network constructed based on respective data characteristics, the The method includes the following steps:

Inputting the first data set into the first search network to obtain the original output of the network;

Performing differential privacy encryption processing on the original output of the network to obtain a first network output;

Send the first network output to the label party participating in the vertical federation modeling, so that the label party can fuse the first network output received from each data party to obtain the second network output, according to the The second network output and the label data of the label side calculate the first gradient of the loss function relative to each of the first network outputs, and return the first gradient to the corresponding data side;

Search structure parameters and/or model parameters in the first search network are updated according to the first gradient received from the tag side.
The vertical federated modeling optimization method according to claim 9, wherein the search structure parameters in the first search network of the data side include weights corresponding to connection operations between network elements in the first search network, and the parameters are based on the data obtained from the After the step of updating the search structure parameters and/or model parameters in the first search network by the first gradient received by the label side, the method further includes:

According to the search structure parameters in the first search network after updating the parameters, a reservation operation is selected from each connection operation;

The model formed by each of the reservation operations and the network elements connected to each of the reservation operations is taken as a target model.
A vertical federated modeling and optimization device, wherein the vertical federated modeling and optimization device includes: a memory, a processor, and a vertical federated modeling optimization program stored on the memory and executable on the processor, and The vertical federated modeling optimizer implements the following steps when executed by the processor:

receiving the first network output sent by the data party, wherein the first network output is obtained by the data party inputting the first data set into the first search network;

Fusing each of the first network outputs to obtain a second network output, and calculating the first gradient of the loss function relative to each of the first network outputs according to the second network output and the label data of the local end;

Performing differential privacy encryption processing on each of the first gradients to obtain each of the first encryption gradients, and sending each of the first encryption gradients to the corresponding data party for the data party to update the first encryption gradient according to the first encryption gradient. Search structure parameters and/or model parameters in the first search network.
11. The vertical federation modeling optimization device of claim 11, wherein the receiving the first network output sent by the data party, wherein the first network output is the data party's transformation of the first data set The steps obtained by entering the first search network include:

Receive the first network output sent by the data party, wherein the first network output is that the data party inputs the first data set into the first search network for processing to obtain the original network output, and analyzes the The original output of the network is obtained after differential privacy encryption processing.
The vertical federation modeling optimization device of claim 11, wherein the tag side deploys an output network and a second data set and a second search network constructed based on data features of the tag side,

The step of fusing each of the first network outputs to obtain the second network output includes:

Inputting the second data set into the second search network to obtain a third network output;

After splicing the third network output and each of the first network outputs, input the output network to obtain the second network output;

After the step of calculating the first gradient of the loss function relative to each of the first network outputs according to the second network output and the local label data, the method further includes:

A second gradient of the loss function relative to the target parameter in the second search network is calculated according to the second network output and the label data, and the target parameter is updated according to the second gradient, wherein the target parameter are search structure parameters and/or model parameters in the second search network.
The vertical federated modeling optimization device according to any one of claims 11 to 13, wherein the differential privacy encryption processing includes clipping processing and Gaussian noise addition processing, and the differential privacy encryption is performed on each of the first gradients The steps of processing to obtain each first encrypted gradient include:

Perform clipping processing on the first gradient to obtain a first clipping gradient, wherein the second-order norm of the first clipping gradient is less than or equal to a first preset threshold;

generating a noise array that obeys the target Gaussian distribution, wherein the mean value of the target Gaussian distribution is 0, the mean square error is a second preset threshold, and each element in the noise array corresponds to each element in the first gradient one-to-one;

The first encrypted gradient is obtained by adding Gaussian noise to the first gradient by using the noise array.
The longitudinal federated modeling optimization method according to claim 14, wherein before the step of generating a noise array obeying a target Gaussian distribution, the method further comprises:

Obtain the privacy level and modeling progress of this vertical federation modeling;

The second preset threshold is set according to the privacy level and the modeling progress.
The vertical federated modeling optimization method according to claim 11, wherein if the tag side also owns feature data, the tag side also acts as a data side, and deploys a dataset and a search network constructed based on its data features, both executing The tasks of the label side also perform the tasks of the data side.
The vertical federated modeling optimization method according to claim 16, wherein the data set and search network of the data side are the first data set and the first search network, and the data set and search network of the label side are the second data set and the first search network 2. Search the web.
A vertical federated modeling and optimization device, wherein the vertical federated modeling and optimization device includes: a memory, a processor, and a vertical federated modeling optimization program stored on the memory and executable on the processor, and The vertical federated modeling optimizer implements the following steps when executed by the processor:

Inputting the first data set into the first search network to obtain the original output of the network;

Performing differential privacy encryption processing on the original output of the network to obtain a first network output;

Send the first network output to the label party participating in the vertical federation modeling, so that the label party can fuse the first network output received from each data party to obtain the second network output, according to the The second network output and the label data of the label side calculate the first gradient of the loss function relative to each of the first network outputs, and return the first gradient to the corresponding data side;

Search structure parameters and/or model parameters in the first search network are updated according to the first gradient received from the tag side.
The vertical federation modeling and optimization device according to claim 18, wherein the search structure parameters in the first search network of the data party include weights corresponding to connection operations between network elements in the first search network, the After the step of updating the search structure parameters and/or model parameters in the first search network by the first gradient received by the label side, the method further includes:

According to the search structure parameters in the first search network after updating the parameters, the reservation operation is selected from each connection operation;

A model formed by each of the reservation operations and the network elements connected to each of the reservation operations is taken as a target model.
A computer-readable storage medium, wherein a vertical federated modeling optimization program is stored on the computer-readable storage medium, and when the longitudinal federal modeling optimization program is executed by a processor, any one of claims 1 to 10 is implemented The steps of the vertical federated modeling optimization method described in Item.