WO2022007321A1 - 纵向联邦建模优化方法、装置、设备及可读存储介质 - Google Patents

纵向联邦建模优化方法、装置、设备及可读存储介质 Download PDF

Info

Publication number
WO2022007321A1
WO2022007321A1 PCT/CN2020/133430 CN2020133430W WO2022007321A1 WO 2022007321 A1 WO2022007321 A1 WO 2022007321A1 CN 2020133430 W CN2020133430 W CN 2020133430W WO 2022007321 A1 WO2022007321 A1 WO 2022007321A1
Authority
WO
WIPO (PCT)
Prior art keywords
network
local
search
update
data
Prior art date
Application number
PCT/CN2020/133430
Other languages
English (en)
French (fr)
Inventor
梁新乐
刘洋
陈天健
Original Assignee
深圳前海微众银行股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳前海微众银行股份有限公司 filed Critical 深圳前海微众银行股份有限公司
Publication of WO2022007321A1 publication Critical patent/WO2022007321A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model

Definitions

  • the present application relates to the technical field of artificial intelligence, and in particular, to a vertical federation modeling optimization method, apparatus, device, and readable storage medium.
  • vertical federated learning when the data features of the participants overlap less and the users overlap more, the part of the users and data with the same users but different data features is taken out to jointly train the machine learning model. For example, there are two participants A and B belonging to the same region, where participant A is a bank and participant B is an e-commerce platform. Participants A and B have more and the same users in the same area, but A and B have different businesses, and the recorded user data characteristics are different. In particular, the user data characteristics of A and B records may be complementary. In such a scenario, vertical federated learning can be used to help A and B build a joint machine learning predictive model to help A and B provide better services to their customers.
  • the participants of the vertical federated learning need to design their own model structures in advance when using the vertical federated technology, and the slight difference in the designed model structure may greatly affect the performance of the overall vertical federated learning technology.
  • the participation threshold of vertical federated learning is relatively high, which limits the application scope of vertical federated learning in specific task areas.
  • the main purpose of this application is to provide a vertical federated modeling optimization method, apparatus, device and readable storage medium, which aims to solve the need for the current vertical federated learning participants to perform pre-preparation of their model structures when using vertical federated technology. Design, resulting in a high threshold for participation in vertical federated learning.
  • the present application provides a vertical federated modeling optimization method, the method includes the following steps:
  • the method is applied to the participants participating in the vertical federation modeling, and each participant deploys a data set and a search network constructed based on their respective data characteristics, and the method includes the following steps:
  • the local target model is obtained based on the updated local search network.
  • the data set of the participants includes a first data set and a second data set
  • the interaction with other participants based on the local data set is used to update the intermediate results of the model parameters and search structure parameters in the respective search networks.
  • the steps of updating the local search network based on the received intermediate results include:
  • the method is applied to a data application participant having tag data, the data application participant is deployed with a back-up network, and the interaction with other participants based on the second data set at the local end is used to update their respective
  • the steps of searching for the second intermediate result of the structural parameters in the initial update copy, and updating the local initial update copy based on the received second intermediate result to obtain the local secondary update copy include:
  • the search structure parameter in the local initial updated copy is updated according to the second gradient to obtain the local secondary updated copy.
  • the method is applied to a data providing participant, the second intermediate result based on the second data set at the local end interacting with other participants to update the second intermediate results of the search structure parameters in the respective initial update copies, and based on the received
  • the steps of updating the initial update copy of the local end to obtain the secondary update copy of the local end with the second intermediate result obtained include:
  • the first gradient sent by the data application participant is received, and the search structure parameter in the local initial update copy is updated according to the first gradient to obtain the local secondary update copy.
  • the search structure parameter in the search network of the participant includes the weight corresponding to the connection operation between the network elements in the search network, and the step of obtaining the local target model based on the updated local search network includes:
  • the reserved operation is selected from each connection operation
  • the model formed by each of the reservation operations and the network units connected to each of the reservation operations is taken as the local target model.
  • the method before the step of selecting a reservation operation from each connection operation according to the updated search structure parameter in the local search network, the method further includes:
  • the step is performed: selecting a reserved operation from each connection operation according to the updated search structure parameter in the local search network;
  • the step is performed again based on the updated local search network: interacting with other participants based on the local data set to update the model parameters and search structure parameters in the respective search networks , and update the local search network based on the received intermediate results.
  • the method is applied to a data application participant that has tag data, and after the step of obtaining the local target model based on the updated local search network, the method further includes:
  • the output of the first model and the output of the second model are spliced and input to the back-end network of the local end to obtain the risk prediction result of the target user.
  • the present application provides a vertical federation modeling optimization device.
  • the device is deployed on the participants participating in the vertical federation modeling.
  • the device includes:
  • the interaction module is used to interact with other participants based on the local data set to update the intermediate results of the model parameters and search structure parameters in the respective search networks, and update the local search network based on the received intermediate results;
  • the determining module is used to obtain the local target model based on the updated local search network.
  • the present application also provides a vertical federated modeling and optimization device, the vertical federated modeling and optimization device includes: a memory, a processor, and a vertical federated modeling and optimization device stored on the memory and running on the processor.
  • a federated modeling optimizer that, when executed by the processor, implements the steps of the vertical federated modeling optimization method described above.
  • the present application also proposes a computer-readable storage medium, where a vertical federated modeling optimization program is stored on the computer-readable storage medium, and the vertical federated modeling optimization program is implemented when executed by a processor Steps of the vertical federated modeling optimization method as described above.
  • each participant participating in the vertical federated learning deploys a dataset and a search network constructed based on their own data characteristics, and each participant uses their own dataset to calculate and interact with other participants to update the models in their respective search networks.
  • the respective search networks are updated based on the received intermediate results, and the respective target models are obtained based on the updated search networks.
  • each participant needs to spend a lot of manpower and material resources to design the model structure in advance. This application realizes that in the vertical federated modeling process, each participant only needs to set up their own search network.
  • the connection between each network unit in the search network is automatically determined by optimizing and updating the search structure parameters in the vertical federation modeling process, which realizes automatic vertical federated learning without spending a lot of manpower and material resources.
  • Pre-setting the model structure reduces the threshold for participating in vertical federated learning, so that vertical federated learning can be applied to a wider range of specific task fields to achieve specific tasks, and improves the application scope of vertical federated learning.
  • the data sets and the model itself will not be directly exchanged between each participant, but the intermediate results used to update the model parameters and search structure parameters will be exchanged, thus ensuring that each participant will be protected. data security and model information security.
  • FIG. 1 is a schematic structural diagram of a hardware operating environment involved in a solution according to an embodiment of the present application
  • FIG. 2 is a schematic flowchart of the first embodiment of the vertical federated modeling optimization method of the present application
  • FIG. 3 is a schematic diagram of a participant jointly updating model parameters involved in an embodiment of the present application
  • FIG. 4 is a schematic diagram of a kind of participant jointly updating search structure parameters involved in an embodiment of the application
  • FIG. 5 is a schematic diagram of a participant jointly updating model parameters involved in an embodiment of the present application
  • FIG. 6 is a functional schematic block diagram of a preferred embodiment of the vertical federated modeling and optimization apparatus of the present application.
  • FIG. 1 is a schematic diagram of a device structure of a hardware operating environment involved in the solution of the embodiment of the present application.
  • the vertical federation modeling and optimization device in this embodiment of the present application may be devices such as a smart phone, a personal computer, and a server, which are not specifically limited herein.
  • the vertical federation modeling optimization device may be a participant participating in the vertical federation modeling, and each participant deploys a data set and a search network constructed based on their own data characteristics.
  • the vertical federated modeling optimization device may include: a processor 1001 , such as a CPU, a network interface 1004 , a user interface 1003 , a memory 1005 , and a communication bus 1002 .
  • the communication bus 1002 is used to realize the connection and communication between these components.
  • the user interface 1003 may include a display screen (Display), an input unit such as a keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a wireless interface.
  • the network interface 1004 may include a standard wired interface and a wireless interface (eg, a WI-FI interface).
  • the memory 1005 may be high-speed RAM memory, or may be non-volatile memory, such as disk memory.
  • the memory 1005 may also be a storage device independent of the aforementioned processor 1001 .
  • FIG. 1 does not constitute a limitation on the vertical federated modeling optimization device, and may include more or less components than the one shown, or combine some components, or different Component placement.
  • the memory 1005 as a computer storage medium may include an operating system, a network communication module, a user interface module, and a vertical federation modeling optimization program.
  • the operating system is a program that manages and controls the hardware and software resources of the device, and supports the operation of the vertical federation modeling optimization program and other software or programs.
  • the user interface 1003 is mainly used for data communication with the client;
  • the network interface 1004 is mainly used for other participants participating in the vertical federation modeling to establish communication connections;
  • the processor 1001 can be used to call the memory 1005
  • the vertical federated modeling optimizer stored in , and does the following:
  • the local target model is obtained based on the updated local search network.
  • the data set of the participants includes a first data set and a second data set
  • the interaction with other participants based on the local data set is used to update the intermediate results of the model parameters and search structure parameters in the respective search networks.
  • the steps of updating the local search network based on the received intermediate results include:
  • the vertical federation modeling and optimization device when the vertical federation modeling and optimization device is a data application participant with label data, the data application participant is deployed with a back-end network, and the data application participant interacts with other participants based on the second data set at the local end.
  • the steps for updating the second intermediate results of the search structure parameters in the respective initial update copies, and updating the local initial update copies based on the received second intermediate results to obtain the local secondary update copies include:
  • the search structure parameter in the local initial updated copy is updated according to the second gradient to obtain the local secondary updated copy.
  • the second intermediate result for updating the search structure parameters in the respective initial update copies based on the interaction with other participants based on the second data set at the local end and the steps of updating the initial update copy of the local end to obtain the secondary update copy of the local end based on the received second intermediate result include:
  • the first gradient sent by the data application participant is received, and the search structure parameter in the local initial update copy is updated according to the first gradient to obtain the local secondary update copy.
  • the search structure parameter in the search network of the participant includes the weight corresponding to the connection operation between the network elements in the search network, and the step of obtaining the local target model based on the updated local search network includes:
  • the reserved operation is selected from each connection operation
  • the model formed by each of the reservation operations and the network units connected to each of the reservation operations is taken as the local target model.
  • the processor 1001 may also be used to call the vertical federation modeling stored in the memory 1005.
  • Optimizer which does the following:
  • the step is performed: selecting a reserved operation from each connection operation according to the updated search structure parameter in the local search network;
  • the step is performed again based on the updated local search network: interacting with other participants based on the local data set to update the model parameters and search structure parameters in the respective search networks , and update the local search network based on the received intermediate results.
  • the method is applied to a data application participant that has tag data, and after the step of obtaining the local target model based on the updated local search network, the processor 1001 can also be used to call the data stored in the memory 1005.
  • Stored vertical federated modeling optimizer that does the following:
  • the output of the first model and the output of the second model are spliced and input to the back-end network of the local end to obtain the risk prediction result of the target user.
  • FIG. 2 is a schematic flowchart of the first embodiment of the vertical federation modeling optimization method of the present application. It should be noted that although a logical order is shown in the flowcharts, in some cases, the steps shown or described may be performed in an order different from that herein.
  • the vertical federated modeling optimization method of this application is applied to the participants participating in the vertical federated learning. Each participant deploys a data set and a search network constructed based on their own data characteristics, and the participants can be devices such as smartphones, personal computers, and servers.
  • the vertical federated modeling optimization method includes:
  • Step S10 interacting with other participants based on the local data set for updating the intermediate results of the model parameters and search structure parameters in the respective search networks, and updating the local search network based on the received intermediate results;
  • the participants in the vertical federated learning are divided into two categories: one is a data application participant with labeled data, and the other is a data provider without labeled data.
  • a data application participant There is one, and there are one or more data-providing parties.
  • Each participant deploys a data set and a search network constructed based on their own data characteristics.
  • the sample dimensions of the data sets of each participant are aligned, that is, the sample IDs of each data set are the same, but the data characteristics of each participant may be different.
  • Each participant may use the encrypted sample alignment method in advance to construct a sample dimension-aligned data set, which will not be described in detail here.
  • the search network refers to a network for performing a network structure search (NAS).
  • the search network of each participant may be a network designed in advance according to the DARTS (Differentiable Architecture Search, microstructure search) method.
  • the search network includes multiple units, each unit corresponds to a network layer, and some units are provided with connection operations. Taking two units as an example, the connection operations before these two units can be preset N types of connections operation, and defines the weight corresponding to each connection operation, the weight is the search structure parameter of the search network, and the network layer parameters in the unit are the model parameters of the search network.
  • a network structure search is required to optimize and update the search structure parameters and model parameters. Based on the final updated search structure parameters, the final network structure can be determined, that is, which connection operation or operations to keep. Since the structure of the network is determined after a network search, each participant does not need to set the network structure of the model like designing a traditional vertical federated learning model, thus reducing the difficulty of designing the model.
  • the data application participants can also deploy the follow-up network set based on the specific model prediction task, and the follow-up network is set after the search network connected to each participant, that is, the output data of each search network is used as input.
  • the post-connection network can use a fully connected layer, or other complex neural network structures, which can vary according to the model prediction task.
  • the execution subject may be a data application participant or a data provider participant.
  • the execution subject is hereinafter referred to as the local end.
  • the local end can interact with other participants based on the local data set to update the intermediate results of the model parameters and search structure parameters in the respective search networks, and update the model parameters and search structure parameters in the local search network based on the received intermediate results , and update the local search network by updating the parameters.
  • the update of the model parameters and/or the search structure parameters in the search network is the update of the search network.
  • the search network of each participant includes model parameters and search structure parameters. Before the joint training of all parties, the parameters are initialized. During the joint training process, each party needs to update their respective model parameters and search structure parameters for multiple rounds. .
  • the intermediate result can be the gradient of the parameter, or the output data of the search network.
  • the intermediate result sent to the other party can be the output data of the search network; when the participant is a data application participant, the intermediate result sent to the other party can be calculated.
  • the gradient corresponding to the output data sent by the data provider since the intermediate results are transmitted instead of the original data in the data set, each participant does not disclose their data privacy to each other, and the data security of each participant is protected.
  • Each participant may jointly update parameters in multiple rounds.
  • the process of one round of joint parameter updating may be that each participant jointly uses respective data sets to simultaneously update search structure parameters and model parameters.
  • the data providing participant inputs the data set at the end into the search network of the end, obtains the network output (referred to as the first network output in this paragraph) through the processing of the search network, and sends the first network output to the data application participant;
  • the data application participant inputs the data set at the end into the search network of the end, and obtains the network output (called the second network output in this paragraph) through the processing of the search network at the end; the data application participant outputs according to the first network output and the second network output Obtain the prediction result.
  • the data application participant can splicing the first network output and the second network output and then input it into the back-connected network, and obtain the prediction through the processing of the back-connected network. Result; the data application participant calculates the loss function according to the prediction result and the label data of the end, the loss function can be the mean square error of the regression problem or the cross entropy loss of the classification problem, etc., and calculates the loss function relative to the model parameters and network of the end.
  • the gradient of the structural parameters, and the gradient of the loss function relative to the output of the first network is calculated, and the data application participant sends the gradient corresponding to the output of the first network to the data provider; the data provider receives the gradient of the first network output, And according to the chain rule and gradient descent algorithm, the gradient of the loss function relative to the model parameters and network structure parameters of the end is calculated according to the gradient output of the first network, and the model parameters and network structure parameters of the end are updated according to the gradient; data application participation Fang also updates the model parameters and network structure parameters of the end according to the gradient of the model parameters and network structure parameters calculated by the end, and thus completes a round of joint parameter update.
  • the intermediate result sent by the data providing participant to the data application participant is the first network output
  • the intermediate result sent by the data application participant to the data providing participant is the gradient corresponding to the first network output.
  • the participants may use different data sets in each round of jointly updating parameters. Specifically, the participants can divide the total data set into multiple small training sets (also referred to as data batches), and each round uses a small data set to participate in the joint update of parameters, or the participants can also jointly update parameters in each round Before parameter update, a batch of data is sampled with replacement from the total data set to participate in the joint parameter update of this round.
  • the data set of the participants can be divided into two data sets: a first data set and a second data set, the first data set can be used as a training set, and the second data set can be used as a verification set .
  • the process of a round of joint parameter update can be divided into two steps. In the first step, each participant jointly uses their first data set to update the model parameters in their respective search networks. Based on the update in the first step, each participant in the second step The parties jointly use the respective second data sets to update the search structure parameters in the respective search networks.
  • the data providing participant inputs the first data set of the terminal into the search network of the terminal, obtains the network output (referred to as the first network output in this paragraph) through the processing of the search network, and outputs the first network output.
  • the data application participant inputs the first data set of the end into the search network of the end, and obtains the network output (called the second network output in this paragraph) through the processing of the search network at the end; the data application participant according to the The first network output and the second network output obtain the prediction result.
  • the data application participant deploys the back-connected network, the data application participant can splicing the first network output and the second network output and then input the back-connected network.
  • the prediction result is obtained through the processing of the subsequent network; the data application participant calculates the loss function according to the prediction result and the label data of the end, and calculates the gradient of the loss function relative to the model parameters of the end, and calculates the loss function relative to the first network.
  • the data application participant sends the gradient corresponding to the output of the first network to the data provider; the data provider receives the gradient of the first network output, and according to the chain rule and gradient descent algorithm, according to the first network output Calculate the gradient of the loss function relative to the model parameters of the end, and update the model parameters of the end according to the gradient; the data application participants also update the model parameters of the end according to the gradient of the model parameters calculated at the end.
  • the data providing participant inputs the second data set of the end into the search network of the end, obtains the network output (called the first network output in this paragraph) through the processing of the search network, and sends the first network output to the data
  • the data application participant inputs the second data set of the end into the search network of the end, and obtains the network output (called the second network output in this paragraph) through the processing of the search network of the end; the data application participant according to the first network
  • the predicted result is obtained from the output and the output of the second network.
  • the data application participant deploys a back-connected network
  • the data application participant can splicing the first network output and the second network output and then input the back-connected network.
  • the prediction result is obtained through the processing of the network; the data application participant calculates the loss function according to the prediction result and the label data of the end, and calculates the gradient of the loss function relative to the search structure parameters of the end, and calculates the loss function relative to the output of the first network.
  • Gradient the data application participant sends the gradient corresponding to the output of the first network to the data provider; the data provider receives the gradient of the first network output, and according to the chain rule and gradient descent algorithm, according to the gradient output of the first network Calculate the gradient of the loss function relative to the search structure parameters of the end, and update the search structure parameters of the end according to the gradient; the data application participants also update the search structure parameters of the end according to the gradient of the search structure parameters calculated by the end, and this is complete.
  • One round of parameter update One round of parameter update.
  • each participant In a round of joint parameter update, each participant firstly uses their first data set to jointly update the model parameters of their respective search networks, and then uses their respective second data sets to jointly update the search structure parameters of their respective search networks, reducing the occurrence of The likelihood of fitting a phenomenon.
  • the data application participant deploys a back-end network, when the data party updates the model parameters of the search network at the end, it also calculates the gradient of the model parameters in the back-end network, and updates it according to the gradient. connect to the network.
  • Step S20 obtaining the local target model based on the updated local search network.
  • the local end After updating the local search network, the local end obtains the local target model according to the updated local search network. Specifically, after performing multiple rounds of jointly updating parameters, the local end may search the network according to the local end obtained by the last round of updating to obtain the local target model.
  • the local search network may be directly used as the local target model, and the search structure parameters therein are also used as model parameters of the local target model.
  • step S20 includes:
  • Step S201 selecting a reservation operation from each connection operation according to the updated search structure parameter in the local search network
  • Step S202 the model formed by each of the reservation operations and the network units connected to each of the reservation operations is used as the local target model.
  • the search structure parameters in the search network of the participants may include weights corresponding to connection operations between network elements in the search network. That is, connection operations are set between network units, and each connection operation corresponds to a weight. It should be noted that a connection operation is not set between any two network units.
  • the local end may select a reservation operation from each connection operation according to the updated search structure parameters in the local search network. Specifically, for every two network units that have connection operations, there are multiple connection operations between them, and one or more connection operations with a greater weight may be selected from the multiple connection operations as the reserved operation.
  • the model formed by each reservation operation and the network units connected by each reservation operation is taken as the local target model. It should be noted that, if the local end is a data application participant and a back-end network is deployed, the local-end target model also includes the back-end network.
  • each participant participating in the vertical federated learning deploys a dataset and a search network constructed based on their own data characteristics, and each participant uses their own dataset to calculate and interact with other participants to update their respective search networks Based on the intermediate results of the intermediate model parameters and the search structure parameters, the respective search networks are updated based on the respective received intermediate results, and the respective target models are obtained based on the updated search networks.
  • the embodiment of the present application realizes that in the vertical federated modeling process, each participant only needs to set up their own search network.
  • each network unit in the search network that is, the model structure
  • the connection between each network unit in the search network is automatically determined by optimizing and updating the search structure parameters during the vertical federation modeling process, which realizes automatic vertical federated learning without spending a lot of money.
  • Human and material resources pre-set the model structure, which lowers the threshold for participating in vertical federated learning, enables vertical federated learning to be applied to a wider range of specific task fields to achieve specific tasks, and improves the application scope of vertical federated learning.
  • each participant does not directly interact with the data set and the model itself, but instead interacts with the intermediate results used to update model parameters and search for structural parameters, thus ensuring that each participant is involved. party's data security and model information security.
  • the data sets of the participants include the first data set and the second data set
  • the step S10 include:
  • Step S101 interacting with other participants based on the local first data set to update the first intermediate results of the model parameters in the respective search networks, and updating the copy of the local search network based on the received first intermediate results to obtain the local initial results. more copies;
  • the process of each participant performing a round of joint parameter update may be divided into three steps.
  • the first step is that each participant uses their respective first data sets to jointly update the model parameters in their respective copies of the search network;
  • the second step is based on the first step update, each participant uses their respective second data sets to jointly update The search network parameters in the respective copies to complete a network structure search;
  • the third step is that the participants use the copy search network parameters updated in the second step as the search network parameters in their respective search networks, and then use their respective first data sets.
  • the respective search network parameters are jointly updated to complete a model parameter update.
  • the local end interacts with other participants based on the local first data set to update the first intermediate results of the model parameters in the respective search network models, and updates the local search network based on the received first intermediate results
  • the copy gets the original updated copy on the local end.
  • each participant may copy their current search network before a round of joint update, and obtain a copy of their respective search network.
  • the first intermediate result sent by the data providing participant to the data application participant may be the network output obtained by the data providing participant entering its first data set into its search network, and the first intermediate result sent by the data application participant to the data providing participant.
  • the intermediate result can be the gradient corresponding to the output of the network.
  • the data providing participant inputs the first data set of the end into the search network of the end, and obtains the network output (called the first network output in this paragraph) through the processing of the search network.
  • the output is sent to the data application participant; the data application participant inputs the first data set of the end into the search network of the end, and the network output (referred to as the second network output in this paragraph) is obtained through the processing of the search network at the end; the data application participant
  • the prediction result is obtained according to the output of the first network and the output of the second network. Specifically, if the data application participant deploys a back-connected network, the data application participant can splicing the first network output and the second network output and then input the back-connected network.
  • the network after the processing of the network, obtains the prediction result; the data application participant calculates the loss function according to the prediction result and the label data of the end, and calculates the gradient of the loss function relative to the model parameters of the end, and calculates the loss function relative to the first
  • the data application participant sends the gradient corresponding to the output of the first network to the data provider; the data provider receives the gradient output by the first network, and according to the chain rule and gradient descent algorithm, according to the first network
  • the output gradient is calculated to obtain the gradient of the loss function relative to the model parameters in the copy of the search network at the end, and the model parameters in the copy of the end are updated according to the gradient to obtain the initial update copy of the end; the data application participants also calculate according to the end. Get the gradient of the model parameters, update the model parameters in the copy of the end, and get the initial update copy of the end.
  • Step S102 interacting with other participants based on the second data set of the local end for updating the second intermediate results of the search structure parameters in the respective initial update copies, and updating the initial update copies of the local end based on the received second intermediate results to obtain This time the copy is updated;
  • the local end interacts with other participants based on the second data set of the local end to update the second intermediate results of the search structure parameters in the respective initial update copies, and updates the initial update of the local end based on the received second intermediate results
  • the copy gets the current update copy.
  • the second intermediate result sent by the data providing participant to the data application participant may be the network output obtained by the data providing participant inputting the second data set at the end to the initial update copy of the end, and the data application participant sends the data to the data providing application
  • the second intermediate result of the square may be the gradient corresponding to the output of the network.
  • the data providing participant inputs the second data set of the end to the initial update copy of the end, and obtains the network output (called the first network output in this paragraph) through the processing of the initial update copy, and transfers the first update copy to the first network output.
  • a network output is sent to the data application participant; the data application participant inputs the second data set of the end to the initial update copy of the end, and obtains the network output (referred to as the second network output in this paragraph) through the processing of the initial update copy of the end;
  • the data application participant obtains the prediction result according to the first network output and the second network output. Specifically, if the data application participant deploys a subsequent network, the data application participant can splicing the first network output and the second network output.
  • the input is followed by the network, and the prediction result is obtained through the processing of the network; the data application participant calculates the loss function according to the prediction result and the label data of the end, and calculates the gradient of the loss function relative to the search structure parameters in the initial update copy of the end, And, the gradient of the loss function relative to the output of the first network is calculated, and the data application participant sends the gradient corresponding to the output of the first network to the data provider; the data provider receives the gradient of the first network output, and according to the chain rule and the gradient descent algorithm, calculate the gradient of the loss function relative to the search structure parameters in the initial update copy of the end according to the gradient output by the first network, and update the search structure parameters of the end according to the gradient to obtain the second update copy of the end; data application participation The party also updates the search structure parameters in the initial update copy of the end according to the gradient of the search structure parameters calculated by the end, and obtains the second update copy of the end.
  • the process of finding the optimal model parameters can be approximated, instead of training until the model converges to completely solve the internal optimization, thereby reducing the number of joint updates of the model parameters between the participants and improving the vertical federation modeling efficiency.
  • Step S103 using the search structure parameters in the local update copy to update the local search network to obtain the local initial update search network;
  • Step S104 interacting with other participants based on the local first data set for updating the third intermediate results of the model parameters in the respective initial update search networks, and updating the local initial update based on the received third intermediate results
  • the search network gets the updated local search network.
  • the local end first uses the search structure parameters in the secondary update copy of the local end to update the local search network to obtain the initial update search network of the local end.
  • the participant replaces the search structure parameters in the current search network of the end with the search structure parameters in the updated copy of the end, so as to update the search network of the end. That is, compared with the search network before this round of joint update parameters, the search structure parameters of the initial update search network are changed, and the model parameters remain unchanged, that is, on the basis of the model parameters unchanged, a network structure search is completed, and the update
  • the search structure parameters are optimized to optimize the structure of the search network.
  • the local end After obtaining the initial update search network of the local end, the local end interacts with other participants based on the first data set of the local end to update the third intermediate results of the model parameters in the initial update search network, and based on the received third intermediate results Update the local search network at the beginning to get the updated local search network.
  • the third intermediate result sent by the data providing participant to the data application participant may be the network output obtained by inputting the first data set of the terminal into the initial update search network of the terminal, and the third intermediate result sent by the data application participant to the data providing participant
  • the third intermediate result may be the gradient corresponding to the output of the network.
  • the data providing participant inputs the first data set of the terminal into the initial update search network of the terminal, obtains the network output (referred to as the first network output in this paragraph) through the processing of the initial update search network, and sends the first network output.
  • the data application participant inputs the first data set of the terminal into the initial update search network of the terminal, and obtains the network output (referred to as the second network output in this paragraph) through the processing of the initial update search network of the terminal; the data application The participant obtains the prediction result according to the first network output and the second network output.
  • the data application participant deploys a subsequent network, the data application participant can splicing the first network output and the second network output and then input the input.
  • the network is followed by the network, and the prediction result is obtained through the processing of the network; the data application participant calculates the loss function according to the prediction result and the label data of the end, and calculates the gradient of the loss function relative to the model parameters in the initial search network of the end, and, Calculate the gradient of the loss function relative to the output of the first network, and the data application participant sends the gradient corresponding to the output of the first network to the data provider; the data provider receives the gradient of the first network output, and according to the chain rule and gradient Descent algorithm, calculates the gradient of the loss function relative to the model parameters in the initial update search network of the end according to the gradient output by the first network, and updates the model parameters of the initial update search network of the end according to the gradient, so as to obtain the updated value of the end. Search the network; the data application participant also updates the model parameters in the initial update search network of the end according to the gradient of the model parameters calculated by the end, and obtains the updated search network of the end.
  • each participant can jointly perform multiple rounds of parameter update, and the data sets used by the participants in each round can be different.
  • the search structure parameters in the search network are kept unchanged, and the model parameters of the search network are optimized and updated.
  • each participant alternately updates the search structure parameters and model parameters in their respective search networks, and the first data set is used to update the model parameters, and the first data set is used to update the search structure parameters.
  • Two data sets using different data sets to update the two parameters, effectively avoids the phenomenon of overfitting, thereby improving the success rate of joint modeling and improving the prediction accuracy of the model obtained by modeling.
  • each participant does not expose the data in their own data sets to each other, thus ensuring the data security of each participant; each participant only needs to send data three times and receive three times in the process of jointly updating parameters in each round.
  • the process of jointly updating parameters in one round can also be divided into three steps.
  • each participant first copies the model parameters in their current search network to obtain a copy of the model parameters.
  • the participants then use their respective first data sets to jointly update the model parameters in their respective search networks; on the basis of the first step of updating, the second step is that each participant uses their respective second data sets to jointly update their respective search networks
  • each participant replaces the model parameters in the search network updated in the second step with a copy of the model parameters, and then uses their respective first data sets to jointly update the model parameters in their respective search networks .
  • a third embodiment of the vertical federated modeling optimization method of the present application is proposed.
  • the method is applied to a data application participant having label data, and the data application participant is deployed with a back-up network, and the steps of step S102 include:
  • Step S1021 receiving the second network output sent by the data providing participant, wherein the data providing participant inputs the second data set of the other end into the initial update copy of the other end to obtain the first network output;
  • the execution subject is the data application participant (hereinafter referred to as the local end), and the data application participant also deploys a back-end network.
  • the local end receives the second network output sent by the data providing participant, wherein the data providing participant inputs the second data set from the other end into the initial update copy of the other end to obtain the first network output.
  • the other end refers to the data provider participant.
  • the data providing participant inputs the second data set of the other end into the initial update copy of the other end, and obtains the first network output after processing the initial update copy of the other end.
  • Step S1022 input the second data set of the local end into the initial update copy of the local end to obtain the second network output, and input the first network output and the second network output into the subsequent network to obtain the third network output ;
  • the local end inputs the second data set of the local end into the initial update copy of the local end to obtain the second network output.
  • network output The local end inputs the output of the first network and the output of the second network and then connects to the network to obtain the output of the third network, that is, the prediction result.
  • the local end can splicing the output of the first network and the output of the second network, and the splicing method can be vector splicing or calculating a weighted average; network output.
  • Step S1023 calculating the first gradient of the loss function relative to the first network output and the second gradient of the search structure parameter in the initial update copy of the local end based on the third network output and the label data of the local end;
  • the local end calculates the loss function according to the third network output and the label data of the local end.
  • the specific loss function calculation method can refer to the existing machine learning model loss function calculation method, which will not be described in detail here.
  • the gradient can be calculated according to the chain rule and the gradient descent algorithm.
  • Step S1024 sending the first gradient to the data providing participant, so that the data providing participant can update the search structure parameters in the initial update copy of the other end according to the first gradient;
  • the local end sends the first gradient to the data provider. It should be noted that when there are multiple data providing participants, the local end calculates the gradient corresponding to the network output sent by each data providing participant, and returns the gradient to each corresponding data providing participant.
  • the data-providing party updates the search structure parameters in the initial update copy of the other end according to the first gradient. Specifically, the data-providing party calculates according to the first gradient according to the chain rule and gradient descent The gradient corresponding to the structural parameter is searched in the initial update copy of the other end, and the search structure parameter in the initial update copy of the other end is updated according to the gradient corresponding to the search structure parameter, and the second update copy of the other end is obtained.
  • Step S1025 Update the search structure parameter in the local initial update copy according to the second gradient to obtain the local secondary update copy.
  • the local end updates the search structure parameters in the initial update copy of the local end according to the second gradient, and obtains the secondary update copy of the local end.
  • each participant interacts with the intermediate results used to update the search structure parameters of their respective search networks, so that each participant can complete the network structure search without exposing their own data, thereby ensuring data security.
  • each participant does not need to set up their own model structures in advance, which lowers the threshold for participating in vertical federated learning.
  • the following example illustrates the process of a round of joint parameter updating.
  • the data application participant is represented by A
  • the data provider participant is represented by B
  • Net A and Net B represent the search network of A and B respectively
  • W A and W B respectively represent the model parameters of Net A and Net B
  • ⁇ A and ⁇ B represent the search structure parameters of Net A and Net B, respectively.
  • X trn A and X val A represent the first data set and the second data set of the A party, respectively
  • X trn B and X val B respectively represent the first data set and the second data set of the B party.
  • Y trn represents the label data corresponding to X trn A
  • Y val represents the label data corresponding to X val A.
  • Party B inputs X trn B into Net B to get the network output U trn B , and transmits it to Party A.
  • party B can process U trn B according to differential privacy or homomorphic encryption method before sending it to party A;
  • Party A inputs X trn A into Net A to obtain the network output U trn A , splices U trn A and U trn B and then inputs it into the network to obtain Y trn out ; Party A copies Net A to obtain Net A ', based on Y trn and Y trn out calculate the gradient of the loss function with respect to W A and the gradient of U trn B and according to Update W A, W A will be updated as Net A 'model parameters, i.e., calculation where ⁇ is the learning rate;
  • Party A will sent to Party B;
  • Party B copies Net B to get Net B ', and according to the gradient Update Net B 'model parameters W B'. Specifically, Party B calculates according to the chain rule and gradient descent algorithm.
  • Party B inputs X val B into Net B ' to obtain the network output U val B , and transmits it to Party A.
  • Party A inputs X val A into Net A ' to obtain the network output U val A , splices U val A and U val B and then inputs them into the network to obtain Y val out ; Party A calculates the loss based on Y val and Y val out Gradient of the function with respect to the search structure parameter ⁇ A ' in Net A' and the gradient of U val B Party A will sent to Party B;
  • Party A according to update ⁇ A ', i.e. calculate Party B according to Update Net B 'search structure parameter ⁇ B'. Specifically, Party B calculates according to the chain rule and gradient descent algorithm.
  • Party B inputs X trn B into Net B to get the network output U trn B , and transmits it to Party A;
  • Party A inputs X trn A into Net A to obtain the network output U trn A , splices U trn A and U trn B and then inputs it into the network to obtain Y trn out ; Party A calculates the loss function based on Y trn and Y trn out Gradient with respect to W A and the gradient of U trn B Party A will sent to Party B;
  • Party A's basis Update W A i.e. calculate Party B according to Update W B. Specifically, Party B calculates according to the chain rule and gradient descent algorithm.
  • a fourth embodiment of the vertical federated modeling optimization method of the present application is proposed.
  • the method is applied to a data providing participant, and the step S102 includes:
  • Step S1026, input the second data set of the local end into the initial update copy of the local end to obtain the first network output;
  • the execution subject is the data providing participant (hereinafter referred to as the local end).
  • Data application participants participating in vertical federation modeling also deploy back-up networks.
  • the local end inputs the second data set of the local end into the initial update copy of the local end to obtain the first network output. Specifically, the local end inputs the second data set of the local end into the initial update copy of the local end, and obtains the first network output after processing the initial update copy of the local end. It should be noted that there may be multiple data providing participants. In this embodiment, one data providing participant is taken as an example to illustrate a specific example.
  • Step S1027 Send the first network output to the data application participant that owns the tag data, so that the data application participant can input the second data set at the other end into the initial update copy of the other end to obtain the second network output, The output of the first network and the output of the second network are then connected to the network to obtain the output of the third network, and the first gradient of the loss function relative to the output of the first network and the The second gradient of the search structure parameter in the initial update copy of the other end, and the search structure parameter in the initial update copy of the other end is updated according to the second gradient, wherein the subsequent network is deployed in the data application participant;
  • the local end sends the first network output to the data application participant.
  • the data application participant inputs the second data set of the other end into the initial update copy of the other end to obtain the second network output.
  • the other end refers to the data application participant.
  • the data application participant inputs the second data set of the other end into the initial update copy of the other end, and obtains the second network output after processing the initial update copy of the other end.
  • the data application participant splices the output of the first network and the output of the second network.
  • the splicing method can be vector splicing or calculating a weighted average; the splicing result is input into the subsequent network, and the third network output is obtained through the processing of the subsequent network. .
  • the data application participant calculates the loss function according to the output of the third network and the label data of the other end, and calculates the first gradient of the loss function relative to the output of the first network, and the second gradient of the loss function relative to the search structure parameters in the initial update copy of the other end. gradient.
  • the data application participant updates the search structure parameters in the initial update copy of the other side according to the second gradient, and obtains the secondary update copy of the other side.
  • the data application participant sends the first gradient to the data providing participant.
  • Step S1028 Receive the first gradient sent by the data application participant, and update the search structure parameter in the local initial update copy according to the first gradient to obtain the local secondary update copy.
  • the local end receives the first gradient sent by the data application participant, and updates the search structure parameters in the initial update copy of the local end according to the first gradient to obtain the secondary update copy of the local end. Specifically, according to the chain rule and the gradient descent algorithm, the local end calculates the gradient corresponding to the search structure parameter in the initial update copy of the local end according to the first gradient, and updates the search structure parameter in the initial update copy of the local end according to the gradient.
  • each participant interacts with the intermediate results used to update the search structure parameters of their respective search networks, so that each participant can complete the network structure search without exposing their own data, thereby ensuring data security.
  • each participant does not need to set up their own model structures in advance, which lowers the threshold for participating in vertical federated learning.
  • step S201 before the step S201, it further includes:
  • Step S203 detecting whether the preset modeling stop condition is currently satisfied
  • the preset modeling stop condition may be a condition set in advance according to specific needs, for example, it may stop when a maximum round is reached, or when a maximum duration is reached, or when the model converges.
  • the detecting model convergence may be detecting whether the loss function of the model converges.
  • the data set is divided into a first data set and a second data set, and when the model parameters and the search structure parameters are optimized separately, it can be detected when the model converges on the first data set or when the second data set is detected. When the model on the set converges, it is determined that the model converges.
  • Step S204 if the preset modeling stop condition is satisfied, then execute the step: select a reserved operation from each connection operation according to the updated search structure parameter in the local search network;
  • step S201 and subsequent operations may be performed to obtain the local target model, and thus the local vertical federation modeling is completed.
  • Step S205 if the preset modeling stop condition is not met, then perform the step based on the updated local search network: interact with other participants based on the local data set to update the model parameters and parameters in the respective search networks. Search for intermediate results of structural parameters, and update the local search network based on the received intermediate results.
  • step S10 and subsequent operations are continued, that is, the next round of jointly updating parameters is performed.
  • step S20 it further includes:
  • Step S30 receiving the first model output sent by the data providing participant, wherein the data providing participant inputs the user data corresponding to the second risk feature of the target user at the other end into the target model of the other end to obtain the first model output ;
  • Each participant may be a device deployed in a bank or other financial institution, and the participant stores user data recorded by each institution during business processing. There are differences in the specific business involved in different institutions, so the characteristics of user data of each participant may be different.
  • Each institution can build a data set based on its own data characteristics, and use their own data sets to jointly conduct vertical federated learning, and enrich the features by expanding the model. degree to improve the prediction performance of the model.
  • each participant can jointly build a user risk prediction model, which is used to predict the user's risk level in business scenarios such as credit business and insurance business.
  • the data characteristics of each participant can select the risk characteristics related to the user's risk prediction according to actual experience, such as the user's deposit amount, the user's default times, and so on.
  • Each participant uses their own data sets to jointly perform vertical federation modeling according to the method in the above-mentioned embodiment to obtain their own target models.
  • each participant After obtaining their respective target models, each participant can jointly carry out risk prediction for users.
  • the data application participant receives the first model output sent by the data providing participant.
  • the data providing participant inputs the user data corresponding to the second risk feature of the target user at the other end into the target model of the other end, and obtains the output of the first model after processing by the target model of the other end.
  • the other end refers to the data provider participant.
  • Step S40 inputting the user data corresponding to the second risk feature of the target user at the local end into the local target model to obtain the output of the second model;
  • Step S50 splicing the output of the first model and the output of the second model and then inputting the output to the back-end network of the local end to obtain the risk prediction result of the target user.
  • the data application participant inputs the user data corresponding to the second risk feature of the target user at the local end into the local target model, and obtains the output of the second model after processing by the local target model.
  • the data application participant splices the output of the first model and the output of the second model.
  • the output of the second model and the output of the second model can be spliced in a vector splicing manner, or a weighted average can be performed.
  • the input data of the splicing result is applied to the back-end network of the participant's local end, and after the back-end network is processed, the output is the risk prediction result of the target user.
  • the data application participant can send the risk prediction result of the target user to the data providing participant, so that the data providing participant can predict the result according to the risk of the target user. Carry out subsequent business processing, for example, determine whether to lend to the target user according to the risk prediction result.
  • each participant only needs to set up their own search network, and does not need to spend a lot of manpower and material resources to set up a carefully set model structure, thereby lowering the threshold for participating in vertical federated learning, enabling banks and other financial institutions to be more It is convenient to carry out joint modeling through longitudinal federated learning, and then complete the risk prediction task through the risk prediction model obtained by joint modeling. Moreover, in the process of vertical federation modeling and the use of models for risk prediction after modeling, each participant does not need to directly interact with their respective datasets and models, thus ensuring the security of user privacy data in each participant.
  • an embodiment of the present application also proposes a vertical federation modeling optimization device.
  • the device is deployed on the participants participating in the vertical federation modeling.
  • the apparatus includes:
  • the interaction module 10 is configured to interact with other participants based on the local data set to update the intermediate results of the model parameters and search structure parameters in the respective search networks, and update the local search network based on the received intermediate results;
  • the determining module 20 is configured to obtain the local target model based on the updated local search network.
  • the data sets of the participants include a first data set and a second data set
  • the interaction module 10 includes:
  • the first interaction unit is used to interact with other participants based on the first data set at the local end to update the first intermediate results of the model parameters in the respective search networks, and update the copy of the local search network based on the received first intermediate results Get the first updated copy of the local end;
  • the second interaction unit is configured to interact with other participants based on the second data set of the local end to update the second intermediate results of the search structure parameters in the respective initial update copies, and update the local end based on the received second intermediate results
  • the first update copy gets the current update copy
  • an update unit configured to update the local search network by using the search structure parameters in the local update copy to obtain the local initial update search network
  • a third interaction unit configured to interact with other participants based on the first data set at the local end for updating the third intermediate results of the model parameters in the initial search network, and update the third intermediate results based on the received third intermediate results
  • the local search network after the initial update of the local terminal gets the updated local search network.
  • the device is deployed on a data application participant having tag data
  • the data application participant is deployed with a back-end network
  • the second interaction unit includes:
  • the first receiving subunit receives the second network output sent by the data providing participant, wherein the data providing participant inputs the second data set of the other end into the initial update copy of the other end to obtain the first network output;
  • the first input subunit is used to input the second data set of the local end into the initial update copy of the local end to obtain the second network output, and input the first network output and the second network output into the subsequent network get the third network output;
  • a calculation subunit configured to calculate the first gradient of the loss function relative to the first network output based on the third network output and the label data of the local end and the second gradient of the search structure parameter in the initial update copy of the local end;
  • a first sending subunit configured to send the first gradient to the data providing participant, so that the data providing participant can update the search structure parameter in the initial update copy of the other end according to the first gradient;
  • An update subunit configured to update the search structure parameter in the local initial updated copy according to the second gradient to obtain the local secondary updated copy.
  • the apparatus is deployed on the data providing participant, and the second interaction unit includes:
  • the second input subunit is used to input the second data set of the local end into the initial update copy of the local end to obtain the first network output;
  • the second sending subunit is configured to send the output of the first network to the data application participant that has the tag data, so that the data application participant can input the second data set of the other end into the initial update copy of the other end to obtain the second Network output, input the first network output and the second network output and then connect to the network to obtain the third network output, and calculate the loss function relative to the first network output based on the third network output and the label data at other ends and the second gradient of the search structure parameters in the initial update copy of the other end, and update the search structure parameters in the initial update copy of the other end according to the second gradient, wherein the subsequent network is deployed in the data application participants;
  • the second receiving subunit is configured to receive the first gradient sent by the data application participant, and update the search structure parameter in the local initial update copy according to the first gradient to obtain the local secondary update copy.
  • the search structure parameter in the search network of the participant includes the weight corresponding to the connection operation between the network elements in the search network
  • the determination module 20 includes:
  • a selection unit configured to select a reserved operation from each connection operation according to the updated search structure parameter in the local search network
  • a determination unit configured to use the model formed by each of the reservation operations and the network units connected to each of the reservation operations as a local target model.
  • the determining module 20 also includes:
  • a detection unit for detecting whether the preset modeling stop condition is currently met
  • the determining module 20 is further configured to, if the preset modeling stop condition is satisfied, execute the step: select a reserved operation from each connection operation according to the updated search structure parameter in the local search network; For the preset modeling stop condition, the steps are performed again based on the updated local search network: based on the local data set, interacting with other participants to update the intermediate results of the model parameters and search structure parameters in the respective search networks , and update the local search network based on the received intermediate results.
  • the device is deployed on a data application participant having tag data, and the device further includes:
  • the receiving module is configured to receive the output of the first model sent by the data providing participant, wherein the data providing participant inputs the user data corresponding to the second risk feature of the target user at the other end into the target model of the other end, and obtains the first model model output;
  • the input module is used for inputting the user data corresponding to the second risk feature of the target user at the local end into the local target model to obtain the output of the second model;
  • the prediction module is used for splicing the output of the first model and the output of the second model and then inputting the output to the back-end network of the local end to obtain the risk prediction result of the target user.
  • the extended content of the specific implementation of the vertical federated modeling optimization apparatus of the present application is basically the same as that of the above-mentioned embodiments of the vertical federated modeling optimization method, and will not be repeated here.
  • an embodiment of the present application further provides a computer-readable storage medium, where a vertical federation modeling optimization program is stored on the storage medium, and when the vertical federation modeling optimization program is executed by a processor, the following vertical federation is realized Steps for modeling optimization methods.
  • the method of the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course can also be implemented by hardware, but in many cases the former is better implementation.
  • the technical solution of the present application can be embodied in the form of a software product in essence or in a part that contributes to the prior art, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, CD), including several instructions to make a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute the methods described in the embodiments of this application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Geometry (AREA)
  • Computer Hardware Design (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

一种纵向联邦建模优化方法、装置、设备及可读存储介质,所述方法包括参与纵向联邦学习的参与方基于本端数据集与其他参与方交互用于更新各自搜索网络中模型参数和搜索结构参数的中间结果,并基于接收到的中间结果更新本端搜索网络(S10);基于更新后的本端搜索网络得到本端目标模型(S20)。该方法中参与方在使用纵向联邦技术建模之时无需事先确定其模型结构,使得纵向联邦学习的参与门槛大大降低。

Description

纵向联邦建模优化方法、装置、设备及可读存储介质
本申请要求2020年7月10日申请的,申请号为202010663980.9,名称为“纵向联邦建模优化方法、装置、设备及可读存储介质”的中国专利申请的优先权,在此将其全文引入作为参考。
技术领域
本申请涉及人工智能技术领域,尤其涉及一种纵向联邦建模优化方法、装置、设备及可读存储介质。
背景技术
随着人工智能的发展,人们为解决数据孤岛的问题,提出了“联邦学习”的概念,使得联邦双方在不用给出己方数据的情况下,也可进行模型训练得到模型参数,并且可以避免数据隐私泄露的问题。
纵向联邦学习是在参与者的数据特征重叠较小,而用户重叠较多的情况下,取出参与者用户相同而用户数据特征不同的那部分用户及数据进行联合训练机器学习模型。比如有属于同一个地区的两个参与者A和B,其中参与者A是一家银行,参与者B是一个电商平台。参与者A和B在同一地区拥有较多相同的用户,但是A与B的业务不同,记录的用户数据特征是不同的。特别地,A和B记录的用户数据特征可能是互补的。在这样的场景下,可以使用纵向联邦学习来帮助A和B构建联合机器学习预测模型,帮助A和B向他们的客户提供更好的服务。
但是,目前纵向联邦学习的参与方在使用纵向联邦技术时需要对各自的模型结构进行预先的设计,而由于设计的模型结构稍有差别可能就会极大地影响整体纵向联邦学习技术的性能,使得纵向联邦学习的参与门槛较高,限制了纵向联邦学习在具体任务领域的应用范围。
发明内容
本申请的主要目的在于提供一种纵向联邦建模优化方法、装置、设备及可读存储介质,旨在解决目前纵向联邦学习的参与方在使用纵向联邦技术时需要对各自的模型结构进行预先的设计,造成纵向联邦学习参与门槛高的问题。
为实现上述目的,本申请提供一种纵向联邦建模优化方法,所述方法包括以下步骤:
所述方法应用于参与纵向联邦建模的参与方,各参与方分别部署有基于各自数据特征构建的数据集和搜索网络,所述方法包括以下步骤:
基于本端数据集与其他参与方交互用于更新各自搜索网络中模型参数和搜索结构参数的中间结果,并基于接收到的中间结果更新本端搜索网络;
基于更新后的本端搜索网络得到本端目标模型。
在一实施例中,参与方的数据集包括第一数据集和第二数据集,所述基于本端数据集与其他参与方交互用于更新各自搜索网络中模型参数和搜索结构参数的中间结果,并基于接收到的中间结果更新本端搜索网络的步骤包括:
基于本端第一数据集与其他参与方交互用于更新各自搜索网络中模型参数的第一中间结果,并基于接收到的第一中间结果更新本端搜索网络的副本得到本端初更副本;
基于本端第二数据集与其他参与方交互用于更新各自初更副本中搜索结构参数的第二中间结果,并基于接收到的第二中间结果更新所述本端初更副本得到本端次更副本;
采用所述本端次更副本中的搜索结构参数更新所述本端搜索网络得到本端初更搜索网络;
基于所述本端第一数据集与其他参与方交互用于更新各自初更搜索网络中模型参数的第三中间结果,并基于接收到的第三中间结果更新所述本端初更搜索网络得到更新后的本端搜索网络。
在一实施例中,所述方法应用于拥有标签数据的数据应用参与方,所述数据应用参与方部署有后接网络,所述基于本端第二数据集与其他参与方交互用于更新各自初更副本中搜索结构参数的第二中间结果,并基于接收到的第二中间结果更新所述本端初更副本得到本端次更副本的步骤包括:
接收数据提供参与方发送的第二网络输出,其中,所述数据提供参与方将他端第二数据集输入他端初更副本得到所述第一网络输出;
将本端第二数据集输入所述本端初更副本得到第二网络输出,并将所述第一网络输出和所述第二网络输出输入所述后接网络得到第三网络输出;
基于所述第三网络输出和本端的标签数据计算损失函数相对所述第一网络输出的第一梯度以及所述本端初更副本中搜索结构参数的第二梯度;
将所述第一梯度发送给所述数据提供参与方,以供所述数据提供参与方根据所述第一梯度更新他端初更副本中的搜索结构参数;
根据所述第二梯度更新所述本端初更副本中的搜索结构参数得到本端次更副本。
在一实施例中,所述方法应用于数据提供参与方,所述基于本端第二数据集与其他参与方交互用于更新各自初更副本中搜索结构参数的第二中间结果,并基于接收到的第二中间结果更新所述本端初更副本得到本端次更副本的步骤包括:
将本端第二数据集输入所述本端初更副本得到第一网络输出;
将所述第一网络输出发送给拥有标签数据的数据应用参与方,以供所述数据应用参与方将他端第二数据集输入他端初更副本得到第二网络输出,将所述第一网络输出和所述第二网络输出输入后接网络得到第三网络输出,并基于所述第三网络输出和他端的标签数据计算损失函数相对所述第一网络输出的第一梯度以及他端初更副本中搜索结构参数的第二梯度,并根据所述第二梯度更新他端初更副本中的搜索结构参数,其中,所述后接网络部署于所述数据应用参与方;
接收所述数据应用参与方发送的所述第一梯度,并根据所述第一梯度更新所述本端初更副本中的搜索结构参数得到本端次更副本。
在一实施例中,参与方的搜索网络中搜索结构参数包括搜索网络中网络单元之间连接操作对应的权重,所述基于更新后的本端搜索网络得到本端目标模型的步骤包括:
根据更新后的本端搜索网络中的搜索结构参数从各连接操作中选取保留操作;
将各所述保留操作和各所述保留操作连接的网络单元所构成的模型作为本端目标模型。
在一实施例中,所述根据更新后的本端搜索网络中的搜索结构参数从各连接操作中选取保留操作的步骤之前,还包括:
检测当前是否满足预设建模停止条件;
若满足所述预设建模停止条件,则执行所述步骤:根据更新后的本端搜索网络中的搜索结构参数从各连接操作中选取保留操作;
若不满足所述预设建模停止条件,则基于更新后的本端搜索网络再执行所述步骤:基于本端数据集与其他参与方交互用于更新各自搜索网络中模型参数和搜索结构参数的中间结果,并基于接收到的中间结果更新本端搜索网络。
在一实施例中,所述方法应用于拥有标签数据的数据应用参与方,所述基于更新后的本端搜索网络得到本端目标模型的步骤之后,还包括:
接收数据提供参与方发送的第一模型输出,其中,所述数据提供参与方将目标用户在他端的第二风险特征对应的用户数据输入他端目标模型,得到所述第一模型输出;
将目标用户在本端的第二风险特征对应的用户数据输入本端目标模型,得到第二模型输出;
将所述第一模型输出和所述第二模型输出进行拼接后输入本端的后接网络,得到所述目标用户的风险预测结果。
为实现上述目的,本申请提供一种纵向联邦建模优化装置,所述装置部署于参与纵向联邦建模的参与方,各参与方分别部署有基于各自数据特征构建的数据集和搜索网络,所述装置包括:
交互模块,用于基于本端数据集与其他参与方交互用于更新各自搜索网络中模型参数和搜索结构参数的中间结果,并基于接收到的中间结果更新本端搜索网络;
确定模块,用于基于更新后的本端搜索网络得到本端目标模型。
为实现上述目的,本申请还提供一种纵向联邦建模优化设备,所述纵向联邦建模优化设备包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的纵向联邦建模优化程序,所述纵向联邦建模优化程序被所述处理器执行时实现如上所述的纵向联邦建模优化方法的步骤。
此外,为实现上述目的,本申请还提出一种计算机可读存储介质,所述计算机可读存储介质上存储有纵向联邦建模优化程序,所述纵向联邦建模优化程序被处理器执行时实现如上所述的纵向联邦建模优化方法的步骤。
本申请中,通过在参与纵向联邦学习的各个参与方部署基于各自数据特征构建的数据集和搜索网络,各个参与方采用各自的数据集与其他参与方计算并交互用于更新各自搜索网络中模型参数和搜索结构参数的中间结果,基于各自接收到的中间结果更新各自的搜索网络,基于更新后的搜索网络得到各自的目标模型。相比于现有纵向联邦学习中,各参与方需要人工花费大量人力物力预先设计模型结构的方式,本申请实现了在纵向联邦建模过程中,各参与方只需要设置各自的搜索网络即可,搜索网络中各个网络单元之间的连接,也即模型结构,是在纵向联邦建模过程中通过优化更新搜索结构参数的方式自动确定的,实现了自动纵向联邦学习,不需要花费大量人力物力预先设置模型结构,降低了参与纵向联邦学习的门槛,使得纵向联邦学习能够被应用于更广泛的具体任务领域中去实现具体的任务,提高了纵向联邦学习的应用范围。并且,在本申请纵向联邦建模过程中,各个参与方之间并不会直接交互数据集和模型本身,而是交互用于更新模型参数和搜索结构参数的中间结果,从而保障了各个参与方的数据安全和模型信息安全。
附图说明
图1为本申请实施例方案涉及的硬件运行环境的结构示意图;
图2为本申请纵向联邦建模优化方法第一实施例的流程示意图;
图3为本申请实施例涉及的一种参与方联合更新模型参数的示意图;
图4为本申请实施例涉及的一种参与方联合更新搜索结构参数的示意图;
图5为本申请实施例涉及的一种参与方联合更新模型参数的示意图;
图6为本申请纵向联邦建模优化装置较佳实施例的功能示意图模块图。
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
具体实施方式
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
如图1所示,图1是本申请实施例方案涉及的硬件运行环境的设备结构示意图。
需要说明的是,本申请实施例纵向联邦建模优化设备可以是智能手机、个人计算机和服务器等设备,在此不做具体限制。纵向联邦建模优化设备可以是参与纵向联邦建模的参与方,各参与方分别部署有基于各自数据特征构建的数据集和搜索网络。
如图1所示,该纵向联邦建模优化设备可以包括:处理器1001,例如CPU,网络接口1004,用户接口1003,存储器1005,通信总线1002。其中,通信总线1002用于实现这些组件之间的连接通信。用户接口1003可以包括显示屏(Display)、输入单元比如键盘(Keyboard),可选用户接口1003还可以包括标准的有线接口、无线接口。网络接口1004可选的可以包括标准的有线接口、无线接口(如WI-FI接口)。存储器1005可以是高速RAM存储器,也可以是稳定的存储器(non-volatile memory),例如磁盘存储器。存储器1005可选的还可以是独立于前述处理器1001的存储装置。
本领域技术人员可以理解,图1中示出的设备结构并不构成对纵向联邦建模优化设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
如图1所示,作为一种计算机存储介质的存储器1005中可以包括操作系统、网络通信模块、用户接口模块以及纵向联邦建模优化程序。其中,操作系统是管理和控制设备硬件和软件资源的程序,支持纵向联邦建模优化程序以及其它软件或程序的运行。在图1所示的设备中,用户接口1003主要用于与客户端进行数据通信;网络接口1004主要用于参与纵向联邦建模的其他参与方建立通信连接;处理器1001可以用于调用存储器1005中存储的纵向联邦建模优化程序,并执行以下操作:
基于本端数据集与其他参与方交互用于更新各自搜索网络中模型参数和搜索结构参数的中间结果,并基于接收到的中间结果更新本端搜索网络;
基于更新后的本端搜索网络得到本端目标模型。
在一实施例中,参与方的数据集包括第一数据集和第二数据集,所述基于本端数据集与其他参与方交互用于更新各自搜索网络中模型参数和搜索结构参数的中间结果,并基于接收到的中间结果更新本端搜索网络的步骤包括:
基于本端第一数据集与其他参与方交互用于更新各自搜索网络中模型参数的第一中间结果,并基于接收到的第一中间结果更新本端搜索网络的副本得到本端初更副本;
基于本端第二数据集与其他参与方交互用于更新各自初更副本中搜索结构参数的第二中间结果,并基于接收到的第二中间结果更新所述本端初更副本得到本端次更副本;
采用所述本端次更副本中的搜索结构参数更新所述本端搜索网络得到本端初更搜索网络;
基于所述本端第一数据集与其他参与方交互用于更新各自初更搜索网络中模型参数的第三中间结果,并基于接收到的第三中间结果更新所述本端初更搜索网络得到更新后的本端搜索网络。
在一实施例中,当纵向联邦建模优化设备是拥有标签数据的数据应用参与方时,所述数据应用参与方部署有后接网络,所述基于本端第二数据集与其他参与方交互用于更新各自初更副本中搜索结构参数的第二中间结果,并基于接收到的第二中间结果更新所述本端初更副本得到本端次更副本的步骤包括:
接收数据提供参与方发送的第二网络输出,其中,所述数据提供参与方将他端第二数据集输入他端初更副本得到所述第一网络输出;
将本端第二数据集输入所述本端初更副本得到第二网络输出,并将所述第一网络输出和所述第二网络输出输入所述后接网络得到第三网络输出;
基于所述第三网络输出和本端的标签数据计算损失函数相对所述第一网络输出的第一梯度以及所述本端初更副本中搜索结构参数的第二梯度;
将所述第一梯度发送给所述数据提供参与方,以供所述数据提供参与方根据所述第一梯度更新他端初更副本中的搜索结构参数;
根据所述第二梯度更新所述本端初更副本中的搜索结构参数得到本端次更副本。
在一实施例中,当纵向联邦建模优化设备是数据提供参与方,所述基于本端第二数据集与其他参与方交互用于更新各自初更副本中搜索结构参数的第二中间结果,并基于接收到的第二中间结果更新所述本端初更副本得到本端次更副本的步骤包括:
将本端第二数据集输入所述本端初更副本得到第一网络输出;
将所述第一网络输出发送给拥有标签数据的数据应用参与方,以供所述数据应用参与方将他端第二数据集输入他端初更副本得到第二网络输出,将所述第一网络输出和所述第二网络输出输入后接网络得到第三网络输出,并基于所述第三网络输出和他端的标签数据计算损失函数相对所述第一网络输出的第一梯度以及他端初更副本中搜索结构参数的第二梯度,并根据所述第二梯度更新他端初更副本中的搜索结构参数,其中,所述后接网络部署于所述数据应用参与方;
接收所述数据应用参与方发送的所述第一梯度,并根据所述第一梯度更新所述本端初更副本中的搜索结构参数得到本端次更副本。
在一实施例中,参与方的搜索网络中搜索结构参数包括搜索网络中网络单元之间连接操作对应的权重,所述基于更新后的本端搜索网络得到本端目标模型的步骤包括:
根据更新后的本端搜索网络中的搜索结构参数从各连接操作中选取保留操作;
将各所述保留操作和各所述保留操作连接的网络单元所构成的模型作为本端目标模型。
在一实施例中,所述根据更新后的本端搜索网络中的搜索结构参数从各连接操作中选取保留操作的步骤之前,处理器1001还可以用于调用存储器1005中存储的纵向联邦建模优化程序,执行以下操作:
检测当前是否满足预设建模停止条件;
若满足所述预设建模停止条件,则执行所述步骤:根据更新后的本端搜索网络中的搜索结构参数从各连接操作中选取保留操作;
若不满足所述预设建模停止条件,则基于更新后的本端搜索网络再执行所述步骤:基于本端数据集与其他参与方交互用于更新各自搜索网络中模型参数和搜索结构参数的中间结果,并基于接收到的中间结果更新本端搜索网络。
在一实施例中,所述方法应用于拥有标签数据的数据应用参与方,所述基于更新后的本端搜索网络得到本端目标模型的步骤之后,处理器1001还可以用于调用存储器1005中存储的纵向联邦建模优化程序,执行以下操作:
接收数据提供参与方发送的第一模型输出,其中,所述数据提供参与方将目标用户在他端的第二风险特征对应的用户数据输入他端目标模型,得到所述第一模型输出;
将目标用户在本端的第二风险特征对应的用户数据输入本端目标模型,得到第二模型输出;
将所述第一模型输出和所述第二模型输出进行拼接后输入本端的后接网络,得到所述目标用户的风险预测结果。
基于上述的结构,提出纵向联邦建模优化方法的各实施例。
参照图2,图2为本申请纵向联邦建模优化方法第一实施例的流程示意图。需要说明的是,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。本申请纵向联邦建模优化方法应用于参与纵向联邦学习的参与方,各参与方分别部署有基于各自数据特征构建的数据集和搜索网络,参与方可以是智能手机、个人计算机和服务器等设备。在本实施例中,纵向联邦建模优化方法包括:
步骤S10,基于本端数据集与其他参与方交互用于更新各自搜索网络中模型参数和搜索结构参数的中间结果,并基于接收到的中间结果更新本端搜索网络;
在本实施例中,纵向联邦学习中的参与方分为两类,一类是拥有标签数据的数据应用参与方,一类是没有标签数据的数据提供参与方,一般情况下,数据应用参与方有一个,数据提供参与方有一个或多个。各个参与方分别部署有基于各自数据特征构建的数据集和搜索网络。其中,各个参与方的数据集的样本维度是对齐的,也即,各个数据集的样本ID是相同的,但是各个参与方的数据特征可各不相同。各个参与方预先可采用加密样本对齐的方式来构建样本维度对齐的数据集,在此不进行详细赘述。搜索网络是指用于进行网络结构搜索(NAS)的网络,在本实施例中,各个参与方的搜索网络可以是各自预先根据DARTS(Differentiable Architecture Search,可微结构搜索)方法设计的网络。
搜索网络中包括多个单元,每个单元对应一个网络层,其中部分单元之间设置有连接操作,以其中两个单元为例,这两个单元之前的连接操作可以是预先设置的N种连接操作,并定义了每种连接操作对应的权重,该权重即搜索网络的搜索结构参数,单元内的网络层参数即搜索网络的模型参数。在模型训练过程中,需要进行网络结构搜索以优化更新搜索结构参数和模型参数,基于最终更新的搜索结构参数即可确定最终的网络结构,即确定保 留哪个或哪些连接操作。由于该网络的结构是经过网络搜索之后才确定的,各个参与方不需要像设计传统纵向联邦学习的模型一样去设置模型的网络结构,从而降低了设计模型的难度。
进一步地,数据应用参与方还可以部署基于具体的模型预测任务设置的后接网络,后接网络被设置于连接在各个参与方的搜索网络之后,也即,以各个搜索网络的输出数据作为输入数据。后接网络可以采用全连接层,或者其他复杂的神经网络结构,具体可根据模型预测任务不同而不同。
在本实施例中,执行主体可以是数据应用参与方,也可以是数据提供参与方。为区分执行主体这一方与其他参与方,以下将执行主体这一方称为本端。
本端可基于本端数据集与其他参与方交互用于更新各自搜索网络中模型参数和搜索结构参数的中间结果,并基于接收到的中间结果更新本端搜索网络中的模型参数和搜索结构参数,通过更新参数来更新本端搜索网络。需要说明的是,在以下各实施例中,对搜索网络中模型参数和/或搜索结构参数的更新,即是对搜索网络的更新。其中,各参与方的搜索网络中包括模型参数和搜索结构参数,在各方联合训练之前,参数是初始化的,在联合训练过程中,各方需要进行多轮更新各自的模型参数和搜索结构参数。各个参与方交互的并不是各自的数据集,而是用于更新各自搜索网络中模型参数和搜索结构参数的中间结果,也即,每个参与方在更新自己的参数时,需要其他参与方的数据,因此,各个参与方可以计算其他参与方更新参数时所需的中间结果,传递给其他参数方,进而帮助其他参与方更新它的参数。其中,中间结果可以是参数的梯度,也可以是搜索网络的输出数据。具体地,当参与方是数据提供参与方时,发送给对方的中间结果可以是该端搜索网络的输出数据;当参与方是数据应用参与方时,发送给对方的中间结果可以是计算得到的数据提供方所发送输出数据对应的梯度。由于传递的是中间结果而不是数据集中的原始数据,使得各个参与方互相之间并没有泄露各自的数据隐私,保护了各个参与方的数据安全。
各个参与方可进行多轮联合更新参数,在一种实施方式中,一轮联合更新参数的过程可以是各个参与方联合采用各自的数据集同时更新搜索结构参数和模型参数。具体地,数据提供参与方将该端的数据集输入该端的搜索网络,经过搜索网络的处理得到网络输出(本段中称第一网络输出),并将第一网络输出发送给数据应用参与方;数据应用参与方将该端的数据集输入该端的搜索网络,经过该端搜索网络的处理得到网络输出(本段中称第二网络输出);数据应用参与方根据第一网络输出和第二网络输出得到预测结果,具体地,若数据应用参与方部署有后接网络,则数据应用参与方可将第一网络输出和第二网络输出进行拼接后输入后接网络,经过后接网络的处理得到预测结果;数据应用参与方根据预测结果和该端的标签数据计算损失函数,该损失函数可以是回归问题的均方误差或分类问题的交叉熵损失等,并计算损失函数相对于该端模型参数和网络结构参数的梯度,以及,计算损失函数相对于第一网络输出的梯度,数据应用参与方将第一网络输出对应的梯度发送给数据提供参与方;数据提供参与方接收第一网络输出的梯度,并根据链式法则和梯度下降算法,根据第一网络输出的梯度计算得到损失函数相对于该端模型参数和网络结构参数的梯度,并根据梯度更新该端的模型参数和网络结构参数;数据应用参与方也根据该端计算得到的模型参数和网络结构参数的梯度,更新该端的模型参数和网络结构参数,至此完成一轮联合更新参数。其中,数据提供参与方发送给数据应用参与方的中间结果是第一网络输出,数据应用参与方发送给数据提供参与方的中间结果是第一网络输出对应的梯度。
需要说明的是,参与方可在各轮联合更新参数中采用不同的数据集。具体地,参与方可将总的数据集划分为多个小的训练集(也可称为数据批),每轮采用一个小数据集参与联合更新参数,或者,参与方也可以是每轮联合参数更新前,从总的数据集中进行有放回的采样一批数据来参与该轮的联合参数更新。
在一轮联合更新参数中,参与方的模型参数和网络结构同时更新,每个参与方只需要 进行一次数据发送和一次数据接收,数据通信极少,通信效率高,进而极大地提高了纵向联邦学习的效率。
进一步地,在另一种实施方式中,参与方的数据集可分为第一数据集和第二数据集两个数据集,第一数据集可作为训练集,第二数据集可作为验证集。一轮联合更新参数的过程可分为两步,第一步各参与方联合采用各自的第一数据集更新各自搜索网络中的模型参数,在第一步更新的基础上,第二步各个参与方联合采用各自的第二数据集更新各自搜索网络中的搜索结构参数。
具体地,第一步中,数据提供参与方将该端的第一数据集输入该端的搜索网络,经过搜索网络的处理得到网络输出(本段中称第一网络输出),并将第一网络输出发送给数据应用参与方;数据应用参与方将该端的第一数据集输入该端的搜索网络,经过该端搜索网络的处理得到网络输出(本段中称第二网络输出);数据应用参与方根据第一网络输出和第二网络输出得到预测结果,具体地,若数据应用参与方部署有后接网络,则数据应用参与方可将第一网络输出和第二网络输出进行拼接后输入后接网络,经过后接网络的处理得到预测结果;数据应用参与方根据预测结果和该端的标签数据计算损失函数,并计算损失函数相对于该端模型参数的梯度,以及,计算损失函数相对于第一网络输出的梯度,数据应用参与方将第一网络输出对应的梯度发送给数据提供参与方;数据提供参与方接收第一网络输出的梯度,并根据链式法则和梯度下降算法,根据第一网络输出的梯度计算得到损失函数相对于该端模型参数的梯度,并根据梯度更新该端的模型参数;数据应用参与方也根据该端计算得到的模型参数的梯度,更新该端的模型参数。
第二步中,数据提供参与方将该端的第二数据集输入该端的搜索网络,经过搜索网络的处理得到网络输出(本段中称第一网络输出),并将第一网络输出发送给数据应用参与方;数据应用参与方将该端的第二数据集输入该端的搜索网络,经过该端搜索网络的处理得到网络输出(本段中称第二网络输出);数据应用参与方根据第一网络输出和第二网络输出得到预测结果,具体地,若数据应用参与方部署有后接网络,则数据应用参与方可将第一网络输出和第二网络输出进行拼接后输入后接网络,经过后接网络的处理得到预测结果;数据应用参与方根据预测结果和该端的标签数据计算损失函数,并计算损失函数相对于该端搜索结构参数的梯度,以及,计算损失函数相对于第一网络输出的梯度,数据应用参与方将第一网络输出对应的梯度发送给数据提供参与方;数据提供参与方接收第一网络输出的梯度,并根据链式法则和梯度下降算法,根据第一网络输出的梯度计算得到损失函数相对于该端搜索结构参数的梯度,并根据梯度更新该端的搜索结构参数;数据应用参与方也根据该端计算得到的搜索结构参数的梯度,更新该端的搜索结构参数,至此完成一轮参数更新。
在一轮联合更新参数中,各参与方先采用各自的第一数据集联合更新各自搜索网络的模型参数,再采用各自的第二数据集联合更新各自搜索网络的搜索结构参数,降低了出现过拟合现象的可能性。
需要说明的是,若数据应用参与方部署有后接网络,则该数据方在更新该端搜索网络的模型参数时,还要计算该端后接网络中模型参数的梯度,并根据梯度更新后接网络。
步骤S20,基于更新后的本端搜索网络得到本端目标模型。
本端在更新本端搜索网络后,根据更新后的本端搜索网络得到本端目标模型。具体地,本端可以在进行多轮联合更新参数后,根据最后一轮更新得到的本端搜索网络,得到本端目标模型。在一种实施方式中,可以是将本端搜索网络直接作为本端目标模型,其中的搜索结构参数也作为本端目标模型的模型参数。
进一步地,所述步骤S20包括:
步骤S201,根据更新后的本端搜索网络中的搜索结构参数从各连接操作中选取保留操作;
步骤S202,将各所述保留操作和各所述保留操作连接的网络单元所构成的模型作为 本端目标模型。
参与方的搜索网络中搜索结构参数可包括搜索网络中网络单元之间连接操作对应的权重。也即,网络单元之间设置了连接操作,每个连接操作对应一个权重。需要说明的是,并不是任意两个网络单元之间都设置有连接操作。本端可根据更新后的本端搜索网络中的搜索结构参数,从各个连接操作中选取保留操作。具体地,对于每两个存在连接操作的网络单元,其之间有多条连接操作,可从多条连接操作中选出权重大的一个或多个连接操作作为保留操作。
在确定保留操作后,将各保留操作以及各个保留操作连接的网络单元所构成的模型,作为本端目标模型。需要说明的是,若本端是数据应用参与方,并部署有后接网络,则本端目标模型还包括后接网络。
在本实施例中,通过在参与纵向联邦学习的各个参与方部署基于各自数据特征构建的数据集和搜索网络,各个参与方采用各自的数据集与其他参与方计算并交互用于更新各自搜索网络中模型参数和搜索结构参数的中间结果,基于各自接收到的中间结果更新各自的搜索网络,基于更新后的搜索网络得到各自的目标模型。相比于现有纵向联邦学习中,各参与方需要人工花费大量人力物力预先设计模型结构的方式,本申请实施例实现了在纵向联邦建模过程中,各参与方只需要设置各自的搜索网络即可,搜索网络中各个网络单元之间的连接,也即模型结构,是在纵向联邦建模过程中通过优化更新搜索结构参数的方式自动确定的,实现了自动纵向联邦学习,不需要花费大量人力物力预先设置模型结构,降低了参与纵向联邦学习的门槛,使得纵向联邦学习能够被应用于更广泛的具体任务领域中去实现具体的任务,提高了纵向联邦学习的应用范围。并且,在本实施例纵向联邦建模过程中,各个参与方之间并不会直接交互数据集和模型本身,而是交互用于更新模型参数和搜索结构参数的中间结果,从而保障了各个参与方的数据安全和模型信息安全。
进一步地,基于上述第一实施例,提出本申请纵向联邦建模优化方法第二实施例,在本实施例中,参与方的数据集包括第一数据集和第二数据集,所述步骤S10包括:
步骤S101,基于本端第一数据集与其他参与方交互用于更新各自搜索网络中模型参数的第一中间结果,并基于接收到的第一中间结果更新本端搜索网络的副本得到本端初更副本;
在本实施例中,各参与方进行一轮联合更新参数的过程可以是分为三步。第一步是各个参与方采用各自的第一数据集联合更新各自搜索网络的副本中的模型参数;第二步是在第一步更新基础上,各个参与方采用各自的第二数据集联合更新各自副本中的搜索网络参数,以完成一次网络结构搜索;第三步是参与方将第二步更新后的副本搜索网络参数作为各自搜索网络中的搜索网络参数,再采用各自的第一数据集联合更新各自的搜索网络参数,以完成一次模型参数更新。
在第一步中,本端基于本端第一数据集与其他参与方交互用于更新各自搜索网络模型中模型参数的第一中间结果,并基于接收到的第一中间结果更新本端搜索网络的副本得到本端初更副本。其中,各个参与方可在一轮联合更新之前,先复制各自当前的搜索网络,得到各自搜索网络的副本。具体地,数据提供参与方向数据应用参与方发送的第一中间结果可以是数据提供参与方将其第一数据集输入其搜索网络得到的网络输出,数据应用参与方向数据提供参与方发送的第一中间结果可以是该网络输出对应的梯度。
也即,在第一步中,数据提供参与方将该端的第一数据集输入该端的搜索网络,经过搜索网络的处理得到网络输出(本段中称第一网络输出),并将第一网络输出发送给数据应用参与方;数据应用参与方将该端的第一数据集输入该端的搜索网络,经过该端搜索网络的处理得到网络输出(本段中称第二网络输出);数据应用参与方根据第一网络输出和第二网络输出得到预测结果,具体地,若数据应用参与方部署有后接网络,则数据应用参与方可将第一网络输出和第二网络输出进行拼接后输入后接网络,经过后接网络的处理得到预测结果;数据应用参与方根据预测结果和该端的标签数据计算损失函数,并计算损失 函数相对于该端模型参数的梯度,以及,计算损失函数相对于第一网络输出的梯度,数据应用参与方将第一网络输出对应的梯度发送给数据提供参与方;数据提供参与方接收第一网络输出的梯度,并根据链式法则和梯度下降算法,根据第一网络输出的梯度计算得到损失函数相对于该端搜索网络副本中模型参数的梯度,并根据该梯度更新该端副本中的模型参数,以得到该端的初更副本;数据应用参与方也根据该端计算得到的模型参数的梯度,更新该端副本中的模型参数,得到该端的初更副本。
步骤S102,基于本端第二数据集与其他参与方交互用于更新各自初更副本中搜索结构参数的第二中间结果,并基于接收到的第二中间结果更新所述本端初更副本得到本端次更副本;
第二步中,本端基于本端第二数据集与其他参与方交互用于更新各自初更副本中搜索结构参数的第二中间结果,并基于接收到的第二中间结果更新本端初更副本得到本端次更副本。其中,数据提供参与方发送给数据应用参与方的第二中间结果可以是数据提供参与方将该端的第二数据集输入该端的初更副本得到的网络输出,数据应用参与方发送给数据提供应用方的第二中间结果可以是该网络输出对应的梯度。
也即,在第二步中,数据提供参与方将该端的第二数据集输入该端的初更副本,经过初更副本的处理得到网络输出(本段中称第一网络输出),并将第一网络输出发送给数据应用参与方;数据应用参与方将该端的第二数据集输入该端的初更副本,经过该端初更副本的处理得到网络输出(本段中称第二网络输出);数据应用参与方根据第一网络输出和第二网络输出得到预测结果,具体地,若数据应用参与方部署有后接网络,则数据应用参与方可将第一网络输出和第二网络输出进行拼接后输入后接网络,经过后接网络的处理得到预测结果;数据应用参与方根据预测结果和该端的标签数据计算损失函数,并计算损失函数相对于该端初更副本中搜索结构参数的梯度,以及,计算损失函数相对于第一网络输出的梯度,数据应用参与方将第一网络输出对应的梯度发送给数据提供参与方;数据提供参与方接收第一网络输出的梯度,并根据链式法则和梯度下降算法,根据第一网络输出的梯度计算得到损失函数相对于该端初更副本中搜索结构参数的梯度,并根据梯度更新该端的搜索结构参数,得到该端的次更副本;数据应用参与方也根据该端计算得到的搜索结构参数的梯度,更新该端初更副本中的搜索结构参数,得到该端的次更副本。
通过第一步中进行一次模型参数的更新,可近似寻找模型参数最优的过程,而不是训练至模型收敛来完全求解内部优化,从而减少了参与方之间联合更新模型参数的次数,进而提高了纵向联邦建模效率。
步骤S103,采用所述本端次更副本中的搜索结构参数更新所述本端搜索网络得到本端初更搜索网络;
步骤S104,基于所述本端第一数据集与其他参与方交互用于更新各自初更搜索网络中模型参数的第三中间结果,并基于接收到的第三中间结果更新所述本端初更搜索网络得到更新后的本端搜索网络。
第三步中,本端先采用本端次更副本中的搜索结构参数更新本端搜索网络,得到本端初更搜索网络。具体地,参与方将该端当前的搜索网络中的搜索结构参数,替换为该端次更副本中的搜索结构参数,以对该端搜索网络进行更新。也即,相比于该轮联合更新参数之前的搜索网络,初更搜索网络的搜索结构参数改变,模型参数不变,也即在模型参数不变的基础上,完成了一次网络结构搜索,更新了搜索结构参数,以优化搜索网络的结构。
在得到本端初更搜索网络后,本端基于本端第一数据集与其他参与方交互用于更新各自初更搜索网络中模型参数的第三中间结果,并基于接收到的第三中间结果更新本端初更搜索网络得到更新后的本端搜索网络。其中,数据提供参与方发送给数据应用参与方的第三中间结果可以是将该端的第一数据集输入该端的初更搜索网络得到的网络输出,数据应用参与方发送给数据提供参与方的第三中间结果可以是该网络输出对应的梯度。
具体地,数据提供参与方将该端的第一数据集输入该端的初更搜索网络,经过初更搜 索网络的处理得到网络输出(本段中称第一网络输出),并将第一网络输出发送给数据应用参与方;数据应用参与方将该端的第一数据集输入该端的初更搜索网络,经过该端初更搜索网络的处理得到网络输出(本段中称第二网络输出);数据应用参与方根据第一网络输出和第二网络输出得到预测结果,具体地,若数据应用参与方部署有后接网络,则数据应用参与方可将第一网络输出和第二网络输出进行拼接后输入后接网络,经过后接网络的处理得到预测结果;数据应用参与方根据预测结果和该端的标签数据计算损失函数,并计算损失函数相对于该端初更搜索网络中模型参数的梯度,以及,计算损失函数相对于第一网络输出的梯度,数据应用参与方将第一网络输出对应的梯度发送给数据提供参与方;数据提供参与方接收第一网络输出的梯度,并根据链式法则和梯度下降算法,根据第一网络输出的梯度计算得到损失函数相对于该端初更搜索网络中模型参数的梯度,并根据该梯度更新该端初更搜索网络的模型参数,以得到该端更新后的搜索网络;数据应用参与方也根据该端计算得到的模型参数的梯度,更新该端初更搜索网络中的模型参数,得到该端更新后的搜索网络。
进一步地,各个参与方可联合进行多轮参数更新,参与方每轮采用的数据集可以不同。在第三步中,实现了保持搜索网络中搜索结构参数不变,优化更新搜索网络的模型参数。
在本实施例中,通过多轮联合更新参数,各个参与方交替更新各自搜索网络中的搜索结构参数和模型参数,且更新模型参数采用的是第一数据集,更新搜索结构参数采用的是第二数据集,采用不同的数据集来更新两种参数,有效地避免了过拟合现象发生,进而提高了联合建模的成功率,提高了建模得到的模型的预测准确率。并且,在纵向联邦学习过程中,各个参与方并没有互相暴露自己数据集中的数据,从而保障了各个参与方的数据安全;各个参与方每轮联合更新参数的过程只需要发送三次数据和接收三次数据,数据通信量少,通信效率高,进而使得纵向联邦学习效率高;各个参与方采用搜索网络来参与纵向联邦学习,在使用纵向联邦技术建模之时无需事先确定其模型结构,极大地降低了参与纵向联邦学习的门槛,提高了纵向联邦学习在具体任务领域的应用范围。
进一步地,在另一实施方式中,一轮联合更新参数的过程也可以分三步,第一步,各个参与方先将各自当前的搜索网络中的模型参数进行复制,得到模型参数副本,各个参与方再采用各自的第一数据集联合更新各自的搜索网络中的模型参数;在第一步更新的基础上,第二步是各个参与方采用各自的第二数据集联合更新各自的搜索网络中的搜索结构参数;第三步,各个参与方先采用模型参数副本替换第二步更新后的搜索网络中的模型参数,再采用各自的第一数据集联合更新各自的搜索网络中的模型参数。通过复制模型参数的方式,也可实现保存模型参数不变,更新搜索结构参数,保持搜索结构参数不变,更新模型参数。
进一步地,基于上述第二实施例,提出本申请纵向联邦建模优化方法第三实施例。在本实施例中,所述方法应用于拥有标签数据的数据应用参与方,所述数据应用参与方部署有后接网络,所述步骤S102的步骤包括:
步骤S1021,接收数据提供参与方发送的第二网络输出,其中,所述数据提供参与方将他端第二数据集输入他端初更副本得到所述第一网络输出;
在本实施例中,执行主体为数据应用参与方(以下称本端),数据应用参与方还部署有后接网络。
本端接收数据提供参与方发送的第二网络输出,其中,数据提供参与方将他端第二数据集输入他端初更副本得到第一网络输出。其中,他端指的是数据提供参与方。具体地,数据提供参与方将他端的第二数据集输入他端初更副本中,经过他端初更副本的处理,得到第一网络输出。需要说明的是,数据提供参与方可以有多个,在本实施例中,以一个数据提供参与方为例进行具体例子的阐述。
步骤S1022,将本端第二数据集输入所述本端初更副本得到第二网络输出,并将所述第一网络输出和所述第二网络输出输入所述后接网络得到第三网络输出;
本端将本端第二数据集输入本端初更副本得到第二网络输出,具体地,将本端第二数据集输入本端初更副本,经过本端初更副本的处理,得到第二网络输出。本端将第一网络输出和第二网络输出输入后接网络,得到第三网络输出,也即预测结果。具体地,本端可将第一网络输出和第二网络输出进行拼接,拼接的方式可以是进行向量拼接,或计算加权平均;将拼接结果输入后接网络,经过后接网络的处理得到第三网络输出。
步骤S1023,基于所述第三网络输出和本端的标签数据计算损失函数相对所述第一网络输出的第一梯度以及所述本端初更副本中搜索结构参数的第二梯度;
本端根据第三网络输出和本端的标签数据计算损失函数,具体的损失函数计算方式可参考现有的机器学习模型损失函数计算方式,在此不进行详细赘述。并计算损失函数相对于第一网络输出的第一梯度,以及损失函数相对于本端初更副本中搜索结构参数的第二梯度。具体可以按照链式法则和梯度下降算法计算梯度。
步骤S1024,将所述第一梯度发送给所述数据提供参与方,以供所述数据提供参与方根据所述第一梯度更新他端初更副本中的搜索结构参数;
本端将第一梯度发送给数据提供参与方。需要说明的是,当数据提供参与方有多个时,本端计算每个数据提供参与方发送的网络输出对应的梯度,并将梯度返回各对应的数据提供参与方。数据提供参与方在接收到第一梯度后,根据第一梯度更新他端初更副本中的搜索结构参数,具体地,数据提供参与方按照链式法则和梯度下降算法,根据第一梯度计算得到他端初更副本中搜索结构参数对应的梯度,并根据搜索结构参数对应的梯度更新他端初更副本中的搜索结构参数,得到他端的次更副本。
步骤S1025,根据所述第二梯度更新所述本端初更副本中的搜索结构参数得到本端次更副本。
本端根据第二梯度更新本端初更副本中的搜索结构参数,得到本端次更副本。
在本实施例中,各个参与方通过交互用于更新各自搜索网络的搜索结构参数的中间结果,使得各个参与方能够在不暴露各自数据的情况下完成网络结构搜索,进而能够在保证数据安全的同时,实现各个参与方不需预先设置各自的模型结构,降低了参与纵向联邦学习的门槛。
以下举例说明,一轮联合进行参数更新的过程。数据应用参与方用A表示,数据提供参与方用B表示,Net A和Net B分别表示A方和B方的搜索网络,W A和W B分别表示Net A和Net B的模型参数,α A和α B分别表示Net A和Net B的搜索结构参数。X trn A和X val A分别表示A方的第一数据集和第二数据集,X trn B和X val B分别表示B方的第一数据集和第二数据集。Y trn表示X trn A对应的标签数据,Y val表示X val A对应的标签数据。需要说明的是,图3、图4和图5中仅以一种示例图的形式代表各个搜索网络和全连接网络,图形样式并不代表真实的网络结构。
如图3所示,第一步:
1、B方将X trn B输入Net B得到网络输出U trn B,并传输至A方。为进一步提高数据隐私,B方可按照差分隐私或同态加密方法对U trn B进行处理后再发送给A方;
2、A方将X trn A输入Net A得到网络输出U trn A,拼接U trn A和U trn B后输入后接网络中,得到Y trn out;A方复制Net A得到Net A’,基于Y trn和Y trn out计算损失函数相对于W A的梯度
Figure PCTCN2020133430-appb-000001
以及U trn B的梯度
Figure PCTCN2020133430-appb-000002
并根据
Figure PCTCN2020133430-appb-000003
更新W A,将更新后的W A作为Net A’的模型参数,即,计算
Figure PCTCN2020133430-appb-000004
其中,θ是学习率;
3、A方将
Figure PCTCN2020133430-appb-000005
发送给B方;
4、B方复制Net B得到Net B’,并根据梯度
Figure PCTCN2020133430-appb-000006
更新Net B’中的模型参数W B’。具体地,B方按照链式法则和梯度下降算法,计算
Figure PCTCN2020133430-appb-000007
如图4所示,第二步:
5、B方将X val B输入Net B’得到网络输出U val B,并传输至A方。
6、A方将X val A输入Net A’得到网络输出U val A,拼接U val A和U val B后输入后接网络中,得到Y val out;A方基于Y val和Y val out计算损失函数相对于Net A’中的搜索结构参数α A’的梯度
Figure PCTCN2020133430-appb-000008
以及U val B的梯度
Figure PCTCN2020133430-appb-000009
A方将
Figure PCTCN2020133430-appb-000010
发送给B方;
7、A方根据
Figure PCTCN2020133430-appb-000011
更新α A’,即计算
Figure PCTCN2020133430-appb-000012
B方根据
Figure PCTCN2020133430-appb-000013
更新Net B’中的搜索结构参数α B’。具体地,B方按照链式法则和梯度下降算法,计算
Figure PCTCN2020133430-appb-000014
8、A方复制Net A’中的α A’到Net A中的α A,B方复制Net B’中的α B’到Net B中的α B
如图5所示,第三步:
9、B方将X trn B输入Net B得到网络输出U trn B,并传输至A方;
10、A方将X trn A输入Net A得到网络输出U trn A,拼接U trn A和U trn B后输入后接网络中,得到Y trn out;A方基于Y trn和Y trn out计算损失函数相对于W A的梯度
Figure PCTCN2020133430-appb-000015
和U trn B的梯度
Figure PCTCN2020133430-appb-000016
A方将
Figure PCTCN2020133430-appb-000017
发送给B方;
11、A方根据
Figure PCTCN2020133430-appb-000018
更新W A,即计算
Figure PCTCN2020133430-appb-000019
B方根据
Figure PCTCN2020133430-appb-000020
更新W B。具体地,B方按照链式法则和梯度下降算法,计算
Figure PCTCN2020133430-appb-000021
进一步地,基于上述第二实施例,提出本申请纵向联邦建模优化方法第四实施例。在本实施例中,所述方法应用于数据提供参与方,所述步骤S102包括:
步骤S1026,将本端第二数据集输入所述本端初更副本得到第一网络输出;
在本实施例中,执行主体为数据提供参与方(以下称本端)。参与纵向联邦建模的数据应用参与方还部署有后接网络。
本端将本端第二数据集输入本端初更副本得到第一网络输出。具体地,本端将本端的第二数据集输入本端的初更副本中,经过本端初更副本的处理,得到第一网络输出。需要说明的是,数据提供参与方可以有多个,在本实施例中,以一个数据提供参与方为例进行具体例子的阐述。
步骤S1027,将所述第一网络输出发送给拥有标签数据的数据应用参与方,以供所述数据应用参与方将他端第二数据集输入他端初更副本得到第二网络输出,将所述第一网络 输出和所述第二网络输出输入后接网络得到第三网络输出,并基于所述第三网络输出和他端的标签数据计算损失函数相对所述第一网络输出的第一梯度以及他端初更副本中搜索结构参数的第二梯度,并根据所述第二梯度更新他端初更副本中的搜索结构参数,其中,所述后接网络部署于所述数据应用参与方;
本端将第一网络输出发送给数据应用参与方。
数据应用参与方将他端第二数据集输入他端初更副本得到第二网络输出。其中,他端是指数据应用参与方。具体地,数据应用参与方将他端第二数据集输入他端初更副本,经过他端初更副本的处理,得到第二网络输出。
数据应用参与方将第一网络输出和第二网络输出进行拼接,拼接的方式可以是进行向量拼接,或计算加权平均;将拼接结果输入后接网络,经过后接网络的处理得到第三网络输出。
数据应用参与方根据第三网络输出和他端的标签数据计算损失函数,并计算损失函数相对于第一网络输出的第一梯度,以及损失函数相对于他端初更副本中搜索结构参数的第二梯度。数据应用参与方根据第二梯度更新他端初更副本中的搜索结构参数,得到他端次更副本。
数据应用参与方将第一梯度发送给数据提供参与方。
步骤S1028,接收所述数据应用参与方发送的所述第一梯度,并根据所述第一梯度更新所述本端初更副本中的搜索结构参数得到本端次更副本。
本端接收数据应用参与方发送的第一梯度,并根据第一梯度更新本端初更副本中的搜索结构参数,得到本端次更副本。具体地,本端按照链式法则和梯度下降算法,根据第一梯度计算本端初更副本中搜索结构参数对应的梯度,并根据该梯度更新本端初更副本中搜索结构参数。
在本实施例中,各个参与方通过交互用于更新各自搜索网络的搜索结构参数的中间结果,使得各个参与方能够在不暴露各自数据的情况下完成网络结构搜索,进而能够在保证数据安全的同时,实现各个参与方不需预先设置各自的模型结构,降低了参与纵向联邦学习的门槛。
进一步地,在一实施方式中,所述步骤S201之前,还包括:
步骤S203,检测当前是否满足预设建模停止条件;
在进行一轮联合更新参数之后,可检测当前是否满足预设建模停止条件。其中,预设建模停止条件可以是预先根据具体需要设置的条件,例如,可以是达到一个最大轮次时停止,或,达到一个最大时长时停止,或模型收敛时停止。其中,检测模型收敛可以是检测模型的损失函数是否收敛。如在上述某些实施例中,数据集划分为第一数据集和第二数据集,分开优化模型参数和搜索结构参数时,可以当检测到在第一数据集上模型收敛或在第二数据集上模型收敛时,即确定模型收敛。
步骤S204,若满足所述预设建模停止条件,则执行所述步骤:根据更新后的本端搜索网络中的搜索结构参数从各连接操作中选取保留操作;
若检测到满足预设建模停止条件,则可以执行步骤S201以及后续的操作,得到本端目标模型,至此完成本地纵向联邦建模。
步骤S205,若不满足所述预设建模停止条件,则基于更新后的本端搜索网络再执行所述步骤:基于本端数据集与其他参与方交互用于更新各自搜索网络中模型参数和搜索结构参数的中间结果,并基于接收到的中间结果更新本端搜索网络。
若检测到不满足预设建模停止条件,则基于更新后的本端搜索网络,继续执行步骤S10及后续操作,也即,进行下一轮联合更新参数的过程。
进一步地,在一实施方式中,所述步骤S20之后,还包括:
步骤S30,接收数据提供参与方发送的第一模型输出,其中,所述数据提供参与方将目标用户在他端的第二风险特征对应的用户数据输入他端目标模型,得到所述第一模型输 出;
各个参与方可以是部署于银行或其他金融机构的设备,参与方中存储有各机构在业务处理过程中记录的用户数据。不同的机构涉及的具体业务存在差异,因此各个参与方的用户数据的特征可能不同,各个机构可基于各自的数据特征构建数据集,采用各自的数据集联合进行纵向联邦学习,通过扩充模型特征丰富度的方式来提升模型的预测性能。具体地,各个参与方可联合构建用户风险预测模型,用于在信贷业务、保险业务等等业务场景中预测用户的风险程度。各个参与方的数据特征可以根据实际经验选取与用户风险预测相关的风险特征,例如,用户的存款数额、用户的违约次数等等。
各个参与方采用各自的数据集按照上述实施例中的方式联合进行纵向联邦建模,得到各自的目标模型。
在得到各自的目标模型后,各参与方可联合对用户进行风险预测。
具体地,数据应用参与方接收数据提供参与方发送的第一模型输出。其中,数据提供参与方将目标用户在他端的第二风险特征对应的用户数据输入他端目标模型,经过他端目标模型的处理,得到第一模型输出。他端是指数据提供参与方。
步骤S40,将目标用户在本端的第二风险特征对应的用户数据输入本端目标模型,得到第二模型输出;
步骤S50,将所述第一模型输出和所述第二模型输出进行拼接后输入本端的后接网络,得到所述目标用户的风险预测结果。
数据应用参与方将目标用户在本端的第二风险特征对应的用户数据输入本端目标模型,经过本端目标模型的处理,得到第二模型输出。数据应用参与方将第一模型输出和第二模型输出进行拼接,具体地,可以是将第二模型输出和第二模型输出按照向量拼接的方式进行拼接,也可以是进行加权平均。将拼接结果输入数据应用参与方本端的后接网络,经过后接网络的处理,输出得到目标用户的风险预测结果。
进一步地,当目标用户的风险预测任务是数据提供参与方发起时,数据应用参与方可以将目标用户的风险预测结果发送给数据提供参与方,以供数据提供参与方根据目标用户的风险预测结果进行后续的业务处理,例如,根据风险预测结果确定是否对目标用户进行贷款。
在本实施例中,各参与方只需要设置各自的搜索网络即可,不需要花费大量人力物力去设置精心设置模型结构,从而降低了参与纵向联邦学习的门槛,使得银行和其他金融机构能够更加方便地通过纵向联邦学习进行联合建模,进而通过联合建模得到的风险预测模型完成风险预测任务。并且,在纵向联邦建模和建模后采用模型进行风险预测的过程中,各个参与方不需要直接交互各自的数据集和模型本身,从而保障了各个参与方中的用户隐私数据的安全。
此外本申请实施例还提出一种纵向联邦建模优化装置,参照图6,所述装置部署于参与纵向联邦建模的参与方,各参与方分别部署有基于各自数据特征构建的数据集和搜索网络,所述装置包括:
交互模块10,用于基于本端数据集与其他参与方交互用于更新各自搜索网络中模型参数和搜索结构参数的中间结果,并基于接收到的中间结果更新本端搜索网络;
确定模块20,用于基于更新后的本端搜索网络得到本端目标模型。
进一步地,参与方的数据集包括第一数据集和第二数据集,所述交互模块10包括:
第一交互单元,用于基于本端第一数据集与其他参与方交互用于更新各自搜索网络中模型参数的第一中间结果,并基于接收到的第一中间结果更新本端搜索网络的副本得到本端初更副本;
第二交互单元,用于基于本端第二数据集与其他参与方交互用于更新各自初更副本中搜索结构参数的第二中间结果,并基于接收到的第二中间结果更新所述本端初更副本得到本端次更副本;
更新单元,用于采用所述本端次更副本中的搜索结构参数更新所述本端搜索网络得到本端初更搜索网络;
第三交互单元,用于基于所述本端第一数据集与其他参与方交互用于更新各自初更搜索网络中模型参数的第三中间结果,并基于接收到的第三中间结果更新所述本端初更搜索网络得到更新后的本端搜索网络。
进一步地,所述装置部署于拥有标签数据的数据应用参与方,所述数据应用参与方部署有后接网络,所述第二交互单元包括:
第一接收子单元,接收数据提供参与方发送的第二网络输出,其中,所述数据提供参与方将他端第二数据集输入他端初更副本得到所述第一网络输出;
第一输入子单元,用于将本端第二数据集输入所述本端初更副本得到第二网络输出,并将所述第一网络输出和所述第二网络输出输入所述后接网络得到第三网络输出;
计算子单元,用于基于所述第三网络输出和本端的标签数据计算损失函数相对所述第一网络输出的第一梯度以及所述本端初更副本中搜索结构参数的第二梯度;
第一发送子单元,用于将所述第一梯度发送给所述数据提供参与方,以供所述数据提供参与方根据所述第一梯度更新他端初更副本中的搜索结构参数;
更新子单元,用于根据所述第二梯度更新所述本端初更副本中的搜索结构参数得到本端次更副本。
进一步地,所述装置部署于数据提供参与方,所述第二交互单元包括:
第二输入子单元,用于将本端第二数据集输入所述本端初更副本得到第一网络输出;
第二发送子单元,用于将所述第一网络输出发送给拥有标签数据的数据应用参与方,以供所述数据应用参与方将他端第二数据集输入他端初更副本得到第二网络输出,将所述第一网络输出和所述第二网络输出输入后接网络得到第三网络输出,并基于所述第三网络输出和他端的标签数据计算损失函数相对所述第一网络输出的第一梯度以及他端初更副本中搜索结构参数的第二梯度,并根据所述第二梯度更新他端初更副本中的搜索结构参数,其中,所述后接网络部署于所述数据应用参与方;
第二接收子单元,用于接收所述数据应用参与方发送的所述第一梯度,并根据所述第一梯度更新所述本端初更副本中的搜索结构参数得到本端次更副本。
进一步地,参与方的搜索网络中搜索结构参数包括搜索网络中网络单元之间连接操作对应的权重,所述确定模块20包括:
选取单元,用于根据更新后的本端搜索网络中的搜索结构参数从各连接操作中选取保留操作;
确定单元,用于将各所述保留操作和各所述保留操作连接的网络单元所构成的模型作为本端目标模型。
进一步地,所述确定模块20还包括:
检测单元,用于检测当前是否满足预设建模停止条件;
所述确定模块20还用于若满足所述预设建模停止条件,则执行所述步骤:根据更新后的本端搜索网络中的搜索结构参数从各连接操作中选取保留操作;若不满足所述预设建模停止条件,则基于更新后的本端搜索网络再执行所述步骤:基于本端数据集与其他参与方交互用于更新各自搜索网络中模型参数和搜索结构参数的中间结果,并基于接收到的中间结果更新本端搜索网络。
进一步地,所述装置部署于拥有标签数据的数据应用参与方,所述装置还包括:
接收模块,用于接收数据提供参与方发送的第一模型输出,其中,所述数据提供参与方将目标用户在他端的第二风险特征对应的用户数据输入他端目标模型,得到所述第一模型输出;
输入模块,用于将目标用户在本端的第二风险特征对应的用户数据输入本端目标模型,得到第二模型输出;
预测模块,用于将所述第一模型输出和所述第二模型输出进行拼接后输入本端的后接网络,得到所述目标用户的风险预测结果。
本申请纵向联邦建模优化装置的具体实施方式的拓展内容与上述纵向联邦建模优化方法各实施例基本相同,在此不做赘述。
此外,本申请实施例还提出一种计算机可读存储介质,所述存储介质上存储有纵向联邦建模优化程序,所述纵向联邦建模优化程序被处理器执行时实现如下所述的纵向联邦建模优化方法的步骤。
本申请纵向联邦建模优化设备和计算机可读存储介质的各实施例,均可参照本申请纵向联邦建模优化方法各实施例,此处不再赘述。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各实施例所述的方法。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (20)

  1. 一种纵向联邦建模优化方法,其中,所述方法应用于参与纵向联邦建模的参与方,各参与方分别部署有基于各自数据特征构建的数据集和搜索网络,所述方法包括以下步骤:
    基于本端数据集与其他参与方交互用于更新各自搜索网络中模型参数和搜索结构参数的中间结果,并基于接收到的中间结果更新本端搜索网络;
    基于更新后的本端搜索网络得到本端目标模型。
  2. 如权利要求1所述的纵向联邦建模优化方法,其中,参与方的数据集包括第一数据集和第二数据集,所述基于本端数据集与其他参与方交互用于更新各自搜索网络中模型参数和搜索结构参数的中间结果,并基于接收到的中间结果更新本端搜索网络的步骤包括:
    基于本端第一数据集与其他参与方交互用于更新各自搜索网络中模型参数的第一中间结果,并基于接收到的第一中间结果更新本端搜索网络的副本得到本端初更副本;
    基于本端第二数据集与其他参与方交互用于更新各自初更副本中搜索结构参数的第二中间结果,并基于接收到的第二中间结果更新所述本端初更副本得到本端次更副本;
    采用所述本端次更副本中的搜索结构参数更新所述本端搜索网络得到本端初更搜索网络;
    基于所述本端第一数据集与其他参与方交互用于更新各自初更搜索网络中模型参数的第三中间结果,并基于接收到的第三中间结果更新所述本端初更搜索网络得到更新后的本端搜索网络。
  3. 如权利要求2所述的纵向联邦建模优化方法,其中,所述方法应用于拥有标签数据的数据应用参与方,所述数据应用参与方部署有后接网络,所述基于本端第二数据集与其他参与方交互用于更新各自初更副本中搜索结构参数的第二中间结果,并基于接收到的第二中间结果更新所述本端初更副本得到本端次更副本的步骤包括:
    接收数据提供参与方发送的第二网络输出,其中,所述数据提供参与方将他端第二数据集输入他端初更副本得到所述第一网络输出;
    将本端第二数据集输入所述本端初更副本得到第二网络输出,并将所述第一网络输出和所述第二网络输出输入所述后接网络得到第三网络输出;
    基于所述第三网络输出和本端的标签数据计算损失函数相对所述第一网络输出的第一梯度以及所述本端初更副本中搜索结构参数的第二梯度;
    将所述第一梯度发送给所述数据提供参与方,以供所述数据提供参与方根据所述第一梯度更新他端初更副本中的搜索结构参数;
    根据所述第二梯度更新所述本端初更副本中的搜索结构参数得到本端次更副本。
  4. 如权利要求2所述的纵向联邦建模优化方法,其中,所述方法应用于数据提供参与方,所述基于本端第二数据集与其他参与方交互用于更新各自初更副本中搜索结构参数的第二中间结果,并基于接收到的第二中间结果更新所述本端初更副本得到本端次更副本的步骤包括:
    将本端第二数据集输入所述本端初更副本得到第一网络输出;
    将所述第一网络输出发送给拥有标签数据的数据应用参与方,以供所述数据应用参与方将他端第二数据集输入他端初更副本得到第二网络输出,将所述第一网络输出和所述第二网络输出输入后接网络得到第三网络输出,并基于所述第三网络输出和他端的标签数据计算损失函数相对所述第一网络输出的第一梯度以及他端初更副本中搜索结构参数的第二梯度,并根据所述第二梯度更新他端初更副本中的搜索结构参数,其中,所述后接网络部署于所述数据应用参与方;
    接收所述数据应用参与方发送的所述第一梯度,并根据所述第一梯度更新所述本端初 更副本中的搜索结构参数得到本端次更副本。
  5. 如权利要求1至4中任一项所述的纵向联邦建模优化方法,其中,参与方的搜索网络中搜索结构参数包括搜索网络中网络单元之间连接操作对应的权重,所述基于更新后的本端搜索网络得到本端目标模型的步骤包括:
    根据更新后的本端搜索网络中的搜索结构参数从各连接操作中选取保留操作;
    将各所述保留操作和各所述保留操作连接的网络单元所构成的模型作为本端目标模型。
  6. 如权利要求5所述的纵向联邦建模优化方法,其中,所述根据更新后的本端搜索网络中的搜索结构参数从各连接操作中选取保留操作的步骤之前,还包括:
    检测当前是否满足预设建模停止条件;
    若满足所述预设建模停止条件,则执行所述步骤:根据更新后的本端搜索网络中的搜索结构参数从各连接操作中选取保留操作;
    若不满足所述预设建模停止条件,则基于更新后的本端搜索网络再执行所述步骤:基于本端数据集与其他参与方交互用于更新各自搜索网络中模型参数和搜索结构参数的中间结果,并基于接收到的中间结果更新本端搜索网络。
  7. 如权利要求1至3中任一项所述的纵向联邦建模优化方法,其中,所述方法应用于拥有标签数据的数据应用参与方,所述基于更新后的本端搜索网络得到本端目标模型的步骤之后,还包括:
    接收数据提供参与方发送的第一模型输出,其中,所述数据提供参与方将目标用户在他端的第二风险特征对应的用户数据输入他端目标模型,得到所述第一模型输出;
    将目标用户在本端的第二风险特征对应的用户数据输入本端目标模型,得到第二模型输出;
    将所述第一模型输出和所述第二模型输出进行拼接后输入本端的后接网络,得到所述目标用户的风险预测结果。
  8. 一种纵向联邦建模优化装置,其中,所述装置部署于参与纵向联邦建模的参与方,各参与方分别部署有基于各自数据特征构建的数据集和搜索网络,所述装置包括:
    交互模块,用于基于本端数据集与其他参与方交互用于更新各自搜索网络中模型参数和搜索结构参数的中间结果,并基于接收到的中间结果更新本端搜索网络;
    确定模块,用于基于更新后的本端搜索网络得到本端目标模型。
  9. 一种纵向联邦建模优化设备,其中,所述纵向联邦建模优化设备包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的纵向联邦建模优化程序,所述纵向联邦建模优化程序被所述处理器执行时实现以下步骤:
    基于本端数据集与其他参与方交互用于更新各自搜索网络中模型参数和搜索结构参数的中间结果,并基于接收到的中间结果更新本端搜索网络;
    基于更新后的本端搜索网络得到本端目标模型。
  10. 如权利要求9所述的纵向联邦建模优化设备,其中,参与方的数据集包括第一数据集和第二数据集,所述纵向联邦建模优化程序被所述处理器执行时还实现以下步骤:
    基于本端第一数据集与其他参与方交互用于更新各自搜索网络中模型参数的第一中间结果,并基于接收到的第一中间结果更新本端搜索网络的副本得到本端初更副本;
    基于本端第二数据集与其他参与方交互用于更新各自初更副本中搜索结构参数的第二中间结果,并基于接收到的第二中间结果更新所述本端初更副本得到本端次更副本;
    采用所述本端次更副本中的搜索结构参数更新所述本端搜索网络得到本端初更搜索网络;
    基于所述本端第一数据集与其他参与方交互用于更新各自初更搜索网络中模型参数的第三中间结果,并基于接收到的第三中间结果更新所述本端初更搜索网络得到更新后的本端搜索网络。
  11. 如权利要求10所述的纵向联邦建模优化设备,其中,所述设备应用于拥有标签数据的数据应用参与方,所述数据应用参与方部署有后接网络,所述纵向联邦建模优化程序被所述处理器执行时还实现以下步骤:
    接收数据提供参与方发送的第二网络输出,其中,所述数据提供参与方将他端第二数据集输入他端初更副本得到所述第一网络输出;
    将本端第二数据集输入所述本端初更副本得到第二网络输出,并将所述第一网络输出和所述第二网络输出输入所述后接网络得到第三网络输出;
    基于所述第三网络输出和本端的标签数据计算损失函数相对所述第一网络输出的第一梯度以及所述本端初更副本中搜索结构参数的第二梯度;
    将所述第一梯度发送给所述数据提供参与方,以供所述数据提供参与方根据所述第一梯度更新他端初更副本中的搜索结构参数;
    根据所述第二梯度更新所述本端初更副本中的搜索结构参数得到本端次更副本。
  12. 如权利要求10所述的纵向联邦建模优化设备,其中,所述设备应用于数据提供参与方,所述纵向联邦建模优化程序被所述处理器执行时还实现以下步骤:
    将本端第二数据集输入所述本端初更副本得到第一网络输出;
    将所述第一网络输出发送给拥有标签数据的数据应用参与方,以供所述数据应用参与方将他端第二数据集输入他端初更副本得到第二网络输出,将所述第一网络输出和所述第二网络输出输入后接网络得到第三网络输出,并基于所述第三网络输出和他端的标签数据计算损失函数相对所述第一网络输出的第一梯度以及他端初更副本中搜索结构参数的第二梯度,并根据所述第二梯度更新他端初更副本中的搜索结构参数,其中,所述后接网络部署于所述数据应用参与方;
    接收所述数据应用参与方发送的所述第一梯度,并根据所述第一梯度更新所述本端初更副本中的搜索结构参数得到本端次更副本。
  13. 如权利要求9至12中任一项所述的纵向联邦建模优化设备,其中,参与方的搜索网络中搜索结构参数包括搜索网络中网络单元之间连接操作对应的权重,所述纵向联邦建模优化程序被所述处理器执行时还实现以下步骤:
    根据更新后的本端搜索网络中的搜索结构参数从各连接操作中选取保留操作;
    将各所述保留操作和各所述保留操作连接的网络单元所构成的模型作为本端目标模型。
  14. 如权利要求13所述的纵向联邦建模优化设备,其中,所述纵向联邦建模优化程序被所述处理器执行时还实现以下步骤:
    检测当前是否满足预设建模停止条件;
    若满足所述预设建模停止条件,则执行所述步骤:根据更新后的本端搜索网络中的搜索结构参数从各连接操作中选取保留操作;
    若不满足所述预设建模停止条件,则基于更新后的本端搜索网络再执行所述步骤:基于本端数据集与其他参与方交互用于更新各自搜索网络中模型参数和搜索结构参数的中间结果,并基于接收到的中间结果更新本端搜索网络。
  15. 一种计算机可读存储介质,其中,所述计算机可读存储介质上存储有纵向联邦建模优化程序,所述纵向联邦建模优化程序被处理器执行时实现以下步骤:
    基于本端数据集与其他参与方交互用于更新各自搜索网络中模型参数和搜索结构参数的中间结果,并基于接收到的中间结果更新本端搜索网络;
    基于更新后的本端搜索网络得到本端目标模型。
  16. 如权利要求15所述的计算机可读存储介质,其中,参与方的数据集包括第一数据集和第二数据集,所述纵向联邦建模优化程序被处理器执行时还实现以下步骤:
    基于本端第一数据集与其他参与方交互用于更新各自搜索网络中模型参数的第一中间结果,并基于接收到的第一中间结果更新本端搜索网络的副本得到本端初更副本;
    基于本端第二数据集与其他参与方交互用于更新各自初更副本中搜索结构参数的第二中间结果,并基于接收到的第二中间结果更新所述本端初更副本得到本端次更副本;
    采用所述本端次更副本中的搜索结构参数更新所述本端搜索网络得到本端初更搜索网络;
    基于所述本端第一数据集与其他参与方交互用于更新各自初更搜索网络中模型参数的第三中间结果,并基于接收到的第三中间结果更新所述本端初更搜索网络得到更新后的本端搜索网络。
  17. 如权利要求16所述的计算机可读存储介质,其中,所述介质应用于拥有标签数据的数据应用参与方,所述数据应用参与方部署有后接网络,所述纵向联邦建模优化程序被处理器执行时还实现以下步骤:
    接收数据提供参与方发送的第二网络输出,其中,所述数据提供参与方将他端第二数据集输入他端初更副本得到所述第一网络输出;
    将本端第二数据集输入所述本端初更副本得到第二网络输出,并将所述第一网络输出和所述第二网络输出输入所述后接网络得到第三网络输出;
    基于所述第三网络输出和本端的标签数据计算损失函数相对所述第一网络输出的第一梯度以及所述本端初更副本中搜索结构参数的第二梯度;
    将所述第一梯度发送给所述数据提供参与方,以供所述数据提供参与方根据所述第一梯度更新他端初更副本中的搜索结构参数;
    根据所述第二梯度更新所述本端初更副本中的搜索结构参数得到本端次更副本。
  18. 如权利要求16所述的计算机可读存储介质,其中,所述介质应用于数据提供参与方,所述纵向联邦建模优化程序被处理器执行时还实现以下步骤:
    将本端第二数据集输入所述本端初更副本得到第一网络输出;
    将所述第一网络输出发送给拥有标签数据的数据应用参与方,以供所述数据应用参与方将他端第二数据集输入他端初更副本得到第二网络输出,将所述第一网络输出和所述第二网络输出输入后接网络得到第三网络输出,并基于所述第三网络输出和他端的标签数据计算损失函数相对所述第一网络输出的第一梯度以及他端初更副本中搜索结构参数的第二梯度,并根据所述第二梯度更新他端初更副本中的搜索结构参数,其中,所述后接网络部署于所述数据应用参与方;
    接收所述数据应用参与方发送的所述第一梯度,并根据所述第一梯度更新所述本端初更副本中的搜索结构参数得到本端次更副本。
  19. 如权利要求15至18任一项所述的计算机可读存储介质,其中,参与方的搜索网络中搜索结构参数包括搜索网络中网络单元之间连接操作对应的权重,所述纵向联邦建模优化程序被处理器执行时还实现以下步骤:
    根据更新后的本端搜索网络中的搜索结构参数从各连接操作中选取保留操作;
    将各所述保留操作和各所述保留操作连接的网络单元所构成的模型作为本端目标模型。
  20. 如权利要求19所述的计算机可读存储介质,其中,所述纵向联邦建模优化程序被处理器执行时还实现以下步骤:
    检测当前是否满足预设建模停止条件;
    若满足所述预设建模停止条件,则执行所述步骤:根据更新后的本端搜索网络中的搜索结构参数从各连接操作中选取保留操作;
    若不满足所述预设建模停止条件,则基于更新后的本端搜索网络再执行所述步骤:基于本端数据集与其他参与方交互用于更新各自搜索网络中模型参数和搜索结构参数的中间结果,并基于接收到的中间结果更新本端搜索网络。
PCT/CN2020/133430 2020-07-10 2020-12-02 纵向联邦建模优化方法、装置、设备及可读存储介质 WO2022007321A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010663980.9A CN111797999A (zh) 2020-07-10 2020-07-10 纵向联邦建模优化方法、装置、设备及可读存储介质
CN202010663980.9 2020-07-10

Publications (1)

Publication Number Publication Date
WO2022007321A1 true WO2022007321A1 (zh) 2022-01-13

Family

ID=72806940

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/133430 WO2022007321A1 (zh) 2020-07-10 2020-12-02 纵向联邦建模优化方法、装置、设备及可读存储介质

Country Status (2)

Country Link
CN (1) CN111797999A (zh)
WO (1) WO2022007321A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114118312A (zh) * 2022-01-29 2022-03-01 华控清交信息科技(北京)有限公司 一种gbdt模型的纵向训练方法、装置、电子设备及系统
CN114866599A (zh) * 2022-04-29 2022-08-05 济南中科泛在智能计算研究院 基于最优联邦方选择的联邦学习方法、设备及系统
CN115018318A (zh) * 2022-06-01 2022-09-06 航天神舟智慧系统技术有限公司 一种社会区域风险预测分析方法与系统

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111797999A (zh) * 2020-07-10 2020-10-20 深圳前海微众银行股份有限公司 纵向联邦建模优化方法、装置、设备及可读存储介质
CN112418446B (zh) * 2020-11-18 2024-04-09 脸萌有限公司 模型处理方法、系统、装置、介质及电子设备
CN113807544B (zh) * 2020-12-31 2023-09-26 京东科技控股股份有限公司 一种联邦学习模型的训练方法、装置及电子设备
CN114091670A (zh) * 2021-11-23 2022-02-25 支付宝(杭州)信息技术有限公司 一种模型线上更新方法及装置

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180018590A1 (en) * 2016-07-18 2018-01-18 NantOmics, Inc. Distributed Machine Learning Systems, Apparatus, and Methods
CN109886417A (zh) * 2019-03-01 2019-06-14 深圳前海微众银行股份有限公司 基于联邦学习的模型参数训练方法、装置、设备及介质
CN110175671A (zh) * 2019-04-28 2019-08-27 华为技术有限公司 神经网络的构建方法、图像处理方法及装置
CN110874646A (zh) * 2020-01-16 2020-03-10 支付宝(杭州)信息技术有限公司 一种联邦学习的异常处理方法、装置及电子设备
CN111310204A (zh) * 2020-02-10 2020-06-19 北京百度网讯科技有限公司 数据处理的方法及装置
CN111340190A (zh) * 2020-02-23 2020-06-26 华为技术有限公司 构建网络结构的方法与装置、及图像生成方法与装置
CN111797999A (zh) * 2020-07-10 2020-10-20 深圳前海微众银行股份有限公司 纵向联邦建模优化方法、装置、设备及可读存储介质

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109492420B (zh) * 2018-12-28 2021-07-20 深圳前海微众银行股份有限公司 基于联邦学习的模型参数训练方法、终端、系统及介质
CN110633806B (zh) * 2019-10-21 2024-04-26 深圳前海微众银行股份有限公司 纵向联邦学习系统优化方法、装置、设备及可读存储介质
CN111210003B (zh) * 2019-12-30 2021-03-19 深圳前海微众银行股份有限公司 纵向联邦学习系统优化方法、装置、设备及可读存储介质
CN110874649B (zh) * 2020-01-16 2020-04-28 支付宝(杭州)信息技术有限公司 联邦学习的执行方法、系统、客户端及电子设备

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180018590A1 (en) * 2016-07-18 2018-01-18 NantOmics, Inc. Distributed Machine Learning Systems, Apparatus, and Methods
CN109886417A (zh) * 2019-03-01 2019-06-14 深圳前海微众银行股份有限公司 基于联邦学习的模型参数训练方法、装置、设备及介质
CN110175671A (zh) * 2019-04-28 2019-08-27 华为技术有限公司 神经网络的构建方法、图像处理方法及装置
CN110874646A (zh) * 2020-01-16 2020-03-10 支付宝(杭州)信息技术有限公司 一种联邦学习的异常处理方法、装置及电子设备
CN111310204A (zh) * 2020-02-10 2020-06-19 北京百度网讯科技有限公司 数据处理的方法及装置
CN111340190A (zh) * 2020-02-23 2020-06-26 华为技术有限公司 构建网络结构的方法与装置、及图像生成方法与装置
CN111797999A (zh) * 2020-07-10 2020-10-20 深圳前海微众银行股份有限公司 纵向联邦建模优化方法、装置、设备及可读存储介质

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114118312A (zh) * 2022-01-29 2022-03-01 华控清交信息科技(北京)有限公司 一种gbdt模型的纵向训练方法、装置、电子设备及系统
CN114118312B (zh) * 2022-01-29 2022-05-13 华控清交信息科技(北京)有限公司 一种gbdt模型的纵向训练方法、装置、电子设备及系统
CN114866599A (zh) * 2022-04-29 2022-08-05 济南中科泛在智能计算研究院 基于最优联邦方选择的联邦学习方法、设备及系统
CN114866599B (zh) * 2022-04-29 2024-03-29 济南中科泛在智能计算研究院 基于最优联邦方选择的联邦学习方法、设备及系统
CN115018318A (zh) * 2022-06-01 2022-09-06 航天神舟智慧系统技术有限公司 一种社会区域风险预测分析方法与系统
CN115018318B (zh) * 2022-06-01 2023-04-18 航天神舟智慧系统技术有限公司 一种社会区域风险预测分析方法与系统

Also Published As

Publication number Publication date
CN111797999A (zh) 2020-10-20

Similar Documents

Publication Publication Date Title
WO2022007321A1 (zh) 纵向联邦建模优化方法、装置、设备及可读存储介质
WO2022016964A1 (zh) 纵向联邦建模优化方法、设备及可读存储介质
Liu et al. Competing bandits in matching markets
US10691494B2 (en) Method and device for virtual resource allocation, modeling, and data prediction
WO2021083276A1 (zh) 横向联邦和纵向联邦联合方法、装置、设备及介质
JP7095140B2 (ja) 特徴抽出に基くマルチモデルトレーニング方法及び装置、電子機器と媒体
WO2022193432A1 (zh) 模型参数更新方法、装置、设备、存储介质及程序产品
JP7383803B2 (ja) 不均一モデルタイプおよびアーキテクチャを使用した連合学習
CN108520470B (zh) 用于生成用户属性信息的方法和装置
US11855970B2 (en) Systems and methods for blind multimodal learning
CN111815169B (zh) 业务审批参数配置方法及装置
WO2022048195A1 (zh) 纵向联邦建模方法、装置、设备及计算机可读存储介质
WO2022237175A1 (zh) 图数据的处理方法、装置、设备、存储介质及程序产品
CN112799708A (zh) 联合更新业务模型的方法及系统
US20190236505A1 (en) System and method for matching resource capacity with client resource needs
WO2023024349A1 (zh) 纵向联邦预测优化方法、设备、介质及计算机程序产品
US20230334450A1 (en) Virtual Assistant Host Platform Configured for Interactive Voice Response Simulation
Tian et al. Synergetic focal loss for imbalanced classification in federated xgboost
US12020246B2 (en) Intelligent distributed ledger consent optimizing apparatus for asset transfer
Ma et al. QoS prediction for neighbor selection via deep transfer collaborative filtering in video streaming P2P networks
CN115017362A (zh) 数据处理方法、电子设备及存储介质
US11475239B2 (en) Solution to end-to-end feature engineering automation
CN118261273A (zh) 联邦网络模型的训练方法、任务预测方法、装置及设备
US20240154942A1 (en) Systems and methods for blind multimodal learning
WO2024139666A1 (zh) 双目标域推荐模型的训练方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20944624

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20944624

Country of ref document: EP

Kind code of ref document: A1