WO2023024349A1 - 纵向联邦预测优化方法、设备、介质及计算机程序产品 - Google Patents

纵向联邦预测优化方法、设备、介质及计算机程序产品 Download PDF

Info

Publication number
WO2023024349A1
WO2023024349A1 PCT/CN2021/139640 CN2021139640W WO2023024349A1 WO 2023024349 A1 WO2023024349 A1 WO 2023024349A1 CN 2021139640 W CN2021139640 W CN 2021139640W WO 2023024349 A1 WO2023024349 A1 WO 2023024349A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
prediction
party
sample
federated
Prior art date
Application number
PCT/CN2021/139640
Other languages
English (en)
French (fr)
Inventor
万晟
高大山
鞠策
谭奔
杨强
Original Assignee
深圳前海微众银行股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳前海微众银行股份有限公司 filed Critical 深圳前海微众银行股份有限公司
Publication of WO2023024349A1 publication Critical patent/WO2023024349A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions

Definitions

  • the present application relates to the field of artificial intelligence technology of financial technology (Fintech), and in particular to a vertical federated prediction optimization method, equipment, media and computer program products.
  • the existing vertical federated prediction model needs to be constructed based on the aligned samples among the participants for vertical federated learning modeling. Both hold only part of the vertical federated forecasting model. Therefore, in the vertical federated prediction scenario, for the aligned samples, the predictor usually combines the aligned samples scattered in other parties for vertical federated prediction to accurately predict the samples. For unaligned samples, the predictor needs to use a separate local model to make predictions locally. Therefore, when the predictor performs federated prediction with higher accuracy for aligned samples, it cannot perform sample prediction based on the complete model for unaligned samples, which makes the overall sample prediction accuracy lower. Therefore, the existing longitudinal federated forecasting methods have the problem of low overall sample forecasting accuracy.
  • the main purpose of this application is to provide a vertical federated forecasting optimization method, equipment, media and computer program products, aiming at solving the technical problem of low overall sample forecasting accuracy of vertical federated forecasting in the prior art.
  • the present application provides a vertical federated prediction optimization method, the vertical federated prediction optimization method is applied to the first device, and the vertical federated prediction optimization method includes:
  • the second device can jointly perform model prediction on the intermediate sample features and the ID matching samples corresponding to the samples to be predicted based on the longitudinal federated residual boosting model, and obtain the first Bipartite model prediction results;
  • weighted aggregation is performed on the prediction result of the first-party model and the prediction result of the second-party model to obtain a target federated prediction result.
  • the present application provides a vertical federated prediction optimization method, the vertical federated prediction optimization method is applied to the second device, and the vertical federated prediction optimization method includes:
  • the second-party model weight corresponding to the vertical federated residual promotion model Obtain the second-party model weight corresponding to the vertical federated residual promotion model, and send the second-party model prediction result and the second-party model weight to the first device for the first device Based on the first-party model prediction result generated by the target prediction model for the sample to be predicted corresponding to the ID matching sample, the first-party model weight corresponding to the target prediction model, the second-party model prediction result, and the The second party model weights to generate a target federated prediction result, wherein the target prediction model is obtained by local iterative training of the first device.
  • the present application provides a longitudinal federated learning modeling optimization method, the longitudinal federated learning modeling optimization method is applied to the first device, and the longitudinal federated learning modeling optimization method includes:
  • the training model prediction result corresponding to the intermediate training sample features, and the first-party initial model weight by calculating the first-party model prediction loss corresponding to the target prediction model to be trained, iteratively optimize the Describe the target prediction model to be trained, and obtain the target prediction model;
  • the present application provides a modeling optimization method for vertical federated learning, the modeling optimization method for vertical federated learning is applied to the second device, and the modeling optimization method for vertical federated learning includes:
  • the present application also provides a vertical federated prediction optimization device, the vertical federated prediction optimization device is a virtual device, and the vertical federated prediction optimization device is applied to the first device, and the vertical federated prediction optimization device includes:
  • the model prediction module is used to extract the samples to be predicted, and obtain the intermediate sample features generated by the feature extractor of the target prediction model for the feature extraction of the samples to be predicted, and the target prediction model to perform model prediction on the samples to be predicted
  • the generated first-party model prediction result wherein the target prediction model is obtained by local iterative training of the first device;
  • a sending module configured to send the intermediate sample feature to a second device, so that the second device can jointly execute the ID matching sample corresponding to the intermediate sample feature and the sample to be predicted based on the longitudinal federated residual boosting model Model prediction, obtain the second-party model prediction results;
  • a receiving module configured to obtain the first-party model weight corresponding to the target prediction model, and receive the second-party model prediction result sent by the second device and the second-party model weight corresponding to the vertical federated residual promotion model ;
  • a weighted aggregation module configured to perform weighted aggregation on the prediction results of the first-party model and the prediction results of the second-party model based on the weights of the first-party model and the weight of the second-party model to obtain a target federated prediction result .
  • the present application also provides a vertical federated prediction optimization device, the vertical federated prediction optimization device is a virtual device, and the vertical federated prediction optimization device is applied to the second device, and the vertical federated prediction optimization device includes:
  • a receiving and searching module configured to receive the intermediate sample feature sent by the first device, and search for an ID matching sample corresponding to the intermediate sample feature;
  • a model prediction module configured to jointly perform model prediction on the ID matching samples and the intermediate sample features based on the vertical federated residual promotion model, and obtain a second-party model prediction result
  • a sending module configured to obtain the weight of the second-party model corresponding to the vertical federated residual promotion model, and send the prediction result of the second-party model and the weight of the second-party model to the first device for
  • the prediction result and the weight of the second-party model generate a target federated prediction result, wherein the target prediction model is obtained by local iterative training of the first device.
  • the present application also provides a vertical federated learning modeling optimization device, the vertical federated learning modeling optimization device is a virtual device, and the vertical federated learning modeling optimization device is applied to the first device, the vertical federated learning modeling Optimizing devices include:
  • the first obtaining module is used to obtain the weight of the first-party initial model, and extract training samples and training sample labels corresponding to the training samples;
  • the second acquisition module is used to acquire the intermediate training sample features generated by the feature extractor of the target prediction model to be trained for feature extraction of the training samples;
  • An iterative optimization module configured to calculate the first-party model corresponding to the target prediction model to be trained based on the training sample label, the training model prediction result corresponding to the intermediate training sample feature, and the first-party initial model weight Prediction loss, iteratively optimizing the target prediction model to be trained to obtain the target prediction model;
  • a sending module configured to send the training sample label, the intermediate training sample features, and the first-party model prediction loss to a second device, so that the second device can calculate the second-party model prediction loss, and based on
  • the residual loss calculated by the prediction loss of the second-party model and the prediction loss of the first-party model is used to optimize the residual promotion model to be trained to obtain a vertical federated residual promotion model.
  • the present application also provides a vertical federated learning modeling optimization device, the vertical federated learning modeling optimization device is a virtual device, and the vertical federated learning modeling optimization device is applied to the second device, the vertical federated learning modeling Optimizing devices include:
  • the receiving module is used to obtain the weight of the second-party initial model, and receive the intermediate training sample features, training sample labels and first-party model prediction loss sent by the first device;
  • the model prediction module is used to obtain the training sample ID matching sample, and based on the residual promotion model to be trained, perform model prediction on the training sample ID matching sample and the intermediate training sample characteristics, and obtain the second party training model prediction result ;
  • a calculation module configured to calculate a second-party model prediction loss based on the training sample label, the second-party initial model weight, and the second-party training model prediction result;
  • An iterative optimization module configured to iteratively optimize the residual promotion model to be trained based on the first-party model prediction loss and the residual loss generated by the second-party model prediction loss, to obtain the vertical federated residual promotion model .
  • the present application also provides a vertical federated prediction optimization device, the vertical federated prediction optimization device is a physical device, and the vertical federated prediction optimization device includes: a memory, a processor, and a The program of the vertical federated prediction optimization method run on the computer, when the program of the vertical federated prediction optimization method is executed by the processor, the steps of the above-mentioned vertical federated prediction optimization method can be realized.
  • the present application also provides a longitudinal federated learning modeling and optimization device, which is a physical device, and the longitudinal federated learning modeling and optimization device includes: a memory, a processor, and a device stored on the memory and The program of the longitudinal federated learning modeling and optimization method that can run on the processor, when the program of the longitudinal federated learning modeling and optimization method is executed by the processor, it can realize the above-mentioned longitudinal federated learning modeling and optimization method step.
  • the present application also provides a medium, the medium is a readable storage medium, and the readable storage medium stores a program for implementing the vertical federated prediction optimization method, and the program for the vertical federated prediction optimization method is implemented when the program is executed by a processor The steps of the above-mentioned vertical federated prediction optimization method.
  • the present application also provides a medium, the medium is a readable storage medium, and the readable storage medium stores a program for implementing the longitudinal federated learning modeling and optimization method, and the program for the longitudinal federated learning modeling and optimization method is processed The steps of implementing the above-mentioned longitudinal federated learning modeling and optimization method when the machine is executed.
  • the present application also provides a computer program product, including a computer program.
  • a computer program product including a computer program.
  • the steps of the above-mentioned vertical federated prediction optimization method are realized.
  • the present application also provides a computer program product, including a computer program.
  • a computer program product including a computer program.
  • the steps of the above-mentioned longitudinal federated learning modeling and optimization method are realized.
  • the present application provides a vertical federated prediction optimization method, device, medium, and computer program product.
  • the forecasting party jointly disperses the aligned samples from other participants to perform Vertical federated prediction is used to accurately predict samples.
  • the forecaster performs local prediction based on a part of the vertical federated prediction model held locally.
  • This application first extracts the samples to be predicted and obtains the characteristics of the target prediction model
  • the extractor performs feature extraction on the sample to be predicted to generate an intermediate sample feature
  • the target prediction model performs model prediction on the sample to be predicted to generate a first-party model prediction result, wherein the target prediction model is determined by the
  • the local iterative training of the first device is obtained, and then the target prediction model based on the local iterative training is realized, and the purpose of sample prediction is performed locally on the sample to be predicted.
  • the first device can be based on The target prediction model as a complete model independently performs accurate sample prediction on the sample to be predicted, and further, sends the intermediate sample features to the second device, so that the second device can analyze the The intermediate sample feature and the ID matching sample corresponding to the sample to be predicted jointly perform model prediction, which provides more decision-making basis for the second device to perform model prediction based on the longitudinal federated residual improvement model, and improves the second device generated by the second device.
  • the accuracy of the prediction result of the square model wherein, since the vertical federated residual promotion model is based on the vertical federated common sample by the second device, the target prediction model in the first device is combined in the vertical federated common sample
  • the model prediction loss on the above, the intermediate public sample features corresponding to the vertical federated public samples, and the corresponding sample labels are obtained by performing residual learning based on vertical federated learning with the first device, and then for the ID matching samples aligned with the samples to be predicted,
  • the second device performs model prediction on ID matching samples and intermediate sample features based on the longitudinal federated residual boosting model to generate more accurate residual boosting information corresponding to the sample to be predicted, that is, the second-party model prediction result, and then obtain all
  • Fig. 1 is a schematic flow chart of the first embodiment of the vertical federated prediction optimization method of the present application
  • Fig. 2 is a schematic flow chart of the second embodiment of the vertical federated prediction optimization method of the present application
  • FIG. 3 is a schematic flow diagram of the first embodiment of the vertical federated learning modeling optimization method of the present application
  • FIG. 4 is a schematic flow diagram of the second embodiment of the vertical federated learning modeling optimization method of the present application.
  • FIG. 5 is a schematic diagram of the device structure of the hardware operating environment involved in the vertical federated prediction optimization method in the embodiment of the present application;
  • FIG. 6 is a schematic diagram of the device structure of the hardware operating environment involved in the longitudinal federated learning modeling optimization method in the embodiment of the present application.
  • An embodiment of the present application provides a vertical federated prediction optimization method.
  • the vertical federated prediction optimization method is applied to the first device, and the vertical federated prediction Optimization methods include:
  • Step S10 extract the sample to be predicted, and obtain the intermediate sample feature generated by the feature extractor of the target prediction model for the sample to be predicted by feature extraction, and the first prediction generated by the target prediction model for the sample to be predicted.
  • a model prediction result wherein the target prediction model is obtained by local iterative training of the first device.
  • the vertical federated prediction optimization method is applied to a vertical federated learning scenario, and both the first device and the second device are participants in the vertical federated learning scenario.
  • the first device has samples with sample labels
  • the second device has samples without sample labels
  • the first device is a predictor, which is used to perform prediction tasks
  • the second device is an auxiliary data provider, which is used for the first
  • the device provides residual improvement information to improve the accuracy of the prediction result generated by the first device.
  • step S10 the sample to be predicted is extracted, and based on the feature extractor in the target prediction model, feature extraction is performed on the sample to be predicted to obtain the characteristics of the intermediate sample, and then based on the classifier in the target prediction model , fully connect the intermediate sample features to obtain the output of the target fully connected layer, and then convert the output of the target fully connected layer into the prediction result of the first-party model through a preset activation function, wherein the first-party model
  • the prediction results may be classification probabilities.
  • the feature extractor for obtaining the target prediction model performs feature extraction on the sample to be predicted to generate an intermediate sample feature, and the target prediction model performs model prediction on the sample to be predicted to generate the first-party model prediction Before the resulting step, the longitudinal federated prediction optimization method also includes:
  • Step A10 obtaining the first-party initial model weights, and extracting training samples and training sample labels corresponding to the training samples;
  • the number of the training samples is at least 1
  • the training sample label is the identification of the training samples
  • the initial model weight of the first party is preset in the first device.
  • the initial value of the model weights representing the predictive accuracy of the model.
  • the model weight can be set to be equal to the logarithmic value of the ratio of the number of samples classified correctly by the prediction model to the number of samples classified incorrectly.
  • the initial value of the model weight can be set to 1.
  • Step A20 obtaining the features of the intermediate training samples generated by the feature extractor of the target prediction model to be trained for feature extraction of the training samples;
  • feature extraction is performed on the training samples to generate intermediate training sample features.
  • Step A30 based on the training sample label, the training model prediction result corresponding to the intermediate training sample features and the first-party initial model weight, by calculating the first-party model prediction loss corresponding to the target prediction model to be trained, iteratively optimizing the target prediction model to be trained to obtain the target prediction model;
  • the iterative training process of the target prediction model to be trained includes a plurality of iterative rounds, wherein one iterative round needs to be iteratively trained based on a preset number of training samples.
  • step A30 based on the classifier and the preset activation function in the target prediction model to be trained, the features of the intermediate training samples are converted into prediction results of the training model, and based on the training samples, the training The model prediction result and the first-party initial model weight are used to calculate the first-party model prediction loss, and then determine whether the first-party model prediction loss is converged.
  • the prediction loss of the first-party model converges, then use the target prediction model to be trained as the target prediction model; if the prediction loss of the first-party model does not converge, calculate the prediction loss based on the first-party model
  • the gradient of the target prediction model to be trained is updated through the preset model optimization method, and based on the prediction result of the training model and the training sample label, the weight of the first-party initial model is updated, and the execution step is returned: extracting The training sample and the training sample label corresponding to the training sample are used for the next iteration.
  • the preset model optimization method includes a gradient descent method and a gradient ascent method, etc., and the first-party model prediction loss is calculated based on the training sample, the training model prediction result and the first-party initial model weight
  • the loss function is as follows
  • L A ( ⁇ A , ⁇ A ,X A ,Y) is the prediction loss of the first-party model
  • N A is the number of training samples in one iteration
  • ⁇ A is the target prediction model to be trained
  • ⁇ A is The initial model weight of the first party
  • X A is a training sample set composed of N A training samples
  • Y is a label set composed of training sample labels corresponding to N A training samples
  • y i is the i-th in a round of iteration A training sample label
  • x A,i is the feature of the i-th training sample in a round of iterations.
  • the first device counts the number of correctly classified samples and the number of incorrectly classified samples of the target prediction model during the iterative training process, and obtains the number of correctly classified samples by the first party and the number of incorrectly classified samples by the first party , and then by calculating the ratio of the number of correct samples classified by the first party to the number of incorrect samples classified by the first party, the weight of the first party model is generated, wherein the specific process of calculating the weight of the first party model is as follows:
  • ⁇ A is the weight of the first-party model
  • A is the number of correct samples classified by the first party
  • B is the number of wrong samples classified by the first party.
  • Step A40 sending the training sample label, the intermediate training sample features, and the first-party model prediction loss to the second device, so that the second device can improve the model based on the residual to be trained, the training sample
  • the corresponding training sample ID matches the sample, the intermediate training sample features, the sample label and the obtained second-party initial model weight, calculates the second-party model prediction loss, and based on the second-party model prediction loss and the The residual loss calculated by the first-party model prediction loss is optimized to obtain the vertical federated residual boosting model by optimizing the residual boosting model to be trained.
  • the initial model weight of the second party is the initial value of the model weight representing the prediction accuracy of the model preset in the second device, where the model weight can be set equal to the prediction model classification The logarithm of the ratio of the number of correct samples to the number of misclassified samples.
  • the initial value of the model weight can be set to 1.
  • step A40 during the iterative training process of the target prediction model, the training sample label corresponding to each training sample, the corresponding first-party model prediction loss, the corresponding training sample and the corresponding intermediate training sample feature are sent to to the second device, and then the second device extracts the training sample ID, and based on the training sample ID, searches for the ID matching sample corresponding to the sample to be predicted, and then the second device inputs the training sample ID matching sample to be trained Model prediction is performed in the residual boosting model, and the prediction result of the second-party training model corresponding to the training sample ID matching sample is obtained.
  • the second device splices the training sample ID matching sample corresponding to the training sample ID and the corresponding intermediate training sample features, so as to perform feature enhancement on the training sample ID matching sample, and obtain training feature enhanced samples. Furthermore, model prediction is performed on the training feature enhancement samples based on the residual promotion model to be trained to obtain a second-party model prediction loss. Further, based on the prediction result of the second-party training model corresponding to the training sample ID matching sample, the corresponding training sample label and the obtained second-party initial model weight, the second-party model prediction loss is calculated.
  • the process of calculating the prediction loss of the second-party model may specifically refer to the process of calculating the prediction loss of the first-party model by the first device, which will not be repeated here. Furthermore, based on the predicted loss of the first-party model and the predicted loss of the second-party model, the residual loss is calculated, and whether the residual loss is converged is judged.
  • the residual loss converges, then use the residual boosting model to be trained as the vertical federated residual boosting model; if the residual loss does not converge, update the calculated gradient based on the residual loss Describe the residual improvement model to be trained, and update the weight of the second-party initial model based on the prediction result of the second-party training model corresponding to the training sample ID matching sample and the corresponding sample label, and return to the execution step: the second device Extract the training sample ID for the next iteration.
  • the loss function for calculating the residual loss by the second device is as follows:
  • L( ⁇ B , ⁇ B ,X B ,Y) is the residual loss
  • N C is the number of training sample ID matching samples in one iteration
  • ⁇ B is the residual promotion model to be trained
  • ⁇ B is the initial model weight of the second party
  • X B is a training sample set composed of N C training sample ID matching samples
  • Y is a label set composed of N C training sample ID matching samples corresponding to the training sample labels
  • the first-party model prediction loss corresponding to the N C training sample ID matching samples Match the intermediate training sample features corresponding to the i-th training sample ID in a round of iterations.
  • the second device counts the number of correctly classified samples and the number of wrongly classified samples in the iterative training process of the longitudinal federated residual boosting model, and obtains the number of correctly classified samples of the second party and the number of incorrectly classified samples of the second party, and then The weight of the second-party model is generated by calculating the ratio of the number of correct samples classified by the second party to the number of wrong samples classified by the second party.
  • the specific process for the second device to generate the second-party model weight may refer to the above-mentioned specific process for the first device to generate the first-party model weight, which will not be repeated here.
  • Step S20 sending the intermediate sample feature to the second device, so that the second device can jointly perform model prediction on the intermediate sample feature and the ID matching sample corresponding to the sample to be predicted based on the longitudinal federated residual boosting model , to obtain the prediction result of the second-party model.
  • the vertical federated residual promotion model is based on the vertical federated public sample by the second device, and the target prediction model in the first device is combined with the vertical federated public sample
  • the model prediction loss above, the intermediate public sample features corresponding to the vertical federated public samples, and the corresponding sample labels are obtained by performing residual learning based on vertical federated learning with the first device.
  • the second device combines the model prediction loss of the target prediction model in the first device on the vertical federated common sample, the intermediate common sample features corresponding to the vertical federated public sample, and
  • the specific process of performing residual learning based on vertical federated learning with the first device to obtain the vertical federated residual promotion model can refer to step A10 to step A30, and will not be repeated here.
  • the vertical federated common sample is a sample in the second device that is aligned with an ID in the first device, that is, a training sample ID matching sample corresponding to a training sample in the first device.
  • the intermediate sample features are sent to the second device, and then the second device splices the ID matching samples corresponding to the samples to be predicted and the corresponding intermediate sample features, so as to perform feature enhancement on the ID matching samples and obtain feature Enhance the sample, and then input the feature enhanced sample corresponding to the sample to be predicted into the vertical federated residual promotion model, perform model prediction on the feature enhanced sample, obtain the second-party model prediction result, and then the second device predicts the second-party model
  • the result and the weight of the second-party model corresponding to the vertical federated residual promotion model are sent to the first device.
  • the longitudinal federated prediction optimization method further includes:
  • Step B10 sending the to-be-predicted sample ID corresponding to the to-be-predicted sample to the second device, so that the second device can search for an ID matching sample corresponding to the to-be-predicted sample ID;
  • sample ID to be predicted is a sample ID of the sample to be predicted.
  • Step B20 if the search failure information sent by the second device is received, taking the prediction result of the first-party model as the target prediction result;
  • the prediction result of the first-party model is used as the target prediction result. Specifically, if the search failure information sent by the second device is received, it is proved that the sample to be predicted is not an aligned sample between the first device and the second device, and then the prediction result of the first-party model is used as The target prediction result is used to realize the purpose of performing sample prediction on the target prediction model independently based on the target prediction model as a complete model.
  • Step B30 if the search failure information sent by the second device is not received, perform a step of sending the intermediate sample features to the second device.
  • the step of: sending the intermediate sample feature to the second device is performed to obtain the intermediate sample feature sent by the second device.
  • the calculated residual boosting information with higher accuracy, wherein the residual boosting information is the second-party model prediction result output by the vertical federated residual boosting model in the second device.
  • Step S30 obtaining the weight of the first-party model corresponding to the target prediction model, and receiving the prediction result of the second-party model sent by the second device and the weight of the second-party model corresponding to the vertical federated residual promotion model;
  • the weight of the first-party model is determined by the first device by calculating the number of correct first-party classification samples and the number of wrong first-party classification samples in the iterative training process of the target prediction model. The ratio is obtained.
  • Step S40 based on the weight of the first-party model and the weight of the second-party model, weighted aggregation is performed on the prediction result of the first-party model and the prediction result of the second-party model to obtain a target federated prediction result.
  • the prediction result of the first-party model and the prediction result of the second-party model are performed through a preset aggregation rule Weighted aggregation to obtain target federated prediction results.
  • the preset aggregation rules include summing and averaging, etc., thereby achieving the purpose of improving the accuracy of the sample prediction performed by the first device on the samples to be predicted by using the residual improvement information generated by the second device.
  • the residual promotion information is generated by the longitudinal federated residual promotion model based on the feature enhancement samples, and the feature enhancement samples are generated based on the splicing of the ID matching samples corresponding to the samples to be predicted and the corresponding intermediate sample features, thus Based on the intermediate sample features in the first device, the purpose of feature enhancement for ID matching samples is realized, so that the input of the vertical federated residual boosting model has more feature information, and the vertical federated residual boosting model makes decisions to generate residual boosting information There are more decision-making bases, and the vertical federated residual boosting model can output more accurate residual boosting information, so based on the second device to generate higher On the basis of the accuracy of sample prediction performed on the samples to be predicted as the aligned samples, the accuracy of the sample prediction performed by the first device on the aligned samples is further improved.
  • the target prediction model can be set as a binary classification model, which is used as a recommendation model, that is, by performing binary classification on the sample to be predicted, it is determined whether to recommend the item corresponding to the sample to be predicted or whether to provide the user with the sample to be predicted Item recommendation, and because the embodiment of the present application achieves the purpose of accurately predicting unaligned samples locally based on the complete model in the case of vertical federated prediction based on residual improvement information with higher accuracy for aligned samples , which improves the overall sample prediction accuracy of the vertical federated prediction, and improves the overall recommendation accuracy of the recommendation model.
  • the existing vertical federated prediction model since the participants of the joint vertical federated learning are required to make model predictions, once there is data loss or transmission downtime among the participants, it cannot be based on the complete model and sample data. Predict the sample, and then affect the accuracy of the sample prediction.
  • the target prediction model is independently held by the first device, even if data loss or downtime occurs on the second device, the first device can rely on the target prediction model as a complete model to treat the prediction samples separately. Execute sample prediction, thereby improving the accuracy of sample prediction when data loss or downtime occurs in the participants of vertical federated learning.
  • the embodiment of the present application provides an optimization method for vertical federated prediction. Compared with the prior art, for unaligned samples in the vertical federated prediction scenario, the predictor performs local prediction based on the partial vertical federated prediction model held locally. Means, the embodiment of the present application first extracts the samples to be predicted, and obtains the intermediate sample features generated by the feature extractor of the target prediction model for the feature extraction of the samples to be predicted, and the target prediction model models the samples to be predicted Predict the generated first-party model prediction results, wherein the target prediction model is obtained by local iterative training of the first device, and then realize the purpose of locally performing sample prediction on the target prediction model based on the local iterative training .
  • the first device can independently perform accurate sample prediction on the sample to be predicted based on the target prediction model as a complete model.
  • the intermediate sample feature is sent to the second device, so that the second device can jointly perform model prediction on the intermediate sample feature and the ID matching sample corresponding to the sample to be predicted based on the longitudinal federated residual promotion model , provides more decision-making basis for the second device to perform model prediction based on the vertical federated residual promotion model, and improves the accuracy of the second-party model prediction results generated by the second device, wherein, due to the vertical federated residual promotion
  • the model is based on the vertical federated public sample by the second device, combining the model prediction loss of the target prediction model in the first device on the vertical federated public sample, and the intermediate public sample features corresponding to the vertical federated public sample And the corresponding sample labels are obtained by performing residual learning based on longitudinal federated learning with the first device, and then for the ID matching samples aligned with the samples to be
  • the target federated prediction result can be obtained by weighted aggregation of the model prediction result and the second-party model prediction result. Therefore, if the sample to be predicted is an aligned sample between the first device and the second device, the first device can use The second device, based on the vertical federated residual boosting model, generates more accurate residual boosting information (prediction results of the second-party model) for the ID matching samples and intermediate sample features that are aligned with the samples to be predicted, and boosts the output of the target prediction model.
  • the accuracy of the prediction results of the one-party model realizes the vertical federated prediction based on the more accurate residual improvement information for the samples to be predicted, and realizes the vertical federated prediction based on the higher accuracy residual improvement information for the aligned samples
  • the purpose of accurately predicting unaligned samples locally based on the complete model overcomes the inability of the predictor to perform full-based prediction on unaligned samples when the predictor makes more accurate federated predictions on aligned samples.
  • the sample prediction of the model has a technical defect that makes the overall sample prediction accuracy lower, and improves the overall sample prediction accuracy of the vertical federated prediction.
  • the vertical federated prediction optimization method is applied to the second device, and the vertical federated prediction optimization method includes:
  • Step C10 receiving the intermediate sample feature sent by the first device, and searching for an ID matching sample corresponding to the intermediate sample feature;
  • the intermediate sample feature is generated by the first device based on the feature extractor in the target prediction model for the sample to be predicted, and the ID matching sample is the ID matching sample in the second device.
  • step C10 receiving the intermediate sample features sent by the first device based on the feature extractor of the target prediction model to perform feature extraction on the samples to be predicted and the samples to be predicted corresponding to the samples to be predicted sent by the first device ID, and then according to the ID of the sample to be predicted, search for an ID matching sample.
  • the longitudinal federated prediction optimization method further includes:
  • Step D10 if the search is successful, perform the step of: based on the vertical federated residual promotion model, jointly perform model prediction on the ID matching sample and the intermediate sample characteristics, and obtain the second-party model prediction result;
  • the second device has an aligned sample corresponding to the sample to be predicted, that is, an ID matching sample, and then performs the step of: based on the vertical federated residual promotion model, the ID matching sample and The intermediate sample features jointly perform model prediction to obtain the prediction result of the second-party model, based on the vertical federated residual promotion model and the ID matching samples corresponding to the samples to be predicted and the corresponding intermediate sample features, generate higher accuracy samples to be predicted Corresponding residual promotion information.
  • the residual improvement information is the prediction result of the second-party model, and then the residual improvement information is sent to the first device, and the first device can realize the residual improvement information based on higher accuracy, and improve the prediction of the first device.
  • the accuracy of the sample-by-sample prediction result is the accuracy of the sample-by-sample prediction result.
  • Step D20 if the search fails, feed back the search failure information to the first device, so that the first device will, after receiving the search failure information, generate the first data for the sample to be predicted based on the target prediction model.
  • the prediction result of the square model is used as the target prediction result.
  • the search fails, it proves that the second device does not have an aligned sample corresponding to the sample to be predicted, and then feeds back search failure information to the first device, and then the first device, after receiving the search failure information, That is, the prediction result of the first-party model generated based on the target prediction model for the sample to be predicted can be directly used as the target prediction result, so as to achieve the purpose of independently performing sample prediction on unaligned samples.
  • Step C20 based on the vertical federated residual promotion model, jointly perform model prediction on the ID matching samples and the intermediate sample features, and obtain a second-party model prediction result;
  • the vertical federated residual promotion model is based on the vertical federated common samples by the second device, and combines the target prediction model in the first device on the vertical federated common samples
  • the model prediction loss, the intermediate public sample features corresponding to the vertical federated public samples, and the corresponding sample labels are obtained by performing residual learning based on vertical federated learning with the first device.
  • the second device combines the model prediction loss of the target prediction model in the first device on the vertical federated public sample, the intermediate common sample features corresponding to the vertical federated public sample, and the corresponding
  • the specific process of performing residual learning based on vertical federated learning with the first device to obtain the vertical federated residual promotion model can refer to the content in the above step A10 to step A40, and will not be repeated here.
  • step C20 based on the characteristics of the intermediate samples, perform feature enhancement on the ID matching samples to obtain feature-enhanced samples, and then perform model prediction by inputting the feature-enhanced samples into the longitudinal federated residual boosting model, Generate second-party model predictions.
  • the model prediction based on the vertical federated residual promotion model is jointly performed on the ID matching sample and the intermediate sample features, and the step of obtaining the prediction result of the second-party model includes:
  • Step C21 splicing the features of the ID matching samples and the intermediate samples to obtain feature-enhanced samples
  • the ID matching samples and the intermediate sample features are concatenated to perform feature enhancement on the ID matching samples based on the intermediate sample features to obtain feature enhanced samples.
  • the ID matching samples may be weighted and concatenated with the features of the intermediate samples, so as to perform feature enhancement on the ID matching samples based on the features of the intermediate samples, to obtain feature-enhanced samples.
  • Step C22 based on the longitudinal federated residual promotion model, perform model prediction on the feature enhanced samples, and obtain the prediction result of the second party model.
  • step C22 data processing is performed on the feature-enhanced samples by inputting the feature-enhanced samples into the longitudinal federated residual boosting model.
  • the process of data processing includes convolution, pooling and full connection, etc., and then obtains the output result of the fully connected layer output by the last fully connected layer in the vertical federated residual promotion model, and then through the preset activation function, Converting the output result of the fully connected layer into a prediction result of the second-party model.
  • the vertical federated prediction optimization method before the step of performing model prediction on the ID matching sample and the intermediate sample features based on the vertical federated residual promotion model, and obtaining the prediction result of the second-party model, the vertical federated prediction optimization method further includes :
  • Step E10 obtain the weight of the second-party initial model, and receive the intermediate training sample features, training sample labels and first-party model prediction loss sent by the first device, wherein the first-party model prediction loss is determined by the first A device calculates based on the first-party model prediction result of the target prediction model on the training sample corresponding to the training sample ID matching sample and the training sample label, and the intermediate training sample features are calculated by the first device based on the The feature extractor of the target prediction model performs feature extraction on the training sample;
  • the specific process for the first device to generate the prediction loss of the first-party model and the characteristics of the intermediate training samples can refer to the specific content in the above step A10 to step A30, which will not be repeated here.
  • the first device needs to store the training sample IDs corresponding to all training samples in the iterative training process of the target prediction model, the corresponding training sample labels, the corresponding first-party model prediction loss and the corresponding intermediate training
  • the sample features are sent to the second device.
  • Step E20 obtaining training sample ID matching samples, and based on the residual improvement model to be trained, performing model prediction on the training sample ID matching samples and the characteristics of the intermediate training samples, and obtaining the prediction result of the second-party training model;
  • a training sample ID matching sample is extracted, and then the training sample ID matching sample is spliced with the intermediate training sample features, so as to match the training sample ID based on the intermediate training sample features.
  • Feature enhancement is performed on the samples to obtain training feature enhancement samples, and then by inputting the training feature enhancement samples into the residual improvement model to be trained to perform model prediction, the prediction result of the second-party training model is obtained.
  • Step E30 calculating the second-party model prediction loss based on the training sample label, the second-party initial model weight and the second-party training model prediction result;
  • the second device calculates the loss function of the second-party model prediction loss based on the training sample label, the second-party initial model weight, and the second-party training model prediction result as follows:
  • L( ⁇ B , ⁇ B ,X B ,Y) is the residual loss
  • N C is the number of training sample ID matching samples in one iteration
  • ⁇ B is the residual promotion model to be trained
  • ⁇ B is the initial model weight of the second party
  • X B is a training sample set composed of N C training sample ID matching samples
  • Y is a label set composed of N C training sample ID matching samples corresponding to the training sample labels
  • the first-party model prediction loss corresponding to the N C training sample ID matching samples Match the intermediate training sample features corresponding to the i-th training sample ID in a round of iterations.
  • Step E40 iteratively optimizing the residual boosting model to be trained based on the first-party model prediction loss and the residual loss generated by the second-party model prediction loss to obtain the vertical federated residual boosting model.
  • the residual loss is calculated based on the predicted loss of the first-party model and the predicted loss of the second-party model, and then it is judged whether the residual loss is converged. If the residual promotion model converges, then use the residual promotion model to be trained as the vertical federated residual promotion model; if the residual loss does not converge, then based on the gradient calculated by the residual loss, pass
  • the preset model optimization method updates the residual promotion model to be trained, and updates the weight of the second-party initial model based on the prediction result of the second-party training model corresponding to the training sample ID matching sample and the corresponding training sample label, And return to the execution step: obtain the training sample ID matching sample, and perform the next iteration.
  • the specific process of calculating the residual loss based on the predicted loss of the first-party model and the predicted loss of the second-party model can refer to the specific content in the above-mentioned step A40, which will not be repeated here.
  • the second device counts the number of correctly classified samples and the number of wrongly classified samples in the iterative training process of the residual promotion model to be trained, and obtains the number of correctly classified samples of the second party and the number of incorrectly classified samples of the second party, and then
  • the weight of the second-party model is generated by calculating the ratio of the number of correct samples classified by the second party to the number of wrong samples classified by the second party.
  • the specific process of generating the weight of the second-party model by the second device may refer to the specific process of generating the weight of the first-party model by the first device, which will not be repeated here.
  • Step C30 obtaining the weight of the second-party model corresponding to the vertical federated residual promotion model, and sending the prediction result of the second-party model and the weight of the second-party model to the first device for the
  • the prediction result of the first-party model generated by the first device for the sample to be predicted corresponding to the ID matching sample based on the target prediction model, the weight of the first-party model corresponding to the target prediction model, and the prediction result of the second-party model and the weight of the second-party model to generate a target federated prediction result, wherein the target prediction model is obtained by local iterative training of the first device.
  • the specific process of the first device generating the first-party model prediction result for the sample to be predicted corresponding to the training sample ID matching sample based on the target prediction model can refer to the specific process in step S10. The steps are not repeated here.
  • the device performs weighted aggregation on the prediction results of the first-party model and the prediction results of the second-party model through preset aggregation rules to obtain The target federated prediction results, and then realize the higher accuracy residual improvement information generated by the second device, optimize the first-party model prediction results of the samples to be predicted in the first device, so as to improve the samples of the samples to be predicted by the first device Accuracy of prediction results.
  • the embodiment of the present application provides an optimization method for vertical federated prediction. Compared with the prior art, for unaligned samples in the vertical federated prediction scenario, the predictor performs local prediction based on the partial vertical federated prediction model held locally. Means, the embodiment of the present application first receives the intermediate sample features sent by the first device, and searches for the ID matching samples corresponding to the intermediate sample features, and based on the vertical federated residual promotion model, the ID matching samples and the intermediate samples Features jointly perform model prediction and obtain second-party model prediction results. The purpose of using the intermediate sample features sent by the first device to generate higher-accuracy residual improvement information corresponding to the sample to be predicted is achieved.
  • the target prediction model is obtained by local iterative training of the first device, and then realizes that, for the aligned samples, the residual improvement information with higher accuracy generated by the second device is used to optimize the first party generated by the first device.
  • Model prediction results to generate target federated prediction results with higher sample prediction accuracy and since the target prediction model is obtained by local iterative training of the first device, for unaligned samples, the first device can also be based on the complete model
  • the target prediction model of independently treats the prediction samples locally to complete the sample prediction, so it overcomes the fact that the predictor cannot make sample predictions based on the complete model for unaligned samples when the federated prediction with higher accuracy is performed on aligned samples, making The technical defect that the overall sample forecast accuracy becomes lower improves the overall sample forecast accuracy of the vertical federated forecast.
  • a longitudinal federated learning modeling optimization method the longitudinal federated learning modeling optimization method is applied to the first device, and the longitudinal federated learning modeling Optimization methods include:
  • Step F10 obtaining the first-party initial model weights, and extracting training samples and training sample labels corresponding to the training samples;
  • the modeling and optimization method for vertical federated learning is applied to a vertical federated learning scenario, and both the first device and the second device are participants in the vertical federated learning scenario.
  • the first device has samples with sample labels
  • the second device has samples without sample labels
  • the first device is a predictor, which is used to construct a prediction model
  • the second device is an auxiliary data provider, which is used to construct the first
  • a device provides a vertical federated residual boosting model of residual boosting information, so as to improve the accuracy of a prediction result generated by the prediction model in the first device.
  • the number of training samples is at least 1
  • the training sample label is the identification of the training sample
  • the first-party initial model weight is the prediction of the representation model preset in the first device.
  • the model weight can be set to be equal to the logarithmic value of the ratio of the number of samples correctly classified by the prediction model to the number of samples classified incorrectly.
  • the initial value of the model weight can be set to 1.
  • Step F20 obtaining the features of the intermediate training samples generated by the feature extractor of the target prediction model to be trained for feature extraction of the training samples
  • feature extraction is performed on the training samples to obtain features of the intermediate training samples.
  • Step F30 based on the training sample label, the training model prediction result corresponding to the intermediate training sample features and the first-party initial model weight, by calculating the first-party model prediction loss corresponding to the target prediction model to be trained, iteratively optimizing the target prediction model to be trained to obtain the target prediction model;
  • step F30 through the classifier in the target prediction model to be trained, the features of the intermediate training samples are fully connected to obtain the output of the training fully connected layer, and then the training The output of the fully connected layer is converted into the prediction result of the training model, and then based on the training samples, the prediction result of the training model and the weight of the first-party initial model, the prediction loss of the first-party model is calculated.
  • the prediction loss of the first-party model is convergent, if the prediction loss of the first-party model is convergent, use the target prediction model to be trained as the target prediction model; if the prediction loss of the first-party model is not Convergence, update the target prediction model to be trained based on the gradient calculated by the first-party model prediction loss through a preset model optimization method, and update the The first party initializes the model weights, and returns to the execution step: extracting training samples and training sample labels corresponding to the training samples, and performing the next iteration.
  • the preset model optimization method includes a gradient descent method and a gradient ascent method, etc.
  • the first-party model prediction loss is calculated based on the training sample, the training model prediction result and the first-party initial model weight
  • the calculation process of can refer to the specific content in the above-mentioned step A10 to step A30, which will not be repeated here.
  • the step of obtaining the target prediction model includes:
  • Step F31 based on the classifier in the target prediction model to be trained, converting the features of the intermediate training samples into prediction results of the training model;
  • the features of the intermediate training samples are fully connected to obtain the output of the training fully connected layer, and then the training fully connected
  • the layer output is converted to the trained model predictions.
  • Step F32 calculating the first-party model prediction loss based on the training sample label, the training model prediction result and the first-party initial model weight
  • step F32 for the specific calculation process of step F32, reference may be made to the content in step A30, which will not be repeated here.
  • Step F33 based on the prediction result of the training model and the label of the training sample, updating the weight of the first-party initial model
  • step F33 based on the prediction results of the training model and the labels of the training samples, the current number of misclassified samples and the number of current correct samples corresponding to the residual boosting model to be trained are updated, and then by calculating the The ratio of the current number of correctly classified samples to the current number of wrongly classified samples is used to recalculate the weight of the first-party initial model.
  • the process of recalculating the weight of the first-party initial model please refer to the specific process of calculating the weight of the first-party model by the first device after step A30, which will not be repeated here.
  • Step F34 based on the first-party model prediction loss and the updated first-party initial model weight, iteratively optimize the target prediction model to be trained to obtain the target prediction model.
  • step F34 it is judged whether the prediction loss of the first-party model is convergent, and if the prediction loss of the first-party model is convergent, the target prediction model to be trained is used as the target prediction model, and the updated The final first-party initial model weight is used as the first-party model weight; if the first-party model prediction loss does not converge, then based on the gradient calculated by the first-party model prediction loss, through a preset model optimization method, Update the target prediction model to be trained, and return to the execution step: extract the training sample and the training sample label corresponding to the training sample, based on the updated target prediction model to be trained and the updated first-party initial model weight, perform the following One iteration.
  • the longitudinal federated learning modeling optimization method further includes:
  • Step G10 obtaining the number of correct first-party classification samples and the corresponding number of wrong first-party classification samples corresponding to the target prediction model
  • the number of correct samples classified by the first party is the number of training samples whose output classification labels of the training samples are consistent with the corresponding training sample labels in the iterative training process of the target prediction model
  • the number of first-party misclassified samples is the number of training samples whose output classification labels of the training samples are inconsistent with the corresponding training sample labels during the iterative training process of the target prediction model.
  • Step G20 generating a first-party model weight by calculating the ratio of the number of samples correctly classified by the first party to the number of samples incorrectly classified by the first party.
  • step G20 the specific calculation formula of step G20 is as follows:
  • ⁇ A is the weight of the first-party model
  • A is the number of correct samples classified by the first party
  • B is the number of wrong samples classified by the first party.
  • Step F40 sending the training sample labels, the intermediate training sample features, and the first-party model prediction loss to the second device, so that the second device can calculate the second-party model prediction loss, and based on the The second-party model prediction loss and the residual loss calculated by the first-party model prediction loss are used to optimize the residual promotion model to be trained to obtain a vertical federated residual promotion model.
  • the prediction loss of the second-party model is based on the residual promotion model to be trained, the training sample ID matching sample corresponding to the training sample, the characteristics of the intermediate training sample, and the sample label and the obtained second-party initial model weights are calculated.
  • the training sample labels corresponding to all training samples, the corresponding first-party model prediction loss, the corresponding training sample ID and the corresponding intermediate training sample features are sent to the second device, Further, the second device extracts the training sample ID, and based on the training sample ID, searches for the training sample ID matching sample corresponding to the training sample ID, and by corresponding the training sample ID matching sample and the intermediate training sample features
  • the training feature enhancement samples are input into the residual improvement model to be trained to perform model prediction, and obtain the prediction result of the second-party training model.
  • the second-party model prediction loss is calculated, and then the second device calculates the second-party model prediction loss based on the second-party One model predicts loss and the second model predicts loss, calculates residual loss, and based on the residual loss, iteratively optimizes the residual boosting model to be trained to obtain a vertical federated residual boosting model.
  • the specific process of the second device constructing the vertical federated residual promotion model can refer to the specific content in step A10 to step A40, which will not be repeated here.
  • the said training sample labels, said intermediate training sample features and said first-party model prediction loss are sent to the second device, so that said second device can upgrade the model based on the residual to be trained, said The training sample ID corresponding to the training sample matches the sample, the intermediate training sample features and the sample label, calculates the second-party model prediction loss, and calculates based on the second-party model prediction loss and the first-party model prediction loss Residual loss, optimizing the residual promotion model to be trained, after the step of obtaining the vertical federated residual promotion model, the longitudinal federated learning modeling optimization method also includes:
  • Step H10 extract the sample to be predicted, and obtain the intermediate sample feature generated by the feature extractor of the target prediction model for the sample to be predicted by feature extraction, and the first prediction generated by the target prediction model for the sample to be predicted.
  • the prediction result of one party model
  • samples to be predicted are extracted, and based on the feature extractor in the target prediction model, feature extraction is performed on the samples to be predicted to obtain intermediate sample features, and then based on the classification in the target prediction model
  • the device performs full connection on the intermediate sample features to obtain the output of the target fully connected layer, and then converts the output of the target fully connected layer into a prediction result of the first-party model through a preset activation function, wherein the first-party
  • the model prediction results may be classification probabilities.
  • Step H20 sending the intermediate sample feature to the second device, so that the second device can jointly perform model prediction on the intermediate sample feature and the ID matching sample corresponding to the sample to be predicted based on the longitudinal federated residual boosting model , to obtain the prediction result of the second-party model;
  • the intermediate sample features are sent to the second device, and then the second device splices the ID matching sample corresponding to the sample to be predicted and the corresponding intermediate sample features, so that the ID matching sample Perform feature enhancement to obtain feature-enhanced samples, and then input the feature-enhanced samples corresponding to the samples to be predicted into the vertical federated residual promotion model, perform model prediction on the feature-enhanced samples, and obtain second-party model prediction results, and then the second device Sending the prediction result of the second-party model and the weight of the second-party model corresponding to the vertical federated residual promotion model to the first device.
  • Step H30 receiving the prediction result of the second-party model sent by the second device and the weight of the second-party model corresponding to the vertical federated residual promotion model;
  • Step H40 based on the first-party model weights and the second-party model weights corresponding to the target prediction model, perform weighted aggregation of the first-party model prediction results and the second-party model prediction results to obtain the target federation forecast result.
  • the prediction result of the first-party model and the prediction result of the second-party model are performed through a preset aggregation rule Weighted aggregation to obtain target federated prediction results.
  • the preset aggregation rules include summing and averaging, etc., thereby achieving the purpose of improving the accuracy of the sample prediction performed by the first device on the samples to be predicted by using the residual improvement information generated by the second device.
  • the embodiment of the present application provides a longitudinal federated learning modeling optimization method, that is, obtain the weight of the first-party initial model, and extract the training sample and the training sample label corresponding to the training sample, and then obtain the target prediction model to be trained
  • the feature extractor performs feature extraction on the training samples to generate intermediate training sample features, and then based on the training sample labels, the training model prediction results corresponding to the intermediate training sample features, and the first-party initial model weights, by calculating
  • the prediction loss of the first-party model corresponding to the target prediction model to be trained is iteratively optimized for the target prediction model to be trained to obtain the target prediction model, thereby achieving the purpose of locally constructing a target prediction model as a complete model on the first device .
  • the training sample labels, the intermediate training sample features, and the first-party model prediction loss are sent to a second device for the second device to calculate the second-party model prediction loss, and based on the second Square model prediction loss and the residual loss calculated by the first-party model prediction loss, optimize the residual promotion model to be trained, obtain a vertical federated residual promotion model, and realize residual learning based on vertical federated learning.
  • the purpose of the vertical federated residual promotion model built at the second device is to expand the feature dimension of the training sample ID matching sample in the second device, so that the vertical federated residual
  • the prediction accuracy of the promotion model is higher, and then based on the target prediction model of the first device and the longitudinal federated residual promotion model at the second device, it can be realized that the alignment sample is based on the residual promotion information with higher accuracy.
  • the purpose of accurate sample prediction can be performed locally independently on unaligned samples based on the full model. It lays the foundation for overcoming the technical defect that the forecaster cannot make sample predictions based on the complete model for unaligned samples when the predictor makes more accurate federated predictions for aligned samples, which makes the overall sample prediction accuracy lower.
  • a longitudinal federated learning modeling optimization method the longitudinal federated learning modeling optimization method is applied to the second device, and the longitudinal federated learning modeling Optimization methods include:
  • Step R10 obtaining the weight of the second-party initial model, and receiving the intermediate training sample features, training sample labels and first-party model prediction loss sent by the first device;
  • the prediction loss of the first-party model is calculated by the first device based on the prediction result of the first-party model on the training sample corresponding to the training sample ID matching sample and The label of the training sample is obtained by calculation, and the feature of the intermediate training sample is obtained by the first device performing feature extraction on the training sample based on the feature extractor of the target prediction model.
  • the specific process of the first device generating the prediction loss of the first-party model and the characteristics of the intermediate training samples please refer to the specific content in step A10 to step A30, which will not be repeated here.
  • Step R20 obtaining training sample ID matching samples, and based on the residual improvement model to be trained, performing model prediction on the training sample ID matching samples and the characteristics of the intermediate training samples, and obtaining the prediction results of the second-party training model;
  • the training sample ID is extracted, and the training sample ID matching sample corresponding to the training sample ID is searched, and then the training sample ID matching sample and the intermediate training sample feature are spliced, so that based on The characteristics of the intermediate training sample, feature enhancement is performed on the training sample ID matching sample to obtain a training feature enhancement sample, and the second party training sample is obtained by inputting the training feature enhancement sample into the residual improvement model to be trained to perform model prediction. Model prediction results.
  • Step R30 calculating the second-party model prediction loss based on the training sample label, the second-party initial model weight and the second-party training model prediction result;
  • step R30 the specific process of the second device calculating the prediction loss of the second-party model in step R30 may refer to the specific content in step E30 , which will not be repeated here.
  • Step R40 Based on the residual loss generated by the first-party model prediction loss and the second-party model prediction loss, iteratively optimize the residual boosting model to be trained to obtain the vertical federated residual boosting model.
  • the residual loss is calculated, and then it is judged whether the residual loss is converged. If the residual improves the model Convergence, the residual promotion model to be trained is used as the vertical federated residual promotion model; if the vertical federated residual promotion model does not converge, then the gradient calculated based on the residual loss is optimized by a preset model
  • the method updates the residual promotion model to be trained, and updates the weight of the second-party initial model based on the prediction result of the second-party training model corresponding to the training sample ID matching sample and the corresponding sample label, and returns to the execution step: Obtain the training sample ID matching sample for the next round of iteration.
  • the specific process of calculating the residual loss based on the predicted loss of the first-party model and the predicted loss of the second-party model can refer to the specific content in step A40, which will not be repeated here.
  • the longitudinal federated learning modeling optimization method also includes:
  • Step T10 obtaining the number of correct samples of the second party classification and the corresponding number of incorrect samples of the second party classification corresponding to the longitudinal federated residual boosting model
  • the number of correct samples classified by the second party is the output classification label of the training sample ID matching sample and the corresponding training sample label in the iterative training process of the longitudinal federated residual boosting model
  • the number of consistent training sample ID matching samples; the number of misclassified samples of the second party is the inconsistency between the output classification label of the training sample ID matching sample and the corresponding training sample label in the iterative training process of the vertical federated residual promotion model
  • the training sample ID matches the number of samples.
  • Step T20 generating a second-party model weight by calculating the ratio of the number of samples correctly classified by the second party to the number of samples incorrectly classified by the second party.
  • step T20 the specific calculation formula of step T20 is as follows:
  • ⁇ A is the weight of the second-party model
  • A is the number of correct samples classified by the second party
  • B is the number of wrong samples classified by the second party.
  • the longitudinal federated learning modeling optimization method also includes:
  • Step Y10 receiving the intermediate sample feature sent by the first device, and searching for an ID matching sample corresponding to the intermediate sample feature;
  • Step Y20 based on the longitudinal federated residual promotion model, jointly perform model prediction on the ID matching sample and the intermediate sample features, and obtain a second-party model prediction result;
  • feature enhancement is performed on the ID matching samples to obtain feature-enhanced samples, and then the feature-enhanced samples are input into the longitudinal federated residual boosting model to perform model prediction to generate the first Bipartite model prediction results.
  • the content in step C21 for a specific implementation manner of performing feature enhancement on the ID matching samples, reference may be made to the content in step C21, which will not be repeated here.
  • Step Y30 sending the prediction result of the second-party model and the weight of the second-party model corresponding to the vertical federated residual promotion model to the first device, so that the first device can use the target prediction model for the
  • the specific process of the first device generating the first-party model prediction result for the sample to be predicted corresponding to the training sample ID matching sample based on the target prediction model can refer to the specific process in step S10. The steps are not repeated here.
  • the device performs weighted aggregation on the prediction results of the first-party model and the prediction results of the second-party model through preset aggregation rules to obtain The target federated prediction results, and then realize the higher accuracy residual improvement information generated by the second device, optimize the first-party model prediction results of the samples to be predicted in the first device, so as to improve the samples of the samples to be predicted by the first device Accuracy of prediction results.
  • the embodiment of the present application provides a longitudinal federated learning modeling optimization method, that is, obtains the weight of the second-party initial model, and receives the intermediate training sample features, training sample labels and first-party model prediction loss sent by the first device.
  • the first-party model prediction loss is calculated by the first device based on the first-party model prediction result of the target prediction model on the training sample corresponding to the training sample ID matching sample and the training sample label, so
  • the features of the intermediate training samples are obtained by the first device performing feature extraction on the training samples based on the feature extractor of the target prediction model, and then obtain training sample ID matching samples, and based on the residual improvement model to be trained, for all
  • the ID matching samples of the training samples and the characteristics of the intermediate training samples jointly perform model prediction to obtain the prediction results of the second-party training model.
  • the second-party initial model weight and the second-party training model prediction result calculate the second-party model prediction loss, and calculate the second-party model prediction loss based on the first-party model prediction loss and the second-party
  • the residual loss generated by the model prediction loss is used to iteratively optimize the residual boosting model to be trained to obtain the vertical federated residual boosting model.
  • the second device can be used to generate more accurate residual boosting information, and the first-party model prediction results generated by the first device can be optimized to generate more accurate sample predictions.
  • the target federated prediction results in order to overcome the technical defect that the predictor cannot make sample predictions based on the complete model for unaligned samples when making federated predictions with higher accuracy for aligned samples, which makes the overall sample prediction accuracy lower Foundation.
  • FIG. 5 is a schematic diagram of a device system of the hardware operating environment involved in the solution of the embodiment of the present application.
  • the vertical federated prediction optimization device may include: a processor 1001 , such as a CPU, a memory 1005 , and a communication bus 1002 .
  • the communication bus 1002 is used to realize connection and communication between the processor 1001 and the memory 1005 .
  • the memory 1005 can be a high-speed RAM memory, or a stable memory (non-volatile memory), such as a disk memory.
  • the memory 1005 may also be a storage device independent of the aforementioned processor 1001 .
  • the vertical federated prediction optimization device may also include a rectangular user interface, a network interface, a camera, an RF (Radio Frequency, radio frequency) circuit, a sensor, an audio circuit, a WiFi module, and the like.
  • the rectangular user interface may include a display screen (Display), an input sub-module such as a keyboard (Keyboard), and the optional rectangular user interface may also include a standard wired interface and a wireless interface.
  • the network interface may include a standard wired interface and a wireless interface (such as a WI-FI interface).
  • the vertical federated prediction optimization device system shown in Figure 5 does not constitute a limitation on the vertical federated prediction optimization device, and may include more or less components than those shown in the figure, or combine certain components, or different component arrangements.
  • the memory 1005 as a computer storage medium may include an operating system, a network communication module, and a vertical federation prediction optimization program.
  • the operating system is a program that manages and controls the hardware and software resources of the vertical federated prediction optimization device, and supports the operation of the vertical federated prediction optimization program and other software and/or programs.
  • the network communication module is used to realize the communication among the various components inside the memory 1005, as well as communicate with other hardware and software in the vertical federated prediction optimization system.
  • the processor 1001 is configured to execute the vertical federated prediction optimization program stored in the memory 1005 to implement the steps of the vertical federated prediction optimization method described in any one of the above.
  • FIG. 6 is a schematic diagram of a device system of the hardware operating environment involved in the solution of the embodiment of the present application.
  • the longitudinal federated learning modeling optimization device may include: a processor 1001 , such as a CPU, a memory 1005 , and a communication bus 1002 .
  • the communication bus 1002 is used to realize connection and communication between the processor 1001 and the memory 1005 .
  • the memory 1005 can be a high-speed RAM memory, or a stable memory (non-volatile memory), such as a disk memory.
  • the memory 1005 may also be a storage device independent of the aforementioned processor 1001 .
  • the longitudinal federated learning modeling optimization device may also include a rectangular user interface, a network interface, a camera, an RF (Radio Frequency, radio frequency) circuit, a sensor, an audio circuit, a WiFi module, and the like.
  • the rectangular user interface may include a display screen (Display), an input sub-module such as a keyboard (Keyboard), and the optional rectangular user interface may also include a standard wired interface and a wireless interface.
  • the network interface may include a standard wired interface and a wireless interface (such as a WI-FI interface).
  • the vertical federated learning modeling and optimization equipment system shown in Figure 6 does not constitute a limitation on the vertical federated learning modeling and optimization equipment, and may include more or less components than those shown in the illustration, or a combination certain components, or a different arrangement of components.
  • the memory 1005 as a computer storage medium may include an operating system, a network communication module, and a longitudinal federated learning modeling optimization program.
  • the operating system is a program that manages and controls the hardware and software resources of the vertical federated learning modeling optimization device, and supports the operation of the vertical federated learning modeling optimization program and other software and/or programs.
  • the network communication module is used to realize the communication between various components inside the memory 1005, and communicate with other hardware and software in the longitudinal federated learning modeling optimization system.
  • the processor 1001 is configured to execute the vertical federated learning modeling optimization program stored in the memory 1005, so as to realize any of the above-mentioned longitudinal federated learning modeling optimization methods. step.
  • the embodiment of the present application also provides a vertical federated prediction optimization device, the vertical federated prediction optimization device is applied to the first device, and the vertical federated prediction optimization device includes:
  • the model prediction module is used to extract the samples to be predicted, and obtain the intermediate sample features generated by the feature extractor of the target prediction model for the feature extraction of the samples to be predicted, and the target prediction model to perform model prediction on the samples to be predicted
  • the generated first-party model prediction result wherein the target prediction model is obtained by local iterative training of the first device;
  • a sending module configured to send the intermediate sample feature to a second device, so that the second device can jointly execute the ID matching sample corresponding to the intermediate sample feature and the sample to be predicted based on the longitudinal federated residual boosting model Model prediction, obtain the second-party model prediction results.
  • the vertical federated residual promotion model is combined with the model prediction loss of the target prediction model in the first device on the vertical federated public sample based on the vertical federated public sample by the second device, the vertical
  • the intermediate public sample features corresponding to the federated public samples and the corresponding sample labels are obtained by performing residual learning based on longitudinal federated learning with the first device;
  • a receiving module configured to obtain the first-party model weight corresponding to the target prediction model, and receive the second-party model prediction result sent by the second device and the second-party model weight corresponding to the vertical federated residual promotion model ;
  • a weighted aggregation module configured to perform weighted aggregation on the prediction results of the first-party model and the prediction results of the second-party model based on the weights of the first-party model and the weight of the second-party model to obtain a target federated prediction result .
  • the vertical federated prediction optimization device is also used for:
  • the prediction result of the first-party model is used as the target prediction result
  • a step is performed: sending the intermediate sample features to the second device.
  • the vertical federated prediction optimization device is also used for:
  • the training model prediction result corresponding to the intermediate training sample features, and the first-party initial model weight by calculating the first-party model prediction loss corresponding to the target prediction model to be trained, iteratively optimize the Describe the target prediction model to be trained, and obtain the target prediction model;
  • the sample ID matches the sample, the intermediate training sample features, the sample label and the obtained second-party initial model weight, calculates the second-party model prediction loss, and based on the second-party model prediction loss and the first-party
  • the residual loss calculated by the model prediction loss is used to optimize the residual boosting model to be trained to obtain the vertical federated residual boosting model.
  • the specific implementation of the vertical federated prediction optimization device of the present application is basically the same as the embodiments of the vertical federated prediction optimization method described above, and will not be repeated here.
  • the embodiment of the present application also provides a vertical federated prediction optimization device, the vertical federated prediction optimization device is applied to the second device, and the vertical federated prediction optimization device includes:
  • a receiving and searching module configured to receive the intermediate sample feature sent by the first device, and search for an ID matching sample corresponding to the intermediate sample feature;
  • the model prediction module is configured to jointly perform model prediction on the ID matching sample and the intermediate sample features based on the vertical federated residual promotion model, and obtain a second-party model prediction result.
  • the vertical federation residual promotion model is combined with the model prediction loss of the target prediction model in the first device on the vertical federated public sample, the vertical federated public sample by the second device based on the vertical federated public sample.
  • the intermediate public sample features corresponding to the samples and the corresponding sample labels are obtained by performing residual learning based on longitudinal federated learning with the first device;
  • a sending module configured to obtain the weight of the second-party model corresponding to the vertical federated residual promotion model, and send the prediction result of the second-party model and the weight of the second-party model to the first device for
  • the prediction result and the weight of the second-party model generate a target federated prediction result, wherein the target prediction model is obtained by local iterative training of the first device.
  • model prediction module is also used for:
  • model prediction is performed on the feature enhancement samples to obtain the second party model prediction result.
  • the vertical federated prediction optimization device is also used for:
  • search If the search is successful, then perform the step of: based on the vertical federated residual promotion model, jointly perform model prediction on the ID matching sample and the intermediate sample characteristics, and obtain the second-party model prediction result;
  • search fails, feed back search failure information to the first device, so that the first device will predict the first-party model generated for the sample to be predicted based on the target prediction model after receiving the search failure information
  • the result serves as the target prediction result.
  • the vertical federated prediction optimization device is also used for:
  • the weight of the second-party initial model and receive the intermediate training sample features, training sample labels, and first-party model prediction loss sent by the first device, wherein the first-party model prediction loss is determined by the first device based on The prediction result of the first-party model of the target prediction model on the training sample corresponding to the training sample ID matching sample and the label of the training sample are calculated, and the features of the intermediate training sample are obtained by the first device based on the target prediction model
  • the feature extractor obtains feature extraction for the training sample;
  • the specific implementation of the vertical federated prediction optimization device of the present application is basically the same as the embodiments of the vertical federated prediction optimization method described above, and will not be repeated here.
  • the embodiment of the present application also provides a vertical federated learning modeling optimization device, which is applied to the first device, and the vertical federated learning modeling optimization device includes:
  • the first obtaining module is used to obtain the weight of the first-party initial model, and extract training samples and training sample labels corresponding to the training samples;
  • the second acquisition module is used to acquire the intermediate training sample features generated by the feature extractor of the target prediction model to be trained for feature extraction of the training samples;
  • An iterative optimization module configured to calculate the first-party model corresponding to the target prediction model to be trained based on the training sample label, the training model prediction result corresponding to the intermediate training sample feature, and the first-party initial model weight Prediction loss, iteratively optimizing the target prediction model to be trained to obtain the target prediction model;
  • a sending module configured to send the training sample label, the intermediate training sample features, and the first-party model prediction loss to a second device, so that the second device can calculate the second-party model prediction loss, and based on
  • the residual loss calculated by the prediction loss of the second-party model and the prediction loss of the first-party model is used to optimize the residual promotion model to be trained to obtain a vertical federated residual promotion model.
  • the prediction loss of the second-party model is based on the residual promotion model to be trained, the training sample ID matching sample corresponding to the training sample, the characteristics of the intermediate training sample, the sample label and the obtained second-party initial model weight Get calculated.
  • the iterative optimization module is also used for:
  • the longitudinal federated learning modeling optimization device is also used for:
  • a first-party model weight is generated by calculating a ratio of the number of samples correctly classified by the first party to the number of samples incorrectly classified by the first party.
  • the longitudinal federated learning modeling optimization device is also used for:
  • the second device can jointly perform model prediction on the intermediate sample features and the ID matching samples corresponding to the samples to be predicted based on the longitudinal federated residual boosting model, and obtain the first Bipartite model prediction results;
  • weighted aggregation is performed on the first-party model prediction results and the second-party model prediction results to obtain a target federated prediction result.
  • the specific implementation manner of the vertical federated learning modeling optimization device of the present application is basically the same as the above-mentioned embodiments of the vertical federated learning modeling optimization method, and will not be repeated here.
  • the embodiment of the present application also provides a vertical federated learning modeling optimization device, which is applied to the second device, and the vertical federated learning modeling optimization device includes:
  • the receiving module is configured to obtain the weight of the second-party initial model, and receive the intermediate training sample features, training sample labels and first-party model prediction loss sent by the first device.
  • the first-party model prediction loss is calculated by the first device based on the first-party model prediction result of the target prediction model on the training sample corresponding to the training sample ID matching sample and the training sample label, so
  • the features of the intermediate training samples are obtained by the first device performing feature extraction on the training samples based on the feature extractor of the target prediction model;
  • the model prediction module is used to obtain the training sample ID matching sample, and based on the residual promotion model to be trained, perform model prediction on the training sample ID matching sample and the intermediate training sample characteristics, and obtain the second party training model prediction result ;
  • a calculation module configured to calculate a second-party model prediction loss based on the training sample label, the second-party initial model weight, and the second-party training model prediction result;
  • An iterative optimization module configured to iteratively optimize the residual promotion model to be trained based on the first-party model prediction loss and the residual loss generated by the second-party model prediction loss, to obtain the vertical federated residual promotion model .
  • the longitudinal federated learning modeling optimization device is also used for:
  • the weight of the second-party model is generated by calculating the ratio of the number of correct samples classified by the second party to the number of wrong samples classified by the second party.
  • the longitudinal federated learning modeling optimization device is also used for:
  • the target prediction model is obtained by local iterative training of the first device.
  • the specific implementation manner of the vertical federated learning modeling optimization device of the present application is basically the same as the above-mentioned embodiments of the vertical federated learning modeling optimization method, and will not be repeated here.
  • An embodiment of the present application provides a medium, the medium is a readable storage medium, and the readable storage medium stores one or more programs, and the one or more programs can also be processed by one or more The machine is used to realize the steps of the vertical federated prediction optimization method described in any one of the above.
  • the specific implementation manner of the readable storage medium of the present application is basically the same as the embodiments of the above-mentioned vertical federated prediction optimization method, and will not be repeated here.
  • An embodiment of the present application provides a medium, the medium is a readable storage medium, and the readable storage medium stores one or more programs, and the one or more programs can also be processed by one or more
  • the machine is used to implement the steps of the longitudinal federated learning modeling optimization method described in any one of the above.
  • the specific implementation manner of the readable storage medium of the present application is basically the same as the embodiments of the above-mentioned longitudinal federated learning modeling and optimization method, and will not be repeated here.
  • the embodiment of the present application provides a computer program product, and the computer program product includes one or more computer programs, and the one or more computer programs can also be executed by one or more processors to implement The steps of the vertical federated prediction optimization method described in any one of the above.
  • the embodiment of the present application provides a computer program product, and the computer program product includes one or more computer programs, and the one or more computer programs can also be executed by one or more processors to implement The steps of the longitudinal federated learning modeling optimization method described in any one of the above.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本申请公开了一种纵向联邦预测优化方法、设备、介质及计算机程序产品,包括:提取待预测样本,并基于本地迭代训练的目标预测模型,生成待预测样本对应的第一方模型预测结果和对应的中间样本特征;将中间样本特征发送至第二设备,以供第二设备基于纵向联邦残差提升模型对中间样本特征和待预测样本对应的ID匹配样本共同执行模型预测,获得第二方模型预测结果;获取第一方模型权重,并接收第二设备发送的第二方模型预测结果和第二方模型权重;基于第一方模型权重和第二方模型权重,对第一方模型预测结果和第二方模型预测结果进行加权聚合,获得目标联邦预测结果

Description

纵向联邦预测优化方法、设备、介质及计算机程序产品
优先权信息
本申请要求于2021年8月25日申请的、申请号为202110982929.9的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及金融科技(Fintech)的人工智能技术领域,尤其涉及一种纵向联邦预测优化方法、设备、介质及计算机程序产品。
背景技术
随着金融科技,尤其是互联网科技金融的不断发展,越来越多的技术(如分布式、人工智能等)应用在金融领域,但金融业也对技术提出了更高的要求,如对金融业对应待办事项的分发也有更高的要求。
随着计算机软件和人工智能计算的不断发展,人工智能技术的应用也越来越广泛。目前,现有的纵向联邦预测模型需要基于各参与方之间的对齐样本进行纵向联邦学习建模而构建,且建模完成后,纵向联邦预测模型分散部署在各个参与方中,每一参与方均只持有部分纵向联邦预测模型。所以在纵向联邦预测场景中,对于对齐样本,预测方通常联合分散在其他参与方的对齐样本进行纵向联邦预测,以准确进行样本预测。而对于未对齐样本,预测方需要使用一个单独的本地模型在本地进行预测。所以,预测方在对对齐样本进行准确度更高的联邦预测的情况下,无法对未对齐样本进行基于完整模型的样本预测,使得整体的样本预测准确度变低。所以,现有的纵向联邦预测方法存在整体样本预测准确度低的问题。
发明内容
本申请的主要目的在于提供一种纵向联邦预测优化方法、设备、介质及计算机程序产品,旨在解决现有技术中纵向联邦预测的整体样本预测准确度低的技术问题。
为实现上述目的,本申请提供一种纵向联邦预测优化方法,所述纵向联邦预测优化方法应用于第一设备,所述纵向联邦预测优化方法包括:
提取待预测样本,并获取目标预测模型的特征提取器针对所述待预测样本进行特征提取生成的中间样本特征,以及所述目标预测模型针对所述待预测样本进行模型预测生成的第一方模型预测结果,其中,所述目标预测模型由所述第一设备本地迭代训练得到;
将所述中间样本特征发送至第二设备,以供所述第二设备基于纵向联邦残差提升模型对所述中间样本特征和所述待预测样本对应的ID匹配样本共同执行模型预测,获得第二方模型预测结果;
获取所述目标预测模型对应的第一方模型权重,并接收所述第二设备发送的第二方模型预测结果和所述纵向联邦残差提升模型对应的第二方模型权重;
基于所述第一方模型权重和所述第二方模型权重,对所述第一方模型预测结果和所述第二方模型预测结果进行加权聚合,获得目标联邦预测结果。
为实现上述目的,本申请提供一种纵向联邦预测优化方法,所述纵向联邦预测优化方法应用于第二设备,所述纵向联邦预测优化方法包括:
接收第一设备发送的中间样本特征,并查找与所述中间样本特征对应的ID匹配样本;
基于纵向联邦残差提升模型,对所述ID匹配样本和所述中间样本特征共同执行模型预测,获得第二方模型预测结果;
获取所述纵向联邦残差提升模型对应的第二方模型权重,并将所述第二方模型预测结果和所述第二方模型权重发送至所述第一设备,以供所述第一设备基于所述目标预测模型针对所述ID匹配样本对应的待预测样本生成的第一方模型预测结果、所述目标预测模型对应的第一方模型权重、所述第二方模型预测结果和所述第二方模型权重,生成目标联邦预测结果,其中,所述目标预测模型由所述第一设备本地迭代训练得到。
为实现上述目的,本申请提供一种纵向联邦学习建模优化方法,所述纵向联邦学习建模优化方法应用于第一设备,所述纵向联邦学习建模优化方法包括:
获取第一方初始模型权重,并提取训练样本和所述训练样本对应的训练样本标签;
获取待训练目标预测模型的特征提取器针对所述训练样本进行特征提取生成的中间训练样本特征;
基于所述训练样本标签、所述中间训练样本特征对应的训练模型预测结果和所述第一方初始模型权重,通过计算所述待训练目标预测模型对应的第一方模型预测损失,迭代优化所述待训练目标预测模型, 获得所述目标预测模型;
将所述训练样本标签、所述中间训练样本特征和所述第一方模型预测损失发送至第二设备,以供所述第二设备计算第二方模型预测损失,并基于所述第二方模型预测损失和所述第一方模型预测损失计算的残差损失,优化所述待训练残差提升模型,获得纵向联邦残差提升模型。
为实现上述目的,本申请提供一种纵向联邦学习建模优化方法,所述纵向联邦学习建模优化方法应用于第二设备,所述纵向联邦学习建模优化方法包括:
获取第二方初始模型权重,并接收第一设备发送的中间训练样本特征、训练样本标签和第一方模型预测损失;
获取训练样本ID匹配样本,并基于待训练残差提升模型,对所述训练样本ID匹配样本和所述中间训练样本特征共同执行模型预测,获得第二方训练模型预测结果;
基于所述训练样本标签、所述第二方初始模型权重和所述第二方训练模型预测结果,计算第二方模型预测损失;
基于所述第一方模型预测损失和所述第二方模型预测损失生成的残差损失,迭代优化所述待训练残差提升模型,获得所述纵向联邦残差提升模型。
本申请还提供一种纵向联邦预测优化装置,所述纵向联邦预测优化装置为虚拟装置,且所述纵向联邦预测优化装置应用于第一设备,所述纵向联邦预测优化装置包括:
模型预测模块,用于提取待预测样本,并获取目标预测模型的特征提取器针对所述待预测样本进行特征提取生成的中间样本特征,以及所述目标预测模型针对所述待预测样本进行模型预测生成的第一方模型预测结果,其中,所述目标预测模型由所述第一设备本地迭代训练得到;
发送模块,用于将所述中间样本特征发送至第二设备,以供所述第二设备基于纵向联邦残差提升模型对所述中间样本特征和所述待预测样本对应的ID匹配样本共同执行模型预测,获得第二方模型预测结果;
接收模块,用于获取所述目标预测模型对应的第一方模型权重,并接收所述第二设备发送的第二方模型预测结果和所述纵向联邦残差提升模型对应的第二方模型权重;
加权聚合模块,用于基于所述第一方模型权重和所述第二方模型权重,对所述第一方模型预测结果和所述第二方模型预测结果进行加权聚合,获得目标联邦预测结果。
本申请还提供一种纵向联邦预测优化装置,所述纵向联邦预测优化装置为虚拟装置,且所述纵向联邦预测优化装置应用于第二设备,所述纵向联邦预测优化装置包括:
接收查找模块,用于接收第一设备发送的中间样本特征,并查找与所述中间样本特征对应的ID匹配样本;
模型预测模块,用于基于纵向联邦残差提升模型,对所述ID匹配样本和所述中间样本特征共同执行模型预测,获得第二方模型预测结果;
发送模块,用于获取所述纵向联邦残差提升模型对应的第二方模型权重,并将所述第二方模型预测结果和所述第二方模型权重发送至所述第一设备,以供所述第一设备基于所述目标预测模型针对所述ID匹配样本对应的待预测样本生成的第一方模型预测结果、所述目标预测模型对应的第一方模型权重、所述第二方模型预测结果和所述第二方模型权重,生成目标联邦预测结果,其中,所述目标预测模型由所述第一设备本地迭代训练得到。
本申请还提供一种纵向联邦学习建模优化装置,所述纵向联邦学习建模优化装置为虚拟装置,且所述纵向联邦学习建模优化装置应用于第一设备,所述纵向联邦学习建模优化装置包括:
第一获取模块,用于获取第一方初始模型权重,并提取训练样本和所述训练样本对应的训练样本标签;
第二获取模块,用于获取待训练目标预测模型的特征提取器针对所述训练样本进行特征提取生成的中间训练样本特征;
迭代优化模块,用于基于所述训练样本标签、所述中间训练样本特征对应的训练模型预测结果和所述第一方初始模型权重,通过计算所述待训练目标预测模型对应的第一方模型预测损失,迭代优化所述待训练目标预测模型,获得所述目标预测模型;
发送模块,用于将所述训练样本标签、所述中间训练样本特征和所述第一方模型预测损失发送至第二设备,以供所述第二设备计算第二方模型预测损失,并基于所述第二方模型预测损失和所述第一方模型预测损失计算的残差损失,优化所述待训练残差提升模型,获得纵向联邦残差提升模型。
本申请还提供一种纵向联邦学习建模优化装置,所述纵向联邦学习建模优化装置为虚拟装置,且所述纵向联邦学习建模优化装置应用于第二设备,所述纵向联邦学习建模优化装置包括:
接收模块,用于获取第二方初始模型权重,并接收第一设备发送的中间训练样本特征、训练样本标 签和第一方模型预测损失;
模型预测模块,用于获取训练样本ID匹配样本,并基于待训练残差提升模型,对所述训练样本ID匹配样本和所述中间训练样本特征共同执行模型预测,获得第二方训练模型预测结果;
计算模块,用于基于所述训练样本标签、所述第二方初始模型权重和所述第二方训练模型预测结果,计算第二方模型预测损失;
迭代优化模块,用于基于所述第一方模型预测损失和所述第二方模型预测损失生成的残差损失,迭代优化所述待训练残差提升模型,获得所述纵向联邦残差提升模型。
本申请还提供一种纵向联邦预测优化设备,所述纵向联邦预测优化设备为实体设备,所述纵向联邦预测优化设备包括:存储器、处理器以及存储在所述存储器上并可在所述处理器上运行的所述纵向联邦预测优化方法的程序,所述纵向联邦预测优化方法的程序被处理器执行时可实现如上述的纵向联邦预测优化方法的步骤。
本申请还提供一种纵向联邦学习建模优化设备,所述纵向联邦学习建模优化设备为实体设备,所述纵向联邦学习建模优化设备包括:存储器、处理器以及存储在所述存储器上并可在所述处理器上运行的所述纵向联邦学习建模优化方法的程序,所述纵向联邦学习建模优化方法的程序被处理器执行时可实现如上述的纵向联邦学习建模优化方法的步骤。
本申请还提供一种介质,所述介质为可读存储介质,所述可读存储介质上存储有实现纵向联邦预测优化方法的程序,所述纵向联邦预测优化方法的程序被处理器执行时实现如上述的纵向联邦预测优化方法的步骤。
本申请还提供一种介质,所述介质为可读存储介质,所述可读存储介质上存储有实现纵向联邦学习建模优化方法的程序,所述纵向联邦学习建模优化方法的程序被处理器执行时实现如上述的纵向联邦学习建模优化方法的步骤。
本申请还提供一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时实现如上述的纵向联邦预测优化方法的步骤。
本申请还提供一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时实现如上述的纵向联邦学习建模优化方法的步骤。
本申请提供了一种纵向联邦预测优化方法、设备、介质及计算机程序产品,相比于现有技术采用的在纵向联邦预测场景中对于对齐样本,预测方联合分散在其他参与方的对齐样本进行纵向联邦预测,以准确进行样本预测,而对于未对齐样本,预测方基于本地持有的部分纵向联邦预测模型进行本地预测的技术手段,本申请首先提取待预测样本,并获取目标预测模型的特征提取器针对所述待预测样本进行特征提取生成的中间样本特征,以及所述目标预测模型针对所述待预测样本进行模型预测生成的第一方模型预测结果,其中,所述目标预测模型由所述第一设备本地迭代训练得到,进而实现了基于本地迭代训练的目标预测模型,对待预测样本在本地进行样本预测的目的,所以,对于若为未对齐样本的待预测样本,第一设备可基于作为完整模型的目标预测模型,独自对待预测样本进行准确的样本预测,进一步地,将所述中间样本特征发送至第二设备,以供所述第二设备基于纵向联邦残差提升模型对所述中间样本特征和所述待预测样本对应的ID匹配样本共同执行模型预测,为第二设备基于纵向联邦残差提升模型执行模型预测提供了更多的决策依据,提升了第二设备生成的第二方模型预测结果的准确度,其中,由于所述纵向联邦残差提升模型由所述第二设备基于纵向联邦公共样本,联合所述第一设备中所述目标预测模型在所述纵向联邦公共样本上的模型预测损失、所述纵向联邦公共样本对应的中间公共样本特征以及对应的样本标签,与第一设备进行基于纵向联邦学习的残差学习得到,进而对于待预测样本对齐的ID匹配样本,第二设备基于纵向联邦残差提升模型对ID匹配样本和中间样本特征执行模型预测可生成待预测样本对应的准确度更高的残差提升信息,也即第二方模型预测结果,进而获取所述目标预测模型对应的第一方模型权重,并接收所述第二设备发送的第二方模型预测结果和所述纵向联邦残差提升模型对应的第二方模型权重,并基于所述第一方模型权重和所述第二方模型权重,对所述第一方模型预测结果和所述第二方模型预测结果进行加权聚合,即可获得目标联邦预测结果,所以对于若为第一设备和第二设备之间的对齐样本的待预测样本,第一设备可借助第二设备基于纵向联邦残差提升模型针对待预测样本对齐的ID匹配样本和中间样本特征生成的准确度更高的残差提升信息(第二方模型预测结果),提升目标预测模型输出的第一方模型预测结果的准确度,实现对待预测样本进行基于准确度更高的残差提升信息的纵向联邦预测,所以,实现了在对对齐样本进行基于准确度更高的残差提升信息的纵向联邦预测的情况下,可基于完整模型,在本地独自对未对齐样本进行准确样本预测的目的,所以克服了预测方在对对齐样本进行准确度更高的联邦预测的情况下,无法对未对齐样本进行基于完整模型的样本预测,使得整体的样本预测准确度变低的技术缺陷,提升了纵向联邦预测的整体样本预测准确度。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本申请的实施例,并与说明书一起用于解释本申请的原理。
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,对于本领域普通技术人员而言,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本申请纵向联邦预测优化方法第一实施例的流程示意图;
图2为本申请纵向联邦预测优化方法第二实施例的流程示意图;
图3为本申请纵向联邦学习建模优化方法第一实施例的流程示意图;
图4为本申请纵向联邦学习建模优化方法第二实施例的流程示意图;
图5为本申请实施例中纵向联邦预测优化方法涉及的硬件运行环境的设备结构示意图;
图6为本申请实施例中纵向联邦学习建模优化方法涉及的硬件运行环境的设备结构示意图。
本申请目的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
具体实施方式
应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。
本申请实施例提供一种纵向联邦预测优化方法,在本申请纵向联邦预测优化方法的第一实施例中,参照图1,所述纵向联邦预测优化方法应用于第一设备,所述纵向联邦预测优化方法包括:
步骤S10,提取待预测样本,并获取目标预测模型的特征提取器针对所述待预测样本进行特征提取生成的中间样本特征,以及所述目标预测模型针对所述待预测样本进行模型预测生成的第一方模型预测结果,其中,所述目标预测模型由所述第一设备本地迭代训练得到。
在本实施例中,需要说明的是,所述纵向联邦预测优化方法应用于纵向联邦学习场景,第一设备和第二设备均为纵向联邦学习场景的参与方。其中,第一设备具备携带样本标签的样本,第二设备具备无样本标签的样本,且第一设备为预测方,用于执行预测任务,第二设备为辅助数据提供方,用于为第一设备提供残差提升信息,以提升第一设备生成的预测结果的准确度。
在一种可实施的方式中,所述目标预测模型可以为多层神经网络,也可以为深度因子分解机模型,用于对待预测样本进行分类。例如,假设目标预测模型最后一层全连接层的输出为z,激活函数为sigmoid函数,则分类结果P=sigmoid(z),其中,P为待预测样本属于预设样本类别的概率。
具体地,在步骤S10中,提取待预测样本,并基于目标预测模型中的特征提取器,对所述待预测样本进行特征提取,获得中间样本特征,进而基于所述目标预测模型中的分类器,对所述中间样本特征进行全连接,获得目标全连接层输出,进而通过预设激活函数,将所述目标全连接层输出转换为第一方模型预测结果,其中,所述第一方模型预测结果可以为分类概率。
其中,在所述获取目标预测模型的特征提取器针对所述待预测样本进行特征提取生成的中间样本特征,以及所述目标预测模型针对所述待预测样本进行模型预测生成的第一方模型预测结果的步骤之前,所述纵向联邦预测优化方法还包括:
步骤A10,获取第一方初始模型权重,并提取训练样本和所述训练样本对应的训练样本标签;
在本实施例中,需要说明的是,所述训练样本的数量至少为1,所述训练样本标签为所述训练样本的标识,所述第一方初始模型权重为第一设备中预先设置的表示模型的预测准确度的模型权重的初始值。其中,可设置模型权重等于预测模型分类正确的样本数与分类错误的样本数的比值的对数值,可选地,模型权重的初始值可设置为1。
步骤A20,获取待训练目标预测模型的特征提取器针对所述训练样本进行特征提取生成的中间训练样本特征;
在本实施例中,基于所述待训练目标预测模型中的特征提取器,对所述训练样本进行特征提取,生成中间训练样本特征。
步骤A30,基于所述训练样本标签、所述中间训练样本特征对应的训练模型预测结果和所述第一方初始模型权重,通过计算所述待训练目标预测模型对应的第一方模型预测损失,迭代优化所述待训练目标预测模型,获得所述目标预测模型;
在本实施例中,需要说明的是,所述待训练目标预测模型的迭代训练过程包括多个迭代轮次,其中,一个迭代轮次需要基于预设数量的训练样本进行迭代训练。
具体地,在步骤A30中,基于所述待训练目标预测模型中的分类器和预设激活函数,将所述中间训练样本特征转换为训练模型预测结果,并基于所述训练样本、所述训练模型预测结果和所述第一方初始模型权重,计算第一方模型预测损失,进而判断所述第一方模型预测损失是否收敛。若所述第一方模型预测损失收敛,则将所述待训练目标预测模型作为所述目标预测模型;若所述第一方模型预测损失未收敛,则基于所述第一方模型预测损失计算的梯度,通过预设模型优化方法,更新所述待训练目标预测 模型,以及基于所述训练模型预测结果和所述训练样本标签,更新所述第一方初始模型权重,并返回执行步骤:提取训练样本和所述训练样本对应的训练样本标签,进行下一次迭代。其中,所述预设模型优化方法包括梯度下降法和梯度上升法等,所述基于所述训练样本、所述训练模型预测结果和所述第一方初始模型权重,计算第一方模型预测损失的损失函数如下
Figure PCTCN2021139640-appb-000001
其中,L AAA,X A,Y)为第一方模型预测损失,N A为一轮迭代中训练样本的数量,θ A为所述待训练目标预测模型,α A为所述第一方初始模型权重,X A为N A个训练样本组成的训练样本集合,Y为N A个训练样本对应的训练样本标签所组成的标签集合,y i为一轮迭代中第i个训练样本标签,x A,i为一轮迭代中第i个训练样本的特征。
进一步地,在步骤A30之后,第一设备统计所述目标预测模型在迭代训练过程中分类正确的样本数量以及分类错误的样本数量,获得第一方分类正确样本数和第一方分类错误样本数,进而通过计算所述第一方分类正确样本数和所述第一方分类错误样本数的比值,生成第一方模型权重,其中,计算所述第一方模型权重的具体过程如下:
Figure PCTCN2021139640-appb-000002
其中,α A为所述第一方模型权重,A为所述第一方分类正确样本数,B为所述第一方分类错误样本数。
步骤A40,将所述训练样本标签、所述中间训练样本特征和所述第一方模型预测损失发送至第二设备,以供所述第二设备基于待训练残差提升模型、所述训练样本对应的训练样本ID匹配样本、所述中间训练样本特征、所述样本标签和获取的第二方初始模型权重,计算第二方模型预测损失,并基于所述第二方模型预测损失和所述第一方模型预测损失计算的残差损失,优化所述待训练残差提升模型,获得所述纵向联邦残差提升模型。
在本实施例中,需要说明的是,所述第二方初始模型权重为第二设备中预先设置的表示模型的预测准确度的模型权重的初始值,其中,可设置模型权重等于预测模型分类正确的样本数与分类错误的样本数的比值的对数值,可选地,模型权重的初始值可设置为1。
具体地,在步骤A40中,将所述目标预测模型在迭代训练过程中每一训练样本对应的训练样标签、对应的第一方模型预测损失、对应的训练样本和对应的中间训练样本特征发送至所述第二设备,进而第二设备提取训练样本ID,并基于所述训练样本ID,查找所述待预测样本对应的ID匹配样本,进而第二设备通过将训练样本ID匹配样本输入待训练残差提升模型中执行模型预测,获得所述训练样本ID匹配样本对应的第二方训练模型预测结果。进而第二设备将所述训练样本ID对应的训练样本ID匹配样本和对应的中间训练样本特征进行拼接,以对所述训练样本ID匹配样本进行特征增强,获得训练特征增强样本。进而基于待训练残差提升模型对所述训练特征增强样本执行模型预测,获得第二方模型预测损失。进而基于所述训练样本ID匹配样本对应的第二方训练模型预测结果、对应的训练样本标签和获取的第二方初始模型权重,计算第二方模型预测损失。
其中,计算所述第二方模型预测损失的过程具体可参照第一设备计算第一方模型预测损失的过程,在此不再赘述。进而基于所述第一方模型预测损失和所述第二方模型预测损失,计算残差损失,并判断所述残差损失是否收敛。若所述残差损失收敛,则将所述待训练残差提升模型作为所述纵向联邦残差提升模型;若所述残差损失未收敛,则基于所述残差损失计算的梯度,更新所述待训练残差提升模型,以及基于所述训练样本ID匹配样本对应的第二方训练模型预测结果和对应的样本标签,更新所述第二方初始模型权重,并返回执行步骤:第二设备提取训练样本ID,进行下一次迭代。其中,所述第二设备计算残差损失的损失函数如下:
Figure PCTCN2021139640-appb-000003
其中,L(θ BB,X B,Y)为所述残差损失,N C为一轮迭代中训练样本ID匹配样本的数量,θ B为所述待训练残差提升模型,α B为所述第二方初始模型权重,X B为N C个训练样本ID匹配样本组成的训练样本集合,Y为N C个训练样本ID匹配样本对应的训练样本标签所组成的标签集合,
Figure PCTCN2021139640-appb-000004
为一轮迭代中第i个训练样本标签,
Figure PCTCN2021139640-appb-000005
为一轮迭代中第i个训练样本ID匹配样本的特征,
Figure PCTCN2021139640-appb-000006
为N C个所述训练样本ID匹配样本对应的第一方模型预测损失,
Figure PCTCN2021139640-appb-000007
为一轮迭代中第i个训练样本ID匹配样本对应的中 间训练样本特征。
进一步地,第二设备统计所述纵向联邦残差提升模型在迭代训练过程中分类正确的样本数量以及分类错误的样本数量,获得第二方分类正确样本数和第二方分类错误样本数,进而通过计算所述第二方分类正确样本数和所述第二方分类错误样本数的比值,生成第二方模型权重。其中,第二设备生成第二方模型权重的具体过程可参照上述第一设备生成第一方模型权重的具体过程,在此不再赘述。
步骤S20,将所述中间样本特征发送至第二设备,以供所述第二设备基于纵向联邦残差提升模型对所述中间样本特征和所述待预测样本对应的ID匹配样本共同执行模型预测,获得第二方模型预测结果。
在本实施例中,需要说明的是,所述纵向联邦残差提升模型由所述第二设备基于纵向联邦公共样本,联合所述第一设备中所述目标预测模型在所述纵向联邦公共样本上的模型预测损失、所述纵向联邦公共样本对应的中间公共样本特征以及对应的样本标签,与第一设备进行基于纵向联邦学习的残差学习得到。且所述第二设备基于纵向联邦公共样本,联合所述第一设备中所述目标预测模型在所述纵向联邦公共样本上的模型预测损失、所述纵向联邦公共样本对应的中间公共样本特征以及对应的样本标签,与第一设备进行基于纵向联邦学习的残差学习得到纵向联邦残差提升模型的具体过程可参照步骤A10至步骤A30,在此不再赘述。其中,所述纵向联邦公共样本为第二设备中与第一设备中ID对齐的样本,也即为所述第一设备中训练样本对应的训练样本ID匹配样本。
具体地,将所述中间样本特征发送至第二设备,进而第二设备将待预测样本对应的ID匹配样本以及对应的中间样本特征进行拼接,以对所述ID匹配样本进行特征增强,获得特征增强样本,进而通过将待预测样本对应的特征增强样本输入纵向联邦残差提升模型,对所述特征增强样本执行模型预测,获得第二方模型预测结果,进而第二设备将第二方模型预测结果和所述纵向联邦残差提升模型对应的第二方模型权重发送至第一设备。
其中,在所述将所述中间样本特征发送至第二设备的步骤之前,所述纵向联邦预测优化方法还包括:
步骤B10,将所述待预测样本对应的待预测样本ID发送至所述第二设备,以供所述第二设备查找所述待预测样本ID对应的ID匹配样本;
在本实施例中,需要说明的是,所述待预测样本ID为所述待预测样本的样本ID。
步骤B20,若接收到所述第二设备发送的查找失败信息,则将所述第一方模型预测结果作为目标预测结果;
在本实施例中,若接收到所述第二设备发送的查找失败信息,则将所述第一方模型预测结果作为目标预测结果。具体地,若接收到所述第二设备发送的查找失败信息,则证明所述待预测样本不为第一设备和第二设备之间的对齐样本,进而将所述第一方模型预测结果作为目标预测结果,以实现基于作为完整模型的目标预测模型,独自对待预测样本进行样本预测的目的。
步骤B30,若未接收到所述第二设备发送的查找失败信息,则执行步骤:将所述中间样本特征发送至第二设备。
在本实施例中,若未接收到所述第二设备发送的查找失败信息,则执行步骤:将所述中间样本特征发送至第二设备,以获取第二设备发送的基于所述中间样本特征计算的准确度更高的残差提升信息,其中,所述残差提升信息为第二设备中纵向联邦残差提升模型输出的第二方模型预测结果。
步骤S30,获取所述目标预测模型对应的第一方模型权重,并接收所述第二设备发送的第二方模型预测结果和所述纵向联邦残差提升模型对应的第二方模型权重;
在本实施例中,需要说明的是,所述第一方模型权重由第一设备通过计算所述目标预测模型在迭代训练过程中的第一方分类正确样本数与第一方分类错误样本数的比值得到。
步骤S40,基于所述第一方模型权重和所述第二方模型权重,对所述第一方模型预测结果和所述第二方模型预测结果进行加权聚合,获得目标联邦预测结果。
在本实施例中,具体地,基于所述第一方模型权重和所述第二方模型权重,通过预设聚合规则对所述第一方模型预测结果和所述第二方模型预测结果进行加权聚合,获得目标联邦预测结果。其中,所述预设聚合规则包括求和求平均等,进而实现了利用第二设备生成的残差提升信息,提升第一设备对待预测样本进行样本的预测的准确度的目的。
需要说明的是,由于残差提升信息是纵向联邦残差提升模型基于特征增强样本生成的,而特征增强样本是基于待预测样本对应的ID匹配样本和对应的中间样本特征进行拼接生成的,从而实现了基于第一设备中的中间样本特征,对ID匹配样本进行特征增强的目的,使得纵向联邦残差提升模型的输入具备更多的特征信息,纵向联邦残差提升模型决策生成残差提升信息的决策依据更多,纵向联邦残差提升模型可输出准确度更高的残差提升信息,因而基于第二设备生成准确度更高的残差提升信息,在利用残差提升信息提升第一设备对作为对齐样本的待预测样本进行样本预测的准确度的基础上,进一步提升了第一设备对对齐样本进行样本预测的准确度。
进一步地,所述目标预测模型可以设置为二分类模型,用于作为推荐模型,也即通过对待预测样本进行二分类,判定是否推荐待预测样本对应的物品或者是否向待预测样本对应的用户进行物品推荐,而由于本申请实施例实现了在对对齐样本进行准确度更高基于残差提升信息的纵向联邦预测的情况下,可基于完整模型,在本地独自对未对齐样本进行准确预测的目的,提升了纵向联邦预测的整体样本预测准确度,提升了推荐模型的整体推荐的准确度。
另外,对于现有的纵向联邦预测模型,由于需要联合纵向联邦学习的各参与方才能进行模型预测,一旦存在参与方发生数据缺失或者发送宕机的情况,则无法基于完整的模型和样本数据,对样本进行预测,进而影响样本预测的准确度。而在本申请的实施例中,由于目标预测模型由第一设备单独持有,即使第二设备发生数据缺失或者发生宕机,第一设备也可以依靠作为完整模型的目标预测模型单独对待预测样本执行样本预测,进而提升了当纵向联邦学习的参与方发生数据缺失或者宕机时样本预测的准确度。
本申请实施例提供了一种纵向联邦预测优化方法,相比于现有技术采用的在纵向联邦预测场景中对于未对齐样本,预测方基于本地持有的部分纵向联邦预测模型进行本地预测的技术手段,本申请实施例首先提取待预测样本,并获取目标预测模型的特征提取器针对所述待预测样本进行特征提取生成的中间样本特征,以及所述目标预测模型针对所述待预测样本进行模型预测生成的第一方模型预测结果,其中,所述目标预测模型由所述第一设备本地迭代训练得到,进而实现了基于本地迭代训练的目标预测模型,对待预测样本在本地进行样本预测的目的。所以,对于若为未对齐样本的待预测样本,第一设备可基于作为完整模型的目标预测模型,独自对待预测样本进行准确的样本预测。进一步地,将所述中间样本特征发送至第二设备,以供所述第二设备基于纵向联邦残差提升模型对所述中间样本特征和所述待预测样本对应的ID匹配样本共同执行模型预测,为第二设备基于纵向联邦残差提升模型执行模型预测提供了更多的决策依据,提升了第二设备生成的第二方模型预测结果的准确度,其中,由于所述纵向联邦残差提升模型由所述第二设备基于纵向联邦公共样本,联合所述第一设备中所述目标预测模型在所述纵向联邦公共样本上的模型预测损失、所述纵向联邦公共样本对应的中间公共样本特征以及对应的样本标签,与第一设备进行基于纵向联邦学习的残差学习得到,进而对于待预测样本对齐的ID匹配样本,第二设备基于纵向联邦残差提升模型对ID匹配样本和中间样本特征执行模型预测可生成待预测样本对应的准确度更高的残差提升信息,也即第二方模型预测结果,进而获取所述目标预测模型对应的第一方模型权重,并接收所述第二设备发送的第二方模型预测结果和所述纵向联邦残差提升模型对应的第二方模型权重,并基于所述第一方模型权重和所述第二方模型权重,对所述第一方模型预测结果和所述第二方模型预测结果进行加权聚合,即可获得目标联邦预测结果,所以对于若为第一设备和第二设备之间的对齐样本的待预测样本,第一设备可借助第二设备基于纵向联邦残差提升模型针对待预测样本对齐的ID匹配样本和中间样本特征生成的准确度更高的残差提升信息(第二方模型预测结果),提升目标预测模型输出的第一方模型预测结果的准确度,实现对待预测样本进行基于准确度更高的残差提升信息的纵向联邦预测,实现了在对对齐样本进行基于准确度更高的残差提升信息的纵向联邦预测的情况下,可基于完整模型在本地独自对未对齐样本进行准确样本预测的目的,克服了预测方在对对齐样本进行准确度更高的联邦预测的情况下,无法对未对齐样本进行基于完整模型的样本预测,使得整体的样本预测准确度变低的技术缺陷,提升了纵向联邦预测的整体样本预测准确度。
进一步地,参照图2,在本申请另一实施例中,所述纵向联邦预测优化方法应用于第二设备,所述纵向联邦预测优化方法包括:
步骤C10,接收第一设备发送的中间样本特征,并查找与所述中间样本特征对应的ID匹配样本;
在本实施例中,需要说明的是,所述中间样本特征为第一设备基于目标预测模型中的特征提取器对待预测样本进行特征提取生成,所述ID匹配样本为第二设备中与所述中间样本特征对应的待预测样本的样本ID一致的样本。
具体地,在步骤C10中,接收第一设备发送的基于目标预测模型的特征提取器对所述待预测样本进行特征提取生成的中间样本特征以及第一设备发送的待预测样本对应的待预测样本ID,进而依据所述待预测样本ID,查找ID匹配样本。
其中,在所述查找ID匹配样本的步骤之后,所述纵向联邦预测优化方法还包括:
步骤D10,若查找成功,则执行步骤:基于纵向联邦残差提升模型,对所述ID匹配样本和所述中间样本特征共同执行模型预测,获得第二方模型预测结果;
在本实施例中,若查找成功,则证明第二设备具备待预测样本对应的对齐样本,也即为ID匹配样本,进而执行步骤:基于纵向联邦残差提升模型,对所述ID匹配样本和所述中间样本特征共同执行模型预测,获得第二方模型预测结果,基于纵向联邦残差提升模型和待预测样本对应的ID匹配样本以及对应的中间样本特征,生成准确度更高的待预测样本对应的残差提升信息。其中,残差提升信息即为第二方模型预测结果,进而将残差提升信息发送至第一设备,第一设备即可实现基于准确度更高的残差提 升信息,提升第一设备对待预测样本的样本预测结果的准确度。
步骤D20,若查找失败,则向所述第一设备反馈查找失败信息,以供所述第一设备在接收所述查找失败信息后,将基于目标预测模型针对所述待预测样本生成的第一方模型预测结果作为目标预测结果。
在本实施例中,若查找失败,则证明第二设备不存在待预测样本对应的对齐样本,进而向所述第一设备反馈查找失败信息,进而第一设备在接收所述查找失败信息后,即可直接将基于目标预测模型针对所述待预测样本生成的第一方模型预测结果作为目标预测结果,以实现独自对未对齐样本进行样本预测的目的。
步骤C20,基于纵向联邦残差提升模型,对所述ID匹配样本和所述中间样本特征共同执行模型预测,获得第二方模型预测结果;
在本实施例中,需要说明的是,所述纵向联邦残差提升模型由所述第二设备基于纵向联邦公共样本,联合所述第一设备中目标预测模型在所述纵向联邦公共样本上的模型预测损失、所述纵向联邦公共样本对应的中间公共样本特征以及对应的样本标签,与第一设备进行基于纵向联邦学习的残差学习得到。且所述第二设备基于纵向联邦公共样本,联合所述第一设备中目标预测模型在所述纵向联邦公共样本上的模型预测损失、所述纵向联邦公共样本对应的中间公共样本特征以及对应的样本标签,与第一设备进行基于纵向联邦学习的残差学习得到纵向联邦残差提升模型的具体过程可参照上述步骤A10至步骤A40中的内容,在此不再赘述。
具体地,在步骤C20中,基于所述中间样本特征,对所述ID匹配样本进行特征增强,获得特征增强样本,进而通过将所述特征增强样本输入纵向联邦残差提升模型中执行模型预测,生成第二方模型预测结果。
其中,所述基于纵向联邦残差提升模型,对所述ID匹配样本和所述中间样本特征共同执行模型预测,获得第二方模型预测结果的步骤包括:
步骤C21,对所述ID匹配样本和所述中间样本特征进行拼接,获得特征增强样本;
在本实施例中,对所述ID匹配样本和所述中间样本特征进行拼接,以基于所述中间样本特征对所述ID匹配样本进行特征增强,获得特征增强样本。
在另一种可实施的方式中,可通过将所述ID匹配样本和所述中间样本特征进行加权拼接,以基于所述中间样本特征对所述ID匹配样本进行特征增强,获得特征增强样本。
步骤C22,基于所述纵向联邦残差提升模型,对所述特征增强样本执行模型预测,获得所述第二方模型预测结果。
具体地,在步骤C22中,通过将所述特征增强样本输入纵向联邦残差提升模型,对所述特征增强样本进行数据处理。其中,数据处理的过程包括卷积、池化和全连接等,进而获得所述纵向联邦残差提升模型中的最后一层全连接层输出的全连接层输出结果,进而通过预设激活函数,将所述全连接层输出结果转换为第二方模型预测结果。
其中,在所述基于纵向联邦残差提升模型,对所述ID匹配样本和所述中间样本特征共同执行模型预测,获得第二方模型预测结果的步骤之前,所述纵向联邦预测优化方法还包括:
步骤E10,获取第二方初始模型权重,并接收所述第一设备发送的中间训练样本特征、训练样本标签和第一方模型预测损失,其中,所述第一方模型预测损失由所述第一设备基于目标预测模型在所述训练样本ID匹配样本对应的训练样本上的第一方模型预测结果和所述训练样本标签计算得到,所述中间训练样本特征由所述第一设备基于所述目标预测模型的特征提取器针对所述训练样本进行特征提取得到;
在本实施例中,需要说明的是,所述第一设备生成第一方模型预测损失和中间训练样本特征的具体过程可参照上述步骤A10至步骤A30中的具体内容,在此不再赘述。
另外,需要说明的是,所述第一设备需要将目标预测模型在迭代训练过程中所有训练样本对应的训练样本ID、对应的训练样本标签、对应的第一方模型预测损失和对应的中间训练样本特征均发送至第二设备。
步骤E20,获取训练样本ID匹配样本,并基于待训练残差提升模型,对所述训练样本ID匹配样本和所述中间训练样本特征共同执行模型预测,获得第二方训练模型预测结果;
具体地,在步骤E20中,提取训练样本ID匹配样本,进而将所述训练样本ID匹配样本与所述中间训练样本特征进行拼接,以基于所述中间训练样本特征,对所述训练样本ID匹配样本进行特征增强,获得训练特征增强样本,进而通过将所述训练特征增强样本输入待训练残差提升模型中执行模型预测,获得第二方训练模型预测结果。
步骤E30,基于所述训练样本标签、所述第二方初始模型权重和所述第二方训练模型预测结果,计算第二方模型预测损失;
在本实施例中,第二设备基于所述训练样本标签、所述第二方初始模型权重和所述第二方训练模型预测结果,计算第二方模型预测损失的损失函数如下:
Figure PCTCN2021139640-appb-000008
其中,L(θ BB,X B,Y)为所述残差损失,N C为一轮迭代中训练样本ID匹配样本的数量,θ B为所述待训练残差提升模型,α B为所述第二方初始模型权重,X B为N C个训练样本ID匹配样本组成的训练样本集合,Y为N C个训练样本ID匹配样本对应的训练样本标签所组成的标签集合,
Figure PCTCN2021139640-appb-000009
为一轮迭代中第i个训练样本标签,
Figure PCTCN2021139640-appb-000010
为一轮迭代中第i个训练样本ID匹配样本的特征,
Figure PCTCN2021139640-appb-000011
为N C个所述训练样本ID匹配样本对应的第一方模型预测损失,
Figure PCTCN2021139640-appb-000012
为一轮迭代中第i个训练样本ID匹配样本对应的中间训练样本特征。
步骤E40,基于所述第一方模型预测损失和所述第二方模型预测损失生成的残差损失,迭代优化所述待训练残差提升模型,获得所述纵向联邦残差提升模型。
具体地,在步骤E40中,基于所述第一方模型预测损失和所述第二方模型预测损失,计算残差损失,进而判断所述残差损失是否收敛。若所述残差提升模型收敛,则将所述待训练残差提升模型作为所述纵向联邦残差提升模型;若所述残差损失未收敛,则基于所述残差损失计算的梯度,通过预设模型优化方法更新所述待训练残差提升模型,以及基于所述训练样本ID匹配样本对应的第二方训练模型预测结果和对应的训练样本标签,更新所述第二方初始模型权重,并返回执行步骤:获取训练样本ID匹配样本,进行下一次迭代。其中,所述基于所述第一方模型预测损失和所述第二方模型预测损失,计算残差损失的具体过程可参照上述步骤A40中的具体内容,在此不再赘述。
进一步地,第二设备统计所述待训练残差提升模型在迭代训练过程中分类正确的样本数量以及分类错误的样本数量,获得第二方分类正确样本数和第二方分类错误样本数,进而通过计算所述第二方分类正确样本数和所述第二方分类错误样本数的比值,生成第二方模型权重。其中,第二设备生成第二方模型权重的具体过程可参照第一设备生成第一方模型权重的具体过程,在此不再赘述。
步骤C30,获取所述纵向联邦残差提升模型对应的第二方模型权重,并将所述第二方模型预测结果和所述第二方模型权重发送至所述第一设备,以供所述第一设备基于所述目标预测模型针对所述ID匹配样本对应的待预测样本生成的第一方模型预测结果、所述目标预测模型对应的第一方模型权重、所述第二方模型预测结果和所述第二方模型权重,生成目标联邦预测结果,其中,所述目标预测模型由所述第一设备本地迭代训练得到。
在本实施例中,需要说明的是,所述第一设备基于目标预测模型针对所述训练样本ID匹配样本对应的待预测样本生成第一方模型预测结果的具体过程可参照步骤S10中的具体步骤,在此不再赘述。
具体地,获取所述纵向联邦残差提升模型对应的第二方模型权重,并将所述第二方模型预测结果和所述第二方模型权重发送至第一设备,以供所述第一设备基于目标预测模型对应的第一方模型权重和所述第二方模型权重,通过预设聚合规则,对所述第一方模型预测结果和所述第二方模型预测结果进行加权聚合,获得目标联邦预测结果,进而实现了基于第二设备生成的准确度更高的残差提升信息,优化第一设备中对待预测样本的第一方模型预测结果,以提升第一设备对待预测样本的样本预测结果的准确度。
本申请实施例提供了一种纵向联邦预测优化方法,相比于现有技术采用的在纵向联邦预测场景中对于未对齐样本,预测方基于本地持有的部分纵向联邦预测模型进行本地预测的技术手段,本申请实施例首先接收第一设备发送的中间样本特征,并查找与所述中间样本特征对应的ID匹配样本,基于纵向联邦残差提升模型,对所述ID匹配样本和所述中间样本特征共同执行模型预测,获得第二方模型预测结果。实现了利用第一设备发送的中间样本特征,生成待预测样本对应的准确度更高的残差提升信息的目的。进而获取所述纵向联邦残差提升模型对应的第二方模型权重,并将所述第二方模型预测结果和所述第二方模型权重发送至所述第一设备,以供所述第一设备基于所述目标预测模型针对所述ID匹配样本对应的待预测样本生成的第一方模型预测结果、所述目标预测模型对应的第一方模型权重、所述第二方模型预测结果和所述第二方模型权重,生成目标联邦预测结果。其中,所述目标预测模型由所述第一设备本地迭代训练得到,进而实现了对于对齐样本,利用第二设备生成的准确度更高的残差提升信息,优化第一设备生成的第一方模型预测结果,以生成样本预测准确度更高的目标联邦预测结果,且由于所述目标预测模型由所述第一设备本地迭代训练得到,对于未对齐样本,第一设备也可基于作为完整模型的目标预测模型,在本地独自对待预测样本完成样本预测,所以克服了预测方在对对齐样本进行准确度更高的联邦预测的情况下,无法对未对齐样本进行基于完整模型的样本预测,使得整体的样本预测准确度变低的技术缺陷,提升了纵向联邦预测的整体样本预测准确度。
进一步地,参照图3,在本申请另一实施例中,还提供一种纵向联邦学习建模优化方法,所述纵向联邦学习建模优化方法应用于第一设备,所述纵向联邦学习建模优化方法包括:
步骤F10,获取第一方初始模型权重,并提取训练样本和所述训练样本对应的训练样本标签;
在本实施例中,需要说明的是,所述纵向联邦学习建模优化方法应用于纵向联邦学习场景,第一设备和第二设备均为纵向联邦学习场景的参与方。其中,第一设备具备携带样本标签的样本,第二设备具备无样本标签的样本,且第一设备为预测方,用于构建预测模型,第二设备为辅助数据提供方,用于构建为第一设备提供残差提升信息的纵向联邦残差提升模型,以提升第一设备中预测模型生成的预测结果的准确度。
另外,需要说明的是,所述训练样本的数量至少为1,所述训练样本标签为所述训练样本的标识,所述第一方初始模型权重为第一设备中预先设置的表示模型的预测准确度的模型权重的初始值。其中,可设置模型权重等于预测模型分类正确的样本数与分类错误的样本数的比值的对数值。可选地,模型权重的初始值可设置为1。
步骤F20,获取待训练目标预测模型的特征提取器针对所述训练样本进行特征提取生成的中间训练样本特征;
在本实施例中,基于待训练目标预测模型中的特征提取器,对所述训练样本进行特征提取,获得中间训练样本特征。
步骤F30,基于所述训练样本标签、所述中间训练样本特征对应的训练模型预测结果和所述第一方初始模型权重,通过计算所述待训练目标预测模型对应的第一方模型预测损失,迭代优化所述待训练目标预测模型,获得所述目标预测模型;
具体地,在步骤F30中,通过所述待训练目标预测模型中的分类器,对所述中间训练样本特征进行全连接,获得训练全连接层输出,进而通过预设激活函数,将所述训练全连接层输出转换为训练模型预测结果,进而基于所述训练样本、所述训练模型预测结果和所述第一方初始模型权重,计算第一方模型预测损失。然后判断所述第一方模型预测损失是否收敛,若所述第一方模型预测损失收敛,则将所述待训练目标预测模型作为所述目标预测模型;若所述第一方模型预测损失未收敛,则基于所述第一方模型预测损失计算的梯度,通过预设模型优化方法,更新所述待训练目标预测模型,以及基于所述训练模型预测结果和所述训练样本标签,更新所述第一方初始模型权重,并返回执行步骤:提取训练样本和所述训练样本对应的训练样本标签,进行下一次迭代。其中,所述预设模型优化方法包括梯度下降法和梯度上升法等,所述基于所述训练样本、所述训练模型预测结果和所述第一方初始模型权重,计算第一方模型预测损失的计算过程可参照上述步骤A10至步骤A30中的具体内容,在此不再赘述。
其中,所述基于所述训练样本标签、所述中间训练样本特征对应的训练模型预测结果和所述第一方初始模型权重,通过计算所述待训练目标预测模型对应的第一方模型预测损失,迭代优化所述待训练目标预测模型,获得所述目标预测模型的步骤包括:
步骤F31,基于所述待训练目标预测模型中的分类器,将所述中间训练样本特征转换为训练模型预测结果;
在本实施例中,基于所述待训练目标预测模型中的分类器,对所述中间训练样本特征进行全连接,获得训练全连接层输出,进而通过预设激活函数,将所述训练全连接层输出转换为训练模型预测结果。
步骤F32,基于所述训练样本标签、所述训练模型预测结果和所述第一方初始模型权重,计算第一方模型预测损失;
在本实施例中,需要说明的是,步骤F32的具体计算过程可参照步骤A30中的内容,在此不再赘述。
步骤F33,基于所述训练模型预测结果和所述训练样本标签,更新所述第一方初始模型权重;
具体地,在步骤F33中,基于所述训练模型预测结果和所述训练样本标签,更新所述待训练残差提升模型对应的当前分类错误样本数和当前分类正确样本数,进而通过计算所述当前分类正确样本数和所述当前分类错误样本数的比值,重新计算第一方初始模型权重。其中,重新计算第一方初始模型权重的过程具体可参照步骤A30之后第一设备计算第一方模型权重的具体过程,在此不再赘述。
步骤F34,基于所述第一方模型预测损失和更新后的第一方初始模型权重,迭代优化所述待训练目标预测模型,获得所述目标预测模型。
具体地,在步骤F34中,判断所述第一方模型预测损失是否收敛,若所述第一方模型预测损失收敛,则将所述待训练目标预测模型作为所述目标预测模型,以及将更新后的第一方初始模型权重作为所述第一方模型权重;若所述第一方模型预测损失未收敛,则基于所述第一方模型预测损失计算的梯度,通过预设模型优化方法,更新所述待训练目标预测模型,并返回执行步骤:提取训练样本和所述训练样本对应的训练样本标签,基于更新后的待训练目标预测模型和更新后的第一方初始模型权重,进行下一 轮迭代。
其中,在所述基于所述训练样本标签、所述中间训练样本特征对应的训练模型预测结果和所述第一方初始模型权重,通过计算所述待训练目标预测模型对应的第一方模型预测损失,迭代优化所述待训练目标预测模型,获得所述目标预测模型的步骤之后,所述纵向联邦学习建模优化方法还包括:
步骤G10,获取所述目标预测模型对应的第一方分类正确样本数和对应的第一方分类错误样本数;
在本实施例中,需要说明的是,所述第一方分类正确样本数为所述目标预测模型在迭代训练过程中针对训练样本的输出分类标签和对应的训练样本标签一致的训练样本的数量,所述第一方分类错误样本数为所述目标预测模型在迭代训练过程中针对训练样本的输出分类标签和对应的训练样本标签不一致的训练样本的数量。
步骤G20,通过计算所述第一方分类正确样本数和所述第一方分类错误样本数的比值,生成第一方模型权重。
在本实施例中,步骤G20的具体计算式如下:
Figure PCTCN2021139640-appb-000013
其中,α A为所述第一方模型权重,A为所述第一方分类正确样本数,B为所述第一方分类错误样本数。
步骤F40,将所述训练样本标签、所述中间训练样本特征和所述第一方模型预测损失发送至第二设备,以供所述第二设备计算第二方模型预测损失,并基于所述第二方模型预测损失和所述第一方模型预测损失计算的残差损失,优化所述待训练残差提升模型,获得纵向联邦残差提升模型。
在本实施例中,需要说明的是,所述第二方模型预测损失基于待训练残差提升模型、所述训练样本对应的训练样本ID匹配样本、所述中间训练样本特征、所述样本标签和获取的第二方初始模型权重进行计算得到。
具体地,将所述目标预测模型在迭代训练过程中所有训练样本对应的训练样本标签、对应的第一方模型预测损失、对应的训练样本ID和对应的中间训练样本特征发送至第二设备,进而第二设备提取训练样本ID,并基于所述训练样本ID,查找所述训练样本ID对应的训练样本ID匹配样本,并通过将所述训练样本ID匹配样本和所述中间训练样本特征共同对应的训练特征增强样本输入待训练残差提升模型中执行模型预测,获得第二方训练模型预测结果。进而基于所述训练样本ID匹配样本对应的第二方训练模型预测结果、对应的训练样本标签和获取的第二方初始模型权重,计算第二方模型预测损失,进而第二设备基于所述第一方模型预测损失和所述第二方模型预测损失,计算残差损失,并基于所述残差损失,迭代优化待训练残差提升模型,获得纵向联邦残差提升模型。其中,所述第二设备构建纵向联邦残差提升模型的具体过程可参照步骤A10至步骤A40中的具体内容,在此不再赘述。
其中,在所述将所述训练样本标签、所述中间训练样本特征和所述第一方模型预测损失发送至第二设备,以供所述第二设备基于待训练残差提升模型、所述训练样本对应的训练样本ID匹配样本、所述中间训练样本特征和所述样本标签,计算第二方模型预测损失,并基于所述第二方模型预测损失和所述第一方模型预测损失计算的残差损失,优化所述待训练残差提升模型,获得纵向联邦残差提升模型的步骤之后,所述纵向联邦学习建模优化方法还包括:
步骤H10,提取待预测样本,并获取目标预测模型的特征提取器针对所述待预测样本进行特征提取生成的中间样本特征,以及所述目标预测模型针对所述待预测样本进行模型预测生成的第一方模型预测结果;
在本实施例中,具体地,提取待预测样本,并基于目标预测模型中的特征提取器,对所述待预测样本进行特征提取,获得中间样本特征,进而基于所述目标预测模型中的分类器,对所述中间样本特征进行全连接,获得目标全连接层输出,进而通过预设激活函数,将所述目标全连接层输出转换为第一方模型预测结果,其中,所述第一方模型预测结果可以为分类概率。
步骤H20,将所述中间样本特征发送至第二设备,以供所述第二设备基于纵向联邦残差提升模型对所述中间样本特征和所述待预测样本对应的ID匹配样本共同执行模型预测,获得第二方模型预测结果;
在本实施例中,具体地,将所述中间样本特征发送至第二设备,进而第二设备将待预测样本对应的ID匹配样本以及对应的中间样本特征进行拼接,以对所述ID匹配样本进行特征增强,获得特征增强样本,进而通过将待预测样本对应的特征增强样本输入纵向联邦残差提升模型,对所述特征增强样本执行模型预测,获得第二方模型预测结果,进而第二设备将第二方模型预测结果和所述纵向联邦残差提升模型对应的第二方模型权重发送至第一设备。
步骤H30,接收所述第二设备发送的第二方模型预测结果和所述纵向联邦残差提升模型对应的第二方模型权重;
步骤H40,基于所述目标预测模型对应的第一方模型权重和所述第二方模型权重,对所述第一方模型预测结果和所述第二方模型预测结果进行加权聚合,获得目标联邦预测结果。
在本实施例中,具体地,基于所述第一方模型权重和所述第二方模型权重,通过预设聚合规则对所述第一方模型预测结果和所述第二方模型预测结果进行加权聚合,获得目标联邦预测结果。其中,所述预设聚合规则包括求和求平均等,进而实现了利用第二设备生成的残差提升信息,提升第一设备对待预测样本进行样本的预测的准确度的目的。
本申请实施例提供了一种纵向联邦学习建模优化方法,也即,获取第一方初始模型权重,并提取训练样本和所述训练样本对应的训练样本标签,进而获取待训练目标预测模型的特征提取器针对所述训练样本进行特征提取生成的中间训练样本特征,进而基于所述训练样本标签、所述中间训练样本特征对应的训练模型预测结果和所述第一方初始模型权重,通过计算所述待训练目标预测模型对应的第一方模型预测损失,迭代优化所述待训练目标预测模型,获得所述目标预测模型,实现了在第一设备本地构建作为完整模型的目标预测模型的目的。进而将所述训练样本标签、所述中间训练样本特征和所述第一方模型预测损失发送至第二设备,以供所述第二设备计算第二方模型预测损失,并基于所述第二方模型预测损失和所述第一方模型预测损失计算的残差损失,优化所述待训练残差提升模型,获得纵向联邦残差提升模型,实现了基于纵向联邦学习的残差学习。利用第一设备中训练样本对应的中间训练样本特征,在第二设备处构建的纵向联邦残差提升模型的目的,扩展了第二设备中训练样本ID匹配样本的特征维度,使得纵向联邦残差提升模型的预测准确度更高,进而基于第一设备的目标预测模型和第二设备处的纵向联邦残差提升模型,即可实现在对对齐样本进行基于准确度更高的残差提升信息的纵向联邦预测的情况下,可基于完整模型,在本地独自对未对齐样本进行准确样本预测的目的。为克服预测方在对对齐样本进行准确度更高的联邦预测的情况下,无法对未对齐样本进行基于完整模型的样本预测,使得整体的样本预测准确度变低的技术缺陷奠定了基础。
进一步地,参照图4,在本申请另一实施例中,还提供一种纵向联邦学习建模优化方法,所述纵向联邦学习建模优化方法应用于第二设备,所述纵向联邦学习建模优化方法包括:
步骤R10,获取第二方初始模型权重,并接收第一设备发送的中间训练样本特征、训练样本标签和第一方模型预测损失;
在本实施例中,需要说明的是,所述第一方模型预测损失由所述第一设备基于目标预测模型在所述训练样本ID匹配样本对应的训练样本上的第一方模型预测结果和所述训练样本标签计算得到,所述中间训练样本特征由所述第一设备基于所述目标预测模型的特征提取器针对所述训练样本进行特征提取得到。所述第一设备生成第一方模型预测损失和中间训练样本特征的具体过程可参照步骤A10至步骤A30中的具体内容,在此不再赘述。
具体地,获取第二方初始模型权重,并接收第一设备发送的目标预测模型在迭代训练过程中所有训练样本对应的训练样本标签、对应的第一方模型预测损失和对应的中间训练样本特征。
步骤R20,获取训练样本ID匹配样本,并基于待训练残差提升模型,对所述训练样本ID匹配样本和所述中间训练样本特征共同执行模型预测,获得第二方训练模型预测结果;
在本实施例中,具体地,提取训练样本ID,并查找所述训练样本ID对应的训练样本ID匹配样本,进而将所述训练样本ID匹配样本和所述中间训练样本特征进行拼接,以基于所述中间训练样本特征,对所述训练样本ID匹配样本进行特征增强,获得训练特征增强样本,通过将所述训练特征增强样本输入待训练残差提升模型中执行模型预测,获得第二方训练模型预测结果。
步骤R30,基于所述训练样本标签、所述第二方初始模型权重和所述第二方训练模型预测结果,计算第二方模型预测损失;
在本实施例中,需要说明的是,步骤R30中第二设备计算第二方模型预测损失的具体过程可参照步骤E30中的具体内容,在此不再赘述。
步骤R40,基于所述第一方模型预测损失和所述第二方模型预测损失生成的残差损失,迭代优化所述待训练残差提升模型,获得所述纵向联邦残差提升模型。
在本实施例中,具体地,基于所述第一方模型预测损失和所述第二方模型预测损失,计算残差损失,进而判断所述残差损失是否收敛,若所述残差提升模型收敛,则将所述待训练残差提升模型作为所述纵向联邦残差提升模型;若所述纵向联邦残差提升模型未收敛,则基于所述残差损失计算的梯度,通过预设模型优化方法更新所述待训练残差提升模型,以及基于所述训练样本ID匹配样本对应的第二方训练模型预测结果和对应的样本标签,更新所述第二方初始模型权重,并返回执行步骤:获取训练样本ID匹配样本,进行下一轮迭代。其中,所述基于所述第一方模型预测损失和所述第二方模型预测损失,计算残差损失的具体过程可参照步骤A40中的具体内容,在此不再赘述。
其中,在所述基于所述第一方模型预测损失和所述第二方模型预测损失生成的残差损失,迭代优化 所述待训练残差提升模型,获得所述纵向联邦残差提升模型的步骤之后,所述纵向联邦学习建模优化方法还包括:
步骤T10,获取所述纵向联邦残差提升模型对应的第二方分类正确样本数和对应的第二方分类错误样本数;
在本实施例中,需要说明的是,所述第二方分类正确样本数为所述纵向联邦残差提升模型在迭代训练过程中针对训练样本ID匹配样本的输出分类标签和对应的训练样本标签一致的训练样本ID匹配样本的数量;所述第二方分类错误样本数为所述纵向联邦残差提升模型在迭代训练过程中针对训练样本ID匹配样本的输出分类标签和对应的训练样本标签不一致的训练样本ID匹配样本的数量。
步骤T20,通过计算所述第二方分类正确样本数和所述第二方分类错误样本数的比值,生成第二方模型权重。
在本实施例中,步骤T20的具体计算式如下:
Figure PCTCN2021139640-appb-000014
其中,α A为所述第二方模型权重,A为所述第二方分类正确样本数,B为所述第二方分类错误样本数。
其中,在所述基于所述第一方模型预测损失和所述第二方模型预测损失生成的残差损失,迭代优化所述待训练残差提升模型,获得纵向联邦残差提升模型的步骤之后,所述纵向联邦学习建模优化方法还包括:
步骤Y10,接收第一设备发送的中间样本特征,并查找与所述中间样本特征对应的ID匹配样本;
在本实施例中,具体地,接收第一设备发送的基于目标预测模型的特征提取器对所述待预测样本进行特征提取生成的中间样本特征以及第一设备发送的待预测样本对应的待预测样本ID,进而依据所述待预测样本ID,查找ID匹配样本。
步骤Y20,基于所述纵向联邦残差提升模型,对所述ID匹配样本和所述中间样本特征共同执行模型预测,获得第二方模型预测结果;
在本实施例中,基于所述中间样本特征,对所述ID匹配样本进行特征增强,获得特征增强样本,进而通过将所述特征增强样本输入纵向联邦残差提升模型中执行模型预测,生成第二方模型预测结果。其中,对所述ID匹配样本进行特征增强的具体实施方式可参照步骤C21中的内容,在此不再赘述。
步骤Y30,将所述第二方模型预测结果和所述纵向联邦残差提升模型对应的第二方模型权重发送至所述第一设备,以供所述第一设备基于目标预测模型针对所述ID匹配样本对应的待预测样本生成的第一方模型预测结果、所述目标预测模型对应的第一方模型权重、所述第二方模型预测结果和所述第二方模型权重,生成目标联邦预测结果,其中,所述目标预测模型由所述第一设备本地迭代训练得到。
在本实施例中,需要说明的是,所述第一设备基于目标预测模型针对所述训练样本ID匹配样本对应的待预测样本生成第一方模型预测结果的具体过程可参照步骤S10中的具体步骤,在此不再赘述。
具体地,获取所述纵向联邦残差提升模型对应的第二方模型权重,并将所述第二方模型预测结果和所述第二方模型权重发送至第一设备,以供所述第一设备基于目标预测模型对应的第一方模型权重和所述第二方模型权重,通过预设聚合规则,对所述第一方模型预测结果和所述第二方模型预测结果进行加权聚合,获得目标联邦预测结果,进而实现了基于第二设备生成的准确度更高的残差提升信息,优化第一设备中对待预测样本的第一方模型预测结果,以提升第一设备对待预测样本的样本预测结果的准确度。
本申请实施例提供了一种纵向联邦学习建模优化方法,也即,获取第二方初始模型权重,并接收第一设备发送的中间训练样本特征、训练样本标签和第一方模型预测损失。其中,所述第一方模型预测损失由所述第一设备基于目标预测模型在所述训练样本ID匹配样本对应的训练样本上的第一方模型预测结果和所述训练样本标签计算得到,所述中间训练样本特征由所述第一设备基于所述目标预测模型的特征提取器针对所述训练样本进行特征提取得到,进而获取训练样本ID匹配样本,并基于待训练残差提升模型,对所述训练样本ID匹配样本和所述中间训练样本特征共同执行模型预测,获得第二方训练模型预测结果。基于所述训练样本标签、所述第二方初始模型权重和所述第二方训练模型预测结果,计算第二方模型预测损失,并基于所述第一方模型预测损失和所述第二方模型预测损失生成的残差损失,迭代优化所述待训练残差提升模型,获得所述纵向联邦残差提升模型。实现了通过利用第一设备中的中间训练样本特征,对第二设备中训练样本对应的训练样本ID匹配样本进行特征增强,联合第一设备进行基于纵向联邦学习的残差学习构建纵向联邦残差提升模型的目的。基于纵向联邦残差提升模型即可对于对齐样本,利用第二设备生成准确度更高的残差提升信息,优化第一设备生成的第一方模型预测结果,以生成样本预测准确度更高的目标联邦预测结果,为克服预测方在对对齐样本进行准确度更高的联邦预测的情况下,无法对未对齐样本进行基于完整模型的样本预测,使得整体的样本预测准确度变低的技术缺陷奠定了基础。
参照图5,图5是本申请实施例方案涉及的硬件运行环境的设备系统示意图。
如图5所示,该纵向联邦预测优化设备可以包括:处理器1001,例如CPU,存储器1005,通信总线1002。其中,通信总线1002用于实现处理器1001和存储器1005之间的连接通信。存储器1005可以是高速RAM存储器,也可以是稳定的存储器(non-volatile memory),例如磁盘存储器。存储器1005可选的还可以是独立于前述处理器1001的存储设备。
可选地,该纵向联邦预测优化设备还可以包括矩形用户接口、网络接口、摄像头、RF(Radio Frequency,射频)电路,传感器、音频电路、WiFi模块等等。矩形用户接口可以包括显示屏(Display)、输入子模块比如键盘(Keyboard),可选矩形用户接口还可以包括标准的有线接口、无线接口。网络接口可选的可以包括标准的有线接口、无线接口(如WI-FI接口)。
本领域技术人员可以理解,图5中示出的纵向联邦预测优化设备系统并不构成对纵向联邦预测优化设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
如图5所示,作为一种计算机存储介质的存储器1005中可以包括操作系统、网络通信模块以及纵向联邦预测优化程序。操作系统是管理和控制纵向联邦预测优化设备硬件和软件资源的程序,支持纵向联邦预测优化程序以及其它软件和/或,程序的运行。网络通信模块用于实现存储器1005内部各组件之间的通信,以及与纵向联邦预测优化系统中其它硬件和软件之间通信。
在图5所示的纵向联邦预测优化设备中,处理器1001用于执行存储器1005中存储的纵向联邦预测优化程序,实现上述任一项所述的纵向联邦预测优化方法的步骤。
本申请纵向联邦预测优化设备具体实施方式与上述纵向联邦预测优化方法各实施例基本相同,在此不再赘述。
参照图6,图6是本申请实施例方案涉及的硬件运行环境的设备系统示意图。
如图6所示,该纵向联邦学习建模优化设备可以包括:处理器1001,例如CPU,存储器1005,通信总线1002。其中,通信总线1002用于实现处理器1001和存储器1005之间的连接通信。存储器1005可以是高速RAM存储器,也可以是稳定的存储器(non-volatile memory),例如磁盘存储器。存储器1005可选的还可以是独立于前述处理器1001的存储设备。
可选地,该纵向联邦学习建模优化设备还可以包括矩形用户接口、网络接口、摄像头、RF(Radio Frequency,射频)电路,传感器、音频电路、WiFi模块等等。矩形用户接口可以包括显示屏(Display)、输入子模块比如键盘(Keyboard),可选矩形用户接口还可以包括标准的有线接口、无线接口。网络接口可选的可以包括标准的有线接口、无线接口(如WI-FI接口)。
本领域技术人员可以理解,图6中示出的纵向联邦学习建模优化设备系统并不构成对纵向联邦学习建模优化设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
如图6所示,作为一种计算机存储介质的存储器1005中可以包括操作系统、网络通信模块以及纵向联邦学习建模优化程序。操作系统是管理和控制纵向联邦学习建模优化设备硬件和软件资源的程序,支持纵向联邦学习建模优化程序以及其它软件和/或,程序的运行。网络通信模块用于实现存储器1005内部各组件之间的通信,以及与纵向联邦学习建模优化系统中其它硬件和软件之间通信。
在图6所示的纵向联邦学习建模优化设备中,处理器1001用于执行存储器1005中存储的纵向联邦学习建模优化程序,实现上述任一项所述的纵向联邦学习建模优化方法的步骤。
本申请纵向联邦学习建模优化设备具体实施方式与上述纵向联邦学习建模优化方法各实施例基本相同,在此不再赘述。
本申请实施例还提供一种纵向联邦预测优化装置,所述纵向联邦预测优化装置应用于第一设备,所述纵向联邦预测优化装置包括:
模型预测模块,用于提取待预测样本,并获取目标预测模型的特征提取器针对所述待预测样本进行特征提取生成的中间样本特征,以及所述目标预测模型针对所述待预测样本进行模型预测生成的第一方模型预测结果,其中,所述目标预测模型由所述第一设备本地迭代训练得到;
发送模块,用于将所述中间样本特征发送至第二设备,以供所述第二设备基于纵向联邦残差提升模型对所述中间样本特征和所述待预测样本对应的ID匹配样本共同执行模型预测,获得第二方模型预测结果。其中,所述纵向联邦残差提升模型由所述第二设备基于纵向联邦公共样本,联合所述第一设备中所述目标预测模型在所述纵向联邦公共样本上的模型预测损失、所述纵向联邦公共样本对应的中间公共样本特征以及对应的样本标签,与第一设备进行基于纵向联邦学习的残差学习得到;
接收模块,用于获取所述目标预测模型对应的第一方模型权重,并接收所述第二设备发送的第二方模型预测结果和所述纵向联邦残差提升模型对应的第二方模型权重;
加权聚合模块,用于基于所述第一方模型权重和所述第二方模型权重,对所述第一方模型预测结果和所述第二方模型预测结果进行加权聚合,获得目标联邦预测结果。
可选地,所述纵向联邦预测优化装置还用于:
将所述待预测样本对应的待预测样本ID发送至所述第二设备,以供所述第二设备查找所述待预测样本ID对应的ID匹配样本;
若接收到所述第二设备发送的查找失败信息,则将所述第一方模型预测结果作为目标预测结果;
若未接收到所述第二设备发送的查找失败信息,则执行步骤:将所述中间样本特征发送至第二设备。
可选地,所述纵向联邦预测优化装置还用于:
获取第一方初始模型权重,并提取训练样本和所述训练样本对应的训练样本标签;
获取待训练目标预测模型的特征提取器针对所述训练样本进行特征提取生成的中间训练样本特征;
基于所述训练样本标签、所述中间训练样本特征对应的训练模型预测结果和所述第一方初始模型权重,通过计算所述待训练目标预测模型对应的第一方模型预测损失,迭代优化所述待训练目标预测模型,获得所述目标预测模型;
将所述训练样本标签、所述中间训练样本特征和所述第一方模型预测损失发送至第二设备,以供所述第二设备基于待训练残差提升模型、所述训练样本对应的训练样本ID匹配样本、所述中间训练样本特征、所述样本标签和获取的第二方初始模型权重,计算第二方模型预测损失,并基于所述第二方模型预测损失和所述第一方模型预测损失计算的残差损失,优化所述待训练残差提升模型,获得所述纵向联邦残差提升模型。
本申请纵向联邦预测优化装置的具体实施方式与上述纵向联邦预测优化方法各实施例基本相同,在此不再赘述。
本申请实施例还提供一种纵向联邦预测优化装置,所述纵向联邦预测优化装置应用于第二设备,所述纵向联邦预测优化装置包括:
接收查找模块,用于接收第一设备发送的中间样本特征,并查找与所述中间样本特征对应的ID匹配样本;
模型预测模块,用于基于纵向联邦残差提升模型,对所述ID匹配样本和所述中间样本特征共同执行模型预测,获得第二方模型预测结果。其中,所述纵向联邦残差提升模型由所述第二设备基于纵向联邦公共样本,联合所述第一设备中目标预测模型在所述纵向联邦公共样本上的模型预测损失、所述纵向联邦公共样本对应的中间公共样本特征以及对应的样本标签,与第一设备进行基于纵向联邦学习的残差学习得到;
发送模块,用于获取所述纵向联邦残差提升模型对应的第二方模型权重,并将所述第二方模型预测结果和所述第二方模型权重发送至所述第一设备,以供所述第一设备基于所述目标预测模型针对所述ID匹配样本对应的待预测样本生成的第一方模型预测结果、所述目标预测模型对应的第一方模型权重、所述第二方模型预测结果和所述第二方模型权重,生成目标联邦预测结果,其中,所述目标预测模型由所述第一设备本地迭代训练得到。
可选地,所述模型预测模块还用于:
对所述ID匹配样本和所述中间样本特征进行拼接,获得特征增强样本;
基于所述纵向联邦残差提升模型,对所述特征增强样本执行模型预测,获得所述第二方模型预测结果。
可选地,所述纵向联邦预测优化装置还用于:
若查找成功,则执行步骤:基于纵向联邦残差提升模型,对所述ID匹配样本和所述中间样本特征共同执行模型预测,获得第二方模型预测结果;
若查找失败,则向所述第一设备反馈查找失败信息,以供所述第一设备在接收所述查找失败信息后,将基于目标预测模型针对所述待预测样本生成的第一方模型预测结果作为目标预测结果。
可选地,所述纵向联邦预测优化装置还用于:
获取第二方初始模型权重,并接收所述第一设备发送的中间训练样本特征、训练样本标签和第一方模型预测损失,其中,所述第一方模型预测损失由所述第一设备基于目标预测模型在所述训练样本ID匹配样本对应的训练样本上的第一方模型预测结果和所述训练样本标签计算得到,所述中间训练样本特征由所述第一设备基于所述目标预测模型的特征提取器针对所述训练样本进行特征提取得到;
获取训练样本ID匹配样本,并基于待训练残差提升模型,对所述训练样本ID匹配样本和所述中间训练样本特征共同执行模型预测,获得第二方训练模型预测结果;
基于所述训练样本标签、所述第二方初始模型权重和所述第二方训练模型预测结果,计算第二方模型预测损失;
基于所述第一方模型预测损失和所述第二方模型预测损失生成的残差损失,迭代优化所述待训练残差提升模型,获得所述纵向联邦残差提升模型。
本申请纵向联邦预测优化装置的具体实施方式与上述纵向联邦预测优化方法各实施例基本相同,在此不再赘述。
本申请实施例还提供一种纵向联邦学习建模优化装置,所述纵向联邦学习建模优化装置应用于第一设备,所述纵向联邦学习建模优化装置包括:
第一获取模块,用于获取第一方初始模型权重,并提取训练样本和所述训练样本对应的训练样本标签;
第二获取模块,用于获取待训练目标预测模型的特征提取器针对所述训练样本进行特征提取生成的中间训练样本特征;
迭代优化模块,用于基于所述训练样本标签、所述中间训练样本特征对应的训练模型预测结果和所述第一方初始模型权重,通过计算所述待训练目标预测模型对应的第一方模型预测损失,迭代优化所述待训练目标预测模型,获得所述目标预测模型;
发送模块,用于将所述训练样本标签、所述中间训练样本特征和所述第一方模型预测损失发送至第二设备,以供所述第二设备计算第二方模型预测损失,并基于所述第二方模型预测损失和所述第一方模型预测损失计算的残差损失,优化所述待训练残差提升模型,获得纵向联邦残差提升模型。其中,所述第二方模型预测损失基于待训练残差提升模型、所述训练样本对应的训练样本ID匹配样本、所述中间训练样本特征、所述样本标签和获取的第二方初始模型权重进行计算得到。
可选地,所述迭代优化模块还用于:
基于所述待训练目标预测模型中的分类器,将所述中间训练样本特征转换为训练模型预测结果;
基于所述训练样本标签、所述训练模型预测结果和所述第一方初始模型权重,计算第一方模型预测损失;
基于所述训练模型预测结果和所述训练样本标签,更新所述第一方初始模型权重;
基于所述第一方模型预测损失和更新后的第一方初始模型权重,迭代优化所述待训练目标预测模型,获得所述目标预测模型。
可选地,所述纵向联邦学习建模优化装置还用于:
获取所述目标预测模型对应的第一方分类正确样本数和对应的第一方分类错误样本数;
通过计算所述第一方分类正确样本数和所述第一方分类错误样本数的比值,生成第一方模型权重。
可选地,所述纵向联邦学习建模优化装置还用于:
提取待预测样本,并获取目标预测模型的特征提取器针对所述待预测样本进行特征提取生成的中间样本特征,以及所述目标预测模型针对所述待预测样本进行模型预测生成的第一方模型预测结果;
将所述中间样本特征发送至第二设备,以供所述第二设备基于纵向联邦残差提升模型对所述中间样本特征和所述待预测样本对应的ID匹配样本共同执行模型预测,获得第二方模型预测结果;
接收所述第二设备发送的第二方模型预测结果和所述纵向联邦残差提升模型对应的第二方模型权重;
基于所述目标预测模型对应的第一方模型权重和所述第二方模型权重,对所述第一方模型预测结果和所述第二方模型预测结果进行加权聚合,获得目标联邦预测结果。
本申请纵向联邦学习建模优化装置的具体实施方式与上述纵向联邦学习建模优化方法各实施例基本相同,在此不再赘述。
本申请实施例还提供一种纵向联邦学习建模优化装置,所述纵向联邦学习建模优化装置应用于第二设备,所述纵向联邦学习建模优化装置包括:
接收模块,用于获取第二方初始模型权重,并接收第一设备发送的中间训练样本特征、训练样本标签和第一方模型预测损失。其中,所述第一方模型预测损失由所述第一设备基于目标预测模型在所述训练样本ID匹配样本对应的训练样本上的第一方模型预测结果和所述训练样本标签计算得到,所述中间训练样本特征由所述第一设备基于所述目标预测模型的特征提取器针对于所述训练样本进行特征提取得到;
模型预测模块,用于获取训练样本ID匹配样本,并基于待训练残差提升模型,对所述训练样本ID匹配样本和所述中间训练样本特征共同执行模型预测,获得第二方训练模型预测结果;
计算模块,用于基于所述训练样本标签、所述第二方初始模型权重和所述第二方训练模型预测结果,计算第二方模型预测损失;
迭代优化模块,用于基于所述第一方模型预测损失和所述第二方模型预测损失生成的残差损失,迭代优化所述待训练残差提升模型,获得所述纵向联邦残差提升模型。
可选地,所述纵向联邦学习建模优化装置还用于:
获取所述纵向联邦残差提升模型对应的第二方分类正确样本数和对应的第二方分类错误样本数;
通过计算所述第二方分类正确样本数和所述第二方分类错误样本数的比值,生成第二方模型权重。
可选地,所述纵向联邦学习建模优化装置还用于:
接收第一设备发送的中间样本特征,并查找与所述中间样本特征对应的ID匹配样本;
基于所述纵向联邦残差提升模型,对所述ID匹配样本和所述中间样本特征共同执行模型预测,获得第二方模型预测结果;
将所述第二方模型预测结果和所述纵向联邦残差提升模型对应的第二方模型权重发送至所述第一设备,以供所述第一设备基于目标预测模型针对所述ID匹配样本对应的待预测样本生成的第一方模型预测结果、所述目标预测模型对应的第一方模型权重、所述第二方模型预测结果和所述第二方模型权重,生成目标联邦预测结果,其中,所述目标预测模型由所述第一设备本地迭代训练得到。
本申请纵向联邦学习建模优化装置的具体实施方式与上述纵向联邦学习建模优化方法各实施例基本相同,在此不再赘述。
本申请实施例提供了一种介质,所述介质为可读存储介质,且所述可读存储介质存储有一个或者一个以上程序,所述一个或者一个以上程序还可被一个或者一个以上的处理器执行以用于实现上述任一项所述的纵向联邦预测优化方法的步骤。
本申请可读存储介质具体实施方式与上述纵向联邦预测优化方法各实施例基本相同,在此不再赘述。
本申请实施例提供了一种介质,所述介质为可读存储介质,且所述可读存储介质存储有一个或者一个以上程序,所述一个或者一个以上程序还可被一个或者一个以上的处理器执行以用于实现上述任一项所述的纵向联邦学习建模优化方法的步骤。
本申请可读存储介质具体实施方式与上述纵向联邦学习建模优化方法各实施例基本相同,在此不再赘述。
本申请实施例提供了一种计算机程序产品,且所述计算机程序产品包括有一个或者一个以上计算机程序,所述一个或者一个以上计算机程序还可被一个或者一个以上的处理器执行以用于实现上述任一项所述的纵向联邦预测优化方法的步骤。
本申请计算机程序产品具体实施方式与上述纵向联邦预测优化方法各实施例基本相同,在此不再赘述。
本申请实施例提供了一种计算机程序产品,且所述计算机程序产品包括有一个或者一个以上计算机程序,所述一个或者一个以上计算机程序还可被一个或者一个以上的处理器执行以用于实现上述任一项所述的纵向联邦学习建模优化方法的步骤。
本申请计算机程序产品具体实施方式与上述纵向联邦学习建模优化方法各实施例基本相同,在此不再赘述。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利处理范围内。

Claims (23)

  1. 一种纵向联邦预测优化方法,其中,应用于第一设备,所述纵向联邦预测优化方法包括:
    提取待预测样本,并获取目标预测模型的特征提取器针对所述待预测样本进行特征提取生成的中间样本特征,以及所述目标预测模型针对所述待预测样本进行模型预测生成的第一方模型预测结果,其中,所述目标预测模型由所述第一设备本地迭代训练得到;
    将所述中间样本特征发送至第二设备,以供所述第二设备基于纵向联邦残差提升模型对所述中间样本特征和所述待预测样本对应的ID匹配样本共同执行模型预测,获得第二方模型预测结果;
    获取所述目标预测模型对应的第一方模型权重,并接收所述第二设备发送的第二方模型预测结果和所述纵向联邦残差提升模型对应的第二方模型权重;
    基于所述第一方模型权重和所述第二方模型权重,对所述第一方模型预测结果和所述第二方模型预测结果进行加权聚合,获得目标联邦预测结果。
  2. 如权利要求1所述纵向联邦预测优化方法,其中,在所述将所述中间样本特征发送至第二设备的步骤之前,所述纵向联邦预测优化方法还包括:
    将所述待预测样本对应的待预测样本ID发送至所述第二设备,以供所述第二设备查找所述待预测样本ID对应的ID匹配样本;
    若接收到所述第二设备发送的查找失败信息,则将所述第一方模型预测结果作为目标预测结果;
    若未接收到所述第二设备发送的查找失败信息,则执行步骤:将所述中间样本特征发送至第二设备。
  3. 如权利要求1所述纵向联邦预测优化方法,其中,所述纵向联邦残差提升模型由所述第二设备基于纵向联邦公共样本,联合所述第一设备中所述目标预测模型在所述纵向联邦公共样本上的模型预测损失、所述纵向联邦公共样本对应的中间公共样本特征以及对应的样本标签,与第一设备进行基于纵向联邦学习的残差学习得到。
  4. 如权利要求3所述纵向联邦预测优化方法,其中,在所述获取目标预测模型的特征提取器针对所述待预测样本进行特征提取生成的中间样本特征,以及所述目标预测模型针对所述待预测样本进行模型预测生成的第一方模型预测结果的步骤之前,所述纵向联邦预测优化方法还包括:
    获取第一方初始模型权重,并提取训练样本和所述训练样本对应的训练样本标签;
    获取待训练目标预测模型的特征提取器针对所述训练样本进行特征提取生成的中间训练样本特征;
    基于所述训练样本标签、所述中间训练样本特征对应的训练模型预测结果和所述第一方初始模型权重,通过计算所述待训练目标预测模型对应的第一方模型预测损失,迭代优化所述待训练目标预测模型,获得所述目标预测模型;
    将所述训练样本标签、所述中间训练样本特征和所述第一方模型预测损失发送至第二设备,以供所述第二设备基于待训练残差提升模型、所述训练样本对应的训练样本ID匹配样本、所述中间训练样本特征、所述样本标签和获取的第二方初始模型权重,计算第二方模型预测损失,并基于所述第二方模型预测损失和所述第一方模型预测损失计算的残差损失,优化所述待训练残差提升模型,获得所述纵向联邦残差提升模型。
  5. 一种纵向联邦预测优化方法,其中,应用于第二设备,所述纵向联邦预测优化方法包括:
    接收第一设备发送的中间样本特征,并查找与所述中间样本特征对应的ID匹配样本;
    基于纵向联邦残差提升模型,对所述ID匹配样本和所述中间样本特征共同执行模型预测,获得第二方模型预测结果;
    获取所述纵向联邦残差提升模型对应的第二方模型权重,并将所述第二方模型预测结果和所述第二方模型权重发送至所述第一设备,以供所述第一设备基于所述目标预测模型针对所述ID匹配样本对应的待预测样本生成的第一方模型预测结果、所述目标预测模型对应的第一方模型权重、所述第二方模型预测结果和所述第二方模型权重,生成目标联邦预测结果,其中,所述目标预测模型由所述第一设备本地迭代训练得到。
  6. 如权利要求5所述纵向联邦预测优化方法,其中,所述纵向联邦残差提升模型由所述第二设备基于纵向联邦公共样本,联合所述第一设备中目标预测模型在所述纵向联邦公共样本上的模型预测损失、所述纵向联邦公共样本对应的中间公共样本特征以及对应的样本标签,与第一设备进行基于纵向联邦学习的残差学习得到。
  7. 如权利要求5所述纵向联邦预测优化方法,其中,所述基于纵向联邦残差提升模型,对所述ID匹配样本和所述中间样本特征共同执行模型预测,获得第二方模型预测结果的步骤包括:
    对所述ID匹配样本和所述中间样本特征进行拼接,获得特征增强样本;
    基于所述纵向联邦残差提升模型,对所述特征增强样本执行模型预测,获得所述第二方模型预测结果。
  8. 如权利要求5所述纵向联邦预测优化方法,其中,在所述查找ID匹配样本的步骤之后,所述纵向联邦预测优化还包括:
    若查找成功,则执行步骤:基于纵向联邦残差提升模型,对所述ID匹配样本和所述中间样本特征共同执行模型预测,获得第二方模型预测结果;
    若查找失败,则向所述第一设备反馈查找失败信息,以供所述第一设备在接收所述查找失败信息后,将基于目标预测模型针对所述待预测样本生成的第一方模型预测结果作为目标预测结果。
  9. 如权利要求5所述纵向联邦预测优化方法,其中,在所述基于纵向联邦残差提升模型,对所述ID匹配样本和所述中间样本特征共同执行模型预测,获得第二方模型预测结果的步骤之前,所述纵向联邦预测优化方法还包括:
    获取第二方初始模型权重,并接收所述第一设备发送的中间训练样本特征、训练样本标签和第一方模型预测损失,其中,所述第一方模型预测损失由所述第一设备基于目标预测模型在所述训练样本ID匹配样本对应的训练样本上的第一方模型预测结果和所述训练样本标签计算得到,所述中间训练样本特征由所述第一设备基于所述目标预测模型的特征提取器针对所述训练样本进行特征提取得到;
    获取训练样本ID匹配样本,并基于待训练残差提升模型,对所述训练样本ID匹配样本和所述中间训练样本特征共同执行模型预测,获得第二方训练模型预测结果;
    基于所述训练样本标签、所述第二方初始模型权重和所述第二方训练模型预测结果,计算第二方模型预测损失;
    基于所述第一方模型预测损失和所述第二方模型预测损失生成的残差损失,迭代优化所述待训练残差提升模型,获得所述纵向联邦残差提升模型。
  10. 一种纵向联邦学习建模优化方法,其中,应用于第一设备,所述纵向联邦学习建模优化方法包括:
    获取第一方初始模型权重,并提取训练样本和所述训练样本对应的训练样本标签;
    获取待训练目标预测模型的特征提取器针对所述训练样本进行特征提取生成的中间训练样本特征;
    基于所述训练样本标签、所述中间训练样本特征对应的训练模型预测结果和所述第一方初始模型权重,通过计算所述待训练目标预测模型对应的第一方模型预测损失,迭代优化所述待训练目标预测模型,获得所述目标预测模型;
    将所述训练样本标签、所述中间训练样本特征和所述第一方模型预测损失发送至第二设备,以供所述第二设备计算第二方模型预测损失,并基于所述第二方模型预测损失和所述第一方模型预测损失计算的残差损失,优化所述待训练残差提升模型,获得纵向联邦残差提升模型。
  11. 如权利要求10所述纵向联邦学习建模优化方法,其中,所述第二方模型预测损失基于待训练残差提升模型、所述训练样本对应的训练样本ID匹配样本、所述中间训练样本特征、所述样本标签和获取的第二方初始模型权重进行计算得到。
  12. 如权利要求10所述纵向联邦学习建模优化方法,其中,所述基于所述训练样本标签、所述中间训练样本特征对应的训练模型预测结果和所述第一方初始模型权重,通过计算所述待训练目标预测模型对应的第一方模型预测损失,迭代优化所述待训练目标预测模型,获得所述目标预测模型的步骤包括:
    基于所述待训练目标预测模型中的分类器,将所述中间训练样本特征转换为训练模型预测结果;
    基于所述训练样本标签、所述训练模型预测结果和所述第一方初始模型权重,计算第一方模型预测损失;
    基于所述训练模型预测结果和所述训练样本标签,更新所述第一方初始模型权重;
    基于所述第一方模型预测损失和更新后的第一方初始模型权重,迭代优化所述待训练目标预测模型,获得所述目标预测模型。
  13. 如权利要求10所述纵向联邦学习建模优化方法,其中,在所述基于所述训练样本标签、所述中间训练样本特征对应的训练模型预测结果和所述第一方初始模型权重,通过计算所述待训练目标预测模型对应的第一方模型预测损失,迭代优化所述待训练目标预测模型,获得所述目标预测模型的步骤之后,所述纵向联邦学习建模优化方法还包括:
    获取所述目标预测模型对应的第一方分类正确样本数和对应的第一方分类错误样本数;
    通过计算所述第一方分类正确样本数和所述第一方分类错误样本数的比值,生成第一方模型权重。
  14. 如权利要求10所述纵向联邦学习建模优化方法,其中,在所述将所述训练样本标签、所述中间训练样本特征和所述第一方模型预测损失发送至第二设备,以供所述第二设备计算第二方模型预测损 失,并基于所述第二方模型预测损失和所述第一方模型预测损失计算的残差损失,优化所述待训练残差提升模型,获得纵向联邦残差提升模型的步骤之后,所述纵向联邦学习建模优化方法还包括:
    提取待预测样本,并获取目标预测模型的特征提取器针对所述待预测样本进行特征提取生成的中间样本特征,以及所述目标预测模型针对所述待预测样本进行模型预测生成的第一方模型预测结果;
    将所述中间样本特征发送至第二设备,以供所述第二设备基于纵向联邦残差提升模型对所述中间样本特征和所述待预测样本对应的ID匹配样本共同执行模型预测,获得第二方模型预测结果;
    接收所述第二设备发送的第二方模型预测结果和所述纵向联邦残差提升模型对应的第二方模型权重;
    基于所述目标预测模型对应的第一方模型权重和所述第二方模型权重,对所述第一方模型预测结果和所述第二方模型预测结果进行加权聚合,获得目标联邦预测结果。
  15. 一种纵向联邦学习建模优化方法,其中,应用于第二设备,所述纵向联邦学习建模优化方法包括:
    获取第二方初始模型权重,并接收第一设备发送的中间训练样本特征、训练样本标签和第一方模型预测损失;
    获取训练样本ID匹配样本,并基于待训练残差提升模型,对所述训练样本ID匹配样本和所述中间训练样本特征共同执行模型预测,获得第二方训练模型预测结果;
    基于所述训练样本标签、所述第二方初始模型权重和所述第二方训练模型预测结果,计算第二方模型预测损失;
    基于所述第一方模型预测损失和所述第二方模型预测损失生成的残差损失,迭代优化所述待训练残差提升模型,获得所述纵向联邦残差提升模型。
  16. 如权利要求15所述纵向联邦学习建模优化方法,其中,所述第一方模型预测损失由所述第一设备基于目标预测模型在所述训练样本ID匹配样本对应的训练样本上的第一方模型预测结果和所述训练样本标签计算得到,所述中间训练样本特征由所述第一设备基于所述目标预测模型的特征提取器针对所述训练样本进行特征提取得到。
  17. 如权利要求15所述纵向联邦学习建模优化方法,其中,在所述基于所述第一方模型预测损失和所述第二方模型预测损失生成的残差损失,迭代优化所述待训练残差提升模型,获得所述纵向联邦残差提升模型的步骤之后,所述纵向联邦学习建模优化方法还包括:
    获取所述纵向联邦残差提升模型对应的第二方分类正确样本数和对应的第二方分类错误样本数;
    通过计算所述第二方分类正确样本数和所述第二方分类错误样本数的比值,生成第二方模型权重。
  18. 如权利要求15所述纵向联邦学习建模优化方法,其中,在所述基于所述残差损失,迭代优化所述待训练残差提升模型,获得纵向联邦残差提升模型的步骤之后,所述纵向联邦学习建模优化方法还包括:
    接收第一设备发送的中间样本特征,并查找与所述中间样本特征对应的ID匹配样本;
    基于所述纵向联邦残差提升模型,对所述ID匹配样本和所述中间样本特征共同执行模型预测,获得第二方模型预测结果;
    将所述第二方模型预测结果和所述纵向联邦残差提升模型对应的第二方模型权重发送至所述第一设备,以供所述第一设备基于目标预测模型针对所述ID匹配样本对应的待预测样本生成的第一方模型预测结果、所述目标预测模型对应的第一方模型权重、所述第二方模型预测结果和所述第二方模型权重,生成目标联邦预测结果,其中,所述目标预测模型由所述第一设备本地迭代训练得到。
  19. 一种纵向联邦预测优化设备,其中,所述纵向联邦预测优化设备包括:存储器、处理器以及存储在存储器上的用于实现所述纵向联邦预测优化方法的程序,
    所述存储器用于存储实现纵向联邦预测优化方法的程序;
    所述处理器用于执行实现所述纵向联邦预测优化方法的程序,以实现如权利要求1至9中任一项所述纵向联邦预测优化方法的步骤。
  20. 一种纵向联邦学习建模优化设备,其中,所述纵向联邦学习建模优化设备包括:存储器、处理器以及存储在存储器上的用于实现所述纵向联邦学习建模优化方法的程序,
    所述存储器用于存储实现纵向联邦学习建模优化方法的程序;
    所述处理器用于执行实现所述纵向联邦学习建模优化方法的程序,以实现如权利要求10至18中任一项所述纵向联邦学习建模优化方法的步骤。
  21. 一种介质,所述介质为可读存储介质,其中,所述可读存储介质上存储有实现纵向联邦预测优化方法的程序,所述实现纵向联邦预测优化方法的程序被处理器执行以实现如权利要求1至9中任一项所述纵向联邦预测优化方法的步骤。
  22. 一种介质,所述介质为可读存储介质,其中,所述可读存储介质上存储有实现纵向联邦学习建模优化方法的程序,所述实现纵向联邦学习建模优化方法的程序被处理器执行以实现如权利要求10至18中任一项所述纵向联邦学习建模优化方法的步骤。
  23. 一种计算机程序产品,包括计算机程序,其中,所述计算机程序被处理器执行时实现如权利要求1至9中任一项所述纵向联邦预测优化方法的步骤,或者实现如权利要10至18中任一项所述纵向联邦学习建模优化方法的步骤。
PCT/CN2021/139640 2021-08-25 2021-12-20 纵向联邦预测优化方法、设备、介质及计算机程序产品 WO2023024349A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110982929.9 2021-08-25
CN202110982929.9A CN113688986A (zh) 2021-08-25 2021-08-25 纵向联邦预测优化方法、设备、介质及计算机程序产品

Publications (1)

Publication Number Publication Date
WO2023024349A1 true WO2023024349A1 (zh) 2023-03-02

Family

ID=78582579

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/139640 WO2023024349A1 (zh) 2021-08-25 2021-12-20 纵向联邦预测优化方法、设备、介质及计算机程序产品

Country Status (2)

Country Link
CN (1) CN113688986A (zh)
WO (1) WO2023024349A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116567652A (zh) * 2023-05-19 2023-08-08 上海科技大学 全向超表面辅助的空中计算赋能垂直联邦学习方法
CN116644372A (zh) * 2023-07-24 2023-08-25 北京芯盾时代科技有限公司 一种账户类型的确定方法、装置、电子设备及存储介质

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113688986A (zh) * 2021-08-25 2021-11-23 深圳前海微众银行股份有限公司 纵向联邦预测优化方法、设备、介质及计算机程序产品

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109002861A (zh) * 2018-08-10 2018-12-14 深圳前海微众银行股份有限公司 联邦建模方法、设备及存储介质
CN112001740A (zh) * 2020-06-19 2020-11-27 南京理工大学 一种基于自适应神经网络的组合预测方法
US20210089964A1 (en) * 2019-09-20 2021-03-25 Google Llc Robust training in the presence of label noise
CN112700010A (zh) * 2020-12-30 2021-04-23 深圳前海微众银行股份有限公司 基于联邦学习的特征补全方法、装置、设备及存储介质
CN112785002A (zh) * 2021-03-15 2021-05-11 深圳前海微众银行股份有限公司 模型构建优化方法、设备、介质及计算机程序产品
CN113688986A (zh) * 2021-08-25 2021-11-23 深圳前海微众银行股份有限公司 纵向联邦预测优化方法、设备、介质及计算机程序产品

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109002861A (zh) * 2018-08-10 2018-12-14 深圳前海微众银行股份有限公司 联邦建模方法、设备及存储介质
US20210089964A1 (en) * 2019-09-20 2021-03-25 Google Llc Robust training in the presence of label noise
CN112001740A (zh) * 2020-06-19 2020-11-27 南京理工大学 一种基于自适应神经网络的组合预测方法
CN112700010A (zh) * 2020-12-30 2021-04-23 深圳前海微众银行股份有限公司 基于联邦学习的特征补全方法、装置、设备及存储介质
CN112785002A (zh) * 2021-03-15 2021-05-11 深圳前海微众银行股份有限公司 模型构建优化方法、设备、介质及计算机程序产品
CN113688986A (zh) * 2021-08-25 2021-11-23 深圳前海微众银行股份有限公司 纵向联邦预测优化方法、设备、介质及计算机程序产品

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116567652A (zh) * 2023-05-19 2023-08-08 上海科技大学 全向超表面辅助的空中计算赋能垂直联邦学习方法
CN116567652B (zh) * 2023-05-19 2024-02-23 上海科技大学 全向超表面辅助的空中计算赋能垂直联邦学习方法
CN116644372A (zh) * 2023-07-24 2023-08-25 北京芯盾时代科技有限公司 一种账户类型的确定方法、装置、电子设备及存储介质
CN116644372B (zh) * 2023-07-24 2023-11-03 北京芯盾时代科技有限公司 一种账户类型的确定方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN113688986A (zh) 2021-11-23

Similar Documents

Publication Publication Date Title
WO2023024349A1 (zh) 纵向联邦预测优化方法、设备、介质及计算机程序产品
WO2019233421A1 (zh) 图像处理方法及装置、电子设备、存储介质
WO2021042543A1 (zh) 基于长短期记忆网络的多轮对话语义分析方法和系统
US9830526B1 (en) Generating image features based on robust feature-learning
US20190286986A1 (en) Machine Learning Model Training Method And Apparatus
EP3582150A1 (en) Method of knowledge transferring, information processing apparatus and storage medium
WO2021083276A1 (zh) 横向联邦和纵向联邦联合方法、装置、设备及介质
WO2019214344A1 (zh) 系统增强学习方法和装置、电子设备、计算机存储介质
WO2020005731A1 (en) Text entity detection and recognition from images
WO2022007321A1 (zh) 纵向联邦建模优化方法、装置、设备及可读存储介质
WO2021089012A1 (zh) 图网络模型的节点分类方法、装置及终端设备
US20230237277A1 (en) Aspect prompting framework for language modeling
US11048773B1 (en) Systems and methods for modeling item similarity and correlating item information
WO2023024350A1 (zh) 纵向联邦预测优化方法、设备、介质及计算机程序产品
KR20220047228A (ko) 이미지 분류 모델 생성 방법 및 장치, 전자 기기, 저장 매체, 컴퓨터 프로그램, 노변 장치 및 클라우드 제어 플랫폼
WO2021139465A1 (zh) 向后模型选择方法、设备及可读存储介质
CN112785002A (zh) 模型构建优化方法、设备、介质及计算机程序产品
CN114492601A (zh) 资源分类模型的训练方法、装置、电子设备及存储介质
CN115147680B (zh) 目标检测模型的预训练方法、装置以及设备
CN117114063A (zh) 用于训练生成式大语言模型和用于处理图像任务的方法
CN111161238A (zh) 图像质量评价方法及装置、电子设备、存储介质
US20240119266A1 (en) Method for Constructing AI Integrated Model, and AI Integrated Model Inference Method and Apparatus
CN117633621A (zh) 开集分类模型的训练方法、装置、电子设备及存储介质
EP4332791A1 (en) Blockchain address classification method and apparatus
CN116307078A (zh) 账户标签预测方法、装置、存储介质及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21954865

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE