CN113505896A - Longitudinal federated learning modeling optimization method, apparatus, medium, and program product - Google Patents

Longitudinal federated learning modeling optimization method, apparatus, medium, and program product Download PDF

Info

Publication number
CN113505896A
CN113505896A CN202110858396.3A CN202110858396A CN113505896A CN 113505896 A CN113505896 A CN 113505896A CN 202110858396 A CN202110858396 A CN 202110858396A CN 113505896 A CN113505896 A CN 113505896A
Authority
CN
China
Prior art keywords
party
sample
model
representation
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110858396.3A
Other languages
Chinese (zh)
Inventor
何元钦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WeBank Co Ltd
Original Assignee
WeBank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WeBank Co Ltd filed Critical WeBank Co Ltd
Priority to CN202110858396.3A priority Critical patent/CN113505896A/en
Publication of CN113505896A publication Critical patent/CN113505896A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a longitudinal federated learning modeling optimization method, equipment, a medium and a program product, which are applied to first equipment, wherein the longitudinal federated learning modeling optimization method comprises the following steps: acquiring first party alignment sample data, and converting the first party alignment sample data into first party mapping representation data on the basis of a first party representation mapping model to be trained; receiving second square sample characterization data sent by second equipment, and constructing a first global characterization learning loss based on the first square mapping characterization data and the second square sample characterization data; optimizing a first party representation mapping model to be trained based on the first global representation learning loss to obtain a first party global representation mapping model; and carrying out model fine tuning training on a sample prediction model with a feature extraction model in the first party global representation mapping model based on the first party label sample to obtain a first party federal prediction model. The method and the device solve the technical problem that the application scenario of the longitudinal federated learning modeling is high in limitation.

Description

Longitudinal federated learning modeling optimization method, apparatus, medium, and program product
Technical Field
The present application relates to the field of artificial intelligence in financial technology (Fintech), and in particular, to a method, apparatus, medium, and program product for longitudinal federal learning modeling optimization.
Background
With the continuous development of financial science and technology, especially internet science and technology, more and more technologies (such as distributed technology, artificial intelligence and the like) are applied to the financial field, but the financial industry also puts higher requirements on the technologies, for example, higher requirements on the distribution of backlog in the financial industry are also put forward.
With the continuous development of computer software, artificial intelligence and big data cloud service application, at present, the application scenario of the longitudinal federation is the situation that samples of all participants overlap more and features overlap less, and different participants have information of different fields/different angles of users. In a longitudinal federated learning scenario, a participant without a sample label needs to calculate loss or gradient by means of sample labels of other participants, and the longitudinal federated learning scenario usually requires at least one participant to have rich sample labels to ensure the accuracy of a model constructed based on longitudinal federated learning, but when the sample proportion of the label in an overlapping sample among the participants of the longitudinal federated learning is low, the accuracy of the longitudinal federated learning model cannot be ensured, so the current longitudinal federated learning modeling method can only be limited to an application scenario with a high sample proportion for the label in the overlapping sample, and the limitation of the application scenario of the existing longitudinal federated learning modeling is high.
Disclosure of Invention
The application mainly aims to provide a longitudinal federated learning modeling optimization method, device, medium and program product, and aims to solve the technical problem that the application scenario of longitudinal federated learning modeling in the prior art is high in limitation.
In order to achieve the above object, the present application provides a longitudinal federated learning modeling optimization method, where the longitudinal federated learning modeling optimization method is applied to a first device, and the longitudinal federated learning modeling optimization method includes:
acquiring first party alignment sample data, and converting the first party alignment sample data into first party mapping representation data on the basis of a first party representation mapping model to be trained;
receiving second-party sample characterization data sent by second equipment, and constructing a first global characterization learning loss based on the first-party mapping characterization data and the second-party sample characterization data, wherein the second-party sample characterization data is obtained by performing feature extraction on second-party alignment sample data corresponding to the first-party alignment sample data by the second equipment;
optimizing the first party representation mapping model to be trained based on the first global representation learning loss to obtain a first party global representation mapping model;
and performing model fine tuning training on a sample prediction model with a first party global feature extraction model in the first party global feature mapping model based on a first party label sample in the first party alignment sample data to obtain a first party federal prediction model.
The application provides a longitudinal federated learning modeling optimization method, which is applied to a second device and comprises the following steps:
acquiring second-party alignment sample data, and performing feature extraction on the second-party alignment sample data based on a second-party feature extraction model to be trained to obtain second-party sample characterization data;
and sending the second party sample representation data to first equipment, so that the first equipment constructs a first party global representation mapping model based on the second party sample representation data and first party sample representation data of first party alignment sample data corresponding to the second party alignment sample data, and performs model fine tuning training on a sample prediction model provided with a first party global feature extraction model in the first party global representation mapping model based on a first party label sample in the first party alignment sample data to obtain a first party federal prediction model.
The application also provides a vertical federal learning modeling optimization device, vertical federal learning modeling optimization device is virtual device, just vertical federal learning modeling optimization device is applied to first equipment, vertical federal learning modeling optimization device includes:
the system comprises a conversion module, a mapping module and a training module, wherein the conversion module is used for acquiring first party alignment sample data and converting the first party alignment sample data into first party mapping representation data based on a first party representation mapping model to be trained;
the loss construction module is configured to receive second-party sample characterization data sent by a second device, and construct a first global characterization learning loss based on the first-party mapping characterization data and the second-party sample characterization data, where the second-party sample characterization data is obtained by performing feature extraction on second-party alignment sample data corresponding to the first-party alignment sample data by the second device;
the optimization module is used for optimizing the first party representation mapping model to be trained based on the first global representation learning loss to obtain a first party global representation mapping model;
and the model fine-tuning module is used for carrying out model fine-tuning training on a sample prediction model with a first party global feature extraction model in the first party global representation mapping model based on a first party label sample in the first party alignment sample data to obtain a first party federal prediction model.
The application also provides a vertical federal learning modeling optimization device, vertical federal learning modeling optimization device is virtual device, just vertical federal learning modeling optimization device is applied to the second equipment, vertical federal learning modeling optimization device includes:
the characteristic extraction module is used for acquiring second-party alignment sample data, and extracting the characteristics of the second-party alignment sample data based on a second-party characteristic extraction model to be trained to obtain second-party sample characterization data;
and the sending module is used for sending the second party sample characterization data to first equipment so that the first equipment can construct a first party global characterization mapping model based on the second party sample characterization data and first party sample characterization data of first party alignment sample data corresponding to the second party alignment sample data, and can perform model fine tuning training on a sample prediction model with a first party global feature extraction model in the first party global characterization mapping model based on the first party label sample in the first party alignment sample data to obtain a first party federal prediction model.
The present application further provides a longitudinal federated learning modeling optimization apparatus, which is an entity apparatus, the longitudinal federated learning modeling optimization apparatus including: a memory, a processor, and a program of the longitudinal federated learning modeling optimization method stored on the memory and executable on the processor, which when executed by the processor, may implement the steps of the longitudinal federated learning modeling optimization method as described above.
The present application also provides a medium, which is a readable storage medium, on which a program for implementing the longitudinal federated learning modeling optimization method is stored, and the program for implementing the longitudinal federated learning modeling optimization method implements the steps of the longitudinal federated learning modeling optimization method as described above when executed by a processor.
The present application also provides a computer program product comprising a computer program which, when executed by a processor, implements the steps of the longitudinal federal learning modeling optimization methodology as described above.
Compared with the technical means that a participant without a sample label in the prior art needs to calculate loss or gradient by means of sample labels of other participants so as to carry out longitudinal federated learning modeling, the method comprises the steps of firstly obtaining first party alignment sample data, and converting the first party alignment sample data into first party mapping representation data on the basis of a first party representation mapping model to be trained; receiving second-party sample characterization data sent by second equipment, and constructing a first global characterization learning loss based on the first-party mapping characterization data and the second-party sample characterization data, wherein the second-party sample characterization data is obtained by performing feature extraction on second-party alignment sample data corresponding to the first-party alignment sample data by the second equipment; optimizing the first party global representation mapping model to be trained based on the first global representation learning loss to obtain a first party global representation mapping model, wherein a sample label is not required in the process of constructing the first party global representation mapping model, so that the purpose of constructing the first party global representation mapping model by using label-free samples in aligned samples between first equipment and second equipment is achieved, the first party global representation mapping model learns global sample characteristics in the first equipment and the second equipment, and then model fine tuning training is performed on a sample prediction model provided with a first party global feature extraction model in the first party global representation mapping model based on first party label samples in the first party aligned sample data to obtain a first party federal prediction model, wherein the first party global representation mapping model learns global sample characteristics in the first equipment and the second equipment, and then a sample prediction model with a first party global feature extraction model in the first party global representation mapping model can be subjected to model fine tuning training based on a small amount of first party label samples in the first party alignment sample data, the sample prediction model can be prompted to learn the mapping from the global sample representation to the sample label, so as to obtain the first party federal prediction model, further realizes the purpose of constructing a first party federal prediction model by utilizing overlapped samples with lower sample label proportion between the first equipment and the second equipment, therefore, the method overcomes the defect that the current longitudinal federal learning modeling method can only be limited to the application scene with higher sample proportion for the label in the overlapped samples, the technical defect that the existing application scene of the longitudinal federated learning modeling is high in limitation is overcome, and the limitation of the application scene of the longitudinal federated learning modeling is reduced.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
FIG. 1 is a schematic flow chart diagram of a first embodiment of a longitudinal federated learning modeling optimization method of the present application;
fig. 2 is a structural framework diagram of a global representation mapping model trained between the first device and the second device in the longitudinal federated learning modeling optimization method of the present application;
FIG. 3 is a schematic flow chart of a second embodiment of the longitudinal federated learning modeling optimization method of the present application;
fig. 4 is a schematic device structure diagram of a hardware operating environment related to the longitudinal federated learning modeling optimization method in the embodiment of the present application.
The objectives, features, and advantages of the present application will be further described with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The embodiment of the application provides a longitudinal federated learning modeling optimization method, which is applied to first equipment, and in the first embodiment of the longitudinal federated learning modeling optimization method, referring to fig. 1, the longitudinal federated learning modeling optimization method includes:
step S10, obtaining first party alignment sample data, and converting the first party alignment sample data into first party mapping representation data based on a first party representation mapping model to be trained;
in this embodiment, it should be noted that, the longitudinal federal learning modeling optimization method is applied to longitudinal federal learning, both the first device and the second device are parties involved in longitudinal federal learning, the first-party alignment sample data at least includes a first-party alignment sample, the second-party alignment sample data at least includes a second-party alignment sample, the first-party alignment sample and the second-party alignment sample are overlapped samples between the first device and the second device, where the first-party alignment sample and the corresponding second-party alignment sample have the same sample ID, and the first-party alignment sample and the corresponding second-party alignment sample have sample characteristics that are not completely the same, and the first device and the second device perform sample alignment before performing step S10 to allow the first device to obtain first-party alignment, and the second equipment acquires second party alignment sample data.
Additionally, it should be noted that the to-be-trained first-party representation mapping model includes a to-be-trained first-party feature extraction model and a to-be-trained first-party representation conversion model, wherein the first party feature extraction model to be trained is used for mapping the first party aligned sample into a first party sample representation by performing feature extraction on the first party aligned sample, the first party characterization conversion model to be trained is used for converting the first party sample characterization into a first party mapping characterization by performing characterization conversion on the first party sample characterization, wherein the first square-mapped token is consistent with the token dimensions of the second square-sample token, e.g., if the first square-mapped token is a 3 x 3 tensor, the second square sample is characterized by a tensor of 3 x 3, and if the first square map is characterized by a 256-bit vector, the second square sample is also characterized by a 256-bit vector.
Additionally, it should be noted that the first party representation mapping model to be trained may also be a single model, the first party alignment sample is an input of the first party representation mapping model to be trained, the first party sample corresponding to the first party alignment sample is represented as an intermediate output of the first party representation mapping model to be trained, and the first party representation is a final output of the first party representation mapping model, for example, assuming that the first party representation mapping model to be trained is a 1000-layer neural network, an output of any one of layers 1 to 999 may be set as the first party sample representation, and an output of the 1000-layer network may be the first party representation.
The method comprises the steps of obtaining first party alignment sample data, converting the first party alignment sample data into first party mapping characterization data based on a first party characterization mapping model to be trained, specifically obtaining the first party alignment sample data, respectively performing feature extraction on each first party alignment sample in the first party alignment sample data based on the first party feature extraction model to be trained, obtaining first party sample characterizations corresponding to the first party alignment samples, and then respectively performing feature conversion on each first party sample characterization based on the first party characterization conversion model to be trained, obtaining first party mapping characterizations corresponding to the first party sample characterizations, namely obtaining first party mapping characterization data, wherein the first party mapping characterization data at least comprises a first party mapping characterization.
Wherein the first party representation mapping model to be trained comprises a first party feature extraction model to be trained and a first party representation conversion model to be trained,
the step of converting the first party alignment sample data into first party mapping representation data based on a first party representation mapping model to be trained comprises:
step S11, extracting the features of the first party alignment sample data based on the first party feature extraction model to be trained to obtain first party sample characterization data;
in this embodiment, feature extraction is performed on the first party alignment sample data based on the first party feature extraction model to be trained to obtain first party sample characterization data, specifically, each first party alignment sample is respectively input into the first party feature extraction model to be trained to respectively perform feature extraction on each first party alignment sample, so as to respectively map each first party alignment sample to a preset first characterization space, and obtain a first party sample characterization corresponding to each first party alignment sample, where it is to be noted that characterization dimensions of each sample characterization in the preset first characterization space are consistent, and each first party sample characterization and each second party mapping characterization are in the preset first characterization space content, where an obtaining method of each second party mapping characterization specifically refers to the content in step a10, and will not be described in detail herein.
After the step of performing feature extraction on the first-party alignment sample data based on the first-party feature extraction model to be trained to obtain first-party sample characterization data, the longitudinal federated learning modeling optimization method further includes:
step A10, sending the first party sample characterization data to a second device, so that the second device optimizes a second party characterization mapping model to be trained based on a second global characterization learning loss calculated by the second device based on the first party sample characterization data and second party mapping characterization data corresponding to the second party sample characterization data, and obtains a second party global characterization mapping model.
In this embodiment, it should be noted that a second-party feature mapping model to be trained exists in the second device, the second-party feature mapping model to be trained includes a second-party feature extraction model to be trained and a second-party feature conversion model to be trained, the second-party sample characterization data at least includes a second-party sample characterization, and the second-party mapping characterization data at least includes a second-party mapping characterization.
Further, the second device performs feature extraction on each second square alignment sample respectively based on a second square feature extraction model to be trained to obtain a second square sample representation corresponding to each second square alignment sample, and then the second device performs feature conversion on each second square sample representation respectively based on a second square feature conversion model to be trained to obtain a second square mapping representation corresponding to each second square sample representation.
In another embodiment, the second party feature mapping model to be trained may also be a single model, the second party alignment sample is an input of the second party feature mapping model to be trained, the second party sample corresponding to the second party alignment sample is characterized as an intermediate output of the second party feature mapping model to be trained, and the second party mapping is characterized as a final output of the second party feature mapping model.
And sending the first party sample characterization data to second equipment, so that the second equipment optimizes a second party mapping and mapping model to be trained based on second global characterization learning loss calculated by the first party sample characterization data and second party mapping and characterization data corresponding to the second party sample characterization data to obtain a second party global characterization and mapping model, specifically, sending the first party sample characterization data to the second equipment, calculating second global characterization learning loss by the second equipment based on each first party sample characterization and the second party mapping and characterization corresponding to the second party sample characterization, and optimizing the second party mapping and mapping model to be trained by the second equipment according to a model gradient calculated by the second global characterization learning loss to obtain the second party global characterization and mapping model.
Step S12, converting the first party sample representation into the first party mapping representation data based on the first party representation conversion model to be trained.
In this embodiment, the first party sample representation is converted into the first party mapping representation data based on the first party representation conversion model to be trained, and specifically, respectively inputting each first party sample representation into the first party representation conversion model to be trained, respectively carrying out representation conversion on each first party sample representation, mapping each first party sample representation to a preset second representation space to obtain a first party mapping representation corresponding to each first party sample representation, wherein, it should be noted that the representation dimensions of the representations in the preset second representation space are consistent, each first-party mapping table and each second-party sample representation are in the preset second representation space, the method for generating each second sample representation may specifically refer to the content in step a10, and is not described herein again.
Step S20, receiving second-party sample characterization data sent by a second device, and constructing a first global characterization learning loss based on the first-party mapping characterization data and the second-party sample characterization data, where the second-party sample characterization data is obtained by performing feature extraction on second-party alignment sample data corresponding to the first-party alignment sample data by the second device;
in this embodiment, it should be noted that, because the first device and the second device are both longitudinally federate learning participants, for each first-party alignment sample, a corresponding second-party alignment sample exists in the second device and has the same sample ID as the first-party alignment sample, and then a one-to-one correspondence exists between the first-party sample characterization and the second-party sample characterization, and correspondingly, a one-to-one correspondence exists between the first-party mapping characterization and the second-party sample characterization.
Receiving second-party sample characterization data sent by a second device, and constructing a first global characterization learning loss based on the first-party mapping characterization data and the second-party sample characterization data, where the second-party sample characterization data is obtained by performing feature extraction on second-party alignment sample data corresponding to the first-party alignment sample data by the second device, specifically, receiving second-party sample characterization data sent by the second device, and constructing a first global characterization learning loss based on similarity of data distribution between the first-party mapping characterization data and the second-party sample characterization data, where a specific process of obtaining the second-party sample characterization data by the second device may refer to specific contents in steps H10 to H20, and is not described herein again.
Wherein the first party mapping characterization data comprises at least a first party mapping characterization, the second party sample characterization data comprises at least a second party sample characterization,
the step of constructing a first global token learning penalty based on the first party mapped token data and the second party sample token data comprises:
step B10, constructing the first global representation learning loss by calculating the similarity loss between each first party mapping representation and the corresponding second party sample representation; and/or
In this embodiment, the first global representation learning loss is constructed by calculating a similarity loss between each first-party mapping representation and a corresponding second-party sample representation, specifically, calculating a similarity loss between each first-party mapping representation and a corresponding second-party sample representation, and further performing cumulative summation on the similarity losses to obtain the first global representation learning loss, where in an implementable manner, a calculation formula of the similarity loss is as follows:
Figure BDA0003184855330000091
wherein L is the similarity loss, fiIs a stand forThe second square sample characterization, zjAnd for the first party mapping representation, the similarity loss can be reduced by the distance between the second party sample representation corresponding to the same sample ID and the corresponding first party mapping representation, and further for the first party characterization mapping model to be trained based on similarity loss optimization, while learning the representation of the first party alignment sample local to the first device, the representation of the second party alignment sample corresponding to the second device can be learned, so that the purpose of prompting the first party characterization mapping model to be trained to learn the global representation in the first device and the second device is realized, and longitudinal federal learning based on unlabeled samples can be realized between the first device and the second device.
Step C20, calculating the comparative learning loss corresponding to each first party mapping representation based on the similarity between each first party mapping representation and each second party sample representation, and constructing the first global representation learning loss.
In this embodiment, a comparison learning loss corresponding to each first-party mapping representation is calculated based on a similarity between each first-party mapping representation and each second-party sample representation, and the first global representation learning loss is constructed.
Wherein the step of calculating a contrast learning loss corresponding to each of the first party mapping representations based on the similarity between each of the first party mapping representations and each of the second party sample representations comprises: performing the following steps for each of the first party mapping characterizations:
searching a sample characterization having the same sample ID as the first party mapping characterization in each second party sample characterization to serve as a positive sample characterization corresponding to the first party mapping characterization, and taking all second party sample characterizations except the positive sample characterization as negative sample characterizations corresponding to the first party mapping characterization, and further calculating a contrast learning loss corresponding to the first party mapping characterization according to the similarity between the first party mapping characterization and the corresponding positive sample characterization and the similarity between the first party mapping characterization and each corresponding negative sample characterization, wherein in an implementable manner, the calculation formula of the contrast learning loss is as follows:
Figure BDA0003184855330000101
wherein L isNLoss of learning for the comparison, f (x)TFor the first party mapping characterization, f (x)+) For the characterization of the positive sample,
Figure BDA0003184855330000102
for the jth negative sample characterization, N-1 is the number of the negative sample characterizations, where the contrast learning loss may be reduced by a distance between the first-party mapping characterization and the corresponding positive sample characterization and by a distance between the first-party mapping characterization and the corresponding negative sample characterization, and then for a first-party characterization mapping model to be trained based on the optimization of the contrast learning loss, the characterization representation of the first-party alignment sample local to the first device may be learned while the characterization representation of the second-party alignment sample in the second device may be learned, so as to achieve the purpose of prompting the first-party characterization mapping model to be trained to learn the global characterization representation in the first device and the second device, and achieve longitudinal federal learning based on unlabeled samples between the first device and the second device.
Step S30, based on the first global representation learning loss, optimizing the first party representation mapping model to be trained to obtain a first party global representation mapping model;
in this embodiment, based on the first global characterization learning loss, the first party to be trained is optimized to obtain a first party global characterization mapping model, specifically, it is determined whether the first party to be trained reaches a preset iterative training end condition, if not, the first party to be trained is updated based on a model gradient calculated based on the first global characterization learning loss, and the execution step is returned to: and acquiring first party alignment sample data, and if the first party alignment sample data is obtained, taking the first party representation mapping model to be trained as a first party global representation mapping model.
Fig. 2 is a structural framework diagram of a global characterization mapping model trained between the first device and the second device, where the global characterization mapping model includes a first global characterization mapping model and a second global characterization mapping model, where client a is the first device, client B is the second device, and X isAFor the first party to align the samples, XBFor the second-party aligned sample, Model A is the first-party feature extraction Model to be trained, Model B is the second-party feature extraction Model to be trained, fAFor the characterization of the first party sample, fBFor the second square sample characterization, zAFor the first party mapping characterization, zBAnd mapping and characterizing the second party, wherein Head A is the first party characterization conversion model to be trained, Head B is the second party characterization conversion model to be trained, the coherent loss in the client A is the global characterization learning loss of the first party, and the coherent loss in the client B is the global characterization learning loss of the second party.
And step S40, performing model fine tuning training on a sample prediction model with a first party global feature extraction model in the first party global feature mapping model based on a first party label sample in the first party alignment sample data to obtain a first party federal prediction model.
In this embodiment, it should be noted that the first-party global feature mapping model includes a first-party global feature extraction model, where the first-party global feature extraction model is a trained first-party feature extraction model to be trained. And when the sample prediction model executes sample prediction, the sample firstly passes through the first party global feature extraction model to generate sample characterization, and then passes through the classification model to output a sample prediction result.
In addition, in a scenario where only one of the first device and the second device has a sample label, when the first device is a label provider in a participant, the first-party label sample is a first-party aligned sample having a sample label, and when the first device is a feature provider in a participant, the corresponding second-party aligned sample of the first-party label sample in the second device has a sample label, the first-party label sample itself does not have a sample label.
Performing model fine-tuning training on a sample prediction model with a first party global feature extraction model in the first party global feature mapping model based on a first party label sample in the first party alignment sample data to obtain a first party federal prediction model, specifically, obtaining the first party label sample in the first party alignment sample data, and then performing iterative training optimization on the sample prediction model with the first party global feature extraction model in the first party global feature mapping model under a preset model fine-tuning condition based on the first party label sample to obtain the first party federal prediction model, wherein the preset model fine-tuning condition comprises that a learning rate is within a preset learning rate range, a model updating step length is within a preset model updating step length range, a model iterative training frequency is within a preset model iteration frequency range and the like, and only the sample prediction model is subjected to model fine-tuning training, instead of training from a brand-new model, the sample prediction model can be trained to be the first party federal prediction model by only needing a few first party label samples.
The step of performing model fine tuning training on a sample prediction model with a first party global feature extraction model in the first party global representation mapping model based on a first party label sample in the first party alignment sample data to obtain a first party federal prediction model comprises:
step D10, based on the first party label sample, carrying out local training optimization on the sample prediction model under a preset model fine tuning condition to obtain the first party federal prediction model; and/or
In this embodiment, the first side label sample has a corresponding real sample label.
Based on the first party label sample, performing local training optimization on the sample prediction model under a preset model fine tuning condition to obtain the first party federal prediction model, specifically, obtaining a first party label sample in the first party aligned sample data, further based on the sample prediction model, performing sample prediction on the first party label sample to obtain a sample prediction label, further based on a real sample label corresponding to the first party label sample and the sample prediction label, calculating label prediction loss, judging whether the sample prediction model reaches a preset iterative training end condition, if not, updating the sample prediction model based on a model gradient calculated by the label prediction loss under the preset model fine tuning condition, and returning to the executing step: and obtaining a first party label sample in the first party alignment sample data, and if the first party label sample is obtained, taking the sample prediction model as a first party federal prediction model, wherein the label prediction loss comprises cross entropy loss, L2 loss and the like.
And E10, carrying out federal learning training optimization based on longitudinal federal learning on the sample prediction model under the preset model fine tuning condition by combining the first party label sample and the second equipment to obtain the first party federal prediction model.
In this embodiment, it should be noted that, in the federal learning scenario of this embodiment, only one of the first device and the second device has a sample tag.
Performing federated learning training optimization based on longitudinal federated learning on the sample prediction model by combining the first-party label sample and the second equipment under a preset model fine tuning condition to obtain the first-party federated prediction model, specifically, performing longitudinal federated learning modeling on the first-party label sample and the sample prediction model by combining the second-party label sample corresponding to the first-party label sample in the second equipment and a second-party sample prediction model with a second-party global bureau feature extraction model, calculating the federated prediction loss corresponding to the sample prediction model, further judging whether the sample prediction model meets a preset federated training end condition, and if not, updating the sample prediction model under a preset model fine tuning condition based on a model gradient calculated by the federated prediction loss, and returning to the execution step: and obtaining a first party label sample in the first party alignment sample data, and if the first party label sample meets the requirement, taking the sample prediction model as the first party federal prediction model, wherein a specific process of performing longitudinal federal learning modeling between the first device and the second device to calculate federal prediction loss is the prior art and is not repeated herein, wherein the preset federal training end condition comprises federal prediction loss convergence, a preset maximum iteration threshold value of the sample prediction model and the like.
Compared with the technical means that a participant without a sample label in the prior art needs to calculate loss or gradient by means of sample labels of other participants so as to perform longitudinal federated learning modeling, the embodiment of the application firstly obtains first party alignment sample data and converts the first party alignment sample data into first party mapping representation data on the basis of a first party representation mapping model to be trained; receiving second-party sample characterization data sent by second equipment, and constructing a first global characterization learning loss based on the first-party mapping characterization data and the second-party sample characterization data, wherein the second-party sample characterization data is obtained by performing feature extraction on second-party alignment sample data corresponding to the first-party alignment sample data by the second equipment; optimizing the first party global representation mapping model to be trained based on the first global representation learning loss to obtain a first party global representation mapping model, wherein a sample label is not required in the process of constructing the first party global representation mapping model, so that the purpose of constructing the first party global representation mapping model by using label-free samples in aligned samples between first equipment and second equipment is achieved, the first party global representation mapping model learns global sample characteristics in the first equipment and the second equipment, and then model fine tuning training is performed on a sample prediction model provided with a first party global feature extraction model in the first party global representation mapping model based on first party label samples in the first party aligned sample data to obtain a first party federal prediction model, wherein the first party global representation mapping model learns global sample characteristics in the first equipment and the second equipment, and then a sample prediction model with a first party global feature extraction model in the first party global representation mapping model can be subjected to model fine tuning training based on a small amount of first party label samples in the first party alignment sample data, the sample prediction model can be prompted to learn the mapping from the global sample representation to the sample label, so as to obtain the first party federal prediction model, further realizes the purpose of constructing a first party federal prediction model by utilizing overlapped samples with lower sample label proportion between the first equipment and the second equipment, therefore, the method overcomes the defect that the current longitudinal federal learning modeling method can only be limited to the application scene with higher sample proportion for the label in the overlapped samples, the technical defect that the existing application scene of the longitudinal federated learning modeling is high in limitation is overcome, and the limitation of the application scene of the longitudinal federated learning modeling is reduced.
Further, referring to fig. 3, in another embodiment of the present application, the longitudinal federated learning modeling optimization method is applied to a second device, and the longitudinal federated learning modeling optimization method includes:
step F10, acquiring second-party alignment sample data, and performing feature extraction on the second-party alignment sample data based on a second-party feature extraction model to be trained to obtain second-party sample characterization data;
in this embodiment, second-party alignment sample data is obtained, and feature extraction is performed on the second-party alignment sample data based on a second-party feature extraction model to be trained to obtain second-party sample characterization data, specifically, feature extraction is performed on each second-party alignment sample by respectively inputting each second-party alignment sample into the second-party feature extraction model to be trained, so as to respectively map each second-party alignment sample to a preset second characterization space, and obtain a second-party sample characterization corresponding to each second-party alignment sample.
After the step of extracting features of the second-party alignment sample data based on the second-party feature extraction model to be trained to obtain second-party sample characterization data, the longitudinal federated learning modeling optimization method further includes:
step H10, converting the second square sample representation data into second square mapping representation data based on a second square representation conversion model to be trained;
in this embodiment, based on a second square feature conversion model to be trained, second square sample characterization data is converted into second square mapping characterization data, specifically, each second square sample characterization is input into the second square feature conversion model to be trained, and is subjected to characterization conversion, so as to map each second square sample characterization to a preset first characterization space, and obtain a second square mapping characterization corresponding to each second square sample characterization.
Step H20, receiving first party sample characterization data sent by a first device, and constructing a second global characterization learning loss based on the first party sample characterization data and the second party mapping characterization data, wherein the first party sample characterization data is obtained by the first device performing feature extraction on first party alignment sample data corresponding to the second party alignment sample data;
in this embodiment, first party sample characterization data sent by a first device is received, and a second global characterization learning loss is constructed based on the first party sample characterization data and the second party mapping characterization data, where the first party sample characterization data is obtained by performing feature extraction on first party alignment sample data corresponding to the second party alignment sample data by the first device, specifically, the first party sample characterization data sent by the first device is received, and a first global characterization learning loss is constructed based on a similarity of data distribution between the second mapping characterization data and the first party sample characterization data, where specific processes of obtaining the first party sample characterization data by the first device may refer to specific contents in steps S11 to S12, and are not repeated here.
Wherein the first party sample characterization data comprises at least a first party sample characterization, the second party mapped characterization data comprises at least a second party mapped characterization,
the step of constructing a second global characterization learning loss based on the first party sample characterization data and the second party mapped characterization data comprises:
step Q10, constructing the second global characterization learning loss by calculating the similarity loss between each second party mapping characterization and the corresponding first party sample characterization; and/or
In this embodiment, the second global characterization learning loss is constructed by calculating a similarity loss between each second-party mapping characterization and the corresponding first-party sample characterization, specifically, calculating a second-party similarity loss between each second-party mapping characterization and the corresponding first-party sample characterization, and further performing cumulative summation on each second-party similarity loss to obtain the second global characterization learning loss, where in an implementable manner, the calculation formula of the second-party similarity loss is as follows:
Figure BDA0003184855330000151
wherein L is the second-party similarity loss, fiFor the characterization of the first party sample, zjMapping the characterization for the second party, wherein the similarity loss of the second party can be reduced to the distance between the first party sample characterization corresponding to the same sample ID and the corresponding second party mapping characterization, and then for the second party to be trained mapping model based on similarity loss optimization, the characterization representation of the first party alignment sample local to the second equipment can be learned while the characterization representation of the first party alignment sample in the first equipment can be learned, so that the representation of the second party to be trained can be promoted to be trainedThe aim of learning the global representation in the first device and the second device by the radial model can be realized by longitudinal federal learning based on unlabeled samples between the first device and the second device.
Step W10, calculating a comparative learning loss corresponding to each second party mapping representation according to the similarity between each second party mapping representation and the corresponding first party sample representation, and constructing the second global representation learning loss.
In this embodiment, the comparison learning loss corresponding to each second-party mapping representation is calculated according to the similarity between each second-party mapping representation and the corresponding first-party sample representation, so as to construct the second global representation learning loss, specifically, the second-party comparison learning loss corresponding to each second-party mapping representation is calculated based on the similarity between each second-party mapping representation and each first-party sample representation, and then the second-party comparison learning losses are accumulated and summed to obtain the second global representation learning loss.
Wherein the step of calculating the second-party contrast learning loss corresponding to each second-party mapping representation based on the similarity between each second-party mapping representation and each first-party sample representation comprises: performing the following steps for each of the second party mapping characterizations:
searching a sample characterization having the same sample ID as the second party mapping characterization in each first party sample characterization as a target positive sample characterization corresponding to the second party mapping characterization, and taking all first party sample characterizations except the target positive sample characterization as target negative sample characterizations corresponding to the second party mapping characterization, and further calculating a second party contrast learning loss corresponding to the second party mapping characterization according to a similarity between the second party mapping characterization and the corresponding target positive sample characterization and a similarity between the second party mapping characterization and each corresponding target negative sample characterization, wherein in an implementable manner, a calculation formula of the second party contrast learning loss is as follows:
Figure BDA0003184855330000161
wherein L isNCompare the learning loss for the second party, f (x)TCharacterizing for the first-winner mapping, f (x)+) For the characterization of the target positive sample,
Figure BDA0003184855330000162
for the jth target negative sample characterization, N-1 is the number of the target negative sample characterizations, wherein the second party contrast learning penalty may be zoomed in to a distance between the second party mapped representation and the corresponding target positive sample representation and zoomed out to a distance between the second party mapped representation and the corresponding target negative sample representation, and then for the second party to be trained that is based on second party contrast learning loss optimization, the characterization representations of the corresponding first party aligned samples in the first device may be learned while the characterization representations of the second party aligned samples local to the second device are learned, thereby achieving the aim of promoting the second party characterization mapping model to be trained to learn the global characterization representation in the first device and the second device, longitudinal federal learning based on unlabeled exemplars can be implemented between a first device and a second device.
Step H30, optimizing the second-party feature extraction model to be trained based on the second global representation learning loss to obtain a second-party global feature extraction model;
in this embodiment, based on the second global representation learning loss, the second-party feature extraction model to be trained is optimized to obtain a second-party global feature extraction model, specifically, it is determined whether the second-party feature mapping model to be trained reaches a preset iterative training end condition, if not, the second-party feature mapping model to be trained is updated based on the model gradient calculated by the second global representation learning loss, and the execution step is returned to: and acquiring second party alignment sample data, and if the second party alignment sample data are obtained, taking the second party characteristic mapping model to be trained as a second party whole-office characteristic mapping model.
And H40, performing model fine tuning training on the second square sample prediction model with the second party global bureau feature extraction model based on a second party label sample in the second party alignment sample data to obtain a second party federal prediction model.
In this embodiment, it should be noted that the second square sample prediction model is composed of the second square global feature extraction model and a second square classification model, and when the second square sample prediction model performs sample prediction, a sample first passes through the second square global feature extraction model to generate a sample characterization, and then passes through the second square classification model to output a sample prediction result.
And performing model fine-tuning training on a second square sample prediction model with the second-party global authority feature extraction model based on a second-party label sample in the second-party alignment sample data to obtain a second-party federal prediction model, specifically, obtaining the second-party label sample in the second-party alignment sample data, and performing iterative training optimization on the second-party sample prediction model with the second-party global authority feature extraction model under a preset model fine-tuning condition based on the second-party label sample to obtain the second-party federal prediction model.
The step of performing model fine tuning training on a second-party sample prediction model with the second-party global bureau feature extraction model based on a second-party label sample in the second-party alignment sample data to obtain a second-party federal prediction model comprises the following steps of:
step R10, based on the second party label sample, carrying out local training optimization on the sample prediction model under a preset model fine tuning condition to obtain a second party federal prediction model; and/or
In this embodiment, it should be noted that the second square label exemplar has a corresponding second square real exemplar label.
Based on the second square label sample, carrying out local training optimization on the sample prediction model under the preset model fine tuning condition to obtain the second-party federal prediction model, specifically, obtaining the second-party label sample in the second-party alignment sample data, and further based on the second-party sample prediction model, performing sample prediction on the second square label sample to obtain a second square sample prediction label, further calculating second square label prediction loss based on a second square real sample label corresponding to the second square label sample and the second square sample prediction label, judging whether the second square sample prediction model reaches a preset iterative training end condition, if not, updating the second square sample prediction model based on the model gradient calculated by the second square label prediction loss under the preset model fine tuning condition, and returning to the execution step: and obtaining a second party label sample in the second party alignment sample data, and if the second party label sample in the second party alignment sample data is obtained, taking the second party sample prediction model as a second party federal prediction model, wherein the second party label prediction loss comprises cross entropy loss, L2 loss and the like.
And T10, carrying out federal learning training optimization based on longitudinal federal learning on the second square sample prediction model by combining the first equipment under the preset model fine tuning condition based on the second square label sample to obtain the second square federal prediction model.
In this embodiment, it should be noted that, in the federal learning scenario of this embodiment, only one of the first device and the second device has a sample tag.
Performing federated learning training optimization based on longitudinal federated learning on the second-party sample prediction model by combining the first equipment under preset model fine tuning conditions based on the second-party label sample to obtain the second-party federated prediction model, specifically, performing longitudinal federated learning modeling on the first equipment based on the second-party label sample and the second-party sample prediction model to combine a first-party label sample corresponding to the second-party label sample in the first equipment and a sample prediction model with a first-party global feature extraction model, calculating a second-party federated prediction loss corresponding to the second-party sample prediction model, and further judging whether the second-party sample prediction model meets preset federated training end conditions, if not, calculating a model gradient based on the second-party federated prediction loss, updating the second square sample prediction model under the preset model fine tuning condition, and returning to the execution step: obtaining a second-party label sample in the second-party alignment sample data, and if the second-party label sample is met, taking the second-party sample prediction model as the second-party federal prediction model, wherein a specific process of performing longitudinal federal learning modeling between the first device and the second device to calculate federal prediction loss is the prior art and is not repeated herein, wherein the preset federal training end condition comprises federal prediction loss convergence, a preset maximum iteration threshold value reached by the second-party sample prediction model and the like, so that the purpose of constructing the second-party federal prediction model corresponding to the first-party prediction model in the first device at the second device is realized, the second-party global feature extraction model is a global feature extraction model constructed based on unlabeled samples, and the second-party global feature extraction model learns global sample representations in the first device and the second device, the purpose of constructing the second-party global feature extraction model by combining global samples in the first device and the second device is achieved, and further, the second-party federal prediction model is constructed by utilizing a small amount of second-party label samples to conduct model fine tuning training on the basis of the second-party global feature extraction model, so that the purpose of constructing the first-party global prediction model by utilizing overlapped samples with low sample label proportion between the first device and the second device is achieved.
Step F20, sending the second party sample representation data to a first device, so that the first device constructs a first party global representation mapping model based on the second party sample representation data and first party sample representation data of first party alignment sample data corresponding to the second party alignment sample data, and performs model fine tuning training on a sample prediction model with a first party global feature extraction model in the first party global representation mapping model based on a first party label sample in the first party alignment sample data to obtain a first party federal prediction model.
In this embodiment, specifically, the second square sample representation data is sent to a first device, the first device constructs a first global representation learning loss based on a similarity of data distribution between the first mapping representation data and the second square sample representation data, the first device optimizes a first square representation mapping model to be trained based on the first global representation learning loss to obtain a first square global representation mapping model, and the first device performs model fine tuning training on a sample prediction model provided with a first square global feature extraction model in the first square global representation mapping model based on a first square label sample in the first square alignment sample data to obtain a first square federal prediction model, where a specific process of obtaining the first square global prediction model may refer to specific contents in steps S10 to S40, and will not be described in detail herein.
The embodiment of the application provides a longitudinal federal learning modeling optimization method, that is, acquiring second party alignment sample data, extracting features of the second party alignment sample data based on a second party feature extraction model to be trained to obtain second party sample characterization data, further sending the second party sample characterization data to first equipment, so that the first equipment constructs a first party global feature mapping model based on the second party sample characterization data and first party sample characterization data of the first party alignment sample data corresponding to the second party alignment sample data, and carries out model fine tuning training on a sample prediction model provided with the first party global feature extraction model in the first party global feature mapping model based on a first party label sample in the first party alignment sample data to obtain a first party federal prediction model, the first party global representation mapping model learns the global sample characteristics in the first equipment and the second equipment, and then a sample prediction model with a first party global characteristic extraction model in the first party global representation mapping model is subjected to model fine tuning training based on a small amount of first party label samples in the first party aligned sample data, so that the sample prediction model can be prompted to learn the mapping from the global sample representation to the sample label, and then a first party federal prediction model is obtained, and the purpose of constructing the first party federal prediction model by utilizing an overlapped sample with a lower sample label ratio between the first equipment and the second equipment is realized, so that the technical defect that the existing application scenario for longitudinal federal learning modeling is higher in limitation to the application scenario with a higher sample ratio for the label in the overlapped sample is overcome, the limitation of the application scenario of longitudinal federated learning modeling is reduced.
Referring to fig. 4, fig. 4 is a schematic device structure diagram of a hardware operating environment according to an embodiment of the present application.
As shown in fig. 4, the longitudinal federal learning modeling optimization device may include: a processor 1001, such as a CPU, a memory 1005, and a communication bus 1002. The communication bus 1002 is used for realizing connection communication between the processor 1001 and the memory 1005. The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a memory device separate from the processor 1001 described above.
Optionally, the longitudinal federal learning modeling optimization device may further include a rectangular user interface, a network interface, a camera, an RF (Radio Frequency) circuit, a sensor, an audio circuit, a WiFi module, and the like. The rectangular user interface may comprise a Display screen (Display), an input sub-module such as a Keyboard (Keyboard), and the optional rectangular user interface may also comprise a standard wired interface, a wireless interface. The network interface may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface).
Those skilled in the art will appreciate that the configuration of the longitudinal federated learning modeling optimization facility shown in FIG. 4 does not constitute a limitation of the longitudinal federated learning modeling optimization facility, and may include more or fewer components than those shown, or some components in combination, or a different arrangement of components.
As shown in fig. 4, the memory 1005, which is a type of computer storage medium, may include an operating system, a network communication module, and a longitudinal federal learning modeling optimization program. The operating system is a program for managing and controlling hardware and software resources of the longitudinal federated learning modeling optimization equipment and supports the operation of the longitudinal federated learning modeling optimization program and other software and/or programs. The network communication module is used for realizing communication among components in the memory 1005 and communication with other hardware and software in the longitudinal federal learning modeling optimization system.
In the longitudinal federated learning modeling optimization apparatus shown in fig. 4, the processor 1001 is configured to execute a longitudinal federated learning modeling optimization program stored in the memory 1005 to implement the steps of any one of the longitudinal federated learning modeling optimization methods described above.
The specific implementation of the longitudinal federated learning modeling optimization device in the application is basically the same as that of each embodiment of the longitudinal federated learning modeling optimization method, and is not described herein again.
The embodiment of the present application further provides a longitudinal federated learning modeling optimization apparatus, where the longitudinal federated learning modeling optimization apparatus is applied to a first device, and the longitudinal federated learning modeling optimization apparatus includes:
the system comprises a conversion module, a mapping module and a training module, wherein the conversion module is used for acquiring first party alignment sample data and converting the first party alignment sample data into first party mapping representation data based on a first party representation mapping model to be trained;
the loss construction module is configured to receive second-party sample characterization data sent by a second device, and construct a first global characterization learning loss based on the first-party mapping characterization data and the second-party sample characterization data, where the second-party sample characterization data is obtained by performing feature extraction on second-party alignment sample data corresponding to the first-party alignment sample data by the second device;
the optimization module is used for optimizing the first party representation mapping model to be trained based on the first global representation learning loss to obtain a first party global representation mapping model;
and the model fine-tuning module is used for carrying out model fine-tuning training on a sample prediction model with a first party global feature extraction model in the first party global representation mapping model based on a first party label sample in the first party alignment sample data to obtain a first party federal prediction model.
Optionally, the to-be-trained first party representation mapping model includes a to-be-trained first party feature extraction model and a to-be-trained first party representation conversion model, and the conversion module is further configured to:
performing feature extraction on the first party alignment sample data based on the first party feature extraction model to be trained to obtain first party sample characterization data;
and converting the first party sample representation into the first party mapping representation data based on the first party representation conversion model to be trained.
Optionally, the longitudinal federated learning modeling optimization apparatus is further configured to:
and sending the first party sample characterization data to second equipment, so that the second equipment optimizes a second party characterization mapping model to be trained based on a second global characterization learning loss calculated by the second equipment according to the first party sample characterization data and second party mapping characterization data corresponding to the second party sample characterization data, and a second party global characterization mapping model is obtained.
Optionally, the first party mapping representation data includes at least a first party mapping representation, the second party mapping representation data includes at least a second party sample representation, and the loss construction module is further configured to:
constructing the first global representation learning loss by calculating the similarity loss between each first party mapping representation and the corresponding second party sample representation; and/or
And calculating the comparison learning loss corresponding to each first party mapping representation based on the similarity between each first party mapping representation and each second party sample representation, and constructing the first global representation learning loss.
Optionally, the model fine-tuning module is further configured to:
based on the first party label sample, carrying out local training optimization on the sample prediction model under a preset model fine tuning condition to obtain a first party federal prediction model; and/or
And carrying out federated learning training optimization based on longitudinal federated learning on the sample prediction model under the preset model fine tuning condition by combining the first party label sample and the second device to obtain the first party federated prediction model.
The specific implementation of the longitudinal federated learning modeling optimization device in the application is basically the same as that of each embodiment of the longitudinal federated learning modeling optimization method, and is not described herein again.
The embodiment of the present application further provides a longitudinal federated learning modeling optimization apparatus, where the longitudinal federated learning modeling optimization apparatus is applied to a second device, and the longitudinal federated learning modeling optimization apparatus includes:
the characteristic extraction module is used for acquiring second-party alignment sample data, and extracting the characteristics of the second-party alignment sample data based on a second-party characteristic extraction model to be trained to obtain second-party sample characterization data;
and the sending module is used for sending the second party sample characterization data to first equipment so that the first equipment can construct a first party global characterization mapping model based on the second party sample characterization data and first party sample characterization data of first party alignment sample data corresponding to the second party alignment sample data, and can perform model fine tuning training on a sample prediction model with a first party global feature extraction model in the first party global characterization mapping model based on the first party label sample in the first party alignment sample data to obtain a first party federal prediction model.
Optionally, the longitudinal federated learning modeling optimization apparatus is further configured to:
converting the second-party sample representation data into second-party mapping representation data based on a second-party representation conversion model to be trained;
receiving first party sample characterization data sent by first equipment, and constructing a second global characterization learning loss based on the first party sample characterization data and the second party mapping characterization data, wherein the first party sample characterization data is obtained by the first equipment through feature extraction on first party alignment sample data corresponding to the second party alignment sample data;
optimizing the second-party feature extraction model to be trained based on the second global representation learning loss to obtain a second-party global feature extraction model;
and performing model fine tuning training on a second square sample prediction model with the second square global bureau feature extraction model based on a second square label sample in the second square alignment sample data to obtain a second square federal prediction model.
Optionally, the first-party sample characterization data includes at least a first-party sample characterization, the second-party mapping characterization data includes at least a second-party mapping characterization, and the longitudinal federated learning modeling optimization device is further configured to:
constructing the second global representation learning loss by calculating the similarity loss between each second party mapping representation and the corresponding first party sample representation; and/or
And calculating the comparative learning loss corresponding to each second party mapping representation according to the similarity between each second party mapping representation and the corresponding first party sample representation, and constructing the second global representation learning loss.
Optionally, the longitudinal federated learning modeling optimization apparatus is further configured to:
based on the second party label sample, carrying out local training optimization on the sample prediction model under a preset model fine tuning condition to obtain a second party federal prediction model; and/or
And carrying out federal learning training optimization based on longitudinal federal learning on the second square sample prediction model by combining the second square label sample and the first equipment under the preset model fine tuning condition to obtain the second square federal prediction model.
The specific implementation of the longitudinal federated learning modeling optimization device in the application is basically the same as that of each embodiment of the longitudinal federated learning modeling optimization method, and is not described herein again.
The present application provides a medium, which is a readable storage medium, and the readable storage medium stores one or more programs, and the one or more programs are further executable by one or more processors for implementing the steps of any one of the above methods for longitudinal federal learning modeling optimization.
The specific implementation manner of the readable storage medium of the application is basically the same as that of each embodiment of the longitudinal federated learning modeling optimization method, and is not described herein again.
The present application provides a computer program product, and the computer program product includes one or more computer programs, which can also be executed by one or more processors for implementing the steps of any one of the above methods for longitudinal federated learning modeling optimization.
The specific implementation of the computer program product of the present application is substantially the same as each embodiment of the above-described longitudinal federated learning modeling optimization method, and is not described herein again.
The above description is only a preferred embodiment of the present application, and not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings, or which are directly or indirectly applied to other related technical fields, are included in the scope of the present application.

Claims (12)

1. A longitudinal federated learning modeling optimization method is applied to a first device and comprises the following steps:
acquiring first party alignment sample data, and converting the first party alignment sample data into first party mapping representation data on the basis of a first party representation mapping model to be trained;
receiving second-party sample characterization data sent by second equipment, and constructing a first global characterization learning loss based on the first-party mapping characterization data and the second-party sample characterization data, wherein the second-party sample characterization data is obtained by performing feature extraction on second-party alignment sample data corresponding to the first-party alignment sample data by the second equipment;
optimizing the first party representation mapping model to be trained based on the first global representation learning loss to obtain a first party global representation mapping model;
and performing model fine tuning training on a sample prediction model with a first party global feature extraction model in the first party global feature mapping model based on a first party label sample in the first party alignment sample data to obtain a first party federal prediction model.
2. The longitudinal federated learning modeling optimization method of claim 1, wherein the to-be-trained first-party token mapping model includes a to-be-trained first-party feature extraction model and a to-be-trained first-party token transformation model,
the step of converting the first party alignment sample data into first party mapping representation data based on a first party representation mapping model to be trained comprises:
performing feature extraction on the first party alignment sample data based on the first party feature extraction model to be trained to obtain first party sample characterization data;
and converting the first party sample representation into the first party mapping representation data based on the first party representation conversion model to be trained.
3. The longitudinal federated learning modeling optimization method of claim 2, wherein after the step of performing feature extraction on the first party alignment sample data based on the first party feature extraction model to be trained to obtain first party sample characterization data, the longitudinal federated learning modeling optimization method further comprises:
and sending the first party sample characterization data to second equipment, so that the second equipment optimizes a second party characterization mapping model to be trained based on a second global characterization learning loss calculated by the second equipment according to the first party sample characterization data and second party mapping characterization data corresponding to the second party sample characterization data, and a second party global characterization mapping model is obtained.
4. The method of claim 1, wherein the first party mapping characterization data includes at least a first party mapping characterization, the second party sample characterization data includes at least a second party sample characterization,
the step of constructing a first global token learning penalty based on the first party mapped token data and the second party sample token data comprises:
constructing the first global representation learning loss by calculating the similarity loss between each first party mapping representation and the corresponding second party sample representation; and/or
And calculating the comparison learning loss corresponding to each first party mapping representation based on the similarity between each first party mapping representation and each second party sample representation, and constructing the first global representation learning loss.
5. The method according to claim 1, wherein the step of performing model fine-tuning training on a sample prediction model provided with a first-party global feature extraction model in the first-party global characterization mapping model based on first-party labeled samples in the first-party aligned sample data to obtain a first-party federated prediction model comprises:
based on the first party label sample, carrying out local training optimization on the sample prediction model under a preset model fine tuning condition to obtain a first party federal prediction model; and/or
And carrying out federated learning training optimization based on longitudinal federated learning on the sample prediction model under the preset model fine tuning condition by combining the first party label sample and the second device to obtain the first party federated prediction model.
6. A longitudinal federated learning modeling optimization method is applied to a second device and comprises the following steps:
acquiring second-party alignment sample data, and performing feature extraction on the second-party alignment sample data based on a second-party feature extraction model to be trained to obtain second-party sample characterization data;
and sending the second party sample representation data to first equipment, so that the first equipment constructs a first party global representation mapping model based on the second party sample representation data and first party sample representation data of first party alignment sample data corresponding to the second party alignment sample data, and performs model fine tuning training on a sample prediction model provided with a first party global feature extraction model in the first party global representation mapping model based on a first party label sample in the first party alignment sample data to obtain a first party federal prediction model.
7. The longitudinal federated learning modeling optimization method of claim 6, wherein after the step of performing feature extraction on the second-party alignment sample data based on the second-party feature extraction model to be trained to obtain second-party sample characterization data, the longitudinal federated learning modeling optimization method further comprises:
converting the second-party sample representation data into second-party mapping representation data based on a second-party representation conversion model to be trained;
receiving first party sample characterization data sent by first equipment, and constructing a second global characterization learning loss based on the first party sample characterization data and the second party mapping characterization data, wherein the first party sample characterization data is obtained by the first equipment through feature extraction on first party alignment sample data corresponding to the second party alignment sample data;
optimizing the second-party feature extraction model to be trained based on the second global representation learning loss to obtain a second-party global feature extraction model;
and performing model fine tuning training on a second square sample prediction model with the second square global bureau feature extraction model based on a second square label sample in the second square alignment sample data to obtain a second square federal prediction model.
8. The method of claim 7, wherein the first-party sample characterization data includes at least a first-party sample characterization, the second-party mapped characterization data includes at least a second-party mapped characterization,
the step of constructing a second global characterization learning loss based on the first party sample characterization data and the second party mapped characterization data comprises:
constructing the second global representation learning loss by calculating the similarity loss between each second party mapping representation and the corresponding first party sample representation; and/or
And calculating the comparative learning loss corresponding to each second party mapping representation according to the similarity between each second party mapping representation and the corresponding first party sample representation, and constructing the second global representation learning loss.
9. The method according to claim 7, wherein the step of performing model fine-tuning training on a second-party sample prediction model provided with the second-party global bureau feature extraction model based on a second-party label sample in the second-party alignment sample data to obtain a second-party federal prediction model comprises:
based on the second party label sample, carrying out local training optimization on the sample prediction model under a preset model fine tuning condition to obtain a second party federal prediction model; and/or
And carrying out federal learning training optimization based on longitudinal federal learning on the second square sample prediction model by combining the second square label sample and the first equipment under the preset model fine tuning condition to obtain the second square federal prediction model.
10. A longitudinal federated learning modeling optimization apparatus, characterized in that the longitudinal federated learning modeling optimization apparatus comprises: a memory, a processor, and a program stored on the memory for implementing the longitudinal federated learning modeling optimization method,
the memory is used for storing a program for realizing the longitudinal federal learning modeling optimization method;
the processor is configured to execute a program implementing the longitudinal federated learning modeling optimization method to implement the steps of the longitudinal federated learning modeling optimization method as recited in any one of claims 1 to 5 or 6 to 9.
11. A medium being a readable storage medium, characterized in that the readable storage medium has stored thereon a program implementing a longitudinal federated learning modeling optimization method, the program implementing the longitudinal federated learning modeling optimization method being executed by a processor to implement the steps of the longitudinal federated learning modeling optimization method as recited in any one of claims 1 to 5 or 6 to 9.
12. A program product being a computer program product comprising a computer program, wherein the computer program when executed by a processor implements the steps of the longitudinal federal learning modeling optimization methodology of any one of claims 1 to 5 or 6 to 9.
CN202110858396.3A 2021-07-28 2021-07-28 Longitudinal federated learning modeling optimization method, apparatus, medium, and program product Pending CN113505896A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110858396.3A CN113505896A (en) 2021-07-28 2021-07-28 Longitudinal federated learning modeling optimization method, apparatus, medium, and program product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110858396.3A CN113505896A (en) 2021-07-28 2021-07-28 Longitudinal federated learning modeling optimization method, apparatus, medium, and program product

Publications (1)

Publication Number Publication Date
CN113505896A true CN113505896A (en) 2021-10-15

Family

ID=78015015

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110858396.3A Pending CN113505896A (en) 2021-07-28 2021-07-28 Longitudinal federated learning modeling optimization method, apparatus, medium, and program product

Country Status (1)

Country Link
CN (1) CN113505896A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113988225A (en) * 2021-12-24 2022-01-28 支付宝(杭州)信息技术有限公司 Method and device for establishing representation extraction model, representation extraction and type identification
WO2023160069A1 (en) * 2022-02-24 2023-08-31 腾讯科技(深圳)有限公司 Machine learning model training method and apparatus, prediction method and apparatus therefor, device, computer readable storage medium and computer program product

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113988225A (en) * 2021-12-24 2022-01-28 支付宝(杭州)信息技术有限公司 Method and device for establishing representation extraction model, representation extraction and type identification
CN113988225B (en) * 2021-12-24 2022-05-06 支付宝(杭州)信息技术有限公司 Method and device for establishing representation extraction model, representation extraction and type identification
WO2023160069A1 (en) * 2022-02-24 2023-08-31 腾讯科技(深圳)有限公司 Machine learning model training method and apparatus, prediction method and apparatus therefor, device, computer readable storage medium and computer program product

Similar Documents

Publication Publication Date Title
WO2023005133A1 (en) Federated learning modeling optimization method and device, and readable storage medium and program product
US11620532B2 (en) Method and apparatus for generating neural network
CN109740018B (en) Method and device for generating video label model
CN108520470B (en) Method and apparatus for generating user attribute information
CN113627085B (en) Transverse federal learning modeling optimization method, equipment and medium
CN110096584B (en) Response method and device
CN111860868B (en) Training sample construction method, device, equipment and computer readable storage medium
CN109902763B (en) Method and device for generating feature map
US20200126315A1 (en) Method and apparatus for generating information
CN112785002A (en) Model construction optimization method, device, medium, and computer program product
CN113505896A (en) Longitudinal federated learning modeling optimization method, apparatus, medium, and program product
CN112650841A (en) Information processing method and device and electronic equipment
CN110633717A (en) Training method and device for target detection model
US20220391425A1 (en) Method and apparatus for processing information
CN113688986A (en) Longitudinal federal prediction optimization method, device, medium, and computer program product
WO2022156468A1 (en) Method and apparatus for processing model data, electronic device, and computer-readable medium
CN111680799A (en) Method and apparatus for processing model parameters
CN113516254A (en) Method, apparatus, medium, and program product for optimizing horizontal federated learning modeling
CN111008213A (en) Method and apparatus for generating language conversion model
CN109919249B (en) Method and device for generating feature map
CN116401372A (en) Knowledge graph representation learning method and device, electronic equipment and readable storage medium
CN111694932A (en) Conversation method and device
JP2023554210A (en) Sort model training method and apparatus for intelligent recommendation, intelligent recommendation method and apparatus, electronic equipment, storage medium, and computer program
CN113705683A (en) Recommendation model training method and device, electronic equipment and storage medium
CN113239215A (en) Multimedia resource classification method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination