WO2023005133A1 - Federated learning modeling optimization method and device, and readable storage medium and program product - Google Patents

Federated learning modeling optimization method and device, and readable storage medium and program product Download PDF

Info

Publication number
WO2023005133A1
WO2023005133A1 PCT/CN2021/141481 CN2021141481W WO2023005133A1 WO 2023005133 A1 WO2023005133 A1 WO 2023005133A1 CN 2021141481 W CN2021141481 W CN 2021141481W WO 2023005133 A1 WO2023005133 A1 WO 2023005133A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature extraction
sample
public
target
extraction model
Prior art date
Application number
PCT/CN2021/141481
Other languages
French (fr)
Chinese (zh)
Inventor
何元钦
Original Assignee
深圳前海微众银行股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳前海微众银行股份有限公司 filed Critical 深圳前海微众银行股份有限公司
Publication of WO2023005133A1 publication Critical patent/WO2023005133A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning

Definitions

  • the present application relates to the technical field of artificial intelligence in financial technology (Fintech), and in particular to a federated learning modeling optimization method, equipment, readable storage medium and program product.
  • one of the prerequisites for horizontal federated learning is that the samples of each participant need to be aligned on the features themselves.
  • the premise of alignment on the feature itself usually requires the samples of each participant to be in the same data mode, for example, the samples of each participant are all images or text, etc. If it is different, horizontal federated learning cannot be performed among the participants. Therefore, the existing horizontal federated learning can only be performed by combining samples of the same data mode in different participants, and the existing horizontal federated learning has strong limitations.
  • the main purpose of this application is to provide a federated learning modeling optimization method, device, readable storage medium and program product, aiming to solve the technical problem of strong limitations of horizontal federated learning in the prior art.
  • the present application provides a federated learning modeling optimization method, which is applied to a federated server, and the federated learning modeling optimization method includes:
  • the present application also provides a federated learning modeling optimization method, the federated learning modeling optimization method is applied to participant devices, and the federated learning modeling optimization method includes:
  • each of the initial global feature extraction models is based on each Knowledge distillation learning training of the target modality aggregation sample representation, and comparative learning training between each of the initial global feature extraction models, to obtain a target global feature extraction model corresponding to each of the initial model global feature extraction models.
  • the present application also provides a federated learning modeling optimization device, the federated learning modeling optimization device is a virtual device, and the federated learning modeling optimization device is applied to a federated server, and the federated learning modeling optimization device includes:
  • the model distribution module is used to distribute the initial global feature extraction model corresponding to each data modality to the participant device corresponding to each said data modality, so that the participant device can use local private training samples based on the initial Carry out comparative learning and training between the global feature extraction model and the local feature extraction model, optimize the local feature extraction model, obtain a globally optimized local feature extraction model, and based on the globally optimized local feature extraction model, perform the federated public
  • the corresponding target modal public samples in the data set are subjected to feature extraction to obtain the representation of the target modal public samples;
  • a selective aggregation module configured to receive the target modal public sample representations sent by each of the participant devices, and perform selective aggregation based on data modalities for each of the target modal public sample representations to obtain each of the data Aggregated sample representation of the target mode corresponding to the mode;
  • the training module is used to obtain public training samples corresponding to each of the data modalities in the federal public data set, and based on each of the public training samples, perform each of the initial global feature extraction models based on each of the targets Knowledge distillation learning training of modal aggregation sample representation, and comparative learning training between each of the initial global feature extraction models, to obtain a target global feature extraction model corresponding to each of the initial model global feature extraction models.
  • the present application also provides a federated learning modeling optimization device, the federated learning modeling optimization device is a virtual device, and the federated learning modeling optimization device is applied to participant equipment, and the federated learning modeling optimization device includes:
  • the receiving module is used to receive the initial global feature extraction model issued by the federation server, and extract local private training samples;
  • a contrastive learning and training module configured to optimize the local feature extraction model by performing comparative learning and training between the initial global feature extraction model and the local feature extraction model based on the local private training samples, and obtain a globally optimized local Feature extraction model;
  • the feature extraction module is used to extract the target mode public samples belonging to the data mode corresponding to the globally optimized local feature extraction model in the federal public data set, and based on the globally optimized local feature extraction model, for all Feature extraction is performed on the public samples of the target modal to obtain the representation of the public samples of the target modal;
  • a sending module configured to send the public sample representation of the target modality to a federated server, so that the federated server can selectively aggregate the public sample representations of each target modality based on data modality to obtain the Aggregating sample representations of target modalities corresponding to the data modalities, and obtaining public training samples corresponding to each of the data modalities in the federal public data set, based on each of the public training samples, each of the initial global features
  • the extraction model performs knowledge distillation learning training based on each of the target modal aggregation sample representations, and performs comparative learning training between each of the initial global feature extraction models to obtain the target global feature corresponding to each of the initial model global feature extraction models.
  • Feature extraction model configured to send the public sample representation of the target modality to a federated server, so that the federated server can selectively aggregate the public sample representations of each target modality based on data modality to obtain the Aggregating sample representations of target modalities corresponding to the data modalities, and
  • the present application also provides a federated learning modeling optimization device, the federated learning modeling optimization device is a physical device, and the federated learning modeling optimization device includes: a memory, a processor, and stored in the memory and can be used in the The program of the federated learning modeling and optimization method run on the processor, and when the program of the federated learning modeling and optimization method is executed by the processor, the steps of the above federated learning modeling and optimization method can be realized.
  • the present application also provides a readable storage medium, on which a program for realizing the federated learning modeling optimization method is stored.
  • a program for realizing the federated learning modeling optimization method is stored.
  • the program of the federated learning modeling and optimization method is executed by a processor, the above-mentioned federation Learn the steps of modeling optimization methods.
  • the present application also provides a computer program product, including a computer program.
  • a computer program product including a computer program.
  • the steps of the above-mentioned federated learning modeling optimization method are implemented.
  • This application provides a federated learning modeling optimization method, equipment, readable storage medium, and program product, compared with the technical means of aligning the characteristics of each participant in the prior art to perform horizontal federated learning , the present application first distributes the initial global feature extraction model corresponding to each data modality to the participant device corresponding to each data modality, so that the participant device can use the initial global feature extraction model based on the local private training sample Contrast learning and training between the extraction model and the local feature extraction model, optimize the local feature extraction model, so that the local feature extraction model can learn the model knowledge of the global model, and then obtain the globally optimized local feature extraction model, and then based on the The local feature extraction model after the global optimization is described, and the feature extraction is performed on the corresponding target modal public samples in the federation public data set, and the target modal public sample representation is obtained, and the target modal public sample representation sent by each participant device is received, And carry out selective aggregation based on the data modality for the public sample representations of each of the target modalities,
  • Fig. 1 is a schematic flow chart of the first embodiment of the federated learning modeling optimization method of the present application
  • FIG. 2 is a schematic flow diagram of the second embodiment of the federated learning modeling optimization method of the present application
  • Fig. 3 is a schematic diagram of the interaction process when performing horizontal federated learning modeling in the federated learning modeling optimization method of the present application
  • FIG. 4 is a schematic diagram of the device structure of the hardware operating environment involved in the federated learning modeling optimization method in the embodiment of the present application.
  • the embodiment of the present application provides a federated learning modeling optimization method, which is applied to a federated server.
  • the federated learning modeling optimization method includes:
  • Step S10 distributing the initial global feature extraction model corresponding to each data modality to the participant device corresponding to each said data modality, so that the participant device can use the initial global feature extraction model based on the local private training sample Carry out comparative learning training between the model and the local feature extraction model, optimize the local feature extraction model, obtain a globally optimized local feature extraction model, and based on the globally optimized local feature extraction model, corresponding to the federal public data set Feature extraction is performed on the target modal public samples to obtain the target modal public sample representation;
  • the federated learning modeling and optimization method is applied to horizontal federated learning, wherein the framework of horizontal federated learning includes a federated server and multiple participant devices, wherein the federated server maintains a global
  • the participant devices maintain their own local models, and each participant device corresponds to multiple data modalities, where a data model corresponds to at least one participant device.
  • the data Mode A corresponds to participant device A and participant device b, that is, participant a and participant device b have samples belonging to data mode A
  • data mode B corresponds to participant device c, that is, participant device c has Samples belonging to data modality B.
  • the participant device has its own local private training sample, wherein the local private training sample may have a corresponding local sample label, and the data mode of the local private training sample is the participant device
  • each participant device and the federation server have the same federated public data set, and the federated public data set has public training samples belonging to the data mode corresponding to each of the participant devices, and the federated server is aimed at
  • Each data mode corresponding to each participant device maintains a corresponding initial global feature extraction model, and the participant device maintains a local feature extraction model for its corresponding data mode.
  • the initial global feature extraction model corresponding to each data modal is distributed to the participant device corresponding to the corresponding data modal, and then the participant device acquires locally held local private training samples and local feature extraction models, by performing comparative learning and training between the initial global feature extraction model and the local feature extraction model, prompting the local feature extraction model to learn the model of the initial global feature extraction model knowledge, optimize the local feature extraction model, and obtain a globally optimized local feature extraction model, and then the participant device extracts the target modal public samples belonging to the data modality corresponding to the participant device from the federal public data set, and uses the The local feature extraction model after the global optimization is used to perform feature extraction on the target modal public samples, and map the target modal public samples to target modal public sample representations, wherein the participant equipment is globally optimized
  • the specific implementation process of the local feature extraction model and obtaining the public sample representation of the target modality can refer to the specific content in step A10 to step A30 and its refinement steps, and will not be repeated here.
  • Step S20 receiving the target modality public sample representation sent by each of the participant devices, and performing selective aggregation based on the data modality for each of the target modality public sample representations to obtain the corresponding Target modal aggregation sample characterization;
  • the public sample representation of the target modality sent by each participant device is received, and the selective aggregation based on the data modality is performed on each of the public sample representations of the target modality to obtain each of the data modality
  • the aggregated sample representations of the target modalities corresponding to the corresponding modalities specifically, receiving the public sample representations of the target modalities sent by each of the participant devices, and then combining the public sample representations of the target modalities corresponding to the same data modal in the public sample representations of each target modal
  • the representations are aggregated separately to obtain the aggregated sample representations of target modalities corresponding to each of the data modalities.
  • the step of performing selective aggregation based on the data modality for the public sample representation of each of the target modalities, and obtaining the aggregated sample representation of the target modality corresponding to each of the data modalities includes:
  • Step S21 based on the corresponding relationship between each of the participant devices and each of the data modalities, determine the respective sample representations to be aggregated corresponding to each of the data modalities in the public sample representations of each of the target modalities;
  • one data modality corresponds to at least one sample representation to be aggregated, and since the public sample representation of the target modality is sent to the federation server by the participant device, the target The modality public sample representation has a one-to-one correspondence with the participant devices, and each participant device has a data modality, and the data modality of each participant device may be the same or different.
  • each of the participant devices and each of the data modalities determine the respective sample representations to be aggregated corresponding to each of the data modalities in the public sample representations of each of the target modalities, specifically, Based on the corresponding relationship between each of the participant devices and each of the data modalities, determine the sample representations to be aggregated corresponding to each data modality in the target modal public sample representations of each of the participant devices, for example , assuming that there are participant devices a1 and a2, corresponding to data modality A, and participant devices b1 and b2, corresponding to data modality B, then the target modality public sample representations sent by participant devices a1 and a2 to the federated server are both The sample representations to be aggregated corresponding to data modality A, and the public sample representations of the target modality sent by participant devices b1 and b2 to the federation server are all sample representations to be aggregated corresponding to data modality B.
  • step S22 aggregate the sample representations to be aggregated corresponding to the respective data modalities to obtain aggregated sample representations of target modalities corresponding to the respective data modalities.
  • the respective representations of samples to be aggregated corresponding to each of the data modalities are aggregated separately to obtain the target modal aggregation sample representations corresponding to each of the data modalities.
  • Step S30 obtaining public training samples corresponding to each of the data modalities in the federated public data set, and based on each of the public training samples, performing an initial global feature extraction model based on each of the target modalities Aggregating sample representation knowledge distillation learning training, and performing comparative learning training between each of the initial global feature extraction models, to obtain a target global feature extraction model corresponding to each of the initial model global feature extraction models.
  • the target modal aggregation sample characterization is used to guide the optimization of the corresponding initial global feature extraction model, so that the output of the initial global feature extraction model is as close as possible to the corresponding target modal aggregation sample characterization similar.
  • the federated server can directly use the target modality public samples selected by each participant device in the federated public data set as public training samples.
  • Mode public sample denoted as X1
  • participant device B selects multiple target modal public samples with data mode a
  • participant device C selects multiple target modal public samples with data mode b
  • X1 and X2 can be directly used as the public training samples of the initial global feature extraction model corresponding to the data mode a selected by the federated server
  • X3 can be directly used as the initial global feature extraction model corresponding to the data mode b selected by the federated server Public training samples for feature extraction models.
  • the target modal aggregated sample representation is the aggregation result of target modal public sample representations corresponding to several target modal public samples
  • the target modal aggregated sample representation represents the corresponding data modality to a certain extent, while Instead of a single sample, samples belonging to the data modality corresponding to the initial global feature extraction model can be reselected from the federal public data set as the public training samples of the initial global feature extraction model, instead of having to place each participant's device in the federation
  • the public samples of the target modality selected in the public data set are used as public training samples.
  • each of the initial global feature extraction models is respectively subjected to knowledge distillation learning training based on each of the target modality aggregation sample representations, and each of the initial global feature extraction models Carry out contrastive learning training between, the step of obtaining the target global feature extraction model corresponding to each described initial model global feature extraction model comprises:
  • Step S31 mapping the public training samples corresponding to each of the data modalities into predicted sample representations through the corresponding initial global feature extraction model
  • the public training samples corresponding to each of the data modalities are respectively mapped to prediction sample representations through the corresponding initial global feature extraction models, specifically, each of the public training samples is input into each of the public training samples.
  • the initial global feature extraction model corresponding to the data mode corresponding to the sample performs feature extraction on each of the public training samples, so as to map each of the public training samples to a preset sample representation space, and obtain each of the public training samples.
  • Corresponding prediction sample representations that is, mapping each of the public training samples to corresponding prediction sample representations.
  • Step S32 calculating the knowledge distillation loss between each of the predicted sample representations and the corresponding target modal aggregation sample representation, and calculating the comparative learning loss between each of the predicted sample representations;
  • the knowledge distillation loss between each of the predicted sample representations and the corresponding target modal aggregation sample representation is calculated, and the comparative learning loss between each of the predicted sample representations is calculated. Specifically, based on each Calculate the knowledge distillation loss based on the similarity between the predicted sample representations and the corresponding target modal aggregation sample representations, and calculate the comparative learning loss based on the similarity between the predicted sample representations.
  • the steps of calculating the knowledge distillation loss between each of the predicted sample representations and the corresponding target modal aggregation sample representations, and calculating the contrastive learning loss between each of the predicted sample representations include:
  • Step S321 based on the sample labels corresponding to each of the public training samples, respectively select a positive sample representation and a corresponding negative sample representation corresponding to each of the predicted sample representations in each of the predicted sample representations;
  • each public training sample of different modal data representing the same thing has the same sample label, and represents different parts of the same thing.
  • the public training samples of the modal data constitute a public training sample group, that is, samples belonging to the same public training sample group have the same sample label, and samples not belonging to the same public training sample group have different sample labels.
  • the positive sample representation and the corresponding negative sample representation corresponding to each of the prediction sample representations are respectively selected in each of the prediction sample representations, specifically, based on each of the public training samples corresponding to , determine the public training sample group corresponding to each of the public training samples, and then for each public training sample:
  • each of the public training samples includes each first public training sample belonging to the first data mode. training samples and respective second public training samples belonging to the second data modality,
  • each of the second public training samples determine a sample that has the same sample label as the first public training sample as the first positive sample corresponding to the first public training sample, and determine the same as the first public training sample.
  • Samples that do not have the same sample label as the corresponding first negative sample corresponding to the first public training sample characterizing the predicted sample corresponding to the first positive sample as the predicted sample corresponding to the first public training sample A positive sample representation of the representation, and a negative sample representation using the prediction sample representation corresponding to the first negative sample as the prediction sample representation corresponding to the first public training sample.
  • each of the first public training samples determine a sample having the same sample label as the second public training sample as the second positive sample corresponding to the second public training sample, and determine the same as the second public training sample.
  • the second public training sample does not have the same sample label as the corresponding second negative sample corresponding to the second public training sample; the predicted sample representation corresponding to the second positive sample is used as the second public training sample corresponding to The positive sample representation of the predicted sample representation of the predicted sample representation, and the negative sample representation of the predicted sample representation corresponding to the second negative sample as the predicted sample representation corresponding to the second public training sample.
  • Step S322 calculating the contrastive learning loss corresponding to each of the initial global feature extraction models based on each of the predicted sample representations, the positive sample representation corresponding to each of the predicted sample representations, and the corresponding negative sample representation;
  • the comparative learning loss corresponding to each of the initial global feature extraction models is calculated, specifically, For each initial global feature extraction model the following steps are performed:
  • L N is the contrastive learning loss
  • N-1 is the number of negative sample representations
  • f(x) T is the predicted sample representation
  • f(x + ) is the positive sample corresponding to the predicted sample representation characterization
  • the j-th negative sample representation corresponding to the predicted sample representation.
  • Step S323 based on the similarity between each of the predicted sample representations and the corresponding target modal aggregation sample representations, respectively calculate the knowledge distillation loss corresponding to each of the initial global feature extraction models.
  • the knowledge distillation losses corresponding to each of the initial global feature extraction models are calculated, specifically, based on each The similarity between the predicted sample representation and the corresponding target modal aggregation sample representation is calculated respectively.
  • the cross entropy between each of the predicted sample representations and the corresponding target modal aggregation sample representation is obtained to obtain each of the The knowledge distillation loss corresponding to the initial global feature extraction model.
  • the knowledge distillation loss corresponding to each of the initial global feature extraction models is calculated through the L2 loss function .
  • Step S33 based on the knowledge distillation loss corresponding to each of the initial global feature extraction models and the corresponding comparative learning loss, optimize each of the initial global feature extraction models to obtain each target global feature extraction model.
  • each initial global feature extraction model will The feature extraction model will output sample representations whose similarity is greater than the first preset similarity threshold, and each initial global feature extraction model will output sample representations as similar as possible, and will prompt each initial global feature extraction model to For the public training samples of the data modality, each initial global feature extraction model will output sample representations whose similarity is less than the second preset similarity threshold, and each initial global feature extraction model will output sample representations that are as dissimilar as possible.
  • the comparison Learning loss to optimize the initial global feature extraction model can shorten the distance between the predicted sample representation output by the initial global feature extraction model and its corresponding positive sample representation, and distance the predicted sample representation output by the initial global feature extraction model from its corresponding negative sample The distance between the tokens, wherein the first preset similarity threshold is greater than the second preset similarity threshold.
  • the knowledge distillation loss is cross-entropy loss or L2 loss
  • it will prompt the predicted sample representation output by the initial global feature extraction model to be as close as possible to the target modal aggregation sample representation corresponding to the initial global feature extraction model, so that the initial The similarity between the predicted sample representation output by the global feature extraction model and the target modal aggregation sample representation corresponding to the initial global feature extraction model is greater than the preset third similarity threshold, which can then promote the initial global feature extraction model to learn the corresponding data
  • each of the initial global feature extraction models is optimized to obtain each target global feature extraction model, specifically, each of the initial global feature
  • the knowledge distillation loss corresponding to the extraction model and the corresponding comparative learning loss are aggregated to obtain the total loss of the global model corresponding to each of the initial global feature extraction models, and then judge whether each of the initial global feature extraction models meets the preset end of training If the condition is satisfied, each of the initial global feature extraction models is used as the corresponding target global feature extraction model, if not satisfied, the corresponding initial model global feature extraction models are updated respectively based on the model gradient calculated by the total loss of each model, And return to the execution step: distribute the initial global feature extraction model corresponding to each data modality to the participant equipment corresponding to each said data modality, wherein the preset training end conditions include the convergence of the total loss of each model and the convergence of each model The number of iterations of the initial global feature extraction model has reached the preset threshold of iteration
  • step S30 it also includes:
  • each of the target global feature extraction models is sent to the participant device having the corresponding data modality of each of the target global feature extraction models, and then the participant device Based on the local private training samples, by performing comparative learning and training between the target global feature extraction model and the local feature extraction model, and optimizing the local feature extraction model, the target local feature extraction model can be obtained, wherein the participant Based on the local private training samples, the device performs comparative learning and training between the target global feature extraction model and the local feature extraction model, and optimizes the local feature extraction model to obtain the target local feature extraction model.
  • the specific implementation process please refer to The specific content of step A10 to step A20 and the detailed steps thereof will not be repeated here.
  • the embodiment of the present application provides a federated learning modeling optimization method. Compared with the technical means of aligning the features of each participant in the prior art to perform horizontal federated learning, the embodiment of the present application firstly integrates each The initial global feature extraction model corresponding to the data modality is distributed to each participant device corresponding to the data modality, so that the participant device can use the initial global feature extraction model and the local feature extraction model based on the local private training sample.
  • the feature extraction model extracts features from the corresponding target modal public samples in the federal public data set to obtain target modal public sample representations, receives the target modal public sample representations sent by each participant device, and The target modal public sample representation is selectively aggregated based on the data modality, and the target modality aggregation sample representation corresponding to each data modality is obtained, so that the target modal public sample representations belonging to the same data modality can be separately aggregated.
  • each of the initial global feature extraction models is based on each of the objectives Knowledge distillation learning training of modal aggregation sample representation, and comparative learning training between each of the initial global feature extraction models, to obtain the target global feature extraction model corresponding to each of the initial model global feature extraction models, and because each The target modal public sample representations are all output from the globally optimized local feature extraction model based on the optimization of the local private training samples of the participant’s equipment, so that each initial global feature extraction model can be indirectly combined with the corresponding data model through knowledge distillation.
  • the samples of multiple participants in different modalities can be federated horizontally, and at the same time, the initial global feature extraction model corresponding to each data modal can use contrastive learning to align samples of different data modalities in the feature space, thus realizing the indirect joint different
  • the purpose of horizontal federated learning on samples of data modalities is to make horizontal federated learning no longer limited to samples of the same data modality of different parties, and to overcome the problem in the prior art that the existing horizontal federated learning can only combine different
  • the samples of the same data mode in the participants are performed, which leads to the technical defect of the existing horizontal federated learning with strong limitations, so the limitations of the horizontal federated learning are reduced.
  • the federated learning modeling optimization method is applied to participant devices, and the federated learning modeling optimization method includes:
  • Step A10 receiving the initial global feature extraction model issued by the federation server, and extracting local private training samples
  • the initial global feature extraction model is an unoptimized target global feature extraction model in the federation server
  • the local private training sample is the training sample data privately owned by the participant device, which is The private data of the participant's device.
  • Step A20 based on the local private training samples, by performing comparative learning and training between the initial global feature extraction model and the local feature extraction model, optimizing the local feature extraction model, and obtaining a globally optimized local feature extraction model;
  • the local feature extraction model is a feature extraction model locally maintained by the participant device, and the number of local private training samples is at least 1.
  • the local feature extraction model Based on the local private training samples, by performing comparative learning and training between the initial global feature extraction model and the local feature extraction model, optimizing the local feature extraction model to obtain a globally optimized local feature extraction model, specifically, Use the initial global feature extraction model to perform feature extraction on all local private training samples to obtain the representation of each first sample, and use the local feature extraction model to perform feature extraction on all local private training samples to obtain the representation of each second sample, and then based on each A comparison learning loss is calculated based on the similarity between the first sample representation and each second sample representation, and then based on the comparison learning loss, the local feature extraction model is optimized to obtain a globally optimized local feature extraction model.
  • Step A21 mapping all the local private training samples to a first sample representation through the initial global feature extraction model, and mapping all the local private training samples to a second sample representation through the local feature extraction model;
  • feature extraction is performed on all local private training samples through the initial global feature extraction model, and all local private training samples are respectively mapped to corresponding first sample representations, and the local feature extraction model is used to All local private training samples are subjected to feature extraction, and all local private training samples are respectively mapped to corresponding second sample representations.
  • Step A22 calculating a contrastive learning loss based on the similarity between each of the first sample representations and each of the second sample representations;
  • the contrastive learning loss is calculated. Specifically, the following steps are performed for each second sample representation:
  • the step of calculating the contrastive learning loss based on the similarity between each of the first sample representations and each of the second sample representations includes:
  • Step A221 taking the sample representation corresponding to the same local private training sample as the second sample representation in each of the first sample representations as the local positive sample representation corresponding to the second sample representation;
  • Step A222 taking the sample representations of the second sample representations that do not correspond to the same local private training sample as the first sample representations as the local negative sample representations corresponding to the first sample representations;
  • Step A223 based on the similarity between each of the second sample representations and the local positive sample representations corresponding to each of the second sample representations, and the local positive sample representations corresponding to each of the second sample representations and each of the second sample representations The similarity between negative sample representations is used to calculate the contrastive learning loss.
  • L N is the contrastive learning loss
  • N-1 is the number of local negative sample representations
  • f(x) T is the second sample representation
  • f(x + ) is the corresponding The local positive sample representation of
  • Step A23 Optimizing the local feature extraction model based on the comparative learning loss to obtain a globally optimized local feature extraction model.
  • the local feature extraction model is optimized to obtain a globally optimized local feature extraction model. Specifically, it is judged whether the contrastive learning loss converges, and if it converges, the The local feature extraction model is used as a globally optimized local feature extraction model. If it does not converge, update the local feature extraction model based on the model gradient calculated by the comparative learning loss, and return to the execution step: extract local private training samples, realize In order to promote the local feature extraction model to learn the model knowledge of the initial global feature extraction model, the output of the local feature extraction model and the initial global feature extraction model for the same sample is as similar as possible, and the global optimization of the corresponding local feature extraction is realized.
  • the step of optimizing the local feature extraction model based on the comparative learning loss to obtain a globally optimized local feature extraction model includes:
  • Step A231 converting each of the second sample representations into output classification labels corresponding to each of the local private training samples through a preset classification model
  • the local private training sample has a corresponding preset real label, wherein the preset real label is the identifier of the local private training sample, which can be used to represent the local private convergent sample information such as categories, attributes, and identities.
  • each of the second sample representations is converted into an output classification label corresponding to each of the local private training samples, specifically, each of the second sample representations is input into the preset classification model, and each The second sample representation is fully connected to obtain a fully connected vector corresponding to each of the second sample representations, and then based on a preset activation function, each of the fully connected vectors is respectively converted into each of the local private training samples.
  • the output classification label for .
  • Step A232 calculating a classification loss based on each of the output classification labels and the preset real labels corresponding to each of the local private training samples;
  • the classification loss is calculated, specifically, the calculation of each of the output classification labels and the corresponding local private training samples.
  • the cross-entropy loss between the corresponding preset real labels and then accumulate the cross-entropy losses to obtain the classification loss.
  • step A232 includes: calculating the L2 loss between each of the output classification labels and the preset real label corresponding to the corresponding local private training sample, and then accumulating the L2 losses to obtain the classification loss .
  • Step A233 calculating the total model loss based on the contrastive learning loss and the classification loss
  • the comparison learning loss and the classification loss are aggregated to obtain a total model loss, wherein the preset aggregation rule includes summing and averaging.
  • Step A234 Optimizing the local feature extraction model based on the total model loss to obtain the globally optimized local feature extraction model.
  • the local feature extraction model is used as a globally optimized local feature extraction model, and if it is not converged, the model is based on the total loss Calculate the model gradient, update the local feature extraction model, and return to the execution step: extract local private training samples, realize the purpose of promoting the local feature extraction model to learn the model knowledge of the initial global feature extraction model, and make the local feature extraction model and the initial
  • the output of the global feature extraction model for the same sample is as close as possible, which realizes the global optimization of the corresponding local feature extraction, and also makes the output of the local feature extraction model as close as possible to the preset real label, which improves the performance of the local feature extraction model.
  • Step A30 extracting target modal public samples belonging to the data modal corresponding to the globally optimized local feature extraction model from the federal public data set, and based on the globally optimized local feature extraction model, extracting the target modal Feature extraction is performed on the public samples of the target modal to obtain the representation of the public sample of the target modal;
  • the target modal public samples belonging to the data modality corresponding to the globally optimized local feature extraction model are extracted from the federal public data set, that is, the target modal public samples belonging to the participating model are extracted from the federal public data set.
  • Step A40 sending the public sample representation of the target modality to the federated server, so that the federated server can selectively aggregate the public sample representations of each target modality based on a data modality to obtain each of the data Aggregating sample representations of the target modalities corresponding to the modalities, and obtaining public training samples corresponding to each of the data modalities in the federal public data set, and based on each of the public training samples, respectively extracting the initial global features of each of the models Carry out knowledge distillation learning training based on each of the target modal aggregate sample representations, and perform comparative learning training between each of the initial global feature extraction models to obtain the target global feature extraction corresponding to each of the initial model global feature extraction models Model.
  • the federated server base will perform data modality-based selective aggregation on the public sample representations of each of the target modalities to obtain the target modality aggregation corresponding to each of the data modalities Sample characterization, and obtain public training samples corresponding to each of the data modalities in the federal public data set, and based on each of the public training samples, perform each of the initial global feature extraction models based on each of the target modalities
  • the specific implementation process of obtaining the target global feature extraction model corresponding to each of the initial model global feature extraction models can refer to the steps The specific content in steps S10 to S30 will not be repeated here.
  • the target global feature extraction model issued by the federated server receives the target global feature extraction model issued by the federated server, and extracting local private training samples, and then based on the local private training samples, by comparing and learning between the target global feature extraction model and the local feature extraction model Training, optimizing the local feature extraction model, so that the local feature extraction model can learn the model knowledge of the target local feature extraction model, and then obtain the target local feature extraction model, wherein, the specific implementation process and obtaining of the target local feature extraction model are obtained
  • the specific implementation process of the globally optimized local feature extraction model is the same, for details, please refer to the specific content in step A10 to step A20, which will not be repeated here.
  • FIG. 3 it is a schematic diagram of the interaction process when performing horizontal federated learning modeling in the federated learning optimization method of the present application, where server is the federated server, Client is the device of the participant, N is the number of devices of the participant, and base 1 and head 1 are the local feature extraction models that make up Client 1 , base N and head N are the local feature extraction models that make up Client N , classfier is the preset classification model, base ga and Head ga are the components that make up Client 1
  • the initial global feature extraction model of is Model ga
  • base gb and Head gb are the initial global feature extraction models in Client N , that is, Model gb , Y 1 and Y N are preset real labels
  • X 1 is The local private training sample in Client 1
  • X N is the local private training sample in Client N
  • X pub.a is the target modal public sample in Client 1
  • X pub.b is the target modal public sample in Client N
  • Z agg.a is the prediction sample representation
  • the embodiment of the present application provides a federated learning modeling optimization method, that is, firstly receive the initial global feature extraction model issued by the federated server, and extract local private training samples, and then based on the local private training samples, through the Perform comparative learning and training between the initial global feature extraction model and the local feature extraction model, optimize the local feature extraction model, obtain a globally optimized local feature extraction model, and realize the global
  • the purpose of the model knowledge of the model is to realize the global optimization of the local feature extraction model, and then extract the target mode public samples belonging to the data mode corresponding to the global optimized local feature extraction model in the federal public data set, and based on
  • the local feature extraction model after the global optimization performs feature extraction on the target modality public samples to obtain target modality public sample representations, and then sends the target modality public sample representations to the federated server for the
  • the federated server base will perform selective aggregation based on the data modality for each of the target modality public sample representations, obtain the target modality aggregated sample representations corresponding
  • each initial global feature extraction model can indirectly combine the samples of multiple participants corresponding to the data mode for horizontal federated learning through knowledge distillation, and at the same time
  • the initial global feature extraction model corresponding to each data modality can use contrastive learning to align samples of different data modality in the feature space, thereby achieving the purpose of indirectly combining samples of different data modality for horizontal federated learning, making horizontal federated Learning is no longer limited to samples of the same data modality of different participants, and overcomes the existing problems in the prior art because the existing horizontal federated learning can only be carried out jointly with samples of the same data modality in different participants.
  • the limitations of horizontal federated learning have strong technical defects, so the limitations of horizontal federated learning are reduced.
  • FIG. 4 is a schematic diagram of a device structure of a hardware operating environment involved in the solution of the embodiment of the present application.
  • the federated learning modeling optimization device may include: a processor 1001, such as a CPU, a memory 1005, and a communication bus 1002.
  • the communication bus 1002 is used to realize connection and communication between the processor 1001 and the memory 1005 .
  • the memory 1005 can be a high-speed RAM memory, or a stable memory (non-volatile memory), such as a disk memory.
  • the memory 1005 may also be a storage device independent of the aforementioned processor 1001 .
  • the federated learning modeling optimization device may also include a rectangular user interface, a network interface, a camera, an RF (Radio Frequency, radio frequency) circuit, a sensor, an audio circuit, a WiFi module, and the like.
  • the rectangular user interface may include a display screen (Display), an input sub-module such as a keyboard (Keyboard), and the optional rectangular user interface may also include a standard wired interface and a wireless interface.
  • the network interface may include a standard wired interface and a wireless interface (such as a WI-FI interface).
  • federated learning modeling and optimization device structure shown in Figure 4 does not constitute a limitation on the federated learning modeling and optimization device, and may include more or less components than those shown in the illustration, or combine some components, or different component arrangements.
  • the memory 1005 as a computer storage medium may include an operating system, a network communication module, and a federated learning modeling optimization program.
  • the operating system is a program that manages and controls the hardware and software resources of the federated learning modeling optimization device, and supports the operation of the federated learning modeling optimization program and other software and/or programs.
  • the network communication module is used to realize the communication between various components inside the memory 1005, and communicate with other hardware and software in the federated learning modeling optimization system.
  • the processor 1001 is configured to execute the federated learning modeling optimization program stored in the memory 1005 to implement the steps of the federated learning modeling optimization method described in any one of the above.
  • the embodiment of the present application also provides a federated learning modeling optimization device, the federated learning modeling optimization device is applied to a federated server, and the federated learning modeling optimization device includes:
  • the model distribution module is used to distribute the initial global feature extraction model corresponding to each data modality to the participant device corresponding to each said data modality, so that the participant device can use local private training samples based on the initial Carry out comparative learning and training between the global feature extraction model and the local feature extraction model, optimize the local feature extraction model, obtain a globally optimized local feature extraction model, and based on the globally optimized local feature extraction model, perform the federated public
  • the corresponding target modal public samples in the data set are subjected to feature extraction to obtain the representation of the target modal public samples;
  • a selective aggregation module configured to receive the target modal public sample representations sent by each of the participant devices, and perform selective aggregation based on data modalities for each of the target modal public sample representations to obtain each of the data Aggregated sample representation of the target mode corresponding to the mode;
  • the training module is used to obtain public training samples corresponding to each of the data modalities in the federal public data set, and based on each of the public training samples, perform each of the initial global feature extraction models based on each of the targets Knowledge distillation learning training of modal aggregation sample representation, and comparative learning training between each of the initial global feature extraction models, to obtain a target global feature extraction model corresponding to each of the initial model global feature extraction models.
  • the training module is also used for:
  • each of the initial global feature extraction models is optimized to obtain each target global feature extraction model.
  • the training module is also used for:
  • the knowledge distillation losses corresponding to each of the initial global feature extraction models are calculated respectively.
  • the selective aggregation module is also used for:
  • each of the participant devices Based on the corresponding relationship between each of the participant devices and each of the data modalities, determine the respective sample representations to be aggregated corresponding to each of the data modalities in the public sample representations of each of the target modalities;
  • the specific implementation of the federated learning modeling optimization device of the present application is basically the same as the above embodiments of the federated learning modeling optimization method, and will not be repeated here.
  • the embodiment of the present application also provides a federated learning modeling optimization device, the federated learning modeling optimization device is applied to participant equipment, and the federated learning modeling optimization device includes:
  • the receiving module is used to receive the initial global feature extraction model issued by the federation server, and extract local private training samples;
  • a contrastive learning and training module configured to optimize the local feature extraction model by performing comparative learning and training between the initial global feature extraction model and the local feature extraction model based on the local private training samples, and obtain a globally optimized local Feature extraction model;
  • the feature extraction module is used to extract the target mode public samples belonging to the data mode corresponding to the globally optimized local feature extraction model in the federal public data set, and based on the globally optimized local feature extraction model, for all Feature extraction is performed on the public samples of the target modal to obtain the representation of the public samples of the target modal;
  • a sending module configured to send the public sample representation of the target modality to a federated server, so that the federated server can selectively aggregate the public sample representations of each target modality based on data modality to obtain the Aggregating sample representations of target modalities corresponding to the data modalities, and obtaining public training samples corresponding to each of the data modalities in the federal public data set, based on each of the public training samples, each of the initial global features
  • the extraction model performs knowledge distillation learning training based on each of the target modal aggregation sample representations, and performs comparative learning training between each of the initial global feature extraction models to obtain the target global feature corresponding to each of the initial model global feature extraction models.
  • Feature extraction model configured to send the public sample representation of the target modality to a federated server, so that the federated server can selectively aggregate the public sample representations of each target modality based on data modality to obtain the Aggregating sample representations of target modalities corresponding to the data modalities, and
  • the contrastive learning training module is also used for:
  • the local feature extraction model is optimized to obtain a globally optimized local feature extraction model.
  • the contrastive learning training module is also used for:
  • the contrastive learning training module is also used for:
  • the local feature extraction model is optimized to obtain the globally optimized local feature extraction model.
  • the specific implementation of the federated learning modeling optimization device of the present application is basically the same as the above embodiments of the federated learning modeling optimization method, and will not be repeated here.
  • the embodiment of the present application provides a readable storage medium, and the readable storage medium stores one or more programs, and the one or more programs can also be executed by one or more processors to implement The steps of the federated learning modeling optimization method described in any one of the above.
  • the specific implementation manner of the readable storage medium of the present application is basically the same as the above embodiments of the federated learning modeling optimization method, and will not be repeated here.
  • the embodiment of the present application provides a computer program product, and the computer program product includes one or more computer programs, and the one or more computer programs can also be executed by one or more processors to implement The steps of the federated learning modeling optimization method described in any one of the above.

Abstract

Disclosed in the present application are a federated learning modeling optimization method and device, and a readable storage medium and a program product, which are applied to a federated server. The federated learning modeling optimization method comprises: distributing initial global feature extraction models corresponding to data modals to participant devices corresponding to the data modals, so that the participant devices obtain globally optimized local feature extraction models on the basis of local private training samples and by means of contrastive learning training, and generate target modal public sample representations according to the globally optimized local feature extraction models; receiving the target modal public sample representations sent by the participant devices, and aggregating the target modal public sample representations into target modal aggregated sample representations; and according to training samples selected from a public data set, optimizing each initial model global feature extraction model into a corresponding target global feature extraction model by means of knowledge distillation and contrastive learning.

Description

联邦学习建模优化方法、设备、可读存储介质及程序产品Federated learning modeling optimization method, device, readable storage medium and program product
优先权信息priority information
本申请要求于2021年7月28日申请的、申请号为202110860096.9的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to a Chinese patent application with application number 202110860096.9 filed on July 28, 2021, the entire contents of which are incorporated herein by reference.
技术领域technical field
本申请涉及金融科技(Fintech)的人工智能技术领域,尤其涉及一种联邦学习建模优化方法、设备、可读存储介质及程序产品。The present application relates to the technical field of artificial intelligence in financial technology (Fintech), and in particular to a federated learning modeling optimization method, equipment, readable storage medium and program product.
背景技术Background technique
随着金融科技,尤其是互联网科技金融的不断发展,越来越多的技术(如分布式、人工智能等)应用在金融领域,但金融业也对技术提出了更高的要求,如对金融业对应待办事项的分发也有更高的要求。With the continuous development of financial technology, especially Internet technology finance, more and more technologies (such as distributed, artificial intelligence, etc.) The industry also has higher requirements for the distribution of to-do items.
随着计算机软件和人工智能、大数据云服务应用的不断发展,目前,在现有的横向联邦学习框架中,横向联邦学习的前提之一为各参与方的样本需要在特征本身上进行对齐,而在特征本身上进行对齐的前提通常要求各参与方的样本处于同一数据模态,例如,各参与方的样本均为图像或者均为文字等,但是,当各参与方的样本的数据模态不同时,则各参与方之间无法进行横向联邦学习,所以,现有的横向联邦学习只能联合不同参与方中同一数据模态的样本进行,现有的横向联邦学习的局限性较强。With the continuous development of computer software, artificial intelligence, and big data cloud service applications, at present, in the existing horizontal federated learning framework, one of the prerequisites for horizontal federated learning is that the samples of each participant need to be aligned on the features themselves. The premise of alignment on the feature itself usually requires the samples of each participant to be in the same data mode, for example, the samples of each participant are all images or text, etc. If it is different, horizontal federated learning cannot be performed among the participants. Therefore, the existing horizontal federated learning can only be performed by combining samples of the same data mode in different participants, and the existing horizontal federated learning has strong limitations.
发明内容Contents of the invention
本申请的主要目的在于提供一种联邦学习建模优化方法、设备、可读存储介质及程序产品,旨在解决现有技术中横向联邦学习局限性强的技术问题。The main purpose of this application is to provide a federated learning modeling optimization method, device, readable storage medium and program product, aiming to solve the technical problem of strong limitations of horizontal federated learning in the prior art.
为实现上述目的,本申请提供一种联邦学习建模优化方法,所述联邦学习建模优化方法应用于联邦服务器,所述联邦学习建模优化方法包括:To achieve the above purpose, the present application provides a federated learning modeling optimization method, which is applied to a federated server, and the federated learning modeling optimization method includes:
将各数据模态对应的初始全局特征提取模型分发至各所述数据模态对应的参与方设备,以供所述参与方设备基于本地私有训练样本,通过在所述初始全局特征提取模型和本地特征提取模型之间进行对比学习训练,优化所述本地特征提取模型,获得全局优化后的本地特征提取模型,并基于所述全局优化后的本地特征提取模型,对联邦公有数据集中对应的目标模态公有样本进行特征提取,得到目标模态公有样本表征;Distributing the initial global feature extraction model corresponding to each data modality to the participant device corresponding to each data modality, so that the participant device can use the initial global feature extraction model and local Perform comparative learning and training between feature extraction models, optimize the local feature extraction model, obtain a globally optimized local feature extraction model, and based on the globally optimized local feature extraction model, perform a corresponding target model in the federal public data set Feature extraction is performed on the public samples of the target modal to obtain the representation of the public sample of the target modal;
接收各所述参与方设备发送的目标模态公有样本表征,并将对各所述目标模态公有样本表征进行基于数据模态的选择性聚合,获得各所述数据模态对应的目标模态聚合样本表征;receiving the target modality public sample representation sent by each of the participant devices, and performing selective aggregation based on the data modality for each of the target modality public sample representations to obtain the target modality corresponding to each of the data modes Aggregate sample characterization;
在所述联邦公有数据集中获取各所述数据模态对应的公有训练样本,并基于各所述公有训练样本,分别对各所述初始全局特征提取模型进行基于各所述目标模态聚合样本表征的知识蒸馏学习训练,以及在各所述初始全局特征提取模型之间进行对比学习训练,获得各所述初始模型全局特征提取模型对应的目标全局特征提取模型。Acquiring public training samples corresponding to each of the data modalities in the federal public data set, and based on each of the public training samples, performing aggregated sample characterization based on each of the target modalities for each of the initial global feature extraction models The knowledge distillation learning training of each of the initial global feature extraction models is performed, and the comparative learning training is performed between each of the initial global feature extraction models to obtain a target global feature extraction model corresponding to each of the initial model global feature extraction models.
本申请还提供一种联邦学习建模优化方法,所述联邦学习建模优化方法应用于参与方设备,所述联邦学习建模优化方法包括:The present application also provides a federated learning modeling optimization method, the federated learning modeling optimization method is applied to participant devices, and the federated learning modeling optimization method includes:
接收联邦服务器下发的初始全局特征提取模型,并提取本地私有训练样本;Receive the initial global feature extraction model issued by the federation server, and extract local private training samples;
基于所述本地私有训练样本,通过在所述初始全局特征提取模型和本地特征提取模型之间进行对比学习训练,优化所述本地特征提取模型,获得全局优化后的本地特征提取模型;Based on the local private training samples, by performing comparative learning and training between the initial global feature extraction model and the local feature extraction model, optimizing the local feature extraction model, and obtaining a globally optimized local feature extraction model;
在联邦公有数据集中提取属于所述全局优化后的本地特征提取模型对应的数据模态 的目标模态公有样本,并基于所述全局优化后的本地特征提取模型,对所述目标模态公有样本进行特征提取,得到目标模态公有样本表征;Extract the target modality public samples belonging to the data modality corresponding to the globally optimized local feature extraction model from the federated public data set, and based on the globally optimized local feature extraction model, extract the target modality public samples Perform feature extraction to obtain the public sample representation of the target mode;
将所述目标模态公有样本表征发送至联邦服务器,以供所述联邦服务器基将对各所述目标模态公有样本表征进行基于数据模态的选择性聚合,获得各所述数据模态对应的目标模态聚合样本表征,并在所述联邦公有数据集中获取各所述数据模态对应的公有训练样本,基于各所述公有训练样本,分别对各所述初始全局特征提取模型进行基于各所述目标模态聚合样本表征的知识蒸馏学习训练,以及在各所述初始全局特征提取模型之间进行对比学习训练,获得各所述初始模型全局特征提取模型对应的目标全局特征提取模型。Sending the public sample representation of the target modality to the federated server, so that the federated server can selectively aggregate the public sample representations of each target modality based on the data modality to obtain the corresponding Aggregating sample representations of target modalities, and obtaining public training samples corresponding to each of the data modalities in the federal public data set, and based on each of the public training samples, each of the initial global feature extraction models is based on each Knowledge distillation learning training of the target modality aggregation sample representation, and comparative learning training between each of the initial global feature extraction models, to obtain a target global feature extraction model corresponding to each of the initial model global feature extraction models.
本申请还提供一种联邦学习建模优化装置,所述联邦学习建模优化装置为虚拟装置,且所述联邦学习建模优化装置应用于联邦服务器,所述联邦学习建模优化装置包括:The present application also provides a federated learning modeling optimization device, the federated learning modeling optimization device is a virtual device, and the federated learning modeling optimization device is applied to a federated server, and the federated learning modeling optimization device includes:
模型分发模块,用于将各数据模态对应的初始全局特征提取模型分发至各所述数据模态对应的参与方设备,以供所述参与方设备基于本地私有训练样本,通过在所述初始全局特征提取模型和本地特征提取模型之间进行对比学习训练,优化所述本地特征提取模型,获得全局优化后的本地特征提取模型,并基于所述全局优化后的本地特征提取模型,对联邦公有数据集中对应的目标模态公有样本进行特征提取,得到目标模态公有样本表征;The model distribution module is used to distribute the initial global feature extraction model corresponding to each data modality to the participant device corresponding to each said data modality, so that the participant device can use local private training samples based on the initial Carry out comparative learning and training between the global feature extraction model and the local feature extraction model, optimize the local feature extraction model, obtain a globally optimized local feature extraction model, and based on the globally optimized local feature extraction model, perform the federated public The corresponding target modal public samples in the data set are subjected to feature extraction to obtain the representation of the target modal public samples;
选择性聚合模块,用于接收各所述参与方设备发送的目标模态公有样本表征,并将对各所述目标模态公有样本表征进行基于数据模态的选择性聚合,获得各所述数据模态对应的目标模态聚合样本表征;A selective aggregation module, configured to receive the target modal public sample representations sent by each of the participant devices, and perform selective aggregation based on data modalities for each of the target modal public sample representations to obtain each of the data Aggregated sample representation of the target mode corresponding to the mode;
训练模块,用于在所述联邦公有数据集中获取各所述数据模态对应的公有训练样本,并基于各所述公有训练样本,分别对各所述初始全局特征提取模型进行基于各所述目标模态聚合样本表征的知识蒸馏学习训练,以及在各所述初始全局特征提取模型之间进行对比学习训练,获得各所述初始模型全局特征提取模型对应的目标全局特征提取模型。The training module is used to obtain public training samples corresponding to each of the data modalities in the federal public data set, and based on each of the public training samples, perform each of the initial global feature extraction models based on each of the targets Knowledge distillation learning training of modal aggregation sample representation, and comparative learning training between each of the initial global feature extraction models, to obtain a target global feature extraction model corresponding to each of the initial model global feature extraction models.
本申请还提供一种联邦学习建模优化装置,所述联邦学习建模优化装置为虚拟装置,且所述联邦学习建模优化装置应用于参与方设备,所述联邦学习建模优化装置包括:The present application also provides a federated learning modeling optimization device, the federated learning modeling optimization device is a virtual device, and the federated learning modeling optimization device is applied to participant equipment, and the federated learning modeling optimization device includes:
接收模块,用于接收联邦服务器下发的初始全局特征提取模型,并提取本地私有训练样本;The receiving module is used to receive the initial global feature extraction model issued by the federation server, and extract local private training samples;
对比学习训练模块,用于基于所述本地私有训练样本,通过在所述初始全局特征提取模型和本地特征提取模型之间进行对比学习训练,优化所述本地特征提取模型,获得全局优化后的本地特征提取模型;A contrastive learning and training module, configured to optimize the local feature extraction model by performing comparative learning and training between the initial global feature extraction model and the local feature extraction model based on the local private training samples, and obtain a globally optimized local Feature extraction model;
特征提取模块,用于在联邦公有数据集中提取属于所述全局优化后的本地特征提取模型对应的数据模态的目标模态公有样本,并基于所述全局优化后的本地特征提取模型,对所述目标模态公有样本进行特征提取,得到目标模态公有样本表征;The feature extraction module is used to extract the target mode public samples belonging to the data mode corresponding to the globally optimized local feature extraction model in the federal public data set, and based on the globally optimized local feature extraction model, for all Feature extraction is performed on the public samples of the target modal to obtain the representation of the public samples of the target modal;
发送模块,用于将所述目标模态公有样本表征发送至联邦服务器,以供所述联邦服务器基将对各所述目标模态公有样本表征进行基于数据模态的选择性聚合,获得各所述数据模态对应的目标模态聚合样本表征,并在所述联邦公有数据集中获取各所述数据模态对应的公有训练样本,基于各所述公有训练样本,分别对各所述初始全局特征提取模型进行基于各所述目标模态聚合样本表征的知识蒸馏学习训练,以及在各所述初始全局特征提取模型之间进行对比学习训练,获得各所述初始模型全局特征提取模型对应的目标全局特征提取模型。A sending module, configured to send the public sample representation of the target modality to a federated server, so that the federated server can selectively aggregate the public sample representations of each target modality based on data modality to obtain the Aggregating sample representations of target modalities corresponding to the data modalities, and obtaining public training samples corresponding to each of the data modalities in the federal public data set, based on each of the public training samples, each of the initial global features The extraction model performs knowledge distillation learning training based on each of the target modal aggregation sample representations, and performs comparative learning training between each of the initial global feature extraction models to obtain the target global feature corresponding to each of the initial model global feature extraction models. Feature extraction model.
本申请还提供一种联邦学习建模优化设备,所述联邦学习建模优化设备为实体设备,所述联邦学习建模优化设备包括:存储器、处理器以及存储在所述存储器上并可在所述处理器上运行的所述联邦学习建模优化方法的程序,所述联邦学习建模优化方法的程序被处理器执行时可实现如上述的联邦学习建模优化方法的步骤。The present application also provides a federated learning modeling optimization device, the federated learning modeling optimization device is a physical device, and the federated learning modeling optimization device includes: a memory, a processor, and stored in the memory and can be used in the The program of the federated learning modeling and optimization method run on the processor, and when the program of the federated learning modeling and optimization method is executed by the processor, the steps of the above federated learning modeling and optimization method can be realized.
本申请还提供一种可读存储介质,所述可读存储介质上存储有实现联邦学习建模优化方法的程序,所述联邦学习建模优化方法的程序被处理器执行时实现如上述的联邦学习建 模优化方法的步骤。The present application also provides a readable storage medium, on which a program for realizing the federated learning modeling optimization method is stored. When the program of the federated learning modeling and optimization method is executed by a processor, the above-mentioned federation Learn the steps of modeling optimization methods.
本申请还提供一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时实现如上述的联邦学习建模优化方法的步骤。The present application also provides a computer program product, including a computer program. When the computer program is executed by a processor, the steps of the above-mentioned federated learning modeling optimization method are implemented.
本申请提供了一种联邦学习建模优化方法、设备、可读存储介质及程序产品,相比于现有技术采用的在各参与方的特征本身上进行对齐,以进行横向联邦学习的技术手段,本申请首先将各数据模态对应的初始全局特征提取模型分发至各所述数据模态对应的参与方设备,以供所述参与方设备基于本地私有训练样本,通过在所述初始全局特征提取模型和本地特征提取模型之间进行对比学习训练,优化所述本地特征提取模型,使得本地特征提取模型可学习到全局模型的模型知识,进而获得全局优化后的本地特征提取模型,进而基于所述全局优化后的本地特征提取模型,对联邦公有数据集中对应的目标模态公有样本进行特征提取,得到目标模态公有样本表征,接收各所述参与方设备发送的目标模态公有样本表征,并将对各所述目标模态公有样本表征进行基于数据模态的选择性聚合,获得各所述数据模态对应的目标模态聚合样本表征,即可实现将属于同一数据模态的目标模态公有样本表征分别进行聚合的目的,进而在所述联邦公有数据集中获取各所述数据模态对应的公有训练样本,并基于各所述公有训练样本,分别对各所述初始全局特征提取模型进行基于各所述目标模态聚合样本表征的知识蒸馏学习训练,以及在各所述初始全局特征提取模型之间进行对比学习训练,获得各所述初始模型全局特征提取模型对应的目标全局特征提取模型,而由于每一目标模态公有样本表征均为基于参与方设备的本地私有训练样本进行优化得到全局优化后的本地特征提取模型输出的,进而使得每一初始全局特征提取模型均可以通过知识蒸馏间接联合对应数据模态的多个参与方的样本进行横向联邦学习,且同时各数据模态对应的初始全局特征提取模型可利用对比学习将不同数据模态的样本在特征空间上进行对齐,进而实现了间接联合不同数据模态的样本进行横向联邦学习的目的,使得横向联邦学习不再局限于不同参与方同一数据模态的样本之间进行,克服了现有技术中由于现有的横向联邦学习只能联合不同参与方中同一数据模态的样本进行,而导致现有的横向联邦学习的局限性较强的技术缺陷,所以,降低了横向联邦学习的局限性。This application provides a federated learning modeling optimization method, equipment, readable storage medium, and program product, compared with the technical means of aligning the characteristics of each participant in the prior art to perform horizontal federated learning , the present application first distributes the initial global feature extraction model corresponding to each data modality to the participant device corresponding to each data modality, so that the participant device can use the initial global feature extraction model based on the local private training sample Contrast learning and training between the extraction model and the local feature extraction model, optimize the local feature extraction model, so that the local feature extraction model can learn the model knowledge of the global model, and then obtain the globally optimized local feature extraction model, and then based on the The local feature extraction model after the global optimization is described, and the feature extraction is performed on the corresponding target modal public samples in the federation public data set, and the target modal public sample representation is obtained, and the target modal public sample representation sent by each participant device is received, And carry out selective aggregation based on the data modality for the public sample representations of each of the target modalities, and obtain the aggregated sample representations of the target modalities corresponding to each of the data modalities, so that the target modes belonging to the same data modality can be realized. The purpose of aggregating the modal public sample representations respectively, and then obtain the public training samples corresponding to each of the data modalities in the federated public data set, and based on each of the public training samples, respectively extract each of the initial global feature models Carry out knowledge distillation learning training based on each of the target modal aggregate sample representations, and perform comparative learning training between each of the initial global feature extraction models to obtain the target global feature extraction corresponding to each of the initial model global feature extraction models model, and since the public sample representation of each target modality is based on the local private training samples of the participant equipment to optimize the output of the globally optimized local feature extraction model, so that each initial global feature extraction model can pass the knowledge Distillation indirectly combines the samples of multiple participants corresponding to the data modality for horizontal federated learning, and at the same time, the initial global feature extraction model corresponding to each data modality can use comparative learning to align samples of different data modality in the feature space, Furthermore, the purpose of indirectly combining samples of different data modalities for horizontal federated learning is achieved, so that horizontal federated learning is no longer limited to samples of the same data modality of different participants, and overcomes the existing horizontal federated learning in the prior art. Federated learning can only be carried out by combining samples of the same data mode in different parties, which leads to the technical defect of the existing horizontal federated learning with strong limitations. Therefore, the limitation of horizontal federated learning is reduced.
附图说明Description of drawings
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本申请的实施例,并与说明书一起用于解释本申请的原理。The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description serve to explain the principles of the application.
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,对于本领域普通技术人员而言,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, for those of ordinary skill in the art, In other words, other drawings can also be obtained from these drawings without paying creative labor.
图1为本申请联邦学习建模优化方法第一实施例的流程示意图;Fig. 1 is a schematic flow chart of the first embodiment of the federated learning modeling optimization method of the present application;
图2为本申请联邦学习建模优化方法第二实施例的流程示意图;FIG. 2 is a schematic flow diagram of the second embodiment of the federated learning modeling optimization method of the present application;
图3为本申请联邦学习建模优化方法中进行横向联邦学习建模时的交互流程示意图;Fig. 3 is a schematic diagram of the interaction process when performing horizontal federated learning modeling in the federated learning modeling optimization method of the present application;
图4为本申请实施例中联邦学习建模优化方法涉及的硬件运行环境的设备结构示意图。FIG. 4 is a schematic diagram of the device structure of the hardware operating environment involved in the federated learning modeling optimization method in the embodiment of the present application.
本申请目的实现、功能特点及优点将结合实施例,参照附图做进一步说明。The realization of the purpose, functions and advantages of the present application will be further described in conjunction with the embodiments and with reference to the accompanying drawings.
具体实施方式Detailed ways
应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。It should be understood that the specific embodiments described here are only used to explain the present application, not to limit the present application.
本申请实施例提供一种联邦学习建模优化方法,应用于联邦服务器,在本申请联邦学习建模优化方法的第一实施例中,参照图1,所述联邦学习建模优化方法包括:The embodiment of the present application provides a federated learning modeling optimization method, which is applied to a federated server. In the first embodiment of the federated learning modeling optimization method of the present application, referring to FIG. 1, the federated learning modeling optimization method includes:
步骤S10,将各数据模态对应的初始全局特征提取模型分发至各所述数据模态对应的参与方设备,以供所述参与方设备基于本地私有训练样本,通过在所述初始全局特征提取模型和本地特征提取模型之间进行对比学习训练,优化所述本地特征提取模型,获得全局优化后的本地特征提取模型,并基于所述全局优化后的本地特征提取模型,对联邦公有数 据集中对应的目标模态公有样本进行特征提取,得到目标模态公有样本表征;Step S10, distributing the initial global feature extraction model corresponding to each data modality to the participant device corresponding to each said data modality, so that the participant device can use the initial global feature extraction model based on the local private training sample Carry out comparative learning training between the model and the local feature extraction model, optimize the local feature extraction model, obtain a globally optimized local feature extraction model, and based on the globally optimized local feature extraction model, corresponding to the federal public data set Feature extraction is performed on the target modal public samples to obtain the target modal public sample representation;
在本实施例中,需要说明的是,所述联邦学习建模优化方法应用于横向联邦学习,其中,在横向联邦学习的框架中包括联邦服务器以及多个参与方设备,其中,联邦服务器维护全局模型,参与方设备维护各自的本地模型,且各参与方设备之间对应多个数据模态,其中,一数据模态至少对应一参与方设备,例如,假设存在数据模态A和B,数据模态A对应参与方设备A和参与方设备b,也即参与方a和参与方设备b具备属于数据模态A的样本,数据模态B对应参与方设备c,也即参与方设备c具备属于数据模态B的样本。In this embodiment, it should be noted that the federated learning modeling and optimization method is applied to horizontal federated learning, wherein the framework of horizontal federated learning includes a federated server and multiple participant devices, wherein the federated server maintains a global The participant devices maintain their own local models, and each participant device corresponds to multiple data modalities, where a data model corresponds to at least one participant device. For example, assuming that there are data modalities A and B, the data Mode A corresponds to participant device A and participant device b, that is, participant a and participant device b have samples belonging to data mode A, and data mode B corresponds to participant device c, that is, participant device c has Samples belonging to data modality B.
另外地,需要说明的是,参与方设备拥有自身的本地私有训练样本,其中,所述本地私有训练样本可拥有对应的本地样本标签,所述本地私有训练样本的数据模态为该参与方设备对应的数据模态,各参与方设备和联邦服务器拥有同一个联邦公有数据集,所述联邦公有数据集拥有属于各所述参与方设备对应的数据模态的公有训练样本,所述联邦服务器针对于各参与方设备对应的每一数据模态均维护一个对应的初始全局特征提取模型,参与方设备则针对于自身对应的数据模态维护一本地特征提取模型。In addition, it should be noted that the participant device has its own local private training sample, wherein the local private training sample may have a corresponding local sample label, and the data mode of the local private training sample is the participant device For the corresponding data mode, each participant device and the federation server have the same federated public data set, and the federated public data set has public training samples belonging to the data mode corresponding to each of the participant devices, and the federated server is aimed at Each data mode corresponding to each participant device maintains a corresponding initial global feature extraction model, and the participant device maintains a local feature extraction model for its corresponding data mode.
将各数据模态对应的初始全局特征提取模型分发至各所述数据模态对应的参与方设备,以供所述参与方设备基于本地私有训练样本,通过在所述初始全局特征提取模型和本地特征提取模型之间进行对比学习训练,优化所述本地特征提取模型,获得全局优化后的本地特征提取模型,并基于所述全局优化后的本地特征提取模型,对联邦公有数据集中对应的目标模态公有样本进行特征提取,得到目标模态公有样本表征,具体地,将各数据模态对应的初始全局特征提取模型分发至相对应的数据模态所对应的参与方设备,进而参与方设备获取本地持有的本地私有训练样本和本地特征提取模型,通过在所述初始全局特征提取模型和本地特征提取模型之间进行对比学习训练,促使本地特征提取模型学习所述初始全局特征提取模型的模型知识,优化所述本地特征提取模型,得到全局优化后的本地特征提取模型,进而参与方设备在联邦公有数据集中提取属于该参与方设备对应的数据模态的目标模态公有样本,并利用所述全局优化后的本地特征提取模型,对所述目标模态公有样本进行特征提取,将所述目标模态公有样本映射为目标模态公有样本表征,其中,所述参与方设备得到全局优化后的本地特征提取模型以及得到目标模态公有样本表征的具体实现过程具体可参照步骤A10至步骤A30及其细化步骤中的具体内容,在此不再赘述。Distributing the initial global feature extraction model corresponding to each data modality to the participant device corresponding to each data modality, so that the participant device can use the initial global feature extraction model and local Perform comparative learning and training between feature extraction models, optimize the local feature extraction model, obtain a globally optimized local feature extraction model, and based on the globally optimized local feature extraction model, perform a corresponding target model in the federal public data set Feature extraction is performed on the public samples of the target modal to obtain the representation of the public sample of the target modal. Specifically, the initial global feature extraction model corresponding to each data modal is distributed to the participant device corresponding to the corresponding data modal, and then the participant device acquires locally held local private training samples and local feature extraction models, by performing comparative learning and training between the initial global feature extraction model and the local feature extraction model, prompting the local feature extraction model to learn the model of the initial global feature extraction model knowledge, optimize the local feature extraction model, and obtain a globally optimized local feature extraction model, and then the participant device extracts the target modal public samples belonging to the data modality corresponding to the participant device from the federal public data set, and uses the The local feature extraction model after the global optimization is used to perform feature extraction on the target modal public samples, and map the target modal public samples to target modal public sample representations, wherein the participant equipment is globally optimized The specific implementation process of the local feature extraction model and obtaining the public sample representation of the target modality can refer to the specific content in step A10 to step A30 and its refinement steps, and will not be repeated here.
步骤S20,接收各所述参与方设备发送的目标模态公有样本表征,并将对各所述目标模态公有样本表征进行基于数据模态的选择性聚合,获得各所述数据模态对应的目标模态聚合样本表征;Step S20, receiving the target modality public sample representation sent by each of the participant devices, and performing selective aggregation based on the data modality for each of the target modality public sample representations to obtain the corresponding Target modal aggregation sample characterization;
在本实施例中,接收各所述参与方设备发送的目标模态公有样本表征,并将对各所述目标模态公有样本表征进行基于数据模态的选择性聚合,获得各所述数据模态对应的目标模态聚合样本表征,具体地,接收各所述参与方设备发送的目标模态公有样本表征,进而将各目标模态公有样本表征中对应同一数据模态的目标模态公有样本表征分别进行聚合,获得各所述数据模态对应的目标模态聚合样本表征,例如,假设存在3个目标模态公有样本表征X1、X2和X3对应的数据模态A,存在3个目标模态公有样本表征X4、X5和X6对应的数据模态B,进而选择X1、X2和X3进行聚合,得到数据模态A对应的目标模态聚合样本表征Z1,选择X4、X5和X6进行聚合,得到数据模态B对应的目标模态聚合样本表征Z2。In this embodiment, the public sample representation of the target modality sent by each participant device is received, and the selective aggregation based on the data modality is performed on each of the public sample representations of the target modality to obtain each of the data modality The aggregated sample representations of the target modalities corresponding to the corresponding modalities, specifically, receiving the public sample representations of the target modalities sent by each of the participant devices, and then combining the public sample representations of the target modalities corresponding to the same data modal in the public sample representations of each target modal The representations are aggregated separately to obtain the aggregated sample representations of target modalities corresponding to each of the data modalities. For example, assuming that there are 3 target modal public sample representations corresponding to data modalities A of X1, X2 and X3, there are 3 target modalities The public sample representations X4, X5 and X6 correspond to the data mode B, and then X1, X2 and X3 are selected for aggregation to obtain the target mode aggregation sample representation Z1 corresponding to the data mode A, and X4, X5 and X6 are selected for aggregation. Obtain the target modality aggregation sample representation Z2 corresponding to the data modality B.
其中,所述将对各所述目标模态公有样本表征进行基于数据模态的选择性聚合,获得各所述数据模态对应的目标模态聚合样本表征的步骤包括:Wherein, the step of performing selective aggregation based on the data modality for the public sample representation of each of the target modalities, and obtaining the aggregated sample representation of the target modality corresponding to each of the data modalities includes:
步骤S21,基于各所述参与方设备与各所述数据模态之间的对应关系,在各所述目标模态公有样本表征中确定各所述数据模态分别对应的各待聚合样本表征;Step S21, based on the corresponding relationship between each of the participant devices and each of the data modalities, determine the respective sample representations to be aggregated corresponding to each of the data modalities in the public sample representations of each of the target modalities;
在本实施例中,需要说明的是,一所述数据模态至少对应一待聚合样本表征,由于所述目标模态公有样本表征是由所述参与方设备发送至联邦服务器的,所述目标模态公有样 本表征与所述参与方设备具备一一对应关系,一所述参与方设备具备一数据模态,各参与方设备的数据模态可相同,也可不同。In this embodiment, it should be noted that one data modality corresponds to at least one sample representation to be aggregated, and since the public sample representation of the target modality is sent to the federation server by the participant device, the target The modality public sample representation has a one-to-one correspondence with the participant devices, and each participant device has a data modality, and the data modality of each participant device may be the same or different.
基于各所述参与方设备与各所述数据模态之间的对应关系,在各所述目标模态公有样本表征中确定各所述数据模态分别对应的各待聚合样本表征,具体地,基于各所述参与方设备与各所述数据模态之间的对应关系,在各所述参与方设备的目标模态公有样本表征中确定每一数据模态对应的各待聚合样本表征,例如,假设存在参与方设备a1和a2,对应数据模态A,存在参与方设备b1和b2,对应数据模态B,则参与方设备a1和a2向联邦服务器发送的目标模态公有样本表征均为数据模态A对应的待聚合样本表征,参与方设备b1和b2向联邦服务器发送的目标模态公有样本表征均为数据模态B对应的待聚合样本表征。Based on the corresponding relationship between each of the participant devices and each of the data modalities, determine the respective sample representations to be aggregated corresponding to each of the data modalities in the public sample representations of each of the target modalities, specifically, Based on the corresponding relationship between each of the participant devices and each of the data modalities, determine the sample representations to be aggregated corresponding to each data modality in the target modal public sample representations of each of the participant devices, for example , assuming that there are participant devices a1 and a2, corresponding to data modality A, and participant devices b1 and b2, corresponding to data modality B, then the target modality public sample representations sent by participant devices a1 and a2 to the federated server are both The sample representations to be aggregated corresponding to data modality A, and the public sample representations of the target modality sent by participant devices b1 and b2 to the federation server are all sample representations to be aggregated corresponding to data modality B.
步骤S22,将各所述数据模态分别对应的各待聚合样本表征分别进行聚合,获得各所述数据模态对应的目标模态聚合样本表征。In step S22, aggregate the sample representations to be aggregated corresponding to the respective data modalities to obtain aggregated sample representations of target modalities corresponding to the respective data modalities.
在本实施例中,将各所述数据模态分别对应的各待聚合样本表征分别进行聚合,获得各所述数据模态对应的目标模态聚合样本表征,具体地,基于预设聚合规则,将每一数据模态对应的各待聚合样本表征分别进行聚合,获得各数据模态对应的目标模态聚合样本表征,其中,所述预设聚合规则包括求和以及求平均等,例如,假设数据模态A对应2个待聚合样本表征分别为a1和a2,数据模态B对应的2个待聚合样本表征分别为b1和b2,则将a1和a2聚合为数据模态A对应的目标模态聚合样本表征,将b1和b2聚合为数据模态B对应的目标模态聚合样本表征。In this embodiment, the respective representations of samples to be aggregated corresponding to each of the data modalities are aggregated separately to obtain the target modal aggregation sample representations corresponding to each of the data modalities. Specifically, based on preset aggregation rules, Aggregate the representations of samples to be aggregated corresponding to each data modality to obtain the target modality aggregation sample representations corresponding to each data modality, wherein the preset aggregation rules include summation and averaging, etc., for example, assuming Data modality A corresponds to 2 sample representations to be aggregated as a1 and a2, and data modality B corresponds to 2 sample representations to be aggregated as b1 and b2, then a1 and a2 are aggregated into the target mode corresponding to data modality A state aggregation sample representation, and aggregate b1 and b2 into the target mode aggregation sample representation corresponding to data mode B.
步骤S30,在所述联邦公有数据集中获取各所述数据模态对应的公有训练样本,并基于各所述公有训练样本,分别对各所述初始全局特征提取模型进行基于各所述目标模态聚合样本表征的知识蒸馏学习训练,以及在各所述初始全局特征提取模型之间进行对比学习训练,获得各所述初始模型全局特征提取模型对应的目标全局特征提取模型。Step S30, obtaining public training samples corresponding to each of the data modalities in the federated public data set, and based on each of the public training samples, performing an initial global feature extraction model based on each of the target modalities Aggregating sample representation knowledge distillation learning training, and performing comparative learning training between each of the initial global feature extraction models, to obtain a target global feature extraction model corresponding to each of the initial model global feature extraction models.
在本实施例中,需要说明的是,所述目标模态聚合样本表征用于指导对应的初始全局特征提取模型进行优化,使得初始全局特征提取模型的输出尽量与对应的目标模态聚合样本表征相近。In this embodiment, it should be noted that the target modal aggregation sample characterization is used to guide the optimization of the corresponding initial global feature extraction model, so that the output of the initial global feature extraction model is as close as possible to the corresponding target modal aggregation sample characterization similar.
在所述联邦公有数据集中获取各所述数据模态对应的公有训练样本,并基于各所述公有训练样本,分别对各所述初始全局特征提取模型进行基于各所述目标模态聚合样本表征的知识蒸馏学习训练,以及在各所述初始全局特征提取模型之间进行对比学习训练,获得各所述初始模型全局特征提取模型对应的目标全局特征提取模型,具体地,在所述联邦公有数据集中获取各所述数据模态对应的公有训练样本,并基于各所述公有训练样本,分别对各所述初始全局特征提取模型进行基于各所述目标模态聚合样本表征的知识蒸馏学习训练,以将各参与方设备中各数据模态的全局优化后的本地特征提取模型的模型知识分别迁移至对应的数据模态对应的初始全局特征提取模型,以及在各所述初始全局特征提取模型之间进行对比学习训练,以促使不同数据模态的样本在特征空间中进行对齐,进而获得各所述初始模型全局特征提取模型对应的目标全局特征提取模型。Acquiring public training samples corresponding to each of the data modalities in the federal public data set, and based on each of the public training samples, performing aggregated sample characterization based on each of the target modalities for each of the initial global feature extraction models knowledge distillation learning training, and comparative learning training between each of the initial global feature extraction models to obtain the target global feature extraction model corresponding to each of the initial model global feature extraction models, specifically, in the federal public data Collecting public training samples corresponding to each of the data modalities, and based on each of the public training samples, performing knowledge distillation learning training based on the aggregation sample representation of each of the target modalities for each of the initial global feature extraction models, In order to transfer the model knowledge of the globally optimized local feature extraction model of each data modality in each participant's device to the initial global feature extraction model corresponding to the corresponding data modality, and between each of the initial global feature extraction models Contrastive learning and training are performed between them, so as to promote the alignment of samples of different data modalities in the feature space, and then obtain the target global feature extraction models corresponding to the global feature extraction models of each initial model.
另外地,需要说明的是,联邦服务器可直接将各参与方设备在联邦公有数据集中选取的目标模态公有样本作为公有训练样本,例如假设参与方设备A选取数据模态为a的多个目标模态公有样本,记为X1,参与方设备B选取数据模态为a的多个目标模态公有样本,记为X2,参与方设备C选取数据模态为b的多个目标模态公有样本,记为X3,则X1和X2则可直接作为联邦服务器选取的数据模态a对应的初始全局特征提取模型的公有训练样本,X3则可直接作为联邦服务器选取的数据模态b对应的初始全局特征提取模型的公有训练样本。In addition, it should be noted that the federated server can directly use the target modality public samples selected by each participant device in the federated public data set as public training samples. Mode public sample, denoted as X1, participant device B selects multiple target modal public samples with data mode a, denoted as X2, participant device C selects multiple target modal public samples with data mode b , denoted as X3, then X1 and X2 can be directly used as the public training samples of the initial global feature extraction model corresponding to the data mode a selected by the federated server, and X3 can be directly used as the initial global feature extraction model corresponding to the data mode b selected by the federated server Public training samples for feature extraction models.
另外地,需要说明的是,虽然所述目标模态聚合样本表征用于指导对应的初始全局特征提取模型进行优化,使得初始全局特征提取模型的输出尽量与对应的目标模态聚合样本 表征相近,但是由于所述目标模态聚合样本表征为若干目标模态公有样本对应的目标模态公有样本表征的聚合结果,所以目标模态聚合样本表征在一定程度上表征的是对应的数据模态,而非对单个样本,进而也可重新从联邦公有数据集中选取属于初始全局特征提取模型对应的数据模态的样本作为该初始全局特征提取模型的公有训练样本,而非必须将各参与方设备在联邦公有数据集中选取的目标模态公有样本作为公有训练样本。In addition, it should be noted that although the target modal aggregation sample characterization is used to guide the optimization of the corresponding initial global feature extraction model, so that the output of the initial global feature extraction model is as close as possible to the corresponding target modal aggregation sample characterization, However, since the target modal aggregated sample representation is the aggregation result of target modal public sample representations corresponding to several target modal public samples, the target modal aggregated sample representation represents the corresponding data modality to a certain extent, while Instead of a single sample, samples belonging to the data modality corresponding to the initial global feature extraction model can be reselected from the federal public data set as the public training samples of the initial global feature extraction model, instead of having to place each participant's device in the federation The public samples of the target modality selected in the public data set are used as public training samples.
其中,所述基于各所述公有训练样本,分别对各所述初始全局特征提取模型进行基于各所述目标模态聚合样本表征的知识蒸馏学习训练,以及在各所述初始全局特征提取模型之间进行对比学习训练,获得各所述初始模型全局特征提取模型对应的目标全局特征提取模型的步骤包括:Wherein, based on each of the public training samples, each of the initial global feature extraction models is respectively subjected to knowledge distillation learning training based on each of the target modality aggregation sample representations, and each of the initial global feature extraction models Carry out contrastive learning training between, the step of obtaining the target global feature extraction model corresponding to each described initial model global feature extraction model comprises:
步骤S31,将各所述数据模态对应的公有训练样本通过对应的初始全局特征提取模型分别映射为预测样本表征;Step S31, mapping the public training samples corresponding to each of the data modalities into predicted sample representations through the corresponding initial global feature extraction model;
在本实施例中,将各所述数据模态对应的公有训练样本通过对应的初始全局特征提取模型分别映射为预测样本表征,具体地,将各所述公有训练样本分别输入各所述公有训练样本对应的数据模态所对应的初始全局特征提取模型,分别对各所述公有训练样本进行特征提取,以将各所述公有训练样本映射至预设样本表征空间,获得各所述公有训练样本对应的预测样本表征,也即,将各所述公有训练样本映射为对应的预测样本表征。In this embodiment, the public training samples corresponding to each of the data modalities are respectively mapped to prediction sample representations through the corresponding initial global feature extraction models, specifically, each of the public training samples is input into each of the public training samples. The initial global feature extraction model corresponding to the data mode corresponding to the sample performs feature extraction on each of the public training samples, so as to map each of the public training samples to a preset sample representation space, and obtain each of the public training samples. Corresponding prediction sample representations, that is, mapping each of the public training samples to corresponding prediction sample representations.
步骤S32,计算各所述预测样本表征与对应的目标模态聚合样本表征之间的知识蒸馏损失,以及计算各所述预测样本表征之间的对比学习损失;Step S32, calculating the knowledge distillation loss between each of the predicted sample representations and the corresponding target modal aggregation sample representation, and calculating the comparative learning loss between each of the predicted sample representations;
在本实施例中,计算各所述预测样本表征与对应的目标模态聚合样本表征之间的知识蒸馏损失,以及计算各所述预测样本表征之间的对比学习损失,具体地,基于各所述预测样本表征与对应的目标模态聚合样本表征之间相似度,计算知识蒸馏损失,以及基于各所述预测样本表征之间的相似度,计算对比学习损失。In this embodiment, the knowledge distillation loss between each of the predicted sample representations and the corresponding target modal aggregation sample representation is calculated, and the comparative learning loss between each of the predicted sample representations is calculated. Specifically, based on each Calculate the knowledge distillation loss based on the similarity between the predicted sample representations and the corresponding target modal aggregation sample representations, and calculate the comparative learning loss based on the similarity between the predicted sample representations.
其中,所述计算各所述预测样本表征与对应的目标模态聚合样本表征之间的知识蒸馏损失,以及计算各所述预测样本表征之间的对比学习损失的步骤包括:Wherein, the steps of calculating the knowledge distillation loss between each of the predicted sample representations and the corresponding target modal aggregation sample representations, and calculating the contrastive learning loss between each of the predicted sample representations include:
步骤S321,基于各所述公有训练样本对应的样本标签,在各所述预测样本表征分别选取各所述预测样本表征对应的正样本表征和对应的负样本表征;Step S321, based on the sample labels corresponding to each of the public training samples, respectively select a positive sample representation and a corresponding negative sample representation corresponding to each of the predicted sample representations in each of the predicted sample representations;
在本实施例中,需要说明的是,当不同模态数据的公有训练样本表示同一事物时,则表示同一事物的不同模态数据的各公有训练样本具备同一样本标签,且表示同一事物的不同模态数据的各公有训练样本构成一公有训练样本组,也即属于同一所述公有训练样本组的样本具备同一样本标签,不属于同一所述公有训练样本组的样本具备不同的样本标签。In this embodiment, it needs to be explained that when the public training samples of different modal data represent the same thing, each public training sample of different modal data representing the same thing has the same sample label, and represents different parts of the same thing. The public training samples of the modal data constitute a public training sample group, that is, samples belonging to the same public training sample group have the same sample label, and samples not belonging to the same public training sample group have different sample labels.
基于各所述公有训练样本对应的样本标签,在各所述预测样本表征分别选取各所述预测样本表征对应的正样本表征和对应的负样本表征,具体地,基于各所述公有训练样本对应的样本标签,确定各所述公有训练样本对应的公有训练样本组,进而对于每一公有训练样本:Based on the sample labels corresponding to each of the public training samples, the positive sample representation and the corresponding negative sample representation corresponding to each of the prediction sample representations are respectively selected in each of the prediction sample representations, specifically, based on each of the public training samples corresponding to , determine the public training sample group corresponding to each of the public training samples, and then for each public training sample:
将所述公有训练样本对应的公有训练样本组内的其他公有训练样本对应的其他预测样本表征作为所述公有训练样本对应的预测样本表征所对应的正样本表征,以及将所述公有训练样本对应的公有训练样本组外的其他样本对应的其他预测样本表征作为所述公有训练样本对应的预测样本表征所对应的负样本表征,进而获得每一所述预测样本表征对应的正样本表征和对应的负样本表征,其中,所述正样本表征和所述负样本表征的数量至少为1。Using other prediction sample representations corresponding to other public training samples in the public training sample group corresponding to the public training samples as positive sample representations corresponding to the prediction sample representations corresponding to the public training samples, and using the public training samples corresponding to Other prediction sample representations corresponding to other samples outside the public training sample group are used as the negative sample representations corresponding to the prediction sample representations corresponding to the public training samples, and then the positive sample representations corresponding to each of the prediction sample representations and the corresponding A negative sample representation, wherein the number of the positive sample representation and the negative sample representation is at least one.
在另一种实施方式中,若各参与方设备的数量为2时,2个参与方设备对应的数据模态不同,则各所述公有训练样本包括属于第一数据模态的各第一公有训练样本和属于第二数据模态的各第二公有训练样本,In another embodiment, if the number of each participant device is 2, and the data modes corresponding to the two participant devices are different, each of the public training samples includes each first public training sample belonging to the first data mode. training samples and respective second public training samples belonging to the second data modality,
所述基于各所述公有训练样本对应的样本标签,在各所述预测样本表征分别选取各所述预测样本表征对应的正样本表征和对应的负样本表征的步骤包括:The step of selecting a positive sample representation and a corresponding negative sample representation corresponding to each of the prediction sample representations in each of the prediction sample representations based on the sample labels corresponding to each of the public training samples includes:
在各所述第二公有训练样本中确定与所述第一公有训练样本具备相同的样本标签的样本作为所述第一公有训练样本对应的第一正样本,以及确定与所述第一公有训练样本不具备相同的样本标签的样本作为对应的所述第一公有训练样本对应的第一负样本;将所述第一正样本对应的预测样本表征作为所述第一公有训练样本对应的预测样本表征的正样本表征,以及将所述第一负样本对应的预测样本表征作为所述第一公有训练样本对应的预测样本表征的负样本表征。In each of the second public training samples, determine a sample that has the same sample label as the first public training sample as the first positive sample corresponding to the first public training sample, and determine the same as the first public training sample. Samples that do not have the same sample label as the corresponding first negative sample corresponding to the first public training sample; characterizing the predicted sample corresponding to the first positive sample as the predicted sample corresponding to the first public training sample A positive sample representation of the representation, and a negative sample representation using the prediction sample representation corresponding to the first negative sample as the prediction sample representation corresponding to the first public training sample.
相同地,在各所述第一公有训练样本中确定与所述第二公有训练样本具备相同的样本标签的样本作为所述第二公有训练样本对应的第二正样本,以及确定与所述第二公有训练样本不具备相同的样本标签的样本作为对应的所述第二公有训练样本对应的第二负样本;将所述第二正样本对应的预测样本表征作为所述第二公有训练样本对应的预测样本表征的正样本表征,以及将所述第二负样本对应的预测样本表征作为所述第二公有训练样本对应的预测样本表征的负样本表征。Similarly, in each of the first public training samples, determine a sample having the same sample label as the second public training sample as the second positive sample corresponding to the second public training sample, and determine the same as the second public training sample. The second public training sample does not have the same sample label as the corresponding second negative sample corresponding to the second public training sample; the predicted sample representation corresponding to the second positive sample is used as the second public training sample corresponding to The positive sample representation of the predicted sample representation of the predicted sample representation, and the negative sample representation of the predicted sample representation corresponding to the second negative sample as the predicted sample representation corresponding to the second public training sample.
步骤S322,基于各所述预测样本表征与各所述预测样本表征对应的正样本表征以及对应的负样本表征,计算各所述初始全局特征提取模型对应的对比学习损失;Step S322, calculating the contrastive learning loss corresponding to each of the initial global feature extraction models based on each of the predicted sample representations, the positive sample representation corresponding to each of the predicted sample representations, and the corresponding negative sample representation;
在本实施例中,基于各所述预测样本表征与各所述预测样本表征对应的正样本表征以及对应的负样本表征,计算各所述初始全局特征提取模型对应的对比学习损失,具体地,对于每一初始全局特征提取模型均执行以下步骤:In this embodiment, based on each of the predicted sample representations, the positive sample representations corresponding to each of the predicted sample representations, and the corresponding negative sample representations, the comparative learning loss corresponding to each of the initial global feature extraction models is calculated, specifically, For each initial global feature extraction model the following steps are performed:
基于所述初始全局特征提取模型输出的各预测样本表征与对应的正样本表征之间的相似度,以及所述初始全局特征提取模型输出的各预测样本表征与对应的负样本表征之间的相似度,计算所述初始全局特征提取模型对应的对比学习损失,进而获得各所述初始全局特征提取模型对应的对比学习损失,其中,在一种可实施的方式中,计算所述对比学习损失的计算方式如下:Based on the similarity between each predicted sample representation output by the initial global feature extraction model and the corresponding positive sample representation, and the similarity between each predicted sample representation output by the initial global feature extraction model and the corresponding negative sample representation Degree, calculate the contrastive learning loss corresponding to the initial global feature extraction model, and then obtain the contrastive learning loss corresponding to each of the initial global feature extraction models, wherein, in an implementable manner, the calculation of the contrastive learning loss It is calculated as follows:
Figure PCTCN2021141481-appb-000001
Figure PCTCN2021141481-appb-000001
其中,L N为所述对比学习损失,N-1为所述负样本表征的数量,f(x) T为所述预测样本表征,f(x +)为所述预测样本表征对应的正样本表征,
Figure PCTCN2021141481-appb-000002
为所述预测样本表征对应的第j个负样本表征。
Among them, L N is the contrastive learning loss, N-1 is the number of negative sample representations, f(x) T is the predicted sample representation, f(x + ) is the positive sample corresponding to the predicted sample representation characterization,
Figure PCTCN2021141481-appb-000002
The j-th negative sample representation corresponding to the predicted sample representation.
步骤S323,基于各所述预测样本表征与对应的目标模态聚合样本表征之间的相似度,分别计算各所述初始全局特征提取模型对应的知识蒸馏损失。Step S323, based on the similarity between each of the predicted sample representations and the corresponding target modal aggregation sample representations, respectively calculate the knowledge distillation loss corresponding to each of the initial global feature extraction models.
在本实施例中,基于各所述预测样本表征与对应的目标模态聚合样本表征之间的相似度,分别计算各所述初始全局特征提取模型对应的知识蒸馏损失,具体地,基于每一所述预测样本表征与对应的目标模态聚合样本表征之间的相似度,分别计算每一所述预测样本表征与对应的目标模态聚合样本表征之间之间的交叉熵,获得各所述初始全局特征提取模型对应的知识蒸馏损失。In this embodiment, based on the similarity between each of the predicted sample representations and the corresponding target modal aggregation sample representations, the knowledge distillation losses corresponding to each of the initial global feature extraction models are calculated, specifically, based on each The similarity between the predicted sample representation and the corresponding target modal aggregation sample representation is calculated respectively. The cross entropy between each of the predicted sample representations and the corresponding target modal aggregation sample representation is obtained to obtain each of the The knowledge distillation loss corresponding to the initial global feature extraction model.
在另一种实施方式中,基于每一所述预测样本表征与对应的目标模态聚合样本表征之间的相似度,通过L2损失函数分别计算各所述初始全局特征提取模型对应的知识蒸馏损失。In another embodiment, based on the similarity between each of the predicted sample representations and the corresponding target modal aggregation sample representations, the knowledge distillation loss corresponding to each of the initial global feature extraction models is calculated through the L2 loss function .
步骤S33,基于各所述初始全局特征提取模型对应的知识蒸馏损失以及对应的对比学习损失,优化各所述初始全局特征提取模型,得到各目标全局特征提取模型。Step S33, based on the knowledge distillation loss corresponding to each of the initial global feature extraction models and the corresponding comparative learning loss, optimize each of the initial global feature extraction models to obtain each target global feature extraction model.
在本实施例中,需要说明的是,若基于对比学习损失优化初始全局特征提取模型,则会促使各初始全局特征提取模型对于具有同一样本标签的各数据模态的公有训练样本,各 初始全局特征提取模型将输出相似度大于第一预设相似度阈值的样本表征,各初始全局特征提取模型将输出尽可能相似的样本表征,且会促使各初始全局特征提取模型对于具有不同样本标签的各数据模态的公有训练样本,各初始全局特征提取模型将输出相似度小于第二预设相似度阈值的样本表征,各初始全局特征提取模型将输出尽可能不相似的样本表征,所以,基于对比学习损失优化初始全局特征提取模型可拉近初始全局特征提取模型输出的预测样本表征与其对应的正样本表征之间的距离,以及拉远初始全局特征提取模型输出的预测样本表征与其对应的负样本表征之间的距离,其中,第一预设相似度阈值大于第二预设相似度阈值。In this embodiment, it should be noted that if the initial global feature extraction model is optimized based on the contrastive learning loss, each initial global feature extraction model will The feature extraction model will output sample representations whose similarity is greater than the first preset similarity threshold, and each initial global feature extraction model will output sample representations as similar as possible, and will prompt each initial global feature extraction model to For the public training samples of the data modality, each initial global feature extraction model will output sample representations whose similarity is less than the second preset similarity threshold, and each initial global feature extraction model will output sample representations that are as dissimilar as possible. Therefore, based on the comparison Learning loss to optimize the initial global feature extraction model can shorten the distance between the predicted sample representation output by the initial global feature extraction model and its corresponding positive sample representation, and distance the predicted sample representation output by the initial global feature extraction model from its corresponding negative sample The distance between the tokens, wherein the first preset similarity threshold is greater than the second preset similarity threshold.
进一步地,而由于知识蒸馏损失为交叉熵损失或者为L2损失,进而会促使初始全局特征提取模型输出的预测样本表征会尽可能靠近初始全局特征提取模型对应的目标模态聚合样本表征,使得初始全局特征提取模型输出的预测样本表征与初始全局特征提取模型对应的目标模态聚合样本表征之间的相似度大于预设第三相似度阈值,进而可实现促使初始全局特征提取模型学习对应的数据模态对应的各参与方设备中的全局优化后的本地特征提取模型的模型知识的目的。Furthermore, since the knowledge distillation loss is cross-entropy loss or L2 loss, it will prompt the predicted sample representation output by the initial global feature extraction model to be as close as possible to the target modal aggregation sample representation corresponding to the initial global feature extraction model, so that the initial The similarity between the predicted sample representation output by the global feature extraction model and the target modal aggregation sample representation corresponding to the initial global feature extraction model is greater than the preset third similarity threshold, which can then promote the initial global feature extraction model to learn the corresponding data The purpose of the model knowledge of the globally optimized local feature extraction model in each participant's device corresponding to the modality.
基于各所述初始全局特征提取模型对应的知识蒸馏损失以及对应的对比学习损失,优化各所述初始全局特征提取模型,得到各目标全局特征提取模型,具体地,将每一所述初始全局特征提取模型对应的知识蒸馏损失以及对应的对比学习损失进行聚合,获得每一所述初始全局特征提取模型对应的全局模型总损失,进而判断各所述初始全局特征提取模型是否均满足预设训练结束条件,若满足,则将各所述初始全局特征提取模型作为对应的目标全局特征提取模型,若不满足,则基于各模型总损失计算的模型梯度,分别更新对应的初始模型全局特征提取模型,并返回执行步骤:将各数据模态对应的初始全局特征提取模型分发至各所述数据模态对应的参与方设备,其中,所述预设训练结束条件包括各模型总损失均收敛以及各所述初始全局特征提取模型的迭代次数均达到预设迭代次数阈值等,进而实现了基于联邦公有数据集,通过知识蒸馏的方式联合属于相同的数据模态的数据构建各数据模态对应的全局模型,以及同时通过对比学习的方式将全局模型针对于不同数据模态的样本生成样本表征在特征空间上进行对齐,进而实现了在不同数据模态的样本上进行横向联邦学习的目的,进而实现了从基于同一数据模态的样本进行横向联邦学习到基于多个数据模态的样本进行横向联邦学习的跨越,解决了拥有不同数据模态之间的参与方设备的数据孤岛问题,进一步提升了用于进行横向联邦学习的样本的丰富度,提升了横向联邦学习的效果,使得横向联邦学习构建的模型预测准确度更高。Based on the knowledge distillation loss corresponding to each of the initial global feature extraction models and the corresponding comparative learning loss, each of the initial global feature extraction models is optimized to obtain each target global feature extraction model, specifically, each of the initial global feature The knowledge distillation loss corresponding to the extraction model and the corresponding comparative learning loss are aggregated to obtain the total loss of the global model corresponding to each of the initial global feature extraction models, and then judge whether each of the initial global feature extraction models meets the preset end of training If the condition is satisfied, each of the initial global feature extraction models is used as the corresponding target global feature extraction model, if not satisfied, the corresponding initial model global feature extraction models are updated respectively based on the model gradient calculated by the total loss of each model, And return to the execution step: distribute the initial global feature extraction model corresponding to each data modality to the participant equipment corresponding to each said data modality, wherein the preset training end conditions include the convergence of the total loss of each model and the convergence of each model The number of iterations of the initial global feature extraction model has reached the preset threshold of iteration times, etc., and then realized the global model corresponding to each data mode by combining data belonging to the same data mode through knowledge distillation based on the federal public data set , and at the same time align the sample representations generated by the global model for samples of different data modalities in the feature space by means of comparative learning, and then realize the purpose of horizontal federated learning on samples of different data modalities, and then achieve The leap from horizontal federated learning based on samples of the same data mode to horizontal federated learning based on samples of multiple data modes solves the problem of data islands with participant devices between different data modes, and further improves user-friendliness. The richness of samples for horizontal federated learning improves the effect of horizontal federated learning and makes the prediction accuracy of the model built by horizontal federated learning higher.
进一步地,在步骤S30之后,还包括:Further, after step S30, it also includes:
根据各目标全局特征提取模型对应的数据模态,分别将各所述目标全局特征提取模型下发至具备各所述目标全局特征提取模型对应数据模态的参与方设备,进而所述参与方设备基于本地私有训练样本,通过在所述目标全局特征提取模型和本地特征提取模型之间进行对比学习训练,优化所述本地特征提取模型,即可获得目标本地特征提取模型,其中,所述参与方设备基于本地私有训练样本,通过在所述目标全局特征提取模型和本地特征提取模型之间进行对比学习训练,优化所述本地特征提取模型,即可获得目标本地特征提取模型的具体实施过程可参照步骤A10至步骤A20及其细化步骤中的具体内容,在此不再赘述。According to the data modality corresponding to each target global feature extraction model, each of the target global feature extraction models is sent to the participant device having the corresponding data modality of each of the target global feature extraction models, and then the participant device Based on the local private training samples, by performing comparative learning and training between the target global feature extraction model and the local feature extraction model, and optimizing the local feature extraction model, the target local feature extraction model can be obtained, wherein the participant Based on the local private training samples, the device performs comparative learning and training between the target global feature extraction model and the local feature extraction model, and optimizes the local feature extraction model to obtain the target local feature extraction model. For the specific implementation process, please refer to The specific content of step A10 to step A20 and the detailed steps thereof will not be repeated here.
本申请实施例提供了一种联邦学习建模优化方法,相比于现有技术采用的在各参与方的特征本身上进行对齐,以进行横向联邦学习的技术手段,本申请实施例首先将各数据模态对应的初始全局特征提取模型分发至各所述数据模态对应的参与方设备,以供所述参与方设备基于本地私有训练样本,通过在所述初始全局特征提取模型和本地特征提取模型之间进行对比学习训练,优化所述本地特征提取模型,使得本地特征提取模型可学习到全局模型的模型知识,进而获得全局优化后的本地特征提取模型,进而基于所述全局优化后的 本地特征提取模型,对联邦公有数据集中对应的目标模态公有样本进行特征提取,得到目标模态公有样本表征,接收各所述参与方设备发送的目标模态公有样本表征,并将对各所述目标模态公有样本表征进行基于数据模态的选择性聚合,获得各所述数据模态对应的目标模态聚合样本表征,即可实现将属于同一数据模态的目标模态公有样本表征分别进行聚合的目的,进而在所述联邦公有数据集中获取各所述数据模态对应的公有训练样本,并基于各所述公有训练样本,分别对各所述初始全局特征提取模型进行基于各所述目标模态聚合样本表征的知识蒸馏学习训练,以及在各所述初始全局特征提取模型之间进行对比学习训练,获得各所述初始模型全局特征提取模型对应的目标全局特征提取模型,而由于每一目标模态公有样本表征均为基于参与方设备的本地私有训练样本进行优化得到全局优化后的本地特征提取模型输出的,进而使得每一初始全局特征提取模型均可以通过知识蒸馏间接联合对应数据模态的多个参与方的样本进行横向联邦学习,且同时各数据模态对应的初始全局特征提取模型可利用对比学习将不同数据模态的样本在特征空间上进行对齐,进而实现了间接联合不同数据模态的样本进行横向联邦学习的目的,使得横向联邦学习不再局限于不同参与方同一数据模态的样本之间进行,克服了现有技术中由于现有的横向联邦学习只能联合不同参与方中同一数据模态的样本进行,而导致现有的横向联邦学习的局限性较强的技术缺陷,所以,降低了横向联邦学习的局限性。The embodiment of the present application provides a federated learning modeling optimization method. Compared with the technical means of aligning the features of each participant in the prior art to perform horizontal federated learning, the embodiment of the present application firstly integrates each The initial global feature extraction model corresponding to the data modality is distributed to each participant device corresponding to the data modality, so that the participant device can use the initial global feature extraction model and the local feature extraction model based on the local private training sample. Perform comparative learning and training between models, optimize the local feature extraction model, so that the local feature extraction model can learn the model knowledge of the global model, and then obtain a globally optimized local feature extraction model, and then based on the globally optimized local feature extraction model The feature extraction model extracts features from the corresponding target modal public samples in the federal public data set to obtain target modal public sample representations, receives the target modal public sample representations sent by each participant device, and The target modal public sample representation is selectively aggregated based on the data modality, and the target modality aggregation sample representation corresponding to each data modality is obtained, so that the target modal public sample representations belonging to the same data modality can be separately aggregated. The purpose of aggregation, and then obtain the public training samples corresponding to each of the data modalities in the federal public data set, and based on each of the public training samples, each of the initial global feature extraction models is based on each of the objectives Knowledge distillation learning training of modal aggregation sample representation, and comparative learning training between each of the initial global feature extraction models, to obtain the target global feature extraction model corresponding to each of the initial model global feature extraction models, and because each The target modal public sample representations are all output from the globally optimized local feature extraction model based on the optimization of the local private training samples of the participant’s equipment, so that each initial global feature extraction model can be indirectly combined with the corresponding data model through knowledge distillation. The samples of multiple participants in different modalities can be federated horizontally, and at the same time, the initial global feature extraction model corresponding to each data modal can use contrastive learning to align samples of different data modalities in the feature space, thus realizing the indirect joint different The purpose of horizontal federated learning on samples of data modalities is to make horizontal federated learning no longer limited to samples of the same data modality of different parties, and to overcome the problem in the prior art that the existing horizontal federated learning can only combine different The samples of the same data mode in the participants are performed, which leads to the technical defect of the existing horizontal federated learning with strong limitations, so the limitations of the horizontal federated learning are reduced.
进一步地,参照图2,在本申请另一实施例中,所述联邦学习建模优化方法应用于参与方设备,所述联邦学习建模优化方法包括:Further, referring to FIG. 2 , in another embodiment of the present application, the federated learning modeling optimization method is applied to participant devices, and the federated learning modeling optimization method includes:
步骤A10,接收联邦服务器下发的初始全局特征提取模型,并提取本地私有训练样本;Step A10, receiving the initial global feature extraction model issued by the federation server, and extracting local private training samples;
在本实施例中,需要说明的是,所述初始全局特征提取模型为联邦服务器中未优化好的目标全局特征提取模型,所述本地私有训练样本为参与方设备私自拥有的训练样本数据,为参与方设备的隐私数据。In this embodiment, it should be noted that the initial global feature extraction model is an unoptimized target global feature extraction model in the federation server, and the local private training sample is the training sample data privately owned by the participant device, which is The private data of the participant's device.
步骤A20,基于所述本地私有训练样本,通过在所述初始全局特征提取模型和本地特征提取模型之间进行对比学习训练,优化所述本地特征提取模型,获得全局优化后的本地特征提取模型;Step A20, based on the local private training samples, by performing comparative learning and training between the initial global feature extraction model and the local feature extraction model, optimizing the local feature extraction model, and obtaining a globally optimized local feature extraction model;
在本实施例中,需要说明的是,所述本地特征提取模型为参与方设备本地维护的特征提取模型,所述本地私有训练样本的数量至少为1.In this embodiment, it should be noted that the local feature extraction model is a feature extraction model locally maintained by the participant device, and the number of local private training samples is at least 1.
基于所述本地私有训练样本,通过在所述初始全局特征提取模型和本地特征提取模型之间进行对比学习训练,优化所述本地特征提取模型,获得全局优化后的本地特征提取模型,具体地,利用初始全局特征提取模型对所有本地私有训练样本进行特征提取,得到各第一样本表征,以及利用本地特征提取模型对所有本地私有训练样本进行特征提取,得到各第二样本表征,进而基于每一第一样本表征与每一第二样本表征之间的相似度,计算对比学习损失,进而基于对比学习损失,优化所述本地特征提取模型,获得全局优化后的本地特征提取模型。Based on the local private training samples, by performing comparative learning and training between the initial global feature extraction model and the local feature extraction model, optimizing the local feature extraction model to obtain a globally optimized local feature extraction model, specifically, Use the initial global feature extraction model to perform feature extraction on all local private training samples to obtain the representation of each first sample, and use the local feature extraction model to perform feature extraction on all local private training samples to obtain the representation of each second sample, and then based on each A comparison learning loss is calculated based on the similarity between the first sample representation and each second sample representation, and then based on the comparison learning loss, the local feature extraction model is optimized to obtain a globally optimized local feature extraction model.
其中,所述基于所述本地私有训练样本,通过在所述初始全局特征提取模型和本地特征提取模型之间进行对比学习训练,优化所述本地特征提取模型,获得全局优化后的本地特征提取模型的步骤包括:Wherein, based on the local private training samples, by performing comparative learning and training between the initial global feature extraction model and the local feature extraction model, optimizing the local feature extraction model, and obtaining a globally optimized local feature extraction model The steps include:
步骤A21,通过所述初始全局特征提取模型将所有所述本地私有训练样本映射为第一样本表征,以及通过所述本地特征提取模型将所有所述本地私有训练样本映射为第二样本表征;Step A21, mapping all the local private training samples to a first sample representation through the initial global feature extraction model, and mapping all the local private training samples to a second sample representation through the local feature extraction model;
在本实施例中,通过所述初始全局特征提取模型对所有本地私有训练样本进行特征提取,将所有本地私有训练样本分别映射为对应的第一样本表征,以及通过所述本地特征提取模型对所有本地私有训练样本进行特征提取,将所有本地私有训练样本分别映射为对应的第二样本表征。In this embodiment, feature extraction is performed on all local private training samples through the initial global feature extraction model, and all local private training samples are respectively mapped to corresponding first sample representations, and the local feature extraction model is used to All local private training samples are subjected to feature extraction, and all local private training samples are respectively mapped to corresponding second sample representations.
步骤A22,基于各所述第一样本表征和各所述第二样本表征之间的相似度,计算对比学习损失;Step A22, calculating a contrastive learning loss based on the similarity between each of the first sample representations and each of the second sample representations;
在本实施例中,基于各所述第一样本表征和各所述第二样本表征之间的相似度,计算对比学习损失,具体地,对于每一第二样本表征均执行以下步骤:In this embodiment, based on the similarity between each of the first sample representations and each of the second sample representations, the contrastive learning loss is calculated. Specifically, the following steps are performed for each second sample representation:
在各所述第一样本表征中确定与所述第二样本表征对应同一本地私有训练样本的目标样本表征,进而将所述目标样本表征作为所述第二样本表征对应的本地正样本表征,进而除所述目标样本表征之外的其他第一样本表征均作为所述第二样本表征对应的本地负样本表征,进而基于每一第二样本表征对应的本地正样本表征以及对应的本地负样本表征,计算对比学习损失。Determining a target sample representation corresponding to the same local private training sample as the second sample representation in each of the first sample representations, and then using the target sample representation as the local positive sample representation corresponding to the second sample representation, Furthermore, other first sample representations except the target sample representation are used as local negative sample representations corresponding to the second sample representation, and then based on the local positive sample representations corresponding to each second sample representation and the corresponding local negative sample representations. Sample representation, computing contrastive learning loss.
其中,所述基于各所述第一样本表征和各所述第二样本表征之间的相似度,计算对比学习损失的步骤包括:Wherein, the step of calculating the contrastive learning loss based on the similarity between each of the first sample representations and each of the second sample representations includes:
步骤A221,将各所述第一样本表征中与所述第二样本表征对应同一所述本地私有训练样本的样本表征作为所述第二样本表征对应的本地正样本表征;Step A221, taking the sample representation corresponding to the same local private training sample as the second sample representation in each of the first sample representations as the local positive sample representation corresponding to the second sample representation;
步骤A222,将各所述第二样本表征中不与所述第一样本表征对应同一所述本地私有训练样本的样本表征作为所述第一样本表征对应的本地负样本表征;Step A222, taking the sample representations of the second sample representations that do not correspond to the same local private training sample as the first sample representations as the local negative sample representations corresponding to the first sample representations;
步骤A223,基于各所述第二样本表征与各所述第二样本表征对应的本地正样本表征之间的相似度,以及各所述第二样本表征与各所述第二样本表征对应的本地负样本表征之间的相似度,计算所述对比学习损失。Step A223, based on the similarity between each of the second sample representations and the local positive sample representations corresponding to each of the second sample representations, and the local positive sample representations corresponding to each of the second sample representations and each of the second sample representations The similarity between negative sample representations is used to calculate the contrastive learning loss.
在本实施例中,基于每一所述第二样本表征与对应的本地正样本表征之间的相似度,以及每一所述第二样本表征与对应的本地负样本表征之间的相似度,计算每一所述第二样本表征对应的单个样本对比学习损失,进而将各单个样本对比学习损失进行累加,得到所述对比学习损失,其中,计算所述对比学习损失的具体公式如下:In this embodiment, based on the similarity between each second sample representation and the corresponding local positive sample representation, and the similarity between each of the second sample representations and the corresponding local negative sample representation, Calculating a single-sample comparative learning loss corresponding to each of the second sample representations, and then accumulating the individual sample comparative learning losses to obtain the comparative learning loss, wherein the specific formula for calculating the comparative learning loss is as follows:
Figure PCTCN2021141481-appb-000003
Figure PCTCN2021141481-appb-000003
其中,L N为所述对比学习损失,N-1为所述本地负样本表征的数量,f(x) T为所述第二样本表征,f(x +)为所述第二样本表征对应的本地正样本表征,
Figure PCTCN2021141481-appb-000004
为所述第二样本表征对应的第j个本地负样本表征。
Among them, L N is the contrastive learning loss, N-1 is the number of local negative sample representations, f(x) T is the second sample representation, f(x + ) is the corresponding The local positive sample representation of ,
Figure PCTCN2021141481-appb-000004
A j-th local negative sample representation corresponding to the second sample representation.
步骤A23,基于所述对比学习损失,优化所述本地特征提取模型,得到全局优化后的本地特征提取模型。Step A23: Optimizing the local feature extraction model based on the comparative learning loss to obtain a globally optimized local feature extraction model.
在本实施例中,基于所述对比学习损失,优化所述本地特征提取模型,得到全局优化后的本地特征提取模型,具体地,判断所述对比学习损失是否收敛,若收敛,则将所述本地特征提取模型作为全局优化后的本地特征提取模型,若未收敛,则基于所述对比学习损失计算的模型梯度,更新所述本地特征提取模型,并返回执行步骤:提取本地私有训练样本,实现了促使本地特征提取模型学习初始全局特征提取模型的模型知识的目的,使得本地特征提取模型与初始全局特征提取模型对于同一样本的输出尽可能的相近,实现了对应本地特征提取的全局优化。In this embodiment, based on the contrastive learning loss, the local feature extraction model is optimized to obtain a globally optimized local feature extraction model. Specifically, it is judged whether the contrastive learning loss converges, and if it converges, the The local feature extraction model is used as a globally optimized local feature extraction model. If it does not converge, update the local feature extraction model based on the model gradient calculated by the comparative learning loss, and return to the execution step: extract local private training samples, realize In order to promote the local feature extraction model to learn the model knowledge of the initial global feature extraction model, the output of the local feature extraction model and the initial global feature extraction model for the same sample is as similar as possible, and the global optimization of the corresponding local feature extraction is realized.
其中,所述基于所述对比学习损失,优化所述本地特征提取模型,得到全局优化后的本地特征提取模型的步骤包括:Wherein, the step of optimizing the local feature extraction model based on the comparative learning loss to obtain a globally optimized local feature extraction model includes:
步骤A231,通过预设分类模型,将各所述第二样本表征分别转换为各所述本地私有训练样本对应的输出分类标签;Step A231, converting each of the second sample representations into output classification labels corresponding to each of the local private training samples through a preset classification model;
在本实施例中,需要说明的是,所述本地私有训练样本具备对应的预设真实标签,其 中,所述预设真实标签为所述本地私有训练样本的标识,可用于表示本地私有收敛样本的类别、属性以及身份等信息。In this embodiment, it should be noted that the local private training sample has a corresponding preset real label, wherein the preset real label is the identifier of the local private training sample, which can be used to represent the local private convergent sample information such as categories, attributes, and identities.
通过预设分类模型,将各所述第二样本表征分别转换为各所述本地私有训练样本对应的输出分类标签,具体地,将各所述第二样本表征输入预设分类模型,分别对各所述第二样本表征进行全连接,获得各所述第二样本表征对应的全连接向量,进而基于预设激活函数,分别将各所述全连接向量分别转换为各所述本地私有训练样本对应的输出分类标签。Through the preset classification model, each of the second sample representations is converted into an output classification label corresponding to each of the local private training samples, specifically, each of the second sample representations is input into the preset classification model, and each The second sample representation is fully connected to obtain a fully connected vector corresponding to each of the second sample representations, and then based on a preset activation function, each of the fully connected vectors is respectively converted into each of the local private training samples. The output classification label for .
步骤A232,基于各所述输出分类标签和各所述本地私有训练样本对应的预设真实标签,计算分类损失;Step A232, calculating a classification loss based on each of the output classification labels and the preset real labels corresponding to each of the local private training samples;
在本实施例中,基于各所述输出分类标签和各所述本地私有训练样本对应的预设真实标签,计算分类损失,具体地,计算每一所述输出分类标签与对应的本地私有训练样本对应的预设真实标签之间的交叉熵损失,进而将各所述交叉熵损失进行累加,获得分类损失。In this embodiment, based on each of the output classification labels and the preset real labels corresponding to each of the local private training samples, the classification loss is calculated, specifically, the calculation of each of the output classification labels and the corresponding local private training samples The cross-entropy loss between the corresponding preset real labels, and then accumulate the cross-entropy losses to obtain the classification loss.
在另一种实施方式中,步骤A232包括:计算每一所述输出分类标签与对应的本地私有训练样本对应的预设真实标签之间的L2损失,进而将各L2损失进行累加,获得分类损失。In another implementation, step A232 includes: calculating the L2 loss between each of the output classification labels and the preset real label corresponding to the corresponding local private training sample, and then accumulating the L2 losses to obtain the classification loss .
步骤A233,基于所述对比学习损失和所述分类损失,计算模型总损失;Step A233, calculating the total model loss based on the contrastive learning loss and the classification loss;
在本实施例中,基于预设聚合规则,将所述对比学习损失与所述分类损失进行聚合,得到模型总损失,其中,所述预设聚合规则包括求和以及求平均等。In this embodiment, based on a preset aggregation rule, the comparison learning loss and the classification loss are aggregated to obtain a total model loss, wherein the preset aggregation rule includes summing and averaging.
步骤A234,基于所述模型总损失,优化所述本地特征提取模型,得到所述全局优化后的本地特征提取模型。Step A234: Optimizing the local feature extraction model based on the total model loss to obtain the globally optimized local feature extraction model.
在本实施例中,具体地,判断所述模型总损失是否收敛,若收敛,则将所述本地特征提取模型作为全局优化后的本地特征提取模型,若未收敛,则基于所述模型总损失计算的模型梯度,更新所述本地特征提取模型,并返回执行步骤:提取本地私有训练样本,实现了促使本地特征提取模型学习初始全局特征提取模型的模型知识的目的,使得本地特征提取模型与初始全局特征提取模型对于同一样本的输出尽可能的相近,实现了对应本地特征提取的全局优化,同时还使得本地特征提取模型的输出尽可能的与预设真实标签相近,提升了本地特征提取模型的准确度。In this embodiment, specifically, it is judged whether the total loss of the model is converged, and if it is converged, the local feature extraction model is used as a globally optimized local feature extraction model, and if it is not converged, the model is based on the total loss Calculate the model gradient, update the local feature extraction model, and return to the execution step: extract local private training samples, realize the purpose of promoting the local feature extraction model to learn the model knowledge of the initial global feature extraction model, and make the local feature extraction model and the initial The output of the global feature extraction model for the same sample is as close as possible, which realizes the global optimization of the corresponding local feature extraction, and also makes the output of the local feature extraction model as close as possible to the preset real label, which improves the performance of the local feature extraction model. Accuracy.
步骤A30,在联邦公有数据集中提取属于所述全局优化后的本地特征提取模型对应的数据模态的目标模态公有样本,并基于所述全局优化后的本地特征提取模型,对所述目标模态公有样本进行特征提取,得到目标模态公有样本表征;Step A30, extracting target modal public samples belonging to the data modal corresponding to the globally optimized local feature extraction model from the federal public data set, and based on the globally optimized local feature extraction model, extracting the target modal Feature extraction is performed on the public samples of the target modal to obtain the representation of the public sample of the target modal;
在本实施例中,具体地,在联邦公有数据集中提取属于所述全局优化后的本地特征提取模型对应的数据模态的目标模态公有样本,也即在联邦公有数据集中提取属于所述参与方设备对应的数据模态的目标模态公有样本,进而利用所述全局优化后的本地特征提取模型对所有目标模态公有样本进行特征提取,得到所有目标模态公有样本的目标模态公有样本表征。In this embodiment, specifically, the target modal public samples belonging to the data modality corresponding to the globally optimized local feature extraction model are extracted from the federal public data set, that is, the target modal public samples belonging to the participating model are extracted from the federal public data set. The target modal public samples of the data modal corresponding to the square equipment, and then use the globally optimized local feature extraction model to perform feature extraction on all target modal public samples, and obtain the target modal public samples of all target modal public samples characterization.
步骤A40,将所述目标模态公有样本表征发送至联邦服务器,以供所述联邦服务器基将对各所述目标模态公有样本表征进行基于数据模态的选择性聚合,获得各所述数据模态对应的目标模态聚合样本表征,并在所述联邦公有数据集中获取各所述数据模态对应的公有训练样本,基于各所述公有训练样本,分别对各所述初始全局特征提取模型进行基于各所述目标模态聚合样本表征的知识蒸馏学习训练,以及在各所述初始全局特征提取模型之间进行对比学习训练,获得各所述初始模型全局特征提取模型对应的目标全局特征提取模型。Step A40, sending the public sample representation of the target modality to the federated server, so that the federated server can selectively aggregate the public sample representations of each target modality based on a data modality to obtain each of the data Aggregating sample representations of the target modalities corresponding to the modalities, and obtaining public training samples corresponding to each of the data modalities in the federal public data set, and based on each of the public training samples, respectively extracting the initial global features of each of the models Carry out knowledge distillation learning training based on each of the target modal aggregate sample representations, and perform comparative learning training between each of the initial global feature extraction models to obtain the target global feature extraction corresponding to each of the initial model global feature extraction models Model.
在本实施例中,需要说明的是,所述联邦服务器基将对各所述目标模态公有样本表征进行基于数据模态的选择性聚合,获得各所述数据模态对应的目标模态聚合样本表征,并在所述联邦公有数据集中获取各所述数据模态对应的公有训练样本,基于各所述公有训练样本,分别对各所述初始全局特征提取模型进行基于各所述目标模态聚合样本表征的知识 蒸馏学习训练,以及在各所述初始全局特征提取模型之间进行对比学习训练,获得各所述初始模型全局特征提取模型对应的目标全局特征提取模型的具体实现过程可参照步骤S10至步骤S30中的具体内容,在此不再赘述。In this embodiment, it should be noted that the federated server base will perform data modality-based selective aggregation on the public sample representations of each of the target modalities to obtain the target modality aggregation corresponding to each of the data modalities Sample characterization, and obtain public training samples corresponding to each of the data modalities in the federal public data set, and based on each of the public training samples, perform each of the initial global feature extraction models based on each of the target modalities For the knowledge distillation learning training of aggregation sample representation, and the comparative learning training between each of the initial global feature extraction models, the specific implementation process of obtaining the target global feature extraction model corresponding to each of the initial model global feature extraction models can refer to the steps The specific content in steps S10 to S30 will not be repeated here.
进一步地,接收联邦服务器下发的目标全局特征提取模型,并提取本地私有训练样本,进而基于所述本地私有训练样本,通过在所述目标全局特征提取模型和本地特征提取模型之间进行对比学习训练,优化所述本地特征提取模型,以供所述本地特征提取模型学习目标本地特征提取模型的模型知识,进而得到目标本地特征提取模型,其中,得到目标本地特征提取模型的具体实现过程与得到全局优化后的本地特征提取模型的具体实现过程相同,具体可参照步骤A10至步骤A20中的具体内容,在此不再赘述。Further, receiving the target global feature extraction model issued by the federated server, and extracting local private training samples, and then based on the local private training samples, by comparing and learning between the target global feature extraction model and the local feature extraction model Training, optimizing the local feature extraction model, so that the local feature extraction model can learn the model knowledge of the target local feature extraction model, and then obtain the target local feature extraction model, wherein, the specific implementation process and obtaining of the target local feature extraction model are obtained The specific implementation process of the globally optimized local feature extraction model is the same, for details, please refer to the specific content in step A10 to step A20, which will not be repeated here.
进一步地,如图3所示为本申请联邦学习优化方法中进行横向联邦学习建模时的交互流程示意图,其中,server为联邦服务器,Client为参与方设备,N为参与方设备的数量,base 1和head 1为组成Client 1中的本地特征提取模型,base N和head N为组成Client N中的本地特征提取模型,classfier为所述预设分类模型,base g.a和Head g.a为组成Client 1中的初始全局特征提取模型,也即为Model g.a,base g.b和Head g.b为组成Client N中的初始全局特征提取模型,也即为Model g.b,Y 1和Y N为预设真实标签,X 1为Client 1中的本地私有训练样本,X N为Client N中的本地私有训练样本,X pub.a为Client 1中的目标模态公有样本,X pub.b为Client N中的目标模态公有样本,Z agg.a为所述第一公有训练样本对应的预测样本表征,Z agg.b为所述第二公有训练样本对应的预测样本表征,cross-entropy loss为所述交叉熵损失,Contrastive loss为所述对比学习损失,此时联邦服务器直接将各所述参与方设备在联邦公有数据集中选取的目标模态公有样本为公有训练样本。 Further, as shown in Figure 3, it is a schematic diagram of the interaction process when performing horizontal federated learning modeling in the federated learning optimization method of the present application, where server is the federated server, Client is the device of the participant, N is the number of devices of the participant, and base 1 and head 1 are the local feature extraction models that make up Client 1 , base N and head N are the local feature extraction models that make up Client N , classfier is the preset classification model, base ga and Head ga are the components that make up Client 1 The initial global feature extraction model of is Model ga , base gb and Head gb are the initial global feature extraction models in Client N , that is, Model gb , Y 1 and Y N are preset real labels, and X 1 is The local private training sample in Client 1 , X N is the local private training sample in Client N , X pub.a is the target modal public sample in Client 1 , and X pub.b is the target modal public sample in Client N , Z agg.a is the prediction sample representation corresponding to the first public training sample, Z agg.b is the prediction sample representation corresponding to the second public training sample, cross-entropy loss is the cross entropy loss, Contrastive loss In order to compare the learning loss, the federated server directly uses the target mode public samples selected by each participant device in the federated public data set as public training samples.
本申请实施例提供了一种联邦学习建模优化方法,也即,首先接收联邦服务器下发的初始全局特征提取模型,并提取本地私有训练样本,进而基于所述本地私有训练样本,通过在所述初始全局特征提取模型和本地特征提取模型之间进行对比学习训练,优化所述本地特征提取模型,获得全局优化后的本地特征提取模型,实现了促使本地提取提取模型学习联邦服务器下发的全局模型的模型知识的目的,实现了对本地特征提取模型的全局优化,进而在联邦公有数据集中提取属于所述全局优化后的本地特征提取模型对应的数据模态的目标模态公有样本,并基于所述全局优化后的本地特征提取模型,对所述目标模态公有样本进行特征提取,得到目标模态公有样本表征,进而将所述目标模态公有样本表征发送至联邦服务器,以供所述联邦服务器基将对各所述目标模态公有样本表征进行基于数据模态的选择性聚合,获得各所述数据模态对应的目标模态聚合样本表征,并在所述联邦公有数据集中获取各所述数据模态对应的公有训练样本,基于各所述公有训练样本,分别对各所述初始全局特征提取模型进行基于各所述目标模态聚合样本表征的知识蒸馏学习训练,以及在各所述初始全局特征提取模型之间进行对比学习训练,获得各所述初始模型全局特征提取模型对应的目标全局特征提取模型,而由于每一目标模态公有样本表征均为基于参与方设备的本地私有训练样本进行优化得到全局优化后的本地特征提取模型输出的,进而使得每一初始全局特征提取模型均可以通过知识蒸馏间接联合对应数据模态的多个参与方的样本进行横向联邦学习,且同时各数据模态对应的初始全局特征提取模型可利用对比学习将不同数据模态的样本在特征空间上进行对齐,进而实现了间接联合不同数据模态的样本进行横向联邦学习的目的,使得横向联邦学习不再局限于不同参与方同一数据模态的样本之间进行,克服了现有技术中由于现有的横向联邦学习只能联合不同参与方中同一数据模态的样本进行,而导致现有的横向联邦学习的局限性较强的技术缺陷,所以,降低了横向联邦学习的局限性。The embodiment of the present application provides a federated learning modeling optimization method, that is, firstly receive the initial global feature extraction model issued by the federated server, and extract local private training samples, and then based on the local private training samples, through the Perform comparative learning and training between the initial global feature extraction model and the local feature extraction model, optimize the local feature extraction model, obtain a globally optimized local feature extraction model, and realize the global The purpose of the model knowledge of the model is to realize the global optimization of the local feature extraction model, and then extract the target mode public samples belonging to the data mode corresponding to the global optimized local feature extraction model in the federal public data set, and based on The local feature extraction model after the global optimization performs feature extraction on the target modality public samples to obtain target modality public sample representations, and then sends the target modality public sample representations to the federated server for the The federated server base will perform selective aggregation based on the data modality for each of the target modality public sample representations, obtain the target modality aggregated sample representations corresponding to each of the data modes, and obtain each The public training samples corresponding to the data modality, based on each of the public training samples, perform knowledge distillation learning training based on the representation of each of the target modality aggregation samples for each of the initial global feature extraction models, and in each of the The initial global feature extraction models are compared and trained to obtain the target global feature extraction models corresponding to each of the initial model global feature extraction models. However, since the public sample representation of each target modality is based on the local private The training samples are optimized to obtain the output of the globally optimized local feature extraction model, so that each initial global feature extraction model can indirectly combine the samples of multiple participants corresponding to the data mode for horizontal federated learning through knowledge distillation, and at the same time The initial global feature extraction model corresponding to each data modality can use contrastive learning to align samples of different data modality in the feature space, thereby achieving the purpose of indirectly combining samples of different data modality for horizontal federated learning, making horizontal federated Learning is no longer limited to samples of the same data modality of different participants, and overcomes the existing problems in the prior art because the existing horizontal federated learning can only be carried out jointly with samples of the same data modality in different participants. The limitations of horizontal federated learning have strong technical defects, so the limitations of horizontal federated learning are reduced.
参照图4,图4是本申请实施例方案涉及的硬件运行环境的设备结构示意图。Referring to FIG. 4 , FIG. 4 is a schematic diagram of a device structure of a hardware operating environment involved in the solution of the embodiment of the present application.
如图4所示,该联邦学习建模优化设备可以包括:处理器1001,例如CPU,存储器 1005,通信总线1002。其中,通信总线1002用于实现处理器1001和存储器1005之间的连接通信。存储器1005可以是高速RAM存储器,也可以是稳定的存储器(non-volatilememory),例如磁盘存储器。存储器1005可选的还可以是独立于前述处理器1001的存储设备。As shown in FIG. 4, the federated learning modeling optimization device may include: a processor 1001, such as a CPU, a memory 1005, and a communication bus 1002. Wherein, the communication bus 1002 is used to realize connection and communication between the processor 1001 and the memory 1005 . The memory 1005 can be a high-speed RAM memory, or a stable memory (non-volatile memory), such as a disk memory. Optionally, the memory 1005 may also be a storage device independent of the aforementioned processor 1001 .
可选地,该联邦学习建模优化设备还可以包括矩形用户接口、网络接口、摄像头、RF(Radio Frequency,射频)电路,传感器、音频电路、WiFi模块等等。矩形用户接口可以包括显示屏(Display)、输入子模块比如键盘(Keyboard),可选矩形用户接口还可以包括标准的有线接口、无线接口。网络接口可选的可以包括标准的有线接口、无线接口(如WI-FI接口)。Optionally, the federated learning modeling optimization device may also include a rectangular user interface, a network interface, a camera, an RF (Radio Frequency, radio frequency) circuit, a sensor, an audio circuit, a WiFi module, and the like. The rectangular user interface may include a display screen (Display), an input sub-module such as a keyboard (Keyboard), and the optional rectangular user interface may also include a standard wired interface and a wireless interface. Optionally, the network interface may include a standard wired interface and a wireless interface (such as a WI-FI interface).
本领域技术人员可以理解,图4中示出的联邦学习建模优化设备结构并不构成对联邦学习建模优化设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。Those skilled in the art can understand that the federated learning modeling and optimization device structure shown in Figure 4 does not constitute a limitation on the federated learning modeling and optimization device, and may include more or less components than those shown in the illustration, or combine some components, or different component arrangements.
如图4所示,作为一种计算机存储介质的存储器1005中可以包括操作系统、网络通信模块以及联邦学习建模优化程序。操作系统是管理和控制联邦学习建模优化设备硬件和软件资源的程序,支持联邦学习建模优化程序以及其它软件和/或,程序的运行。网络通信模块用于实现存储器1005内部各组件之间的通信,以及与联邦学习建模优化系统中其它硬件和软件之间通信。As shown in FIG. 4 , the memory 1005 as a computer storage medium may include an operating system, a network communication module, and a federated learning modeling optimization program. The operating system is a program that manages and controls the hardware and software resources of the federated learning modeling optimization device, and supports the operation of the federated learning modeling optimization program and other software and/or programs. The network communication module is used to realize the communication between various components inside the memory 1005, and communicate with other hardware and software in the federated learning modeling optimization system.
在图4所示的联邦学习建模优化设备中,处理器1001用于执行存储器1005中存储的联邦学习建模优化程序,实现上述任一项所述的联邦学习建模优化方法的步骤。In the federated learning modeling optimization device shown in FIG. 4 , the processor 1001 is configured to execute the federated learning modeling optimization program stored in the memory 1005 to implement the steps of the federated learning modeling optimization method described in any one of the above.
本申请联邦学习建模优化设备具体实施方式与上述联邦学习建模优化方法各实施例基本相同,在此不再赘述。The specific implementation manners of the federated learning modeling optimization device of the present application are basically the same as the embodiments of the above federated learning modeling optimization method, and will not be repeated here.
本申请实施例还提供一种联邦学习建模优化装置,所述联邦学习建模优化装置应用于联邦服务器,所述联邦学习建模优化装置包括:The embodiment of the present application also provides a federated learning modeling optimization device, the federated learning modeling optimization device is applied to a federated server, and the federated learning modeling optimization device includes:
模型分发模块,用于将各数据模态对应的初始全局特征提取模型分发至各所述数据模态对应的参与方设备,以供所述参与方设备基于本地私有训练样本,通过在所述初始全局特征提取模型和本地特征提取模型之间进行对比学习训练,优化所述本地特征提取模型,获得全局优化后的本地特征提取模型,并基于所述全局优化后的本地特征提取模型,对联邦公有数据集中对应的目标模态公有样本进行特征提取,得到目标模态公有样本表征;The model distribution module is used to distribute the initial global feature extraction model corresponding to each data modality to the participant device corresponding to each said data modality, so that the participant device can use local private training samples based on the initial Carry out comparative learning and training between the global feature extraction model and the local feature extraction model, optimize the local feature extraction model, obtain a globally optimized local feature extraction model, and based on the globally optimized local feature extraction model, perform the federated public The corresponding target modal public samples in the data set are subjected to feature extraction to obtain the representation of the target modal public samples;
选择性聚合模块,用于接收各所述参与方设备发送的目标模态公有样本表征,并将对各所述目标模态公有样本表征进行基于数据模态的选择性聚合,获得各所述数据模态对应的目标模态聚合样本表征;A selective aggregation module, configured to receive the target modal public sample representations sent by each of the participant devices, and perform selective aggregation based on data modalities for each of the target modal public sample representations to obtain each of the data Aggregated sample representation of the target mode corresponding to the mode;
训练模块,用于在所述联邦公有数据集中获取各所述数据模态对应的公有训练样本,并基于各所述公有训练样本,分别对各所述初始全局特征提取模型进行基于各所述目标模态聚合样本表征的知识蒸馏学习训练,以及在各所述初始全局特征提取模型之间进行对比学习训练,获得各所述初始模型全局特征提取模型对应的目标全局特征提取模型。The training module is used to obtain public training samples corresponding to each of the data modalities in the federal public data set, and based on each of the public training samples, perform each of the initial global feature extraction models based on each of the targets Knowledge distillation learning training of modal aggregation sample representation, and comparative learning training between each of the initial global feature extraction models, to obtain a target global feature extraction model corresponding to each of the initial model global feature extraction models.
可选地,所述训练模块还用于:Optionally, the training module is also used for:
将各所述数据模态对应的公有训练样本通过对应的初始全局特征提取模型分别映射为预测样本表征;Mapping the public training samples corresponding to each of the data modalities into predicted sample representations through corresponding initial global feature extraction models;
计算各所述预测样本表征与对应的目标模态聚合样本表征之间的知识蒸馏损失,以及计算各所述预测样本表征之间的对比学习损失;calculating a knowledge distillation loss between each of the predicted sample representations and the corresponding target modal aggregation sample representation, and calculating a comparative learning loss between each of the predicted sample representations;
基于各所述初始全局特征提取模型对应的知识蒸馏损失以及对应的对比学习损失,优化各所述初始全局特征提取模型,得到各目标全局特征提取模型。Based on the knowledge distillation loss corresponding to each of the initial global feature extraction models and the corresponding comparative learning loss, each of the initial global feature extraction models is optimized to obtain each target global feature extraction model.
可选地,所述训练模块还用于:Optionally, the training module is also used for:
基于各所述公有训练样本对应的样本标签,在各所述预测样本表征分别选取各所述预 测样本表征对应的正样本表征和对应的负样本表征;Based on the sample label corresponding to each of the public training samples, respectively select a positive sample representation and a corresponding negative sample representation corresponding to each of the prediction sample representations in each of the prediction sample representations;
基于各所述预测样本表征与各所述预测样本表征对应的正样本表征以及对应的负样本表征,计算各所述初始全局特征提取模型对应的对比学习损失;Based on each of the predicted sample representations and each of the predicted sample representations corresponding to the positive sample representation and the corresponding negative sample representation, calculate the comparative learning loss corresponding to each of the initial global feature extraction models;
基于各所述预测样本表征与对应的目标模态聚合样本表征之间的相似度,分别计算各所述初始全局特征提取模型对应的知识蒸馏损失。Based on the similarity between each of the prediction sample representations and the corresponding target modal aggregation sample representations, the knowledge distillation losses corresponding to each of the initial global feature extraction models are calculated respectively.
可选地,所述选择性聚合模块还用于:Optionally, the selective aggregation module is also used for:
基于各所述参与方设备与各所述数据模态之间的对应关系,在各所述目标模态公有样本表征中确定各所述数据模态分别对应的各待聚合样本表征;Based on the corresponding relationship between each of the participant devices and each of the data modalities, determine the respective sample representations to be aggregated corresponding to each of the data modalities in the public sample representations of each of the target modalities;
将各所述数据模态分别对应的各待聚合样本表征分别进行聚合,获得各所述数据模态对应的目标模态聚合样本表征。Aggregating the respective sample representations to be aggregated corresponding to the respective data modalities to obtain the aggregated sample representations of target modalities corresponding to the respective data modalities.
本申请联邦学习建模优化装置的具体实施方式与上述联邦学习建模优化方法各实施例基本相同,在此不再赘述。The specific implementation of the federated learning modeling optimization device of the present application is basically the same as the above embodiments of the federated learning modeling optimization method, and will not be repeated here.
本申请实施例还提供一种联邦学习建模优化装置,所述联邦学习建模优化装置应用于参与方设备,所述联邦学习建模优化装置包括:The embodiment of the present application also provides a federated learning modeling optimization device, the federated learning modeling optimization device is applied to participant equipment, and the federated learning modeling optimization device includes:
接收模块,用于接收联邦服务器下发的初始全局特征提取模型,并提取本地私有训练样本;The receiving module is used to receive the initial global feature extraction model issued by the federation server, and extract local private training samples;
对比学习训练模块,用于基于所述本地私有训练样本,通过在所述初始全局特征提取模型和本地特征提取模型之间进行对比学习训练,优化所述本地特征提取模型,获得全局优化后的本地特征提取模型;A contrastive learning and training module, configured to optimize the local feature extraction model by performing comparative learning and training between the initial global feature extraction model and the local feature extraction model based on the local private training samples, and obtain a globally optimized local Feature extraction model;
特征提取模块,用于在联邦公有数据集中提取属于所述全局优化后的本地特征提取模型对应的数据模态的目标模态公有样本,并基于所述全局优化后的本地特征提取模型,对所述目标模态公有样本进行特征提取,得到目标模态公有样本表征;The feature extraction module is used to extract the target mode public samples belonging to the data mode corresponding to the globally optimized local feature extraction model in the federal public data set, and based on the globally optimized local feature extraction model, for all Feature extraction is performed on the public samples of the target modal to obtain the representation of the public samples of the target modal;
发送模块,用于将所述目标模态公有样本表征发送至联邦服务器,以供所述联邦服务器基将对各所述目标模态公有样本表征进行基于数据模态的选择性聚合,获得各所述数据模态对应的目标模态聚合样本表征,并在所述联邦公有数据集中获取各所述数据模态对应的公有训练样本,基于各所述公有训练样本,分别对各所述初始全局特征提取模型进行基于各所述目标模态聚合样本表征的知识蒸馏学习训练,以及在各所述初始全局特征提取模型之间进行对比学习训练,获得各所述初始模型全局特征提取模型对应的目标全局特征提取模型。A sending module, configured to send the public sample representation of the target modality to a federated server, so that the federated server can selectively aggregate the public sample representations of each target modality based on data modality to obtain the Aggregating sample representations of target modalities corresponding to the data modalities, and obtaining public training samples corresponding to each of the data modalities in the federal public data set, based on each of the public training samples, each of the initial global features The extraction model performs knowledge distillation learning training based on each of the target modal aggregation sample representations, and performs comparative learning training between each of the initial global feature extraction models to obtain the target global feature corresponding to each of the initial model global feature extraction models. Feature extraction model.
可选地,所述对比学习训练模块还用于:Optionally, the contrastive learning training module is also used for:
通过所述初始全局特征提取模型将所有所述本地私有训练样本映射为第一样本表征,以及通过所述本地特征提取模型将所有所述本地私有训练样本映射为第二样本表征;mapping all of the local private training samples to a first sample representation by the initial global feature extraction model, and mapping all of the local private training samples to a second sample representation by the local feature extraction model;
基于各所述第一样本表征和各所述第二样本表征之间的相似度,计算对比学习损失;calculating a contrastive learning loss based on the similarity between each of the first sample representations and each of the second sample representations;
基于所述对比学习损失,优化所述本地特征提取模型,得到全局优化后的本地特征提取模型。Based on the comparative learning loss, the local feature extraction model is optimized to obtain a globally optimized local feature extraction model.
可选地,所述对比学习训练模块还用于:Optionally, the contrastive learning training module is also used for:
将各所述第一样本表征中与所述第二样本表征对应同一所述本地私有训练样本的样本表征作为所述第二样本表征对应的本地正样本表征;Using the sample representation corresponding to the same local private training sample as the second sample representation in each of the first sample representations as the local positive sample representation corresponding to the second sample representation;
将各所述第二样本表征中不与所述第一样本表征对应同一所述本地私有训练样本的样本表征作为所述第一样本表征对应的本地负样本表征;Using the sample representations of the second sample representations that do not correspond to the same local private training sample as the first sample representations as the local negative sample representations corresponding to the first sample representations;
基于各所述第二样本表征与各所述第二样本表征对应的本地正样本表征之间的相似度,以及各所述第二样本表征与各所述第二样本表征对应的本地负样本表征之间的相似度,计算所述对比学习损失。Based on the similarity between each of the second sample representations and the local positive sample representations corresponding to each of the second sample representations, and each of the second sample representations and the local negative sample representations corresponding to each of the second sample representations The similarity between, calculate the contrastive learning loss.
可选地,所述对比学习训练模块还用于:Optionally, the contrastive learning training module is also used for:
通过预设分类模型,将各所述第二样本表征分别转换为各所述本地私有训练样本对应的输出分类标签;Converting each of the second sample representations into output classification labels corresponding to each of the local private training samples through a preset classification model;
基于各所述输出分类标签和各所述本地私有训练样本对应的预设真实标签,计算分类损失;calculating a classification loss based on each of the output classification labels and each of the preset real labels corresponding to the local private training samples;
基于所述对比学习损失和所述分类损失,计算模型总损失;calculating a model total loss based on the contrastive learning loss and the classification loss;
基于所述模型总损失,优化所述本地特征提取模型,得到所述全局优化后的本地特征提取模型。Based on the total loss of the model, the local feature extraction model is optimized to obtain the globally optimized local feature extraction model.
本申请联邦学习建模优化装置的具体实施方式与上述联邦学习建模优化方法各实施例基本相同,在此不再赘述。The specific implementation of the federated learning modeling optimization device of the present application is basically the same as the above embodiments of the federated learning modeling optimization method, and will not be repeated here.
本申请实施例提供了一种可读存储介质,且所述可读存储介质存储有一个或者一个以上程序,所述一个或者一个以上程序还可被一个或者一个以上的处理器执行以用于实现上述任一项所述的联邦学习建模优化方法的步骤。The embodiment of the present application provides a readable storage medium, and the readable storage medium stores one or more programs, and the one or more programs can also be executed by one or more processors to implement The steps of the federated learning modeling optimization method described in any one of the above.
本申请可读存储介质具体实施方式与上述联邦学习建模优化方法各实施例基本相同,在此不再赘述。The specific implementation manner of the readable storage medium of the present application is basically the same as the above embodiments of the federated learning modeling optimization method, and will not be repeated here.
本申请实施例提供了一种计算机程序产品,且所述计算机程序产品包括有一个或者一个以上计算机程序,所述一个或者一个以上计算机程序还可被一个或者一个以上的处理器执行以用于实现上述任一项所述的联邦学习建模优化方法的步骤。The embodiment of the present application provides a computer program product, and the computer program product includes one or more computer programs, and the one or more computer programs can also be executed by one or more processors to implement The steps of the federated learning modeling optimization method described in any one of the above.
本申请计算机程序产品具体实施方式与上述联邦学习建模优化方法各实施例基本相同,在此不再赘述。The specific implementation manners of the computer program product of the present application are basically the same as the embodiments of the above-mentioned federated learning modeling optimization method, and will not be repeated here.
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利处理范围内。The above are only preferred embodiments of the present application, and are not intended to limit the patent scope of the present application. All equivalent structures or equivalent process transformations made by using the description of the application and the accompanying drawings are directly or indirectly used in other related technical fields. , are all included in the patent processing scope of the present application in the same way.

Claims (11)

  1. 一种联邦学习建模优化方法,其中,应用于联邦服务器,所述联邦学习建模优化方法包括:A federated learning modeling optimization method, wherein, applied to a federated server, the federated learning modeling optimization method includes:
    将各数据模态对应的初始全局特征提取模型分发至各所述数据模态对应的参与方设备,以供所述参与方设备基于本地私有训练样本,通过在所述初始全局特征提取模型和本地特征提取模型之间进行对比学习训练,优化所述本地特征提取模型,获得全局优化后的本地特征提取模型,并基于所述全局优化后的本地特征提取模型,对联邦公有数据集中对应的目标模态公有样本进行特征提取,得到目标模态公有样本表征;Distributing the initial global feature extraction model corresponding to each data modality to the participant device corresponding to each data modality, so that the participant device can use the initial global feature extraction model and local Perform comparative learning and training between feature extraction models, optimize the local feature extraction model, obtain a globally optimized local feature extraction model, and based on the globally optimized local feature extraction model, perform a corresponding target model in the federal public data set Feature extraction is performed on the public samples of the target modal to obtain the representation of the public sample of the target modal;
    接收各所述参与方设备发送的目标模态公有样本表征,并将对各所述目标模态公有样本表征进行基于数据模态的选择性聚合,获得各所述数据模态对应的目标模态聚合样本表征;receiving the target modality public sample representation sent by each of the participant devices, and performing selective aggregation based on the data modality for each of the target modality public sample representations to obtain the target modality corresponding to each of the data modes Aggregate sample characterization;
    在所述联邦公有数据集中获取各所述数据模态对应的公有训练样本,并基于各所述公有训练样本,分别对各所述初始全局特征提取模型进行基于各所述目标模态聚合样本表征的知识蒸馏学习训练,以及在各所述初始全局特征提取模型之间进行对比学习训练,获得各所述初始模型全局特征提取模型对应的目标全局特征提取模型。Acquiring public training samples corresponding to each of the data modalities in the federal public data set, and based on each of the public training samples, performing aggregated sample characterization based on each of the target modalities for each of the initial global feature extraction models The knowledge distillation learning training of each of the initial global feature extraction models is performed, and the comparative learning training is performed between each of the initial global feature extraction models to obtain a target global feature extraction model corresponding to each of the initial model global feature extraction models.
  2. 如权利要求1所述联邦学习建模优化方法,其中,所述基于各所述公有训练样本,分别对各所述初始全局特征提取模型进行基于各所述目标模态聚合样本表征的知识蒸馏学习训练,以及在各所述初始全局特征提取模型之间进行对比学习训练,获得各所述初始模型全局特征提取模型对应的目标全局特征提取模型的步骤包括:The federated learning modeling optimization method according to claim 1, wherein, based on each of the public training samples, each of the initial global feature extraction models is subjected to knowledge distillation learning based on each of the target modality aggregation sample representations training, and performing comparative learning training between each of the initial global feature extraction models, and obtaining a target global feature extraction model corresponding to each of the initial model global feature extraction models comprising:
    将各所述数据模态对应的公有训练样本通过对应的初始全局特征提取模型分别映射为预测样本表征;Mapping the public training samples corresponding to each of the data modalities into predicted sample representations through corresponding initial global feature extraction models;
    计算各所述预测样本表征与对应的目标模态聚合样本表征之间的知识蒸馏损失,以及计算各所述预测样本表征之间的对比学习损失;calculating a knowledge distillation loss between each of the predicted sample representations and the corresponding target modal aggregation sample representation, and calculating a comparative learning loss between each of the predicted sample representations;
    基于各所述初始全局特征提取模型对应的知识蒸馏损失以及对应的对比学习损失,优化各所述初始全局特征提取模型,得到各目标全局特征提取模型。Based on the knowledge distillation loss corresponding to each of the initial global feature extraction models and the corresponding comparative learning loss, each of the initial global feature extraction models is optimized to obtain each target global feature extraction model.
  3. 如权利要求2所述联邦学习建模优化方法,其中,所述计算各所述预测样本表征与对应的目标模态聚合样本表征之间的知识蒸馏损失,以及计算各所述预测样本表征之间的对比学习损失的步骤包括:The federated learning modeling optimization method according to claim 2, wherein the calculation of the knowledge distillation loss between each of the predicted sample representations and the corresponding target modal aggregation sample representation, and the calculation of the relationship between each of the predicted sample representations The steps of contrastive learning loss include:
    基于各所述公有训练样本对应的样本标签,在各所述预测样本表征分别选取各所述预测样本表征对应的正样本表征和对应的负样本表征;Based on the sample labels corresponding to each of the public training samples, respectively select a positive sample representation and a corresponding negative sample representation corresponding to each of the prediction sample representations in each of the prediction sample representations;
    基于各所述预测样本表征与各所述预测样本表征对应的正样本表征以及对应的负样本表征,计算各所述初始全局特征提取模型对应的对比学习损失;Based on each of the predicted sample representations and each of the predicted sample representations corresponding to the positive sample representation and the corresponding negative sample representation, calculate the comparative learning loss corresponding to each of the initial global feature extraction models;
    基于各所述预测样本表征与对应的目标模态聚合样本表征之间的相似度,分别计算各所述初始全局特征提取模型对应的知识蒸馏损失。Based on the similarity between each of the prediction sample representations and the corresponding target modal aggregation sample representations, the knowledge distillation losses corresponding to each of the initial global feature extraction models are calculated respectively.
  4. 如权利要求1所述联邦学习建模优化方法,其中,所述将对各所述目标模态公有样本表征进行基于数据模态的选择性聚合,获得各所述数据模态对应的目标模态聚合样本表征的步骤包括:The federated learning modeling and optimization method according to claim 1, wherein the selective aggregation based on the data modality is performed on the public sample representation of each of the target modalities to obtain the target modality corresponding to each of the data modalities The steps to aggregate sample characterization include:
    基于各所述参与方设备与各所述数据模态之间的对应关系,在各所述目标模态公有样本表征中确定各所述数据模态分别对应的各待聚合样本表征;Based on the corresponding relationship between each of the participant devices and each of the data modalities, determine the respective sample representations to be aggregated corresponding to each of the data modalities in the public sample representations of each of the target modalities;
    将各所述数据模态分别对应的各待聚合样本表征分别进行聚合,获得各所述数据模态对应的目标模态聚合样本表征。Aggregating the respective sample representations to be aggregated corresponding to the respective data modalities to obtain the aggregated sample representations of target modalities corresponding to the respective data modalities.
  5. 一种联邦学习建模优化方法,其中,应用于参与方设备,所述联邦学习建模优化方法包括:A federated learning modeling optimization method, wherein, applied to participant equipment, the federated learning modeling optimization method includes:
    接收联邦服务器下发的初始全局特征提取模型,并提取本地私有训练样本;Receive the initial global feature extraction model issued by the federation server, and extract local private training samples;
    基于所述本地私有训练样本,通过在所述初始全局特征提取模型和本地特征提取模型 之间进行对比学习训练,优化所述本地特征提取模型,获得全局优化后的本地特征提取模型;Based on the local private training samples, by performing contrastive learning and training between the initial global feature extraction model and the local feature extraction model, optimizing the local feature extraction model to obtain a globally optimized local feature extraction model;
    在联邦公有数据集中提取属于所述全局优化后的本地特征提取模型对应的数据模态的目标模态公有样本,并基于所述全局优化后的本地特征提取模型,对所述目标模态公有样本进行特征提取,得到目标模态公有样本表征;Extract the target modality public samples belonging to the data modality corresponding to the globally optimized local feature extraction model from the federated public data set, and based on the globally optimized local feature extraction model, extract the target modality public samples Perform feature extraction to obtain the public sample representation of the target mode;
    将所述目标模态公有样本表征发送至联邦服务器,以供所述联邦服务器基将对各所述目标模态公有样本表征进行基于数据模态的选择性聚合,获得各所述数据模态对应的目标模态聚合样本表征,并在所述联邦公有数据集中获取各所述数据模态对应的公有训练样本,基于各所述公有训练样本,分别对各所述初始全局特征提取模型进行基于各所述目标模态聚合样本表征的知识蒸馏学习训练,以及在各所述初始全局特征提取模型之间进行对比学习训练,获得各所述初始模型全局特征提取模型对应的目标全局特征提取模型。Sending the public sample representation of the target modality to the federated server, so that the federated server can selectively aggregate the public sample representations of each target modality based on the data modality to obtain the corresponding Aggregating sample representations of target modalities, and obtaining public training samples corresponding to each of the data modalities in the federal public data set, and based on each of the public training samples, each of the initial global feature extraction models is based on each Knowledge distillation learning training of the target modality aggregation sample representation, and comparative learning training between each of the initial global feature extraction models, to obtain a target global feature extraction model corresponding to each of the initial model global feature extraction models.
  6. 如权利要求5所述联邦学习建模优化方法,其中,所述基于所述本地私有训练样本,通过在所述初始全局特征提取模型和本地特征提取模型之间进行对比学习训练,优化所述本地特征提取模型,获得全局优化后的本地特征提取模型的步骤包括:The federated learning modeling optimization method according to claim 5, wherein, based on the local private training samples, by performing comparative learning training between the initial global feature extraction model and the local feature extraction model, the local Feature extraction model, the steps of obtaining a globally optimized local feature extraction model include:
    通过所述初始全局特征提取模型将所有所述本地私有训练样本映射为第一样本表征,以及通过所述本地特征提取模型将所有所述本地私有训练样本映射为第二样本表征;mapping all of the local private training samples to a first sample representation by the initial global feature extraction model, and mapping all of the local private training samples to a second sample representation by the local feature extraction model;
    基于各所述第一样本表征和各所述第二样本表征之间的相似度,计算对比学习损失;calculating a contrastive learning loss based on the similarity between each of the first sample representations and each of the second sample representations;
    基于所述对比学习损失,优化所述本地特征提取模型,得到全局优化后的本地特征提取模型。Based on the comparative learning loss, the local feature extraction model is optimized to obtain a globally optimized local feature extraction model.
  7. 如权利要求6所述联邦学习建模优化方法,其中,所述基于各所述第一样本表征和各所述第二样本表征之间的相似度,计算对比学习损失的步骤包括:The federated learning modeling optimization method according to claim 6, wherein the step of calculating the contrastive learning loss based on the similarity between each of the first sample representations and each of the second sample representations comprises:
    将各所述第一样本表征中与所述第二样本表征对应同一所述本地私有训练样本的样本表征作为所述第二样本表征对应的本地正样本表征;Using the sample representation corresponding to the same local private training sample as the second sample representation in each of the first sample representations as the local positive sample representation corresponding to the second sample representation;
    将各所述第二样本表征中不与所述第一样本表征对应同一所述本地私有训练样本的样本表征作为所述第一样本表征对应的本地负样本表征;Using the sample representations of the second sample representations that do not correspond to the same local private training sample as the first sample representations as the local negative sample representations corresponding to the first sample representations;
    基于各所述第二样本表征与各所述第二样本表征对应的本地正样本表征之间的相似度,以及各所述第二样本表征与各所述第二样本表征对应的本地负样本表征之间的相似度,计算所述对比学习损失。Based on the similarity between each of the second sample representations and the local positive sample representations corresponding to each of the second sample representations, and each of the second sample representations and the local negative sample representations corresponding to each of the second sample representations The similarity between, calculate the contrastive learning loss.
  8. 如权利要求6所述联邦学习建模优化方法,其中,所述基于所述对比学习损失,优化所述本地特征提取模型,得到全局优化后的本地特征提取模型的步骤包括:The federated learning modeling optimization method according to claim 6, wherein said step of optimizing said local feature extraction model based on said contrastive learning loss to obtain a globally optimized local feature extraction model comprises:
    通过预设分类模型,将各所述第二样本表征分别转换为各所述本地私有训练样本对应的输出分类标签;Converting each of the second sample representations into output classification labels corresponding to each of the local private training samples through a preset classification model;
    基于各所述输出分类标签和各所述本地私有训练样本对应的预设真实标签,计算分类损失;calculating a classification loss based on each of the output classification labels and each of the preset real labels corresponding to the local private training samples;
    基于所述对比学习损失和所述分类损失,计算模型总损失;calculating a model total loss based on the contrastive learning loss and the classification loss;
    基于所述模型总损失,优化所述本地特征提取模型,得到所述全局优化后的本地特征提取模型。Based on the total loss of the model, the local feature extraction model is optimized to obtain the globally optimized local feature extraction model.
  9. 一种联邦学习建模优化设备,其中,所述联邦学习建模优化设备包括:存储器、处理器以及存储在存储器上的用于实现所述联邦学习建模优化方法的程序,A federated learning modeling optimization device, wherein the federated learning modeling optimization device includes: a memory, a processor, and a program stored on the memory for implementing the federated learning modeling optimization method,
    所述存储器用于存储实现联邦学习建模优化方法的程序;The memory is used to store a program implementing a federated learning modeling optimization method;
    所述处理器用于执行实现所述联邦学习建模优化方法的程序,以实现如权利要求1至4或者5至8中任一项所述联邦学习建模优化方法的步骤。The processor is configured to execute a program for implementing the federated learning modeling optimization method, so as to realize the steps of the federated learning modeling optimization method according to any one of claims 1 to 4 or 5 to 8.
  10. 一种可读存储介质,其中,所述可读存储介质上存储有实现联邦学习建模优化方法的程序,所述实现联邦学习建模优化方法的程序被处理器执行以实现如权利要求1至4或者5至8中任一项所述联邦学习建模优化方法的步骤。A readable storage medium, wherein a program for implementing a federated learning modeling optimization method is stored on the readable storage medium, and the program for implementing a federated learning modeling and optimizing method is executed by a processor to implement claims 1 to 4 or the steps of the federated learning modeling optimization method described in any one of 5 to 8.
  11. 一种程序产品,所述程序产品为计算机程序产品,包括计算机程序,其中,所述计算机程序被处理器执行时实现如权利要求1至4或者5至8中任一项所述联邦学习建模优化方法的步骤。A program product, the program product is a computer program product, including a computer program, wherein, when the computer program is executed by a processor, it implements the federated learning modeling described in any one of claims 1 to 4 or 5 to 8 The steps of the optimization method.
PCT/CN2021/141481 2021-07-28 2021-12-27 Federated learning modeling optimization method and device, and readable storage medium and program product WO2023005133A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110860096.9A CN113516255A (en) 2021-07-28 2021-07-28 Federal learning modeling optimization method, apparatus, readable storage medium, and program product
CN202110860096.9 2021-07-28

Publications (1)

Publication Number Publication Date
WO2023005133A1 true WO2023005133A1 (en) 2023-02-02

Family

ID=78068749

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/141481 WO2023005133A1 (en) 2021-07-28 2021-12-27 Federated learning modeling optimization method and device, and readable storage medium and program product

Country Status (2)

Country Link
CN (1) CN113516255A (en)
WO (1) WO2023005133A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116168258A (en) * 2023-04-25 2023-05-26 之江实验室 Object classification method, device, equipment and readable storage medium
CN116502709A (en) * 2023-06-26 2023-07-28 浙江大学滨江研究院 Heterogeneous federal learning method and device
CN116522228A (en) * 2023-04-28 2023-08-01 哈尔滨工程大学 Radio frequency fingerprint identification method based on feature imitation federal learning
CN116665319A (en) * 2023-07-31 2023-08-29 华南理工大学 Multi-mode biological feature recognition method based on federal learning
CN116757275A (en) * 2023-06-07 2023-09-15 京信数据科技有限公司 Knowledge graph federal learning device and method
CN117196070A (en) * 2023-11-08 2023-12-08 山东省计算中心(国家超级计算济南中心) Heterogeneous data-oriented dual federal distillation learning method and device
CN117436133A (en) * 2023-12-22 2024-01-23 信联科技(南京)有限公司 Federal learning privacy protection method based on data enhancement

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113516255A (en) * 2021-07-28 2021-10-19 深圳前海微众银行股份有限公司 Federal learning modeling optimization method, apparatus, readable storage medium, and program product
CN113869528B (en) * 2021-12-02 2022-03-18 中国科学院自动化研究所 De-entanglement individualized federated learning method for consensus characterization extraction and diversity propagation
CN113988225B (en) * 2021-12-24 2022-05-06 支付宝(杭州)信息技术有限公司 Method and device for establishing representation extraction model, representation extraction and type identification
CN114330125A (en) * 2021-12-29 2022-04-12 新智我来网络科技有限公司 Knowledge distillation-based joint learning training method, device, equipment and medium
CN114612408B (en) * 2022-03-04 2023-06-06 拓微摹心数据科技(南京)有限公司 Cardiac image processing method based on federal deep learning
CN114510652B (en) * 2022-04-20 2023-04-07 宁波大学 Social collaborative filtering recommendation method based on federal learning
CN114819196B (en) * 2022-06-24 2022-10-28 杭州金智塔科技有限公司 Noise distillation-based federal learning system and method
CN115829028B (en) * 2023-02-14 2023-04-18 电子科技大学 Multi-mode federal learning task processing method and system
CN116229219B (en) * 2023-05-10 2023-09-26 浙江大学 Image encoder training method and system based on federal and contrast characterization learning

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180373988A1 (en) * 2017-06-27 2018-12-27 Hcl Technologies Limited System and method for tuning and deploying an analytical model over a target eco-system
CN111144579A (en) * 2019-12-30 2020-05-12 大连理工大学 Multi-mode Lu nation feature learning model based on non-negative matrix decomposition
CN112101578A (en) * 2020-11-17 2020-12-18 中国科学院自动化研究所 Distributed language relationship recognition method, system and device based on federal learning
CN112651511A (en) * 2020-12-04 2021-04-13 华为技术有限公司 Model training method, data processing method and device
CN113128701A (en) * 2021-04-07 2021-07-16 中国科学院计算技术研究所 Sample sparsity-oriented federal learning method and system
CN113516255A (en) * 2021-07-28 2021-10-19 深圳前海微众银行股份有限公司 Federal learning modeling optimization method, apparatus, readable storage medium, and program product

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180373988A1 (en) * 2017-06-27 2018-12-27 Hcl Technologies Limited System and method for tuning and deploying an analytical model over a target eco-system
CN111144579A (en) * 2019-12-30 2020-05-12 大连理工大学 Multi-mode Lu nation feature learning model based on non-negative matrix decomposition
CN112101578A (en) * 2020-11-17 2020-12-18 中国科学院自动化研究所 Distributed language relationship recognition method, system and device based on federal learning
CN112651511A (en) * 2020-12-04 2021-04-13 华为技术有限公司 Model training method, data processing method and device
CN113128701A (en) * 2021-04-07 2021-07-16 中国科学院计算技术研究所 Sample sparsity-oriented federal learning method and system
CN113516255A (en) * 2021-07-28 2021-10-19 深圳前海微众银行股份有限公司 Federal learning modeling optimization method, apparatus, readable storage medium, and program product

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116168258A (en) * 2023-04-25 2023-05-26 之江实验室 Object classification method, device, equipment and readable storage medium
CN116522228A (en) * 2023-04-28 2023-08-01 哈尔滨工程大学 Radio frequency fingerprint identification method based on feature imitation federal learning
CN116522228B (en) * 2023-04-28 2024-02-06 哈尔滨工程大学 Radio frequency fingerprint identification method based on feature imitation federal learning
CN116757275A (en) * 2023-06-07 2023-09-15 京信数据科技有限公司 Knowledge graph federal learning device and method
CN116502709A (en) * 2023-06-26 2023-07-28 浙江大学滨江研究院 Heterogeneous federal learning method and device
CN116665319A (en) * 2023-07-31 2023-08-29 华南理工大学 Multi-mode biological feature recognition method based on federal learning
CN116665319B (en) * 2023-07-31 2023-11-24 华南理工大学 Multi-mode biological feature recognition method based on federal learning
CN117196070A (en) * 2023-11-08 2023-12-08 山东省计算中心(国家超级计算济南中心) Heterogeneous data-oriented dual federal distillation learning method and device
CN117196070B (en) * 2023-11-08 2024-01-26 山东省计算中心(国家超级计算济南中心) Heterogeneous data-oriented dual federal distillation learning method and device
CN117436133A (en) * 2023-12-22 2024-01-23 信联科技(南京)有限公司 Federal learning privacy protection method based on data enhancement
CN117436133B (en) * 2023-12-22 2024-03-12 信联科技(南京)有限公司 Federal learning privacy protection method based on data enhancement

Also Published As

Publication number Publication date
CN113516255A (en) 2021-10-19

Similar Documents

Publication Publication Date Title
WO2023005133A1 (en) Federated learning modeling optimization method and device, and readable storage medium and program product
WO2021083276A1 (en) Method, device, and apparatus for combining horizontal federation and vertical federation, and medium
WO2020094060A1 (en) Recommendation method and apparatus
CN102223453B (en) High performance queueless contact center
WO2019228494A1 (en) Method and device for determining type of wireless access point
WO2019214344A1 (en) System reinforcement learning method and apparatus, electronic device, and computer storage medium
WO2022028045A1 (en) Data processing method, apparatus, and device, and medium
WO2022022024A1 (en) Training sample construction method, apparatus, and device, and computer-readable storage medium
WO2022236824A1 (en) Target detection network construction optimization method, apparatus and device, and medium and product
CN110020022B (en) Data processing method, device, equipment and readable storage medium
WO2021258882A1 (en) Recurrent neural network-based data processing method, apparatus, and device, and medium
CN115147265B (en) Avatar generation method, apparatus, electronic device, and storage medium
CN110069715A (en) A kind of method of information recommendation model training, the method and device of information recommendation
WO2023024349A1 (en) Method for optimizing vertical federated prediction, device, medium, and computer program product
CN113656698B (en) Training method and device for interest feature extraction model and electronic equipment
CN112785002A (en) Model construction optimization method, device, medium, and computer program product
CN110633717A (en) Training method and device for target detection model
WO2021185427A1 (en) Generation of personalized recommendations
CN109086976B (en) Task allocation method for crowd sensing
KR101700030B1 (en) Method for visual object localization using privileged information and apparatus for performing the same
WO2021139483A1 (en) Forward model selection method and device, and readable storage medium
CN113361384A (en) Face recognition model compression method, device, medium, and computer program product
CN112966054A (en) Enterprise graph node relation-based ethnic group division method and computer equipment
CN112381236A (en) Data processing method, device, equipment and storage medium for federal transfer learning
CN115273148B (en) Pedestrian re-recognition model training method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21951705

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE