CN111898768A - Data processing method, device, equipment and medium - Google Patents
Data processing method, device, equipment and medium Download PDFInfo
- Publication number
- CN111898768A CN111898768A CN202010787792.7A CN202010787792A CN111898768A CN 111898768 A CN111898768 A CN 111898768A CN 202010787792 A CN202010787792 A CN 202010787792A CN 111898768 A CN111898768 A CN 111898768A
- Authority
- CN
- China
- Prior art keywords
- participant
- model
- data
- preset
- prediction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 42
- 238000012549 training Methods 0.000 claims abstract description 83
- 238000000034 method Methods 0.000 claims abstract description 42
- 238000012545 processing Methods 0.000 claims description 37
- 238000013145 classification model Methods 0.000 claims description 25
- 230000008569 process Effects 0.000 claims description 23
- 238000004891 communication Methods 0.000 claims description 16
- 230000005477 standard model Effects 0.000 abstract description 2
- 230000005484 gravity Effects 0.000 description 10
- 238000010586 diagram Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 6
- 238000004364 calculation method Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 238000009826 distribution Methods 0.000 description 4
- 230000000474 nursing effect Effects 0.000 description 4
- 230000002776 aggregation Effects 0.000 description 3
- 238000004220 aggregation Methods 0.000 description 3
- 238000010801 machine learning Methods 0.000 description 3
- 230000004913 activation Effects 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- NOESYZHRGYRDHS-UHFFFAOYSA-N insulin Chemical compound N1C(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(NC(=O)CN)C(C)CC)CSSCC(C(NC(CO)C(=O)NC(CC(C)C)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CCC(N)=O)C(=O)NC(CC(C)C)C(=O)NC(CCC(O)=O)C(=O)NC(CC(N)=O)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CSSCC(NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2C=CC(O)=CC=2)NC(=O)C(CC(C)C)NC(=O)C(C)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2NC=NC=2)NC(=O)C(CO)NC(=O)CNC2=O)C(=O)NCC(=O)NC(CCC(O)=O)C(=O)NC(CCCNC(N)=N)C(=O)NCC(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC(O)=CC=3)C(=O)NC(C(C)O)C(=O)N3C(CCC3)C(=O)NC(CCCCN)C(=O)NC(C)C(O)=O)C(=O)NC(CC(N)=O)C(O)=O)=O)NC(=O)C(C(C)CC)NC(=O)C(CO)NC(=O)C(C(C)O)NC(=O)C1CSSCC2NC(=O)C(CC(C)C)NC(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CC(N)=O)NC(=O)C(NC(=O)C(N)CC=1C=CC=CC=1)C(C)C)CC1=CN=CN1 NOESYZHRGYRDHS-UHFFFAOYSA-N 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 1
- 102000004877 Insulin Human genes 0.000 description 1
- 108090001061 Insulin Proteins 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 206010012601 diabetes mellitus Diseases 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 239000008103 glucose Substances 0.000 description 1
- 229940125396 insulin Drugs 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Software Systems (AREA)
- Medical Informatics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The application discloses a data processing method, a device, equipment and a medium, wherein the method comprises the following steps: determining the data occupation rate of target data consistent with the data characteristics of the first party in second sample data of each second party in the second sample data; determining a first participant prediction weight of each second sample data based on each data occupancy ratio; determining a preset prediction model of the first participant through federal learning based on the first participant prediction weight and the first sample data of the first participant. The technical problems that the federal training efficiency is low and the prediction performance of the obtained federal standard model is low in the prior art are solved.
Description
Technical Field
The present application relates to the field of artificial intelligence technology for financial technology (Fintech), and in particular, to a data processing method, apparatus, device, and medium.
Background
With the continuous development of financial technologies, especially internet technology and finance, more and more technologies (such as distributed, Blockchain, artificial intelligence, etc.) are applied to the financial field, but the financial industry also puts higher requirements on the technologies, such as higher requirements on data processing in the financial industry.
At present, on one hand, because any single participant cannot collect enough data to accurately construct an effective machine learning model, and on the other hand, because of the requirement of privacy protection, the cooperative training of a model suitable for a task of a certain participant by combining different participants in a federal learning mode becomes a main mode for constructing a required target machine learning model.
In the prior art, when a certain participant trains a self model in combination with other participants, each participant has the same training specific gravity or prediction specific gravity, and each participant has the same training specific gravity or prediction specific gravity, so that part of invalid training data in data of the certain participant occupies the training specific gravity or prediction specific gravity, the training efficiency is low, and the prediction performance of the obtained target model is low.
Disclosure of Invention
The application mainly aims to provide a data processing method, a data processing device, data processing equipment and a data processing medium, and aims to solve the technical problems that in the prior art, the federal training efficiency is low and the prediction performance of an obtained federal standard model is low.
In order to achieve the above object, the present application provides a data processing method, applied to a first participant, where the first participant is in communication connection with a plurality of second participants, and the data processing method includes:
determining the data occupation rate of target data consistent with the data characteristics of the first party in second sample data of each second party in the second sample data;
determining a first participant prediction weight of each second sample data based on each data occupancy ratio;
determining a preset prediction model of the first participant through federal learning based on the first participant prediction weight and the first sample data of the first participant.
Optionally, the step of determining a data proportion of target data consistent with the data characteristics of the first party in second sample data of each second party in the second sample data includes:
receiving the data occupation ratio sent by each second participant, wherein after each second participant inputs corresponding second sample data into a corresponding preset domain classification model, the second sample data is subjected to prediction processing to obtain the data occupation ratio of target data in the second sample data, which has the same data characteristics as the first participant, in the second sample data;
the preset field classification model is a first target model which is obtained by performing iterative training on a first preset prediction model to be trained on the basis of training sample data with preset participant source labels and by executing a first preset federal flow and predicting the data source occupation ratio of each participant.
Optionally, the step of determining the preset prediction model of the first participant through federal learning based on the prediction weight of the first participant and the first sample data of the first participant is to determine the preset prediction model of the first participant by using:
performing iterative training on a second preset prediction model to be trained by executing a second preset federal flow on the basis of the first participant prediction weight and first sample data of the first participant to obtain a second target model;
setting the second target model as a preset prediction model of the first participant.
Optionally, the step of performing iterative training on a second preset prediction model to be trained by executing a second preset federal procedure based on the first participant prediction weight and the first sample data of the first participant to obtain a second target model includes:
receiving a second model initial gradient sent by a second participant, wherein the second model initial gradient is determined by each second participant based on corresponding second sample data in a process of executing a second preset federal flow;
acquiring a second model updating gradient of each second participant based on the second model initial gradient and the prediction weight corresponding to the first participant;
determining a replacement update model parameter based on the second model update gradient and a first model update gradient determined by a first participant based on the first sample data in executing a second preset federal procedure;
and iteratively updating the model parameters in the second preset to-be-trained prediction model based on the replacement update model parameters to obtain a second target model.
Optionally, the first party and the second party are in communication connection through a preset intermediary party;
the step of performing iterative training on a second preset prediction model to be trained by executing a second preset federal flow based on the first participant prediction weight and the first sample data of the first participant to obtain a second target model includes:
in executing a second preset federal flow, determining a first model updating gradient based on the first sample data, and sending the first model updating gradient to a preset intermediate party;
receiving a replacement update model parameter which is sent by a preset intermediate party and is based on a second model initial gradient, corresponding to the first participant prediction weight, and determined by the first model update gradient;
the second participant determines a second model initial gradient based on the second sample data, encrypts the second model initial gradient and sends the second model initial gradient to the preset intermediate party;
and iteratively updating the model parameters in the second preset to-be-trained prediction model based on the replacement update model parameters to obtain a second target model.
Optionally, the step of receiving a replacement update model parameter determined based on a second model initial gradient sent by a preset intermediary party and corresponding to the first participant prediction weight and the first model update gradient includes:
and receiving a replacement update model parameter which is sent by a preset intermediate party in an encrypted manner and is determined on the basis of the second model initial gradient, the first participant prediction weight and the first model update gradient.
Optionally, after the step of determining the preset prediction model of the first participant based on the first participant prediction weight and the first sample data of the first participant, the method includes:
acquiring data to be processed, and inputting the data to be processed into the preset prediction model;
and carrying out prediction processing on the data to be processed based on the preset prediction model to obtain a target prediction result.
The present application further provides a data processing apparatus, which is applied to a first party, where the first party and a second party are in communication connection, and the second party is multiple, the data processing apparatus includes:
the first determining module is used for determining the data occupation rate of target data consistent with the data characteristics of the first party in second sample data of each second party in the second sample data;
a second determining module, configured to determine a first participant prediction weight of each second sample data based on each of the data occupancy rates;
and the third determining module is used for determining a preset prediction model of the first participant through federal learning based on the first participant prediction weight and the first sample data of the first participant.
Optionally, the first determining module includes:
the receiving unit is used for receiving the data occupation ratio sent by each second participant, and after each second participant inputs corresponding second sample data into a corresponding preset domain classification model, the second sample data is subjected to prediction processing to obtain the data occupation ratio of target data in the second sample data, which has the same data characteristics as the first participant, in the second sample data;
the preset field classification model is a first target model which is obtained by performing iterative training on a first preset prediction model to be trained on the basis of training sample data with preset participant source labels and by executing a first preset federal flow and predicting the data source occupation ratio of each participant.
Optionally, the sample data in the first participant is first sample data, and the third determining module includes:
the first execution unit is used for performing iterative training on a second preset prediction model to be trained by executing a second preset federal process on the basis of the first participant prediction weight and first sample data of the first participant to obtain a second target model;
a setting unit configured to set the second target model as a preset prediction model of the first participant.
Optionally, the first execution unit includes:
the first receiving subunit is configured to receive a second model initial gradient sent by a second party, where the second model initial gradient is determined by each second party based on corresponding second sample data in a process of executing a second preset federation flow;
the obtaining subunit is configured to obtain a second model update gradient of each second participant based on the second model initial gradient and the prediction weight corresponding to the first participant;
the determining subunit is configured to determine, based on the second model update gradient, and in executing a second preset federal procedure, a first model update gradient determined by the first participant based on the first sample data, and determine a replacement update model parameter;
and the first updating subunit is used for iteratively updating the model parameters in the second preset prediction model to be trained based on the replacement updated model parameters to obtain a second target model.
Optionally, the first party and the second party are in communication connection through a preset intermediary party;
the first execution unit further includes:
the sending subunit is configured to, in executing a second preset federal flow, determine a first model update gradient based on the first sample data, and send the first model update gradient to a preset intermediate party;
the second receiving subunit is configured to receive a replacement update model parameter determined based on a second model initial gradient sent by a preset intermediary party, and the first model update gradient corresponding to the first participant prediction weight;
the second participant determines a second model initial gradient based on the second sample data, encrypts the second model initial gradient and sends the second model initial gradient to the preset intermediate party;
and the second updating subunit is used for iteratively updating the model parameters in the second preset prediction model to be trained based on the replacement updating model parameters to obtain a second target model.
Optionally, the second receiving subunit is configured to implement:
and receiving a replacement update model parameter which is sent by a preset intermediate party in an encrypted manner and is determined on the basis of the second model initial gradient, the first participant prediction weight and the first model update gradient.
Optionally, the data processing apparatus further includes:
the first acquisition module is used for acquiring data to be processed and inputting the data to be processed into the preset prediction model;
and the second acquisition module is used for carrying out prediction processing on the data to be processed based on the preset prediction model to obtain a target prediction result.
The present application further provides a data processing apparatus, where the data processing apparatus is an entity apparatus, and the data processing apparatus includes: a memory, a processor and a program of the data processing method stored on the memory and executable on the processor, which program of the data processing method when executed by the processor may implement the steps of the data processing method as described above.
The present application also provides a medium having stored thereon a program for implementing the above-described data processing method, the program for the data processing method implementing the steps of the above-described data processing method when executed by a processor.
Determining the data occupation rate of target data consistent with the data characteristics of the first party in second sample data of each second party in the second sample data; determining a first participant prediction weight of each second sample data based on each data occupancy ratio; determining a preset prediction model of the first participant through federal learning based on the first participant prediction weight and the first sample data of the first participant. The method comprises the steps of determining the data occupation ratio of target data consistent with the data characteristics of a first participant in second sample data of each second participant in the second sample data, determining the first participant prediction weight of each second sample data, determining a preset prediction model of the first participant in a federation mode, namely determining the prediction weight with an actual effective training function aiming at the first participant in the second sample data of different distributions by adding the data occupation ratio, obtaining the preset prediction model of the first participant in the federation mode, and combining training data with the training proportion or the prediction proportion which are the same as those of each participant in the prior art, so that part of invalid training data in the data of one participant occupies the training proportion or the prediction proportion, wherein the federate model is based on the training data (occupying the training proportion or the prediction proportion) which is actually effective (occupying the training proportion or the prediction proportion) of each participant, furthermore, efficiency interference caused by invalid data is avoided, the federal training efficiency is improved, and the model accuracy of the federal model is improved due to the fact that the training data are effective, so that the prediction performance of the model is improved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
FIG. 1 is a schematic flow chart diagram illustrating a first embodiment of a data processing method according to the present application;
FIG. 2 is a flowchart illustrating a detailed process of performing speech recognition on speech data to be recognized to obtain candidate results of the speech data to be recognized according to a first embodiment of the data processing method of the present application;
FIG. 3 is a schematic diagram of an apparatus configuration of a hardware operating environment according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a first scenario in the data processing method of the present application;
FIG. 5 is a diagram illustrating a second scenario of the data processing method of the present application;
fig. 6 is a schematic diagram of a third scenario in the data processing method of the present application.
The objectives, features, and advantages of the present application will be further described with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
In a first embodiment of the data processing method, referring to fig. 1, the data processing method is applied to a first participant, where the first participant is in communication connection with a plurality of second participants, and the data processing method includes:
step S10, determining a data occupation ratio of target data consistent with the data characteristics of the first party in second sample data of each second party in the second sample data;
step S20, determining a first participant prediction weight of each second sample data based on each of the data occupation ratios;
and step S30, determining a preset prediction model of the first participant through federal learning based on the first participant prediction weight and the first sample data of the first participant.
The method comprises the following specific steps:
step S10, determining a data occupation ratio of target data consistent with the data characteristics of the first party in second sample data of each second party in the second sample data;
integrally, the preset prediction model is a first target model obtained by iteratively training a first preset prediction model to be trained through federal learning based on a plurality of second participant data (second sample data) with the first participant prediction weight and first sample data of a first participant;
in this embodiment, after a preset prediction model is obtained, to-be-processed data is obtained, and the to-be-processed data is input into the preset prediction model, where the preset prediction model is a trained model capable of accurately predicting the to-be-processed data, for example, if the to-be-processed data includes age, weight, blood glucose content, and insulin level data of a user, the to-be-processed data is input into the preset prediction model, and then it may be determined whether the output user has a diabetic result.
It should be noted that in this embodiment, the federal learning includes horizontal federal learning or vertical federal learning, and the federal flow includes a horizontal federal flow and a vertical federal flow, which is specifically described in this embodiment by taking a horizontal federal learning application as an example.
It should be noted that the first participant includes a data union of all labeled (tagged) data and unlabeled data, and is numbered such that X and Y obey the same conditional distribution for certain participants (including the first participant and the second participant), i.e., X and Y obey the same conditional distributionWhile data from different participants is distributed differently marginally. Specifically, for example, data accumulated in hospitals is often from patientsThe data accumulated by the nursing home is old people, the body disease condition difference is large, most of machine learning tasks of the physical examination center aim at young and middle-aged adults, when the existing hospitals jointly train machine models such as the nursing home and the physical examination center, the participants of the hospitals, the nursing home, the physical examination center and the like have the same training specific gravity or prediction specific gravity, namely the data characteristics of the participants of the default hospitals, the nursing home, the physical examination center and the like are the same as the characteristics of hospital data, the data of different participants are heterogeneous actually, the effectiveness of the training data of the hospital machine models is low based on the same training specific gravity or prediction specific gravity, and the prediction performance of the models obtained by federal learning training is low.
In this embodiment, the data occupation rate of the target data in the second sample data may be further determined by determining, in each second sample data, the target data consistent with the data characteristics of the first participant, or directly determining the data occupation rate of the target data in the second sample data, where a first participant determines, based on the local training data, the target data in the second sample data of other second participants in addition to the corresponding local model, to determine the data occupation rate of the target (valid data) in the second sample data, so as to improve the training effectiveness of the local model of the first participant, specifically, for example, a plurality of second participants such as participant a and participant C train the model of participant B (first participant) together, in order to ensure the privacy of the user and since the initial models are the same, in this embodiment, the way in which the participant a and the participant C jointly train the model of the participant B is as follows: selecting target data in participant A that is valid for participant B, obtaining a data occupancy rate in participant A that is valid for participant B, i.e., a probability that labeled (pre-set feature label) data from participant A is determined as labeled data for participant B (or has a first participant B data feature), further obtaining a first weight corresponding to the labeled data from participant A used in a model process for training participant B, selecting target data in participant C that is valid for participant B, to determine a probability that labeled data from participant C is determined as labeled data for participant B (or has a first participant B data feature), obtaining a data occupancy rate corresponding to a second sample data from participant C used in a model process for training participant B, based on the data occupancy rate of participant A and the data occupancy rate of participant C, the model parameters of the first participant that need to be aggregated and updated may be obtained through joint calculation, for example, the data occupation ratio of the participant a to the participant B multiplied by the self labeling data of the participant a and the data occupation ratio of the participant C to the participant B multiplied by the self labeling data of the participant C are respectively obtained, and then the model parameters of the first participant B that need to be aggregated and updated, the model parameters of the second participant a that need to be aggregated and updated, and the model parameters of the second participant C that need to be aggregated and updated may be obtained through joint calculation according to a preset calculation formula, as specifically shown in fig. 4, 5, and 6.
The step of determining a data proportion of target data consistent with the data characteristics of the first party in second sample data of each second party in the second sample data includes:
step S11, receiving the data occupation ratio sent by each second participant, wherein after each second participant inputs corresponding second sample data into a corresponding preset domain classification model, the second sample data is subjected to prediction processing to obtain the data occupation ratio of target data in the second sample data, which has the same data characteristics as the first participant, in the second sample data;
the preset field classification model is a first target model which is obtained by performing iterative training on a first preset prediction model to be trained on the basis of training sample data with preset participant source labels and by executing a first preset federal flow and predicting the data source occupation ratio of each participant.
In the present embodiment, the data occupation ratio, which is the probability that the labeled data from the second participant is judged to be the labeled data of the first participant (the labeled data is consistent with the data feature of the first participant), is determined by the preset domain classification model.
Integrally, the preset domain classification model is obtained by performing horizontal federation on training data of preset n participants, wherein an output layer in the preset domain classifier outputs a vector with a dimensionality of n after being processed by a preset Sigmoid activation function, each dimensionality output value is in a (0,1) interval, and the output value of the ith dimensionality is used for representing the probability that input data is trained by the ith participant, namely the data occupation ratio.
In this embodiment, the preset domain classification model is trained, that is, based on training sample data with preset participant source labels, a first target model is obtained by performing iterative training on a first preset prediction model to be trained by executing a first preset federal procedure, and then predicting the data source occupation ratio of each participant, so that after each second participant inputs corresponding second sample data into a corresponding preset domain classification model, the second sample data is subjected to prediction processing, and the data occupation ratio of target data in the second sample data, which is consistent with the data characteristics of the first participant, in the second sample data is obtained.
Specifically, in this embodiment, a first preset prediction model to be trained is iteratively trained by executing a first preset federal process based on participant data having preset participant source tags, so as to obtain a first target model with a ratio of data sources of the prediction participants to data sources, for example, the first participant initiates a first federal learning task, the first participant sends the first preset prediction model to be trained of the first federal learning task to each second participant, the first participant trains the first prediction model to be trained based on local training data, so as to obtain model parameters after the first participant trains, after obtaining the model parameters after the first participant trains, the first preset horizontal federal process is executed, so as to obtain first joint model parameters (including some joint model parameters of other second participants), after obtaining the first joint model parameters, and continuously performing a first horizontal federal process to allow a first participant to obtain a second target combined model parameter, continuously performing a first horizontal two-group process based on the second target combined model parameter to finally obtain a first target model, wherein the implicit characteristics of data of each participant are learned in the process of obtaining the first target model, so that a preset Sigmoid activation function is set in a preset domain classification model, and an output value of a first dimension can be obtained and output to represent the probability or the data occupation ratio that input data is the ith participant.
Specifically, for example, the plurality of second participants are participant a and participant C, and the first participant is participant B, then for participant B, the data occupation ratio of the second sample data of participant a for training the model of participant B needs to be obtained through the domain classification model, the data occupation ratio of the second sample data of participant C for training the model of participant B needs to be obtained through the domain classification model, for participant a, the data occupation ratio of the first sample data of participant B for training the model of participant a needs to be obtained through the domain classification model, the data occupation ratio of the second sample data of participant C for training the model of participant a needs to be obtained through the domain classification model, and for participant C, the data occupation ratio of the second sample data of participant a for training the model of participant B needs to be obtained through the domain classification model, acquiring the data occupation ratio used for training the participant B model in the second sample data of the participant C through a preset domain classification model, namely, inputting all local labeled data into the preset domain classification model for each participant to obtain a first output valueRepresenting a representation from a participant PiIs marked with a sample xik is determined as the probability of the participant having the labeled data.
Step S20, determining a first participant prediction weight of each second sample data based on each of the data occupation ratios;
after the probability, which is the data occupation ratio, is obtained, the participant P is calculated according to the following calculation formula in the preset domain classification modeliIs marked with a sample xikFor training a participant PjCorresponding prediction weights in the model process of (1)
And step S30, determining a preset prediction model of the first participant through federal learning based on the first participant prediction weight and the first sample data of the first participant.
Determining a preset predictive model of the first participant through federated learning, in particular, through lateral federated learning, based on the first participant predictive weight and the first sample data of the first participant.
After the step of determining a pre-set prediction model of the first participant based on the first participant prediction weight and the first sample data of the first participant, the method comprises:
step S40, acquiring data to be processed, and inputting the data to be processed into the preset prediction model;
and step S50, carrying out prediction processing on the data to be processed based on the preset prediction model to obtain a target prediction result.
In this embodiment, after a preset prediction model is obtained, to-be-processed data is obtained, and the to-be-processed data is input into the preset prediction model to obtain a target prediction result.
Determining the data occupation rate of target data consistent with the data characteristics of the first party in second sample data of each second party in the second sample data; determining a first participant prediction weight of each second sample data based on each data occupancy ratio; determining a preset prediction model of the first participant through federal learning based on the first participant prediction weight and the first sample data of the first participant. The method comprises the steps of determining the data occupation ratio of target data consistent with the data characteristics of a first participant in second sample data of each second participant in the second sample data, determining the first participant prediction weight of each second sample data, determining a preset prediction model of the first participant in a federation mode, namely determining the prediction weight with an actual effective training function aiming at the first participant in the second sample data of different distributions by adding the data occupation ratio, obtaining the preset prediction model of the first participant in the federation mode, and combining training data with the training proportion or the prediction proportion which are the same as those of each participant in the prior art, so that part of invalid training data in the data of one participant occupies the training proportion or the prediction proportion, wherein the federate model is based on the training data (occupying the training proportion or the prediction proportion) which is actually effective (occupying the training proportion or the prediction proportion) of each participant, furthermore, efficiency interference caused by invalid data is avoided, the federal training efficiency is improved, and the model accuracy of the federal model is improved due to the fact that the training data are effective, so that the prediction performance of the model is improved.
In another embodiment of the data processing method, referring to fig. 2, the step of determining the preset prediction model of the first participant through federal learning based on the first participant prediction weight and the first sample data of the first participant, where the sample data of the first participant is first sample data, includes:
step S31, performing iterative training on a second preset prediction model to be trained by executing a second preset federal flow based on the first participant prediction weight and the first sample data of the first participant to obtain a second target model;
step S32, setting the second target model as a preset prediction model of the first participant.
It should be noted that, in this embodiment, based on the prediction weight of the first participant and the first sample data of the first participant, a second preset prediction model to be trained is iteratively trained by executing a second preset federal procedure on an intermediate party or a certain participant, so as to obtain a second target model, and the second target model is set as the preset prediction model of the first participant.
Specifically, in this embodiment, there is a coordinator, which communicates with a first participant and a second participant, respectively, before the federal model training, the coordinator (coordinator) may initiate a second federal learning task, and send an initial model (a second preset prediction model to be trained) of the second federal learning task to each participant, for the first participant, based on the second preset prediction model to be trained, training data local to the first participant trains the initial model in the first participant to obtain model parameters after the first participant is trained, and after obtaining the model parameters after the first participant is trained, execute a second preset horizontal flow (where each second participant trains the second model to be trained on the basis of the initial model and target data of the first participant prediction weight obtained by each second participant, obtaining a model parameter after training of each second participant), specifically, based on a first combined model parameter obtained by combining the trained model parameters (encrypted) of each second participant by the coordinator, the first participant continues to train and execute a second preset horizontal federal flow until a preset end condition is reached, and obtains a second target model.
The process of executing the first preset horizontal federal flow may specifically be: and performing iterative training on the second preset to-be-trained prediction model, judging whether the second preset to-be-trained prediction model subjected to iterative training meets a preset replacement updating condition, if so, performing iterative training for 500 times, and if the second preset to-be-trained prediction model meets the preset replacement updating condition, performing replacement updating on the model variables of the second preset to-be-trained prediction model subjected to training updating through executing the second preset transverse federal flow to obtain the second preset to-be-trained prediction model subjected to replacement updating.
And encrypting and sending the updated model variable of the second preset prediction model to be trained to an intermediate party associated with the first participant to ensure the security, so that the intermediate party can aggregate the model variables sent by a plurality of other first participants to obtain an aggregated model variable, encrypting and feeding back the model variable to each first participant, receiving the aggregated model variable fed back by the intermediate party by the first participant, replacing and updating the updated model variable of the second preset prediction model to be trained to the aggregated model variable, and obtaining the updated second preset prediction model to be trained.
And continuously performing iterative training and replacement updating on the second preset prediction model to be trained, which is subjected to replacement updating, until the second preset prediction model to be trained meets a preset training completion condition, wherein the preset training completion condition can be that the training frequency reaches certain data, such as 1 ten thousand times or 5000 times, or the preset training completion condition can be that the corresponding loss function is converged. To obtain a second object model.
In addition, in this embodiment, the first participant may initiate a second federated learning task, the first participant sends an initial model of the second federated learning task to each second participant, the first participant trains the initial model in the first participant based on local first sample data (where each second participant trains the initial model based on the initial model and target data to obtain model parameters after each second participant trains), obtains model parameters after the first participant trains, executes a first preset horizontal federated procedure after obtaining the model parameters after the first participant trains the initial model, specifically, continues to train and execute a second preset horizontal procedure until reaching a preset federated end condition based on first combined model parameters obtained by the first participant in combination with the trained model parameters of each other second participant, and obtaining a second target model.
In this embodiment, a second preset prediction model to be trained is iteratively trained by executing a second preset federal process based on the first participant prediction weight and the first sample data of the first participant, so as to obtain a second target model; setting the second target model as a preset prediction model of the first participant. In the embodiment, the preset prediction model is accurately obtained.
In another embodiment of the data processing method, the step of performing iterative training on a second preset prediction model to be trained by executing a second preset federal process based on the first participant prediction weight and the first sample data of the first participant to obtain a second target model includes:
step B1, receiving a second model initial gradient sent by a second participant, wherein the second model initial gradient is determined by each second participant based on corresponding second sample data in the process of executing a second preset federal flow;
step B2, obtaining a second model updating gradient of each second participant based on the second model initial gradient and the prediction weight corresponding to the first participant;
step B3, determining replacement update model parameters based on the second model update gradient and the first model update gradient determined by the first participant based on the first sample data in executing a second preset federal procedure;
and step B4, performing iterative updating on the model parameters in the second preset prediction model to be trained based on the replacement updating model parameters to obtain a second target model.
In this embodiment, the model parameter may be obtained through a gradient, and a second model initial gradient sent by a second participant is received first, where the second model initial gradient is determined based on corresponding second sample data during execution of a second preset federal procedure by each second participant, and after the second model initial gradient is obtained, a second model update gradient of each second participant is obtained based on the second model initial gradient and a prediction weight corresponding to the first participant, specifically, an upper mark lb is a mark meaning, and each participant stores all local mark data (second sample data)Inputting the input field classification model theta to obtain a first output value
Representing a representation from a participant PiIs marked with a sample xik is determinedAs a participant PjIs marked with a data setAccording to the following calculation formula in the domain classification model, calculating the labeled sample x of the participantikFor training a participant PjCorresponding weight in the model process of (1)The prediction weight of the first participant is obtained, wherein N is the total amount of labeled data of all participants, Nj is the amount of labeled training data of Pj, and after the prediction weight of the first participant is obtained, the first participant is multiplied by the corresponding update gradient of the second model to obtain the update gradient of the second model.
And determining a replacement update model parameter based on the second model update gradient and the first model update gradient determined by the first participant based on the first sample data in executing a second preset federal flow, and performing iterative update on the model parameter in a second preset to-be-trained prediction model for 500 times based on the replacement update model parameter so as to obtain a second target model.
The first party and the second party are in communication connection through a preset middle party;
the step of performing iterative training on a second preset prediction model to be trained by executing a second preset federal flow based on the first participant prediction weight and the first sample data of the first participant to obtain a second target model includes:
step C1, in executing a second preset federal flow, determining a first model updating gradient based on the first sample data, and sending the first model updating gradient to a preset intermediate party;
step C2, receiving a second model-based initial gradient sent by a preset intermediate party, updating the gradient with the first model corresponding to the first participant prediction weight, and determining a replacement updating model parameter;
the second participant determines a second model initial gradient based on the second sample data, encrypts the second model initial gradient and sends the second model initial gradient to the preset intermediate party;
and step C3, performing iterative updating on the model parameters in the second preset prediction model to be trained based on the replacement updating model parameters to obtain a second target model.
In this embodiment, the first participant and the second participant may perform gradient aggregation through the intermediate party to update the model parameters with the determined replacement, so as to finally obtain the target model.
The step of receiving a replacement update model parameter determined based on a second model initial gradient sent by a preset intermediary party and corresponding to the first participant prediction weight and the first model update gradient includes:
and D1, receiving the replacement update model parameters determined by the preset intermediate party encrypted and sent based on the second model initial gradient, corresponding to the first participant prediction weight and the first model update gradient.
In this embodiment, the first party and the second party may perform gradient aggregation through the middle party, and after the aggregation, the middle party encrypts the fed back replacement update model parameters to improve the security in the horizontal federation process.
In this embodiment, a second model initial gradient sent by a second participant is received, where the second model initial gradient is determined based on corresponding second sample data by each second participant in a process of executing a second preset federation flow; acquiring a second model updating gradient of each second participant based on the second model initial gradient and the prediction weight corresponding to the first participant; determining a replacement update model parameter based on the second model update gradient and a first model update gradient determined by a first participant based on the first sample data in executing a second preset federal procedure; and iteratively updating the model parameters in the second preset to-be-trained prediction model based on the replacement update model parameters to obtain a second target model. In the embodiment, the second target model is accurately obtained.
Referring to fig. 3, fig. 3 is a schematic device structure diagram of a hardware operating environment according to an embodiment of the present application.
As shown in fig. 3, the data processing apparatus may include: a processor 1001, such as a CPU, a memory 1005, and a communication bus 1002. The communication bus 1002 is used for realizing connection communication between the processor 1001 and the memory 1005. The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a memory device separate from the processor 1001 described above.
Optionally, the data processing device may further include a rectangular user interface, a network interface, a camera, RF (radio frequency) circuitry, a sensor, audio circuitry, a WiFi module, and so forth. The rectangular user interface may comprise a Display screen (Display), an input sub-module such as a Keyboard (Keyboard), and the optional rectangular user interface may also comprise a standard wired interface, a wireless interface. The network interface may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface).
Those skilled in the art will appreciate that the data processing device architecture shown in fig. 3 does not constitute a limitation of the data processing device and may include more or fewer components than shown, or some components may be combined, or a different arrangement of components.
As shown in fig. 3, a memory 1005, which is a kind of computer medium, may include therein an operating system, a network communication module, and a data processing program. An operating system is a program that manages and controls the hardware and software resources of the data processing device, supporting the operation of the data processing program as well as other software and/or programs. The network communication module is used to enable communication between components within the memory 1005, as well as with other hardware and software within the data processing system.
In the data processing apparatus shown in fig. 3, the processor 1001 is configured to execute a data processing program stored in the memory 1005, and implement the steps of the data processing method according to any one of the above.
The specific implementation of the data processing device of the present application is substantially the same as that of each embodiment of the data processing method, and is not described herein again.
The present application further provides a data processing apparatus, which is applied to a first party, where the first party and a second party are in communication connection, and the second party is multiple, the data processing apparatus includes:
the first determining module is used for determining the data occupation rate of target data consistent with the data characteristics of the first party in second sample data of each second party in the second sample data;
a second determining module, configured to determine a first participant prediction weight of each second sample data based on each of the data occupancy rates;
and the third determining module is used for determining a preset prediction model of the first participant through federal learning based on the first participant prediction weight and the first sample data of the first participant.
Optionally, the first determining module includes:
the receiving unit is used for receiving the data occupation ratio sent by each second participant, and after each second participant inputs corresponding second sample data into a corresponding preset domain classification model, the second sample data is subjected to prediction processing to obtain the data occupation ratio of target data in the second sample data, which has the same data characteristics as the first participant, in the second sample data;
the preset field classification model is a first target model which is obtained by performing iterative training on a first preset prediction model to be trained on the basis of training sample data with preset participant source labels and by executing a first preset federal flow and predicting the data source occupation ratio of each participant.
Optionally, the sample data in the first participant is first sample data, and the third determining module includes:
the first execution unit is used for performing iterative training on a second preset prediction model to be trained by executing a second preset federal process on the basis of the first participant prediction weight and first sample data of the first participant to obtain a second target model;
a setting unit configured to set the second target model as a preset prediction model of the first participant.
Optionally, the first execution unit includes:
the first receiving subunit is configured to receive a second model initial gradient sent by a second party, where the second model initial gradient is determined by each second party based on corresponding second sample data in a process of executing a second preset federation flow;
the obtaining subunit is configured to obtain a second model update gradient of each second participant based on the second model initial gradient and the prediction weight corresponding to the first participant;
the determining subunit is configured to determine, based on the second model update gradient, and in executing a second preset federal procedure, a first model update gradient determined by the first participant based on the first sample data, and determine a replacement update model parameter;
and the first updating subunit is used for iteratively updating the model parameters in the second preset prediction model to be trained based on the replacement updated model parameters to obtain a second target model.
Optionally, the first party and the second party are in communication connection through a preset intermediary party;
the first execution unit further includes:
the sending subunit is configured to, in executing a second preset federal flow, determine a first model update gradient based on the first sample data, and send the first model update gradient to a preset intermediate party;
the second receiving subunit is configured to receive a replacement update model parameter determined based on a second model initial gradient sent by a preset intermediary party, and the first model update gradient corresponding to the first participant prediction weight;
the second participant determines a second model initial gradient based on the second sample data, encrypts the second model initial gradient and sends the second model initial gradient to the preset intermediate party;
and the second updating subunit is used for iteratively updating the model parameters in the second preset prediction model to be trained based on the replacement updating model parameters to obtain a second target model.
Optionally, the second receiving subunit is configured to implement:
and receiving a replacement update model parameter which is sent by a preset intermediate party in an encrypted manner and is determined on the basis of the second model initial gradient, the first participant prediction weight and the first model update gradient.
Optionally, the data processing apparatus further includes:
the first acquisition module is used for acquiring data to be processed and inputting the data to be processed into the preset prediction model;
and the second acquisition module is used for carrying out prediction processing on the data to be processed based on the preset prediction model to obtain a target prediction result.
The specific implementation of the data processing apparatus of the present application is substantially the same as that of the embodiments of the data processing method, and is not described herein again.
The present embodiment provides a medium, and the medium stores one or more programs, which can be further executed by one or more processors for implementing the steps of the data processing method described in any one of the above.
The specific implementation of the medium of the present application is substantially the same as that of each embodiment of the data processing method, and is not described herein again.
The above description is only a preferred embodiment of the present application, and not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings, or which are directly or indirectly applied to other related technical fields, are included in the scope of the present application.
Claims (10)
1. A data processing method is applied to a first participant, the first participant is in communication connection with a plurality of second participants, and the data processing method comprises the following steps:
determining the data occupation rate of target data consistent with the data characteristics of the first party in second sample data of each second party in the second sample data;
determining a first participant prediction weight of each second sample data based on each data occupancy ratio;
determining a preset prediction model of the first participant through federal learning based on the first participant prediction weight and the first sample data of the first participant.
2. The data processing method according to claim 1, wherein the step of determining a data proportion of target data having a data characteristic identical to that of the first participant in the second sample data of each of the second participants in the second sample data comprises:
receiving the data occupation ratio sent by each second participant, wherein after each second participant inputs corresponding second sample data into a corresponding preset domain classification model, the second sample data is subjected to prediction processing to obtain the data occupation ratio of target data in the second sample data, which has the same data characteristics as the first participant, in the second sample data;
the preset field classification model is a first target model which is obtained by performing iterative training on a first preset prediction model to be trained on the basis of training sample data with preset participant source labels and by executing a first preset federal flow and predicting the data source occupation ratio of each participant.
3. The data processing method of claim 1, wherein the sample data in the first participant is first sample data, and the step of determining the pre-set prediction model of the first participant through federal learning based on the first participant prediction weight and the first sample data of the first participant comprises:
performing iterative training on a second preset prediction model to be trained by executing a second preset federal flow on the basis of the first participant prediction weight and first sample data of the first participant to obtain a second target model;
setting the second target model as a preset prediction model of the first participant.
4. The data processing method of claim 3, wherein the step of iteratively training a second predetermined predictive model to be trained by executing a second predetermined federal procedure based on the first participant prediction weights and the first sample data of the first participant to obtain a second target model comprises:
receiving a second model initial gradient sent by a second participant, wherein the second model initial gradient is determined by each second participant based on corresponding second sample data in a process of executing a second preset federal flow;
acquiring a second model updating gradient of each second participant based on the second model initial gradient and the prediction weight corresponding to the first participant;
determining a replacement update model parameter based on the second model update gradient and a first model update gradient determined by a first participant based on the first sample data in executing a second preset federal procedure;
and iteratively updating the model parameters in the second preset to-be-trained prediction model based on the replacement update model parameters to obtain a second target model.
5. The data processing method of claim 4, wherein the first party and the second party are communicatively coupled via a predetermined intermediary;
the step of performing iterative training on a second preset prediction model to be trained by executing a second preset federal flow based on the first participant prediction weight and the first sample data of the first participant to obtain a second target model includes:
in executing a second preset federal flow, determining a first model updating gradient based on the first sample data, and sending the first model updating gradient to a preset intermediate party;
receiving a replacement update model parameter which is sent by a preset intermediate party and is based on a second model initial gradient, corresponding to the first participant prediction weight, and determined by the first model update gradient;
the second participant determines a second model initial gradient based on the second sample data, encrypts the second model initial gradient and sends the second model initial gradient to the preset intermediate party;
and iteratively updating the model parameters in the second preset to-be-trained prediction model based on the replacement update model parameters to obtain a second target model.
6. The data processing method of claim 5, wherein the step of receiving the replacement update model parameters determined based on the second model initial gradient, the first participant prediction weight and the first model update gradient sent by the predetermined intermediary comprises:
and receiving a replacement update model parameter which is sent by a preset intermediate party in an encrypted manner and is determined on the basis of the second model initial gradient, the first participant prediction weight and the first model update gradient.
7. The data processing method of claim 1, wherein the step of determining the pre-set prediction model for the first participant based on the first participant prediction weights and the first sample data for the first participant is followed by:
acquiring data to be processed, and inputting the data to be processed into the preset prediction model;
and carrying out prediction processing on the data to be processed based on the preset prediction model to obtain a target prediction result.
8. A data processing apparatus, applied to a first party, the first party being in communication connection with a plurality of second parties, the data processing apparatus comprising:
the first determining module is used for determining the data occupation rate of target data consistent with the data characteristics of the first party in second sample data of each second party in the second sample data;
a second determining module, configured to determine a first participant prediction weight of each second sample data based on each of the data occupancy rates;
and the third determining module is used for determining a preset prediction model of the first participant through federal learning based on the first participant prediction weight and the first sample data of the first participant.
9. A data processing apparatus, characterized in that the data processing apparatus comprises: a memory, a processor and a program stored on the memory for implementing the data processing method,
the memory is used for storing a program for realizing the data processing method;
the processor is configured to execute a program implementing the data processing method to implement the steps of the data processing method according to any one of claims 1 to 7.
10. A medium having stored thereon a program for implementing a data processing method, the program being executed by a processor to implement the steps of the data processing method according to any one of claims 1 to 7.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010787792.7A CN111898768A (en) | 2020-08-06 | 2020-08-06 | Data processing method, device, equipment and medium |
PCT/CN2021/094938 WO2022028045A1 (en) | 2020-08-06 | 2021-05-20 | Data processing method, apparatus, and device, and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010787792.7A CN111898768A (en) | 2020-08-06 | 2020-08-06 | Data processing method, device, equipment and medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111898768A true CN111898768A (en) | 2020-11-06 |
Family
ID=73246120
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010787792.7A Pending CN111898768A (en) | 2020-08-06 | 2020-08-06 | Data processing method, device, equipment and medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN111898768A (en) |
WO (1) | WO2022028045A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112686388A (en) * | 2020-12-10 | 2021-04-20 | 广州广电运通金融电子股份有限公司 | Data set partitioning method and system under federated learning scene |
CN112750038A (en) * | 2021-01-14 | 2021-05-04 | 中国工商银行股份有限公司 | Transaction risk determination method and device and server |
CN113158223A (en) * | 2021-01-27 | 2021-07-23 | 深圳前海微众银行股份有限公司 | Data processing method, device, equipment and medium based on state transition kernel optimization |
WO2022028045A1 (en) * | 2020-08-06 | 2022-02-10 | 深圳前海微众银行股份有限公司 | Data processing method, apparatus, and device, and medium |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114334158B (en) * | 2022-03-07 | 2022-06-21 | 广州帝隆科技股份有限公司 | Monitoring management method and system based on Internet of things |
CN114548429B (en) * | 2022-04-27 | 2022-08-12 | 蓝象智联(杭州)科技有限公司 | Safe and efficient transverse federated neural network model training method |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180349788A1 (en) * | 2017-05-30 | 2018-12-06 | Adobe Systems Incorporated | Introspection network for training neural networks |
CN109284313A (en) * | 2018-08-10 | 2019-01-29 | 深圳前海微众银行股份有限公司 | Federal modeling method, equipment and readable storage medium storing program for executing based on semi-supervised learning |
CN110490335A (en) * | 2019-08-07 | 2019-11-22 | 深圳前海微众银行股份有限公司 | A kind of method and device calculating participant's contribution rate |
US20200027019A1 (en) * | 2019-08-15 | 2020-01-23 | Lg Electronics Inc. | Method and apparatus for learning a model to generate poi data using federated learning |
WO2020029590A1 (en) * | 2018-08-10 | 2020-02-13 | 深圳前海微众银行股份有限公司 | Sample prediction method and device based on federated training, and storage medium |
CN111091200A (en) * | 2019-12-20 | 2020-05-01 | 深圳前海微众银行股份有限公司 | Updating method, system, agent, server and storage medium of training model |
CN111340614A (en) * | 2020-02-28 | 2020-06-26 | 深圳前海微众银行股份有限公司 | Sample sampling method and device based on federal learning and readable storage medium |
CN111355739A (en) * | 2020-03-06 | 2020-06-30 | 深圳前海微众银行股份有限公司 | Data transmission method, device, terminal equipment and medium for horizontal federal learning |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111898768A (en) * | 2020-08-06 | 2020-11-06 | 深圳前海微众银行股份有限公司 | Data processing method, device, equipment and medium |
-
2020
- 2020-08-06 CN CN202010787792.7A patent/CN111898768A/en active Pending
-
2021
- 2021-05-20 WO PCT/CN2021/094938 patent/WO2022028045A1/en active Application Filing
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180349788A1 (en) * | 2017-05-30 | 2018-12-06 | Adobe Systems Incorporated | Introspection network for training neural networks |
CN109284313A (en) * | 2018-08-10 | 2019-01-29 | 深圳前海微众银行股份有限公司 | Federal modeling method, equipment and readable storage medium storing program for executing based on semi-supervised learning |
WO2020029590A1 (en) * | 2018-08-10 | 2020-02-13 | 深圳前海微众银行股份有限公司 | Sample prediction method and device based on federated training, and storage medium |
CN110490335A (en) * | 2019-08-07 | 2019-11-22 | 深圳前海微众银行股份有限公司 | A kind of method and device calculating participant's contribution rate |
US20200027019A1 (en) * | 2019-08-15 | 2020-01-23 | Lg Electronics Inc. | Method and apparatus for learning a model to generate poi data using federated learning |
CN111091200A (en) * | 2019-12-20 | 2020-05-01 | 深圳前海微众银行股份有限公司 | Updating method, system, agent, server and storage medium of training model |
CN111340614A (en) * | 2020-02-28 | 2020-06-26 | 深圳前海微众银行股份有限公司 | Sample sampling method and device based on federal learning and readable storage medium |
CN111355739A (en) * | 2020-03-06 | 2020-06-30 | 深圳前海微众银行股份有限公司 | Data transmission method, device, terminal equipment and medium for horizontal federal learning |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022028045A1 (en) * | 2020-08-06 | 2022-02-10 | 深圳前海微众银行股份有限公司 | Data processing method, apparatus, and device, and medium |
CN112686388A (en) * | 2020-12-10 | 2021-04-20 | 广州广电运通金融电子股份有限公司 | Data set partitioning method and system under federated learning scene |
CN112750038A (en) * | 2021-01-14 | 2021-05-04 | 中国工商银行股份有限公司 | Transaction risk determination method and device and server |
CN112750038B (en) * | 2021-01-14 | 2024-02-02 | 中国工商银行股份有限公司 | Transaction risk determination method, device and server |
CN113158223A (en) * | 2021-01-27 | 2021-07-23 | 深圳前海微众银行股份有限公司 | Data processing method, device, equipment and medium based on state transition kernel optimization |
Also Published As
Publication number | Publication date |
---|---|
WO2022028045A1 (en) | 2022-02-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111898768A (en) | Data processing method, device, equipment and medium | |
US20220327807A1 (en) | Continually Learning Audio Feedback Engine | |
CN111275207B (en) | Semi-supervision-based transverse federal learning optimization method, equipment and storage medium | |
CN111126574B (en) | Method, device and storage medium for training machine learning model based on endoscopic image | |
WO2021083276A1 (en) | Method, device, and apparatus for combining horizontal federation and vertical federation, and medium | |
Pérez et al. | Group decision making problems in a linguistic and dynamic context | |
CN108197652B (en) | Method and apparatus for generating information | |
CN110458217B (en) | Image recognition method and device, fundus image recognition method and electronic equipment | |
CN109564575A (en) | Classified using machine learning model to image | |
CN112700010B (en) | Feature completion method, device, equipment and storage medium based on federal learning | |
CN109313490A (en) | It is tracked using the eye gaze of neural network | |
US20170091652A1 (en) | Regularized model adaptation for in-session recommendations | |
KR102056806B1 (en) | Terminal and server providing a video call service | |
CN113095512A (en) | Federal learning modeling optimization method, apparatus, medium, and computer program product | |
CN110069715A (en) | A kind of method of information recommendation model training, the method and device of information recommendation | |
WO2019062405A1 (en) | Application program processing method and apparatus, storage medium, and electronic device | |
WO2021258882A1 (en) | Recurrent neural network-based data processing method, apparatus, and device, and medium | |
CN111126347B (en) | Human eye state identification method, device, terminal and readable storage medium | |
CN107273979A (en) | The method and system of machine learning prediction are performed based on service class | |
CN111428884A (en) | Federal modeling method, device and readable storage medium based on forward law | |
CN111428883A (en) | Federal modeling method, device and readable storage medium based on backward law | |
WO2021139483A1 (en) | Forward model selection method and device, and readable storage medium | |
CN112785144A (en) | Model construction method, device and storage medium based on federal learning | |
CN110781976A (en) | Extension method of training image, training method and related device | |
CN111815169A (en) | Business approval parameter configuration method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |