CN109165683A - Sample predictions method, apparatus and storage medium based on federation's training - Google Patents
Sample predictions method, apparatus and storage medium based on federation's training Download PDFInfo
- Publication number
- CN109165683A CN109165683A CN201810913869.3A CN201810913869A CN109165683A CN 109165683 A CN109165683 A CN 109165683A CN 201810913869 A CN201810913869 A CN 201810913869A CN 109165683 A CN109165683 A CN 109165683A
- Authority
- CN
- China
- Prior art keywords
- sample
- training
- node
- split
- tree
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/24323—Tree-organised classifiers
Abstract
The invention discloses a kind of sample predictions methods based on federation's training, the following steps are included: carrying out federal training using the training sample that XGboost algorithm is aligned two, tree-model is promoted to construct gradient, wherein, it includes more regression trees that the gradient, which promotes tree-model, and a split vertexes of the regression tree correspond to a feature of training sample;Tree-model is promoted based on the gradient, forecast sample is treated and carries out associated prediction, with the prediction score of the sample class of determination sample to be predicted or acquisition sample to be predicted.The invention also discloses a kind of sample predictions devices and computer readable storage medium based on federation's training.The present invention, which is realized, carries out federal training modeling, and then the model realization sample predictions based on foundation using the training sample of different data side.
Description
Technical field
The present invention relates to machine learning techniques field more particularly to a kind of sample predictions methods based on federation's training, dress
It sets and computer readable storage medium.
Background technique
Certain behaviors in current information epoch, people can be come out by Data Representation, such as consumer behavior, thus derivative
Big data analysis is gone out, corresponding Analysis model of network behaviors has been constructed by machine learning, and then can classify to the behavior of people
Or the behavioural characteristic based on user is predicted etc..
Usually all it is that stand-alone training is carried out to sample data by a side in existing machine learning techniques, that is to say that folk prescription is built
Mould.Meanwhile the mathematical model based on foundation, it may be determined that the feature that sample characteristics concentrate significance level relatively high.However very
In the big data analysis scene in multispan field, such as the existing consumer behavior of user, also there is a lend-borrow action, and consumer consumption behavior number
According to generation in consumer service provider, and user's lend-borrow action data are generated in financial service provider, if financial service mentions
Supplier needs the lend-borrow action of the prediction user of the consumer behavior feature based on user, then needs disappearing using consumer service provider
Expense behavioral data simultaneously carries out machine learning together with the lend-borrow action data of we to construct prediction model.
Therefore, for above-mentioned application scenarios, a kind of new modeling pattern is needed to realize the sample of different data provider
The joint training of data, and then realize that both sides participate in modeling jointly.
Summary of the invention
The main purpose of the present invention is to provide a kind of sample predictions method, apparatus based on federation's training and computer can
Read storage medium, it is intended to which solving the prior art cannot achieve the joint training of sample data of different data provider, Jin Erwu
Method realizes the technical issues of both sides participate in modeling and sample predictions jointly.
To achieve the above object, the present invention provides a kind of sample predictions method based on federation's training, described based on federation
Trained sample predictions method the following steps are included:
Federal training is carried out using the training sample that XGboost algorithm is aligned two, promotes tree-model to construct gradient,
Wherein, it includes more regression trees that the gradient, which promotes tree-model, and a split vertexes of the regression tree correspond to training sample
One feature;
Tree-model is promoted based on the gradient, forecast sample is treated and carries out associated prediction, with the sample of determination sample to be predicted
This classification or the prediction score for obtaining sample to be predicted.
Optionally, the sample predictions method based on federation's training includes:
Before carrying out federal training, using Proxy Signature and rsa encryption algorithm, the ID of sample data is interacted and is added
It is close;
By comparing both sides' encrypted ID encryption string, the intersection part in both sides' sample is identified, and by the friendship in sample
Collection part is as the training sample after sample alignment.
Optionally, the training sample of described two alignment is respectively the first training sample and the second training sample;
The first training sample attribute includes sample ID and part sample characteristics, the second training sample attribute packet
Include sample ID, another part sample characteristics and data label;
First training sample is provided by the first data side and is stored in the first data side local, the second training sample
This is provided by the second data side and is stored in the second data side local.
Optionally, the training sample being aligned using XGboost algorithm to two carries out federal training, to construct gradient
Promoting tree-model includes:
In second data side side, the First-order Gradient of each training sample in the corresponding sample set of epicycle node split is obtained
With second order gradient;
If epicycle node split is the first run node split for constructing regression tree, to the First-order Gradient and two ladder
Degree is sent to the first data side after being encrypted together with the sample ID of the sample set, in the first data side
The First-order Gradient and the second order gradient of the side group in encryption calculate local training sample corresponding with the sample ID every
The financial value of split vertexes under a kind of divisional mode;
If epicycle node split is the non-first run node split for constructing regression tree, the sample ID of the sample set is sent
To the first data side, in first data side lateral edge First-order Gradient used in first run node split and second order
Gradient calculates the financial value of local training sample split vertexes under each divisional mode corresponding with the sample ID;
Second data side receives the encryption financial value for all split vertexes that the first data side returns and is decrypted;
The local and sample is calculated based on the First-order Gradient and the second order gradient in second data side side
The financial value of the corresponding training sample of ID split vertexes under each divisional mode;
Based on the financial value of the respective calculated all split vertexes of both sides, best point of the overall situation of epicycle node split is determined
Split node;
The best split vertexes of the overall situation based on epicycle node split, divide the corresponding sample set of present node, raw
The node of Cheng Xin is to construct the regression tree that gradient promotes tree-model.
Optionally, described in second data side side, obtain each trained sample in the corresponding sample set of epicycle node split
Before the step of this First-order Gradient and second order gradient, further includes:
When carrying out node split, judge whether epicycle node split corresponds to first regression tree of construction;
If epicycle node split first regression tree of corresponding construction, judge whether epicycle node split is first recurrence of construction
The first run node split of tree;
If epicycle node split is the first run node split for constructing first regression tree, in second data side side, just
The First-order Gradient of each training sample and second order gradient in the corresponding sample set of beginningization epicycle node split;If epicycle node split is
The non-first run node split for constructing first regression tree, then continue to use First-order Gradient used in first run node split and second order gradient;
If epicycle node split is corresponding to construct non-first regression tree, judge whether epicycle node split is construction non-first
The first run node split of regression tree;
If epicycle node split is the first run node split for constructing non-first regression tree, more according to last round of federal training
New First-order Gradient and second order gradient;If epicycle node split is the non-first run node split for constructing non-first regression tree, continue to use
First-order Gradient used in first run node split and second order gradient.
Optionally, the sample predictions method based on federation's training further include:
When generating new node to construct the regression tree of gradient promotion tree-model, in second data side side, judgement
Whether the depth of epicycle regression tree reaches predetermined depth threshold value;
If the depth of epicycle regression tree reaches the predetermined depth threshold value, Stop node division obtains gradient boosted tree
Otherwise one regression tree of model continues next round node split;
When Stop node division, in second data side side, judge whether the total quantity of epicycle regression tree reaches pre-
If amount threshold;
If the total quantity of epicycle regression tree reaches the preset quantity threshold value, stop federal training, otherwise continues next
The federal training of wheel.
Optionally, the sample predictions method based on federation's training further include:
In second data side side, the related letter for the best split vertexes of the overall situation that each round node split determines is recorded
Breath;
Wherein, the relevant information include: the provider of corresponding sample data, corresponding sample data feature coding and
Financial value.
Optionally, the statistics gradient promotes the average yield value of the corresponding split vertexes of same feature in tree-model
Include:
In second data side side, is promoted in tree-model using each global best split vertexes as the gradient and respectively returned
The split vertexes of tree count the average yield value of the corresponding split vertexes of same feature coding.
Optionally, described that tree-model is promoted based on the gradient, it treats forecast sample and carries out associated prediction, to determine to pre-
The sample class of test sample sheet or the prediction score for obtaining sample to be predicted include:
In second data side side, traverses the gradient and promote the corresponding regression tree of tree-model;
If the attribute value of current traverse node is recorded in the second data side, by comparing local sample to be predicted
The attribute value of data point and current traverse node, with the next traverse node of determination;
If the attribute value of current traverse node is recorded in the first data side, initiate to inquire to the first data side
Request, in first data side side, by comparing the data point of local sample to be predicted and the category of current traverse node
Property value, determines next traverse node and returns to the nodal information to the second data side;
When having traversed the gradient and promoting the corresponding regression tree of tree-model, based on corresponding to the affiliated node of sample to be predicted
Sample data label, determine the sample class of sample to be predicted, or the weighted value based on the affiliated node of sample to be predicted, obtain
Obtain the prediction score of sample to be predicted.
Further, to achieve the above object, the present invention also provides a kind of sample predictions device based on federation's training, institutes
The sample predictions device based on federation's training is stated to include memory, processor and be stored on the memory and can be described
The sample predictions program run on processor realizes as above any one institute when the sample predictions program is executed by the processor
The step of sample predictions method based on federation's training stated.
Further, to achieve the above object, the present invention also provides a kind of computer readable storage medium, the computers
It is stored with sample predictions program on readable storage medium storing program for executing, as above any one is realized when the sample predictions program is executed by processor
The step of described sample predictions method based on federation's training.
The present invention carries out federal training using the training sample that XGboost algorithm is aligned two, to construct gradient promotion
Tree-model, wherein it is regression tree set that gradient, which promotes tree-model, comprising there are more regression trees, one point of every regression tree
Split the feature that node corresponds to training sample;Finally based on gradient promoted tree-model, treat forecast sample combine it is pre-
It surveys, with the prediction score of the sample class of determination sample to be predicted or acquisition sample to be predicted.The present invention is realized using different
The training sample of data side carries out federal training modeling, and then can realize pre- to the sample progress with multi-party sample data feature
It surveys.
Detailed description of the invention
Fig. 1 is the knot for the hardware running environment being related to the present invention is based on the sample predictions Installation practice scheme of federation's training
Structure schematic diagram;
Fig. 2 is that the present invention is based on the flow diagrams of one embodiment of sample predictions method of federation's training;
Fig. 3 is that the present invention is based on the process signals that sample alignment is carried out in one embodiment of sample predictions method of federation's training
Figure;
Fig. 4 is the refinement flow diagram of mono- embodiment of step S10 in Fig. 2;
Fig. 5 is that the present invention is based on the training result schematic diagrames of one embodiment of sample predictions method of federation's training.
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific embodiment
It should be appreciated that described herein, specific examples are only used to explain the present invention, is not intended to limit the present invention.
The present invention provides a kind of sample predictions device based on federation's training.
As shown in Figure 1, Fig. 1 is the hardware fortune being related to the present invention is based on the sample predictions Installation practice scheme of federation's training
The structural schematic diagram of row environment.
The present invention is based on the sample predictions devices of federation's training can be PC, and being also possible to server etc. has meter
The equipment for calculating processing capacity.
As shown in Figure 1, the sample predictions device based on federation's training may include: processor 1001, such as CPU, network
Interface 1004, user interface 1003, memory 1005, communication bus 1002.Wherein, communication bus 1002 is for realizing these groups
Connection communication between part.User interface 1003 may include display screen (Display), input unit such as keyboard
(Keyboard), optional user interface 1003 can also include standard wireline interface and wireless interface.Network interface 1004 is optional
May include standard wireline interface and wireless interface (such as WI-FI interface).Memory 1005 can be high speed RAM memory,
It is also possible to stable memory (non-volatile memory), such as magnetic disk storage.Memory 1005 optionally may be used also
To be independently of the storage device of aforementioned processor 1001.
It will be understood by those skilled in the art that the sample predictions apparatus structure based on federation's training shown in Fig. 1 is not
The restriction of structure twin installation may include perhaps combining certain components or different portions than illustrating more or fewer components
Part arrangement.
As shown in Figure 1, as may include that operating system, network are logical in a kind of memory 1005 of computer storage medium
Believe module, Subscriber Interface Module SIM and file copy program.
In sample predictions device based on federation's training shown in Fig. 1, network interface 1004 is mainly used for connection backstage
Server carries out data communication with background server;User interface 1003 is mainly used for connecting client (user terminal), with client
End carries out data communication;And processor 1001 can be used for calling the sample predictions program stored in memory 1005, and execute
It operates below:
Federal training is carried out using the training sample that XGboost algorithm is aligned two, promotes tree-model to construct gradient,
Wherein, it includes more regression trees that the gradient, which promotes tree-model, and a split vertexes of the regression tree correspond to training sample
One feature;
Tree-model is promoted based on the gradient, forecast sample is treated and carries out associated prediction, with the sample of determination sample to be predicted
This classification or the prediction score for obtaining sample to be predicted.
Further, processor 1001 calls the sample predictions program stored in memory 1005 also to execute following operation:
Before carrying out federal training, using Proxy Signature and rsa encryption algorithm, the ID of sample data is interacted and is added
It is close;
By comparing both sides' encrypted ID encryption string, the intersection part in both sides' sample is identified, and by the friendship in sample
Collection part is as the training sample after sample alignment.
Further, the training sample of described two alignment is respectively the first training sample and the second training sample;It is described
First training sample attribute includes sample ID and part sample characteristics, and the second training sample attribute includes sample ID, another
A part of sample characteristics and data label;First training sample is provided by the first data side and is stored in the first data side
Local, second training sample is provided by the second data side and is stored in the second data side local;The calling of processor 1001 is deposited
The sample predictions program stored in reservoir 1005 also executes following operation:
In second data side side, the First-order Gradient of each training sample in the corresponding sample set of epicycle node split is obtained
With second order gradient;
If epicycle node split is the first run node split for constructing regression tree, to the First-order Gradient and two ladder
Degree is sent to the first data side after being encrypted together with the sample ID of the sample set, in the first data side
The First-order Gradient and the second order gradient of the side group in encryption calculate local training sample corresponding with the sample ID every
The financial value of split vertexes under a kind of divisional mode;
If epicycle node split is the non-first run node split for constructing regression tree, the sample ID of the sample set is sent
To the first data side, in first data side lateral edge First-order Gradient used in first run node split and second order
Gradient calculates the financial value of local training sample split vertexes under each divisional mode corresponding with the sample ID;
Second data side receives the encryption financial value for all split vertexes that the first data side returns and is decrypted;
The local and sample is calculated based on the First-order Gradient and the second order gradient in second data side side
The financial value of the corresponding training sample of ID split vertexes under each divisional mode;
Based on the financial value of the respective calculated all split vertexes of both sides, best point of the overall situation of epicycle node split is determined
Split node;
The best split vertexes of the overall situation based on epicycle node split, divide the corresponding sample set of present node, raw
The node of Cheng Xin is to construct the regression tree that gradient promotes tree-model.
Further, processor 1001 calls the sample predictions program stored in memory 1005 also to execute following operation:
When carrying out node split, judge whether epicycle node split corresponds to first regression tree of construction;
If epicycle node split first regression tree of corresponding construction, judge whether epicycle node split is first recurrence of construction
The first run node split of tree;
If epicycle node split is the first run node split for constructing first regression tree, in second data side side, just
The First-order Gradient of each training sample and second order gradient in the corresponding sample set of beginningization epicycle node split;If epicycle node split is
The non-first run node split for constructing first regression tree, then continue to use First-order Gradient used in first run node split and second order gradient;
If epicycle node split is corresponding to construct non-first regression tree, judge whether epicycle node split is construction non-first
The first run node split of regression tree;
If epicycle node split is the first run node split for constructing non-first regression tree, more according to last round of federal training
New First-order Gradient and second order gradient;If epicycle node split is the non-first run node split for constructing non-first regression tree, continue to use
First-order Gradient used in first run node split and second order gradient.
Further, processor 1001 calls the sample predictions program stored in memory 1005 also to execute following operation:
In first data side side, the First-order Gradient and the second order gradient based on encryption calculate local and institute
State the financial value of the corresponding training sample of sample ID split vertexes under each divisional mode;
Or in first data side side, First-order Gradient used in first run node split and second order gradient are continued to use, it counts
Calculate the financial value of local training sample split vertexes under each divisional mode corresponding with the sample ID;
The second data side is sent to after encrypting to the financial value of all split vertexes.
Further, processor 1001 calls the sample predictions program stored in memory 1005 also to execute following operation:
When generating new node to construct the regression tree of gradient promotion tree-model, in second data side side, judgement
Whether the depth of epicycle regression tree reaches predetermined depth threshold value;
If the depth of epicycle regression tree reaches the predetermined depth threshold value, Stop node division obtains gradient boosted tree
Otherwise one regression tree of model continues next round node split;
When Stop node division, in second data side side, judge whether the total quantity of epicycle regression tree reaches pre-
If amount threshold;
If the total quantity of epicycle regression tree reaches the preset quantity threshold value, stop federal training, otherwise continues next
The federal training of wheel.
Further, processor 1001 calls the sample predictions program stored in memory 1005 also to execute following operation:
In second data side side, the related letter for the best split vertexes of the overall situation that each round node split determines is recorded
Breath;
Wherein, the relevant information include: the provider of corresponding sample data, corresponding sample data feature coding and
Financial value.
Further, processor 1001 calls the sample predictions program stored in memory 1005 also to execute following operation:
In second data side side, traverses the gradient and promote the corresponding regression tree of tree-model;
If the attribute value of current traverse node is recorded in the second data side, by comparing local sample to be predicted
The attribute value of data point and current traverse node, with the next traverse node of determination;
If the attribute value of current traverse node is recorded in the first data side, initiate to inquire to the first data side
Request, in first data side side, by comparing the data point of local sample to be predicted and the category of current traverse node
Property value, determines next traverse node and returns to the nodal information to the second data side;
When having traversed the gradient and promoting the corresponding regression tree of tree-model, based on corresponding to the affiliated node of sample to be predicted
Sample data label, determine the sample class of sample to be predicted, or the weighted value based on the affiliated node of sample to be predicted, obtain
Obtain the prediction score of sample to be predicted.
Based on the hardware running environment that the above-mentioned sample predictions Installation practice scheme based on federation's training is related to, this is proposed
The following embodiment of sample predictions method of the invention based on federation's training.
It is that the present invention is based on the flow diagrams of one embodiment of sample predictions method of federation's training referring to Fig. 2, Fig. 2.This
In embodiment, it is described based on federation training sample predictions method the following steps are included:
Step S10 carries out federal training using the training sample that XGboost algorithm is aligned two, is mentioned with constructing gradient
Rise tree-model, wherein it includes more regression trees, the corresponding instruction of a split vertexes of the regression tree that the gradient, which promotes tree-model,
Practice a feature of sample;
XGboost (eXtreme Gradient Boosting) algorithm is in GBDT (Gradient Boosting
Decision Tree, gradient boosted tree) improvement that Boosting algorithm is carried out on the basis of algorithm, the use of internal decision making tree
Be regression tree, it includes more regression trees that algorithm output, which is the set of regression tree, and the basic ideas of training study are traversal instructions
All dividing methods (namely mode of node split) for practicing all features of sample, select the dividing method of loss reduction, obtain
Two leaves (namely split vertexes and generate new node), then proceed to traverse, until:
(1) stop splitting condition if meeting, export a regression tree;
(2) stop iterated conditional if meeting, export a regression tree set.
In the present embodiment, the training sample that XGboost algorithm uses is two independent training samples namely each instruction
Practice sample and belongs to different data sides respectively.If two training samples are regarded as a whole training sample, due to two
Training sample belongs to different data sides, therefore, can regard as and carry out cutting to whole training sample, and then training sample is
The different characteristic of same sample (sample is longitudinal sectional).
Further, since two training samples belong to different data sides respectively, therefore, to realize federal training modeling, need
Sample alignment is carried out to the raw sample data that both sides provide.
In the present embodiment, federation's training refers to sample training process by two data sides cooperate it is common complete, it is final trained
To the gradient boosted tree model regression tree that includes, split vertexes correspond to the feature of both sides' training sample.
In XGboost algorithm, when traversing all dividing methods of all features of training sample, evaluated by financial value
The superiority and inferiority of dividing method, each split vertexes all select the dividing method of loss reduction.Therefore, the financial value of split vertexes can be made
It is characterized the Appreciation gist of importance, the financial value of split vertexes is bigger, then node allocation loss is smaller, and then the split vertexes
The importance of corresponding feature is also bigger.
It include more regression trees since the gradient that training obtains is promoted in tree-model in the present embodiment, and different recurrence
For tree there is a possibility that carrying out node allocation with same characteristic features, therefore, it is necessary to statistical gradients to promote all recurrence that tree-model includes
The average yield value of the corresponding split vertexes of same feature in tree, and using average yield value as the scoring of character pair.
Step S20 promotes tree-model based on the gradient, treats forecast sample and carries out associated prediction, to be predicted with determination
The sample class of sample or the prediction score for obtaining sample to be predicted.
In the present embodiment, promoting tree-model using the gradient that the training of XGboost algorithm obtains be may be implemented to forecast sample
Associated prediction is carried out, forecast sample is classified or given a mark to realize.
The present embodiment carries out federal training using the training sample that XGboost algorithm is aligned two, is mentioned with constructing gradient
Rise tree-model, wherein it is regression tree set that gradient, which promotes tree-model, comprising there are more regression trees, one of every regression tree
Split vertexes correspond to a feature of training sample;Tree-model is finally being promoted based on gradient, forecast sample is being treated and is combined
Prediction, with the prediction score of the sample class of determination sample to be predicted or acquisition sample to be predicted.The present invention is realized using not
Training sample with data side carries out federal training modeling, and then can realize and carry out to the sample with multi-party sample data feature
Prediction.
Further, to guarantee in federal modeling process, the sample gradient that different data side uses is consistent, therefore, two
Data side first carries out both sides' sample registration process before carrying out federal modeling, and specific process flow is as shown in Figure 3.
The alignment of both sides' sample interacts encipherment scheme to sample ID using Proxy Signature and rsa encryption algorithm, passes through ratio
More encrypted ID encryption string identifies in both sides' sample that (Private Parts, both sides can not each other with non-intersection part for intersection part
See), to realize the secret protection to non-intersection part sample data, the present invention needs in sample alignment procedure to sample data
It is encrypted.
Assuming that the sample id of the data side A is identified as XA: the sample id of { u1, u2, u3, u4 }, the data side B are identified as XB: u1,
U2, u3, u5 }, the Proxy Signature of data x is E (x), and the RSA key that the side B generates is (n, e, d), the RSA key that the side A obtains be (n,
E), following instantiation procedure is carried out:
(1) side A encrypts id: YA={ (re%n) * E (u) | u ∈ XA, wherein r corresponds to XAIn each is different
The different random numbers that sample id is generated, then the side A is YAIt is sent to the side B;
(2) side B again encrypts id encryption string: ZA={ yd|y∈YA, the side B is again the string Z of double layer encryptionAHair
To the side A;
(3) side A is to ZAIt proceeds as follows:
(4) side B encrypts id: ZB={ E (E (u))d|u∈XB, then ZBIt is sent to the side A;
(5) side A compares DAAnd ZBIf the two encryption strings are equal, then it represents that XAAnd XBIt is equal.Equal id is then sample
This intersection part ({ u1, u2, u3 }) retains;Unequal part ({ u4, u5 }) because be encryption form, both sides to this not
As it can be seen that discardable.
Further, the specific implementation of joint training of the invention for ease of description, the present embodiment is specifically with two
Independent training sample is illustrated.
In the present embodiment, the first data side provide the first training sample, the first training sample attribute include sample ID and
Part sample characteristics;Second data side provides the second training sample, and the second training sample attribute includes sample ID, another part sample
Eigen and data label.
Wherein, sample characteristics refer to that the feature that sample shows or has, such as sample are behaved, then corresponding sample characteristics
It can be age, gender, income, educational background etc..Data label is for classifying to multiple and different samples, the result tool of classification
The feature that body is dependent on sample carries out determining to obtain.
The major significance that federal training of the invention is modeled is to realize the two-way secret protection of both sides' sample data.Cause
This, in federal training process, the first training sample is stored in the first data side local, and the second training sample is stored in the second number
According to square local, such as in following table 1, data are provided by the first data side and are stored in the first data side local, number in surface table 2
It is local according to being provided by the second data side and being stored in the second data side.
Table 1
As shown in Table 1, the first training sample attribute include sample ID (X1~X5), Age feature, Gender feature with
And Amount of given credit feature.
Table 2
Sample ID | Bill Payment | Education | Lable |
X1 | 3102 | 2 | 24 |
X2 | 17250 | 3 | 14 |
X3 | 14027 | 2 | 16 |
X4 | 6787 | 1 | 10 |
X5 | 280 | 1 | 26 |
Shown in table 2 as above, the second training sample attribute include sample ID (X1~X5), Bill Payment feature,
Education feature and data label Lable.
It further, is the refinement flow diagram of mono- embodiment of step S10 in Fig. 2 referring to Fig. 4, Fig. 4.Based on above-mentioned reality
Apply example, in the present embodiment, above-mentioned steps S10 is specifically included:
Step S101 obtains each training sample in the corresponding sample set of epicycle node split in second data side side
First-order Gradient and second order gradient;
XGboost algorithm is a kind of machine learning modeling method, is needed using classifier (namely classification function) sample
Data are mapped to some in given classification, predict so as to be applied to data.Utilizing classifier learning classification rule
In the process, need to judge using loss function the error of fitting size of machine learning.
In the present embodiment, when carrying out node split every time, in the second data side side, it is corresponding to obtain epicycle node split
The First-order Gradient of each training sample and second order gradient in sample set.
Wherein, gradient promotion tree-model needs to carry out the training of more wheel federations, and the training of each round federation is corresponding to be generated one time
Gui Shu, and the generation of a regression tree needs to carry out multiple node split.
Therefore, in each round federation training process, node split uses the training sample for most starting to save for the first time,
Node split next time then will use the training sample that new node caused by last node split corresponds to sample set, and
In the federal training process of same wheel, each round node split all continues to use First-order Gradient used in first run node split and two ladders
Degree.And federation's training of next round will use last round of federal training result and update a ladder used in last round of federal training
Degree and second order gradient.
XGboost algorithm supports customized loss function, asks single order inclined objective function using customized loss function
Derivative and second-order partial differential coefficient, the corresponding First-order Gradient and second order gradient for obtaining local sample data to be trained.
Therefore the explanation for promoting tree-model in based on the above embodiment for XGboost algorithm and gradient constructs regression tree
It needs to be determined that split vertexes, and split vertexes can be determined by financial value.The calculation formula of financial value gain is as follows:
Wherein, ILRepresent the sample set for including of present node division rear left child node, IRAfter representing present node division
The sample set for including of right child node, giIndicate the First-order Gradient of sample i, hiIndicate the second order gradient of sample i, λ, γ are normal
Number.
Since sample data to be trained is respectively present the first data side and the second data side, therefore, it is necessary in the first number
The financial value of respective sample data split vertexes under each divisional mode is calculated separately according to square side and the second data side side.
In the present embodiment, it is aligned since the first data side has carried out sample with the second data side in advance, thus both sides have
Therefore identical Gradient Features, are based on the second data simultaneously because data label is present in the sample data of the second data side
The First-order Gradient and second order gradient of the sample data of side, calculate both sides' sample data split vertexes under each divisional mode
Financial value.
Step S102, if epicycle node split be construct regression tree first run node split, to the First-order Gradient with
The second order gradient is sent to the first data side together with the sample ID of the sample set after being encrypted, for described
The First-order Gradient and the second order gradient of the first data side's side group in encryption, calculate local instruction corresponding with the sample ID
Practice the financial value of sample split vertexes under each divisional mode;
In the present embodiment, to realize the two-way secret protection for realizing both sides' sample data in federal training process, therefore, if
Epicycle node split is the first run node split for constructing regression tree, then the single order of sample data is calculated in the second data side side
After gradient and second order gradient, is first encrypted, be then then forwarded to the first data side.
In the first data side side, First-order Gradient and second order gradient and above-mentioned income based on the sample data received
The receipts of first data side local sample data split vertexes under each divisional mode are calculated in the calculation formula of value gain
Benefit value, since First-order Gradient and second order gradient are encrypted, the financial value being calculated is also secret value, thus nothing
Financial value need to be encrypted.
Under the various partitioning schemes for calculating sample data after the financial value of split vertexes, generation new node can be divided
To construct regression tree.The present embodiment is preferably had the leading building gradient boosted tree in the second data side of data label by sample data
The regression tree of model.Therefore, it is necessary to the first data side local sample datas that will be calculated in the first data side side each
The financial value of split vertexes is sent to the second data side under kind divisional mode.
Step S103, if epicycle node split is the non-first run node split for constructing regression tree, by the sample set
Sample ID is sent to the first data side, in first data side lateral edge single order used in first run node split
Gradient and second order gradient calculate local training sample split vertexes under each divisional mode corresponding with the sample ID
Financial value;
It, only need to be by epicycle section if epicycle node split is the non-first run node split for constructing regression tree in the present embodiment
The sample ID of the corresponding sample set of dot splitting is sent to the first data side, and when the first data side continues to continue to use first run node split
Used First-order Gradient and second order gradient calculate local training sample corresponding with the sample ID received in each division
The financial value of split vertexes under mode.
Step S104, the second data side receive the encryption financial value for all split vertexes that the first data side returns simultaneously
It is decrypted;
Step S105, in second data side side, based on the First-order Gradient and the second order gradient, calculate it is local with
The financial value of the corresponding training sample split vertexes under each divisional mode of the sample ID;
In the second data side side, First-order Gradient and second order gradient and above-mentioned receipts based on the sample data being calculated
The calculation formula of beneficial value gain calculates the local sample data to be trained in the second data side and divides section under each divisional mode
The financial value of point.
Step S106 determines epicycle node split based on the financial value of the respective calculated all split vertexes of both sides
Global best split vertexes;
Since the initial sample data of both sides has carried out sample alignment, respectively calculated all divisions save both sides
The financial value of point can regard the financial value to both sides' overall data sample split vertexes under each divisional mode as, because
This, by comparing the size of financial value, using the maximum split vertexes of financial value as best point of the overall situation of epicycle node split
Split node.
It should be noted that the best corresponding sample characteristics of split vertexes of the overall situation be both likely to belong to the first data side
Training sample, it is also possible to belong to the training sample of the second data side.
Optionally, it is dominated since the regression tree that gradient promotes tree-model is constructed by the second data side, in the second data
Square side needs to record the relevant information for the best split vertexes of the overall situation that each round node split determines;Relevant information includes: correspondence
The provider of sample data, the feature coding and financial value for corresponding to sample data.
For example, if data side A holds the corresponding feature f of global optimal partition pointi, then this is recorded as (Site A, EA
(fi),gain).Conversely, if data side B holds the corresponding feature f of global optimal partition pointi, then this is recorded as (Site B, EB
(fi),gain).Wherein, EA(fi) indicate data side A to feature fiIt is encoded, EB(fi) indicate data side B to feature fiIt carries out
Coding can indicate feature f by codingiWithout revealing its initial characteristic data.
Optionally, when carrying out feature selecting in the above-described embodiments, preferably using each global best split vertexes as gradient
The split vertexes for promoting each regression tree in tree-model, count the average yield value of the corresponding split vertexes of same feature coding.
Step S107, the best split vertexes of the overall situation based on epicycle node split, to the corresponding sample set of present node into
Line splitting generates new node to construct the regression tree that gradient promotes tree-model.
If the best corresponding sample characteristics of split vertexes of the overall situation of epicycle node split belong to the training sample of the first data side
This, then the corresponding sample data of present node of epicycle segmentation belongs to the first data side.Correspondingly, if epicycle node split it is complete
The best corresponding sample characteristics of split vertexes of office belong to the training sample of the second data side, then the present node of epicycle segmentation is corresponding
Sample data belong to the second data side.
By node split, that is, new node (left child node and right child node) is produced, to construct regression tree.And lead to
Excessive wheel node split, then can be continuously generated new node, and then obtain the tree deeper regression tree of depth, and if Stop node
The regression tree that gradient promotes tree-model then can be obtained in division.
In the present embodiment, since the data that both sides calculate communication are all the encryption data of model intermediate result, training
Process will not reveal initial characteristic data.Guarantee the privacy of data in entire training process using Encryption Algorithm simultaneously.
It is preferred that using part homomorphic encryption algorithm, additive homomorphism is supported.
Further, in one embodiment, the difference based on node split condition, is used for especially by following manner
The First-order Gradient and second order gradient of the training sample of node split:
1, first regression tree of the corresponding construction of epicycle node split
If 1.1, epicycle node split is the first run node split for constructing first regression tree, in the second data side side, just
The First-order Gradient of each training sample and second order gradient in the corresponding sample set of beginningization epicycle node split;
If 1.2, epicycle node split is the non-first run node split for constructing first regression tree, first run node split is continued to use
Used First-order Gradient and second order gradient.
2, epicycle node split is corresponding constructs non-first regression tree
If 2.1, the corresponding first run node split for constructing non-first regression tree of epicycle node split, according to last round of federation
Training updates First-order Gradient and second order gradient;
If 2.2, epicycle node split is the non-first run node split for constructing non-first regression tree, first run node point is continued to use
First-order Gradient used in splitting and second order gradient.
Further, in one embodiment, be reduce the complexity of regression tree, therefore the depth threshold of default regression tree with
Carry out node split limitation.
In the present embodiment, when each round, which generates new node, promotes the regression tree of tree-model to construct gradient, second
Data side side, judges whether the depth of epicycle regression tree reaches predetermined depth threshold value;
If the depth of epicycle regression tree reaches predetermined depth threshold value, Stop node division, and then obtains gradient boosted tree
Otherwise one regression tree of model continues next round node split.
It should be noted that the condition of limitation node split is also possible to the Stop node point when node cannot continue division
It splits, such as the corresponding sample of present node, then can not continue node split.
Further, in another embodiment, to avoid training process overfitting, therefore the quantity threshold of regression tree is preset
Value is to limit the generation quantity of regression tree.
In the present embodiment, when Stop node division, in the second data side side, judge epicycle regression tree total quantity whether
Reach preset quantity threshold value;
If the total quantity of epicycle regression tree reaches preset quantity threshold value, stop federal training, otherwise continues next round connection
Nation's training.
It should be noted that the condition of the generation quantity of limitation regression tree is also possible to stop when node cannot continue division
Only construct regression tree.
For a better understanding of the invention, below based on sample data in table 1,2 in above-described embodiment, to federal instruction of the invention
White silk is illustrated with modeling process.
First round federation training: first regression tree of training
(1) first round node split
1.1, in the second data side side, computational chart 2 sample data First-order Gradient (gi) and second order gradient (hi);To gi
And hiThe first data side is sent to after being encrypted;
1.2, in the first data side side, it is based on giAnd hi, lower point of all possible divisional mode of sample data in computational chart 1
Split the financial value gain of node;Financial value gain is sent to the second data side;
Since Age feature with 5 kinds of sample data division modes, Gender feature there are 2 kinds of sample datas to divide in table 1
Mode, Amount of given credit 5 kinds of sample data division modes of feature, therefore, sample data has altogether in table 1
12 kinds of divisional modes, namely need to calculate the financial value of the corresponding split vertexes of 12 kinds of division modes.
1.3, in the second data side side, computational chart 2 under all possible divisional mode of sample data split vertexes receipts
Beneficial value gain;
Due in table 2 Bill Payment feature with 5 kinds of sample data division modes, Education feature have 3 kinds
Sample data division mode, therefore, sample data has 8 kinds of divisional modes altogether in table 2, namely needs to calculate 8 kinds of division sides
The financial value of the corresponding split vertexes of formula.
1.4, from the financial value of the corresponding split vertexes of the calculated 12 kinds of division modes in the first data side side and from
In the financial value of the corresponding split vertexes of the calculated 8 kinds of division modes in two data sides side, the corresponding spy of maximum return value is selected
Levy the best split vertexes of the overall situation as epicycle node split;
1.5, the best split vertexes of the overall situation based on epicycle node split, divide the corresponding sample data of present node
It splits, generates new node to construct the regression tree that gradient promotes tree-model.
1.6, judge whether the depth of epicycle regression tree reaches predetermined depth threshold value;If the depth of epicycle regression tree reaches pre-
If depth threshold, then Stop node divides, and then obtains the regression tree that gradient promotes tree-model, otherwise continues next round section
Dot splitting;
1.7, judge whether the total quantity of epicycle regression tree reaches preset quantity threshold value;If the total quantity of epicycle regression tree reaches
To preset quantity threshold value, then stop federal training, otherwise enters the training of next round federation.
(2) second and third wheel node split
2.1, assume that the corresponding feature of last round of node split is that Bill Payment is less than or equal to 3102, then this feature
As split vertexes (corresponding sample be X1, X2, X3, X4, X5), two new partial nodes are generated, wherein left sibling is to should be less than
Or the sample set (X1, X5) equal to 3102, and right node is to the sample set (X2, X3, X4) that should be greater than 3102, by sample set
It closes (X1, X5) and sample set (X2, X3, X4) and continues second and third wheel node split respectively as new sample set, with right respectively
Two new nodes are divided, and new node is generated.;
2.2, since second and third wheel node split belongs to the federal training of same wheel, continue to continue to use first round node point
Sample gradient value used in splitting.Assuming that the corresponding feature of a split vertexes of epicycle is Amount of given credit
Less than or equal to 200, then this feature generates two new partial nodes, wherein left as split vertexes (corresponding sample is X1, X5)
The corresponding sample X5 less than or equal to 200 of node, and right node is to the sample X1 that should be greater than 200;Similarly, epicycle another
The corresponding feature of split vertexes is that Age is less than or equal to 35, then this feature is as split vertexes (corresponding sample be X2, X3, X4),
Generate two new partial nodes, wherein left sibling it is corresponding be less than or equal to 35 sample X2, X3, and right node is to should be greater than 35
Sample X4.Specific implementation flow refers to first round node split process.
The federal training of second wheel: second regression tree of training
3.1, it since epicycle node split belongs to the training of next round federation, is updated with last round of federal training result
First-order Gradient and second order gradient used in the federal training of one wheel continue the federal training of the second wheel and carry out node split, to generate
New node constructs next regression tree, and specific implementation flow refers to the building process of previous regression tree.
3.2, as shown in figure 5, sample data produces two after the training of two-wheeled federation in table 1,2 in above-described embodiment
Regression tree, first regression tree includes three split vertexes, is respectively: Bill Payment is less than or equal to 3102, Amount
Of given credit is less than or equal to 200, Age and is less than or equal to 35;Second regression tree includes two split vertexes, point
Be not: Bill Payment is less than or equal to 6787, Gender==1.
3.3, two regression trees of tree-model are promoted based on gradient as shown in Figure 5, the feature of sample data is corresponding flat
Equal financial value: Bill Payment is (gain1+gain4)/2;Education is 0;Age is gain3;Gender is gain5;
Amount of given credit is gain2.
Further, the present invention is based on federation training one embodiment of sample predictions method in, treat forecast sample into
The specific implementation flow of row associated prediction includes:
(1) it in the second data side side, traverses gradient and promotes the corresponding regression tree of tree-model;
(2) if the attribute value of current traverse node is recorded in the second data side, by comparing local sample to be predicted
The attribute value of data point and current traverse node, with the next traverse node of determination;
(3) if the attribute value of current traverse node is recorded in the first data side, inquiry request is initiated to the first data side,
For being determined in the first data side side by comparing the data point of local sample to be predicted and the attribute value of current traverse node
Next traverse node simultaneously returns to the nodal information to the second data side;
(4) when having traversed gradient and promoting the corresponding regression tree of tree-model, based on corresponding to the affiliated node of sample to be predicted
Sample data label, determine the sample class of sample to be predicted, or the weighted value based on the affiliated node of sample to be predicted, obtain
Obtain the prediction score of sample to be predicted.
In the present embodiment, since when generating regression tree, the split vertexes record of regression tree is stored in the second data side side,
Therefore the present embodiment is by the leading associated prediction for completing to treat forecast sample in the second data side, especially by traversal gradient boosted tree
The corresponding regression tree of model is with the affiliated node of determination sample to be predicted.Wherein, the affiliated node of sample to be predicted especially by
The data point of sample more to be predicted and the attribute value of current traverse node are determined.
After the affiliated node of sample to be predicted has been determined, the corresponding trained sample of the affiliated node of sample to be predicted can be based on
This data label determines the sample class of sample to be predicted, or the weighted value based on the affiliated node of sample to be predicted, obtain to
The prediction score of forecast sample.
The present invention also provides a kind of computer readable storage mediums.
Sample predictions program is stored on computer readable storage medium of the present invention, the sample predictions program is by processor
The step of sample predictions method as described in the examples such as any of the above-described based on federation's training is realized when execution.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side
Method can be realized by means of software and necessary general hardware platform, naturally it is also possible to by hardware, but in many cases
The former is more preferably embodiment.Based on this understanding, technical solution of the present invention substantially in other words does the prior art
The part contributed out can be embodied in the form of software products, which is stored in a storage medium
In (such as ROM/RAM), including some instructions are used so that a terminal (can be mobile phone, computer, server or network are set
It is standby etc.) execute method described in each embodiment of the present invention.
The embodiment of the present invention is described with above attached drawing, but the invention is not limited to above-mentioned specific
Embodiment, the above mentioned embodiment is only schematical, rather than restrictive, those skilled in the art
Under the inspiration of the present invention, without breaking away from the scope protected by the purposes and claims of the present invention, it can also make very much
Form, it is all using equivalent structure or equivalent flow shift made by description of the invention and accompanying drawing content, directly or indirectly
Other related technical areas are used in, all of these belong to the protection of the present invention.
Claims (10)
1. a kind of sample predictions method based on federation's training, which is characterized in that the sample predictions side based on federation's training
Method the following steps are included:
Federal training is carried out using the training sample that XGboost algorithm is aligned two, promotes tree-model to construct gradient,
In, it includes more regression trees that the gradient, which promotes tree-model, and a split vertexes of the regression tree correspond to the one of training sample
A feature;
Tree-model is promoted based on the gradient, forecast sample is treated and carries out associated prediction, with the sample class of determination sample to be predicted
Prediction score that is other or obtaining sample to be predicted.
2. the sample predictions method as described in claim 1 based on federation's training, which is characterized in that described based on federal training
Sample predictions method include:
Before carrying out federal training, using Proxy Signature and rsa encryption algorithm, encryption is interacted to the ID of sample data;
By comparing both sides' encrypted ID encryption string, the intersection part in both sides' sample is identified, and by the intersection portion in sample
It is allocated as the training sample after being aligned for sample.
3. the sample predictions method as claimed in claim 2 based on federation's training, which is characterized in that the instruction of described two alignment
Practicing sample is respectively the first training sample and the second training sample;
The first training sample attribute includes sample ID and part sample characteristics, and the second training sample attribute includes sample
This ID, another part sample characteristics and data label;
First training sample provided by the first data side and be stored in the first data side local, second training sample by
Second data side provides and is stored in the second data side local.
4. the sample predictions method as claimed in claim 3 based on federation's training, which is characterized in that described to use XGboost
The training sample that algorithm is aligned two carries out federal training, includes: to construct gradient promotion tree-model
In second data side side, the First-order Gradient and two of each training sample in the corresponding sample set of epicycle node split is obtained
Ladder degree;
If epicycle node split is the first run node split for constructing regression tree, to the First-order Gradient and the second order gradient into
The first data side is sent to together with the sample ID of the sample set after row encryption, in first data side side group
The First-order Gradient and the second order gradient in encryption calculate local training sample corresponding with the sample ID at each
The financial value of split vertexes under divisional mode;
If epicycle node split is the non-first run node split for constructing regression tree, the sample ID of the sample set is sent to institute
The first data side is stated, in first data side lateral edge First-order Gradient used in first run node split and two ladders
Degree calculates the financial value of local training sample split vertexes under each divisional mode corresponding with the sample ID;
Second data side receives the encryption financial value for all split vertexes that the first data side returns and is decrypted;
Local and ID pairs of the sample is calculated based on the First-order Gradient and the second order gradient in second data side side
The financial value of the training sample answered split vertexes under each divisional mode;
Based on the financial value of the respective calculated all split vertexes of both sides, the best division section of the overall situation of epicycle node split is determined
Point;
The best split vertexes of the overall situation based on epicycle node split, divide the corresponding sample set of present node, generate new
Node with construct gradient promoted tree-model regression tree.
5. the sample predictions method as claimed in claim 4 based on federation's training, which is characterized in that described in second number
According to square side, the step of obtaining the First-order Gradient and second order gradient of each training sample in the corresponding sample set of epicycle node split it
Before, further includes:
When carrying out node split, judge whether epicycle node split corresponds to first regression tree of construction;
If epicycle node split first regression tree of corresponding construction, judge whether epicycle node split is first regression tree of construction
First run node split;
If epicycle node split is the first run node split for constructing first regression tree, in second data side side, initialization
The First-order Gradient of each training sample and second order gradient in the corresponding sample set of epicycle node split;If epicycle node split is construction
The non-first run node split of first regression tree, then continue to use First-order Gradient used in first run node split and second order gradient;
If epicycle node split is corresponding to construct non-first regression tree, judge whether epicycle node split is the non-first recurrence of construction
The first run node split of tree;
If epicycle node split is the first run node split for constructing non-first regression tree, one is updated according to last round of federal training
Ladder degree and second order gradient;If epicycle node split is the non-first run node split for constructing non-first regression tree, the first run is continued to use
First-order Gradient used in node split and second order gradient.
6. the sample predictions method as claimed in claim 4 based on federation's training, which is characterized in that described based on federal training
Sample predictions method further include:
When generating new node to construct the regression tree of gradient promotion tree-model, in second data side side, epicycle is judged
Whether the depth of regression tree reaches predetermined depth threshold value;
If the depth of epicycle regression tree reaches the predetermined depth threshold value, Stop node division obtains gradient and promotes tree-model
A regression tree, otherwise continue next round node split;
When Stop node division, in second data side side, judge whether the total quantity of epicycle regression tree reaches present count
Measure threshold value;
If the total quantity of epicycle regression tree reaches the preset quantity threshold value, stop federal training, otherwise continues next round connection
Nation's training.
7. the sample predictions method as claimed in claim 4 based on federation's training, which is characterized in that described based on federal training
Sample predictions method further include:
In second data side side, the relevant information for the best split vertexes of the overall situation that each round node split determines is recorded;
Wherein, the relevant information includes: the feature coding and income of the provider of corresponding sample data, corresponding sample data
Value.
8. the sample predictions method as claimed in claim 7 based on federation's training, which is characterized in that described to be based on the gradient
Tree-model is promoted, forecast sample is treated and carries out associated prediction, with the sample class of determination sample to be predicted or obtains sample to be predicted
This prediction score includes:
In second data side side, traverses the gradient and promote the corresponding regression tree of tree-model;
If the attribute value of current traverse node is recorded in the second data side, by comparing the data of local sample to be predicted
The attribute value of point and current traverse node, with the next traverse node of determination;
If the attribute value of current traverse node is recorded in the first data side, inquiry is initiated to the first data side and is asked
It asks, in first data side side, by comparing the data point of local sample to be predicted and the attribute of current traverse node
Value, determines next traverse node and returns to the nodal information to the second data side;
When having traversed the gradient and promoting the corresponding regression tree of tree-model, based on sample corresponding to the affiliated node of sample to be predicted
This data label determines the sample class of sample to be predicted, or the weighted value based on the affiliated node of sample to be predicted, obtain to
The prediction score of forecast sample.
9. a kind of sample predictions device based on federation's training, which is characterized in that the sample predictions dress based on federation's training
It sets including memory, processor and is stored in the sample predictions journey that can be run on the memory and on the processor
Sequence is realized as of any of claims 1-8 when the sample predictions program is executed by the processor based on federation
The step of trained sample predictions method.
10. a kind of computer readable storage medium, which is characterized in that it is pre- to be stored with sample on the computer readable storage medium
Ranging sequence is realized as of any of claims 1-8 when the sample predictions program is executed by processor based on federation
The step of trained sample predictions method.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810913869.3A CN109165683B (en) | 2018-08-10 | 2018-08-10 | Sample prediction method, device and storage medium based on federal training |
PCT/CN2019/080297 WO2020029590A1 (en) | 2018-08-10 | 2019-03-29 | Sample prediction method and device based on federated training, and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810913869.3A CN109165683B (en) | 2018-08-10 | 2018-08-10 | Sample prediction method, device and storage medium based on federal training |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109165683A true CN109165683A (en) | 2019-01-08 |
CN109165683B CN109165683B (en) | 2023-09-12 |
Family
ID=64895662
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810913869.3A Active CN109165683B (en) | 2018-08-10 | 2018-08-10 | Sample prediction method, device and storage medium based on federal training |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN109165683B (en) |
WO (1) | WO2020029590A1 (en) |
Cited By (33)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109670484A (en) * | 2019-01-16 | 2019-04-23 | 电子科技大学 | A kind of mobile phone individual discrimination method based on bispectrum feature and boosted tree |
CN110443378A (en) * | 2019-08-02 | 2019-11-12 | 深圳前海微众银行股份有限公司 | Feature correlation analysis method, device and readable storage medium storing program for executing in federation's study |
CN110717671A (en) * | 2019-10-08 | 2020-01-21 | 深圳前海微众银行股份有限公司 | Method and device for determining contribution degree of participants |
WO2020029590A1 (en) * | 2018-08-10 | 2020-02-13 | 深圳前海微众银行股份有限公司 | Sample prediction method and device based on federated training, and storage medium |
CN110796266A (en) * | 2019-10-30 | 2020-02-14 | 深圳前海微众银行股份有限公司 | Method, device and storage medium for implementing reinforcement learning based on public information |
CN110851869A (en) * | 2019-11-14 | 2020-02-28 | 深圳前海微众银行股份有限公司 | Sensitive information processing method and device and readable storage medium |
CN110944011A (en) * | 2019-12-16 | 2020-03-31 | 支付宝(杭州)信息技术有限公司 | Joint prediction method and system based on tree model |
CN110968886A (en) * | 2019-12-20 | 2020-04-07 | 支付宝(杭州)信息技术有限公司 | Method and system for screening training samples of machine learning model |
CN111242385A (en) * | 2020-01-19 | 2020-06-05 | 苏宁云计算有限公司 | Prediction method, device and system of gradient lifting tree model |
CN111309848A (en) * | 2020-01-19 | 2020-06-19 | 苏宁云计算有限公司 | Generation method and system of gradient lifting tree model |
CN111444956A (en) * | 2020-03-25 | 2020-07-24 | 平安科技(深圳)有限公司 | Low-load information prediction method and device, computer system and readable storage medium |
CN111598186A (en) * | 2020-06-05 | 2020-08-28 | 腾讯科技(深圳)有限公司 | Decision model training method, prediction method and device based on longitudinal federal learning |
CN111667075A (en) * | 2020-06-12 | 2020-09-15 | 杭州浮云网络科技有限公司 | Service execution method, device and related equipment |
CN111695697A (en) * | 2020-06-12 | 2020-09-22 | 深圳前海微众银行股份有限公司 | Multi-party combined decision tree construction method and device and readable storage medium |
CN111915019A (en) * | 2020-08-07 | 2020-11-10 | 平安科技(深圳)有限公司 | Federal learning method, system, computer device, and storage medium |
CN112183759A (en) * | 2019-07-04 | 2021-01-05 | 创新先进技术有限公司 | Model training method, device and system |
CN112199706A (en) * | 2020-10-26 | 2021-01-08 | 支付宝(杭州)信息技术有限公司 | Tree model training method and business prediction method based on multi-party safety calculation |
CN112464287A (en) * | 2020-12-12 | 2021-03-09 | 同济大学 | Multi-party XGboost safety prediction model training method based on secret sharing and federal learning |
CN112529101A (en) * | 2020-12-24 | 2021-03-19 | 深圳前海微众银行股份有限公司 | Method and device for training classification model, electronic equipment and storage medium |
CN112651458A (en) * | 2020-12-31 | 2021-04-13 | 深圳云天励飞技术股份有限公司 | Method and device for training classification model, electronic equipment and storage medium |
WO2021082634A1 (en) * | 2019-10-29 | 2021-05-06 | 支付宝(杭州)信息技术有限公司 | Tree model-based prediction method and apparatus |
CN112766514A (en) * | 2021-01-22 | 2021-05-07 | 支付宝(杭州)信息技术有限公司 | Method, system and device for joint training of machine learning model |
CN113392164A (en) * | 2020-03-13 | 2021-09-14 | 京东城市(北京)数字科技有限公司 | Method, main server, service platform and system for constructing longitudinal federated tree |
CN113554476A (en) * | 2020-04-23 | 2021-10-26 | 京东数字科技控股有限公司 | Training method and system of credit prediction model, electronic device and storage medium |
CN113642669A (en) * | 2021-08-30 | 2021-11-12 | 平安医疗健康管理股份有限公司 | Fraud prevention detection method, device and equipment based on feature analysis and storage medium |
CN113705727A (en) * | 2021-09-16 | 2021-11-26 | 四川新网银行股份有限公司 | Decision tree modeling method, prediction method, device and medium based on difference privacy |
CN113723477A (en) * | 2021-08-16 | 2021-11-30 | 同盾科技有限公司 | Cross-feature federal abnormal data detection method based on isolated forest |
CN113807544A (en) * | 2020-12-31 | 2021-12-17 | 京东科技控股股份有限公司 | Method and device for training federated learning model and electronic equipment |
CN113822311A (en) * | 2020-12-31 | 2021-12-21 | 京东科技控股股份有限公司 | Method and device for training federated learning model and electronic equipment |
EP3975089A1 (en) * | 2020-09-25 | 2022-03-30 | Beijing Baidu Netcom Science And Technology Co. Ltd. | Multi-model training method and device based on feature extraction, an electronic device, and a medium |
CN114362948A (en) * | 2022-03-17 | 2022-04-15 | 蓝象智联(杭州)科技有限公司 | Efficient federal derivative feature logistic regression modeling method |
WO2022144001A1 (en) * | 2020-12-31 | 2022-07-07 | 京东科技控股股份有限公司 | Federated learning model training method and apparatus, and electronic device |
CN113554476B (en) * | 2020-04-23 | 2024-04-19 | 京东科技控股股份有限公司 | Training method and system of credit prediction model, electronic equipment and storage medium |
Families Citing this family (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111414646B (en) * | 2020-03-20 | 2024-03-29 | 矩阵元技术(深圳)有限公司 | Data processing method and device for realizing privacy protection |
CN111402095A (en) * | 2020-03-23 | 2020-07-10 | 温州医科大学 | Method for detecting student behaviors and psychology based on homomorphic encrypted federated learning |
CN111461874A (en) * | 2020-04-13 | 2020-07-28 | 浙江大学 | Credit risk control system and method based on federal mode |
CN111666576B (en) * | 2020-04-29 | 2023-08-04 | 平安科技(深圳)有限公司 | Data processing model generation method and device, and data processing method and device |
CN111882054B (en) * | 2020-05-27 | 2024-04-12 | 杭州中奥科技有限公司 | Method for cross training of encryption relationship network data of two parties and related equipment |
CN113824546B (en) * | 2020-06-19 | 2024-04-02 | 百度在线网络技术(北京)有限公司 | Method and device for generating information |
CN111814985B (en) * | 2020-06-30 | 2023-08-29 | 平安科技(深圳)有限公司 | Model training method under federal learning network and related equipment thereof |
CN111898765A (en) * | 2020-07-29 | 2020-11-06 | 深圳前海微众银行股份有限公司 | Feature binning method, device, equipment and readable storage medium |
CN111914277B (en) * | 2020-08-07 | 2023-09-01 | 平安科技(深圳)有限公司 | Intersection data generation method and federal model training method based on intersection data |
US11914678B2 (en) | 2020-09-23 | 2024-02-27 | International Business Machines Corporation | Input encoding for classifier generalization |
CN112288094B (en) * | 2020-10-09 | 2022-05-17 | 武汉大学 | Federal network representation learning method and system |
CN112381307B (en) * | 2020-11-20 | 2023-12-22 | 平安科技(深圳)有限公司 | Meteorological event prediction method and device and related equipment |
CN113824677B (en) * | 2020-12-28 | 2023-09-05 | 京东科技控股股份有限公司 | Training method and device of federal learning model, electronic equipment and storage medium |
CN113807380B (en) * | 2020-12-31 | 2023-09-01 | 京东科技信息技术有限公司 | Training method and device of federal learning model and electronic equipment |
CN112749749B (en) * | 2021-01-14 | 2024-04-16 | 深圳前海微众银行股份有限公司 | Classification decision tree model-based classification method and device and electronic equipment |
CN112836830B (en) * | 2021-02-01 | 2022-05-06 | 广西师范大学 | Method for voting and training in parallel by using federated gradient boosting decision tree |
CN113807534B (en) * | 2021-03-08 | 2023-09-01 | 京东科技控股股份有限公司 | Model parameter training method and device of federal learning model and electronic equipment |
CN114882333A (en) * | 2021-05-31 | 2022-08-09 | 北京百度网讯科技有限公司 | Training method and device of data processing model, electronic equipment and storage medium |
CN113204443B (en) * | 2021-06-03 | 2024-04-16 | 京东科技控股股份有限公司 | Data processing method, device, medium and product based on federal learning framework |
CN113435537B (en) * | 2021-07-16 | 2022-08-26 | 同盾控股有限公司 | Cross-feature federated learning method and prediction method based on Soft GBDT |
CN113722987B (en) * | 2021-08-16 | 2023-11-03 | 京东科技控股股份有限公司 | Training method and device of federal learning model, electronic equipment and storage medium |
CN113657996A (en) * | 2021-08-26 | 2021-11-16 | 深圳市洞见智慧科技有限公司 | Method and device for determining feature contribution degree in federated learning and electronic equipment |
CN113722739B (en) * | 2021-09-06 | 2024-04-09 | 京东科技控股股份有限公司 | Gradient lifting tree model generation method and device, electronic equipment and storage medium |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101056166A (en) * | 2007-05-28 | 2007-10-17 | 北京飞天诚信科技有限公司 | A method for improving the data transmission security |
CN104009842A (en) * | 2014-05-15 | 2014-08-27 | 华南理工大学 | Communication data encryption and decryption method based on DES encryption algorithm, RSA encryption algorithm and fragile digital watermarking |
CN107704966A (en) * | 2017-10-17 | 2018-02-16 | 华南理工大学 | A kind of Energy Load forecasting system and method based on weather big data |
CN107767183A (en) * | 2017-10-31 | 2018-03-06 | 常州大学 | Brand loyalty method of testing based on combination learning and profile point |
US20180089587A1 (en) * | 2016-09-26 | 2018-03-29 | Google Inc. | Systems and Methods for Communication Efficient Distributed Mean Estimation |
CN107993139A (en) * | 2017-11-15 | 2018-05-04 | 华融融通(北京)科技有限公司 | A kind of anti-fake system of consumer finance based on dynamic regulation database and method |
CN108021984A (en) * | 2016-11-01 | 2018-05-11 | 第四范式(北京)技术有限公司 | Determine the method and system of the feature importance of machine learning sample |
TWM561279U (en) * | 2018-02-12 | 2018-06-01 | 林俊良 | Blockchain system and node server for processing strategy model scripts of financial assets |
CN108257105A (en) * | 2018-01-29 | 2018-07-06 | 南华大学 | A kind of light stream estimation for video image and denoising combination learning depth network model |
CN108375808A (en) * | 2018-03-12 | 2018-08-07 | 南京恩瑞特实业有限公司 | Dense fog forecasting procedures of the NRIET based on machine learning |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018031597A1 (en) * | 2016-08-08 | 2018-02-15 | Google Llc | Systems and methods for data aggregation based on one-time pad based sharing |
CN107423339A (en) * | 2017-04-29 | 2017-12-01 | 天津大学 | Popular microblogging Forecasting Methodology based on extreme Gradient Propulsion and random forest |
CN107832581B (en) * | 2017-12-15 | 2022-02-18 | 百度在线网络技术(北京)有限公司 | State prediction method and device |
CN109165683B (en) * | 2018-08-10 | 2023-09-12 | 深圳前海微众银行股份有限公司 | Sample prediction method, device and storage medium based on federal training |
-
2018
- 2018-08-10 CN CN201810913869.3A patent/CN109165683B/en active Active
-
2019
- 2019-03-29 WO PCT/CN2019/080297 patent/WO2020029590A1/en active Application Filing
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101056166A (en) * | 2007-05-28 | 2007-10-17 | 北京飞天诚信科技有限公司 | A method for improving the data transmission security |
CN104009842A (en) * | 2014-05-15 | 2014-08-27 | 华南理工大学 | Communication data encryption and decryption method based on DES encryption algorithm, RSA encryption algorithm and fragile digital watermarking |
US20180089587A1 (en) * | 2016-09-26 | 2018-03-29 | Google Inc. | Systems and Methods for Communication Efficient Distributed Mean Estimation |
CN108021984A (en) * | 2016-11-01 | 2018-05-11 | 第四范式(北京)技术有限公司 | Determine the method and system of the feature importance of machine learning sample |
CN107704966A (en) * | 2017-10-17 | 2018-02-16 | 华南理工大学 | A kind of Energy Load forecasting system and method based on weather big data |
CN107767183A (en) * | 2017-10-31 | 2018-03-06 | 常州大学 | Brand loyalty method of testing based on combination learning and profile point |
CN107993139A (en) * | 2017-11-15 | 2018-05-04 | 华融融通(北京)科技有限公司 | A kind of anti-fake system of consumer finance based on dynamic regulation database and method |
CN108257105A (en) * | 2018-01-29 | 2018-07-06 | 南华大学 | A kind of light stream estimation for video image and denoising combination learning depth network model |
TWM561279U (en) * | 2018-02-12 | 2018-06-01 | 林俊良 | Blockchain system and node server for processing strategy model scripts of financial assets |
CN108375808A (en) * | 2018-03-12 | 2018-08-07 | 南京恩瑞特实业有限公司 | Dense fog forecasting procedures of the NRIET based on machine learning |
Non-Patent Citations (5)
Title |
---|
H. BRENDAN MCMAHAN 等: "Communication-efficient learning of deep networks from decentralized data", 《ARTIFICIAL INTELLIGENCE AND STATISTICS》 * |
JAKUB 等: "Federated learning strategies for improving communication efficiency", 《ARXIV.ORG》 * |
STEPHEN HARDY 等: "Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption", 《ARXIV.ORG》 * |
TIANQI CHEN 等: "XGBoost: A Scalable Tree Boosting System", 《KDD"16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING》 * |
许裕栗 等: "Xgboost算法在区域用电预测中的应用", 《自动化仪表》 * |
Cited By (52)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020029590A1 (en) * | 2018-08-10 | 2020-02-13 | 深圳前海微众银行股份有限公司 | Sample prediction method and device based on federated training, and storage medium |
CN109670484B (en) * | 2019-01-16 | 2022-03-25 | 电子科技大学 | Mobile phone individual identification method based on bispectrum characteristics and lifting tree |
CN109670484A (en) * | 2019-01-16 | 2019-04-23 | 电子科技大学 | A kind of mobile phone individual discrimination method based on bispectrum feature and boosted tree |
CN112183759A (en) * | 2019-07-04 | 2021-01-05 | 创新先进技术有限公司 | Model training method, device and system |
CN112183759B (en) * | 2019-07-04 | 2024-02-13 | 创新先进技术有限公司 | Model training method, device and system |
CN110443378A (en) * | 2019-08-02 | 2019-11-12 | 深圳前海微众银行股份有限公司 | Feature correlation analysis method, device and readable storage medium storing program for executing in federation's study |
CN110443378B (en) * | 2019-08-02 | 2023-11-03 | 深圳前海微众银行股份有限公司 | Feature correlation analysis method and device in federal learning and readable storage medium |
CN110717671B (en) * | 2019-10-08 | 2021-08-31 | 深圳前海微众银行股份有限公司 | Method and device for determining contribution degree of participants |
CN110717671A (en) * | 2019-10-08 | 2020-01-21 | 深圳前海微众银行股份有限公司 | Method and device for determining contribution degree of participants |
WO2021082634A1 (en) * | 2019-10-29 | 2021-05-06 | 支付宝(杭州)信息技术有限公司 | Tree model-based prediction method and apparatus |
CN110796266A (en) * | 2019-10-30 | 2020-02-14 | 深圳前海微众银行股份有限公司 | Method, device and storage medium for implementing reinforcement learning based on public information |
CN110851869A (en) * | 2019-11-14 | 2020-02-28 | 深圳前海微众银行股份有限公司 | Sensitive information processing method and device and readable storage medium |
CN110851869B (en) * | 2019-11-14 | 2023-09-19 | 深圳前海微众银行股份有限公司 | Sensitive information processing method, device and readable storage medium |
CN110944011B (en) * | 2019-12-16 | 2021-12-07 | 支付宝(杭州)信息技术有限公司 | Joint prediction method and system based on tree model |
CN110944011A (en) * | 2019-12-16 | 2020-03-31 | 支付宝(杭州)信息技术有限公司 | Joint prediction method and system based on tree model |
CN110968886A (en) * | 2019-12-20 | 2020-04-07 | 支付宝(杭州)信息技术有限公司 | Method and system for screening training samples of machine learning model |
CN111242385A (en) * | 2020-01-19 | 2020-06-05 | 苏宁云计算有限公司 | Prediction method, device and system of gradient lifting tree model |
CN111309848A (en) * | 2020-01-19 | 2020-06-19 | 苏宁云计算有限公司 | Generation method and system of gradient lifting tree model |
CN113392164B (en) * | 2020-03-13 | 2024-01-12 | 京东城市(北京)数字科技有限公司 | Method for constructing longitudinal federal tree, main server, service platform and system |
CN113392164A (en) * | 2020-03-13 | 2021-09-14 | 京东城市(北京)数字科技有限公司 | Method, main server, service platform and system for constructing longitudinal federated tree |
CN111444956A (en) * | 2020-03-25 | 2020-07-24 | 平安科技(深圳)有限公司 | Low-load information prediction method and device, computer system and readable storage medium |
CN111444956B (en) * | 2020-03-25 | 2023-10-31 | 平安科技(深圳)有限公司 | Low-load information prediction method, device, computer system and readable storage medium |
CN113554476B (en) * | 2020-04-23 | 2024-04-19 | 京东科技控股股份有限公司 | Training method and system of credit prediction model, electronic equipment and storage medium |
CN113554476A (en) * | 2020-04-23 | 2021-10-26 | 京东数字科技控股有限公司 | Training method and system of credit prediction model, electronic device and storage medium |
CN111598186A (en) * | 2020-06-05 | 2020-08-28 | 腾讯科技(深圳)有限公司 | Decision model training method, prediction method and device based on longitudinal federal learning |
CN111695697B (en) * | 2020-06-12 | 2023-09-08 | 深圳前海微众银行股份有限公司 | Multiparty joint decision tree construction method, equipment and readable storage medium |
CN111695697A (en) * | 2020-06-12 | 2020-09-22 | 深圳前海微众银行股份有限公司 | Multi-party combined decision tree construction method and device and readable storage medium |
WO2021249086A1 (en) * | 2020-06-12 | 2021-12-16 | 深圳前海微众银行股份有限公司 | Multi-party joint decision tree construction method, device and readable storage medium |
CN111667075A (en) * | 2020-06-12 | 2020-09-15 | 杭州浮云网络科技有限公司 | Service execution method, device and related equipment |
CN111915019A (en) * | 2020-08-07 | 2020-11-10 | 平安科技(深圳)有限公司 | Federal learning method, system, computer device, and storage medium |
CN111915019B (en) * | 2020-08-07 | 2023-06-20 | 平安科技(深圳)有限公司 | Federal learning method, system, computer device, and storage medium |
EP3975089A1 (en) * | 2020-09-25 | 2022-03-30 | Beijing Baidu Netcom Science And Technology Co. Ltd. | Multi-model training method and device based on feature extraction, an electronic device, and a medium |
CN112199706A (en) * | 2020-10-26 | 2021-01-08 | 支付宝(杭州)信息技术有限公司 | Tree model training method and business prediction method based on multi-party safety calculation |
CN112464287A (en) * | 2020-12-12 | 2021-03-09 | 同济大学 | Multi-party XGboost safety prediction model training method based on secret sharing and federal learning |
CN112464287B (en) * | 2020-12-12 | 2022-07-05 | 同济大学 | Multi-party XGboost safety prediction model training method based on secret sharing and federal learning |
CN112529101A (en) * | 2020-12-24 | 2021-03-19 | 深圳前海微众银行股份有限公司 | Method and device for training classification model, electronic equipment and storage medium |
CN112651458B (en) * | 2020-12-31 | 2024-04-02 | 深圳云天励飞技术股份有限公司 | Classification model training method and device, electronic equipment and storage medium |
WO2022144001A1 (en) * | 2020-12-31 | 2022-07-07 | 京东科技控股股份有限公司 | Federated learning model training method and apparatus, and electronic device |
CN113822311A (en) * | 2020-12-31 | 2021-12-21 | 京东科技控股股份有限公司 | Method and device for training federated learning model and electronic equipment |
CN113807544A (en) * | 2020-12-31 | 2021-12-17 | 京东科技控股股份有限公司 | Method and device for training federated learning model and electronic equipment |
CN113822311B (en) * | 2020-12-31 | 2023-09-01 | 京东科技控股股份有限公司 | Training method and device of federal learning model and electronic equipment |
CN112651458A (en) * | 2020-12-31 | 2021-04-13 | 深圳云天励飞技术股份有限公司 | Method and device for training classification model, electronic equipment and storage medium |
CN113807544B (en) * | 2020-12-31 | 2023-09-26 | 京东科技控股股份有限公司 | Training method and device of federal learning model and electronic equipment |
CN112766514B (en) * | 2021-01-22 | 2021-12-24 | 支付宝(杭州)信息技术有限公司 | Method, system and device for joint training of machine learning model |
CN112766514A (en) * | 2021-01-22 | 2021-05-07 | 支付宝(杭州)信息技术有限公司 | Method, system and device for joint training of machine learning model |
CN113723477A (en) * | 2021-08-16 | 2021-11-30 | 同盾科技有限公司 | Cross-feature federal abnormal data detection method based on isolated forest |
CN113642669A (en) * | 2021-08-30 | 2021-11-12 | 平安医疗健康管理股份有限公司 | Fraud prevention detection method, device and equipment based on feature analysis and storage medium |
CN113642669B (en) * | 2021-08-30 | 2024-04-05 | 平安医疗健康管理股份有限公司 | Feature analysis-based fraud prevention detection method, device, equipment and storage medium |
CN113705727B (en) * | 2021-09-16 | 2023-05-12 | 四川新网银行股份有限公司 | Decision tree modeling method, prediction method, equipment and medium based on differential privacy |
CN113705727A (en) * | 2021-09-16 | 2021-11-26 | 四川新网银行股份有限公司 | Decision tree modeling method, prediction method, device and medium based on difference privacy |
CN114362948B (en) * | 2022-03-17 | 2022-07-12 | 蓝象智联(杭州)科技有限公司 | Federated derived feature logistic regression modeling method |
CN114362948A (en) * | 2022-03-17 | 2022-04-15 | 蓝象智联(杭州)科技有限公司 | Efficient federal derivative feature logistic regression modeling method |
Also Published As
Publication number | Publication date |
---|---|
WO2020029590A1 (en) | 2020-02-13 |
CN109165683B (en) | 2023-09-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109165683A (en) | Sample predictions method, apparatus and storage medium based on federation's training | |
CN109034398A (en) | Feature selection approach, device and storage medium based on federation's training | |
Reardon et al. | Income inequality and income segregation | |
CN105335409B (en) | A kind of determination method, equipment and the network server of target user | |
CN107871087A (en) | The personalized difference method for secret protection that high dimensional data is issued under distributed environment | |
CN102663047B (en) | Method and device for mining social relationship during mobile reading | |
CN111932386B (en) | User account determining method and device, information pushing method and device, and electronic equipment | |
CN109087079A (en) | Digital cash Transaction Information analysis method | |
CN107291815A (en) | Recommend method in Ask-Answer Community based on cross-platform tag fusion | |
CN111666460A (en) | User portrait generation method and device based on privacy protection and storage medium | |
CN107358116A (en) | A kind of method for secret protection in multi-sensitive attributes data publication | |
CN113449048B (en) | Data label distribution determining method and device, computer equipment and storage medium | |
CN111538916B (en) | Interest point recommendation method based on neural network and geographic influence | |
CN109376901A (en) | A kind of service quality prediction technique based on decentralization matrix decomposition | |
CN108416227A (en) | Big data platform secret protection evaluation method and device based on Dare Information Entropy | |
Chao | Construction model of E-commerce agricultural product online marketing system based on blockchain and improved genetic algorithm | |
CN109783805A (en) | A kind of network community user recognition methods and device | |
CN116186754A (en) | Federal random forest power data collaborative analysis method based on blockchain | |
CN112016954A (en) | Resource allocation method and device based on block chain network technology and electronic equipment | |
WO2019237840A1 (en) | Data set generating method and apparatus | |
CN112613601B (en) | Neural network model updating method, equipment and computer storage medium | |
Mecke et al. | Some distributions for I‐segments of planar random homogeneous STIT tessellations | |
CN108647334A (en) | A kind of video social networks homology analysis method under spark platforms | |
CN109472115B (en) | Large-scale complex network modeling method and device based on geographic information | |
Palestini et al. | A graph-based approach to inequality assessment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |