The content of the invention
For above-mentioned technical problem, this specification embodiment provides a kind of method merged to model predication value, dress
It puts and equipment, technical solution is as follows:
In one aspect, a kind of method merged to model predication value of proposition, including:
Based on given several samples, carry out the predicted value to on-line prediction model respectively and offline pre- according to setting branch mailbox method
The predicted value for surveying model carries out branch mailbox, wherein, each sample in several samples includes:First predicted value, the second prediction
The label of value and sample, first predicted value are obtained by on-line prediction model prediction, and the second predicted value by predicting mould offline
Type is predicted to obtain;
According to branch mailbox as a result, the first predicted value of each sample is converted into corresponding with the section residing for first predicted value
First interval feature, the second predicted value of each sample is converted into the secondth area corresponding with the section residing for second predicted value
Between feature;
It is formed and is turned with the label of the corresponding first interval feature of each sample, the second interval feature and sample
Sample data after change, and using the sample data after conversion come training pattern, the model which completes is used for online pre-
It surveys the predicted value of model and the predicted value of offline prediction model is merged to obtain final predicted value.
In one aspect, a kind of method merged to model predication value of proposition, including:
The business datum that target user generates in first time period is obtained, input feature vector is determined according to the business datum
And on-line prediction model is input to, export the first predicted value;
Obtain the second predicted value corresponding with the target user for being obtained using offline prediction model, wherein, it is described from
The input feature vector of line prediction model is the service feature generated according to the target user in second time period come definite;
Obtain the knot that branch mailbox is carried out to the first predicted value of on-line prediction model and the second predicted value of offline prediction model
Fruit determines the first interval residing for first predicted value and the second interval residing for second predicted value respectively;
According to the first interval and the second interval, the model obtained using advance training is come to the described first prediction
Value and second predicted value are merged, and obtain final fusion forecasting value, and the fusion forecasting value is used for determining the mesh
Mark the label of user.
In one aspect, a kind of device merged to model predication value of proposition, including:
Branch mailbox unit based on given several samples, carrys out the prediction to on-line prediction model respectively according to setting branch mailbox method
Value and the predicted value of offline prediction model carry out branch mailbox, wherein, each sample in several samples includes:First prediction
The label of value, the second predicted value and sample, first predicted value are obtained by on-line prediction model prediction, the second predicted value by
Offline prediction model is predicted to obtain;
Feature Conversion unit, according to branch mailbox as a result, the first predicted value of each sample is converted into and first predicted value
Second predicted value of each sample is converted into and the area residing for second predicted value by the corresponding first interval feature in residing section
Between corresponding second interval feature;
Training unit, with the corresponding first interval feature of each sample, the second interval feature and sample
Label forms the sample data after conversion, and using the sample data after conversion come training pattern, the model which completes is used
It is merged to obtain final predicted value in the predicted value of on-line prediction model and the predicted value of offline prediction model.
In one aspect, a kind of device merged to model predication value of proposition, including:
Online score value predicting unit obtains the business number that target user generates in the first time period before triggering moment
According to, input feature vector is determined according to the business datum and is input to on-line prediction model, exports the first predicted value, it is described online pre-
Survey the label that model is used to predict user;
It is pre- to obtain corresponding with the target user second obtained using offline prediction model for offline score value obtaining unit
Measured value, wherein, the input feature vector of the offline prediction model is to be produced according to the target user in past second time period
Raw service feature comes definite, and the offline prediction model is used to predict the label of user;
Interval determination unit is carried out according to the predicted value in advance to on-line prediction model and the predicted value of offline prediction model
Branch mailbox as a result, respectively determine first predicted value residing for first interval and the secondth area residing for second predicted value
Between;
Score value integrated unit, according to the first interval and the second interval, the model that is obtained using advance training come
First predicted value and second predicted value are merged, obtain final fusion forecasting value, the fusion forecasting value
For determining the label of the target user.
In one aspect, a kind of computer equipment of proposition, including:
Processor;
For storing the memory of processor-executable instruction;
The processor is configured as:
Based on given several samples, carry out the predicted value to on-line prediction model respectively and offline pre- according to setting branch mailbox method
The predicted value for surveying model carries out branch mailbox, wherein, each sample in several samples includes:First predicted value, the second prediction
The label of value and sample, first predicted value are obtained by on-line prediction model prediction, and the second predicted value by predicting mould offline
Type is predicted to obtain;
According to branch mailbox as a result, the first predicted value of each sample is converted into corresponding with the section residing for first predicted value
First interval feature, the second predicted value of each sample is converted into the secondth area corresponding with the section residing for second predicted value
Between feature;
It is formed and is turned with the label of the corresponding first interval feature of each sample, the second interval feature and sample
Sample data after change, and using the sample data after conversion come training pattern, the model which completes is used for online pre-
It surveys the predicted value of model and the predicted value of offline prediction model is merged to obtain final predicted value.
In one aspect, a kind of computer equipment of proposition, including:
Processor;
For storing the memory of processor-executable instruction;
The processor is configured as:
Online score value predicting unit obtains the business number that target user generates in the first time period before triggering moment
According to, input feature vector is determined according to the business datum and is input to on-line prediction model, exports the first predicted value, it is described online pre-
Survey the label that model is used to predict user;
It is pre- to obtain corresponding with the target user second obtained using offline prediction model for offline score value obtaining unit
Measured value, wherein, the input feature vector of the offline prediction model is to be produced according to the target user in past second time period
Raw service feature comes definite, and the offline prediction model is used to predict the label of user;
Interval determination unit is carried out according to the predicted value in advance to on-line prediction model and the predicted value of offline prediction model
Branch mailbox as a result, respectively determine first predicted value residing for first interval and the secondth area residing for second predicted value
Between;
Score value integrated unit, according to the first interval and the second interval, the model that is obtained using advance training come
First predicted value and second predicted value are merged, obtain final fusion forecasting value, the fusion forecasting value
For determining the label of the target user.
Effect caused by the technical solution that this specification embodiment is provided includes:
The model obtained by machine learning is come the predicted value to the line prediction model and the offline prediction model
Predicted value is merged, and the final score value obtained using fusion is predicted come the label to user, thus improve to
While the accuracy that the label at family is predicted, requirement of the business to low time delay is also met.
It should be appreciated that above general description and following detailed description are only exemplary and explanatory, not
This specification embodiment can be limited.
In addition, any embodiment in this specification embodiment and above-mentioned whole effects need not be reached.
Specific embodiment
In order to which those skilled in the art is made to more fully understand the technical solution in this specification embodiment, below in conjunction with this
Attached drawing in specification embodiment is described in detail the technical solution in this specification embodiment, it is clear that described
Embodiment is only the part of the embodiment of this specification, instead of all the embodiments.Based on the embodiment in this specification,
Those of ordinary skill in the art's all other embodiments obtained should all belong to the scope of protection.
Shown in Figure 1, in one embodiment of this specification, a kind of method merged to model predication value is used
The obtained score value of on-line prediction model and the obtained score value of offline prediction model merged, this method can include
Following step 101~104, wherein:
Step 101:The business datum that target user generates in first time period is obtained, is determined according to the business datum
Input feature vector is simultaneously input to on-line prediction model, exports the first predicted value.
Step 102:The second predicted value corresponding with the target user obtained using offline prediction model is obtained,
In, the input feature vector of the offline prediction model be the service feature that is generated according to the target user in second time period come
Definite.
Herein, the on-line prediction model and the offline prediction model are the use built using machine learning algorithm
The model predicted come the label to user.The user tag of prediction can be related to specific business needed for the two models
, such as:For a kind of network payment business, the user tag of required prediction can be divided into:" excessive risk user ", " risk
User ", " low-risk user ", etc..For a kind of information recommendation business, the user tag of required prediction can be divided into:" physical culture
Class ", " educational ", " finance and economic ", etc..On-line prediction model and offline prediction model are all using a certain number of trained samples
Originally trained, each sample in these training samples can include:Sample of users is participating in specific transactions (such as network payment
Business) during generated one or more behavioral datas and the determined label of sample of users.Wherein it is possible to it adopts
Above-mentioned on-line prediction model and offline prediction model were trained originally with same lot sample, the sample that two batches can also be used different
Originally on-line prediction model and offline prediction model were trained, were not restricted herein.
In this specification embodiment, offline prediction model can be realized by timed task, such as:Referring to daily
Timing is carved or specified time section performs once offline score value prediction, which can be directed to full dose user;And
Line prediction model can be triggered by the operation of specific user, such as:The behavior that user clicks on some webpage can trigger once
The score value calculating process of on-line prediction model.
Because offline prediction model is compared to on-line prediction model, the more high-dimensional characteristic of generally use, characteristic
According to time span can also be longer, and more complicated algorithm may be employed.As shown in Figure 1, for specific examples, in T
Day, offline prediction model can obtain each user in the T-1 days generated business datums during specific transactions are participated in
(feature A) is handled accordingly according to the business datum (feature A) of acquisition, can be obtained input feature vector and is input to offline
In prediction model, obtain the offline prediction score value (the second predicted value i.e. in text) of each user and be written in database X.It is and right
In on-line prediction model, it can constantly gather the online characteristic (feature B) of user and be written in database Y, wherein, institute
It can be the business datum of user caused by during specific transactions are participated in quasi real time to state online characteristic, such as:
The triggering moment of on-line prediction is t1, then online characteristic can be produced by section t0~t1 (such as 3 minutes) this period
Business datum.As it can be seen that after for initiating the user of pre- flow gauge request arrival, scheduler needs to do two tasks, one
It is that the last the second predicted value corresponding with target user obtained by the calculating of offline prediction model is read from database X;
The second is the online characteristic of the target user is read from database Y to carry out the score value of next on-line prediction model
Prediction process.
So far, for any one target user, a prediction score value can be obtained by on-line prediction model and is led to
It crosses offline prediction model and obtains a prediction score value.
Step 103:Branch mailbox is carried out according to the predicted value in advance to on-line prediction model and the predicted value of offline prediction model
As a result, respectively determine first predicted value residing for first interval and the second interval residing for second predicted value.
Step 104:According to the first interval and the second interval, the model obtained using advance training is come to described
First predicted value and second predicted value are merged, and obtain final fusion forecasting value, wherein, the fusion forecasting value is used
To determine the label of the target user.
In an optional embodiment, step 104 can specifically include:
Step 1041:Based on predetermined weight corresponding with branch mailbox obtains each section, obtain and firstth area
Between corresponding first weight and the second weight corresponding with the second interval.Wherein, the model treats that training parameter includes
Weight corresponding with each section that branch mailbox obtains.
Step 1042:Fusion forecasting value, the fusion forecasting are determined using first weight and second weight
Value is used for determining the label of the target user.
Since 103~step 104 of above-mentioned steps is needed based on branch mailbox result and power corresponding with each section that branch mailbox obtains
It realizes again, therefore, it is necessary to introduce a kind of method of definite fusion weight before step 103~step 104 is discussed in detail.Such as
Shown in Fig. 2, in one embodiment, 201~step 203 that the method comprising the steps of, wherein:
Step 201:Based on given several samples, come respectively to the predicted value of on-line prediction model according to setting branch mailbox method
Branch mailbox is carried out with the predicted value of offline prediction model, wherein, each sample in several samples includes:First predicted value,
The label of second predicted value and sample, first predicted value are obtained by on-line prediction model prediction, the second predicted value by from
Line prediction model is predicted to obtain.
The sample referred in the step 201 can with for training above-mentioned offline prediction model and/or on-line prediction model
Sample it is identical, it is of course also possible to be different samples, this is not restricted.
In one embodiment, the setting branch mailbox method can be the branch mailbox method based on entropy.Branch mailbox method based on entropy is to divide
The value of dependent variable is considered during case so that reach minimum entropy (minimumentropy) after branch mailbox.Branch mailbox method based on entropy it is good
Place is can to show preferable distinction in high score region.Certainly, the setting branch mailbox method can also be point based on Geordie
Case method waits frequency divisions case method etc..
Step 202:According to residing for branch mailbox as a result, being converted into the first predicted value of each sample with first predicted value
Second predicted value of each sample is converted into corresponding with the section residing for second predicted value by the corresponding first interval feature in section
Second interval feature.
In one example, it is assumed that the first predicted value and the second predicted value are all between 0~1, then to on-line prediction
After the predicted value of model carries out branch mailbox, obtained cut-point includes:0、0.1、0.13、0.15、0.2、0.3、0.5、1;To from
After the predicted value of line prediction model carries out branch mailbox, obtained cut-point includes:0、0.03、0.05、0.08、0.09、0.11、
0.13、1;That is, the output valve of on-line prediction model and offline prediction model respectively obtains 7 sections after branch mailbox.
In one embodiment, one-hot rules may be employed to realize that the feature of step 202 converts.An assuming that sample
The first predicted value for 0.17, the second predicted value is 0.12, then since 0.17 in the 4th section (0.15,0.2), 0.12
It, can be by the first predicted value using one-hot rules in the 6th section (0.11,0.13):0.17 is converted to first interval
Feature:On-bin-0001000 (" on-bin " is the mark of on-line prediction model), by the second predicted value:0.12 is converted to second
Section feature:Off-bin-0000010 (" off-bin " is the mark of offline prediction model).It after the same method, can be by
The first predicted value and the second predicted value in other a pair of samples carry out feature conversion.
Step 203:With the mark of the corresponding first interval feature of each sample, the second interval feature and sample
Label form the sample data after conversion, and using the sample data after conversion come training pattern, the model which completes is used for
The predicted value of predicted value and offline prediction model to on-line prediction model is merged to obtain final predicted value.
Wherein, the sample data after the conversion is except the first interval feature, the second interval feature and sample
Outside this label, other data can also be included.That is, described " composition " is not closing.
In the above example, before feature conversion, certain sample data is, for example,:
{ 0.17,0.12, " risk user " };
After feature conversion, an obtained new sample data is, for example,:
{ 0001000,0000010, " risk user " }
Model to be trained can be linear model or nonlinear model herein, in a kind of embodiment using linear model
In, the model treats that training parameter can include weight corresponding with each section that branch mailbox obtains, and the weight can be used for
The predicted value of predicted value and offline prediction model to line prediction model is merged to obtain final predicted value.Mould to be trained
Type can be logistic regression (Logistic Regression, LR) model, wherein it is possible to each section difference obtained for branch mailbox
A weight is distributed, and is trained the weight as the parameter of LR models, each weighted value may finally be solved.It is above-mentioned
Weight can be one of respective bins scoring, the scoring be not only between the different aspect of model (online, off-line model) and
Global an importance balance and study have been done between each fraction section.
Example mentioned above is continued to use, following weight may finally be obtained:
The weight on-bin-1=1.054 in section (0,0.1),
……
The weight on-bin-7=4.439 in section (0.5,1);
The weight off-bin-1=0.604 in section (0,0.03),
……
The weight off-bin-7=3.237 in section (0.13,1).
Next, above-mentioned steps 103 to step 104 are illustrated continuing with more than specific example.Assuming that for
Some target user, the first predicted value obtained by on-line prediction model are 0.66, the obtained by offline prediction model
Two predicted values are 0.25, then with reference to above-mentioned example, first in step 103, determine first residing for first predicted value 0.4
Section is:(0.5,1), the second interval residing for second predicted value 0.25 are:(0.13,1).Then in step 1041,
Based on predetermined weight corresponding with branch mailbox obtains each section, can obtain and the first interval:(0.5,1) it is corresponding
The first weight be:4.439, with the second interval:(0.13,1) corresponding second weight is:3.237.
Finally, in step 1042, can final fusion forecasting be determined according to above-mentioned first weight and the second weight
Value, in an alternate embodiment of the invention, first weight and second weight can be summed, and using summed result as
Fusion forecasting value, i.e. fusion forecasting value=4.439+3.237=7.676.Certainly, the concrete mode of fusion is not limited to sum,
Such as:Be averaging etc..Finally, how can be determined according to specific business with the fusion forecasting value.
Effect caused by the technical solution that this specification embodiment is provided includes:
The weight obtained by machine learning is come the predicted value to the line prediction model and the offline prediction model
Predicted value is merged, and the final score value obtained using fusion is predicted come the label to user, thus improve to
While the accuracy that the label at family is predicted, requirement of the business to low time delay is also met.In addition, utilize point based on entropy
Case and Logic Regression Models effectively integrate on-time model score value and off-line model score value so that online offline score value it
Between comparativity adaptively adjusted in machine-learning process.
Corresponding to above method embodiment, this specification embodiment also provides a kind of dress merged to model predication value
It puts.
It is shown in Figure 3, in one embodiment, in the training stage of fusion weight, a kind of device of definite fusion weight
300 can include:
Branch mailbox unit 301, is configured as:Based on given several samples, come respectively to online pre- according to setting branch mailbox method
It surveys the predicted value of model and the predicted value of offline prediction model carries out branch mailbox, wherein, each sample bag in several samples
It includes:The label of first predicted value, the second predicted value and sample, first predicted value are obtained by on-line prediction model prediction,
Second predicted value is predicted to obtain by offline prediction model;
Feature Conversion unit 302, is configured as:According to branch mailbox as a result, by the first predicted value of each sample be converted into
Second predicted value of each sample is converted into second pre- with this by the corresponding first interval feature in section residing for first predicted value
The corresponding second interval feature in section residing for measured value;
Training unit 303, is configured as:It is special with the corresponding first interval feature of each sample, the second interval
The label of sign and sample forms the sample data after conversion, and using the sample data after conversion come training pattern, the training
The model of completion is final for being merged to obtain to the predicted value of on-line prediction model and the predicted value of offline prediction model
Predicted value.
It is shown in Figure 4, in one embodiment, in score value fusing stage, a kind of dress merged to model predication value
Putting 400 can include:
Online score value predicting unit 401, is configured as:Target user is obtained to produce in the first time period before triggering moment
Raw business datum determines input feature vector according to the business datum and is input to on-line prediction model, exports the first predicted value,
The on-line prediction model is used to predict the label of user;
Offline score value obtaining unit 402, is configured as:It obtains being obtained using offline prediction model with the target user
Corresponding second predicted value, wherein, the input feature vector of the offline prediction model is past according to the target user
The service feature of generation comes definite in two periods, and the offline prediction model is used to predict the label of user;
Interval determination unit 403, is configured as:According to the predicted value in advance to on-line prediction model and offline prediction model
Predicted value carry out branch mailbox as a result, respectively determine first predicted value residing for first interval and the second predicted value institute
The second interval at place;
Weight determining unit 404, is configured as:According to the first interval and the second interval, advance training is utilized
Obtained model merges first predicted value and second predicted value, obtains final fusion forecasting value, institute
State the label that fusion forecasting value is used for determining the target user.
In an alternative embodiment, the score value integrated unit 404 may include:
Weight determination subelement, based on predetermined weight corresponding with branch mailbox obtains each section, obtain with it is described
Corresponding first weight of first interval and the second weight corresponding with the second interval;
Subelement is merged, determines fusion forecasting value using first weight and second weight, the fusion is pre-
Measured value is used for determining the label of the target user.
In one embodiment, the fusion subelement can be configured as:
First weight and second weight are summed, and using summed result as fusion forecasting value.
The function of modules and the realization process of effect specifically refer to and step are corresponded in the above method in above device
Realization process, details are not described herein.
This specification embodiment also provides a kind of computer equipment (such as server), includes at least memory, processor
And the computer program that can be run on a memory and on a processor is stored, wherein, processor is realized when performing described program
Preceding method.
Fig. 5 shows a kind of more specifically computing device hardware architecture diagram that this specification embodiment is provided,
The equipment can include:Processor 1010, memory 1020, input/output interface 1030, communication interface 1040 and bus
1050.Wherein processor 1010, memory 1020, input/output interface 1030 and communication interface 1040 are real by bus 1050
The now communication connection inside equipment each other.
General CPU (Central Processing Unit, central processing unit), micro- place may be employed in processor 1010
Reason device, application specific integrated circuit (Application Specific Integrated Circuit, ASIC) or one
Or the modes such as multiple integrated circuits are realized, for performing relative program, to realize technical side that this specification embodiment is provided
Case.
ROM (Read Only Memory, read-only memory), RAM (Random Access may be employed in memory 1020
Memory, random access memory), the forms such as static storage device, dynamic memory realize.Memory 1020 can store
Operating system and other applications are realizing technical solution that this specification embodiment is provided by software or firmware
When, relevant program code is stored in memory 1020, and is performed by processor 1010 to call.
Input/output interface 1030 is for connecting input/output module, to realize information input and output.Input and output/
Module can be used as component Configuration (not shown) in a device, can also be external in equipment to provide corresponding function.Wherein
Input equipment can include keyboard, mouse, touch-screen, microphone, various kinds of sensors etc., output equipment can include display,
Loud speaker, vibrator, indicator light etc..
Communication interface 1040 is used for connection communication module (not shown), to realize the communication of this equipment and other equipment
Interaction.Wherein communication module can be realized by wired mode (such as USB, cable etc.) and communicated, can also be wirelessly
(such as mobile network, WIFI, bluetooth etc.) realizes communication.
Bus 1050 include an access, equipment various components (such as processor 1010, memory 1020, input/it is defeated
Outgoing interface 1030 and communication interface 1040) between transmit information.
It should be noted that although above equipment illustrates only processor 1010, memory 1020, input/output interface
1030th, communication interface 1040 and bus 1050, but in specific implementation process, which can also include realizing normal fortune
Other assemblies necessary to row.In addition, it will be appreciated by those skilled in the art that, it can also be only comprising real in above equipment
Component necessary to existing this specification embodiment scheme, without including all components shown in figure.
As seen through the above description of the embodiments, those skilled in the art can be understood that this specification
Embodiment can add the mode of required general hardware platform to realize by software.Based on such understanding, this specification is implemented
The technical solution of example substantially in other words can be embodied the part that the prior art contributes in the form of software product,
The computer software product can be stored in storage medium, such as ROM/RAM, magnetic disc, CD, including some instructions making
Computer equipment (can be personal computer, server or the network equipment etc.) to perform this specification embodiment each
Method described in some parts of a embodiment or embodiment.
System, device, module or the unit that above-described embodiment illustrates can specifically realize by computer chip or entity,
Or it is realized by having the function of certain product.A kind of typical realization equipment is computer, and the concrete form of computer can
To be personal computer, laptop computer, cellular phone, camera phone, smart phone, personal digital assistant, media play
In device, navigation equipment, E-mail receiver/send equipment, game console, tablet computer, wearable device or these equipment
The combination of arbitrary several equipment.
Each embodiment in this specification is described by the way of progressive, identical similar portion between each embodiment
Point just to refer each other, and the highlights of each of the examples are difference from other examples.It is real especially for device
For applying example, since it is substantially similar to embodiment of the method, so describing fairly simple, related part is referring to embodiment of the method
Part explanation.The apparatus embodiments described above are merely exemplary, wherein described be used as separating component explanation
Module may or may not be it is physically separate, can be each module when implementing this specification embodiment scheme
Function realize in the same or multiple software and or hardware.Can also select according to the actual needs part therein or
Person's whole module realizes the purpose of this embodiment scheme.Those of ordinary skill in the art are in situation about not making the creative labor
Under, you can to understand and implement.
The above is only the specific embodiment of this specification embodiment, it is noted that for the general of the art
For logical technical staff, on the premise of this specification embodiment principle is not departed from, several improvements and modifications can also be made, this
A little improvements and modifications also should be regarded as the protection domain of this specification embodiment.