CN108052979A - The method, apparatus and equipment merged to model predication value - Google Patents

The method, apparatus and equipment merged to model predication value Download PDF

Info

Publication number
CN108052979A
CN108052979A CN201711353984.1A CN201711353984A CN108052979A CN 108052979 A CN108052979 A CN 108052979A CN 201711353984 A CN201711353984 A CN 201711353984A CN 108052979 A CN108052979 A CN 108052979A
Authority
CN
China
Prior art keywords
predicted value
value
prediction model
interval
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711353984.1A
Other languages
Chinese (zh)
Inventor
方文静
周俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced New Technologies Co Ltd
Advantageous New Technologies Co Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201711353984.1A priority Critical patent/CN108052979A/en
Publication of CN108052979A publication Critical patent/CN108052979A/en
Priority to TW107135970A priority patent/TWI718422B/en
Priority to PCT/CN2018/111824 priority patent/WO2019114423A1/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/251Fusion techniques of input or preprocessed data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A kind of method, apparatus and equipment merged to model predication value is disclosed, wherein the method merged to model predication value includes:Based on given several samples, carry out the predicted value to on-line prediction model respectively according to setting branch mailbox method and the predicted value of offline prediction model carries out branch mailbox;According to branch mailbox as a result, the first predicted value of each sample is converted into first interval feature corresponding with the section residing for first predicted value, the second predicted value of each sample is converted into second interval feature corresponding with the section residing for second predicted value;Sample data after conversion is formed with the label of the corresponding first interval feature of each sample, the second interval feature and sample, and using the sample data after conversion come training pattern, the model which completes is used to that the predicted value of on-line prediction model and the predicted value of offline prediction model to be merged to obtain final predicted value.

Description

The method, apparatus and equipment merged to model predication value
Technical field
This specification be related to machine learning techniques field more particularly to a kind of method merged to model predication value, Device and equipment.
Background technology
Machine learning algorithm is that one kind can automatically analyze from data and obtain rule, and assimilated equations carry out unknown data The algorithm of prediction, is widely used in numerous areas.
In practical applications, including online prediction model and offline prediction model, wherein, offline prediction model is usually with fixed When task realize that advantage is to can be included in dimension higher feature and use more complicated algorithm, so as to reach more Accurately prediction effect;However, since feature is more and algorithm is complicated, prediction process typically more takes.It is pre- compared to offline Model is surveyed, on-line prediction model can reach more efficient pre- using the relatively low feature of dimension and relatively simple algorithm It surveys, shortcoming is that feature is not abundant enough, and accuracy is not high.As it can be seen that on-line prediction model and offline prediction model respectively have advantage, It is current urgent problem to be solved in the industry how the two to be carried out rational fusion.
The content of the invention
For above-mentioned technical problem, this specification embodiment provides a kind of method merged to model predication value, dress It puts and equipment, technical solution is as follows:
In one aspect, a kind of method merged to model predication value of proposition, including:
Based on given several samples, carry out the predicted value to on-line prediction model respectively and offline pre- according to setting branch mailbox method The predicted value for surveying model carries out branch mailbox, wherein, each sample in several samples includes:First predicted value, the second prediction The label of value and sample, first predicted value are obtained by on-line prediction model prediction, and the second predicted value by predicting mould offline Type is predicted to obtain;
According to branch mailbox as a result, the first predicted value of each sample is converted into corresponding with the section residing for first predicted value First interval feature, the second predicted value of each sample is converted into the secondth area corresponding with the section residing for second predicted value Between feature;
It is formed and is turned with the label of the corresponding first interval feature of each sample, the second interval feature and sample Sample data after change, and using the sample data after conversion come training pattern, the model which completes is used for online pre- It surveys the predicted value of model and the predicted value of offline prediction model is merged to obtain final predicted value.
In one aspect, a kind of method merged to model predication value of proposition, including:
The business datum that target user generates in first time period is obtained, input feature vector is determined according to the business datum And on-line prediction model is input to, export the first predicted value;
Obtain the second predicted value corresponding with the target user for being obtained using offline prediction model, wherein, it is described from The input feature vector of line prediction model is the service feature generated according to the target user in second time period come definite;
Obtain the knot that branch mailbox is carried out to the first predicted value of on-line prediction model and the second predicted value of offline prediction model Fruit determines the first interval residing for first predicted value and the second interval residing for second predicted value respectively;
According to the first interval and the second interval, the model obtained using advance training is come to the described first prediction Value and second predicted value are merged, and obtain final fusion forecasting value, and the fusion forecasting value is used for determining the mesh Mark the label of user.
In one aspect, a kind of device merged to model predication value of proposition, including:
Branch mailbox unit based on given several samples, carrys out the prediction to on-line prediction model respectively according to setting branch mailbox method Value and the predicted value of offline prediction model carry out branch mailbox, wherein, each sample in several samples includes:First prediction The label of value, the second predicted value and sample, first predicted value are obtained by on-line prediction model prediction, the second predicted value by Offline prediction model is predicted to obtain;
Feature Conversion unit, according to branch mailbox as a result, the first predicted value of each sample is converted into and first predicted value Second predicted value of each sample is converted into and the area residing for second predicted value by the corresponding first interval feature in residing section Between corresponding second interval feature;
Training unit, with the corresponding first interval feature of each sample, the second interval feature and sample Label forms the sample data after conversion, and using the sample data after conversion come training pattern, the model which completes is used It is merged to obtain final predicted value in the predicted value of on-line prediction model and the predicted value of offline prediction model.
In one aspect, a kind of device merged to model predication value of proposition, including:
Online score value predicting unit obtains the business number that target user generates in the first time period before triggering moment According to, input feature vector is determined according to the business datum and is input to on-line prediction model, exports the first predicted value, it is described online pre- Survey the label that model is used to predict user;
It is pre- to obtain corresponding with the target user second obtained using offline prediction model for offline score value obtaining unit Measured value, wherein, the input feature vector of the offline prediction model is to be produced according to the target user in past second time period Raw service feature comes definite, and the offline prediction model is used to predict the label of user;
Interval determination unit is carried out according to the predicted value in advance to on-line prediction model and the predicted value of offline prediction model Branch mailbox as a result, respectively determine first predicted value residing for first interval and the secondth area residing for second predicted value Between;
Score value integrated unit, according to the first interval and the second interval, the model that is obtained using advance training come First predicted value and second predicted value are merged, obtain final fusion forecasting value, the fusion forecasting value For determining the label of the target user.
In one aspect, a kind of computer equipment of proposition, including:
Processor;
For storing the memory of processor-executable instruction;
The processor is configured as:
Based on given several samples, carry out the predicted value to on-line prediction model respectively and offline pre- according to setting branch mailbox method The predicted value for surveying model carries out branch mailbox, wherein, each sample in several samples includes:First predicted value, the second prediction The label of value and sample, first predicted value are obtained by on-line prediction model prediction, and the second predicted value by predicting mould offline Type is predicted to obtain;
According to branch mailbox as a result, the first predicted value of each sample is converted into corresponding with the section residing for first predicted value First interval feature, the second predicted value of each sample is converted into the secondth area corresponding with the section residing for second predicted value Between feature;
It is formed and is turned with the label of the corresponding first interval feature of each sample, the second interval feature and sample Sample data after change, and using the sample data after conversion come training pattern, the model which completes is used for online pre- It surveys the predicted value of model and the predicted value of offline prediction model is merged to obtain final predicted value.
In one aspect, a kind of computer equipment of proposition, including:
Processor;
For storing the memory of processor-executable instruction;
The processor is configured as:
Online score value predicting unit obtains the business number that target user generates in the first time period before triggering moment According to, input feature vector is determined according to the business datum and is input to on-line prediction model, exports the first predicted value, it is described online pre- Survey the label that model is used to predict user;
It is pre- to obtain corresponding with the target user second obtained using offline prediction model for offline score value obtaining unit Measured value, wherein, the input feature vector of the offline prediction model is to be produced according to the target user in past second time period Raw service feature comes definite, and the offline prediction model is used to predict the label of user;
Interval determination unit is carried out according to the predicted value in advance to on-line prediction model and the predicted value of offline prediction model Branch mailbox as a result, respectively determine first predicted value residing for first interval and the secondth area residing for second predicted value Between;
Score value integrated unit, according to the first interval and the second interval, the model that is obtained using advance training come First predicted value and second predicted value are merged, obtain final fusion forecasting value, the fusion forecasting value For determining the label of the target user.
Effect caused by the technical solution that this specification embodiment is provided includes:
The model obtained by machine learning is come the predicted value to the line prediction model and the offline prediction model Predicted value is merged, and the final score value obtained using fusion is predicted come the label to user, thus improve to While the accuracy that the label at family is predicted, requirement of the business to low time delay is also met.
It should be appreciated that above general description and following detailed description are only exemplary and explanatory, not This specification embodiment can be limited.
In addition, any embodiment in this specification embodiment and above-mentioned whole effects need not be reached.
Description of the drawings
In order to illustrate more clearly of this specification embodiment or technical solution of the prior art, below will to embodiment or Attached drawing needed to be used in the description of the prior art is briefly described, it should be apparent that, the accompanying drawings in the following description is only Some embodiments described in this specification embodiment, for those of ordinary skill in the art, can also be attached according to these Figure obtains other attached drawings.
Fig. 1 is a kind of flow diagram for method merged to model predication value that this specification embodiment provides;
Fig. 2 is a kind of process for definite fusion weight that this specification embodiment provides;
Fig. 3 is a kind of device (weight training stage) merged to model predication value that this specification embodiment provides Structure diagram;
Fig. 4 is a kind of device (score value fusing stage) merged to model predication value that this specification embodiment provides Structure diagram;
Fig. 5 is for configuring a kind of structure diagram of equipment of this specification embodiment device.
Specific embodiment
In order to which those skilled in the art is made to more fully understand the technical solution in this specification embodiment, below in conjunction with this Attached drawing in specification embodiment is described in detail the technical solution in this specification embodiment, it is clear that described Embodiment is only the part of the embodiment of this specification, instead of all the embodiments.Based on the embodiment in this specification, Those of ordinary skill in the art's all other embodiments obtained should all belong to the scope of protection.
Shown in Figure 1, in one embodiment of this specification, a kind of method merged to model predication value is used The obtained score value of on-line prediction model and the obtained score value of offline prediction model merged, this method can include Following step 101~104, wherein:
Step 101:The business datum that target user generates in first time period is obtained, is determined according to the business datum Input feature vector is simultaneously input to on-line prediction model, exports the first predicted value.
Step 102:The second predicted value corresponding with the target user obtained using offline prediction model is obtained, In, the input feature vector of the offline prediction model be the service feature that is generated according to the target user in second time period come Definite.
Herein, the on-line prediction model and the offline prediction model are the use built using machine learning algorithm The model predicted come the label to user.The user tag of prediction can be related to specific business needed for the two models , such as:For a kind of network payment business, the user tag of required prediction can be divided into:" excessive risk user ", " risk User ", " low-risk user ", etc..For a kind of information recommendation business, the user tag of required prediction can be divided into:" physical culture Class ", " educational ", " finance and economic ", etc..On-line prediction model and offline prediction model are all using a certain number of trained samples Originally trained, each sample in these training samples can include:Sample of users is participating in specific transactions (such as network payment Business) during generated one or more behavioral datas and the determined label of sample of users.Wherein it is possible to it adopts Above-mentioned on-line prediction model and offline prediction model were trained originally with same lot sample, the sample that two batches can also be used different Originally on-line prediction model and offline prediction model were trained, were not restricted herein.
In this specification embodiment, offline prediction model can be realized by timed task, such as:Referring to daily Timing is carved or specified time section performs once offline score value prediction, which can be directed to full dose user;And Line prediction model can be triggered by the operation of specific user, such as:The behavior that user clicks on some webpage can trigger once The score value calculating process of on-line prediction model.
Because offline prediction model is compared to on-line prediction model, the more high-dimensional characteristic of generally use, characteristic According to time span can also be longer, and more complicated algorithm may be employed.As shown in Figure 1, for specific examples, in T Day, offline prediction model can obtain each user in the T-1 days generated business datums during specific transactions are participated in (feature A) is handled accordingly according to the business datum (feature A) of acquisition, can be obtained input feature vector and is input to offline In prediction model, obtain the offline prediction score value (the second predicted value i.e. in text) of each user and be written in database X.It is and right In on-line prediction model, it can constantly gather the online characteristic (feature B) of user and be written in database Y, wherein, institute It can be the business datum of user caused by during specific transactions are participated in quasi real time to state online characteristic, such as: The triggering moment of on-line prediction is t1, then online characteristic can be produced by section t0~t1 (such as 3 minutes) this period Business datum.As it can be seen that after for initiating the user of pre- flow gauge request arrival, scheduler needs to do two tasks, one It is that the last the second predicted value corresponding with target user obtained by the calculating of offline prediction model is read from database X; The second is the online characteristic of the target user is read from database Y to carry out the score value of next on-line prediction model Prediction process.
So far, for any one target user, a prediction score value can be obtained by on-line prediction model and is led to It crosses offline prediction model and obtains a prediction score value.
Step 103:Branch mailbox is carried out according to the predicted value in advance to on-line prediction model and the predicted value of offline prediction model As a result, respectively determine first predicted value residing for first interval and the second interval residing for second predicted value.
Step 104:According to the first interval and the second interval, the model obtained using advance training is come to described First predicted value and second predicted value are merged, and obtain final fusion forecasting value, wherein, the fusion forecasting value is used To determine the label of the target user.
In an optional embodiment, step 104 can specifically include:
Step 1041:Based on predetermined weight corresponding with branch mailbox obtains each section, obtain and firstth area Between corresponding first weight and the second weight corresponding with the second interval.Wherein, the model treats that training parameter includes Weight corresponding with each section that branch mailbox obtains.
Step 1042:Fusion forecasting value, the fusion forecasting are determined using first weight and second weight Value is used for determining the label of the target user.
Since 103~step 104 of above-mentioned steps is needed based on branch mailbox result and power corresponding with each section that branch mailbox obtains It realizes again, therefore, it is necessary to introduce a kind of method of definite fusion weight before step 103~step 104 is discussed in detail.Such as Shown in Fig. 2, in one embodiment, 201~step 203 that the method comprising the steps of, wherein:
Step 201:Based on given several samples, come respectively to the predicted value of on-line prediction model according to setting branch mailbox method Branch mailbox is carried out with the predicted value of offline prediction model, wherein, each sample in several samples includes:First predicted value, The label of second predicted value and sample, first predicted value are obtained by on-line prediction model prediction, the second predicted value by from Line prediction model is predicted to obtain.
The sample referred in the step 201 can with for training above-mentioned offline prediction model and/or on-line prediction model Sample it is identical, it is of course also possible to be different samples, this is not restricted.
In one embodiment, the setting branch mailbox method can be the branch mailbox method based on entropy.Branch mailbox method based on entropy is to divide The value of dependent variable is considered during case so that reach minimum entropy (minimumentropy) after branch mailbox.Branch mailbox method based on entropy it is good Place is can to show preferable distinction in high score region.Certainly, the setting branch mailbox method can also be point based on Geordie Case method waits frequency divisions case method etc..
Step 202:According to residing for branch mailbox as a result, being converted into the first predicted value of each sample with first predicted value Second predicted value of each sample is converted into corresponding with the section residing for second predicted value by the corresponding first interval feature in section Second interval feature.
In one example, it is assumed that the first predicted value and the second predicted value are all between 0~1, then to on-line prediction After the predicted value of model carries out branch mailbox, obtained cut-point includes:0、0.1、0.13、0.15、0.2、0.3、0.5、1;To from After the predicted value of line prediction model carries out branch mailbox, obtained cut-point includes:0、0.03、0.05、0.08、0.09、0.11、 0.13、1;That is, the output valve of on-line prediction model and offline prediction model respectively obtains 7 sections after branch mailbox.
In one embodiment, one-hot rules may be employed to realize that the feature of step 202 converts.An assuming that sample The first predicted value for 0.17, the second predicted value is 0.12, then since 0.17 in the 4th section (0.15,0.2), 0.12 It, can be by the first predicted value using one-hot rules in the 6th section (0.11,0.13):0.17 is converted to first interval Feature:On-bin-0001000 (" on-bin " is the mark of on-line prediction model), by the second predicted value:0.12 is converted to second Section feature:Off-bin-0000010 (" off-bin " is the mark of offline prediction model).It after the same method, can be by The first predicted value and the second predicted value in other a pair of samples carry out feature conversion.
Step 203:With the mark of the corresponding first interval feature of each sample, the second interval feature and sample Label form the sample data after conversion, and using the sample data after conversion come training pattern, the model which completes is used for The predicted value of predicted value and offline prediction model to on-line prediction model is merged to obtain final predicted value.
Wherein, the sample data after the conversion is except the first interval feature, the second interval feature and sample Outside this label, other data can also be included.That is, described " composition " is not closing.
In the above example, before feature conversion, certain sample data is, for example,:
{ 0.17,0.12, " risk user " };
After feature conversion, an obtained new sample data is, for example,:
{ 0001000,0000010, " risk user " }
Model to be trained can be linear model or nonlinear model herein, in a kind of embodiment using linear model In, the model treats that training parameter can include weight corresponding with each section that branch mailbox obtains, and the weight can be used for The predicted value of predicted value and offline prediction model to line prediction model is merged to obtain final predicted value.Mould to be trained Type can be logistic regression (Logistic Regression, LR) model, wherein it is possible to each section difference obtained for branch mailbox A weight is distributed, and is trained the weight as the parameter of LR models, each weighted value may finally be solved.It is above-mentioned Weight can be one of respective bins scoring, the scoring be not only between the different aspect of model (online, off-line model) and Global an importance balance and study have been done between each fraction section.
Example mentioned above is continued to use, following weight may finally be obtained:
The weight on-bin-1=1.054 in section (0,0.1),
……
The weight on-bin-7=4.439 in section (0.5,1);
The weight off-bin-1=0.604 in section (0,0.03),
……
The weight off-bin-7=3.237 in section (0.13,1).
Next, above-mentioned steps 103 to step 104 are illustrated continuing with more than specific example.Assuming that for Some target user, the first predicted value obtained by on-line prediction model are 0.66, the obtained by offline prediction model Two predicted values are 0.25, then with reference to above-mentioned example, first in step 103, determine first residing for first predicted value 0.4 Section is:(0.5,1), the second interval residing for second predicted value 0.25 are:(0.13,1).Then in step 1041, Based on predetermined weight corresponding with branch mailbox obtains each section, can obtain and the first interval:(0.5,1) it is corresponding The first weight be:4.439, with the second interval:(0.13,1) corresponding second weight is:3.237.
Finally, in step 1042, can final fusion forecasting be determined according to above-mentioned first weight and the second weight Value, in an alternate embodiment of the invention, first weight and second weight can be summed, and using summed result as Fusion forecasting value, i.e. fusion forecasting value=4.439+3.237=7.676.Certainly, the concrete mode of fusion is not limited to sum, Such as:Be averaging etc..Finally, how can be determined according to specific business with the fusion forecasting value.
Effect caused by the technical solution that this specification embodiment is provided includes:
The weight obtained by machine learning is come the predicted value to the line prediction model and the offline prediction model Predicted value is merged, and the final score value obtained using fusion is predicted come the label to user, thus improve to While the accuracy that the label at family is predicted, requirement of the business to low time delay is also met.In addition, utilize point based on entropy Case and Logic Regression Models effectively integrate on-time model score value and off-line model score value so that online offline score value it Between comparativity adaptively adjusted in machine-learning process.
Corresponding to above method embodiment, this specification embodiment also provides a kind of dress merged to model predication value It puts.
It is shown in Figure 3, in one embodiment, in the training stage of fusion weight, a kind of device of definite fusion weight 300 can include:
Branch mailbox unit 301, is configured as:Based on given several samples, come respectively to online pre- according to setting branch mailbox method It surveys the predicted value of model and the predicted value of offline prediction model carries out branch mailbox, wherein, each sample bag in several samples It includes:The label of first predicted value, the second predicted value and sample, first predicted value are obtained by on-line prediction model prediction, Second predicted value is predicted to obtain by offline prediction model;
Feature Conversion unit 302, is configured as:According to branch mailbox as a result, by the first predicted value of each sample be converted into Second predicted value of each sample is converted into second pre- with this by the corresponding first interval feature in section residing for first predicted value The corresponding second interval feature in section residing for measured value;
Training unit 303, is configured as:It is special with the corresponding first interval feature of each sample, the second interval The label of sign and sample forms the sample data after conversion, and using the sample data after conversion come training pattern, the training The model of completion is final for being merged to obtain to the predicted value of on-line prediction model and the predicted value of offline prediction model Predicted value.
It is shown in Figure 4, in one embodiment, in score value fusing stage, a kind of dress merged to model predication value Putting 400 can include:
Online score value predicting unit 401, is configured as:Target user is obtained to produce in the first time period before triggering moment Raw business datum determines input feature vector according to the business datum and is input to on-line prediction model, exports the first predicted value, The on-line prediction model is used to predict the label of user;
Offline score value obtaining unit 402, is configured as:It obtains being obtained using offline prediction model with the target user Corresponding second predicted value, wherein, the input feature vector of the offline prediction model is past according to the target user The service feature of generation comes definite in two periods, and the offline prediction model is used to predict the label of user;
Interval determination unit 403, is configured as:According to the predicted value in advance to on-line prediction model and offline prediction model Predicted value carry out branch mailbox as a result, respectively determine first predicted value residing for first interval and the second predicted value institute The second interval at place;
Weight determining unit 404, is configured as:According to the first interval and the second interval, advance training is utilized Obtained model merges first predicted value and second predicted value, obtains final fusion forecasting value, institute State the label that fusion forecasting value is used for determining the target user.
In an alternative embodiment, the score value integrated unit 404 may include:
Weight determination subelement, based on predetermined weight corresponding with branch mailbox obtains each section, obtain with it is described Corresponding first weight of first interval and the second weight corresponding with the second interval;
Subelement is merged, determines fusion forecasting value using first weight and second weight, the fusion is pre- Measured value is used for determining the label of the target user.
In one embodiment, the fusion subelement can be configured as:
First weight and second weight are summed, and using summed result as fusion forecasting value.
The function of modules and the realization process of effect specifically refer to and step are corresponded in the above method in above device Realization process, details are not described herein.
This specification embodiment also provides a kind of computer equipment (such as server), includes at least memory, processor And the computer program that can be run on a memory and on a processor is stored, wherein, processor is realized when performing described program Preceding method.
Fig. 5 shows a kind of more specifically computing device hardware architecture diagram that this specification embodiment is provided, The equipment can include:Processor 1010, memory 1020, input/output interface 1030, communication interface 1040 and bus 1050.Wherein processor 1010, memory 1020, input/output interface 1030 and communication interface 1040 are real by bus 1050 The now communication connection inside equipment each other.
General CPU (Central Processing Unit, central processing unit), micro- place may be employed in processor 1010 Reason device, application specific integrated circuit (Application Specific Integrated Circuit, ASIC) or one Or the modes such as multiple integrated circuits are realized, for performing relative program, to realize technical side that this specification embodiment is provided Case.
ROM (Read Only Memory, read-only memory), RAM (Random Access may be employed in memory 1020 Memory, random access memory), the forms such as static storage device, dynamic memory realize.Memory 1020 can store Operating system and other applications are realizing technical solution that this specification embodiment is provided by software or firmware When, relevant program code is stored in memory 1020, and is performed by processor 1010 to call.
Input/output interface 1030 is for connecting input/output module, to realize information input and output.Input and output/ Module can be used as component Configuration (not shown) in a device, can also be external in equipment to provide corresponding function.Wherein Input equipment can include keyboard, mouse, touch-screen, microphone, various kinds of sensors etc., output equipment can include display, Loud speaker, vibrator, indicator light etc..
Communication interface 1040 is used for connection communication module (not shown), to realize the communication of this equipment and other equipment Interaction.Wherein communication module can be realized by wired mode (such as USB, cable etc.) and communicated, can also be wirelessly (such as mobile network, WIFI, bluetooth etc.) realizes communication.
Bus 1050 include an access, equipment various components (such as processor 1010, memory 1020, input/it is defeated Outgoing interface 1030 and communication interface 1040) between transmit information.
It should be noted that although above equipment illustrates only processor 1010, memory 1020, input/output interface 1030th, communication interface 1040 and bus 1050, but in specific implementation process, which can also include realizing normal fortune Other assemblies necessary to row.In addition, it will be appreciated by those skilled in the art that, it can also be only comprising real in above equipment Component necessary to existing this specification embodiment scheme, without including all components shown in figure.
As seen through the above description of the embodiments, those skilled in the art can be understood that this specification Embodiment can add the mode of required general hardware platform to realize by software.Based on such understanding, this specification is implemented The technical solution of example substantially in other words can be embodied the part that the prior art contributes in the form of software product, The computer software product can be stored in storage medium, such as ROM/RAM, magnetic disc, CD, including some instructions making Computer equipment (can be personal computer, server or the network equipment etc.) to perform this specification embodiment each Method described in some parts of a embodiment or embodiment.
System, device, module or the unit that above-described embodiment illustrates can specifically realize by computer chip or entity, Or it is realized by having the function of certain product.A kind of typical realization equipment is computer, and the concrete form of computer can To be personal computer, laptop computer, cellular phone, camera phone, smart phone, personal digital assistant, media play In device, navigation equipment, E-mail receiver/send equipment, game console, tablet computer, wearable device or these equipment The combination of arbitrary several equipment.
Each embodiment in this specification is described by the way of progressive, identical similar portion between each embodiment Point just to refer each other, and the highlights of each of the examples are difference from other examples.It is real especially for device For applying example, since it is substantially similar to embodiment of the method, so describing fairly simple, related part is referring to embodiment of the method Part explanation.The apparatus embodiments described above are merely exemplary, wherein described be used as separating component explanation Module may or may not be it is physically separate, can be each module when implementing this specification embodiment scheme Function realize in the same or multiple software and or hardware.Can also select according to the actual needs part therein or Person's whole module realizes the purpose of this embodiment scheme.Those of ordinary skill in the art are in situation about not making the creative labor Under, you can to understand and implement.
The above is only the specific embodiment of this specification embodiment, it is noted that for the general of the art For logical technical staff, on the premise of this specification embodiment principle is not departed from, several improvements and modifications can also be made, this A little improvements and modifications also should be regarded as the protection domain of this specification embodiment.

Claims (14)

1. a kind of method merged to model predication value, including:
Based on given several samples, come the predicted value to on-line prediction model respectively and offline prediction mould according to setting branch mailbox method The predicted value of type carries out branch mailbox, wherein, each sample in several samples includes:First predicted value, the second predicted value with And the label of sample, first predicted value are obtained by on-line prediction model prediction, the second predicted value is pre- by offline prediction model It measures;
According to branch mailbox as a result, the first predicted value of each sample is converted into corresponding with the section residing for first predicted value It is special to be converted into second interval corresponding with the section residing for second predicted value by one section feature for second predicted value of each sample Sign;
After conversion being formed with the label of the corresponding first interval feature of each sample, the second interval feature and sample Sample data, and using the sample data after conversion come training pattern, the model which completes is used for on-line prediction mould The predicted value of type and the predicted value of offline prediction model are merged to obtain final predicted value.
2. according to the method described in claim 1, the setting branch mailbox method includes:Branch mailbox method based on entropy or based on Geordie Branch mailbox method waits frequency divisions case method.
3. according to the method described in claim 1, the model treats training parameter including corresponding with each section that branch mailbox obtains Weight, the weight is used to that the predicted value of line prediction model and the predicted value of offline prediction model to be merged to obtain final Predicted value.
4. a kind of method merged to model predication value, including:
The business datum that is generated in first time period of target user is obtained, input feature vector and defeated is determined according to the business datum Enter to on-line prediction model, export the first predicted value;
The second predicted value corresponding with the target user obtained using offline prediction model is obtained, wherein, it is described offline pre- The input feature vector for surveying model is the service feature generated according to the target user in second time period come definite;
It obtains to the second predicted value progress branch mailbox of the first predicted value and offline prediction model of on-line prediction model as a result, dividing First interval that Que Ding be residing for first predicted value and the second interval residing for second predicted value;
According to the first interval and the second interval, the model that is obtained using advance training come to first predicted value and Second predicted value is merged, and obtains final fusion forecasting value, and the fusion forecasting value is used for determining that the target is used The label at family.
5. according to the method described in claim 3, the model obtained using advance training come to first predicted value and Second predicted value is merged to obtain final fusion forecasting value, including:
Based on predetermined weight corresponding with branch mailbox obtains each section, the first power corresponding with the first interval is obtained Weight and the second weight corresponding with the second interval, the model treat that training parameter includes each section pair obtained with branch mailbox The weight answered;
Fusion forecasting value is determined using first weight and second weight.
6. according to the method described in claim 5, described determine that fusion is pre- using first weight and second weight Measured value, including:
First weight and second weight are summed, and using summed result as fusion forecasting value.
7. a kind of device merged to model predication value, including:
Branch mailbox unit, based on given several samples, according to setting branch mailbox method come the predicted value to on-line prediction model respectively and The predicted value of offline prediction model carries out branch mailbox, wherein, each sample in several samples includes:First predicted value, The label of two predicted values and sample, first predicted value are obtained by on-line prediction model prediction, and the second predicted value is by offline Prediction model is predicted to obtain;
Feature Conversion unit, according to residing for branch mailbox as a result, being converted into the first predicted value of each sample with first predicted value The corresponding first interval feature in section, the second predicted value of each sample is converted into and the section pair residing for second predicted value The second interval feature answered;
Training unit, with the label of the corresponding first interval feature of each sample, the second interval feature and sample Form the sample data after conversion, and using the sample data after conversion come training pattern, the model which completes for pair The predicted value of on-line prediction model and the predicted value of offline prediction model are merged to obtain final predicted value.
8. device according to claim 7, the setting branch mailbox method includes:Branch mailbox method based on entropy or based on Geordie Branch mailbox method waits frequency divisions case method.
9. device according to claim 7, the model treats training parameter including corresponding with each section that branch mailbox obtains Weight, the weight is used to that the predicted value of line prediction model and the predicted value of offline prediction model to be merged to obtain final Predicted value.
10. a kind of device merged to model predication value, including:
Online score value predicting unit obtains the business datum that target user generates in the first time period before triggering moment, root Input feature vector is determined according to the business datum and is input to on-line prediction model, exports the first predicted value, the on-line prediction mould Type is used to predict the label of user;
Offline score value obtaining unit obtains the second prediction corresponding with the target user obtained using offline prediction model Value, wherein, the input feature vector of the offline prediction model is to be generated according to the target user in past second time period Service feature come definite, the offline prediction model is used to predict the label of user;
Interval determination unit carries out branch mailbox according to the predicted value in advance to on-line prediction model and the predicted value of offline prediction model As a result, respectively determine first predicted value residing for first interval and the second interval residing for second predicted value;
Score value integrated unit, according to the first interval and the second interval, the model obtained using advance training is come to institute It states the first predicted value and second predicted value is merged, obtain final fusion forecasting value, the fusion forecasting value is used for Determine the label of the target user.
11. device according to claim 10, the score value integrated unit includes:
Weight determination subelement based on predetermined weight corresponding with branch mailbox obtains each section, obtains and described first Corresponding first weight in section and the second weight corresponding with the second interval;
Subelement is merged, fusion forecasting value, the fusion forecasting value are determined using first weight and second weight For determining the label of the target user.
12. according to the devices described in claim 11, the fusion subelement is configured as:
First weight and second weight are summed, and using summed result as fusion forecasting value.
13. a kind of computer equipment, including:
Processor;
For storing the memory of processor-executable instruction;
The processor is configured as:
Based on given several samples, come the predicted value to on-line prediction model respectively and offline prediction mould according to setting branch mailbox method The predicted value of type carries out branch mailbox, wherein, each sample in several samples includes:First predicted value, the second predicted value with And the label of sample, first predicted value are obtained by on-line prediction model prediction, the second predicted value is pre- by offline prediction model It measures;
According to branch mailbox as a result, the first predicted value of each sample is converted into corresponding with the section residing for first predicted value It is special to be converted into second interval corresponding with the section residing for second predicted value by one section feature for second predicted value of each sample Sign;
After conversion being formed with the label of the corresponding first interval feature of each sample, the second interval feature and sample Sample data, and using the sample data after conversion come training pattern, the model which completes is used for on-line prediction mould The predicted value of type and the predicted value of offline prediction model are merged to obtain final predicted value.
14. a kind of computer equipment, including:
Processor;
For storing the memory of processor-executable instruction;
The processor is configured as:
The business datum that is generated in first time period of target user is obtained, input feature vector and defeated is determined according to the business datum Enter to on-line prediction model, export the first predicted value;
The second predicted value corresponding with the target user obtained using offline prediction model is obtained, wherein, it is described offline pre- The input feature vector for surveying model is the service feature generated according to the target user in second time period come definite;
It obtains to the second predicted value progress branch mailbox of the first predicted value and offline prediction model of on-line prediction model as a result, dividing First interval that Que Ding be residing for first predicted value and the second interval residing for second predicted value;
According to the first interval and the second interval, the model that is obtained using advance training come to first predicted value and Second predicted value is merged, and obtains final fusion forecasting value, and the fusion forecasting value is used for determining that the target is used The label at family.
CN201711353984.1A 2017-12-15 2017-12-15 The method, apparatus and equipment merged to model predication value Pending CN108052979A (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN201711353984.1A CN108052979A (en) 2017-12-15 2017-12-15 The method, apparatus and equipment merged to model predication value
TW107135970A TWI718422B (en) 2017-12-15 2018-10-12 Method, device and equipment for fusing model prediction values
PCT/CN2018/111824 WO2019114423A1 (en) 2017-12-15 2018-10-25 Method and apparatus for merging model prediction values, and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711353984.1A CN108052979A (en) 2017-12-15 2017-12-15 The method, apparatus and equipment merged to model predication value

Publications (1)

Publication Number Publication Date
CN108052979A true CN108052979A (en) 2018-05-18

Family

ID=62132684

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711353984.1A Pending CN108052979A (en) 2017-12-15 2017-12-15 The method, apparatus and equipment merged to model predication value

Country Status (3)

Country Link
CN (1) CN108052979A (en)
TW (1) TWI718422B (en)
WO (1) WO2019114423A1 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108985489A (en) * 2018-06-08 2018-12-11 阿里巴巴集团控股有限公司 A kind of Risk Forecast Method, risk profile device and terminal device
CN109063886A (en) * 2018-06-12 2018-12-21 阿里巴巴集团控股有限公司 A kind of method for detecting abnormality, device and equipment
CN109635990A (en) * 2018-10-12 2019-04-16 阿里巴巴集团控股有限公司 A kind of training method, prediction technique, device and electronic equipment
WO2019114423A1 (en) * 2017-12-15 2019-06-20 阿里巴巴集团控股有限公司 Method and apparatus for merging model prediction values, and device
CN111242244A (en) * 2020-04-24 2020-06-05 支付宝(杭州)信息技术有限公司 Characteristic value sorting method, system and device
CN111582565A (en) * 2020-04-26 2020-08-25 支付宝(杭州)信息技术有限公司 Data fusion method and device and electronic equipment
CN112801358A (en) * 2021-01-21 2021-05-14 上海东普信息科技有限公司 Component prediction method, device, equipment and storage medium based on model fusion
CN116402241A (en) * 2023-06-08 2023-07-07 浙江大学 Multi-model-based supply chain data prediction method and device

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112418258A (en) * 2019-08-22 2021-02-26 北京京东振世信息技术有限公司 Feature discretization method and device
CN111767982A (en) * 2020-05-20 2020-10-13 北京大米科技有限公司 Training method and device for user conversion prediction model, storage medium and electronic equipment
CN112288457A (en) * 2020-06-23 2021-01-29 北京沃东天骏信息技术有限公司 Data processing method, device, equipment and medium based on multi-model calculation fusion
CN112711765A (en) * 2020-12-30 2021-04-27 深圳前海微众银行股份有限公司 Sample characteristic information value determination method, terminal, device and storage medium
CN113312512B (en) * 2021-06-10 2023-10-31 北京百度网讯科技有限公司 Training method, recommending device, electronic equipment and storage medium
CN113920166B (en) * 2021-10-29 2024-05-28 广州文远知行科技有限公司 Method, device, vehicle and storage medium for selecting object motion model

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105679021A (en) * 2016-02-02 2016-06-15 重庆云途交通科技有限公司 Travel time fusion prediction and query method based on traffic big data
CN106873571A (en) * 2017-02-10 2017-06-20 泉州装备制造研究所 A kind of method for early warning based on data and Model Fusion
CN107025153A (en) * 2016-01-29 2017-08-08 阿里巴巴集团控股有限公司 The failure prediction method and device of disk

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9563855B2 (en) * 2014-06-27 2017-02-07 Intel Corporation Using a generic classifier to train a personalized classifier for wearable devices
CN108052979A (en) * 2017-12-15 2018-05-18 阿里巴巴集团控股有限公司 The method, apparatus and equipment merged to model predication value

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107025153A (en) * 2016-01-29 2017-08-08 阿里巴巴集团控股有限公司 The failure prediction method and device of disk
CN105679021A (en) * 2016-02-02 2016-06-15 重庆云途交通科技有限公司 Travel time fusion prediction and query method based on traffic big data
CN106873571A (en) * 2017-02-10 2017-06-20 泉州装备制造研究所 A kind of method for early warning based on data and Model Fusion

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019114423A1 (en) * 2017-12-15 2019-06-20 阿里巴巴集团控股有限公司 Method and apparatus for merging model prediction values, and device
CN108985489A (en) * 2018-06-08 2018-12-11 阿里巴巴集团控股有限公司 A kind of Risk Forecast Method, risk profile device and terminal device
CN108985489B (en) * 2018-06-08 2021-12-31 创新先进技术有限公司 Risk prediction method, risk prediction device and terminal equipment
CN109063886B (en) * 2018-06-12 2022-05-31 创新先进技术有限公司 Anomaly detection method, device and equipment
CN109063886A (en) * 2018-06-12 2018-12-21 阿里巴巴集团控股有限公司 A kind of method for detecting abnormality, device and equipment
CN109635990A (en) * 2018-10-12 2019-04-16 阿里巴巴集团控股有限公司 A kind of training method, prediction technique, device and electronic equipment
CN109635990B (en) * 2018-10-12 2022-09-16 创新先进技术有限公司 Training method, prediction method, device, electronic equipment and storage medium
CN111242244A (en) * 2020-04-24 2020-06-05 支付宝(杭州)信息技术有限公司 Characteristic value sorting method, system and device
CN111242244B (en) * 2020-04-24 2020-09-18 支付宝(杭州)信息技术有限公司 Characteristic value sorting method, system and device
CN111582565A (en) * 2020-04-26 2020-08-25 支付宝(杭州)信息技术有限公司 Data fusion method and device and electronic equipment
CN112801358A (en) * 2021-01-21 2021-05-14 上海东普信息科技有限公司 Component prediction method, device, equipment and storage medium based on model fusion
CN116402241A (en) * 2023-06-08 2023-07-07 浙江大学 Multi-model-based supply chain data prediction method and device
CN116402241B (en) * 2023-06-08 2023-08-18 浙江大学 Multi-model-based supply chain data prediction method and device

Also Published As

Publication number Publication date
WO2019114423A1 (en) 2019-06-20
TWI718422B (en) 2021-02-11
TW201928709A (en) 2019-07-16

Similar Documents

Publication Publication Date Title
CN108052979A (en) The method, apparatus and equipment merged to model predication value
TWI788529B (en) Credit risk prediction method and device based on LSTM model
CN108563548A (en) Method for detecting abnormality and device
CN108021931A (en) A kind of data sample label processing method and device
CN109271970A (en) Face datection model training method and device
CN109191312A (en) A kind of anti-fraud air control method and device of Claims Resolution
CN109299258A (en) A kind of public sentiment event detecting method, device and equipment
CN111883262B (en) Epidemic situation trend prediction method and device, electronic equipment and storage medium
CN108133390A (en) For predicting the method and apparatus of user behavior and computing device
Hanga et al. A graph-based approach to interpreting recurrent neural networks in process mining
CN114663198A (en) Product recommendation method, device and equipment based on user portrait and storage medium
CN112182118B (en) Target object prediction method based on multiple data sources and related equipment thereof
CN109087138A (en) Data processing method and system, computer system and readable storage medium storing program for executing
CN108256626A (en) The Forecasting Methodology and device of time series
CN113516417A (en) Service evaluation method and device based on intelligent modeling, electronic equipment and medium
CN116910567B (en) Online training sample construction method and related device for recommended service
López-Martín et al. Support vector regression for predicting the productivity of higher education graduate students from individually developed software projects
CN110544166A (en) Sample generation method, device and storage medium
CN114757700A (en) Article sales prediction model training method, article sales prediction method and apparatus
CN108510071B (en) Data feature extraction method and device and computer readable storage medium
CN113362179B (en) Method, apparatus, device, storage medium and program product for predicting transaction data
CN114329196B (en) Information pushing method and device, electronic equipment and storage medium
CN114579860B (en) User behavior portrait generation method, device, electronic equipment and storage medium
Hanne et al. Artificial Intelligence and Machine Learning for Maturity Evaluation and Model Validation
CN116611939A (en) Method for optimizing claim settlement prediction model based on deep Q learning and related equipment thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1254023

Country of ref document: HK

TA01 Transfer of patent application right

Effective date of registration: 20200925

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Innovative advanced technology Co.,Ltd.

Address before: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant before: Advanced innovation technology Co.,Ltd.

Effective date of registration: 20200925

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Advanced innovation technology Co.,Ltd.

Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands

Applicant before: Alibaba Group Holding Ltd.

TA01 Transfer of patent application right
RJ01 Rejection of invention patent application after publication

Application publication date: 20180518

RJ01 Rejection of invention patent application after publication