The method and apparatus for calculating user's score value
Technical field
The present invention relates to field of computer technology more particularly to a kind of method and apparatus for calculating user's score value.
Background technique
Currently, in user behavior research process, common means be by a large amount of normal users payment amount and
Corresponding normal users score value and a small amount of abnormal user payment amount and corresponding abnormal user score value are adopted as data set
Data set is calculated with traditional homing method, obtains estimated value, estimated value is multiplied with target user's payment amount, is obtained
To target user's score value.
In realizing process of the present invention, at least there are the following problems in the prior art for inventor's discovery:
First, since data set includes a large amount of normal users payment amount, a small amount of abnormal user payment amount, data
Collect unbalanced, calculated based on data set, estimated value will be biased to normal users, estimated value inaccuracy, target user's score value
It shifts, target user's score value is also inaccurate.
Second, there are the following problems for traditional regression analysis: firstly, obtained using traditional regression analysis
Estimated value can be influenced by extremum, estimated value inaccuracy, and target user's score value is also inaccurate.Secondly, traditional regression analysis side
Method requires residual error to meet normal distribution, but reality is unsatisfactory for normal distribution substantially, since distribution pattern changes, the credibility of calculating
It is difficult to ensure, the accuracy of calculating is not high, and target user's score value is also inaccurate.Again, traditional regression analysis is mean value
It returns, only features the index of condition distribution central tendency, can not comprehensively describe the overall picture of dependent variable condition distribution.
Summary of the invention
In view of this, the embodiment of the present invention provides a kind of method and apparatus for calculating user's score value, estimated value can be improved
Accuracy, improve target user's score value calculate accuracy.
To achieve the above object, according to an aspect of an embodiment of the present invention, a kind of side for calculating user's score value is provided
Method.
A kind of method of calculating user score value of the embodiment of the present invention, comprising:
Obtain the first data set and the second data set;Wherein, first data set includes multiple first data, Mei Ge
One data include first sample user data and corresponding first sample user score value, and second data set includes multiple second
Data, each second data include the second sample of users data and corresponding second sample of users score value;
First data set and second data set are sampled using bootstrap, obtain sample data;
The sample data is calculated using quantile estimate algorithm, obtains estimated value;
According to the estimated value and target user data, target user's score value is calculated.
In one embodiment, first data set and second data set are sampled using bootstrap, are obtained
To sample data, comprising:
First data of the first quantity are extracted from first data set using bootstrap, and are extracted multiple;
Second data of the second quantity are extracted from second data set using the bootstrap, and are extracted more
It is secondary;
By second data of first data of first quantity extracted every time and second quantity
Set is as the sample data extracted every time.
In one embodiment, the first data bulk and the second data in second data set in first data set
The ratio of quantity is greater than 10, and the ratio range of first quantity and second quantity is [0.1,10].
In one embodiment, the sample data is calculated using quantile estimate algorithm, obtains estimated value, wrapped
It includes:
The sample data extracted every time is calculated using quantile estimate algorithm, the reference extracted every time
Value;
The reference value extracted every time is added, total reference value is obtained;
By total reference value divided by number is extracted, estimated value is obtained.
In one embodiment, according to the estimated value and target user data, target user's score value is calculated, comprising:
The estimated value is multiplied with target user data, obtains target user's score value;
Wherein, the target user data includes: category, the use of User IP, the attribute of user terminal, user's purchase product
Any of family time buying and user's payment amount.
To achieve the above object, other side according to an embodiment of the present invention provides and a kind of calculates user score value
Device.
A kind of device of calculating user score value of the embodiment of the present invention, comprising:
Acquiring unit, for obtaining the first data set and the second data set;Wherein, first data set includes multiple
One data, each first data include first sample user data and corresponding first sample user score value, second data
Collection includes multiple second data, and each second data include the second sample of users data and corresponding second sample of users score value;
Sampling unit is obtained for being sampled using bootstrap to first data set and second data set
Sample data;
Estimated value computing unit is estimated for being calculated using quantile estimate algorithm the sample data
Value;
Target user's score value computing unit, for calculating target user point according to the estimated value and target user data
Value.
In one embodiment, sampling unit includes:
First sub-unit, for extracting described the of the first quantity from first data set using bootstrap
One data, and extract multiple;
Second sub-unit, for extracting the institute of the second quantity from second data set using the bootstrap
The second data are stated, and are extracted multiple;
Aggregation units, for by the institute of first data of first quantity extracted every time and second quantity
The set of the second data is stated as the sample data extracted every time.
In one embodiment, the first data bulk and the second data in second data set in first data set
The ratio of quantity is greater than 10, and the ratio range of first quantity and second quantity is [0.1,10].
In one embodiment, estimated value computing unit includes:
Reference value computing unit, based on being carried out using quantile estimate algorithm to the sample data extracted every time
It calculates, the reference value extracted every time;
Summation unit obtains total reference value for the reference value extracted every time to be added;
Estimated value computation subunit, for total reference value divided by number is extracted, to be obtained estimated value.
In one embodiment, target user's score value computing unit is specifically used for:
The estimated value is multiplied with target user data, obtains target user's score value;
Wherein, the target user data includes: category, the use of User IP, the attribute of user terminal, user's purchase product
Any of family time buying and user's payment amount.
To achieve the above object, another aspect according to an embodiment of the present invention, provides a kind of electronic equipment.
The a kind of electronic equipment of the embodiment of the present invention, comprising: one or more processors;Storage device, for storing one
A or multiple programs, when one or more of programs are executed by one or more of processors, so that one or more
A processor realizes the method provided in an embodiment of the present invention for calculating user's score value.
To achieve the above object, another aspect according to an embodiment of the present invention provides a kind of computer-readable medium.
A kind of computer-readable medium of the embodiment of the present invention, is stored thereon with computer program, and described program is processed
Device realizes the method provided in an embodiment of the present invention for calculating user's score value when executing.
One embodiment in foregoing invention has the following advantages that or the utility model has the advantages that obtains the first data set and the second data
Collection, samples to it using bootstrap, obtains sample data, since bootstrap is random, equiprobability, there is the double sampling put back to,
Thus, sample data be it is balanced, improve the accuracy of estimated value, improve target user's score value calculating accuracy.It adopts
Sample data is calculated with quantile estimate algorithm, for using traditional regression analysis, estimated value pair
Extremum shows more steady, and the accuracy of estimated value improves, and more comprehensively describes the complete of dependent variable condition distribution
Looks, thus, improve the accuracy of target user's score value calculating.
Further effect possessed by above-mentioned non-usual optional way adds hereinafter in conjunction with specific embodiment
With explanation.
Detailed description of the invention
Attached drawing for a better understanding of the present invention, does not constitute an undue limitation on the present invention.Wherein:
Fig. 1 is the schematic diagram of the main flow of the method according to an embodiment of the invention for calculating user's score value;
Fig. 2 is the schematic diagram of the main flow of the method according to another embodiment of the present invention for calculating user's score value;
Fig. 3 is the schematic diagram of the formant of the device according to an embodiment of the invention for calculating user's score value;
Fig. 4 is the schematic diagram of the formant of the device according to another embodiment of the present invention for calculating user's score value;
Fig. 5 is that the embodiment of the present invention can be applied to exemplary system architecture figure therein;
Fig. 6 is adapted for the structural representation of the computer system for the terminal device or server of realizing the embodiment of the present invention
Figure.
Specific embodiment
Below in conjunction with attached drawing, an exemplary embodiment of the present invention will be described, including the various of the embodiment of the present invention
Details should think them only exemplary to help understanding.Therefore, those of ordinary skill in the art should recognize
It arrives, it can be with various changes and modifications are made to the embodiments described herein, without departing from scope and spirit of the present invention.Together
Sample, for clarity and conciseness, descriptions of well-known functions and structures are omitted from the following description.
It should be pointed out that in the absence of conflict, the feature in embodiment and embodiment in the present invention can be with
It is combined with each other.
Currently, in user behavior research process, common means be by a large amount of normal users payment amount and
Corresponding normal users score value and a small amount of abnormal user payment amount and corresponding abnormal user score value are adopted as data set
Data set is calculated with traditional homing method (Mean Regression), obtains estimated value, estimated value and target are used
Family payment amount is multiplied, and obtains target user's score value.To be serviced according to target user's score value for target user, example
Such as, recommended products etc..
The prior art has the following problems:
First, since data set includes a large amount of normal users payment amount, a small amount of abnormal user payment amount, because
And data set is unbalanced, is calculated based on data set, estimated value will be biased to normal users, estimated value inaccuracy, and target is used
Family score value shifts, and target user's score value is also inaccurate.
Second, there are the following problems for traditional regression analysis:
Firstly, the estimated value obtained using traditional regression analysis can by extremum (e.g., Singular variance, outlier and
High lever value etc.) it influences, estimated value inaccuracy, target user's score value is also inaccurate.
Secondly, traditional regression analysis requires residual error to meet normal distribution, but reality is unsatisfactory for normal distribution substantially,
Since distribution pattern changes, the credibility of calculating is difficult to ensure that the accuracy of calculating is not high, and target user's score value is also inaccurate.
Again, traditional regression analysis is that (traditional regression analysis really studies dependent variable to mean regression
Conditional expectation, investigate influence of the independent variable to the conditional mean of dependent variable, so, such method is mean regression), only portray
The index of condition distribution central tendency, can not comprehensively not describe the overall picture of dependent variable condition distribution.
Of the existing technology in order to solve the problems, such as, one embodiment of the invention provides a kind of side for calculating user's score value
Method, this method can be executed by server, as shown in Figure 1, this method comprises:
Step S101, the first data set and the second data set are obtained;Wherein, first data set includes multiple first numbers
According to each first data include first sample user data and corresponding first sample user score value, the second data set packet
Multiple second data are included, each second data include the second sample of users data and corresponding second sample of users score value.
Step S102, first data set and second data set are sampled using bootstrap, obtain sample
Data.
In this step, if the ratio of the second data bulk of the first data bulk and the second data set of the first data set
Greater than 10, then when it is implemented, the first data of the first quantity can be extracted from the first data set, using bootstrap from second
The second data that the second quantity is extracted in data set, by the collection of the first data of the first quantity and the second data of the second quantity
Cooperation is sample data.Why in this way, being not use bootstrap that can not obtain because the second data bulk of the second data set is few
Second data bulk balanced with the first data bulk is taken, without obtaining second data bulk balanced with the first data bulk,
There are the following problems for meeting: sample data is unbalanced, estimated value inaccuracy, and the accuracy that target user's score value calculates is poor.In addition, tool
When body is implemented, the first data of the first quantity can also be extracted from the first data set using bootstrap, using bootstrap from
The second data that the second quantity is extracted in second data set, by the second quantity of the first data of the first quantity and the second quantity
Set as sample data.
Step S103, the sample data is calculated using quantile estimate algorithm, obtains estimated value.
In this step, when it is implemented, obtaining objective function according to quantile estimate algorithm, and quantile is set
Value, by the first data and the second data substitution objective function in the value and sample data of quantile, to obtain the quantile
Under estimated value.The value of different quantiles is set according to specific needs, to obtain the estimated value under different quantiles.Based on not
Target user's score value is calculated with the estimated value under quantile, target user's score value is more acurrate, more comprehensively.
Step S104, according to the estimated value and target user data, target user's score value is calculated.
In this step, when it is implemented, estimated value is multiplied with target user data, target user's score value is obtained.If
Estimated value is the estimated value under different quantiles, then the estimated value under each quantile is multiplied with target user data, obtains
Target user's score value under different quantiles.Due to target user's score value under different quantiles, therefore, it is possible to comprehensively describe
The overall picture of dependent variable condition distribution.
In this embodiment, the first data set and the second data set are obtained, is sampled using bootstrap to it, obtains sample number
According to, since bootstrap is random, equiprobability, there is the double sampling put back to, thus, sample data be it is balanced, improve estimation
The accuracy of value improves the accuracy of target user's score value calculating.Sample data is counted using quantile estimate algorithm
Calculate, relative to using for traditional regression analysis, estimated value to extremum show more steadily and surely, estimated value it is accurate
Property improve, more comprehensively describe dependent variable condition distribution overall picture, thus, improve target user's score value calculating standard
True property.
Of the existing technology in order to solve the problems, such as, another embodiment of the present invention provides a kind of sides for calculating user's score value
Method, this method can be executed by server, as shown in Fig. 2, this method comprises:
Step S201, the first data set and the second data set are obtained;Wherein, first data set includes multiple first numbers
According to each first data include first sample user data and corresponding first sample user score value, the second data set packet
Multiple second data are included, each second data include the second sample of users data and corresponding second sample of users score value.
Step S202, first data of the first quantity are extracted from first data set using bootstrap, and
It extracts multiple.
In this step, using bootstrap, (Bootstrap Method, Bootstrapping or Bootstrap sampling method are
It is a kind of to be concentrated with the uniform sampling put back to from given training) to the first data set into row stochastic, it is equiprobable, and have and put back to
Double sampling obtains the first data of the first quantity.
It should be noted that by the first data for repeatedly extracting the first quantity, the sample data extracted every time, thus
It avoids the bad accuracy for leading to estimated value of primary sample data not high and the accuracy of target user's score value calculating is not high
Problem.It ensure that the stability that target user's score value calculates.
Step S203, second number of the second quantity is extracted from second data set using the bootstrap
According to, and extract multiple.
In this step, using bootstrap (Bootstrap) to the second data set into row stochastic, it is equiprobable, and have
The double sampling put back to obtains the second data of the second quantity.
It should be noted that by the second data for repeatedly extracting the second quantity, the sample data extracted every time, thus
It avoids the bad accuracy for leading to estimated value of primary sample data not high and the accuracy of target user's score value calculating is not high
Problem.It ensure that the stability that target user's score value calculates.
Step S204, by described the of first data of first quantity extracted every time and second quantity
The set of two data is as the sample data extracted every time.
In this step, when it is implemented, in first data set in the first data bulk and second data set
The ratio of second data bulk is greater than 10, and the ratio range of first quantity and second quantity is [0.1,10].
It should be understood that due to the second data bulk in the first data bulk and the second data set in the first data set
Ratio is greater than 10, so, the first data bulk in the first data set is much larger than the second data bulk in the second data set.Example
Such as, for data caused by normal users as the first data in the first data set, data caused by abnormal user are (abnormal to use
Family refers to seek user that illegitimate benefits uses improper means, such as brush single user, robber brush user etc.) it is used as second
The second data in data set.It is well known that the quantity of normal users be it is very large, the quantity of abnormal user is relative to just
It is very little for the quantity at common family, thus, the first data bulk in the first data set is much larger than in the second data set the
Two data bulks.The prior art is the collection cooperation directly by the second data in the first data in the first data set and the second data set
For data set, data set is calculated using traditional homing method, obtains estimated value.Due in the data set of the prior art
The substantial amounts of first data, and the quantity of the second data is very little, thus, there are unbalanced problems for data set, thus
It causes estimated value to be biased to normal users, it is inaccurate to cause estimated value, and the validity of estimated value is not high, in turn results in target user
Score value calculates the problem of inaccuracy.And the embodiment of the present invention extracts the first quantity using bootstrap from the first data set
First data extract the second data of the second quantity using bootstrap from the second data set, by the first number of the first quantity
According to the set of the second data with the second quantity as sample data, and the ratio range of the first quantity and the second quantity is
[0.1,10].Thus, in the sample data of the embodiment of the present invention quantity of the first data with the quantity of the second data be it is balanced,
I.e. sample data is balanced, as a result, estimated value will not be biased to either party (i.e. estimated value will not both be biased to normal users,
Will not be biased to abnormal user), estimated value accuracy is high, and the validity of estimated value is high, improves the standard that target user's score value calculates
True property.
The step is illustrated with a specific example below: it is assumed that the first data set is number caused by 10000 normal users
According to (data caused by normal users include normal user data and normal users score value), the second data set is 100 exceptions
Data caused by user (data caused by abnormal user include abnormal user data and abnormal user score value), use is self-service
Method extracts data caused by 1000 normal users from the first data set, and extracts 5 times;It is counted using bootstrap from second
Extract data caused by 1000 abnormal users according to concentration, and extract 5 times (it should be noted that due to using bootstrap into
Row extracts, thus, there is the case where repeating in data caused by 1000 abnormal users).To extract every time 1000
Sample number of the set of data caused by data caused by a normal users and 1000 abnormal users as the secondary extraction
According to finally obtaining 5 sample datas.
Step S205, the sample data extracted every time is calculated using quantile estimate algorithm, is obtained every time
The reference value of extraction.
In this step, the sample data extracted every time is calculated using quantile estimate algorithm, so as to
Any quantile steadily describes the overall picture that independent variable is distributed the variation range of dependent variable, comprehensive description dependent variable condition,
Application scenarios are richer.
In addition, the expression formula of quantile estimate algorithm:
Wherein,Dependent variable is represented,Quantile is represented, x is independent variable, βiFor estimated value, (i=1-n, n are
The number of independent variable).
According to quantile estimate algorithm, by the weighted average of residual absolute value as the objective function minimized.Thus, mesh
The expression formula of scalar functions is:
Wherein,Quantile is represented, quantile can be taking human as setting, (0 < τ < 1);
ρτThe weight for representing the residual error under different quantiles is known quantity;It represents under different quantile numbers
Residual error;
y*It is dependent variable, x*It is independent variable, i is the number of data in sample data, and j is frequency in sampling;
Represent the estimated value under different quantiles.
Below on the basis of step S204 given example, then the step illustrated with a specific example:
It is 0.5 that quantile, which is arranged, and by 0.5 and 5 sample data, (sample data includes the first data (the of the first quantity
First sample user data in one data is independent variable, and corresponding first sample user score value is dependent variable) and the second quantity
The second data (the second sample of users data in the second data are independents variable, corresponding second sample of users score value be because become
Amount)) objective function is substituted into respectively, obtain 5 reference values that quantile is 0.5;
It is 0.75 that quantile, which is arranged, and 0.75 and 5 sample data is substituted into objective function respectively, and obtaining quantile is 0.75
5 reference values.
It should be noted that being calculated using quantile estimate algorithm sample data, relative to using traditional recurrence
For analysis method, it is more steady that estimated value shows extremum, and the accuracy of estimated value improves, and quantile estimate is calculated
Different quantiles can be set in method, to obtain the estimated value of different quantiles, calculate further according to the estimated value of different quantiles
Target user's score value of different quantiles can more comprehensively describe the overall picture of dependent variable condition distribution, thus, target user
The accuracy of the calculating of score value is higher.
Step S206, the reference value extracted every time is added, obtains total reference value.
In this step, below on the basis of step S205 given example, then the step illustrated with a specific example:
5 reference values that quantile is 0.5 are added, obtain total reference value that quantile is 0.5 (assuming that being calculated
1) total reference value that quantile is 0.5 is;
5 reference values that quantile is 0.75 are added, obtain total reference value that quantile is 0.75 (assuming that being calculated
Quantile be 0.75 total reference value be 2).
Step S207, total reference value is obtained into estimated value divided by number is extracted.
In this step, the first of the first quantity is extracted from the first data set when it is implemented, extracting number and referring to
The number of data.In addition, the number due to the first data for extracting the first quantity and the second data for extracting the second quantity
Number it is identical, thus extract number also refer to the number that the second data of the second quantity are extracted from the second data set.
Below on the basis of step S206 given example, then the step illustrated with a specific example:
Total reference value (1) that quantile is 0.5 obtains the estimated value (1/5=0.2) that quantile is 0.5 divided by 5;
Total reference value (2) that quantile is 0.75 obtains the estimated value (2/5=0.4) that quantile is 0.75 divided by 5.
Additionally, it should be understood that can also test at following aspect to estimated value: basic after obtaining estimated value
Assuming that, conspicuousness, the goodness of fit, exceptional value or practical significance etc..After upchecking, then by estimated value and target user data phase
Multiply, to obtain target user's score value.
Step S208, the estimated value is multiplied with target user data, obtains target user's score value;Wherein, the mesh
Mark user data includes: category, user's time buying and the user's branch of User IP, the attribute of user terminal, user's purchase product
Pay any of amount of money.
In this step, it should be noted that, the attribute of user terminal can be whether user terminal is common terminal;User
Time buying can according to (daytime includes 8 points to 18 points) purchase on user daytime or user's night (night includes 0 point to 8 points,
Or 18 points to 24 points) purchase divided.In addition, by the analysis to history big data, the category of discovery User IP, user terminal
Property, user's any of category, user's time buying and user's payment amount for buying product variation target can all be used
Family score value has an impact, and by more accurately calculating target user's score value as target user data, more comprehensively analysis is used
Family.
Below on the basis of step S207 given example, then the step illustrated with a specific example:
Assuming that user's payment amount is 1000, then target user's score value be 20 (1000 × 0.2=20) or 40 (1000 ×
0.4=40).
The method for illustrating to calculate user's score value above in association with Fig. 1-Fig. 2 illustrates to calculate user point below in conjunction with Fig. 3-Fig. 4
The device of value.
Of the existing technology in order to solve the problems, such as, one embodiment of the invention provides a kind of dress for calculating user's score value
It sets, which can be executed by server, as shown in figure 3, the device includes:
Acquiring unit 301, for obtaining the first data set and the second data set;Wherein, first data set includes more
A first data, each first data include first sample user data and corresponding first sample user score value, and described second
Data set includes multiple second data, and each second data include the second sample of users data and corresponding second sample of users point
Value.
Sampling unit 302 is obtained for being sampled using bootstrap to first data set and second data set
To sample data.
Estimated value computing unit 303 is estimated for being calculated using quantile estimate algorithm the sample data
Evaluation.
Target user's score value computing unit 304, for calculating target user according to the estimated value and target user data
Score value.
Of the existing technology in order to solve the problems, such as, another embodiment of the present invention provides a kind of dresses for calculating user's score value
It sets, which can be executed by server, as shown in figure 4, the device includes:
Acquiring unit 401, for obtaining the first data set and the second data set;Wherein, first data set includes more
A first data, each first data include first sample user data and corresponding first sample user score value, and described second
Data set includes multiple second data, and each second data include the second sample of users data and corresponding second sample of users point
Value.
Sampling unit 402 is obtained for being sampled using bootstrap to first data set and second data set
To sample data.
When it is implemented, sampling unit 402 includes:
First sub-unit 4021, for extracting the institute of the first quantity from first data set using bootstrap
The first data are stated, and are extracted multiple.
Second sub-unit 4022, for extracting the second quantity from second data set using the bootstrap
Second data, and extract multiple.
Aggregation units 4023, for by first data and second quantity of first quantity extracted every time
Second data set as the sample data extracted every time.
In addition, in first data set in the first data bulk and second data set the second data bulk ratio
Greater than 10, the ratio range of first quantity and second quantity is [0.1,10].
Estimated value computing unit 403 is estimated for being calculated using quantile estimate algorithm the sample data
Evaluation.
When it is implemented, estimated value computing unit 403 includes:
Reference value computing unit 4031, for being carried out using quantile estimate algorithm to the sample data extracted every time
It calculates, the reference value extracted every time.
Summation unit 4032 obtains total reference value for the reference value extracted every time to be added.
Estimated value computation subunit 4033, for total reference value divided by number is extracted, to be obtained estimated value.
Target user's score value computing unit 404, for calculating target user according to the estimated value and target user data
Score value.
When it is implemented, target user's score value computing unit 404 is specifically used for:
The estimated value is multiplied with target user data, obtains target user's score value;Wherein, the target user data
It include: in User IP, the attribute of user terminal, the category of user's purchase product, user's time buying and user's payment amount
Any one.
Fig. 5 is shown can be using the method for calculating user's score value of the embodiment of the present invention or the device of calculating user's score value
Exemplary system architecture 500.
As shown in figure 5, system architecture 500 may include terminal device 501,502,503, network 504 and server 505.
Network 504 between terminal device 501,502,503 and server 505 to provide the medium of communication link.Network 504 can be with
Including various connection types, such as wired, wireless communication link or fiber optic cables etc..
User can be used terminal device 501,502,503 and be interacted by network 504 with server 505, to receive or send out
Send message etc..Various telecommunication customer end applications, such as the application of shopping class, net can be installed on terminal device 501,502,503
(merely illustrative) such as the application of page browsing device, searching class application, instant messaging tools, mailbox client, social platform softwares.
Terminal device 501,502,503 can be the various electronic equipments with display screen and supported web page browsing, packet
Include but be not limited to smart phone, tablet computer, pocket computer on knee and desktop computer etc..
Server 505 can be to provide the server of various services, such as utilize terminal device 501,502,503 to user
The shopping class website browsed provides the back-stage management server (merely illustrative) supported.Back-stage management server can be to reception
To the data such as information query request analyze etc. processing, and by processing result (such as target push information, product letter
Breath -- merely illustrative) feed back to terminal device.
It should be noted that the method for calculating user's score value provided by the embodiment of the present invention is generally held by server 505
Row, correspondingly, the device for calculating user's score value are generally positioned in server 505.
It should be understood that the number of terminal device, network and server in Fig. 5 is only schematical.According to realization need
It wants, can have any number of terminal device, network and server.
Below with reference to Fig. 6, it illustrates the computer systems 600 for the terminal device for being suitable for being used to realize the embodiment of the present invention
Structural schematic diagram.Terminal device shown in Fig. 6 is only an example, function to the embodiment of the present invention and should not use model
Shroud carrys out any restrictions.
As shown in fig. 6, computer system 600 includes central processing unit (CPU) 601, it can be read-only according to being stored in
Program in memory (ROM) 602 or be loaded into the program in random access storage device (RAM) 603 from storage section 608 and
Execute various movements appropriate and processing.In RAM 603, also it is stored with system 600 and operates required various programs and data.
CPU 601, ROM 602 and RAM 603 are connected with each other by bus 604.Input/output (I/O) interface 605 is also connected to always
Line 604.
I/O interface 605 is connected to lower component: the importation 606 including keyboard, mouse etc.;It is penetrated including such as cathode
The output par, c 607 of spool (CRT), liquid crystal display (LCD) etc. and loudspeaker etc.;Storage section 608 including hard disk etc.;
And the communications portion 609 of the network interface card including LAN card, modem etc..Communications portion 609 via such as because
The network of spy's net executes communication process.Driver 610 is also connected to I/O interface 605 as needed.Detachable media 611, such as
Disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on as needed on driver 610, in order to read from thereon
Computer program be mounted into storage section 608 as needed.
Particularly, disclosed embodiment, the process described above with reference to flow chart may be implemented as counting according to the present invention
Calculation machine software program.For example, embodiment disclosed by the invention includes a kind of computer program product comprising be carried on computer
Computer program on readable medium, the computer program include the program code for method shown in execution flow chart.?
In such embodiment, which can be downloaded and installed from network by communications portion 609, and/or from can
Medium 611 is dismantled to be mounted.When the computer program is executed by central processing unit (CPU) 601, system of the invention is executed
The above-mentioned function of middle restriction.
It should be noted that computer-readable medium shown in the present invention can be computer-readable signal media or meter
Calculation machine readable storage medium storing program for executing either the two any combination.Computer readable storage medium for example can be --- but not
Be limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or any above combination.Meter
The more specific example of calculation machine readable storage medium storing program for executing can include but is not limited to: have the electrical connection, just of one or more conducting wires
Taking formula computer disk, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type may be programmed read-only storage
Device (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device,
Or above-mentioned any appropriate combination.In the present invention, computer readable storage medium can be it is any include or storage journey
The tangible medium of sequence, the program can be commanded execution system, device or device use or in connection.And at this
In invention, computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal,
Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including but unlimited
In electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be that computer can
Any computer-readable medium other than storage medium is read, which can send, propagates or transmit and be used for
By the use of instruction execution system, device or device or program in connection.Include on computer-readable medium
Program code can transmit with any suitable medium, including but not limited to: wireless, electric wire, optical cable, RF etc. are above-mentioned
Any appropriate combination.
Flow chart and block diagram in attached drawing are illustrated according to the system of various embodiments of the invention, method and computer journey
The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation
A part of one unit of table, program segment or code, a part of said units, program segment or code include one or more
Executable instruction for implementing the specified logical function.It should also be noted that in some implementations as replacements, institute in box
The function of mark can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are practical
On can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it wants
It is noted that the combination of each box in block diagram or flow chart and the box in block diagram or flow chart, can use and execute rule
The dedicated hardware based systems of fixed functions or operations is realized, or can use the group of specialized hardware and computer instruction
It closes to realize.
Being described in unit involved in the embodiment of the present invention can be realized by way of software, can also be by hard
The mode of part is realized.Described unit also can be set in the processor, for example, can be described as: a kind of processor packet
Include acquiring unit, sampling unit, estimated value computing unit and target user's score value computing unit.Wherein, the title of these units
The restriction to the unit itself is not constituted under certain conditions, for example, sampling unit is also described as " using bootstrap
First data set and second data set are sampled, the unit of sample data is obtained ".
As on the other hand, the present invention also provides a kind of computer-readable medium, which be can be
Included in equipment described in above-described embodiment;It is also possible to individualism, and without in the supplying equipment.Above-mentioned calculating
Machine readable medium carries one or more program, when said one or multiple programs are executed by the equipment, makes
Obtaining the equipment includes: to obtain the first data set and the second data set;Wherein, first data set includes multiple first data,
Each first data include first sample user data and corresponding first sample user score value, and second data set includes more
A second data, each second data include the second sample of users data and corresponding second sample of users score value;Using self-service
Method is sampled first data set and second data set, obtains sample data;Using quantile estimate algorithm pair
The sample data is calculated, and estimated value is obtained;According to the estimated value and target user data, target user point is calculated
Value.
Technical solution according to an embodiment of the present invention obtains the first data set and the second data set, using bootstrap to it
Sampling, obtains sample data, since bootstrap is random, equiprobability, there is the double sampling put back to, thus, sample data is equal
Weighing apparatus, the accuracy of estimated value is improved, the accuracy of target user's score value calculating is improved.Using quantile estimate algorithm pair
Sample data is calculated, and for using traditional regression analysis, estimated value shows extremum more steady
Strong, the accuracy of estimated value improves, and more comprehensively describes the overall picture of dependent variable condition distribution, thus, improve target
The accuracy that user's score value calculates.
Above-mentioned specific embodiment, does not constitute a limitation on the scope of protection of the present invention.Those skilled in the art should be bright
It is white, design requirement and other factors are depended on, various modifications, combination, sub-portfolio and substitution can occur.It is any
Made modifications, equivalent substitutions and improvements etc. within the spirit and principles in the present invention, should be included in the scope of the present invention
Within.