CN108182597A - A kind of clicking rate predictor method based on decision tree and logistic regression - Google Patents
A kind of clicking rate predictor method based on decision tree and logistic regression Download PDFInfo
- Publication number
- CN108182597A CN108182597A CN201711439302.9A CN201711439302A CN108182597A CN 108182597 A CN108182597 A CN 108182597A CN 201711439302 A CN201711439302 A CN 201711439302A CN 108182597 A CN108182597 A CN 108182597A
- Authority
- CN
- China
- Prior art keywords
- decision tree
- clicking rate
- data
- logistic regression
- predictor method
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0241—Advertisements
- G06Q30/0242—Determining effectiveness of advertisements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Accounting & Taxation (AREA)
- Development Economics (AREA)
- Software Systems (AREA)
- Finance (AREA)
- Strategic Management (AREA)
- General Physics & Mathematics (AREA)
- Game Theory and Decision Science (AREA)
- Evolutionary Computation (AREA)
- Marketing (AREA)
- Economics (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- General Business, Economics & Management (AREA)
- Medical Informatics (AREA)
- Entrepreneurship & Innovation (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a kind of clicking rate predictor methods based on decision tree and logistic regression, include the following steps:Obtain the correlated characteristic data of impression information;Establish the clicking rate prediction model based on decision tree Yu probability sparse linear grader cascade structure;Real-time training data is generated by on-line joining process device;Clicking rate prediction model is trained to carry out obtaining newest clicking rate prediction model and estimated to carry out clicking rate by real-time training data;Propose a model architecture based on decision tree Yu probability sparse linear grader cascade structure, it further comprises an on-line study layer, and disclose on-line joining process device, it is component part very crucial in an on-line study layer, and training data can be converted into real-time stream data;Clicking rate predictor method of the present invention based on decision tree and logistic regression, compared to the effect promoting of existing clicking rate appraisal procedure at least 10%.
Description
Technical field
The present invention relates to field of computer technology more particularly to a kind of clicking rate based on decision tree and logistic regression are pre-
Estimate method.
Background technology
Digital advertisement is the industry of a value multi-million dollar, and annual also in sustainable growth.It is most online
Advertising platform is all to dynamically distribute advertisement, is adjusted according to the feedback information of user, and then it is interested to user to show its
Advertisement.Machine learning plays a critically important role in which advertisement is showed to user, uses this similar recommendation
Pattern can also promote the dispensing efficiency of advertisement.
One in 2007 by Varian and Edelman et al. the paper created it is a kind of pay per click it is competing
Valency pattern, the effect of the price-bidding model depend on estimating the accuracy of click.The data generated in usual bid are very
Largely, and many new features or element addition are had, so Prediction System needs good adaptability and processing a large amount of
The ability of data.
In search advertisements system, data that user is inquired will become the foundation for choosing candidate locations, but
In advertisement delivery system, user can't actively go to input anything, so when showing advertisement to user, just have big
The advertisement of amount can match some conditions oriented of user, such as geographical location, interest attribute, identity information etc..It but will
As soon as choosing a most suitable advertisement in these a large amount of advertisements, at this moment need to come to each advertisement by machine learning
It carries out clicking rate (CTR, Click-Through-Rate) to estimate, and then chooses the highest showing advertisement of clicking rate to user.
Invention content
In view of presently, there are above-mentioned deficiency, it is pre- that the present invention provides a kind of clicking rate based on decision tree and logistic regression
Estimate method, it is proposed that combine the prediction model of decision tree and logistic regression, improve and estimate effect.
In order to achieve the above objectives, the embodiment of the present invention adopts the following technical scheme that:
A kind of clicking rate predictor method based on decision tree and logistic regression, it is described based on decision tree and logistic regression
Clicking rate predictor method includes the following steps:
Obtain the correlated characteristic data of impression information;
Establish the clicking rate prediction model based on decision tree Yu probability sparse linear grader cascade structure;
Real-time training data is generated by on-line joining process device;
Clicking rate prediction model is trained to carry out obtaining newest clicking rate prediction model by real-time training data to carry out
Clicking rate is estimated.
According to one aspect of the present invention, the work of the on-line joining process device is:Label is added in data and with online
Mode trains the data of input, impression information is showed to click through with impression information, ID is asked to be attached, each user makes
Used time can all generate a unique request ID, will be showed by this ID and click matches.
It is described to be included the following steps by the real-time training data of on-line joining process device generation according to one aspect of the present invention:
User accesses website or app, the relevant information of user and can be transmitted in system;
System is returned to relevant impression information in the equipment of user by sorting;
The data that the above process generates are recorded in and are showed in data flow;
When user clicks the impression information that he is seen, this click data is recorded in click data stream;
After time window phase, the data that show connected will be sent to training data concentration by on-line joining process device.
According to one aspect of the present invention, during real-time training data is generated by on-line joining process device, need to establish
Abnormality detection mechanism.
According to one aspect of the present invention, on-line study method training linear classifier is used.
According to one aspect of the present invention, feature is converted using enhancing decision tree.
According to one aspect of the present invention, the enhancing decision tree includes:Every is individually set all as a classification spy
Sign, its value is exactly the index value of leaf.
According to one aspect of the present invention, the mode of the enhancing decision tree training data is instructed with batch style
Experienced.
According to one aspect of the present invention, feature weight is added to each feature, in each tree node structure,
Select and divide a best features, once a feature in more trees in use, the importance of each feature can pass through by
The whole whole penalty values addition calculation of tree obtains.
According to one aspect of the present invention, the clicking rate predictor method based on decision tree and logistic regression includes:Make
A large amount of training datas are handled with the methods of sampling.
The advantages of present invention is implemented:Clicking rate predictor method of the present invention based on decision tree and logistic regression, packet
Include following steps:Obtain the correlated characteristic data of impression information;It establishes and is cascaded based on decision tree and probability sparse linear grader
The clicking rate prediction model of structure;Real-time training data is generated by on-line joining process device;It is trained and clicked by real-time training data
Rate prediction model carries out obtaining newest clicking rate prediction model to be estimated to carry out clicking rate;One is proposed based on decision tree
With the model architecture of probability sparse linear grader cascade structure, it further comprises an on-line study layer, and discloses
On-line joining process device, it is component part very crucial in an on-line study layer, can be converted into training data in real time
Stream data;Clicking rate predictor method of the present invention based on decision tree and logistic regression, compared to existing clicking rate
The effect promoting of appraisal procedure at least 10%.
Description of the drawings
It to describe the technical solutions in the embodiments of the present invention more clearly, below will be to use required in embodiment
Attached drawing be briefly described, it should be apparent that, the accompanying drawings in the following description is only some embodiments of the present invention, for
For those of ordinary skill in the art, without creative efforts, other are can also be obtained according to these attached drawings
Attached drawing.
Fig. 1 is a kind of clicking rate predictor method schematic diagram based on decision tree and logistic regression of the present invention;
Fig. 2 is the freshness test result schematic diagram of training data of the present invention;
Fig. 3 is the training result of the test schematic diagram that modification learning rate of the present invention carries out model;
Fig. 4 is influence schematic diagram of the different feature quantity of the present invention to result;
Fig. 5 is uniform sampling training result schematic diagram of the present invention;
Fig. 6 is negative sample sampling instruction result schematic diagram of the present invention.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present invention, the technical solution in the embodiment of the present invention is carried out clear, complete
Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, those of ordinary skill in the art obtained without making creative work it is all its
His embodiment, shall fall within the protection scope of the present invention.
In the present embodiment, we normalize entropy (NE) and calibration as our main judging quota.The molecule portion of NE
It is cross entropy and the cost function of LR in fact to divide.(y is sample label, takes 1 or -1;The click that pi is sample i is predicted
Probability);Denominator part be original sample comentropy (p be positive sample probability or be exactly frequency), i.e., original sample
This uncertainty.Assuming that given training dataset includes N datas, per data, all there are one label yi∈{-1,+1}
With the clicking rate p estimatedi, wherein i=1,2 ... N, the average CTR that document is tested is p, then NE is represented by
NE is a basic component for calculating relative information gain (RIG), and RIG=1-NE, it helps our to eliminate
The uncertainty of sample.When we do not have model help, the positive and negative uncertain meeting of sample is big, we are not very
Easily determine that sample is positive and negative;But after having model help, we can obtain the clicking rate of a prediction, under this help
We easier can go judgement sample positive and negative, at this time uncertain just to have dropped.As shown in Figure 1, one kind is based on
Decision tree and the clicking rate predictor method of logistic regression, the clicking rate predictor method packet based on decision tree and logistic regression
Include following steps:
Step S1:Obtain the correlated characteristic data of impression information;
The specific embodiment that the step S1 obtains the correlated characteristic data of impression information can be:Using decision tree mould
Type, the feature formed first to the ad data and user data carry out screening combination, and generation discrimination height is more representative
Strong characteristic of division, i.e. cross feature.The dimension of feature vector on the one hand can be substantially reduced as a result, accelerate machine learning
Convergence process improves assessment efficiency;On the other hand since the feature for using higher discrimination carries out the assessment of ad click rate,
It can obtain more accurate assessed value.
Obtain the correlated characteristic information that particular historical in the order history period launches advertisement;The history launches advertisement
Oneself advertisement through dispensing within the order history period is referred specifically to, is shown to user interface in a variety of manners, as search is drawn
The search listing held up, the message column prompting interface of application program, dialog interface of application program etc..During the order history
Between section to maintain the particular historical advertisement not newer period in preset time.Obtain the spy in the order history period
Determine history launch advertisement correlated characteristic information, specifically, the particular historical launch advertisement fingering row clicking rate estimate work as
Preceding dispensing advertisement.Wherein, the history launch advertisement correlated characteristic information be specifically including but not limited to following any one or
It is multinomial:The affiliated industry of advertisement, advertisement size, advertisement text, advertising pictures, history of advertising show number, history of advertising click time
Clicking rate after number, location advertising normalization.
Acquisition of information that the advertisement affiliated industrial characteristic is registered when being launched by advertisement or by letters such as its brief introductions
Breath extracts corresponding keyword and obtains;The advertisement size is obtained by the size of display;The advertisement text directly passes through
Its acquisition of information issued;The advertising pictures are specially the description value for characterizing its characteristics of image, and such as feature vector passes through phase
The image characteristics extraction algorithm answered extracts the individual features of the picture;The history of advertising shows number and refers specifically to statistics
The number of user is showed in the particular historical period of acquisition;The history of advertising number of clicks, which refers to after advertisement is demonstrated, to be used
The number of clicks at family;Clicking rate after the location advertising normalization refers specifically to position shown by advertisement by certain algorithm meter
After calculation, optimal location is selected to be shown the number of clicks of rear user.
Obtain the individualized feature information of target user;The individualized feature information refers specifically to related to target user
, characterize the characteristic information of itself attribute.In a particular embodiment, target user's individualized feature information include but
It is not limited to following any one or more:
Gender, province, occupation, income, school, the age, educational background, blood group, constellation, networking mode, networking time, preference,
Love and marriage situation.
Step S2:Establish the clicking rate prediction model based on decision tree Yu probability sparse linear grader cascade structure;
The step S2 establishes the clicking rate prediction model based on decision tree Yu probability sparse linear grader cascade structure
Specific embodiment can be:It is proposed a kind of model structure, enhancing decision tree and the level link of probability sparse linear grader
Structure.
In practical applications, the on-line study model that the present embodiment uses is based on Stochastic Gradient
Descent (SGD) algorithm, after Feature Conversion, an advertising creative will be made of a structured vectors:Wherein eiRepresent i-th of unit vector, i1,i2,…,inWhat is represented is n-th of input feature vector
Value, in the training stage, we used scale-of-two label y ∈ {+1, -1 } to indicate whether to click.When the advertisement of given labeling
Intention (x, y), then the linear combination of weight can be expressed as:
Wherein w represents the weight vectors of linear click score.
In Bayes's on-line study pattern, two of which key factor, the expression way point of likelihood function and relative importance value
It is not:
With
WhereinWhat is represented is the cumulative distribution function of standardized normal distribution, and what N (t) was represented is standardized normal distribution
Density function, its on-line training is by it is expected that matching and match by moment are realized, the model is by weighing vector w approximation
Mean value and the variance composition of Posterior distrbutionp, therefore, above-mentioned formula can be changed to by we:
Wherein v (t):=N (t)/∮ (t), w (t):=v (t) [v (t)+t].
However the expression formula of the likelihood function in SGD algorithms is:
P (y | x, w)=sigmoid (s (y, x, x))
Wherein sigmoid (t)=exp (t)/(1+exp (t)), we are usually referred to as logistic regression (LR), the mould
Type has inferred the derivative of log-likelihood and has been expressed as the gradient direction of each coordinate fixed step size:
Wherein g is the log-likelihood Grad of all non-zero characteristics, is represented by
Specifically, the process of the generation decision-tree model is summarized as follows:If set of data samples is S, first according to certain
One attribute of policy selection, such as age of user are divided according to the attribute, if the age 30 is boundary, the sample more than 30 years old
It is divided into a set, the sample less than 30 years old is divided into a set.Specifically, each individualized feature of user is as an attribute,
Such as gender, province, occupation, income, school, age, educational background, blood group, constellation, networking mode, networking time, preference, love and marriage
The features such as situation are based respectively on a certain amount value and are divided, while the correlated characteristic of particular historical dispensing advertisement is also distinguished
Show number, advertisement as an attribute, such as affiliated industry of advertisement, advertisement size, advertisement text, advertising pictures, history of advertising
The features such as the clicking rate after history number of clicks, location advertising normalization, are based respectively on corresponding quantized value and carry out further
It divides, until cannot divide, so as to generate the different leaf nodes of decision tree, each leaf node characterizes one
Cross feature.
In practical applications, in order to improve accuracy, there are two types of simple methods to change the input of linear classifier spy
Sign.For continuous feature, a straightforward procedure for learning nonlinear transformation is that feature is put into a bin set, then will
The bin is as a characteristic of division.Linear classifier has effectively learnt the constant Nonlinear Mapping of a segmentation, and study has
Bin boundaries are critically important, and can realize this work there are many method.Second of simple and effective conversion
Mode is structure tuple input feature vector, and for characteristic of division, most stupid method is exactly to use cartesian product, but it has
One shortcoming is exactly that cannot be modified combination useless, if input feature vector is all continuous, can combine tying up
It is fixed, such as use k-d tree.
Enhancing decision tree is a kind of powerful and very easily method can realize that we described non-linear and first just now
Group conversion.We, which individually set every, is all considered as a characteristic of division, its value is exactly the index value of leaf.For example, it is assumed that
One decision tree has 2 subtrees, wherein a subtree has 3 leaf nodes, another has 2 leaf nodes, at this moment there is one
Data terminate in the 2nd leaf node of subtree 1 and the 1st leaf node of subtree 2, then we can be by two points of vectors
The input value of [0,1,0,1,0] as linear classifier, wherein preceding 3 values represent be subtree 1 leaf node, latter two
What value represented is the leaf node of subtree 2.The enhancing decision tree that we use has followed gradient elevator (GBM), makes herein
With classical L2- TreeBoost algorithms, in study iteration every time, a new tree can be all created to the residual of tree before
Remaining to be modeled, the conversion based on decision tree is a kind of feature coding being subjected to supervision, it by real-valued vectors be converted into one it is compact
Vector of binary values, the traversal from root node to leaf node is exactly the rule of certain features in fact, on binary vector
Linear grader is substantially exactly the enhancing decision tree training unlike other modes for one group of rule learning weight
The mode of data is trained with batch style, this can save the training time significantly.
In practical applications, We conducted some experiments to show the input tape using the feature of tree as linear model
Come influence, in this experiment we compare two Logic Regression Models, one of them contains Feature Conversion logic, separately
One primitive character directly used, later we also enhancing decision tree compared.Comparing result such as following table:
Model | NE values |
Logistic regression+enhancing decision tree | 96.58% |
Logistic regression | 99.43% |
Enhance decision tree | 100% |
NE values reduce nearly 3% after having used feature conversion as can be seen from the table, this is that obviously effect carries
It rises.Display logic recurrence brings the promotion of bigger with the mode that decision tree is combined in table.
In practical applications, for the freshness that data is enable to keep maximum, we used on-line study linear classifications
The mode of device.
Assess the influence that different learning rates generates the logistic regression based on SDG.Realize the purpose, we do
The following processing:
1. the learning rate of the feature i in the t times iteration can be expressed as
Wherein α, β are adjustable parameters.
2. the square root learning rate of each weight:
Wherein ηt,iWhat is represented is that feature i iterates to the sum of all trained examples after the t times.
3. the learning rate of each weight:
4. global learning rate:
5. instant learning rate:
ηt,i=α
First three equation is provided with learning rate for each feature, and the learning rate of all features of latter two equation is all
The same.Wherein adjustable parameter is optimized by the form of grid search, specific optimal value such as following table:
The training that learning rate carries out model is changed by several ways above, the results are shown in Figure 3 for experiment, can be with
To find out, the 1st kind of mode has an optimal NE values, and the 3rd kind of mode shows worst, and the 2nd kind of mode is similar with the result of the 5th kind of mode,
Caused by the main reason for 4th kind of mode shows difference may be the imbalance of the example quantity in each feature, because of each instruction
Different features can be included by practicing example, some features at this moment will be occurred and be contained more training examples.Using the 4th kind of side
During formula, the learning rate of the feature containing a small amount of example will drastically decline, and it is optimal to prevent weight from converging to.Although the 3rd kind
Mode does not have this problem, but because it the learning rate of all features is all reduced show it is still very poor, in this way
As soon as may result in when model converges to non-optimal, training terminates, this also explains why this side
Formula performance is worst.
Step S3:Real-time training data is generated by on-line joining process device;
Click Prediction System be typically to be deployed in one dynamically to bid in environment, so data distribution can with when
Between and change, it has been found that the freshness of training data largely influence whether prediction performance.In order to verify this knot
By we used the data of specific one day to be used as training, then applies model in next continuously bidding within several days.
The results are shown in Figure 2 for final test, and what abscissa represented is the number of days that test data is separated by with training data in figure, indulges and sits
Mark represents NE values.It can be evident that from figure as the increase NE values for being separated by number of days also accordingly increase, so one
The section time (being no more than 7 days) needs the newest data of re -training later so that model keeps optimal, we use a timing
Task trains the time for enhancing decision tree depending on various factors to realize this purpose, the quantity including tree, every
The quantity of leaf child node, cpu, memory etc. can require over the time of 24 hours to train in the case of single cpu
Go out an enhancing decision tree.But in production environment, it would be desirable to carry out concurrent training using the machine of multinuclear, enough memories
Such one tree.
Newer training data can improve the accuracy of prediction, it additionally provides a simple model architecture,
Middle linear classifier layer is on-line training.
In practical applications, the present embodiment proposes a kind of experimental system, which can generate real-time training data,
And pass through on-line study training linear classifier.This system is known as " on-line joining process device " by we, because of its key operation
It is to add in label (click/not clicking) and with the data (advertising creative) of online mode training input.It is clicked in launch process
Label can be got in real time, but we can not know the user in real time due to the delay of data and network
Whether the advertisement is not clicked on, so it is to be understood that whether advertising creative is clicked, it is necessary to intention in regular hour window phase
Into the setting of row label, problem is that this time window phase, the setting was much on earth.
It so just needs more memories when window phase setting is long and clicks the time to cache creative information to wait for
Occur, it is too short when setting, some normal click datas can be lost.This can bring the problem of " clicking covering ", all clicks
Score be all successfully joined it is current show suffer, therefore, on-line joining process system must reconnect and click covering
Between obtain balance.
Mean that real-time training set will be with prejudice without completely clicking covering:The CTR of experiment is often than true
It is low.This is because if stand-by period long enough, the data that sub-fraction is marked as " not clicking " will be marked
It is denoted as " click ".However, in practice, it has been found that in the case that waiting for, window is continually changing, it is easy to by this deviation
Reduce, so as to which memory requirements is become controllable.In addition, this little deviation can also be measured and correct.On-line joining process device
Groundwork is exactly by showing advertisement and ad click by asking ID to be attached, and each user is in silver-colored orange Ask-Bid System
A unique request ID can all be generated by bidding, so can will be showed by this ID and click matches.Connect online
A substantially flow for connecing device is:User accesses website or app, and the relevant information of user can be transmitted to silver-colored bidding for orange and be
In system, Ask-Bid System is returned to relevant advertisement in the equipment of user by sorting, and the data that this process generates can quilt
It is recorded in and shows in data flow, when user clicks the advertisement that he is seen, this click data will be recorded in click data
In stream, after time window phase, on-line joining process device will show data what is connected and (add in and click or do not click on label)
It is sent to training data concentration.Trainer can continuously generate newest model in this way.Final machine
Device learning model forms a tight closure cycle, and in this model, the variation of feature distribution or model performance can
It is corrected with captured, study and in a short time.
When using the real-time training data system of generation, a significant consideration needed to be considered is to establish protection
Mechanism prevents to destroy the abnormal phenomenon of on-line study system.For example, when click data stream leads to it for some reason
In data when being all old data, then the training data that on-line joining process device generates will become very small, this can lead to reality
When trainer train generate model pre-estimating come out clicking rate become very low, and then make the showing advertisement number of Ask-Bid System
It reduces.Such issues that at this moment abnormality detection mechanism can help us to avoid, such as when the distribution of real-time training data changes suddenly
Become, it is possible to the automatic on-line training for disconnecting on-line joining process device.
Step S4:Clicking rate prediction model is trained to carry out obtaining newest clicking rate and estimate mould by real-time training data
Type is estimated to carry out clicking rate.
In practical applications, the tree in model is more, and the time of prediction is longer.In this section, we have studied increases
The quantity of tree is to estimating the influence of accuracy.The quantity of tree is increased to 2000 by us from 1, and trained data set is a whole day
Data, the data one day after that test data is.Found after test the quantity of tree from 0 increase to 500 when the ratio that declines of NE values
It is more apparent, but be held essentially constant in NE values later.So it is not that tree more multiple-effect fruit is better, in the training process often
Reach fitting in some place.
Feature quantity is the factor that accuracy and calculated performance are estimated in another influence, in order to be better understood on feature
The influence of quantity, we add feature weight to each feature.In each tree node structure, select and divide one
Best features, to reduce square error to the maximum extent, once in use, the weight of each feature during a feature is set at more
The property wanted can be by the way that the whole whole penalty values addition calculation of tree be obtained.
Rule of thumb, usually only a small amount of characteristic can generate model large effect, and other most of characteristics pair
The influence of model can be ignored.We test also for the discovery, only retain therein 10,20,50,100,200
During a feature, influence of the different feature quantities to result is then assessed, the results are shown in Figure 4, as can be seen from the figure exists
In this section of 10-50, NE values decline obvious, and 50-200NE value falls are smaller, so as to demonstrate to mould
The feature quantity that type is affected often accounts for the ratio of very little.
In practical applications, when handling a large amount of training datas, we provide the method for two kinds of data from the sample survey and assess them
Quality, both methods is:Uniform sampling and negative sample sampling of data.We will use the enhancing decision containing 600 trees
It sets to compare.
It is a kind of method being in daily use to carry out uniform sampling to training data, because it realizes simple and does not need to repair
Newly-generated model can be used by changing sample data.In in this section, the different sampling rate of our teams is assessed, right
In each group of sample data, we can use enhancing model to be trained, and experimental result is as shown in figure 5, as can be seen from Figure
Data are more, and modelling effect is better, and during the training data of use 10%, only low when NE values ratio is using whole training datas
0.02, so we do not need to all data being trained when testing.
Up to the present, having had many researchers, unbalanced problem has carried out a large amount of research to class, as a result table
Bright, this imbalance can have a huge impact the performance of mode of learning, below we can sample to solve using negative sample
Class imbalance problem.Similarly, data are carried out Contrast on effect, comparing result such as Fig. 6 institutes by us using plurality of sampling rates
Show, as can be seen from the figure sample rate modelling effect at 0.025 is best.
The advantages of present invention is implemented:Clicking rate predictor method of the present invention based on decision tree and logistic regression, packet
Include following steps:Obtain the correlated characteristic data of impression information;It establishes and is cascaded based on decision tree and probability sparse linear grader
The clicking rate prediction model of structure;Real-time training data is generated by on-line joining process device;It is trained and clicked by real-time training data
Rate prediction model carries out obtaining newest clicking rate prediction model to be estimated to carry out clicking rate;One is proposed based on decision tree
With the model architecture of probability sparse linear grader cascade structure, it further comprises an on-line study layer, and discloses
On-line joining process device, it is component part very crucial in an on-line study layer, can be converted into training data in real time
Stream data;Clicking rate predictor method of the present invention based on decision tree and logistic regression, compared to existing clicking rate
The effect promoting of appraisal procedure at least 10%.
The above description is merely a specific embodiment, but protection scope of the present invention is not limited thereto, and appoints
How those skilled in the art is in technical scope disclosed by the invention, the change or replacement that can be readily occurred in, all
It is covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with the scope of the claims
Subject to.
Claims (10)
1. a kind of clicking rate predictor method based on decision tree and logistic regression, which is characterized in that described to be based on decision tree and patrol
The clicking rate predictor method returned is collected to include the following steps:
Obtain the correlated characteristic data of impression information;
Establish the clicking rate prediction model based on decision tree Yu probability sparse linear grader cascade structure;
Real-time training data is generated by on-line joining process device;
Clicking rate prediction model is trained to carry out obtaining newest clicking rate prediction model by real-time training data to be clicked
Rate is estimated.
2. the clicking rate predictor method according to claim 1 based on decision tree and logistic regression, which is characterized in that described
The work of on-line joining process device is:Label is added in data and with the data of online mode training input, impression information is showed
Request ID is clicked through with impression information to be attached, when each user's use can all generate a unique request ID, pass through
This ID will show and click matches.
3. the clicking rate predictor method according to claim 2 based on decision tree and logistic regression, which is characterized in that described
Real-time training data is generated by on-line joining process device to include the following steps:
User accesses website or app, the relevant information of user and can be transmitted in system;
System is returned to relevant impression information in the equipment of user by sorting;
The data that the above process generates are recorded in and are showed in data flow;
When user clicks the impression information that he is seen, this click data is recorded in click data stream;
After time window phase, the data that show connected will be sent to training data concentration by on-line joining process device.
4. the clicking rate predictor method according to claim 3 based on decision tree and logistic regression, which is characterized in that logical
It crosses during the real-time training data of on-line joining process device generation, needs to establish abnormality detection mechanism.
5. the clicking rate predictor method according to claim 1 based on decision tree and logistic regression, which is characterized in that use
On-line study method training linear classifier.
6. the clicking rate predictor method according to claim 1 based on decision tree and logistic regression, which is characterized in that use
Enhance decision tree to be converted to feature.
7. the clicking rate predictor method according to claim 6 based on decision tree and logistic regression, which is characterized in that described
Enhancing decision tree includes:Every is individually set all for a characteristic of division, its value is exactly the index value of leaf.
8. the clicking rate predictor method according to claim 6 based on decision tree and logistic regression, which is characterized in that described
The mode of enhancing decision tree training data is trained with batch style.
9. the clicking rate predictor method based on decision tree and logistic regression described in claim 6, which is characterized in that each spy
Sign all adds feature weight, in each tree node structure, selects and divides a best features, once a feature is more
In use, the importance of each feature can be by the way that the whole whole penalty values addition calculation of tree be obtained in tree.
10. the clicking rate predictor method based on decision tree and logistic regression according to one of claim 1 to 9, feature
It is, the clicking rate predictor method based on decision tree and logistic regression includes:A large amount of training numbers are handled using the methods of sampling
According to.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711439302.9A CN108182597A (en) | 2017-12-27 | 2017-12-27 | A kind of clicking rate predictor method based on decision tree and logistic regression |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711439302.9A CN108182597A (en) | 2017-12-27 | 2017-12-27 | A kind of clicking rate predictor method based on decision tree and logistic regression |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108182597A true CN108182597A (en) | 2018-06-19 |
Family
ID=62547435
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711439302.9A Withdrawn CN108182597A (en) | 2017-12-27 | 2017-12-27 | A kind of clicking rate predictor method based on decision tree and logistic regression |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108182597A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109003148A (en) * | 2018-09-30 | 2018-12-14 | 北京奇虎科技有限公司 | Advertisement sending method, device, server and readable storage medium storing program for executing |
CN109522506A (en) * | 2018-10-30 | 2019-03-26 | 广东原昇信息科技有限公司 | The dynamic prediction method of visitor's behavioral data conversion ratio |
CN110245990A (en) * | 2019-06-19 | 2019-09-17 | 北京达佳互联信息技术有限公司 | Advertisement recommended method, device, electronic equipment and storage medium |
CN110933499A (en) * | 2018-09-19 | 2020-03-27 | 飞狐信息技术(天津)有限公司 | Video click rate estimation method and device |
CN111192071A (en) * | 2018-11-15 | 2020-05-22 | 北京嘀嘀无限科技发展有限公司 | Invoice amount estimation method and device and invoice probability model training method and device |
CN112055038A (en) * | 2019-06-06 | 2020-12-08 | 阿里巴巴集团控股有限公司 | Method for generating click rate estimation model and method for predicting click probability |
CN113723744A (en) * | 2021-07-12 | 2021-11-30 | 浙江德马科技股份有限公司 | Storage equipment management system, method, computer storage medium and server |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2960849A1 (en) * | 2014-06-26 | 2015-12-30 | Deutsche Telekom AG | Method and system for recommending an item to a user |
CN105808762A (en) * | 2016-03-18 | 2016-07-27 | 北京百度网讯科技有限公司 | Resource sequencing method and device |
CN107067274A (en) * | 2016-12-27 | 2017-08-18 | 北京掌阔移动传媒科技有限公司 | One DSP real time bid ad system based on blended learning model |
-
2017
- 2017-12-27 CN CN201711439302.9A patent/CN108182597A/en not_active Withdrawn
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2960849A1 (en) * | 2014-06-26 | 2015-12-30 | Deutsche Telekom AG | Method and system for recommending an item to a user |
CN105808762A (en) * | 2016-03-18 | 2016-07-27 | 北京百度网讯科技有限公司 | Resource sequencing method and device |
CN107067274A (en) * | 2016-12-27 | 2017-08-18 | 北京掌阔移动传媒科技有限公司 | One DSP real time bid ad system based on blended learning model |
Non-Patent Citations (1)
Title |
---|
XINRAN等: ""Practical Lessons from Predicting Clicks on Ads at Facebook"", 《ACM》 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110933499A (en) * | 2018-09-19 | 2020-03-27 | 飞狐信息技术(天津)有限公司 | Video click rate estimation method and device |
CN110933499B (en) * | 2018-09-19 | 2021-12-24 | 飞狐信息技术(天津)有限公司 | Video click rate estimation method and device |
CN109003148A (en) * | 2018-09-30 | 2018-12-14 | 北京奇虎科技有限公司 | Advertisement sending method, device, server and readable storage medium storing program for executing |
CN109003148B (en) * | 2018-09-30 | 2023-10-31 | 三六零科技集团有限公司 | Advertisement pushing method, advertisement pushing device, server and readable storage medium |
CN109522506A (en) * | 2018-10-30 | 2019-03-26 | 广东原昇信息科技有限公司 | The dynamic prediction method of visitor's behavioral data conversion ratio |
CN111192071A (en) * | 2018-11-15 | 2020-05-22 | 北京嘀嘀无限科技发展有限公司 | Invoice amount estimation method and device and invoice probability model training method and device |
CN111192071B (en) * | 2018-11-15 | 2023-11-17 | 北京嘀嘀无限科技发展有限公司 | Method and device for estimating amount of bill, method and device for training bill probability model |
CN112055038A (en) * | 2019-06-06 | 2020-12-08 | 阿里巴巴集团控股有限公司 | Method for generating click rate estimation model and method for predicting click probability |
CN112055038B (en) * | 2019-06-06 | 2022-04-15 | 阿里巴巴集团控股有限公司 | Method for generating click rate estimation model and method for predicting click probability |
CN110245990A (en) * | 2019-06-19 | 2019-09-17 | 北京达佳互联信息技术有限公司 | Advertisement recommended method, device, electronic equipment and storage medium |
CN113723744A (en) * | 2021-07-12 | 2021-11-30 | 浙江德马科技股份有限公司 | Storage equipment management system, method, computer storage medium and server |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108182597A (en) | A kind of clicking rate predictor method based on decision tree and logistic regression | |
CN110162703B (en) | Content recommendation method, training device, content recommendation equipment and storage medium | |
Xian et al. | Zero-shot learning-the good, the bad and the ugly | |
CN111222332B (en) | Commodity recommendation method combining attention network and user emotion | |
CN106021364B (en) | Foundation, image searching method and the device of picture searching dependency prediction model | |
TWI689871B (en) | Gradient lifting decision tree (GBDT) model feature interpretation method and device | |
US8738436B2 (en) | Click through rate prediction system and method | |
CN111061962B (en) | Recommendation method based on user scoring analysis | |
CN109299396A (en) | Merge the convolutional neural networks collaborative filtering recommending method and system of attention model | |
CN111325579A (en) | Advertisement click rate prediction method | |
CN110728541A (en) | Information stream media advertisement creative recommendation method and device | |
CN110796313B (en) | Session recommendation method based on weighted graph volume and item attraction model | |
CN112541639B (en) | Recommendation system scoring prediction method based on graph neural network and attention mechanism | |
CN110309508A (en) | A kind of VWAP quantization transaction system and method based on investor sentiment | |
CN112749330B (en) | Information pushing method, device, computer equipment and storage medium | |
CN108052625A (en) | A kind of entity sophisticated category method | |
CN110297915A (en) | A kind of IS quantization transaction system and method based on investor sentiment | |
CN110851718A (en) | Movie recommendation method based on long-time memory network and user comments | |
JP2024144459A (en) | Advertising-related service providing system and user advertising equipment | |
CN110046353A (en) | Aspect level emotion analysis method based on multi-language level mechanism | |
CN110717090A (en) | Network public praise evaluation method and system for scenic spots and electronic equipment | |
CN110738565A (en) | Real estate finance artificial intelligence composite wind control model based on data set | |
CN112541010B (en) | User gender prediction method based on logistic regression | |
Daneshmandi et al. | A hybrid data mining model to improve customer response modeling in direct marketing | |
CN108287902B (en) | Recommendation system method based on data non-random missing mechanism |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20180619 |
|
WW01 | Invention patent application withdrawn after publication |