CN103761266A - Click rate predicting method and system based on multistage logistic regression - Google Patents

Click rate predicting method and system based on multistage logistic regression Download PDF

Info

Publication number
CN103761266A
CN103761266A CN201410001103.XA CN201410001103A CN103761266A CN 103761266 A CN103761266 A CN 103761266A CN 201410001103 A CN201410001103 A CN 201410001103A CN 103761266 A CN103761266 A CN 103761266A
Authority
CN
China
Prior art keywords
model
clicking rate
logistic regression
logic
multilevel logic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410001103.XA
Other languages
Chinese (zh)
Inventor
崔晶晶
林佳婕
李春华
受春柏
刘立娜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING GEO POLYMERIZATION NETWORK TECHNOLOGY Co Ltd
Original Assignee
BEIJING GEO POLYMERIZATION NETWORK TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING GEO POLYMERIZATION NETWORK TECHNOLOGY Co Ltd filed Critical BEIJING GEO POLYMERIZATION NETWORK TECHNOLOGY Co Ltd
Priority to CN201410001103.XA priority Critical patent/CN103761266A/en
Publication of CN103761266A publication Critical patent/CN103761266A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0242Determining effectiveness of advertisements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0277Online advertisement

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Game Theory and Decision Science (AREA)
  • Human Resources & Organizations (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a click rate predicting method and system based on multistage logistic regression. The method includes: analyzing obtained click rate data, analyzing factors affecting click rate, and selecting feature vectors in the factors to build feature models; using multistage logistic regression models to perform machine learning on the feature models to obtain prediction models; using the prediction models to predict the to-be-predicted click rate data. The method has the advantages that by multistage logistic regression, calculation amount can be reduced and calculation speed can be increased while dimensionality and sample number are unchanged, and the problem that the current click rate prediction is large in data amount and inaccurate in prediction.

Description

The clicking rate Forecasting Methodology and the system that based on multilevel logic, return
Technical field
The present invention relates to the large data machine learning in internet process field, relate in particular to a kind of method and system of the clicking rate prediction returning based on multilevel logic.
Background technology
Along with the raising of global IT application degree, internet, applications is more and more universal, and with respect to the advertisement of traditional media, Internet advertising proportion is increasing.Along with the rise of online game, ecommerce and the net alliance development of attention long-tail flow, the actual effect that advertiser produces the web advertisement more and more receives publicity in recent years.By the clicking rate of advertisement link is carried out to statistical computation, can understand the interested advertisement of different user, thereby show more accurately corresponding advertisement to each user, to improve the clicking rate of advertisement, improve the visit capacity of advertisement delivery effect and the page.So-called clicking rate, is again CTR (Click-through Rate), or CR (Clicks Ratio), is a ratio, i.e. quantity is shown in link clicks quantity/link.For advertisement link, its clicking rate has reflected the delivery quality of this advertisement conventionally.To advertising platform, if can be according to user's network browsing or search behavior, and the content of the page etc. dope the clicking rate of alternative advertisement, can weigh the quality of each advertisement putting, the advertisement that clicking rate predicted value is higher is thrown in, thereby improve the conversion ratio (ROI) of advertisement.
First the method for prediction clicking rate all will be chosen ad click rate influential because usually setting up raw data model conventionally at present.Affect clicking rate a variety of because have of advertisement, for example advertisement, media, audient.Each factor itself can be segmented again many aspects.A dimension data can be seen in each aspect, and the sample data of each dimension is the actual click rate of this dimension within a period of time, and therefore the sample data of various dimensions is magnanimity.This just causes, when prediction clicking rate, facing the too large problem of calculated amount.In order addressing this problem, when clicking rate is predicted, generally all can to adopt the method that reduces dimension or reduce sample size to carry out dimension-reduction treatment, thereby reduce calculated amount at present.But owing to having reduced sample dimension or sample size, thereby affected the accuracy predicting the outcome.
Summary of the invention
The invention provides a kind of Forecasting Methodology and system of the clicking rate returning based on multilevel logic, by multilevel logic, return, under and prerequisite that sample size is constant constant in dimension, reduce operand, to solve the problem that in current clicking rate prediction, data volume is large, forecasting inaccuracy is true.
According to an aspect of the present invention, provide a kind of clicking rate Forecasting Methodology returning based on multilevel logic, the method comprises:
Feature extraction step, by acquired clicking rate data analysis, analyzes to the influential factor of clicking rate therefrom selected characteristic vector, construction feature model;
Model training step, is used multilevel logic regression model, characteristic model is carried out to multilevel logic and return machine learning, obtains forecast model;
Clicking rate prediction steps: use forecast model to treat prediction clicking rate data and predict.
According to a further aspect in the invention, provide a kind of clicking rate prognoses system returning based on multilevel logic, this system comprises:
Feature extraction device, for by acquired clicking rate data analysis, analyzes to the influential factor of clicking rate therefrom selected characteristic vector, construction feature model;
Model training apparatus, for using multilevel logic regression model, carries out multilevel logic to characteristic model and returns machine learning, obtains forecast model;
Clicking rate prediction unit: predict for using forecast model to treat prediction clicking rate data.
Compared with prior art, the method that method of the present invention provides multilevel logic to return, has improved accuracy and the efficiency of clicking rate prediction.Specifically be applied in advertisement field, can make advertisement obtain throwing in more accurately.The clicking rate Forecasting Methodology that the present invention uses not is to be applicable to advertisement field, also can be applicable to other large data retrieval prediction fields.
Accompanying drawing explanation
Fig. 1 is the method that the embodiment of the present invention is carried out the prediction of multilevel logic recurrence clicking rate.
Embodiment
Below in conjunction with drawings and Examples, describe embodiments of the present invention in detail, the technical matters whereby the present invention being solved, the technological means of employing, and the technique effect reaching can absolutely prove.It should be noted that, only otherwise form conflict, each embodiment in the present invention and each feature of each embodiment can mutually combine, and the technical scheme forming is all within protection scope of the present invention.
Embodiment 1
As shown in Figure 1, the clicking rate Forecasting Methodology returning based on multilevel logic of the embodiment of the present invention mainly comprises the steps:
Feature extraction step, by acquired clicking rate data analysis, analyzes to the influential factor of clicking rate therefrom selected characteristic vector, construction feature model;
Model training step, is used multilevel logic regression model, characteristic model is carried out to multilevel logic and return machine learning, obtains forecast model; And
Clicking rate prediction steps: use forecast model to treat prediction clicking rate data and predict.
Wherein, affect the multiple because have of clicking rate, topmost comprising: advertisement, media, audient.The present invention preferably uses following model construction clicking rate characteristic model:
μ(a,u,c)=p(click|a,u,c)
Wherein, a: represent advertisement, u: represent audient, c: represent media
Wherein, in model training step, preferably use multilevel logic regression model below:
p(click|a,u,c)=σ(w tx)
Wherein, w t: represent n dimensional feature weight vectors (parameter), x: represent n dimensional feature vector
Wherein, in multilevel logic regression model, preferably use logistic regression function below:
σ ( t ) = 1 1 + e t
Wherein, the recurrence of the multilevel logic in model training step machine learning procedure comprises:
Inherent logic returns calculation procedure: the logistic regression that the N dimensional feature vector in characteristic model is carried out to self calculates, and obtains the regressand value of this dimensional feature vector;
Wherein the size of N need to be determined according to concrete data characteristics and self;
Intermediate logic returns calculation procedure: choose M first order regressand value and carry out intergrade calculating, wherein M<N;
Wherein this step can be carried out repeatedly computing according to actual needs, and all logistic regression computing is carried out in the input using the output of upper level as next stage each time.Logistic regression computing each time all can reduce data dimension, reduces the operand of computing next time.
And final logistic regression calculation procedure: the input of the regressing calculation using the intermediate value of intergrade regressing calculation as afterbody, finally obtains the predicted value of clicking rate.
Embodiment 2
The clicking rate prognoses system returning based on multilevel logic of the embodiment of the present invention mainly comprises as follows:
Feature extraction device, for by acquired clicking rate data analysis, analyzes to the influential factor of clicking rate therefrom selected characteristic vector, construction feature model;
Model training apparatus, for using multilevel logic regression model, carries out multilevel logic to characteristic model and returns machine learning, obtains forecast model; And
Clicking rate prediction unit: use forecast model to treat prediction clicking rate data and predict.
Wherein, affect the multiple because have of clicking rate, topmost comprising: advertisement, media, audient.The present invention preferably uses following model construction clicking rate characteristic model:
μ(a,u,c)=p(click|a,u,c)
Wherein, a: represent advertisement, u: represent audient, c: represent media
Wherein, in model training apparatus, preferably use multilevel logic regression model below:
p(click|a,u,c)=σ(w tx)
Wherein, w t: represent n dimensional feature weight vectors (parameter), x: represent n dimensional feature vector
Wherein, in multilevel logic regression model, preferably use logistic regression function below:
&sigma; ( t ) = 1 1 + e t
Wherein, model training apparatus also comprises that a multilevel logic returns machine learning device, and it comprises:
Inherent logic returns calculation element: the logistic regression that carries out self for the N dimensional feature vector to characteristic model calculates, and obtains the regressand value of this dimensional feature vector;
Wherein the size of N need to be determined according to concrete data characteristics and self;
Intermediate logic returns calculation element: for choosing M first order regressand value, and selected M first order regressand value carried out to intergrade calculating, wherein M<N;
Wherein this step can be carried out repeatedly computing according to actual needs, and all logistic regression computing is carried out in the input using the output of upper level as next stage each time.Logistic regression computing each time all can reduce data dimension, reduces the operand of computing next time.
And final logistic regression calculation element: for the input as the regressing calculation of afterbody by the intermediate value of intergrade regressing calculation, finally obtain the predicted value of clicking rate.
Although the disclosed embodiment of the present invention as above, the embodiment that described content just adopts for the ease of understanding the present invention, is not intended to limit the present invention.Technician in any the technical field of the invention; do not departing under the prerequisite of the disclosed spirit and scope of the present invention; can do any modifications and variations what implement in form and in details; but scope of patent protection of the present invention, still must be as the criterion with the scope that appended claims was defined.

Claims (8)

1. the clicking rate Forecasting Methodology returning based on multilevel logic, is characterized in that: the method comprises the steps:
Feature extraction step, by acquired clicking rate data analysis, analyzes to the influential factor of clicking rate therefrom selected characteristic vector, construction feature model;
Model training step, is used multilevel logic regression model, characteristic model is carried out to multilevel logic and return machine learning, obtains forecast model; And
Clicking rate prediction steps: use forecast model to treat prediction clicking rate data and predict.
2. Forecasting Methodology as claimed in claim 1, is characterized in that: the characteristic model in described feature extraction step is: μ (a, u, c)=p (click|a, u, c), wherein, a represents advertisement, and u represents audient, and c represents media, p () is multilevel logic regression model, there is p (click|a, u, c)=σ (w tx), w wherein trepresent n dimensional feature weight vectors, x represents n dimensional feature vector, and σ () is logistic regression function, has
Figure FDA0000452281140000011
3. Forecasting Methodology as claimed in claim 1, is characterized in that: the step that the multilevel logic in described feature extraction step returns machine learning comprises:
Inherent logic returns calculation procedure, the N dimensional feature vector in characteristic model is carried out to the logistic regression of self and calculates, and obtains the regressand value of this dimensional feature vector;
Intermediate logic returns calculation procedure, chooses M first order regressand value and carries out intergrade calculating, wherein M<N; And
Final logistic regression calculation procedure, the input by the intermediate value of intergrade regressing calculation as the regressing calculation of afterbody, finally obtains the predicted value of clicking rate.
4. Forecasting Methodology as claimed in claim 5, it is characterized in that: the intergrade that described intermediate logic returns in calculation procedure is calculated and can be carried out according to actual needs repeatedly computing, and all logistic regression computing is carried out in the input using the output of upper level as next stage each time.
5. the clicking rate prognoses system returning based on multilevel logic, is characterized in that: this system comprises as lower device:
Feature extraction device, for by acquired clicking rate data analysis, analyzes to the influential factor of clicking rate therefrom selected characteristic vector, construction feature model;
Model training apparatus, for using multilevel logic regression model, carries out multilevel logic to characteristic model and returns machine learning, obtains forecast model; And
Clicking rate prediction unit: predict for using forecast model to treat prediction clicking rate data.
6. prognoses system as claimed in claim 5, is characterized in that: described characteristic model is: μ (a, u, c)=p (click|a, u, c), wherein, a represents advertisement, and u represents audient, and c represents media, p () is multilevel logic regression model, there is p (click|a, u, c)=σ (w tx), w wherein trepresent n dimensional feature weight vectors, x represents n dimensional feature vector, and σ () is logistic regression function, has
7. prognoses system as claimed in claim 5, is characterized in that: described model training apparatus comprises that multilevel logic returns machine learning device, and this multilevel logic returns machine learning device and comprises:
Inherent logic returns calculation element, carries out the logistic regression of self calculate for the N dimensional feature vector to characteristic model, obtains the regressand value of this dimensional feature vector;
Intermediate logic returns calculation element: for choosing M first order regressand value, carry out intergrade calculating, wherein M<N;
Final logistic regression calculation element: for the input as the regressing calculation of afterbody by the intermediate value of intergrade regressing calculation, finally obtain the predicted value of clicking rate.
8. prognoses system as claimed in claim 7, is characterized in that: described intermediate logic returns calculation element can carry out repeatedly computing according to actual needs, and all logistic regression computing is carried out in the input using the output of upper level as next stage each time.
CN201410001103.XA 2014-01-02 2014-01-02 Click rate predicting method and system based on multistage logistic regression Pending CN103761266A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410001103.XA CN103761266A (en) 2014-01-02 2014-01-02 Click rate predicting method and system based on multistage logistic regression

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410001103.XA CN103761266A (en) 2014-01-02 2014-01-02 Click rate predicting method and system based on multistage logistic regression

Publications (1)

Publication Number Publication Date
CN103761266A true CN103761266A (en) 2014-04-30

Family

ID=50528503

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410001103.XA Pending CN103761266A (en) 2014-01-02 2014-01-02 Click rate predicting method and system based on multistage logistic regression

Country Status (1)

Country Link
CN (1) CN103761266A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104268644A (en) * 2014-09-23 2015-01-07 新浪网技术(中国)有限公司 Method and device for predicting click frequency of advertisement at advertising position
CN104536983A (en) * 2014-12-08 2015-04-22 北京掌阔技术有限公司 Method and device for predicting advertisement click rate
CN105023170A (en) * 2015-06-26 2015-11-04 深圳市腾讯计算机系统有限公司 Processing method and device of click stream data
CN105446802A (en) * 2014-08-13 2016-03-30 阿里巴巴集团控股有限公司 Operation execution method and device based on conversion rate
CN105808762A (en) * 2016-03-18 2016-07-27 北京百度网讯科技有限公司 Resource sequencing method and device
CN105824806A (en) * 2016-06-13 2016-08-03 腾讯科技(深圳)有限公司 Quality evaluation method and device for public accounts
CN106688215A (en) * 2014-06-27 2017-05-17 谷歌公司 Automated click type selection for content performance optimization
CN109840782A (en) * 2017-11-24 2019-06-04 腾讯科技(深圳)有限公司 Clicking rate prediction technique, device, server and storage medium
CN111339433B (en) * 2020-05-21 2020-08-21 腾讯科技(深圳)有限公司 Information recommendation method and device based on artificial intelligence and electronic equipment

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106688215A (en) * 2014-06-27 2017-05-17 谷歌公司 Automated click type selection for content performance optimization
CN106688215B (en) * 2014-06-27 2020-07-14 谷歌有限责任公司 Automatic click type selection for content performance optimization
CN106688215B8 (en) * 2014-06-27 2020-08-18 谷歌有限责任公司 Automatic click type selection for content performance optimization
CN105446802A (en) * 2014-08-13 2016-03-30 阿里巴巴集团控股有限公司 Operation execution method and device based on conversion rate
CN104268644A (en) * 2014-09-23 2015-01-07 新浪网技术(中国)有限公司 Method and device for predicting click frequency of advertisement at advertising position
CN104536983A (en) * 2014-12-08 2015-04-22 北京掌阔技术有限公司 Method and device for predicting advertisement click rate
CN105023170A (en) * 2015-06-26 2015-11-04 深圳市腾讯计算机系统有限公司 Processing method and device of click stream data
CN105808762A (en) * 2016-03-18 2016-07-27 北京百度网讯科技有限公司 Resource sequencing method and device
CN105824806A (en) * 2016-06-13 2016-08-03 腾讯科技(深圳)有限公司 Quality evaluation method and device for public accounts
CN105824806B (en) * 2016-06-13 2018-10-23 腾讯科技(深圳)有限公司 A kind of quality evaluating method and device of public's account
CN109840782A (en) * 2017-11-24 2019-06-04 腾讯科技(深圳)有限公司 Clicking rate prediction technique, device, server and storage medium
CN111339433B (en) * 2020-05-21 2020-08-21 腾讯科技(深圳)有限公司 Information recommendation method and device based on artificial intelligence and electronic equipment

Similar Documents

Publication Publication Date Title
CN103761266A (en) Click rate predicting method and system based on multistage logistic regression
CN104462593B (en) A kind of method and apparatus that the push of user individual message related to resources is provided
Graepel et al. Web-scale bayesian click-through rate prediction for sponsored search advertising in microsoft's bing search engine
US20170091805A1 (en) Advertisement Recommendation Method and Advertisement Recommendation Server
CN106251174A (en) Information recommendation method and device
CN107909433A (en) A kind of Method of Commodity Recommendation based on big data mobile e-business
US8572011B1 (en) Outcome estimation models trained using regression and ranking techniques
US8364525B2 (en) Using clicked slate driven click-through rate estimates in sponsored search
CN108681915B (en) Click rate estimation method and device and electronic equipment
CN105335491B (en) Behavior is clicked come to the method and system of user&#39;s Recommended Books based on user
CN104978665A (en) Brand evaluation method and brand evaluation device
CN104933622A (en) Microblog popularity degree prediction method based on user and microblog theme and microblog popularity degree prediction system based on user and microblog theme
CN104391849A (en) Collaborative filtering recommendation method for integrating time contextual information
Moon et al. The development of a classification model for predicting the performance of forecasting methods for naval spare parts demand
US20120253945A1 (en) Bid traffic estimation
CN103246985A (en) Advertisement click rate predicting method and device
WO2018232331A1 (en) Systems and methods for optimizing and simulating webpage ranking and traffic
CN104574160A (en) Smooth advertisement traffic control method
CN103744917A (en) Mixed recommendation method and system
CN111400613A (en) Article recommendation method, device, medium and computer equipment
CN107145506B (en) Improved content-based agricultural commodity recommendation method
US20140257972A1 (en) Method, computer readable medium and system for determining true scores for a plurality of touchpoint encounters
CN103365842B (en) A kind of page browsing recommends method and device
WO2014031456A2 (en) Forecasting a number of impressions of a prospective advertisement listing
CN103617146B (en) A kind of machine learning method and device based on hardware resource consumption

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C53 Correction of patent for invention or patent application
CB03 Change of inventor or designer information

Inventor after: Cui Jingjing

Inventor after: Lin Jiajie

Inventor after: Li Chunhua

Inventor after: Shou Chunbai

Inventor before: Cui Jingjing

Inventor before: Lin Jiajie

Inventor before: Li Chunhua

Inventor before: Shou Chunbai

Inventor before: Liu Lina

COR Change of bibliographic data

Free format text: CORRECT: INVENTOR; FROM: CUI JINGJING LIN JIAJIE LI CHUNHUA SHOU CHUNBAI LIU LINA TO: CUI JINGJING LIN JIAJIE LI CHUNHUA SHOU CHUNBAI

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20140430