CN106096629B - A kind of ad click rate prediction technique based on similarity relation between user - Google Patents

A kind of ad click rate prediction technique based on similarity relation between user Download PDF

Info

Publication number
CN106096629B
CN106096629B CN201610380746.9A CN201610380746A CN106096629B CN 106096629 B CN106096629 B CN 106096629B CN 201610380746 A CN201610380746 A CN 201610380746A CN 106096629 B CN106096629 B CN 106096629B
Authority
CN
China
Prior art keywords
user
node
advertisement
state
users
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610380746.9A
Other languages
Chinese (zh)
Other versions
CN106096629A (en
Inventor
徐小龙
刘欣欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CNLIVE Corp.
Original Assignee
Nanjing Post and Telecommunication University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Post and Telecommunication University filed Critical Nanjing Post and Telecommunication University
Priority to CN201610380746.9A priority Critical patent/CN106096629B/en
Publication of CN106096629A publication Critical patent/CN106096629A/en
Application granted granted Critical
Publication of CN106096629B publication Critical patent/CN106096629B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • G06F18/24155Bayesian classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Data Mining & Analysis (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Game Theory and Decision Science (AREA)
  • Artificial Intelligence (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Business, Economics & Management (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention relates to a kind of ad click rate prediction techniques based on similarity relation between user, ad click rate prediction technique based on similarity relation between user, extraction based on data in advertisement click logs, construct the structure and parameter of Bayesian network model, realize the analysis of similarity relation between user, prediction of the user to ad click rate is further realized as a result, the final accurate dispensing for realizing advertisement;Wherein, the foundation of Bayesian network model, accuracy with higher not will lead to result and have no basis, and when creating Bayesian network model, eliminate redundancy side, enhance the reliability and validity of Bayesian network model;Moreover, in the establishment process of Bayesian network model, the reasoning of Bayesian network is carried out by a variety of methods, obtains indirect similar users, flexibility with higher and selectivity;Realize preferable ad click rate prediction effect.

Description

A kind of ad click rate prediction technique based on similarity relation between user
Technical field
The present invention relates to a kind of ad click rate prediction techniques based on similarity relation between user, belong to web advertisement dispensing Technical field.
Background technique
Advertisement itself be to society transmitting information publicity measures and many companies important revenue source it One.With the continuous development of Internet advertising, under the driving of enormous profit, how to improve advertisement launch bring profit also at For research hotspot.By predicting ad click rate, it can effectively judge a user to an ad click row For a possibility that, so that the advertisement that is oriented to it is launched, effectively improve the gray profit for being launched advertisement.At present Advertisement launches and is generally divided into two kinds: the advertisement point based on content is launched and directional technology.
Advertisement based on content, which is launched, carries out content matching strategy, that is, launches the search term content searched for when advertisement with user Or browsing webpage content centered on, by ad content in search term perhaps web page contents matched and launch it is matched extensively It accuses, this putting mode matches ad content, and different user, which is directed to, there is no consideration carries out accurate personalized recommendation, For different user, the advertisement that may be seen when searching for same search word or browsing the same page is the same, but this A little advertisements not necessarily their interested contents, this putting mode effect are poor.
Directional technology is a kind of popular technology in terms of launching advertisement, it using historical data to user characteristics into Then row description launches accurate advertisement to user according to user characteristics, so the experience of user can be promoted well, so More at present is all to carry out advertisement dispensing using directional technology.
But since in practical applications, number of ads is huge, many users might not have ad click record, or The advertisement that many users of person click is seldom, and the click record in historical record about user will be very few at this time, therefore according to going through History data, which directly carry out advertisement dispensing to user, cannot accurately find the interested direction of user, and advertisement delivery effect is just at this time It can have a greatly reduced quality.Therefore, the interest for being often difficult preferably to predict user is launched in advertisement in the prior art, thus cannot be very Accurately launch the interested advertisement of user.
Summary of the invention
Technical problem to be solved by the invention is to provide a kind of analyses based on similarity relation between user, can be accurate Predict that user to the clicking rate of advertisement, realizes the ad click rate prediction side based on similarity relation between user that advertisement is accurately launched Method.
In order to solve the above-mentioned technical problem the present invention uses following technical scheme: the present invention devises a kind of based between user The ad click rate prediction technique of similarity relation, includes the following steps:
Step 001. is directed to each user according to the advertisement click logs in server respectively, obtains user in default sieve Select its all search key and user in the period wide for each shown to it respectively within the default screening period The clicking rate of announcement, subsequently into step 002;
Step 002. is directed to all users, obtains and screens in the period on search key between all two two users default Similarity value, then choose corresponding two two users of each similarity value institute for being greater than default similarity threshold, respectively structure There are two users of direct similarity relation at each group, and obtain the dependence between two users of each group, according to each group Dependence between two users determines the direct similar users of each user, subsequently into step 003;
Step 003. has the dependence between two users of two users and each group of direct similarity relation for each group Relationship establishes Bayesian network model, wherein each user is respectively adopted each user node and indicates, each group has directly similar Dependence between two users of relationship is indicated using the oriented arrow between user node, subsequently into step 004;
Step 004. is respectively for each user node in Bayesian network model, if there are user father's sections for user node Point then obtains the user node and clicks advertisement in its each user's father node respectively and do not click different groups of advertisement two states Under conjunction, correspond to click advertisement state posterior probability, that is, obtain the user node using its each user's father node as Each direct similar users, in the case where each direct similar users click advertisement and do not click the various combination of advertisement two states, The corresponding posterior probability for clicking advertisement state of the user node;If user's father node is not present in user node, the user is obtained Node is corresponding to be clicked advertisement state and does not click advertisement shape probability of state;Subsequently into step 005;
Step 005. is according to the structure of Bayesian network model, and there is no each user node of user's father node, It is corresponding to click advertisement state and do not click advertisement shape probability of state, respectively for each user section in Bayesian network model Point, acquisition user node is respectively with respect to other each user nodes with its indirect association in the case where clicking advertisement state, the user The corresponding posterior probability for clicking advertisement state of node, and selection is respectively corresponded greater than each posterior probability of predetermined probabilities threshold value Two indirect associations user node, that is, respectively constitute each group have indirect similarity relation two users, subsequently into step Rapid 006;
Step 006. obtains each direct similar users, each indirect similar users for corresponding to target prediction user, and Further obtaining target prediction user, relatively each direct similar users click advertisement and do not click advertisement two states respectively Under various combination, the corresponding posterior probability for clicking advertisement state of target prediction user;And target prediction user is relatively each respectively A indirect similar users are in the case where clicking advertisement state, the corresponding posterior probability for clicking advertisement state of target prediction user, i.e., by mesh Direct similar users, the indirect similar users of mark prediction user are referred to as similar users, and it is each with respect to it to obtain target prediction user Position similar users respectively correspond its posterior probability for clicking advertisement state;Subsequently into step 007;
Step 007. is according to each user respectively for the click to its each branch advertisement shown within the default screening period Rate, obtain the relatively every similar users of target prediction user, respectively correspond its posterior probability for clicking advertisement state respectively with it is right Answer each similar users for targeted advertisements clicking rate product, finally by after each product addition multiplied by normalization factor institute It must be worth, as prediction clicking rate of the target prediction user for targeted advertisements.
As a preferred technical solution of the present invention: the step 001 specifically comprises the following steps:
Step 001-1. is directed to each user according to the advertisement click logs in server respectively, obtains user default It screens its all search key, each branch advertisement in the period and is presetting the displaying number in the screening period to the user respectively, And the user is directed to the number of clicks to its each branch advertisement shown respectively within the default screening period, subsequently into step 001-2;
Step 001-2. is directed to each user respectively, according to each branch advertisement exhibition within the default screening period to user respectively Show that number and the user, respectively for the number of clicks to its each branch advertisement shown, are somebody's turn to do within the default screening period User is directed to the clicking rate to its each branch advertisement shown respectively within the default screening period, subsequently into step 002.
As a preferred technical solution of the present invention: the step 002 specifically comprises the following steps:
Step 002-1. is directed to all users, obtain it is all to two two users, respectively for each to two two users, according to It is crucial to obtain two users common search within the default screening period for family its all search key within the default screening period The number of word accounts for the ratio of two users number of all search keys within the default screening period, dual-purpose to two as this Similarity value between family within the default screening period on search key, thus to obtain in all users it is each to two two users it Between similarity value within the default screening period on search key, subsequently into step 002-2;
Step 00202. chooses corresponding two two users of each similarity value institute for being greater than default similarity threshold, point Not Gou Cheng each group there are two users of direct similarity relation, and enter step 002-3;
Step 002-3. is directed to two the users A and B that each group has direct similarity relation respectively, within the default screening period Following judgement is done, subsequently into step 003;
If the number of judgement two user's common search keywords of A, B accounts for the ratio of all search key numbers of party A-subscriber, Greater than the ratio that the number of two user's common search keywords of A, B accounts for all search key numbers of party B-subscriber, then A, B two Dependence between user is that user A is directed toward user B, i.e. user A is user's father node of user B, and user B is user A's User's child node, i.e. user A are the direct similar users of user B;
If the number of judgement two user's common search keywords of A, B accounts for the ratio of all search key numbers of party B-subscriber, Greater than the ratio that the number of two user's common search keywords of A, B accounts for all search key numbers of party A-subscriber, then A, B two Dependence between user is that user B is directed toward user A, i.e. user B is user's father node of user A, and user A is user B's User's child node, i.e. user B are the direct similar users of user A;
If the number of judgement two user's common search keywords of A, B accounts for the ratio of all search key numbers of party B-subscriber, Equal to the ratio that the number of two user's common search keywords of A, B accounts for all search key numbers of party A-subscriber, then further look into It sees between two users of A, B and whether has deposited dependence, be, do not do further operating for two users of A, B;Otherwise it is directed to A, dependence is set at random between two users of B.
As a preferred technical solution of the present invention: the step 004 specifically comprises the following steps:
It is pressed respectively for each user node in Bayesian network model using user node as active user's node Following steps are operated:
Step 004-1. judges that active user's node is to enter step 004-2 with the presence or absence of user's father node;Otherwise into Enter step 004-5;
Step 004-2. obtains the number N of user's father node corresponding to active user's node, according to each user node point Dui Ying advertisement not clicked and not click the two states of advertisement, will click on advertisement state and be defined as 1, not click the definition of advertisement state It is 0, and then obtains and combine constituted 2 between active user's node different conditions and its all user's father node different conditions N+1 power state, subsequently into step 004-3;
Step 004-3. be directed between active user's node and its all user's father node respectively combine constituted it is each State further takes the user node pre- for each user node in state if user node state is 1 respectively If screening interim keyword of its all search key as the user node in the period;If user node state is 0, Within the default screening period, all search keys of all user nodes and the user node in Bayesian network model are taken The difference set of all search keys, as the interim keyword of the user node, subsequently into step 004-4;
Step 004-4. be directed between active user's node and its all user's father node respectively combine constituted it is each State, active user's node in the keyword number and state of the intersection of all interim keywords of user node in acquisition state The ratio of the keyword number of the intersection of all interim keywords of user's father node, in this state as active user's node Posterior probability;Advertisement is clicked in its each user's father node thus to obtain active user's node and does not click advertisement two states Under various combination, it is each with its to obtain active user's node for the corresponding posterior probability for clicking advertisement state of active user's node User's father node clicks advertisement in each direct similar users and does not click two kinds of advertisement respectively as each direct similar users Under the various combination of state, the corresponding posterior probability for clicking advertisement state of active user's node;
Step 004-5. takes and works as then within the default screening period for the corresponding state for clicking advertisement of active user's node All search keys of all user nodes in preceding all search key numbers of user node and Bayesian network model Several ratio is as the corresponding click advertisement shape probability of state of active user's node;Meanwhile for the corresponding not point of active user's node The state of advertisement is hit, then within the default screening period, takes all search of all user nodes in Bayesian network model crucial Own in the number and Bayesian network model of the search key of the difference set of word and all search keys of active user's node The ratio of the number of all search keys of user node does not click the general of advertisement state as active user's node correspondence Rate;Advertisement shape probability of state is not clicked thus to obtain the corresponding click advertisement state of active user's node and.
As a preferred technical solution of the present invention: in the step 005, using Gibbs sampling method, according to shellfish The structure of this network model of leaf, and there is no each user nodes of user's father node, it is corresponding to click advertisement state and not point Advertisement shape probability of state is hit, respectively for each user node in Bayesian network model, it is opposite respectively to obtain user node Other each user nodes with its indirect association are in the case where clicking advertisement state, after the corresponding click advertisement state of the user node Test probability.
As a preferred technical solution of the present invention: in the step 005, using Gibbs sampling method, difference needle Active user's node is obtained using user node as active user's node to each user node in Bayesian network model Respectively with respect to other each user nodes with its indirect association in the case where clicking advertisement state, active user's node is corresponding to be clicked extensively The posterior probability for state of lodging a complaint with, specifically comprises the following steps:
Step 005-1. is clicked extensively in other each user nodes with active user's node indirect association by corresponding The user node for state of lodging a complaint with will be removed as evidence variable e, active user's node as target variable t in Bayesian network model Other user nodes other than evidence variable, target variable are as non-evidence variable q, respectively in Bayesian network model Each user node, using user's father node of user's father node of user node, user's child node and user's child node as The markov of the user node covers;Subsequently into step 005-2;
Step 005-2. initializes the state of all user nodes as first sample, and evidence variable states are assigned a value of 1, non-evidence variable assigns state 0 or 1 at random, and enters step 005-3;
Step 005-3. recycles non-evidence variable and utilizes posterior probability in step 004 to each non-evidence variable q It calculates, calculating its in the covering of its markov under each user node status condition of current non-evidence variable q is respectively 0 He 1 posterior probability, subsequently into step 005-4;
Step 005-4. be randomly generated one 0 to the sum of the current non-evidence variable q conditional probability for being 0 and 1 it is random The state of current non-evidence variable is changed to 0, if the random number if the random number is less than or equal to its conditional probability for being 0 by number The conditional probability for being 0 greater than it, and be less than its sum of conditional probability for being 0 and 1, then the state of current non-evidence variable is changed to 1, the state of each non-evidence variable is thus updated as new sample, subsequently into step 005-5;
Step 005-5. repeats step 005-2 to step 005-4 step, and constantly sampling generates new sample, and statistics is all The sample number n that target variable state is 1 in sample calculates the ratio of sample number n and frequency in sampling s that target variable state is 1, As active user's node itself is also 1 posterior probability under conditions of having determined that a certain user node state is 1.
As a preferred technical solution of the present invention: the acquisition of normalization factor passes through such as lower section in the step 007 Method:
First within the default screening period, each similar users for obtaining target prediction user are directed to respectively to its displaying The sum of the clicking rate of each branch advertisement, then calculate the ratio of the sum of 1 and the clicking rate, that is, it is used as normalization factor.
A kind of ad click rate prediction technique based on similarity relation between user of the present invention uses above technical scheme Compared with prior art, a kind of ad click based on similarity relation between user designed by the present invention is had following technical effect that Rate prediction technique constructs the structure and parameter of Bayesian network model based on the extraction of data in advertisement click logs, realizes and uses The analysis of similarity relation between family further realizes prediction of the user to ad click rate as a result, final to realize the accurate of advertisement It launches;Wherein, the foundation of Bayesian network model, accuracy with higher not will lead to result and have no basis, and creating When building Bayesian network model, redundancy side is eliminated, enhances the reliability and validity of Bayesian network model;Not only such as This carries out the reasoning of Bayesian network by a variety of methods in the establishment process of Bayesian network model, obtains indirectly similar User, flexibility with higher and selectivity;So that the designed ad click rate based on similarity relation between user of the present invention Prediction technique has fully considered the interest and focus of user, has combined two while by ad content and search content matching Kind launches advertisement a little, avoids the one-sidedness of single advertisement putting mode, has preferable prediction effect.
Detailed description of the invention
Fig. 1 is the configuration diagram based on the ad click rate prediction technique of similarity relation between user that the present invention designs;
Fig. 2 is Bayesian network mould in the ad click rate prediction technique based on similarity relation between user of the invention designed The construction flow chart of type;
Fig. 3 is to be adopted in the ad click rate prediction technique based on similarity relation between user of the invention designed using gibbs Quadrat method constructs the flow chart of indirect similar users relationship.
Specific embodiment
Specific embodiments of the present invention will be described in further detail with reference to the accompanying drawings of the specification.
Problem solved by the invention is that in the prior art, for lacking, user clicks record or click records less feelings The interested advertisement of user cannot be accurately launched very much under condition, the problem being not efficient enough is launched in advertisement.Based on pass similar between user It is ad click rate prediction technique, is established between user by similitude of the user in search behavior, using Bayesian network Direct similar dependence, and between being inferred between user by the direct similarity relation, using the rationalistic method of Bayesian network Similar dependence is connect, so as to predict certain user to the clicking rate of certain advertisement, hence for a certain of certain user search Keyword can match all advertisements that may be launched, then the clicking rate for all advertisements that may be launched by prediction carries out Sequence, and then the advertisement that user is oriented is launched by the prediction, the income and advertisement for effectively improving advertisement putting business are thrown Effect is put, solves the problems such as current advertisement dispensing is not efficient enough.
As shown in Figure 1, a kind of ad click rate prediction technique based on similarity relation between user designed by the present invention, real In the application process of border, specifically comprise the following steps:
Step 001. is directed to each user according to the advertisement click logs in server respectively, obtains user in default sieve Select its all search key and user in the period wide for each shown to it respectively within the default screening period The clicking rate of announcement, subsequently into step 002.
Wherein, the step 001 specifically comprises the following steps:
Step 001-1. from the advertisement click logs in server, filter out user characteristics mark, characteristic of advertisement mark, The description of user's search key, the displayings number of advertisement and number this five fields for being clicked, as a result, respectively for each use Family, obtain user within the default screening period its all search key, each branch advertisement respectively within the default screening period to The displaying number of the user and the user are secondary for the click to its each branch advertisement shown respectively within the default screening period Number, subsequently into step 001-2.
Step 001-2. is directed to each user respectively, according to each branch advertisement exhibition within the default screening period to user respectively Show that number and the user, respectively for the number of clicks to its each branch advertisement shown, are somebody's turn to do within the default screening period User is directed to the clicking rate to its each branch advertisement shown respectively within the default screening period, subsequently into step 002.
Step 002. is directed to all users, obtains and screens in the period on search key between all two two users default Similarity value, then choose corresponding two two users of each similarity value institute for being greater than default similarity threshold, respectively structure There are two users of direct similarity relation at each group, and obtain the dependence between two users of each group, according to each group Dependence between two users, determines the direct similar users of each user, and saves, subsequently into step 003.
Wherein, as shown in Fig. 2, step 002 specifically comprises the following steps:
Step 002-1. is directed to all users, obtain it is all to two two users, respectively for each to two two users, according to It is crucial to obtain two users common search within the default screening period for family its all search key within the default screening period The number of word accounts for the ratio of two users number of all search keys within the default screening period, dual-purpose to two as this Similarity value between family within the default screening period on search key, thus to obtain in all users it is each to two two users it Between similarity value within the default screening period on search key, subsequently into step 002-2.
Step 00202. chooses corresponding two two users of each similarity value institute for being greater than default similarity threshold, point Not Gou Cheng each group there are two users of direct similarity relation, and enter step 002-3.
Step 002-3. is directed to two the users A and B that each group has direct similarity relation respectively, within the default screening period Following judgement is done, subsequently into step 003.
If the number of judgement two user's common search keywords of A, B accounts for the ratio of all search key numbers of party A-subscriber, Greater than the ratio that the number of two user's common search keywords of A, B accounts for all search key numbers of party B-subscriber, then A, B two Dependence between user is that user A is directed toward user B, i.e. user A is user's father node of user B, and user B is user A's User's child node, i.e. user A are the direct similar users of user B.
If the number of judgement two user's common search keywords of A, B accounts for the ratio of all search key numbers of party B-subscriber, Greater than the ratio that the number of two user's common search keywords of A, B accounts for all search key numbers of party A-subscriber, then A, B two Dependence between user is that user B is directed toward user A, i.e. user B is user's father node of user A, and user A is user B's User's child node, i.e. user B are the direct similar users of user A.
If the number of judgement two user's common search keywords of A, B accounts for the ratio of all search key numbers of party B-subscriber, Equal to the ratio that the number of two user's common search keywords of A, B accounts for all search key numbers of party A-subscriber, then further look into It sees between two users of A, B and whether has deposited dependence, be, do not do further operating for two users of A, B;Otherwise it is directed to A, dependence is set at random between two users of B.
Step 003. has the dependence between two users of two users and each group of direct similarity relation for each group Relationship establishes Bayesian network model, wherein each user is respectively adopted each user node and indicates, each group has directly similar Dependence between two users of relationship is indicated using the oriented arrow between user node, subsequently into step 004.
Step 004. is respectively for each user node in Bayesian network model, if there are user father's sections for user node Point then obtains the user node and clicks advertisement in its each user's father node respectively and do not click different groups of advertisement two states Under conjunction, correspond to click advertisement state posterior probability, that is, obtain the user node using its each user's father node as Each direct similar users, in the case where each direct similar users click advertisement and do not click the various combination of advertisement two states, The corresponding posterior probability for clicking advertisement state of the user node, and save;If user's father node is not present in user node, obtain The user node is corresponding to be clicked advertisement state and does not click advertisement shape probability of state, and is saved;Subsequently into step 005.
Wherein, above-mentioned steps 004 specifically comprise the following steps:
It is pressed respectively for each user node in Bayesian network model using user node as active user's node Following steps are operated:
Step 004-1. judges that active user's node is to enter step 004-2 with the presence or absence of user's father node;Otherwise into Enter step 004-5.
Step 004-2. obtains the number N of user's father node corresponding to active user's node, according to each user node point Dui Ying advertisement not clicked and not click the two states of advertisement, will click on advertisement state and be defined as 1, not click the definition of advertisement state It is 0, and then obtains and combine constituted 2 between active user's node different conditions and its all user's father node different conditions N+1 power state, each state is formed by binary N+1 0 or 1, subsequently into step 004-3.
Step 004-3. be directed between active user's node and its all user's father node respectively combine constituted it is each State further takes the user node pre- for each user node in state if user node state is 1 respectively If screening interim keyword of its all search key as the user node in the period;If user node state is 0, Within the default screening period, all search keys of all user nodes and the user node in Bayesian network model are taken The difference set of all search keys, as the interim keyword of the user node, subsequently into step 004-4.
Step 004-4. be directed between active user's node and its all user's father node respectively combine constituted it is each State, active user's node in the keyword number and state of the intersection of all interim keywords of user node in acquisition state The ratio of the keyword number of the intersection of all interim keywords of user's father node, in this state as active user's node Posterior probability;Advertisement is clicked in its each user's father node thus to obtain active user's node and does not click advertisement two states Under various combination, it is each with its to obtain active user's node for the corresponding posterior probability for clicking advertisement state of active user's node User's father node clicks advertisement in each direct similar users and does not click two kinds of advertisement respectively as each direct similar users Under the various combination of state, the corresponding posterior probability for clicking advertisement state of active user's node, and save.
Step 004-5. takes and works as then within the default screening period for the corresponding state for clicking advertisement of active user's node All search keys of all user nodes in preceding all search key numbers of user node and Bayesian network model Several ratio is as the corresponding click advertisement shape probability of state of active user's node;Meanwhile for the corresponding not point of active user's node The state of advertisement is hit, then within the default screening period, takes all search of all user nodes in Bayesian network model crucial Own in the number and Bayesian network model of the search key of the difference set of word and all search keys of active user's node The ratio of the number of all search keys of user node does not click the general of advertisement state as active user's node correspondence Rate;Advertisement shape probability of state is not clicked thus to obtain the corresponding click advertisement state of active user's node and, and saved.
Step 005. is according to the structure of Bayesian network model, and there is no each user node of user's father node, It is corresponding to click advertisement state and do not click advertisement shape probability of state, respectively for each user section in Bayesian network model Point, acquisition user node is respectively with respect to other each user nodes with its indirect association in the case where clicking advertisement state, the user The corresponding posterior probability for clicking advertisement state of node, and selection is respectively corresponded greater than each posterior probability of predetermined probabilities threshold value Two indirect associations user node, that is, respectively constitute two users that each group has indirect similarity relation, and save, then Enter step 006.
Wherein, as shown in figure 3, in above-mentioned steps 005, using Gibbs sampling method, it is directed to Bayesian network mould respectively It is opposite respectively and therebetween to obtain active user's node using user node as active user's node for each user node in type Other each user nodes of connection are connect in the case where clicking advertisement state, the corresponding posteriority for clicking advertisement state of active user's node is general Rate, and save, specifically comprise the following steps:
Step 005-1. is clicked extensively in other each user nodes with active user's node indirect association by corresponding The user node for state of lodging a complaint with will be removed as evidence variable e, active user's node as target variable t in Bayesian network model Other user nodes other than evidence variable, target variable are as non-evidence variable q, respectively in Bayesian network model Each user node, using user's father node of user's father node of user node, user's child node and user's child node as The markov of the user node covers;Subsequently into step 005-2.
Step 005-2. initializes the state of all user nodes as first sample, and evidence variable states are assigned a value of 1, non-evidence variable assigns state 0 or 1 at random, and enters step 005-3.
Step 005-3. recycles non-evidence variable and utilizes posterior probability in step 004 to each non-evidence variable q It calculates, calculating its in the covering of its markov under each user node status condition of current non-evidence variable q is respectively 0 He 1 posterior probability, subsequently into step 005-4.
Step 005-4. be randomly generated one 0 to the sum of the current non-evidence variable q conditional probability for being 0 and 1 it is random The state of current non-evidence variable is changed to 0, if the random number if the random number is less than or equal to its conditional probability for being 0 by number The conditional probability for being 0 greater than it, and be less than its sum of conditional probability for being 0 and 1, then the state of current non-evidence variable is changed to 1, the state of each non-evidence variable is thus updated as new sample, subsequently into step 005-5.
Step 005-5. repeats step 005-2 to step 005-4 step, and constantly sampling generates new sample, and statistics is all The sample number n that target variable state is 1 in sample calculates the ratio of sample number n and frequency in sampling s that target variable state is 1, As active user's node itself is also 1 posterior probability under conditions of having determined that a certain user node state is 1.
Step 006. obtains each direct similar users, each indirect similar users for corresponding to target prediction user, and Further obtaining target prediction user, relatively each direct similar users click advertisement and do not click advertisement two states respectively Under various combination, the corresponding posterior probability for clicking advertisement state of target prediction user;And target prediction user is relatively each respectively A indirect similar users are in the case where clicking advertisement state, the corresponding posterior probability for clicking advertisement state of target prediction user, i.e., by mesh Direct similar users, the indirect similar users of mark prediction user are referred to as similar users, and it is each with respect to it to obtain target prediction user Position similar users respectively correspond its posterior probability for clicking advertisement state, and save;Subsequently into step 007.
Step 007. is according to each user respectively for the click to its each branch advertisement shown within the default screening period Rate, obtain the relatively every similar users of target prediction user, respectively correspond its posterior probability for clicking advertisement state respectively with it is right Answer each similar users for targeted advertisements clicking rate product, finally by after each product addition multiplied by normalization factor institute It must be worth, as prediction clicking rate of the target prediction user for targeted advertisements.Wherein, for normalization factor, exist first In the default screening period, each similar users for obtaining target prediction user are directed to the click of each branch advertisement to its displaying respectively The sum of rate, then calculate the ratio of the sum of 1 and the clicking rate, that is, it is used as normalization factor.
Based on the above-mentioned technical proposal, the prediction clicking rate of advertisement or searching for user are directed to according to designed acquisition user Rope keyword, it is accurate to realize the dispensing for meeting the advertisement of its interest for user, achieve the purpose that promote advertisement delivery effect.
A kind of ad click rate prediction technique based on similarity relation between user designed by above-mentioned technical proposal is based on advertisement The extraction of data in click logs constructs the structure and parameter of Bayesian network model, realizes point of similarity relation between user Analysis, further realizes prediction of the user to ad click rate as a result, the final accurate dispensing for realizing advertisement;Wherein, Bayesian network The foundation of network model, accuracy with higher not will lead to result and have no basis, and in creation Bayesian network model When, redundancy side is eliminated, the reliability and validity of Bayesian network model are enhanced;Moreover, in Bayesian network mould In the establishment process of type, the reasoning of Bayesian network is carried out by a variety of methods, obtains indirect similar users, spirit with higher Activity and selectivity;So that the designed ad click rate prediction technique based on similarity relation between user of the present invention, it will be in advertisement While appearance with search content matching, fully considers the interest and focus of user, has combined two kinds of dispensing advertisements a little, The one-sidedness of single advertisement putting mode is avoided, there is preferable prediction effect.
Embodiments of the present invention are explained in detail above in conjunction with attached drawing, but the present invention is not limited to above-mentioned implementations Mode within the knowledge of a person skilled in the art can also be without departing from the purpose of the present invention It makes a variety of changes.

Claims (7)

1. a kind of ad click rate prediction technique based on similarity relation between user, which comprises the steps of:
Step 001. is directed to each user according to the advertisement click logs in server respectively, obtains user in default screening week Its all search key and the user are directed to respectively within the default screening period to its each branch advertisement shown in phase Clicking rate, subsequently into step 002;
Step 002. is directed to all users, obtains the phase between all two two users within the default screening period on search key Like angle value, then corresponding two two users of each similarity value institute for being greater than default similarity threshold are chosen, respectively constituted each Group has two users of direct similarity relation, and obtains the dependence between two users of each group, according to each group two Dependence between user determines the direct similar users of each user, subsequently into step 003;
Step 003. has the dependence between two users of two users and each group of direct similarity relation for each group Establish Bayesian network model, wherein each user is respectively adopted each user node and indicates, each group has direct similarity relation Two users between dependence indicated using the oriented arrow between user node, subsequently into step 004;
Step 004. respectively for each user node in Bayesian network model, if user node there are user's father node, The user node is then obtained to click advertisement in its each user's father node respectively and do not click the various combination of advertisement two states Under, the posterior probability for clicking advertisement state is corresponded to, that is, obtains the user node using its each user's father node as each A direct similar users should in the case where each direct similar users click advertisement and do not click the various combination of advertisement two states The corresponding posterior probability for clicking advertisement state of user node;If user's father node is not present in user node, user section is obtained Point is corresponding to be clicked advertisement state and does not click advertisement shape probability of state;Subsequently into step 005;
Step 005. is according to the structure of Bayesian network model, and there is no each user nodes of user's father node, corresponding It clicks advertisement state and does not click advertisement shape probability of state, respectively for each user node in Bayesian network model, obtain User node respectively with respect to other each user nodes with its indirect association in the case where clicking advertisement state, the user node pair The posterior probability of advertisement state should be clicked, and is chosen two corresponding greater than each posterior probability of predetermined probabilities threshold value institute The user node of indirect association respectively constitutes two users that each group has indirect similarity relation, subsequently into step 006;
Step 006. obtains each direct similar users, each indirect similar users for corresponding to target prediction user, goes forward side by side one Step obtains target prediction user, and relatively each direct similar users click advertisement and do not click the difference of advertisement two states respectively Under combination, the corresponding posterior probability for clicking advertisement state of target prediction user;And target prediction user each relatively respectively Similar users are connect in the case where clicking advertisement state, the corresponding posterior probability for clicking advertisement state of target prediction user is that is, pre- by target Direct similar users, the indirect similar users for surveying user are referred to as similar users, obtain target prediction user with respect to its every phase Like user, respectively correspond its posterior probability for clicking advertisement state;Subsequently into step 007;
Step 007. is directed to the clicking rate to its each branch advertisement shown according to each user respectively within the default screening period, obtains The relatively every similar users of target prediction user, respectively correspond its click advertisement state posterior probability respectively with it is corresponding each Similar users for the clicking rate of targeted advertisements product, finally by after each product addition multiplied by normalization factor resulting value, As prediction clicking rate of the target prediction user for targeted advertisements.
2. a kind of ad click rate prediction technique based on similarity relation between user according to claim 1, it is characterised in that: The step 001 specifically comprises the following steps:
Step 001-1. is directed to each user according to the advertisement click logs in server respectively, obtains user in default screening Its all search key, each branch advertisement are presetting the displaying number in the screening period to the user respectively in period, and The user is directed to the number of clicks to its each branch advertisement shown respectively within the default screening period, subsequently into step 001-2;
Step 001-2. is directed to each user respectively, according to each branch advertisement respectively within the default screening period to the displaying time of user The several and user, respectively for the number of clicks to its each branch advertisement shown, obtains the user within the default screening period Respectively for the clicking rate to its each branch advertisement shown within the default screening period, subsequently into step 002.
3. a kind of ad click rate prediction technique based on similarity relation between user according to claim 1, it is characterised in that: The step 002 specifically comprises the following steps:
Step 002-1. is directed to all users, obtains all to two two users, respectively for each to two two users, is existed according to user Its all search key in the default screening period obtains two users common search keyword within the default screening period Number accounts for the ratio of two users number of all search keys within the default screening period, as this to two two users it Between similarity value within the default screening period on search key, thus to obtain in all users it is each between two two users Similarity value in the default screening period on search key, subsequently into step 002-2;
Step 00202. chooses corresponding two two users of each similarity value institute for being greater than default similarity threshold, respectively structure There are two users of direct similarity relation at each group, and enter step 002-3;
Step 002-3. is directed to two the users A and B that each group has direct similarity relation respectively, done within the default screening period as Lower judgement, subsequently into step 003;
If the number of judgement two user's common search keywords of A, B accounts for the ratio of all search key numbers of party A-subscriber, it is greater than A, the number of two user's common search keywords of B accounts for the ratio of all search key numbers of party B-subscriber, then two users of A, B Between dependence be user A be directed toward user B, i.e. user A be user B user's father node, user B be user A user Child node, i.e. user A are the direct similar users of user B;
If the number of judgement two user's common search keywords of A, B accounts for the ratio of all search key numbers of party B-subscriber, it is greater than A, the number of two user's common search keywords of B accounts for the ratio of all search key numbers of party A-subscriber, then two users of A, B Between dependence be user B be directed toward user A, i.e. user B be user A user's father node, user A be user B user Child node, i.e. user B are the direct similar users of user A;
If the number of judgement two user's common search keywords of A, B accounts for the ratio of all search key numbers of party B-subscriber, it is equal to A, the number of two user's common search keywords of B accounts for the ratio of all search key numbers of party A-subscriber, then further check A, Whether dependence is deposited between two users of B, be, has not done further operating for two users of A, B;Otherwise it is directed to A, B two Dependence is set at random between a user.
4. a kind of ad click rate prediction technique based on similarity relation between user according to claim 1, it is characterised in that: The step 004 specifically comprises the following steps:
Respectively for each user node in Bayesian network model, using user node as active user's node, by as follows Step is operated:
Step 004-1. judges that active user's node is to enter step 004-2 with the presence or absence of user's father node;Otherwise enter step Rapid 004-5;
Step 004-2. obtains the number N of user's father node corresponding to active user's node, right respectively according to each user node Advertisement should be clicked and do not click the two states of advertisement, advertisement state is will click on and be defined as 1, advertisement state is not clicked and be defined as 0, And then it obtains and combines 2 constituted N+1 between active user's node different conditions and its all user's father node different conditions Power state, subsequently into step 004-3;
Step 004-3. is directed between active user's node and its all user's father node respectively and combines each state constituted, Further respectively for each user node in state, if user node state is 1, take the user node in default screening Interim keyword of its all search key as the user node in period;If user node state is 0, default It screens in the period, takes all of all search keys of all user nodes and the user node in Bayesian network model The difference set of search key, as the interim keyword of the user node, subsequently into step 004-4;
Step 004-4. is directed between active user's node and its all user's father node respectively and combines each state constituted, In acquisition state in the keyword number Yu state of the intersection of all interim keywords of user node active user's node it is all The ratio of the keyword number of the intersection of the interim keyword of user's father node, as the posteriority of active user's node in this state Probability;Advertisement is clicked in its each user's father node thus to obtain active user's node and does not click the difference of advertisement two states Under combination, the corresponding posterior probability for clicking advertisement state of active user's node, i.e. acquisition active user's node is with its each user Father node clicks advertisement in each direct similar users and does not click advertisement two states respectively as each direct similar users Various combination under, the corresponding posterior probability for clicking advertisement state of active user's node;
Step 004-5. takes current use then within the default screening period for the corresponding state for clicking advertisement of active user's node All search key numbers of all user nodes in all search key numbers of family node and Bayesian network model Ratio is as the corresponding click advertisement shape probability of state of active user's node;Meanwhile it not being clicked extensively for active user's node is corresponding The state of announcement, then within the default screening period, take in Bayesian network model all search keys of all user nodes with All users in the number and Bayesian network model of the search key of the difference set of all search keys of active user's node The ratio of the number of all search keys of node does not click advertisement shape probability of state as active user's node correspondence;By This obtains the corresponding click advertisement state of active user's node and does not click advertisement shape probability of state.
5. a kind of ad click rate prediction technique based on similarity relation between user according to claim 1, it is characterised in that: In the step 005, using Gibbs sampling method, according to the structure of Bayesian network model, and there is no user fathers to save Each user node of point, it is corresponding to click advertisement state and do not click advertisement shape probability of state, it is directed to Bayesian network mould respectively Each user node in type obtains user node and is clicking extensively with respect to other each user nodes with its indirect association respectively It lodges a complaint under state, the corresponding posterior probability for clicking advertisement state of the user node.
6. a kind of ad click rate prediction technique based on similarity relation between user according to claim 5, it is characterised in that: In the step 005, using Gibbs sampling method, respectively for each user node in Bayesian network model, it will use Family node obtains active user's node respectively with respect to other each user nodes with its indirect association as active user's node In the case where clicking advertisement state, the corresponding posterior probability for clicking advertisement state of active user's node specifically comprises the following steps:
Step 005-1. clicks advertisement shape in other each user nodes with active user's node indirect association, by corresponding The user node of state will remove evidence as target variable t as evidence variable e, active user's node in Bayesian network model Other user nodes other than variable, target variable are as non-evidence variable q, respectively for each in Bayesian network model User node, using user's father node of user's father node of user node, user's child node and user's child node as the use The markov of family node covers;Subsequently into step 005-2;
Step 005-2. initializes the states of all user nodes as first sample, and evidence variable states are assigned a value of 1, non- Evidence variable assigns state 0 or 1 at random, and enters step 005-3;
Step 005-3. recycles non-evidence variable, to each non-evidence variable q, using the calculating of posterior probability in step 004, Calculating its in the covering of its markov under each user node status condition of current non-evidence variable q is respectively after 0 and 1 Probability is tested, subsequently into step 005-4;
Step 005-4. is randomly generated one in 0 to the sum of the current non-evidence variable q conditional probability for being 0 and 1 random number, if The random number is less than or equal to its conditional probability for being 0, then the state of current non-evidence variable is changed to 0, if the random number is greater than Its conditional probability for being 0, and be less than its sum of conditional probability for being 0 and 1, then the state of current non-evidence variable is changed to 1, by This updates the state of each non-evidence variable as new sample, subsequently into step 005-5;
Step 005-5. repeats step 005-2 to step 005-4 step, and constantly sampling generates new sample, counts all samples The sample number n that middle target variable state is 1 calculates the ratio of sample number n and frequency in sampling s that target variable state is 1, as Active user's node itself is also 1 posterior probability under conditions of having determined that a certain user node state is 1.
7. a kind of ad click rate prediction technique based on similarity relation between user according to claim 1, it is characterised in that: The acquisition of normalization factor is by the following method in the step 007:
First within the default screening period, each similar users for obtaining target prediction user are directed to each shown to it respectively The sum of clicking rate of advertisement, then calculate the ratio of the sum of 1 and the clicking rate, that is, it is used as normalization factor.
CN201610380746.9A 2016-06-01 2016-06-01 A kind of ad click rate prediction technique based on similarity relation between user Active CN106096629B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610380746.9A CN106096629B (en) 2016-06-01 2016-06-01 A kind of ad click rate prediction technique based on similarity relation between user

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610380746.9A CN106096629B (en) 2016-06-01 2016-06-01 A kind of ad click rate prediction technique based on similarity relation between user

Publications (2)

Publication Number Publication Date
CN106096629A CN106096629A (en) 2016-11-09
CN106096629B true CN106096629B (en) 2019-07-16

Family

ID=57229983

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610380746.9A Active CN106096629B (en) 2016-06-01 2016-06-01 A kind of ad click rate prediction technique based on similarity relation between user

Country Status (1)

Country Link
CN (1) CN106096629B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106875208B (en) * 2016-12-27 2020-12-04 网易传媒科技(北京)有限公司 Method and device for determining display position of remarkable advertisement
CN107464141B (en) * 2017-08-07 2021-09-07 北京京东尚科信息技术有限公司 Method and device for information popularization, electronic equipment and computer readable medium
CN107516247A (en) * 2017-08-28 2017-12-26 天脉聚源(北京)科技有限公司 A kind of method and device for predicting advertisement played data
CN107613022B (en) * 2017-10-20 2020-10-16 阿里巴巴(中国)有限公司 Content pushing method and device and computer equipment
CN108062684B (en) * 2017-12-12 2021-01-22 北京奇艺世纪科技有限公司 Method and device for predicting click rate of advertisement
CN109146551A (en) * 2018-07-26 2019-01-04 深圳市元征科技股份有限公司 A kind of advertisement recommended method, server and computer-readable medium
CN109784537B (en) * 2018-12-14 2019-12-06 北京达佳互联信息技术有限公司 advertisement click rate estimation method and device, server and storage medium
CN111242239B (en) * 2020-01-21 2023-05-30 腾讯科技(深圳)有限公司 Training sample selection method, training sample selection device and computer storage medium
CN117408750B (en) * 2023-12-12 2024-03-19 广州宇中网络科技有限公司 Network advertisement delivery method based on big data analysis

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105160548A (en) * 2015-08-20 2015-12-16 北京奇虎科技有限公司 Method and apparatus for predicting advertisement click-through rate

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105160548A (en) * 2015-08-20 2015-12-16 北京奇虎科技有限公司 Method and apparatus for predicting advertisement click-through rate

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Web-Scale Bayesian Click-Through Rate Prediction for Sponsored Search Advertising in Microsoft’s Bing Search Engine;Thore Graepel, et al.;《 International Conference on International Conference on Machine Learning》;20090430;第1-8页
基于概率图模型的互联网广告点击率预测;岳昆等;《华东师范大学学报(自然科学版)》;20130531;第2013年卷(第3期);第15-25页
基于概率图模型的互联网广告点击率预测;方志鹏;《中国优秀硕士学位论文全文数据库信息科技辑》;20150915;第2015年卷(第09期);I140-64
基于贝叶斯网的广告点击率预测方法及实现;王朝禄;《中国优秀硕士学位论文全文数据库信息科技辑》;20140115;第2013年卷(第01期);I140-89

Also Published As

Publication number Publication date
CN106096629A (en) 2016-11-09

Similar Documents

Publication Publication Date Title
CN106096629B (en) A kind of ad click rate prediction technique based on similarity relation between user
WO2020147594A1 (en) Method, system, and device for obtaining expression of relationship between entities, and advertisement retrieval system
US9391789B2 (en) Method and system for multi-level distribution information cache management in a mobile environment
AU2010266611B2 (en) Gathering information about connections in a social networking service
CN111444395B (en) Method, system and equipment for obtaining relation expression between entities and advertisement recall system
Strong Humanizing big data: Marketing at the meeting of data, social science and consumer insight
WO2011116129A2 (en) Systems and methods for interacting with messages, authors, and followers
US20080228537A1 (en) Systems and methods for targeting advertisements to users of social-networking and other web 2.0 websites and applications
US20080288347A1 (en) Advertising keyword selection based on real-time data
CN106789598B (en) Social relation chain-based public number message pushing method, device and system
WO2011112319A2 (en) Emotional targeting
US20110223571A1 (en) Emotional web
CN111382361A (en) Information pushing method and device, storage medium and computer equipment
US20160104074A1 (en) Recommending Bidded Terms
CN105447193A (en) Music recommending system based on machine learning and collaborative filtering
KR102322668B1 (en) Systme for providing multi-platform service for stimulating creative activity of contents creator
France et al. Characterizing viral videos: Methodology and applications
CN113742567A (en) Multimedia resource recommendation method and device, electronic equipment and storage medium
Demirel An examination of a campaign hashtag (# OptOutside) with google trends and twitter
CN109299368B (en) Method and system for intelligent and personalized recommendation of environmental information resources AI
Liu et al. Sequential heterogeneous attribute embedding for item recommendation
CN116049530A (en) Recall method, device, computer equipment and storage medium for popularization information
Li et al. An effective deep learning approach for personalized advertisement service recommend
CN110347923B (en) Traceable fast fission type user portrait construction method
CN112785328A (en) Content pushing method and device and computer storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: No. 66, New Model Road, Gulou District, Nanjing City, Jiangsu Province, 210000

Applicant after: Nanjing Post & Telecommunication Univ.

Address before: 210023 9 Wen Yuan Road, Qixia District, Nanjing, Jiangsu.

Applicant before: Nanjing Post & Telecommunication Univ.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20200422

Address after: 328, floor 3, building 7-10, No.88 Jianguo Road, Chaoyang District, Beijing 100020

Patentee after: CNLIVE Corp.

Address before: 210000, 66 new model street, Gulou District, Jiangsu, Nanjing

Patentee before: NANJING UNIVERSITY OF POSTS AND TELECOMMUNICATIONS

TR01 Transfer of patent right