WO2015192667A1 - 推荐广告的方法及广告推荐服务器 - Google Patents

推荐广告的方法及广告推荐服务器 Download PDF

Info

Publication number
WO2015192667A1
WO2015192667A1 PCT/CN2015/072573 CN2015072573W WO2015192667A1 WO 2015192667 A1 WO2015192667 A1 WO 2015192667A1 CN 2015072573 W CN2015072573 W CN 2015072573W WO 2015192667 A1 WO2015192667 A1 WO 2015192667A1
Authority
WO
WIPO (PCT)
Prior art keywords
advertisements
advertisement
user
kth
click
Prior art date
Application number
PCT/CN2015/072573
Other languages
English (en)
French (fr)
Inventor
涂丹丹
张勇
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2015192667A1 publication Critical patent/WO2015192667A1/zh
Priority to US15/378,311 priority Critical patent/US20170091805A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0242Determining effectiveness of advertisements
    • G06Q30/0244Optimization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0255Targeted advertisements based on user history
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0277Online advertisement

Definitions

  • the present invention relates to the field of information processing, and in particular, to a method of recommending an advertisement and an advertisement recommendation server.
  • CBF Content-based Filtering
  • CF Collaborative Filtering
  • the information retrieval or information filtering technology is mainly used, and the advertisement is recommended to the target user according to the relevance of the advertisement and the webpage content. That is, an advertisement having a higher relevance to web page content is considered to have a higher click probability. Therefore, the same advertisement is often recommended to users on the same web page.
  • this algorithm does not take into account the user's interest, and the accuracy of the click probability prediction of the advertisement is not high, so it is difficult to guarantee the click rate of the advertisement.
  • the similarity between users is mainly calculated according to the historical advertisement click information of the user, and then the user's preference for the advertisement is predicted according to the click status of the advertisement with the user with higher similarity to the target user, and then The target user is recommended according to the degree of preference.
  • the project-based CF algorithm mainly by calculating the similarity between the advertisements, the closest advertisement set of the target advertisement is selected, and whether the target advertisement is recommended according to the current user's preference for the closest advertisement is determined. Both CF algorithms use the user's preference to predict the click probability of the ad.
  • the CF algorithm improves the accuracy of the click probability prediction of the advertisement to a certain extent, the click rate of the advertisement can be improved, but the user frequently visits.
  • the advertisement recommended to the user by the CF algorithm is often similar to the advertisement familiar to the user, and the advertisement that the user is not familiar with but potentially interested can not be found, resulting in low click rate and poor user experience.
  • the embodiment of the invention provides a method for recommending an advertisement and an advertisement recommendation server, which can improve the click rate of the advertisement, thereby improving the user experience.
  • a method for recommending an advertisement including: obtaining webpage access information and advertisement click information from a user accessing an internet log, wherein the webpage access information is used to indicate n webpages accessed by m users, The advertisement click information is used to indicate x advertisements that m users click on n webpages, n, m and x are positive integers greater than 1; and the m is predicted according to the webpage access information and the advertisement click information
  • the determining a novelty factor corresponding to the x advertisements respectively includes: determining, according to historical recommendation information, a novelty factor corresponding to the x advertisements,
  • the historical recommendation information is used to indicate that the history of the x advertisements is separately recommended to the i-th user.
  • the determining, according to the historical recommendation information, the novelty factor corresponding to the x advertisements respectively including: for the x a kth advertisement in the advertisement, if the history recommendation information indicates that the kth advertisement is not recommended to the ith user, determining that the novelty factor corresponding to the kth advertisement is a first value; if the history The recommendation information indicates that the kth advertisement is recommended to the ith user in the past, and the novelty factor corresponding to the kth advertisement is determined to be a second value; wherein the first value is greater than the second value, k A positive integer from 1 to x.
  • the determining that the novelty factor corresponding to the kth advertisement is a second value comprises: determining q days to the foregoing The i-th user recommends the kth advertisement, q is a positive integer; determining the Ebbinghaus forgetting curve value corresponding to the q-day; determining that the novelty factor corresponding to the k-th advertisement is the first value and the The difference between the Ebbinghaus forgetting curve values.
  • the determining the novelty factor corresponding to the x advertisements respectively includes: determining, for the kth advertisement of the x advertisements, the kth advertisement a similarity to each of the x advertisements other than the kth advertisement; according to the kth advertisement and other advertisements of the x advertisements other than the kth advertisement a similarity between the x advertisements, a similarity ranking corresponding to the kth advertisement, and a dissimilarity ranking corresponding to the kth advertisement; a ranking and a similarity ranking corresponding to the kth advertisement The dissimilarity ranking corresponding to the kth advertisement is weighted to obtain a novelty factor corresponding to the kth advertisement; wherein k is a positive integer ranging from 1 to x.
  • the determining the novelty factor corresponding to the x advertisements respectively includes: determining, for the kth advertisement of the x advertisements, the kth advertisement a diversity distance between the advertisements other than the k-th advertisement and the x-th advertisements, respectively, according to the k-th advertisement and the x-th advertisements other than the k-th advertisement The diversity distance between the advertisements determines a novelty factor corresponding to the kth advertisement; wherein k is a positive integer ranging from 1 to x.
  • the click probability corresponding to the x advertisements and the novelty factor corresponding to the x advertisements respectively Determining, in the x advertisements, the p advertisements to be recommended to the ith user, including: weighting a click probability corresponding to each of the x advertisements and a novelty factor corresponding to each advertisement, determining a score corresponding to each of the x advertisements; sorting the x advertisements in order of the scores corresponding to the x advertisements to obtain the sorted x advertisements; and sorting the x advertisements
  • the top p advertisements in the advertisement are determined as p advertisements to be recommended to the ith user.
  • the click probability and the corresponding information according to the x advertisements respectively Determining a novelty factor corresponding to each of the x advertisements, and determining, in the x advertisements, p advertisements to be recommended to the ith user, including: in descending order of click probability, The x advertisements are sorted to obtain the sorted x advertisements; the first q advertisements in the sorted x advertisements are sorted according to the novelty factor in descending order, and the reordered q is obtained. Advertisements, where q is a positive integer and q is greater than p; the first p advertisements of the reordered q advertisements are determined as p advertisements to be recommended to the ith user.
  • the click probability of the x advertisements includes: generating a user-webpage access matrix, a user-ad click matrix, and an advertisement-web relevance degree matrix according to the webpage access information and the advertisement click information, wherein the user The i-th row and the j-th column object of the webpage access matrix represent the access record of the i-th user to the jth webpage, and the i-th row and the k-th column object of the user-advertising click matrix represent the i-th user pair a click record of the kth advertisement, the jth row and the kth column object of the advertisement-webpage relevance matrix represents the degree of association between the jth webpage and the kth advertisement, and k is a value from 1 to x.
  • a positive integer performing a joint probability matrix decomposition on the user-web page access matrix, the user-ad click matrix, and the advertisement-web relevance matrix to obtain a user implicit feature vector of the i-th user, the first j page of the webpage
  • An advertisement implicit feature vector including a feature vector and the kth advertisement; a user implicit feature vector of the i-th user, a webpage implied feature vector of the jth webpage, and an advertisement implied by the kth advertisement a feature vector, determining a click probability of the kth advertisement when the ith user accesses the jth webpage.
  • an advertisement recommendation server including: an obtaining unit, configured to obtain webpage access information and advertisement click information from a user accessing an internet log, where the webpage access information is used to indicate n accessed by m users. a webpage, the advertisement click information is used to indicate x advertisements that m users click on n webpages, n, m, and x are positive integers greater than 1, and a prediction unit is configured to access information and a location according to the webpage Determining the click probability of the x advertisements when the i th user accesses the jth webpage, wherein i is a positive integer from 1 to m, and j is a value from 1 to n a positive integer; a determining unit, configured to determine a novelty factor corresponding to each of the x advertisements, wherein a novelty factor corresponding to each of the x advertisements is used to represent the ith user to each of the advertisements a degree of knowledge; a selection unit, configured to determine, among the x advertisements, p to be recommended to
  • the determining unit is specifically configured to: determine, according to the historical recommendation information, a novelty factor corresponding to the x advertisements, where the historical recommendation information is used to indicate The history of the x advertisements is separately recommended to the ith user.
  • the determining unit is specifically configured to: if the k-th advertisement in the x advertisements, the historical recommendation information Determining that the kth advertisement is not recommended to the ith user, determining that the novelty factor corresponding to the kth advertisement is a first value; if the historical recommendation information indicates that the ith user is recommended in the past The k-th advertisement determines that the novelty factor corresponding to the k-th advertisement is a second value; wherein the first value is greater than the second value, and k is a positive integer ranging from 1 to x.
  • the determining unit is specifically configured to: determine that the kth advertisement is recommended to the i th user before q days, q is a positive integer; determining an Ebbinghaus forgetting curve value corresponding to the q-day; determining that the novelty factor corresponding to the k-th advertisement is between the first value and the Ebbinghaus forgetting curve value Difference.
  • the determining unit is specifically configured to: determine, for the kth advertisement in the x advertisements, the kth advertisement and the x advertisements respectively a similarity between advertisements other than the k-th advertisement; determining, based on the similarity between the k-th advertisement and other advertisements of the x advertisements other than the k-th advertisement a similarity ranking corresponding to the kth advertisement and a dissimilarity ranking corresponding to the kth advertisement in the x advertisements; a similarity ranking corresponding to the kth advertisement and a dissimilarity corresponding to the kth advertisement The sexual ranking is weighted to obtain a novelty factor corresponding to the kth advertisement; wherein k is a positive integer from 1 to x.
  • the determining unit is specifically configured to: determine, for the kth advertisement in the x advertisements, the kth advertisement and the x advertisements respectively a diversity distance between advertisements other than the kth advertisement; a diversity distance between the kth advertisement and an advertisement other than the kth advertisement among the x advertisements, respectively Determining a novelty factor corresponding to the kth advertisement; wherein k is a positive integer ranging from 1 to x.
  • the selecting unit is specifically configured to: a click probability corresponding to each of the x advertisements, and each of the advertisements The corresponding novelty factors are weighted, and the scores corresponding to the x advertisements are respectively determined; and the x advertisements are sorted according to the order of the x advertisements corresponding to the scores, Up to the sorted x advertisements; determining the top p advertisements of the sorted x advertisements as p advertisements to be recommended to the ith user.
  • the selecting unit is specifically configured to: a small order, sorting the x advertisements to obtain the sorted x advertisements; sorting the top q advertisements in the sorted x advertisements according to a novelty factor from large to small, Obtaining q re-sorted advertisements, where q is a positive integer and q is greater than p; determining the first p advertisements of the reordered q advertisements as p advertisements to be recommended to the ith user.
  • the predicting unit is configured to: generate a user-webpage access matrix according to the webpage access information and the advertisement click information, a user-advertise click matrix and an advertisement-web relevance matrix, wherein the i-th row and the j-th column object of the user-web access matrix represent an access record of the i-th user to the j-th webpage, the user- The i-th row and the k-th column object of the advertisement click matrix represent the click record of the IKth user for the k-th advertisement, and the j-th row and the k-th column object of the advertisement-web page relevance degree matrix represent the j-th page and the a degree of association between the kth advertisements, k being a positive integer from 1 to x; a joint probability matrix for the user-web page access matrix, the user-ad click matrix, and the advertisement-web relevance matrix Decomposing, obtaining an implied feature vector of the user of the
  • the click probability of the x advertisements when the i-th user accesses the jth webpage is predicted according to the webpage access information and the advertisement click information, and the novelty factor corresponding to each of the x advertisements is determined according to the historical recommendation information, and according to the x advertisements
  • the click probability and the novelty factor corresponding to the x advertisements respectively determine the p advertisements to be recommended to the i-th user in the x advertisements, wherein the i-th user has less knowledge of the p advertisements than the i-th user-to-x advertisements
  • the degree of awareness of advertisements other than p advertisements, the click probability of p advertisements is higher than the click probability of advertisements other than p advertisements among x advertisements.
  • FIG. 1 is a schematic flow chart of a method of recommending an advertisement according to an embodiment of the present invention.
  • FIG. 2 is a schematic flow chart of a process of a method of recommending an advertisement according to an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of an AdRec model in accordance with an embodiment of the present invention.
  • FIG. 4 is a schematic block diagram of an advertisement recommendation server in accordance with an embodiment of the present invention.
  • FIG. 5 is a schematic block diagram of an advertisement recommendation server according to an embodiment of the present invention.
  • FIG. 6 is a schematic block diagram of an advertisement recommendation system in accordance with an embodiment of the present invention.
  • Embodiments of the present invention can be applied to recommendation scenarios of various objects, such as recommendations of objects such as products, applications, or songs. Therefore, in the embodiment of the present invention, the advertisement may be a carrier of the recommended objects, and the information of the recommended object may be displayed through an advertisement page.
  • the method of the embodiment of the present invention can be performed by an advertisement recommendation server.
  • the advertisement recommendation server can store advertisements published by advertisers, manage advertisements issued by advertisers, and provide advertisement services to users. Specifically, the advertisement recommendation server may collect information such as a user's click record on the advertisement and a user's click record on the webpage, and may recommend an advertisement to the user based on the information.
  • FIG. 1 is a schematic flow chart of a method of recommending an advertisement according to an embodiment of the present invention.
  • the method of Figure 1 can be performed by an advertisement recommendation server.
  • the webpage access information is used to indicate n webpages visited by m users, and the advertisement click information is used to indicate that x users click on x webpages. Advertisements, n, m, and x are positive integers greater than one.
  • the webpage access information and the advertisement click information predict the click probability of the x advertisements when the i-th user accesses the jth webpage among the m users, where i is a positive integer from 1 to m, and j is a value from 1 A positive integer to n.
  • the click probability of the p advertisements is higher than the click probability of the advertisements other than the p advertisements in the x advertisements, p is positive Integer and p ⁇ x.
  • the click probability of the x advertisements when the i-th user accesses the jth webpage is predicted according to the webpage access information and the advertisement click information, and the novelty factor corresponding to each of the x advertisements is determined according to the historical recommendation information, and according to the x advertisements
  • the click probability and the novelty factor corresponding to the x advertisements respectively determine the p advertisements to be recommended to the i-th user in the x advertisements, wherein the i-th user has less knowledge of the p advertisements than the i-th user-to-x advertisements
  • the degree of awareness of advertisements other than p advertisements, the click probability of p advertisements is higher than the click probability of advertisements other than p advertisements among x advertisements.
  • the two-dimensional information is used to predict the click probability of the advertisement, such as information related to the advertisement and the webpage, or related information of the user and the advertisement.
  • advertisements recommended to users are often similar to advertisements familiar to the user. Ads that are unfamiliar but potentially interesting to users are difficult to recommend to users.
  • the webpage access information is used to indicate n webpages accessed by m users
  • the advertisement click information is used to indicate x advertisements that m users click on n webpages, and therefore, according to webpage access information and advertisements
  • the click information predicts the click probability of the advertisement, that is, the information of the three dimensions of the user, the webpage, and the advertisement is used to predict the click probability of the x advertisements, thereby improving the accuracy of the click probability prediction of the advertisement.
  • the novelty factor corresponding to each of the x advertisements is determined.
  • the p advertisements to be recommended to the i-th user are determined, and the accuracy of the click probability prediction of the advertisement and the novelty of the advertisement are considered. Not only can improve the accuracy of the click probability prediction of the advertisement, but also consider the novelty of the advertisement, thereby avoiding recommending the same type of advertisement to the user for a long time without considering the potential interest of the user, thereby improving the click rate of the advertisement and improving user experience.
  • the i-th user may be any one of m users
  • the j-th webpage may be any one of n webpages.
  • the above-mentioned x advertisements may be all advertisements or partial advertisements stored in the advertisement recommendation server.
  • a user-webpage access matrix, a user-advertisement click matrix, and an advertisement-webpage relevance matrix may be generated according to the webpage access information and the advertisement click information, where the user-webpage The i-th row and the j-th column object of the access matrix represent the access record of the i-th user to the jth webpage, and the i-th row and the k-th column object of the user-ad click matrix represent the click record of the i-th user to the k-th advertisement, the advertisement-webpage
  • the jth row and kth column objects of the relevance degree matrix represent the degree of association between the jth webpage and the kth advertisement, and k is a positive integer ranging from 1 to x.
  • the user-web access matrix, the user-ad click matrix and the advertisement-web relevance matrix can be decomposed by the joint probability matrix to obtain the user implicit feature vector of the i-th user, the webpage implied feature vector of the j-th webpage, and the kth The advertisement's advertising implied feature vector.
  • the click probability of the kth advertisement when the i th user accesses the jth webpage may be determined according to the user implicit feature vector of the i-th user, the webpage implied feature vector of the jth webpage, and the advertisement implied feature vector of the kth advertisement.
  • the webpages can be classified into user-page access matrix, user-ad click matrix, and click-through rate matrix of advertisements when webpages and advertisements appear at the same time.
  • web pages can be categorized by domain name.
  • the similarity information of the webpage and the advertisement may be extracted from the webpage access information and the advertisement click information.
  • the advertisement-page relevance matrix can be obtained based on the click rate matrix of the advertisement when the webpage and the advertisement appear at the same time, and the similarity information of the webpage and the advertisement.
  • the user-web access matrix, user-ad click matrix and advertisement-web relevance matrix can be decomposed, so that the i-th user accesses the j-th page.
  • the click probability of the ad is the probability of the ad.
  • User-web access matrix and user-ad click matrix can reflect the user's interest, but
  • the webpage relevance matrix can reflect the correlation between the webpage and the advertisement. It can be seen that, in this embodiment, the user's interest and the correlation between the webpage and the advertisement are considered, and the click probability of each advertisement is predicted. Therefore, the accuracy of the click probability prediction of the advertisement can be improved, thereby ensuring the click rate of the advertisement.
  • the user's access data to the webpage and the user's click data on the advertisement are very sparse. This phenomenon can also be called data sparseness. In this case, the accuracy of predicting the click probability of an advertisement using a CBF-based algorithm or a CF algorithm is greatly reduced.
  • the joint probability matrix decomposition algorithm is used to predict the click probability of the advertisement according to the user-web access matrix, the user-advertise click matrix, and the advertisement-page relevance matrix, although the three matrices may be It is a sparse matrix, but since it is not based on only one of the matrices to predict the click probability, the accuracy of the click probability prediction of the advertisement can be ensured even in the case of sparse data.
  • a sparse matrix can refer to a matrix in which the data of a row or column is missing more.
  • the objective joint function may be maximized, and the user-webpage access matrix and the user may be based on the gradient descent method.
  • the advertisement click matrix and the advertisement-page relevance matrix are decomposed to obtain the user implicit feature vector of the i-th user, the webpage implied feature vector of the jth webpage, and the advertisement implied feature vector of the kth advertisement.
  • the click probability of the kth advertisement may be predicted according to the user implicit feature vector of the i-th user, the webpage implied feature vector of the jth webpage, and the advertisement implied feature vector of the kth advertisement.
  • the maximum implicit joint a posteriori probability is used as an objective function, and based on the gradient descent method, the user implicit feature vector of the i-th user, the webpage implied feature vector of the jth webpage, and the advertisement of the kth advertisement are obtained according to the above three matrices. Implicit feature vector.
  • the webpage implied feature vector of the jth webpage, and the advertisement implied feature vector of the kth advertisement, the first vector, the second vector, and the third vector may be respectively determined, and the first vector may be Indicates the degree of interest of the i-th user to the jth webpage, the second vector may indicate the degree of interest of the i-th user to the kth advertisement, and the third vector may indicate the degree of association of the jth webpage with the kth advertisement.
  • the linear combination of the first vector, the second vector, and the third vector may be mapped to [0, 1], so that the click probability of the kth advertisement when the i th user accesses the jth web page may be obtained.
  • the kth advertisement can be any of the x advertisements.
  • the click probability of the i-th user when accessing the jth webpage can be calculated according to the above process. In this way, the click probability of x advertisements when the i-th user accesses the jth webpage can be obtained.
  • the complexity of the recommendation algorithm is Factors that need to be focused on.
  • the overhead of the calculation process mainly comes from the gradient descent method.
  • the complexity of the algorithm increases linearly with the amount of data in the three matrices. Therefore, the present embodiment is suitable for the processing of large-scale data.
  • step 130 for the kth advertisement in the x advertisements, if the history recommendation information indicates that the kth advertisement is not recommended to the i th user, the novel corresponding to the kth advertisement may be determined.
  • the sex factor is the first value; if the history recommendation information indicates that the kth advertisement has been recommended to the ith user in the past, it may be determined that the novelty factor corresponding to the kth advertisement is the second value.
  • first value is greater than the second value
  • k is a positive integer ranging from 1 to x.
  • the kth advertisement may be any one of x advertisements.
  • Each ad can correspond to a novelty factor.
  • the novelty factor corresponding to each advertisement can be used to indicate the novelty of the advertisement for the i-th user.
  • the novelty factor in the case where it has not been recommended to the i-th user is greater than the novelty factor in the case where it has been recommended to the i-th user.
  • the greater the novelty factor corresponding to the advertisement the greater the novelty of the advertisement for the i-th user, in other words, the i-th user is unfamiliar with the advertisement or has not seen the advertisement.
  • the novelty factor in the case of not recommending to the i-th user is greater than the novelty factor in the case that the i-th user has already recommended, so that the advertisement can be improved.
  • the novelty of the recommended ads enhances the user experience.
  • the first value and the second value may be preset, for example, the first value may be preset to 1 and the second value may be preset to 0.5.
  • the second value may be based on historical recommendation information and the Ebbinghaus Forgetting Curve.
  • step 130 it may be determined that the kth advertisement is recommended to the i-th user before q days, q is a positive integer, and the Ebbings forgetting curve value corresponding to q days is determined, and determined.
  • the novelty factor corresponding to the kth advertisement is the difference between the first value and the Ebbinghaus forgetting curve value.
  • the first value can be preset to 1 and the second value is 1 - Ebbinghaus Forgotten Curve value.
  • the novelty factor corresponding to the advertisement may be determined based on the Ebbinghaus forgetting curve. This can improve the accuracy of the novelty factor, thereby improving the novelty of the advertisements recommended to the user and improving the user experience. It should be noted that determining the novelty factor corresponding to the advertisement based on the Ebbinghaus forgetting curve value is only a preferred embodiment adopted by the present invention. It can be understood that the Ebbinghaus forgetting curve value is replaced with The weighting value associated with q can also implement the inventive solution.
  • step 130 for the kth advertisement in the x advertisements, The degree of similarity between the kth advertisement and the other advertisements other than the kth advertisement among the x advertisements may be determined.
  • the similarity ranking corresponding to the kth advertisement and the dissimilarity ranking corresponding to the kth advertisement in the x advertisements may be determined according to the similarity between the kth advertisement and the other advertisements except the kth advertisement in the x advertisements. .
  • the similarity ranking corresponding to the kth advertisement and the dissimilarity ranking corresponding to the kth advertisement may be weighted to obtain a novelty factor corresponding to the kth advertisement, where k is a positive integer ranging from 1 to x.
  • the novelty factor corresponding to each advertisement may be determined according to an evaluation index of the domain classification system, Intra-list Similarity.
  • the similarity between the two ads can be determined.
  • the similarity between two-to-two advertisements can be determined based on a cosine similarity algorithm or a Pearson similarity algorithm.
  • the similarity between it and other advertisements can be utilized to determine the similarity ranking RS and the dissimilarity ranking NRS corresponding to the advertisement in the x advertisements.
  • the similarity ranking and the dissimilarity ranking corresponding to the advertisement may then be weighted to obtain a novelty factor corresponding to the advertisement.
  • the novelty factor of the advertisement W*RS+(1-W)*NRS, where W is the weight value.
  • This embodiment can improve the accuracy of the novelty factor, thereby improving the novelty of the advertisement recommended to the user and improving the user experience.
  • step 130 for the kth advertisement in the x advertisements, determining a diversity distance between the kth advertisement and the other advertisements except the kth advertisement in the x advertisements respectively Determining a novelty factor corresponding to the kth advertisement according to a diversity distance between the kth advertisement and each of the x advertisements other than the kth advertisement; wherein k is a positive integer ranging from 1 to x .
  • the novelty factor corresponding to each of the x advertisements may be determined based on the principle of recommendation diversity.
  • the diversity distance between the two ads can be determined.
  • the Jaccard diversity distance calculation method can be used to obtain the diversity distance between the two advertisements.
  • the diversity distance between it and other individual advertisements can be calculated.
  • the novelty factor corresponding to the advertisement is determined according to the diversity distance between the advertisement and each of the other advertisements. For example, the diversity distance between the advertisement and each of the other advertisements can be summed to obtain a novelty factor corresponding to the advertisement.
  • This embodiment can improve the accuracy of the novelty factor, thereby improving the novelty of the advertisement recommended to the user and improving the user experience.
  • a click probability corresponding to each advertisement of each of the x advertisements and a novelty factor corresponding to each advertisement may be weighted to determine x advertisements respectively.
  • the x advertisements may be sorted according to the order of the scores of the x advertisements, and the sorted x advertisements are obtained.
  • the top p advertisements of the sorted x advertisements may be determined as p advertisements to be recommended to the i-th user.
  • the click probability and the novelty factor may be weighted by a weighting algorithm to obtain a score corresponding to each advertisement.
  • a corresponding weight may be assigned to its click probability and novelty factor, and the click probability and novelty factor of the advertisement may be weighted by the assigned weight, thereby obtaining a score corresponding to the advertisement.
  • the x advertisements may be sorted in descending order of the score, and the first p advertisements of the sorted x advertisements are used as advertisements to be recommended to the i-th user. It can be seen that when determining the advertisement to be recommended to the ith user, both the click probability and the novelty factor are considered, so that the click rate of the advertisement can be improved and the user experience can be improved.
  • the x advertisements may be sorted according to the order of click probability, and the sorted x advertisements are obtained.
  • the first q advertisements in the sorted x advertisements may be sorted according to the order of novelty factors, and the re-sorted q advertisements are obtained, where q is a positive integer and q is greater than p.
  • the first p advertisements among the re-sorted q advertisements may be determined as p advertisements to be recommended to the i-th user.
  • an advertisement recommendation list can be obtained based on the funnel-shaped filter weighting method described above.
  • q is preferably 2 times p. It can be seen that when determining the advertisement to be recommended to the ith user, both the click probability and the novelty factor are considered, so that the click rate of the advertisement can be improved and the user experience can be improved.
  • webpage access information and advertisement click information may be obtained from a user accessing an internet log in real time.
  • the advertisement click information may include the user's click information on the recommended p advertisements. That is to say, the user's click information of the recommended p advertisements will be fed back in real time, so that the real-time information can adaptively adjust the click probability of the advertisement, thereby further improving the accuracy of the click probability prediction of the advertisement.
  • FIG. 2 is a schematic flow chart of a process of a method of recommending an advertisement according to an embodiment of the present invention.
  • x advertisements, n, m and x are positive integers greater than one.
  • B can represent a user-web access matrix.
  • B is an element b ij (b ij ⁇ [0,1 ]) indicates that the user u i w j page access to the record, may be considered to be user u i w j pages of level of interest.
  • b ij can be calculated from equation (1):
  • g( ⁇ ) is a Logistic Function and is used for normalization.
  • f(u i , w j ) represents the number of times the user u i browses the network w j .
  • C can represent a user-advertise click matrix.
  • the element c ik in C represents the degree of interest of the user u i to the advertisement a k .
  • c ik can be obtained by equation (2):
  • f(u i , a k ) represents the number of times the user u i clicks on the advertisement a k .
  • R can represent an advertisement-page relevance matrix.
  • the element r jk in R represents the degree of association between the web page w j and the advertisement a k .
  • the same ad has different clickthrough rates when displayed on different pages. The more relevant an ad is to the content of a webpage, the more likely it is that the ad will be clicked.
  • the ad-page relevance matrix is determined by combining the click rate of the advertisement when the webpage-advertising occurs and the similarity between the webpage and the advertisement, so that the accuracy of the advertisement-webpage relevance matrix can be improved.
  • d jk can represent the similarity between the web page w j and the advertisement a k
  • h jk represents the click rate of the advertisement a k on the web page w j .
  • d jk can be obtained according to the Probabilistic Latent Semantic Analysis (PLSA) method or the Latent Dirichlet Allocation (LDA) algorithm.
  • PLSA Probabilistic Latent Semantic Analysis
  • LDA Latent Dirichlet Allocation
  • h jk may be equal to the number of ads on a page w j is a k a k ad clicks divided by the total number placed on pages w j.
  • the user's access history to the web page and the click history of the ad can reflect the user's interests or preferences.
  • the ad click rate is closely related to user interest and the relevance of the ad to the page.
  • user interest and advertisements are combined with web page relevance by using the AdRec model.
  • the following describes an advertisement a k in x advertisements as an example. It should be understood that the advertisement a k may be any of the x advertisements.
  • the three implicit feature vectors can be determined based on the AdRec model.
  • 3 is a schematic diagram of an AdRec model in accordance with an embodiment of the present invention. As shown in FIG. 3, the user-web access matrix shares the user implicit feature vector U i with the user-ad click matrix, and the user-ad click matrix and the ad-web relevance matrix share the advertising implied feature vector A k .
  • the AdRec model is based on the following assumptions:
  • g( ⁇ ) is the logistic function
  • g( ⁇ ) is the logistic function
  • g( ⁇ ) is the logistic function
  • Equation (10) can be considered as an unconstrained optimization problem. Equation (11) is equivalent to equation (10).
  • Equation (11) The local minimum of equation (11) can be obtained based on the gradient descent method.
  • the gradient descent formulas for U i , W j , and A k are as follows:
  • the computational overhead of the gradient descent method is mainly derived from the objective function E and the corresponding gradient descent formula. Since the matrices B, C, and R belong to a sparse matrix, the time complexity of the objective function in equation (10) can be O(n B l+n C l+n R l), where n B , n C , and n R represent The number of non-zero elements in matrices B, C, and R.
  • the time complexity of equations (12) to (14) can be derived. Therefore, the total time complexity of each iteration is O(n B l+n C l+n R l), that is, the time complexity of the algorithm increases linearly with the number of observations in the three sparse matrices. Therefore, embodiments of the present invention are applicable to the processing of large-scale data.
  • an advertisement feature vector of each of the x advertisements can be obtained.
  • the click probability of x advertisements when the user u i accesses the web page w j can be obtained.
  • Novelty factor corresponding to advertisement a k It can be determined according to equation (16):
  • the click probability of each advertisement and its novelty factor may be assigned corresponding weights, and the click probability and novelty factor of the advertisement are weighted by the assigned weights to obtain a score corresponding to the advertisement.
  • the sum of the weight of the click probability of each advertisement and the weight of its novelty factor is 1.
  • information of p advertisements may be presented on the network element w j when the user u i accesses the web page w j .
  • the p advertisements to be recommended to the user u i may be determined by other means than the steps 206 and 207.
  • p advertisements to be recommended to the user u i may be obtained based on a funnel-shaped filtering weighting manner.
  • x advertisements can be sorted in descending order of click probability to obtain sorted x advertisements.
  • the top q advertisements in the sorted x advertisements can be reordered according to the novelty factor from the largest to the smallest, and the re-sorted q advertisements are obtained.
  • the first p advertisements of the re-sorted q advertisements can then be recommended to the user u i .
  • q can be twice as large as p.
  • the click probability of the x advertisements when the i-th user accesses the jth webpage is predicted according to the webpage access information and the advertisement click information, and the novelty factor corresponding to each of the x advertisements is determined according to the historical recommendation information, and according to the x advertisements
  • the click probability and the novelty factor corresponding to the x advertisements respectively determine the p advertisements to be recommended to the i-th user in the x advertisements, wherein the i-th user has less knowledge of the p advertisements than the i-th user-to-x advertisements
  • the degree of awareness of advertisements other than p advertisements, the click probability of p advertisements is higher than the click probability of advertisements other than p advertisements among x advertisements.
  • the advertisement recommendation server 400 of FIG. 4 includes an acquisition unit 410, a prediction unit 420, a determination unit 430, and a selection unit 440.
  • the obtaining unit 410 obtains webpage access information and advertisement click information from the user internet log, where the webpage access information is used to indicate n webpages visited by the m users, and the advertisement click information is used to indicate that x users click on the n webpages. Advertisements, n, m, and x are positive integers greater than one.
  • the prediction unit 420 predicts the click probability of the x advertisements when the i-th user accesses the jth webpage according to the webpage access information and the advertisement click information, where i is a positive integer from 1 to m, and j is a value from A positive integer from 1 to n.
  • the determining unit 430 determines a novelty factor corresponding to each of the x advertisements, and a novelty factor corresponding to each of the x advertisements is used to indicate the degree of knowledge of the advertisement by the i-th user.
  • the selecting unit 440 determines p advertisements to be recommended to the i-th user among the x advertisements according to the click probability of the x advertisements and the novelty factor corresponding to the x advertisements respectively, wherein the i-th user knows the p advertisements Less than the i-th user's awareness of advertisements other than p advertisements in x advertisements, the click probability of p advertisements is higher than the click probability of advertisements other than p advertisements in x advertisements, p is a positive integer And p ⁇ x.
  • the i-th user access is predicted according to the webpage access information and the advertisement click information.
  • the determining unit 430 may determine, according to the historical recommendation information, a novelty factor corresponding to each of the x advertisements, where the historical recommendation information is used to indicate that the history of the x advertisements is separately recommended to the ith user.
  • the determining unit 430 may determine the novelty factor corresponding to the kth advertisement. Is the first value. If the history recommendation information indicates that the kth advertisement has been recommended to the ith user in the past, the determining unit 430 determines that the novelty factor corresponding to the kth advertisement is the second value.
  • first value is greater than the second value
  • k is a positive integer ranging from 1 to x.
  • the determining unit 430 may determine that the kth advertisement is recommended to the i-th user q days ago, and q is a positive integer.
  • the determining unit 430 can determine the Ebbinghaus forgetting curve value corresponding to q days.
  • the determining unit 430 may determine that the novelty factor corresponding to the kth advertisement is a difference between the first value and the Ebbinghaus forgetting curve value.
  • the determining unit 430 may determine the similarity between the kth advertisement and the other advertisements other than the kth advertisement among the x advertisements.
  • the determining unit 430 may determine, according to the similarity between the kth advertisement and the other advertisements other than the kth advertisement among the x advertisements, the similarity ranking corresponding to the kth advertisement and the kth advertisement corresponding to the x advertisements. Similarity ranking.
  • the determining unit 430 may weight the similarity ranking corresponding to the kth advertisement and the dissimilarity ranking corresponding to the kth advertisement to obtain a novelty factor corresponding to the kth advertisement.
  • k is a positive integer from 1 to x.
  • the determining unit 430 may determine a diversity distance between the kth advertisement and the other advertisements other than the kth advertisement among the x advertisements.
  • the determining unit 430 may separately and in addition to the kth advertisement among the x advertisements according to the kth advertisement
  • the diversity distance between other advertisements determines the novelty factor corresponding to the kth advertisement.
  • k is a positive integer from 1 to x.
  • the selecting unit 440 may weight the click probability corresponding to each advertisement of the x advertisements and the novelty factor corresponding to each advertisement, determine the score corresponding to each of the x advertisements, and may follow The x advertisements are sorted from the largest to the smallest, and the x advertisements are sorted to obtain the sorted x advertisements. The selection unit 440 can then determine the top p advertisements among the sorted x advertisements as the p advertisements to be recommended to the ith user.
  • the selecting unit 440 may sort the x advertisements in order of decreasing click probability to obtain the sorted x advertisements.
  • the selecting unit 440 may sort the top q advertisements in the sorted x advertisements according to the novelty factor from the largest to the smallest, and obtain the re-sorted q advertisements, where q is a positive integer and q is greater than p.
  • the selection unit 440 may also determine the top p advertisements among the reordered q advertisements as p advertisements to be recommended to the ith user.
  • the prediction unit 420 may generate a user-webpage access matrix, a user-advertising click matrix, and an advertisement-webpage relevance matrix according to the webpage access information and the advertisement click information, wherein the user-webpage access matrix
  • the i-th row and the j-th column object represent the access record of the i-th user to the jth webpage
  • the i-th row and the k-th column object of the user-ad click matrix represent the click record of the i-th user to the k-th advertisement
  • the jth row and kth column object of the matrix represents the degree of association between the jth web page and the kth advertisement
  • k is a positive integer ranging from 1 to x.
  • the prediction unit 420 may perform joint probability matrix decomposition on the user-webpage access matrix, the user-advertisement click matrix, and the advertisement-webpage relevance matrix to obtain the user implicit feature vector of the i-th user, the webpage implied feature vector of the j-th webpage, and The advertisement of the kth advertisement implies an eigenvector.
  • the prediction unit 420 may determine, according to the user implicit feature vector of the i-th user, the webpage implied feature vector of the jth webpage, and the advertisement implied feature vector of the kth advertisement, the click of the kth advertisement when the i-th user accesses the jth webpage. Probability.
  • advertisement recommendation server 400 of FIG. 4 For the other functions and operations of the advertisement recommendation server 400 of FIG. 4, reference may be made to the process of the method embodiment of FIG. 1 to FIG. 3, and details are not described herein again.
  • FIG. 5 is a schematic block diagram of an advertisement recommendation server according to an embodiment of the present invention.
  • the advertisement recommendation server 500 of FIG. 5 may include a memory 510 and a processor 520.
  • Memory 510 can include random access memory, flash memory, read only memory, programmable read only memory, nonvolatile memory or registers, and the like.
  • the processor 520 can be a Central Processing Unit (CPU).
  • Memory 510 is used to store executable instructions.
  • the processor 520 can execute executable instructions stored in the memory 510, configured to: obtain webpage access information and advertisement click information from a user accessing an internet log, where the webpage access information is used to indicate n webpages visited by the m users, and the advertisement clicks The information is used to indicate x advertisements that m users click on n webpages, n, m and x are positive integers greater than 1; according to webpage access information and advertisement click information, the i-th user accesses of m users is predicted. j.
  • the click probability of x advertisements where i is a positive integer from 1 to m, j is a positive integer from 1 to n; the novelty factor corresponding to x advertisements is determined, x advertisements
  • the novelty factor corresponding to each advertisement is used to indicate the degree of knowledge of the i-th user to the advertisement; and the i-th user is determined to be in the x advertisements according to the click probability of the x advertisements and the novelty factor corresponding to the x advertisements respectively.
  • the recommended p advertisements wherein the i-th user has less knowledge of the p advertisements than the i-th users have the knowledge of the advertisements other than the p advertisements of the x advertisements, and the p advertisements have a higher click probability than the x advertisements. Except for p in the advertisement Click on the probability of outside advertising, p is a positive integer and p ⁇ x.
  • the click probability of the x advertisements when the i-th user accesses the jth webpage is predicted according to the webpage access information and the advertisement click information, and the novelty factor corresponding to each of the x advertisements is determined according to the historical recommendation information, and according to the x advertisements
  • the click probability and the novelty factor corresponding to the x advertisements respectively determine the p advertisements to be recommended to the i-th user in the x advertisements, wherein the i-th user has less knowledge of the p advertisements than the i-th user-to-x advertisements
  • the degree of awareness of advertisements other than p advertisements, the click probability of p advertisements is higher than the click probability of advertisements other than p advertisements among x advertisements.
  • the processor 520 may determine, according to the historical recommendation information, a novelty factor corresponding to each of the x advertisements, where the historical recommendation information is used to indicate that the history records of the x advertisements are respectively recommended to the i-th user.
  • the processor 520 may determine the novelty factor corresponding to the kth advertisement. Is the first value. If the history recommendation information indicates that the kth advertisement has been recommended to the ith user in the past, the processor 520 determines that the novelty factor corresponding to the kth advertisement is the second value.
  • first value is greater than the second value
  • k is a positive integer ranging from 1 to x.
  • the processor 520 may determine to recommend to the i-th user q days ago. After the kth advertisement, q is a positive integer. The processor 520 can determine the Ebbinghaus forgetting curve value corresponding to q days. The processor 520 may determine that the novelty factor corresponding to the kth advertisement is a difference between the first value and the Ebbinghaus forgetting curve value.
  • the processor 520 may determine the similarity between the kth advertisement and the other advertisements other than the kth advertisement among the x advertisements.
  • the processor 520 may determine, according to the similarity between the kth advertisement and the other advertisements other than the kth advertisement in the x advertisements, the similarity ranking corresponding to the kth advertisement and the kth advertisement corresponding to the x advertisements. Similarity ranking.
  • the processor 520 may weight the similarity ranking corresponding to the kth advertisement and the dissimilarity ranking corresponding to the kth advertisement to obtain a novelty factor corresponding to the kth advertisement.
  • k is a positive integer from 1 to x.
  • the processor 520 may determine a diversity distance between the kth advertisement and the other advertisements other than the kth advertisement among the x advertisements.
  • the processor 520 can determine the novelty factor corresponding to the kth advertisement according to the diversity distance between the kth advertisement and the other advertisements other than the kth advertisement among the x advertisements.
  • k is a positive integer from 1 to x.
  • the processor 520 may weight the click probability corresponding to each advertisement in each of the x advertisements and the novelty factor corresponding to each advertisement, determine the score corresponding to each of the x advertisements, and may follow The x advertisements are sorted from the largest to the smallest, and the x advertisements are sorted to obtain the sorted x advertisements. The processor 520 can then determine the top p advertisements among the sorted x advertisements as the p advertisements to be recommended to the ith user.
  • the processor 520 may sort the x advertisements in order of click probability to obtain the sorted x advertisements.
  • the processor 520 may sort the top q advertisements in the sorted x advertisements according to the novelty factor from the largest to the smallest, and obtain the reordered q advertisements, where q is a positive integer and q is greater than p.
  • the processor 520 may determine the top p advertisements among the reordered q advertisements as p advertisements to be recommended to the ith user.
  • the processor 520 may generate a user-webpage access matrix, a user-advertising click matrix, and an advertisement-webpage relevance matrix according to the webpage access information and the advertisement click information, wherein the user-webpage access matrix
  • the i-th row and the j-th column object represent the access record of the i-th user to the jth webpage
  • the i-th row and the k-th column object of the user-ad click matrix represent the click record of the i-th user to the k-th advertisement
  • the jth row and kth column of the matrix represent the jth page and the k
  • the degree of association between advertisements, k is a positive integer from 1 to x.
  • the processor 520 may perform joint probability matrix decomposition on the user-web access matrix, the user-ad click matrix, and the advertisement-web relevance matrix to obtain the user implicit feature vector of the i-th user, the webpage implied feature vector of the j-th webpage, and The advertisement of the kth advertisement implies an eigenvector.
  • the processor 520 can determine, according to the user implicit feature vector of the i-th user, the webpage implied feature vector of the jth webpage, and the advertisement implied feature vector of the kth advertisement, the click of the kth advertisement when the i-th user accesses the jth webpage. Probability.
  • advertisement recommendation server 500 of FIG. 5 For the other functions and operations of the advertisement recommendation server 500 of FIG. 5, reference may be made to the process of the method embodiment of FIG. 1 to FIG. 3, and details are not described herein again.
  • the advertisement recommendation system 600 of FIG. 6 includes an advertisement recommendation server 610 and a user equipment (User Equipment, UE) 620.
  • UE User Equipment
  • the UE 620 may be a terminal of various forms capable of accessing the Internet, such as a desktop computer, a tablet computer, or a mobile phone.
  • the advertisement recommendation server 610 can recommend an advertisement to the UE 620.
  • the advertisement recommendation server 610 may include a memory 610a and a processor 610b.
  • the memory 610a is for storing executable instructions.
  • the processor 610b can execute executable instructions stored in the memory 610a for: obtaining webpage access information and advertisement click information from a user accessing an internet log, wherein the webpage access information is used to indicate n webpages visited by the m users, and the advertisement clicks The information is used to indicate x advertisements that m users click on n webpages, n, m and x are positive integers greater than 1; according to webpage access information and advertisement click information, the i-th user accesses of m users is predicted. j.
  • the click probability of x advertisements where i is a positive integer from 1 to m, j is a positive integer from 1 to n; the novelty factor corresponding to x advertisements is determined, x advertisements
  • the novelty factor corresponding to each advertisement is used to indicate the degree of knowledge of the i-th user to the advertisement; and the i-th user is determined to be in the x advertisements according to the click probability of the x advertisements and the novelty factor corresponding to the x advertisements respectively.
  • the recommended p advertisements wherein the i-th user has less knowledge of the p advertisements than the i-th users have the knowledge of the advertisements other than the p advertisements of the x advertisements, and the p advertisements have a higher click probability than the x advertisements. Except for p in the advertisement
  • the click probability of an advertisement other than the advertisement p is a positive integer and p ⁇ x.
  • the processor 610b may determine, according to the historical recommendation information, a novelty factor corresponding to each of the x advertisements, where the historical recommendation information is used to indicate that the history of the x advertisements is separately recommended to the ith user.
  • the processor 610b may determine the kth advertisement pair.
  • the novelty factor should be the first value. If the history recommendation information indicates that the kth advertisement has been recommended to the ith user in the past, the processor 610b determines that the novelty factor corresponding to the kth advertisement is the second value.
  • first value is greater than the second value
  • k is a positive integer ranging from 1 to x.
  • the processor 610b may determine that the kth advertisement is recommended to the i-th user q days ago, and q is a positive integer.
  • the processor 610b can determine the Ebbinghaus forgetting curve value corresponding to q days.
  • the processor 610b may determine that the novelty factor corresponding to the kth advertisement is a difference between the first value and the Ebbinghaus forgetting curve value.
  • the processor 610b may determine the similarity between the kth advertisement and the other advertisements other than the kth advertisement among the x advertisements.
  • the processor 610b may determine, according to the similarity between the kth advertisement and the advertisements other than the kth advertisement in the x advertisements, the similarity ranking corresponding to the kth advertisement and the kth advertisement corresponding to the x advertisements. Similarity ranking.
  • the processor 610b may weight the similarity ranking corresponding to the kth advertisement and the dissimilarity ranking corresponding to the kth advertisement to obtain a novelty factor corresponding to the kth advertisement.
  • k is a positive integer from 1 to x.
  • the processor 610b may determine a diversity distance between the kth advertisement and the other advertisements other than the kth advertisement among the x advertisements.
  • the processor 610b may determine the novelty factor corresponding to the kth advertisement according to the diversity distance between the kth advertisement and the other advertisements other than the kth advertisement among the x advertisements.
  • k is a positive integer from 1 to x.
  • the processor 610b may weight the click probability corresponding to each advertisement in each of the x advertisements and the novelty factor corresponding to each advertisement, determine the score corresponding to each of the x advertisements, and may follow The x advertisements are sorted from the largest to the smallest, and the x advertisements are sorted to obtain the sorted x advertisements. Processor 610b may then determine the top p advertisements among the ranked x advertisements as p advertisements to be recommended to the ith user.
  • the processor 610b may sort the x advertisements in order of click probability to obtain the sorted x advertisements.
  • the processor 610b may sort the top q advertisements in the sorted x advertisements according to the novelty factor from the largest to the smallest, and obtain the reordered q advertisements, where q is a positive integer and q is greater than p.
  • the processor 610b may determine the top p advertisements among the reordered q advertisements as the p advertisements to be recommended to the ith user.
  • the processor 610b may access information and advertisement according to the webpage. Clicking on the information to generate a user-web access matrix, a user-ad click matrix, and an advertisement-web relevance matrix, wherein the i-th row and the j-th column object of the user-web access matrix represent the access record of the i-th user to the j-th webpage, The i-th row and the k-th column object of the user-ad click matrix represent the click record of the k-th advertisement by the i-th user, and the j-th row and the k-th column object of the advertisement-web relevance degree matrix represent between the jth webpage and the kth advertisement
  • the degree of association, k is a positive integer from 1 to x.
  • the processor 610b may perform joint probability matrix decomposition on the user-web access matrix, the user-advertise click matrix, and the advertisement-web relevance matrix to obtain the user implicit feature vector of the i-th user, the webpage implied feature vector of the j-th webpage, and The advertisement of the kth advertisement implies an eigenvector.
  • the processor 610b may determine, according to the user implicit feature vector of the i-th user, the webpage implied feature vector of the jth webpage, and the advertisement implied feature vector of the kth advertisement, the click of the kth advertisement when the i-th user accesses the jth webpage. Probability.
  • the click probability of the x advertisements when the i-th user accesses the jth webpage is predicted according to the webpage access information and the advertisement click information, and the novelty factor corresponding to each of the x advertisements is determined according to the historical recommendation information, and according to the x advertisements
  • the click probability and the novelty factor corresponding to the x advertisements respectively determine the p advertisements to be recommended to the i-th user in the x advertisements, wherein the i-th user has less knowledge of the p advertisements than the i-th user-to-x advertisements
  • the degree of awareness of advertisements other than p advertisements, the click probability of p advertisements is higher than the click probability of advertisements other than p advertisements among x advertisements.
  • advertisement recommendation server 610 For the other functions and operations of the advertisement recommendation server 610, reference may be made to the process of the method embodiment of FIG. 1 to FIG. 3 above. To avoid repetition, details are not described herein again.
  • the disclosed systems, devices, and The method can be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the functions may be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a standalone product.
  • the technical solution of the present invention which is essential or contributes to the prior art, or a part of the technical solution, may be embodied in the form of a software product, which is stored in a storage medium, including
  • the instructions are used to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like. .

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • General Business, Economics & Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Databases & Information Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Transfer Between Computers (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种推荐广告的方法及广告推荐服务器。该方法包括:获取网页访问信息和广告点击信息(110),网页访问信息用于指示m个用户所访问的n个网页,广告点击信息用于指示m个用户在n个网页上点击的x个广告;根据网页访问信息和广告点击信息,预测m个用户中第i用户访问第j网页时x个广告的点击概率(120);确定x个广告分别对应的新颖性因子(130);根据x个广告的点击概率和x个广告分别对应的新颖性因子在x个广告中确定待向第i用户推荐的p个广告(140)。该推荐广告的方法及广告推荐服务器能够提高广告的点击率并提升用户体验。

Description

推荐广告的方法及广告推荐服务器
本申请要求于2014年6月16日提交中国专利局、申请号为201410268560.5、发明名称为“推荐广告的方法及广告推荐服务器”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及信息处理领域,并且具体地,涉及推荐广告的方法及广告推荐服务器。
背景技术
互联网在线广告已成为除电视和报纸之外的主要广告投放方式。在线广告的收益与广告的点击率密切相关,增加广告点击率是提高广告收益的有效途径之一。为了提高广告点击率,推荐广告之前需要预测用户点击广告的概率(以下称为广告的点击概率)。
目前,主要通过两种算法预测广告的点击概率来向用户推荐广告。一种是基于内容过滤(Content-based Filtering,CBF)的推荐算法,另一种是基于用户或项目的协同过滤(Collaborative Filtering,CF)的推荐算法。
具体而言,对于基于CBF的算法,主要是利用信息检索或信息过滤技术,根据广告和网页内容的相关性向目标用户推荐广告。即,与网页内容相关性越高的广告,认为其点击概率越高。因此,在相同的网页上往往会向用户推荐相同的广告。然而,这种算法未考虑用户的兴趣,导致广告的点击概率预测的准确性并不高,因此难以保证广告的点击率。
对于基于用户的CF算法,主要根据用户的历史广告点击信息计算用户之间的相似性,然后根据与目标用户相似性较高的用户对广告的点击情况,预测目标用户对广告的喜好程度,然后根据喜好程度对目标用户进行推荐。对于基于项目的CF算法,主要通过计算广告之间的相似性,选择目标广告的最接近的广告集合,根据当前用户对最接近的广告的喜好程度来决定是否推荐目标广告。这两种CF算法均是利用用户的喜好程度预测广告的点击概率。可见,相比基于CBF的算法而言,虽然CF算法在一定程度上提高了广告的点击概率预测的准确性,能够提高广告的点击率,但是由于用户经常访 问内容相似的网页,采用CF算法推荐给用户的广告往往和此用户熟悉的广告很相似,无法发现用户并不熟悉但潜在感兴趣的广告,导致广告的点击率不高,用户体验差。
发明内容
本发明实施例提供推荐广告的方法及广告推荐服务器,能够提高广告的点击率,进而提升用户体验。
第一方面,提供了一种推荐广告的方法,包括:从用户访问互联网日志中获取网页访问信息和广告点击信息,所述网页访问信息用于指示m个用户所访问的n个网页,所述广告点击信息用于指示m个用户在n个网页上点击的x个广告,n、m和x均为大于1的正整数;根据所述网页访问信息和所述广告点击信息,预测所述m个用户中第i用户访问第j网页时所述x个广告的点击概率,其中i为取值从1至m的正整数,j为取值从1至n的正整数;确定所述x个广告分别对应的新颖性因子,所述x个广告中每个广告对应的新颖性因子用于表示所述第i用户对所述每个广告的知晓程度;根据所述x个广告的点击概率和所述x个广告分别对应的新颖性因子,在所述x个广告中确定待向所述第i用户推荐的p个广告,其中,所述第i用户对所述p个广告的知晓程度低于所述第i用户对所述x个广告中除所述p个广告之外的广告的知晓程度,所述p个广告的点击概率高于所述x个广告中除所述p个广告之外的广告的点击概率,p为正整数且p≤x。
结合第一方面,在第一种可能的实现方式中,所述确定所述x个广告分别对应的新颖性因子,包括:根据历史推荐信息,确定所述x个广告分别对应的新颖性因子,所述历史推荐信息用于指示向所述第i用户分别推荐所述x个广告的历史记录。
结合第一方面的第一种可能的实现方式,在第二种可能的实现方式中,所述根据历史推荐信息,确定所述x个广告分别对应的新颖性因子,包括:对于所述x个广告中的第k广告,如果所述历史推荐信息指示未向所述第i用户推荐过所述第k广告,则确定所述第k广告对应的新颖性因子为第一值;如果所述历史推荐信息指示过去向所述第i用户推荐过所述第k广告,则确定所述第k广告对应的新颖性因子为第二值;其中,所述第一值大于所述第二值,k为取值从1至x的正整数。
结合第一方面的第二种可能的实现方式,在第三种可能的实现方式中,所述确定所述第k广告对应的新颖性因子为第二值,包括:确定q天前向所述第i用户推荐过所述第k广告,q为正整数;确定所述q天对应的艾宾浩斯遗忘曲线值;确定所述第k广告对应的新颖性因子为所述第一值与所述艾宾浩斯遗忘曲线值之间的差值。
结合第一方面,在第四种可能的实现方式中,所述确定所述x个广告分别对应的新颖性因子,包括:对于所述x个广告中的第k广告,确定所述第k广告分别与所述x个广告中除所述第k广告之外的其它广告之间的相似度;根据所述第k广告分别与所述x个广告中除所述第k广告之外的其它广告之间的相似度,确定在所述x个广告中所述第k广告对应的相似性排名和所述第k广告对应的不相似性排名;对所述第k广告对应的相似性排名和所述第k广告对应的不相似性排名进行加权,以得到所述第k广告对应的新颖性因子;其中,k为取值从1至x的正整数。
结合第一方面,在第五种可能的实现方式中,所述确定所述x个广告分别对应的新颖性因子,包括:对于所述x个广告中的第k广告,确定所述第k广告分别与所述x个广告中除所述第k广告之外的其它广告之间的多样性距离;根据所述第k广告分别与所述x个广告中除所述第k广告之外的其它广告之间的多样性距离,确定所述第k广告对应的新颖性因子;其中,k为取值从1至x的正整数。
结合第一方面或上述任一实现方式,在第六种可能的实现方式中,所述根据所述x个广告分别对应的点击概率和所述x个广告分别对应的新颖性因子,在所述x个广告中确定待向所述第i用户推荐的p个广告,包括:对所述x个广告中每个广告对应的点击概率和所述每个广告对应的新颖性因子进行加权,确定所述x个广告分别对应的评分;按照所述x个广告对应的评分从大到小的顺序,对所述x个广告进行排序,得到排序后的x个广告;将所述排序后的x个广告中的前p个广告确定为待向所述第i用户推荐的p个广告。
结合第一方面或第一种可能的实现方式至第五种可能的实现方式中任一方式,在第七种可能的实现方式中,所述根据所述x个广告分别对应的点击概率和所述x个广告分别对应的新颖性因子,在所述x个广告中确定待向所述第i用户推荐的p个广告,包括:按照点击概率从大到小的顺序,对所 述x个广告进行排序,得到排序后的x个广告;按照新颖性因子从大到小的顺序,对所述排序后的x个广告中的前q个广告进行排序,得到重新排序后的q个广告,其中q为正整数且q大于p;将所述重新排序后的q个广告中的前p个广告确定为待向所述第i用户推荐的p个广告。
结合第一方面或上述任一实现方式,在第八种可能的实现方式中,所述根据所述网页访问信息和所述广告点击信息,预测所述m个用户中第i用户访问第j网页时所述x个广告的点击概率,包括:根据所述网页访问信息和所述广告点击信息,生成用户-网页访问矩阵、用户-广告点击矩阵和广告-网页关联度矩阵,其中,所述用户-网页访问矩阵的第i行第j列对象表示所述第i用户对所述第j网页的访问记录,所述用户-广告点击矩阵的第i行第k列对象表示所述第i用户对第k广告的点击记录,所述广告-网页关联度矩阵的第j行第k列对象表示所述第j网页与所述第k广告之间的关联度,k为取值从1至x的正整数;对所述用户-网页访问矩阵、所述用户-广告点击矩阵和所述广告-网页关联度矩阵进行联合概率矩阵分解,得到所述第i用户的用户隐含特征向量、所述第j网页的网页隐含特征向量和所述第k广告的广告隐含特征向量;根据所述第i用户的用户隐含特征向量、所述第j网页的网页隐含特征向量和所述第k广告的广告隐含特征向量,确定所述第i用户访问所述第j网页时所述第k广告的点击概率。
第二方面,提供了一种广告推荐服务器,包括:获取单元,用于从用户访问互联网日志中获取网页访问信息和广告点击信息,所述网页访问信息用于指示m个用户所访问的n个网页,所述广告点击信息用于指示m个用户在n个网页上点击的x个广告,n、m和x均为大于1的正整数;预测单元,用于根据所述网页访问信息和所述广告点击信息,预测所述m个用户中第i用户访问第j网页时所述x个广告的点击概率,其中i为取值从1至m的正整数,j为取值从1至n的正整数;确定单元,用于确定所述x个广告分别对应的新颖性因子,所述x个广告中每个广告对应的新颖性因子用于表示所述第i用户对所述每个广告的知晓程度;选择单元,用于根据所述x个广告的点击概率和所述x个广告分别对应的新颖性因子,在所述x个广告中确定待向所述第i用户推荐的p个广告,其中,所述第i用户对所述p个广告的知晓程度低于所述第i用户对所述x个广告中除所述p个广告之外的广告的知晓程度,所述p个广告的点击概率高于所述x个广告中除所述p个广告之 外的广告的点击概率,p为正整数且p≤x。
结合第二方面,在第一种可能的实现方式中,所述确定单元,具体用于:根据历史推荐信息,确定所述x个广告分别对应的新颖性因子,所述历史推荐信息用于指示向所述第i用户分别推荐所述x个广告的历史记录。
结合第二方面的第一种可能的实现方式,在第二种可能的实现方式中,所述确定单元,具体用于:对于所述x个广告中的第k广告,如果所述历史推荐信息指示未向所述第i用户推荐过所述第k广告,则确定所述第k广告对应的新颖性因子为第一值;如果所述历史推荐信息指示过去向所述第i用户推荐过所述第k广告,则确定所述第k广告对应的新颖性因子为第二值;其中,所述第一值大于所述第二值,k为取值从1至x的正整数。
结合第二方面的第二种可能的实现方式,在第三种可能的实现方式中,所述确定单元,具体用于:确定q天前向所述第i用户推荐过所述第k广告,q为正整数;确定所述q天对应的艾宾浩斯遗忘曲线值;确定所述第k广告对应的新颖性因子为所述第一值与所述艾宾浩斯遗忘曲线值之间的差值。
结合第二方面,在第四种可能的实现方式中,所述确定单元,具体用于:对于所述x个广告中的第k广告,确定所述第k广告分别与所述x个广告中除所述第k广告之外的其它广告之间的相似度;根据所述第k广告分别与所述x个广告中除所述第k广告之外的其它广告之间的相似度,确定在所述x个广告中所述第k广告对应的相似性排名和所述第k广告对应的不相似性排名;对所述第k广告对应的相似性排名和所述第k广告对应的不相似性排名进行加权,以得到所述第k广告对应的新颖性因子;其中,k为取值从1至x的正整数。
结合第二方面,在第五种可能的实现方式中,所述确定单元,具体用于:对于所述x个广告中的第k广告,确定所述第k广告分别与所述x个广告中除所述第k广告之外的其它广告之间的多样性距离;根据所述第k广告分别与所述x个广告中除所述第k广告之外的其它广告之间的多样性距离,确定所述第k广告对应的新颖性因子;其中,k为取值从1至x的正整数。
结合第二方面或上述任一实现方式,在第六种可能的实现方式中,所述选择单元,具体用于:对所述x个广告中每个广告对应的点击概率和所述每个广告对应的新颖性因子进行加权,确定所述x个广告分别对应的评分;按照所述x个广告对应的评分从大到小的顺序,对所述x个广告进行排序,得 到排序后的x个广告;将所述排序后的x个广告中的前p个广告确定为待向所述第i用户推荐的p个广告。
结合第二方面或第一种可能的实现方式至第五种可能的实现方式中任一方式,在第七种可能的实现方式中,所述选择单元,具体用于:按照点击概率从大到小的顺序,对所述x个广告进行排序,得到排序后的x个广告;按照新颖性因子从大到小的顺序,对所述排序后的x个广告中的前q个广告进行排序,得到重新排序后的q个广告,其中q为正整数且q大于p;将所述重新排序后的q个广告中的前p个广告确定为待向所述第i用户推荐的p个广告。
结合第二方面或上述任一实现方式,在第八种可能的实现方式中,所述预测单元,具体用于:根据所述网页访问信息和所述广告点击信息,生成用户-网页访问矩阵、用户-广告点击矩阵和广告-网页关联度矩阵,其中,所述用户-网页访问矩阵的第i行第j列对象表示所述第i用户对所述第j网页的访问记录,所述用户-广告点击矩阵的第i行第k列对象表示所述第i用户对第k广告的点击记录,所述广告-网页关联度矩阵的第j行第k列对象表示所述第j网页与所述第k广告之间的关联度,k为取值从1至x的正整数;对所述用户-网页访问矩阵、所述用户-广告点击矩阵和所述广告-网页关联度矩阵进行联合概率矩阵分解,得到所述第i用户的用户隐含特征向量、所述第j网页的网页隐含特征向量和所述第k广告的广告隐含特征向量;根据所述第i用户的用户隐含特征向量、所述第j网页的网页隐含特征向量和所述第k广告的广告隐含特征向量,确定所述第i用户访问所述第j网页时所述第k广告的点击概率。
本发明实施例中,根据网页访问信息和广告点击信息预测第i用户访问第j网页时x个广告的点击概率,根据历史推荐信息确定x个广告分别对应的新颖性因子,并根据x个广告的点击概率和x个广告分别对应的新颖性因子在x个广告中确定待向第i用户推荐的p个广告,其中第i用户对p个广告的知晓程度低于第i用户对x个广告中除p个广告之外的广告的知晓程度,p个广告的点击概率高于x个广告中除p个广告之外的广告的点击概率。由于综合考虑了用户、网页和广告三方面的信息来预测广告的点击概率,从而能够提升广告的点击概率预测的准确性,并且由于考虑了广告的新颖性,从而能够避免长时间向用户推荐同一类型而未考虑用户潜在兴趣的广告,因此 能够提高广告的点击率,进而提升用户体验。
附图说明
为了更清楚地说明本发明实施例的技术方案,下面将对本发明实施例中所需要使用的附图作简单地介绍,显而易见地,下面所描述的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是根据本发明实施例的推荐广告的方法的示意性流程图。
图2是根据本发明实施例的推荐广告的方法的过程的示意性流程图。
图3是根据本发明实施例的AdRec模型的示意图。
图4是根据本发明实施例的广告推荐服务器的示意性框图。
图5是根据本发明实施例的广告推荐服务器的示意性框图。
图6是根据本发明实施例的广告推荐系统的示意框图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明的一部分实施例,而不是全部实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都应属于本发明保护的范围。
本发明实施例可以应用于各种对象的推荐场景,例如商品、应用(Application)或歌曲等对象的推荐。因此,本发明实施例中,广告可以是这些推荐对象的载体,被推荐对象的信息可以通过广告页面来显示。
本发明实施例的方法可以由广告推荐服务器来执行。广告推荐服务器可以存储广告主发布的广告,对广告主发布的广告进行管理,并可以向用户提供广告服务。具体地,广告推荐服务器可以统计用户对广告的点击记录以及用户对网页的点击记录等信息,可以基于这些信息向用户推荐广告。
图1是根据本发明实施例的推荐广告的方法的示意性流程图。图1的方法可由广告推荐服务器执行。
110,从用户访问互联网日志中获取网页访问信息和广告点击信息,网页访问信息用于指示m个用户所访问的n个网页,广告点击信息用于指示m个用户在n个网页上点击的x个广告,n、m和x均为大于1的正整数。
120,根据网页访问信息和广告点击信息,预测m个用户中第i用户访问第j网页时x个广告的点击概率,其中i为取值从1至m的正整数,j为取值从1至n的正整数。
130,根据历史推荐信息,确定x个广告分别对应的新颖性因子,历史推荐信息用于指示向第i用户分别推荐x个广告的历史记录,x个广告中每个广告的新颖性因子用于表示第i用户对该广告的知晓程度。
140,根据x个广告的点击概率和x个广告分别对应的新颖性因子,在x个广告中确定待向第i用户推荐的p个广告,其中,第i用户对p个广告的知晓程度低于第i用户对x个广告中除所述p个广告之外的广告的知晓程度,p个广告的点击概率高于x个广告中除p个广告之外的广告的点击概率,p为正整数且p≤x。
本发明实施例中,根据网页访问信息和广告点击信息预测第i用户访问第j网页时x个广告的点击概率,根据历史推荐信息确定x个广告分别对应的新颖性因子,并根据x个广告的点击概率和x个广告分别对应的新颖性因子在x个广告中确定待向第i用户推荐的p个广告,其中第i用户对p个广告的知晓程度低于第i用户对x个广告中除p个广告之外的广告的知晓程度,p个广告的点击概率高于x个广告中除p个广告之外的广告的点击概率。由于综合考虑了用户、网页和广告三方面的信息来预测广告的点击概率,从而能够提升广告的点击概率预测的准确性,并且由于考虑了广告的新颖性,从而能够避免长时间向用户推荐同一类型而未考虑用户潜在兴趣的广告,因此能够提高广告的点击率,进而提升用户体验。
具体而言,现有的广告推荐算法中,均是利用二维信息预测广告的点击概率,例如广告和网页的相关信息或者用户和广告的相关信息。此外,基于现有的基于CBF的算法或CF算法,向用户推荐的广告往往和该用户熟悉的广告很相似。用户不熟悉但具有潜在兴趣的广告却难以被推荐给用户。
本发明实施例中,网页访问信息用于指示m个用户所访问的n个网页,广告点击信息用于指示m个用户在n个网页上点击的x个广告,因此,根据网页访问信息和广告点击信息预测广告的点击概率,也就是利用用户、网页以及广告这三个维度的信息预测x个广告的点击概率,从而能够提高广告的点击概率预测的准确性。此外,根据用于指示向第i用户推荐x个广告的历史记录的历史推荐信息,确定x个广告分别对应的新颖性因子。这样,在根 据x个广告的点击概率和x个广告分别对应的新颖性因子确定待向第i用户推荐的p个广告时,同时考虑了广告的点击概率预测的准确性和广告的新颖性两方面,因此不仅能够提升广告的点击概率预测的准确性,并且由于考虑了广告的新颖性,从而能够避免长时间向用户推荐同一类型而未考虑用户潜在兴趣的广告,因此能够提高广告的点击率,并提升用户体验。
应理解,本发明实施例中,第i用户可以是m个用户中任意一个用户,第j网页可以是n个网页中任意一个网页。
可选地,作为一个实施例,上述x个广告可以是广告推荐服务器中存储的所有广告或部分广告。
可选地,作为另一实施例,在步骤120中,可以根据网页访问信息和广告点击信息,生成用户-网页访问矩阵、用户-广告点击矩阵和广告-网页关联度矩阵,其中,用户-网页访问矩阵的第i行第j列对象表示第i用户对第j网页的访问记录,用户-广告点击矩阵的第i行第k列对象表示第i用户对第k广告的点击记录,广告-网页关联度矩阵的第j行第k列对象表示第j网页与第k广告之间的关联度,k为取值从1至x的正整数。然后可以对用户-网页访问矩阵、用户-广告点击矩阵和广告-网页关联度矩阵进行联合概率矩阵分解,得到第i用户的用户隐含特征向量、第j网页的网页隐含特征向量和第k广告的广告隐含特征向量。最后可以根据第i用户的用户隐含特征向量、第j网页的网页隐含特征向量和第k广告的广告隐含特征向量,确定第i用户访问第j网页时第k广告的点击概率。
通常网页的数量非常大,可以将网页按照进行分类后,再将网页访问信息和广告点击信息转化为用户-网页访问矩阵、用户-广告点击矩阵以及网页和广告同时出现时广告的点击率矩阵。例如,可以按照域名对网页进行分类。此外,可以从网页访问信息和广告点击信息中提取网页与广告的相似度信息。基于网页和广告同时出现时广告的点击率矩阵以及网页与广告的相似度信息,可以得到广告-网页关联度矩阵。
利用联合概率矩阵分解(Unified Probabilistic Matrix Factorization,UPMF)算法,可以对用户-网页访问矩阵、用户-广告点击矩阵和广告-网页关联度矩阵进行分解,从而得到第i用户访问第j网页时x个广告的点击概率。
用户-网页访问矩阵和用户-广告点击矩阵可以反映用户的的兴趣,而广 告-网页关联度矩阵可以反映网页与广告之间的相关性,可见,本实施例中,同时考虑了用户的兴趣以及网页与广告之间的相关性,预测各个广告的点击概率。因此,能够提高广告的点击概率预测的准确性,从而能够保证广告的点击率。
目前,由于网页数量和用户数量很大,用户对网页的访问数据以及用户对广告的点击数据十分稀疏。这种现象也可以称为数据稀疏。这种情况下,采用基于CBF的算法或者CF算法预测广告的点击概率的准确率会大大降低。而本发明实施例中,利用联合概率矩阵分解算法,根据用户-网页访问矩阵、用户-广告点击矩阵和广告-网页关联度矩阵这三个矩阵预测广告的点击概率,虽然这三个矩阵可能均为稀疏矩阵,但由于并非仅仅基于其中某一个矩阵预测点击概率,从而在数据稀疏的情况下也能够保证广告的点击概率预测的准确性。稀疏矩阵可以指行或列的数据缺失较多的矩阵。
具体而言,在第i用户访问第j网页时,对于x个广告中的第k广告,可以以最大化联合后验概率为目标函数,基于梯度下降法,对用户-网页访问矩阵、用户-广告点击矩阵和广告-网页关联度矩阵进行分解,得到第i用户的用户隐含特征向量、第j网页的网页隐含特征向量和第k广告的广告隐含特征向量。可以利用,根据第i用户的用户隐含特征向量、第j网页的网页隐含特征向量和第k广告的广告隐含特征向量,预测第k广告的点击概率。
具体地,以最大化联合后验概率为目标函数,基于梯度下降法,根据上述三个矩阵得到第i用户的用户隐含特征向量、第j网页的网页隐含特征向量和第k广告的广告隐含特征向量。根据第i用户的用户隐含特征向量、第j网页的网页隐含特征向量和第k广告的广告隐含特征向量,可以分别确定第一向量、第二向量和第三向量,第一向量可以表示第i用户对第j网页的感兴趣程度,第二向量可以表示第i用户对第k广告的感兴趣程度,第三向量可以表示第j网页与第k广告的关联程度。可以将第一向量、第二向量以及第三向量的线性组合映射到[0,1],从而可以得到在第i用户访问第j网页时第k广告的点击概率。
第k广告可以是x个广告中的任一广告。对于每个广告而言,均可以按照上述过程计算在第i用户访问第j网页时其点击概率。这样可以得到在第i用户访问第j网页时x个广告的点击概率。
目前,由于网页数量和用户数量的规模较大,因此推荐算法的复杂度是 需要重点关注的因素。本实施例中,计算过程的开销主要来源于梯度下降法。算法复杂度随三个矩阵中数据量增加而线性增长。因此,本实施例适用于大规模数据的处理。
可选地,作为另一实施例,在步骤130中,对于x个广告中的第k广告,如果历史推荐信息指示未向第i用户推荐过第k广告,则可以确定第k广告对应的新颖性因子为第一值;如果历史推荐信息指示过去向第i用户推荐过第k广告,则可以确定第k广告对应的新颖性因子为第二值。
其中,第一值大于第二值,k为取值从1至x的正整数。
具体而言,上述第k广告可以是x个广告中的任意一个广告。每个广告可以对应一个新颖性因子。每个广告对应的新颖性因子可以用于表示对第i用户而言该广告的新颖性。对于每个广告而言,在未向第i用户推荐过的情况下的新颖性因子大于在向第i用户已经推荐过的情况下的新颖性因子。广告对应的新颖性因子越大,则可以表明对于第i用户来说该广告的新颖性越高,换句话说,第i用户对该广告不熟悉或者未见过该广告。
可见,本实施例中,对于每个广告而言,在未向第i用户推荐过的情况下的新颖性因子大于在向第i用户已经推荐过的情况下的新颖性因子,这样,能够提升所推荐的广告的新颖性,从而提升用户体验。
第一值和第二值可以是预先设定的,例如,第一值可以预设为1,第二值可以预设为0.5。或者,第二值可以是根据历史推荐信息和艾宾浩斯遗忘曲线得到的。
可选地,作为另一实施例,在步骤130中,可以确定q天前向第i用户推荐过第k广告,q为正整数,确定q天对应的艾宾浩斯遗忘曲线值,并确定第k广告对应的新颖性因子为第一值与艾宾浩斯遗忘曲线值之间的差值。
例如,第一值可以预设为1,第二值为1-艾宾浩斯遗忘曲线值。
对于向第i用户推荐过的广告而言,可以基于艾宾浩斯遗忘曲线来确定该广告对应的新颖性因子。这样能够提高新颖性因子的准确度,从而能够提升向用户推荐的广告的新颖性,并提升用户体验。需要说明的是,基于艾宾浩斯遗忘曲线值来确定该广告对应的新颖性因子只是本发明采用的一种较佳的实施方式,可以理解的是,将艾宾浩斯遗忘曲线值替换成与q相关的权重值,也可以实现本发明方案。
可选地,作为另一实施例,在步骤130中,对于x个广告中的第k广告, 可以确定第k广告分别与x个广告中除第k广告之外的其它广告之间的相似度。可以根据第k广告分别与x个广告中除第k广告之外的其它广告之间的相似度,确定在x个广告中第k广告对应的相似性排名和第k广告对应的不相似性排名。可以对第k广告对应的相似性排名和第k广告对应的不相似性排名进行加权,以得到第k广告对应的新颖性因子,其中,k为取值从1至x的正整数。
具体而言,可以根据领域分类体系的评价指标——列表内部相似度(Intra-list Similarity)来确定各个广告对应的新颖性因子。针对x个广告,可以确定两两广告之间的相似度。例如,可以根据余弦相似性算法或皮尔森(Pearson)相似性算法,确定两两广告之间的相似度。这样,对于每个广告,可以利用其与其它广告之间的相似度,确定在x个广告中该广告对应的相似性排名RS和不相似性排名NRS。然后可以对该广告对应的相似性排名和不相似性排名进行加权,从而得到该广告对应的新颖性因子。例如,该广告的新颖性因子=W*RS+(1-W)*NRS,其中W为权重值。
本实施例能够提高新颖性因子的准确度,从而能够提升向用户推荐的广告的新颖性,并提升用户体验。
可选地,作为另一实施例,在步骤130中,对于x个广告中的第k广告,确定第k广告分别与x个广告中除第k广告之外的其它广告之间的多样性距离;根据第k广告分别与x个广告中除第k广告之外的其它广告之间的多样性距离,确定第k广告对应的新颖性因子;其中,k为取值从1至x的正整数。
具体地,可以基于推荐多样性原理来确定x个广告分别对应的新颖性因子。对于x个广告,可以确定两两广告之间的多样性距离。例如,可以基于Jaccard多样性距离计算方式,来得到两两广告之间的多样性距离。
因此,对于每个广告,可以计算出其与其它各个广告之间的多样性距离。根据该广告与其它各个广告之间的多样性距离,确定该广告对应的新颖性因子。例如,可以将该广告与其它各个广告之间的多样性距离进行求和,得到该广告对应的新颖性因子。本实施例能够提高新颖性因子的准确度,从而能够提升向用户推荐的广告的新颖性,并提升用户体验。
可选地,作为另一实施例,在步骤140中,可以对x个广告中每个广告对应的点击概率和每个广告对应的新颖性因子进行加权,确定x个广告分别 对应的评分。可以按照x个广告对应的评分从大到小的顺序,对x个广告进行排序,得到排序后的x个广告。可以将排序后的x个广告中的前p个广告确定为待向第i用户推荐的p个广告。
具体地,可以通过加权算法,对点击概率和新颖性因子进行加权,来得到各个广告对应的评分。例如,对于每个广告,可以为其点击概率和新颖性因子分配相应的权重,利用所分配的权重对该广告的点击概率和新颖性因子进行加权,从而得到该广告对应的评分。可以按照评分从大到小的顺序对x个广告进行排序,将排序后的x个广告中前p个广告作为待向第i用户推荐的广告。可见,在确定要向第i用户推荐的广告时,同时考虑了点击概率和新颖性因子两方面因素,从而能够提高广告的点击率并提升用户体验。
可选地,作为另一实施例,在步骤140中,可以按照点击概率从大到小的顺序,对x个广告进行排序,得到排序后的x个广告。可以按照新颖性因子从大到小的顺序,对排序后的x个广告中的前q个广告进行排序,得到重新排序后的q个广告,其中q为正整数且q大于p。可以将重新排序后的q个广告中前p个广告确定为待向第i用户推荐的p个广告。
例如,可以基于上述这种漏斗形的过滤加权方式得到广告推荐列表。q优选为p的2倍。可见,在确定待向第i用户推荐的广告时,同时考虑了点击概率和新颖性因子两方面因素,从而能够提高广告的点击率并提升用户体验。
可选地,作为另一实施例,在步骤110中,可以实时地从用户访问互联网日志中获取网页访问信息和广告点击信息。广告点击信息可以包含用户对推荐的p个广告的点击信息。也就是说,用户对推荐的p个广告的点击信息会被实时地反馈回来,这样结合实时的信息能够自适应地调整广告的点击概率,从而进一步提高广告的点击概率预测的准确性。
下面将结合具体例子详细描述本发明实施例的过程。应理解,下面的例子仅是为了帮助本领域技术人员更好地理解本发明实施例,而非限制本发明实施例的范围。
图2是根据本发明实施例的推荐广告的方法的过程的示意性流程图。
201,从用户访问互联网的日志中获取网页访问信息和广告点击信息,网页访问信息用于指示m个用户所访问的n个网页,广告点击信息用于指示m个用户在n个网页上点击的x个广告,n、m和x均为大于1的正整数。
202,根据网页访问信息和广告点击信息,生成用户-网页访问矩阵、用户-广告点击矩阵和广告-网页关联度矩阵。
(I)用户-网页访问矩阵
B可以表示用户-网页访问矩阵。B中的元素bij(bij∈[0,1])表示用户ui对网页wj的访问记录,也可以认为是用户ui对网页wj的感兴趣程度。显然地,用户浏览网页的次数越多,可以表明用户对此网页内容越感兴趣。bij可以由公式(1)计算得到:
bij=g(f(ui,wj))   (1)
其中,g(·)是逻辑斯蒂(Logistic Function)函数,用于归一化。f(ui,wj)表示用户ui浏览网wj的次数。
(II)用户-广告点击矩阵
C可以表示用户-广告点击矩阵。C中的元素cik表示用户ui对广告ak的感兴趣程度。显然地,用户点击广告,可以表明用户对该广告感兴趣。cik可以由公式(2)得到:
cik=g(f(ui,ak))   (2)
其中,f(ui,ak)表示用户ui点击广告ak的次数。
(III)广告-网页关联度矩阵
R可以表示广告-网页关联度矩阵。R中的元素rjk表示网页wj与广告ak之间的关联度。同一广告在不同网页上显示时,具有不同的点击率。广告和网页的内容越相关,广告被点击的可能性越大。此处结合网页-广告同时出现时广告的点击率以及网页和广告之间的相似度,确定广告-网页关联度矩阵,这样能够提高广告-网页关联度矩阵的准确度。
rjk可以由公式(3)得到:
rjk=αdjk+(1-α)hjk   (3)
其中,djk可以表示网页wj与广告ak之间的相似度,hjk表示在网页wj上广告ak的点击率。
djk可以按照概率潜在语义分析(Probabilistic Latent Semantic Analysis,PLSA)方法或潜在狄利克雷分配(Latent Dirichlet Allocation,LDA)算法得到。
hjk可以等于网页wj上广告ak被点击的次数除以广告ak在网页wj上总的投放次数。
203,根据用户-网页访问矩阵、用户-广告点击矩阵和广告-网页关联度 矩阵,确定用户ui的用户隐含特征向量、网页wj的网页隐含特征向量和x个广告各自的广告隐含特征向量。
用户对网页的访问历史和对广告的点击历史均能反映用户的兴趣或偏好。而广告点击率与用户兴趣及广告与网页关联度密切相关。本实施例中,通过利用AdRec模型将用户兴趣及广告与网页关联度相结合。
下面将以x个广告中的广告ak为例进行描述。应理解,广告ak可以是x个广告中任一广告。
具体地,可以基于AdRec模型确定这三个隐含特征向量。图3是根据本发明实施例的AdRec模型的示意图。如图3所示,用户-网页访问矩阵与用户-广告点击矩阵共享用户隐含特征向量Ui,用户-广告点击矩阵与广告-网页关联度矩阵共享广告隐含特征向量Ak
AdRec模型基于如下假设:
(I)假设Ui、Wj和Ak先验服从正态分布且相互独立,即
Figure PCTCN2015072573-appb-000001
Figure PCTCN2015072573-appb-000002
Figure PCTCN2015072573-appb-000003
(II)在给定用户ui的用户隐含特征向量Ui、网页wj的网页隐含特征向量Wj(其中,Ui和Wj的维数均为l)后,bij满足均值为g(Ui TWj)、方差为
Figure PCTCN2015072573-appb-000004
的正态分布且相互独立。用户-网页访问矩阵B的条件概率分布如下:
Figure PCTCN2015072573-appb-000005
其中,
Figure PCTCN2015072573-appb-000006
是指示函数,g(·)是逻辑斯蒂函数。
当用户ui访问过网页wj
Figure PCTCN2015072573-appb-000007
否则,
Figure PCTCN2015072573-appb-000008
g(·)的具体表现形式为g(z)=1/(1+e-z),用于将
Figure PCTCN2015072573-appb-000009
映射到[0,1]。由于UPMF算法引入概率思想,因此矩阵中各元素的值应属于[0,1]。
(III)cik满足均值为g(Ui TAk)、方差为
Figure PCTCN2015072573-appb-000010
的正态分布且互相独立。用户-广告点击矩阵C的条件概率分布如下:
Figure PCTCN2015072573-appb-000011
其中,
Figure PCTCN2015072573-appb-000012
是指示函数,g(·)是逻辑斯蒂函数。
当用户ui点击过广告ak时,
Figure PCTCN2015072573-appb-000013
否则,
Figure PCTCN2015072573-appb-000014
g(·)的具体表现形式如上所述,用于将
Figure PCTCN2015072573-appb-000015
值映射到[0,1]。
(IV)rjk满足均值为g(Wj TAk)、方差为
Figure PCTCN2015072573-appb-000016
的正态分布且互相独立。广告-网页关联度矩阵R的条件概率分布如下:
Figure PCTCN2015072573-appb-000017
其中,
Figure PCTCN2015072573-appb-000018
是指示函数,g(·)是逻辑斯蒂函数。
当网页wj与广告ak有关联时,即rjk大于0时,
Figure PCTCN2015072573-appb-000019
否则,
Figure PCTCN2015072573-appb-000020
g(·)的具体表现形式如上所述,用于将
Figure PCTCN2015072573-appb-000021
值映射到[0,1]。
(V)根据上述等式(4)至(9),可以推导出U、W和A的后验分布函数。后验分布函数的log函数如下:
Figure PCTCN2015072573-appb-000022
其中,T是常量。等式(10)可以视为无约束优化问题。等式(11)等价于等式(10)。
Figure PCTCN2015072573-appb-000023
其中,
Figure PCTCN2015072573-appb-000024
等式(11)的局部最小值可基于梯度下降法得到。Ui、Wj和Ak的梯度下降公式如下所示:
Figure PCTCN2015072573-appb-000025
Figure PCTCN2015072573-appb-000026
Figure PCTCN2015072573-appb-000027
根据上述公式(12)至(14)可以得到Ui、Wj和Ak
(VI)时间复杂度分析
梯度下降法的计算开销主要来自于目标函数E和对应的梯度下降公式。由于矩阵B、C和R属于稀疏矩阵,等式(10)中目标函数时间复杂度可以为O(nBl+nCl+nRl),其中nB、nC和nR分别表示矩阵B、C和R中非零元素个数。
同理可以推导出等式(12)至(14)的时间复杂度。因此每次迭代的总时间复杂度为O(nBl+nCl+nRl),即算法时间复杂度随三个稀疏矩阵中观测数据数量增加成线性增长。因此本发明实施例可应用于大规模数据的处理。
可以按照上述过程,得到x个广告中每个广告的广告特征向量。
204,根据用户ui的用户隐含特征向量、网页wj的网页隐含特征向量和x个广告各自的广告隐含特征向量,预测在用户ui访问网页wj时x个广告的点击概率。
下面仍以广告ak为例进行描述。
在用户ui访问网页wj时,广告ak的点击概率可以使用实数
Figure PCTCN2015072573-appb-000028
表示, 可以按照等式(15)得到:
Figure PCTCN2015072573-appb-000029
其中,h(·)是参数为
Figure PCTCN2015072573-appb-000030
Figure PCTCN2015072573-appb-000031
的函数。
Figure PCTCN2015072573-appb-000032
可以表示用户ui对网页wj的感兴趣程度,
Figure PCTCN2015072573-appb-000033
可以表示用户ui对广告ak的感兴趣程度,
Figure PCTCN2015072573-appb-000034
可以表示广告ak与网页wj的关联程度。
按照等式(15),可以得到在用户ui访问网页wj时x个广告的点击概率。
205,根据x个广告的历史推荐信息,确定x个广告分别对应的新颖性因子。
下面仍以广告ak为例进行描述。
广告ak对应的新颖性因子
Figure PCTCN2015072573-appb-000035
可以根据等式(16)确定:
Figure PCTCN2015072573-appb-000036
其中,q为正整数。基于q的取值,可以得到q对应的艾宾浩斯遗忘曲线值。
这样,可以根据等式(16)得到x个广告中的每个广告对应的新颖性因子。
206,对x个广告的点击概率和x个广告分别对应的新颖性因子进行加权,得到x个广告分别对应的评分。
例如,可以向每个广告的点击概率和其新颖性因子分配相应的权重,利用所分配的权重对该广告的点击概率和新颖性因子进行加权,得到该广告对应的评分。其中,每个广告的点击概率的权重与自己的新颖性因子的权重之和为1。
207,按照x个广告对应的评分从大到小的顺序,对x个广告进行排序,得到排序后的x个广告。
208,在用户ui访问网页wj时,向用户ui推荐排序后的x个广告中的前p个广告,p为正整数。
具体地,可以在用户ui访问网页wj时,在网元wj上呈现p个广告的信息。
此外,在得到x个广告的点击概率和x个广告分别对应的新颖性因子后,可以通过除步骤206和207之外的其它方式确定待向用户ui推荐的p个广告。例如,可以基于漏斗形的过滤加权方式得到待向用户ui推荐的p个广告。具 体而言,可以按照点击概率从大到小的顺序对x个广告进行排序,得到排序后的x个广告。然后,可以按照新颖性因子从大到小的顺序对排序后的x个广告中前q个广告重新进行排序,得到重新排序后的q个广告。然后可以将重新排序后的q个广告中前p个广告推荐给用户ui。例如,q可以是p的2倍。
本发明实施例中,根据网页访问信息和广告点击信息预测第i用户访问第j网页时x个广告的点击概率,根据历史推荐信息确定x个广告分别对应的新颖性因子,并根据x个广告的点击概率和x个广告分别对应的新颖性因子在x个广告中确定待向第i用户推荐的p个广告,其中第i用户对p个广告的知晓程度低于第i用户对x个广告中除p个广告之外的广告的知晓程度,p个广告的点击概率高于x个广告中除p个广告之外的广告的点击概率。由于综合考虑了用户、网页和广告三方面的信息来预测广告的点击概率,从而能够提升广告的点击概率预测的准确性,并且由于考虑了广告的新颖性,从而能够避免长时间向用户推荐同一类型而未考虑用户潜在兴趣的广告,因此能够提高广告的点击率,进而提升用户体验。
图4是根据本发明实施例的广告推荐服务器的示意性框图。图4的广告推荐服务器400包括获取单元410、预测单元420、确定单元430和选择单元440。
获取单元410从用户互联网日志中获取网页访问信息和广告点击信息,网页访问信息用于指示m个用户所访问的n个网页,广告点击信息用于指示m个用户在n个网页上点击的x个广告,n、m和x均为大于1的正整数。预测单元420根据网页访问信息和广告点击信息,预测m个用户中第i用户访问第j网页时x个广告的点击概率,其中i为取值从1至m的正整数,j为取值从1至n的正整数。确定单元430确定x个广告分别对应的新颖性因子,x个广告中每个广告对应的新颖性因子用于表示第i用户对该广告的知晓程度。选择单元440根据x个广告的点击概率和x个广告分别对应的新颖性因子,在x个广告中确定待向第i用户推荐的p个广告,其中,第i用户对p个广告的知晓程度低于第i用户对x个广告中除p个广告之外的广告的知晓程度,p个广告的点击概率高于x个广告中除p个广告之外的广告的点击概率,p为正整数且p≤x。
本发明实施例中,根据网页访问信息和广告点击信息预测第i用户访问 第j网页时x个广告的点击概率,根据历史推荐信息确定x个广告分别对应的新颖性因子,并根据x个广告的点击概率和x个广告分别对应的新颖性因子在x个广告中确定待向第i用户推荐的p个广告,其中第i用户对p个广告的知晓程度低于第i用户对x个广告中除p个广告之外的广告的知晓程度,p个广告的点击概率高于x个广告中除p个广告之外的广告的点击概率。由于综合考虑了用户、网页和广告三方面的信息来预测广告的点击概率,从而能够提升广告的点击概率预测的准确性,并且由于考虑了广告的新颖性,从而能够避免长时间向用户推荐同一类型而未考虑用户潜在兴趣的广告,因此能够提高广告的点击率,进而提升用户体验。
可选地,作为一个实施例,确定单元430可以根据历史推荐信息,确定x个广告分别对应的新颖性因子,历史推荐信息用于指示向第i用户分别推荐x个广告的历史记录。
可选地,作为另一实施例,对于x个广告中的第k广告,如果历史推荐信息指示未向第i用户推荐过第k广告,则确定单元430可以确定第k广告对应的新颖性因子为第一值。如果历史推荐信息指示过去向第i用户推荐过第k广告,则确定单元430确定第k广告对应的新颖性因子为第二值。
其中,第一值大于第二值,k为取值从1至x的正整数。
可选地,作为另一实施例,确定单元430可以确定q天前向第i用户推荐过第k广告,q为正整数。确定单元430可以确定q天对应的艾宾浩斯遗忘曲线值。确定单元430可以确定第k广告对应的新颖性因子为第一值与艾宾浩斯遗忘曲线值之间的差值。
可选地,作为另一实施例,对于x个广告中的第k广告,确定单元430可以确定第k广告分别与x个广告中除第k广告之外的其它广告之间的相似度。确定单元430可以根据第k广告分别与x个广告中除第k广告之外的其它广告之间的相似度,确定在x个广告中第k广告对应的相似性排名和第k广告对应的不相似性排名。确定单元430可以对第k广告对应的相似性排名和第k广告对应的不相似性排名进行加权,以得到第k广告对应的新颖性因子。其中,k为取值从1至x的正整数。
可选地,作为另一实施例,对于x个广告中的第k广告,确定单元430可以确定第k广告分别与x个广告中除第k广告之外的其它广告之间的多样性距离。确定单元430可以根据第k广告分别与x个广告中除第k广告之外 的其它广告之间的多样性距离,确定第k广告对应的新颖性因子。其中,k为取值从1至x的正整数。
可选地,作为另一实施例,选择单元440可以对x个广告中每个广告对应的点击概率和每个广告对应的新颖性因子进行加权,确定x个广告分别对应的评分,并可以按照x个广告对应的评分从大到小的顺序,对x个广告进行排序,得到排序后的x个广告。然后选择单元440可以将排序后的x个广告中的前p个广告确定为待向第i用户推荐的p个广告。
可选地,作为另一实施例,选择单元440可以按照点击概率从大到小的顺序,对x个广告进行排序,得到排序后的x个广告。选择单元440可以按照新颖性因子从大到小的顺序,对排序后的x个广告中的前q个广告进行排序,得到重新排序后的q个广告,其中q为正整数且q大于p。选择单元440还可以将重新排序后的q个广告中的前p个广告确定为待向第i用户推荐的p个广告。
可选地,作为另一实施例,预测单元420可以根据网页访问信息和广告点击信息,生成用户-网页访问矩阵、用户-广告点击矩阵和广告-网页关联度矩阵,其中,用户-网页访问矩阵的第i行第j列对象表示第i用户对第j网页的访问记录,用户-广告点击矩阵的第i行第k列对象表示第i用户对第k广告的点击记录,广告-网页关联度矩阵的第j行第k列对象表示第j网页与第k广告之间的关联度,k为取值从1至x的正整数。预测单元420可以对用户-网页访问矩阵、用户-广告点击矩阵和广告-网页关联度矩阵进行联合概率矩阵分解,得到第i用户的用户隐含特征向量、第j网页的网页隐含特征向量和第k广告的广告隐含特征向量。然后预测单元420可以根据第i用户的用户隐含特征向量、第j网页的网页隐含特征向量和第k广告的广告隐含特征向量,确定第i用户访问第j网页时第k广告的点击概率。
图4的广告推荐服务器400的其它功能和操作可以参照上述图1至图3的方法实施例的过程,为了避免重复,此处不再赘述。
图5是根据本发明实施例的广告推荐服务器的示意性框图。图5的广告推荐服务器500可以包括存储器510和处理器520。
存储器510可以包括随机存储器、闪存、只读存储器、可编程只读存储器、非易失性存储器或寄存器等。处理器520可以是中央处理器(Central Processing Unit,CPU)。
存储器510用于存储可执行指令。处理器520可以执行存储器510中存储的可执行指令,用于:从用户访问互联网日志中获取网页访问信息和广告点击信息,网页访问信息用于指示m个用户所访问的n个网页,广告点击信息用于指示m个用户在n个网页上点击的x个广告,n、m和x均为大于1的正整数;根据网页访问信息和广告点击信息,预测m个用户中第i用户访问第j网页时x个广告的点击概率,其中i为取值从1至m的正整数,j为取值从1至n的正整数;确定x个广告分别对应的新颖性因子,x个广告中每个广告对应的新颖性因子用于表示第i用户对该广告的知晓程度;根据x个广告的点击概率和x个广告分别对应的新颖性因子,在x个广告中确定待向第i用户推荐的p个广告,其中,第i用户对p个广告的知晓程度低于第i用户对x个广告中除p个广告之外的广告的知晓程度,p个广告的点击概率高于x个广告中除p个广告之外的广告的点击概率,p为正整数且p≤x。
本发明实施例中,根据网页访问信息和广告点击信息预测第i用户访问第j网页时x个广告的点击概率,根据历史推荐信息确定x个广告分别对应的新颖性因子,并根据x个广告的点击概率和x个广告分别对应的新颖性因子在x个广告中确定待向第i用户推荐的p个广告,其中第i用户对p个广告的知晓程度低于第i用户对x个广告中除p个广告之外的广告的知晓程度,p个广告的点击概率高于x个广告中除p个广告之外的广告的点击概率。由于综合考虑了用户、网页和广告三方面的信息来预测广告的点击概率,从而能够提升广告的点击概率预测的准确性,并且由于考虑了广告的新颖性,从而能够避免长时间向用户推荐同一类型而未考虑用户潜在兴趣的广告,因此能够提高广告的点击率,进而提升用户体验。
可选地,作为一个实施例,处理器520可以根据历史推荐信息,确定x个广告分别对应的新颖性因子,历史推荐信息用于指示向第i用户分别推荐x个广告的历史记录。
可选地,作为另一实施例,对于x个广告中的第k广告,如果历史推荐信息指示未向第i用户推荐过第k广告,则处理器520可以确定第k广告对应的新颖性因子为第一值。如果历史推荐信息指示过去向第i用户推荐过第k广告,则处理器520确定第k广告对应的新颖性因子为第二值。
其中,第一值大于第二值,k为取值从1至x的正整数。
可选地,作为另一实施例,处理器520可以确定q天前向第i用户推荐 过第k广告,q为正整数。处理器520可以确定q天对应的艾宾浩斯遗忘曲线值。处理器520可以确定第k广告对应的新颖性因子为第一值与艾宾浩斯遗忘曲线值之间的差值。
可选地,作为另一实施例,对于x个广告中的第k广告,处理器520可以确定第k广告分别与x个广告中除第k广告之外的其它广告之间的相似度。处理器520可以根据第k广告分别与x个广告中除第k广告之外的其它广告之间的相似度,确定在x个广告中第k广告对应的相似性排名和第k广告对应的不相似性排名。处理器520可以对第k广告对应的相似性排名和第k广告对应的不相似性排名进行加权,以得到第k广告对应的新颖性因子。其中,k为取值从1至x的正整数。
可选地,作为另一实施例,对于x个广告中的第k广告,处理器520可以确定第k广告分别与x个广告中除第k广告之外的其它广告之间的多样性距离。处理器520可以根据第k广告分别与x个广告中除第k广告之外的其它广告之间的多样性距离,确定第k广告对应的新颖性因子。其中,k为取值从1至x的正整数。
可选地,作为另一实施例,处理器520可以对x个广告中每个广告对应的点击概率和每个广告对应的新颖性因子进行加权,确定x个广告分别对应的评分,并可以按照x个广告对应的评分从大到小的顺序,对x个广告进行排序,得到排序后的x个广告。然后处理器520可以将排序后的x个广告中的前p个广告确定为待向第i用户推荐的p个广告。
可选地,作为另一实施例,处理器520可以按照点击概率从大到小的顺序,对x个广告进行排序,得到排序后的x个广告。处理器520可以根据新颖性因子从大到小的顺序,对排序后的x个广告中的前q个广告进行排序,得到重新排序后的q个广告,其中q为正整数且q大于p。处理器520可以将重新排序后的q个广告中的前p个广告确定为待向第i用户推荐的p个广告。
可选地,作为另一实施例,处理器520可以根据网页访问信息和广告点击信息,生成用户-网页访问矩阵、用户-广告点击矩阵和广告-网页关联度矩阵,其中,用户-网页访问矩阵的第i行第j列对象表示第i用户对第j网页的访问记录,用户-广告点击矩阵的第i行第k列对象表示第i用户对第k广告的点击记录,广告-网页关联度矩阵的第j行第k列对象表示第j网页与第 k广告之间的关联度,k为取值从1至x的正整数。处理器520可以对用户-网页访问矩阵、用户-广告点击矩阵和广告-网页关联度矩阵进行联合概率矩阵分解,得到第i用户的用户隐含特征向量、第j网页的网页隐含特征向量和第k广告的广告隐含特征向量。然后处理器520可以根据第i用户的用户隐含特征向量、第j网页的网页隐含特征向量和第k广告的广告隐含特征向量,确定第i用户访问第j网页时第k广告的点击概率。
图5的广告推荐服务器500的其它功能和操作可以参照上述图1至图3的方法实施例的过程,为了避免重复,此处不再赘述。
图6是根据本发明实施例的广告推荐系统的示意框图。图6的广告推荐系统600包括广告推荐服务器610和用户设备(User Equipment,UE)620。
UE)620可以是能够访问互联网的各种形态的终端,例如台式电脑、平板电脑或手机等。
广告推荐服务器610可以向UE 620推荐广告。
具体地,广告推荐服务器610可以包括存储器610a和处理器610b。
存储器610a用于存储可执行指令。处理器610b可以执行存储器610a中存储的可执行指令,用于:从用户访问互联网日志中获取网页访问信息和广告点击信息,网页访问信息用于指示m个用户所访问的n个网页,广告点击信息用于指示m个用户在n个网页上点击的x个广告,n、m和x均为大于1的正整数;根据网页访问信息和广告点击信息,预测m个用户中第i用户访问第j网页时x个广告的点击概率,其中i为取值从1至m的正整数,j为取值从1至n的正整数;确定x个广告分别对应的新颖性因子,x个广告中每个广告对应的新颖性因子用于表示第i用户对该广告的知晓程度;根据x个广告的点击概率和x个广告分别对应的新颖性因子,在x个广告中确定待向第i用户推荐的p个广告,其中,第i用户对p个广告的知晓程度低于第i用户对x个广告中除p个广告之外的广告的知晓程度,p个广告的点击概率高于x个广告中除p个广告之外的广告的点击概率,p为正整数且p≤x。
可选地,作为一个实施例,处理器610b可以根据历史推荐信息,确定x个广告分别对应的新颖性因子,历史推荐信息用于指示向第i用户分别推荐x个广告的历史记录。
可选地,作为一个实施例,对于x个广告中的第k广告,如果历史推荐信息指示未向第i用户推荐过第k广告,则处理器610b可以确定第k广告对 应的新颖性因子为第一值。如果历史推荐信息指示过去向第i用户推荐过第k广告,则处理器610b确定第k广告对应的新颖性因子为第二值。
其中,第一值大于第二值,k为取值从1至x的正整数。
可选地,作为另一实施例,处理器610b可以确定q天前向第i用户推荐过第k广告,q为正整数。处理器610b可以确定q天对应的艾宾浩斯遗忘曲线值。处理器610b可以确定第k广告对应的新颖性因子为第一值与艾宾浩斯遗忘曲线值之间的差值。
可选地,作为另一实施例,对于x个广告中的第k广告,处理器610b可以确定第k广告分别与x个广告中除第k广告之外的其它广告之间的相似度。处理器610b可以根据第k广告分别与x个广告中除第k广告之外的其它广告之间的相似度,确定在x个广告中第k广告对应的相似性排名和第k广告对应的不相似性排名。处理器610b可以对第k广告对应的相似性排名和第k广告对应的不相似性排名进行加权,以得到第k广告对应的新颖性因子。其中,k为取值从1至x的正整数。
可选地,作为另一实施例,对于x个广告中的第k广告,处理器610b可以确定第k广告分别与x个广告中除第k广告之外的其它广告之间的多样性距离。处理器610b可以根据第k广告分别与x个广告中除第k广告之外的其它广告之间的多样性距离,确定第k广告对应的新颖性因子。其中,k为取值从1至x的正整数。
可选地,作为另一实施例,处理器610b可以对x个广告中每个广告对应的点击概率和每个广告对应的新颖性因子进行加权,确定x个广告分别对应的评分,并可以按照x个广告对应的评分从大到小的顺序,对x个广告进行排序,得到排序后的x个广告。然后处理器610b可以将排序后的x个广告中的前p个广告确定为待向第i用户推荐的p个广告。
可选地,作为另一实施例,处理器610b可以按照点击概率从大到小的顺序,对x个广告进行排序,得到排序后的x个广告。处理器610b可以根据新颖性因子从大到小的顺序,对排序后的x个广告中的前q个广告进行排序,得到重新排序后的q个广告,其中q为正整数且q大于p。处理器610b可以将重新排序后的q个广告中的前p个广告确定为待向第i用户推荐的p个广告。
可选地,作为另一实施例,处理器610b可以根据网页访问信息和广告 点击信息,生成用户-网页访问矩阵、用户-广告点击矩阵和广告-网页关联度矩阵,其中,用户-网页访问矩阵的第i行第j列对象表示第i用户对第j网页的访问记录,用户-广告点击矩阵的第i行第k列对象表示第i用户对第k广告的点击记录,广告-网页关联度矩阵的第j行第k列对象表示第j网页与第k广告之间的关联度,k为取值从1至x的正整数。处理器610b可以对用户-网页访问矩阵、用户-广告点击矩阵和广告-网页关联度矩阵进行联合概率矩阵分解,得到第i用户的用户隐含特征向量、第j网页的网页隐含特征向量和第k广告的广告隐含特征向量。然后处理器610b可以根据第i用户的用户隐含特征向量、第j网页的网页隐含特征向量和第k广告的广告隐含特征向量,确定第i用户访问第j网页时第k广告的点击概率。
本发明实施例中,根据网页访问信息和广告点击信息预测第i用户访问第j网页时x个广告的点击概率,根据历史推荐信息确定x个广告分别对应的新颖性因子,并根据x个广告的点击概率和x个广告分别对应的新颖性因子在x个广告中确定待向第i用户推荐的p个广告,其中第i用户对p个广告的知晓程度低于第i用户对x个广告中除p个广告之外的广告的知晓程度,p个广告的点击概率高于x个广告中除p个广告之外的广告的点击概率。由于综合考虑了用户、网页和广告三方面的信息来预测广告的点击概率,从而能够提升广告的点击概率预测的准确性,并且由于考虑了广告的新颖性,从而能够避免长时间向用户推荐同一类型而未考虑用户潜在兴趣的广告,因此能够提高广告的点击率,进而提升用户体验。
广告推荐服务器610的其它功能和操作可以参照上面图1至图3的方法实施例的过程,为了避免重复,此处不再赘述。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和 方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以所述权利要求的保护范围为准。

Claims (18)

  1. 一种推荐广告的方法,其特征在于,包括:
    从用户访问互联网日志中获取网页访问信息和广告点击信息,所述网页访问信息用于指示m个用户所访问的n个网页,所述广告点击信息用于指示m个用户在n个网页上点击的x个广告,n、m和x均为大于1的正整数;
    根据所述网页访问信息和所述广告点击信息,预测所述m个用户中第i用户访问第j网页时所述x个广告的点击概率,其中i为取值从1至m的正整数,j为取值从1至n的正整数;
    确定所述x个广告分别对应的新颖性因子,所述x个广告中每个广告对应的新颖性因子用于表示所述第i用户对所述每个广告的知晓程度;
    根据所述x个广告的点击概率和所述x个广告分别对应的新颖性因子,在所述x个广告中确定待向所述第i用户推荐的p个广告,其中,所述第i用户对所述p个广告的知晓程度低于所述第i用户对所述x个广告中除所述p个广告之外的广告的知晓程度,所述p个广告的点击概率高于所述x个广告中除所述p个广告之外的广告的点击概率,p为正整数且p≤x。
  2. 根据权利要求1所述的方法,其特征在于,所述确定所述x个广告分别对应的新颖性因子,包括:
    根据历史推荐信息,确定所述x个广告分别对应的新颖性因子,所述历史推荐信息用于指示向所述第i用户分别推荐所述x个广告的历史记录。
  3. 根据权利要求2所述的方法,其特征在于,所述根据历史推荐信息,确定所述x个广告分别对应的新颖性因子,包括:
    对于所述x个广告中的第k广告,
    如果所述历史推荐信息指示未向所述第i用户推荐过所述第k广告,则确定所述第k广告对应的新颖性因子为第一值;
    如果所述历史推荐信息指示过去向所述第i用户推荐过所述第k广告,则确定所述第k广告对应的新颖性因子为第二值;
    其中,所述第一值大于所述第二值,k为取值从1至x的正整数。
  4. 根据权利要求3所述的方法,其特征在于,所述确定所述第k广告对应的新颖性因子为第二值,包括:
    确定q天前向所述第i用户推荐过所述第k广告,q为正整数;
    确定所述q天对应的艾宾浩斯遗忘曲线值;
    确定所述第k广告对应的新颖性因子为所述第一值与所述艾宾浩斯遗忘曲线值之间的差值。
  5. 根据权利要求1所述的方法,其特征在于,所述确定所述x个广告分别对应的新颖性因子,包括:
    对于所述x个广告中的第k广告,
    确定所述第k广告分别与所述x个广告中除所述第k广告之外的其它广告之间的相似度;
    根据所述第k广告分别与所述x个广告中除所述第k广告之外的其它广告之间的相似度,确定在所述x个广告中所述第k广告对应的相似性排名和所述第k广告对应的不相似性排名;
    对所述第k广告对应的相似性排名和所述第k广告对应的不相似性排名进行加权,以得到所述第k广告对应的新颖性因子;
    其中,k为取值从1至x的正整数。
  6. 根据权利要求1所述的方法,其特征在于,所述确定所述x个广告分别对应的新颖性因子,包括:
    对于所述x个广告中的第k广告,
    确定所述第k广告分别与所述x个广告中除所述第k广告之外的其它广告之间的多样性距离;
    根据所述第k广告分别与所述x个广告中除所述第k广告之外的其它广告之间的多样性距离,确定所述第k广告对应的新颖性因子;
    其中,k为取值从1至x的正整数。
  7. 根据权利要求1至6中任一项所述的方法,其特征在于,所述根据所述x个广告分别对应的点击概率和所述x个广告分别对应的新颖性因子,在所述x个广告中确定待向所述第i用户推荐的p个广告,包括:
    对所述x个广告中每个广告对应的点击概率和所述每个广告对应的新颖性因子进行加权,确定所述x个广告分别对应的评分;
    按照所述x个广告对应的评分从大到小的顺序,对所述x个广告进行排序,得到排序后的x个广告;
    将所述排序后的x个广告中的前p个广告确定为待向所述第i用户推荐的p个广告。
  8. 根据权利要求1至6中任一项所述的方法,其特征在于,所述根据所述x个广告分别对应的点击概率和所述x个广告分别对应的新颖性因子,在所述x个广告中确定待向所述第i用户推荐的p个广告,包括:
    按照点击概率从大到小的顺序,对所述x个广告进行排序,得到排序后的x个广告;
    按照新颖性因子从大到小的顺序,对所述排序后的x个广告中的前q个广告重新进行排序,得到重新排序后的q个广告;其中q为正整数且q大于p;
    将所述重新排序后的q个广告中的前p个广告确定为待向所述第i用户推荐的p个广告。
  9. 根据权利要求1至8中任一项所述的方法,其特征在于,所述根据所述网页访问信息和所述广告点击信息,预测所述m个用户中第i用户访问第j网页时所述x个广告的点击概率,包括:
    根据所述网页访问信息和所述广告点击信息,生成用户-网页访问矩阵、用户-广告点击矩阵和广告-网页关联度矩阵,其中,所述用户-网页访问矩阵的第i行第j列对象表示所述第i用户对所述第j网页的访问记录,所述用户-广告点击矩阵的第i行第k列对象表示所述第i用户对第k广告的点击记录,所述广告-网页关联度矩阵的第j行第k列对象表示所述第j网页与所述第k广告之间的关联度,k为取值从1至x的正整数;
    对所述用户-网页访问矩阵、所述用户-广告点击矩阵和所述广告-网页关联度矩阵进行联合概率矩阵分解,得到所述第i用户的用户隐含特征向量、所述第j网页的网页隐含特征向量和所述第k广告的广告隐含特征向量;
    根据所述第i用户的用户隐含特征向量、所述第j网页的网页隐含特征向量和所述第k广告的广告隐含特征向量,确定所述第i用户访问所述第j网页时所述第k广告的点击概率。
  10. 一种广告推荐服务器,其特征在于,包括:
    获取单元,用于从用户访问互联网日志中获取网页访问信息和广告点击信息,所述网页访问信息用于指示m个用户所访问的n个网页,所述广告点击信息用于指示m个用户在n个网页上点击的x个广告,n、m和x均为大于1的正整数;
    预测单元,用于根据所述网页访问信息和所述广告点击信息,预测所述 m个用户中第i用户访问第j网页时所述x个广告的点击概率,其中i为取值从1至m的正整数,j为取值从1至n的正整数;
    确定单元,用于确定所述x个广告分别对应的新颖性因子,所述x个广告中每个广告对应的新颖性因子用于表示所述第i用户对所述每个广告的知晓程度;
    选择单元,用于根据所述x个广告的点击概率和所述x个广告分别对应的新颖性因子,在所述x个广告中确定待向所述第i用户推荐的p个广告,其中,所述第i用户对所述p个广告的知晓程度低于所述第i用户对所述x个广告中除所述p个广告之外的广告的知晓程度,所述p个广告的点击概率高于所述x个广告中除所述p个广告之外的广告的点击概率,p为正整数且p≤x。
  11. 根据权利要求10所述的广告推荐服务器,其特征在于,所述确定单元,具体用于:
    根据历史推荐信息,确定所述x个广告分别对应的新颖性因子,所述历史推荐信息用于指示向所述第i用户分别推荐所述x个广告的历史记录。
  12. 根据权利要求11所述的广告推荐服务器,其特征在于,在根据历史推荐信息,确定所述x个广告分别对应的新颖性因子的方面,所述确定单元,具体用于:
    对于所述x个广告中的第k广告,
    如果所述历史推荐信息指示未向所述第i用户推荐过所述第k广告,则确定所述第k广告对应的新颖性因子为第一值;
    如果所述历史推荐信息指示过去向所述第i用户推荐过所述第k广告,则确定所述第k广告对应的新颖性因子为第二值;
    其中,所述第一值大于所述第二值,k为取值从1至x的正整数。
  13. 根据权利要求12所述的广告推荐服务器,其特征在于,在确定所述第k广告对应的新颖性因子为第二值的方面,所述确定单元,具体用于:
    确定q天前向所述第i用户推荐过所述第k广告,q为正整数;
    确定所述q天对应的艾宾浩斯遗忘曲线值;
    确定所述第k广告对应的新颖性因子为所述第一值与所述艾宾浩斯遗忘曲线值之间的差值。
  14. 根据权利要求10所述的广告推荐服务器,其特征在于,在确定所 述x个广告分别对应的新颖性因子的方面,所述确定单元,具体用于:
    对于所述x个广告中的第k广告,
    确定所述第k广告分别与所述x个广告中除所述第k广告之外的其它广告之间的相似度;
    根据所述第k广告分别与所述x个广告中除所述第k广告之外的其它广告之间的相似度,确定在所述x个广告中所述第k广告对应的相似性排名和所述第k广告对应的不相似性排名;
    对所述第k广告对应的相似性排名和所述第k广告对应的不相似性排名进行加权,以得到所述第k广告对应的新颖性因子;
    其中,k为取值从1至x的正整数。
  15. 根据权利要求10所述的广告推荐服务器,其特征在于,在确定所述x个广告分别对应的新颖性因子的方面,所述确定单元,具体用于:
    对于所述x个广告中的第k广告,
    确定所述第k广告分别与所述x个广告中除所述第k广告之外的其它广告之间的多样性距离;
    根据所述第k广告分别与所述x个广告中除所述第k广告之外的其它广告之间的多样性距离,确定所述第k广告对应的新颖性因子;
    其中,k为取值从1至x的正整数。
  16. 根据权利要求10至15中任一项所述的广告推荐服务器,其特征在于,所述选择单元,具体用于:
    对所述x个广告中每个广告对应的点击概率和所述每个广告对应的新颖性因子进行加权,确定所述x个广告分别对应的评分;
    按照所述x个广告对应的评分从大到小的顺序,对所述x个广告进行排序,得到排序后的x个广告;
    将所述排序后的x个广告中的前p个广告确定为待向所述第i用户推荐的p个广告。
  17. 根据权利要求10至15中任一项所述的广告推荐服务器,其特征在于,所述选择单元,具体用于:
    按照点击概率从大到小的顺序,对所述x个广告进行排序,得到排序后的x个广告;
    按照新颖性因子从大到小的顺序,对所述排序后的x个广告中的前q个 广告进行排序,得到重新排序后的q个广告,其中q为正整数且q大于p;
    将所述重新排序后的q个广告中的前p个广告确定为待向所述第i用户推荐的p个广告。
  18. 根据权利要求10至17中任一项所述的广告推荐服务器,其特征在于,所述预测单元,具体用于:
    根据所述网页访问信息和所述广告点击信息,生成用户-网页访问矩阵、用户-广告点击矩阵和广告-网页关联度矩阵,其中,所述用户-网页访问矩阵的第i行第j列对象表示所述第i用户对所述第j网页的访问记录,所述用户-广告点击矩阵的第i行第k列对象表示所述第i用户对第k广告的点击记录,所述广告-网页关联度矩阵的第j行第k列对象表示所述第j网页与所述第k广告之间的关联度,k为取值从1至x的正整数;
    对所述用户-网页访问矩阵、所述用户-广告点击矩阵和所述广告-网页关联度矩阵进行联合概率矩阵分解,得到所述第i用户的用户隐含特征向量、所述第j网页的网页隐含特征向量和所述第k广告的广告隐含特征向量;
    根据所述第i用户的用户隐含特征向量、所述第j网页的网页隐含特征向量和所述第k广告的广告隐含特征向量,确定所述第i用户访问所述第j网页时所述第k广告的点击概率。
PCT/CN2015/072573 2014-06-16 2015-02-09 推荐广告的方法及广告推荐服务器 WO2015192667A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US15/378,311 US20170091805A1 (en) 2014-06-16 2016-12-14 Advertisement Recommendation Method and Advertisement Recommendation Server

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410268560.5A CN104090919B (zh) 2014-06-16 2014-06-16 推荐广告的方法及广告推荐服务器
CN201410268560.5 2014-06-16

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/378,311 Continuation US20170091805A1 (en) 2014-06-16 2016-12-14 Advertisement Recommendation Method and Advertisement Recommendation Server

Publications (1)

Publication Number Publication Date
WO2015192667A1 true WO2015192667A1 (zh) 2015-12-23

Family

ID=51638635

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/072573 WO2015192667A1 (zh) 2014-06-16 2015-02-09 推荐广告的方法及广告推荐服务器

Country Status (3)

Country Link
US (1) US20170091805A1 (zh)
CN (1) CN104090919B (zh)
WO (1) WO2015192667A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109360057A (zh) * 2018-10-12 2019-02-19 平安科技(深圳)有限公司 信息推送方法、装置、计算机设备及存储介质
CN111242699A (zh) * 2020-02-07 2020-06-05 恩亿科(北京)数据科技有限公司 流量退量管理方法、装置、电子设备及可读存储介质

Families Citing this family (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104090919B (zh) * 2014-06-16 2017-04-19 华为技术有限公司 推荐广告的方法及广告推荐服务器
CN105760400B (zh) * 2014-12-19 2019-06-21 阿里巴巴集团控股有限公司 一种基于搜索行为的推送消息排序方法及装置
CN105812844B (zh) * 2014-12-29 2019-02-26 深圳市Tcl高新技术开发有限公司 一种电视的用户广告推送方法及系统
CN105447724B (zh) * 2015-12-15 2022-04-05 腾讯科技(深圳)有限公司 内容项推荐方法及装置
CN107305552B (zh) * 2016-04-20 2020-04-07 中国电信股份有限公司 辅助阅读方法和装置
CN106339896A (zh) * 2016-08-17 2017-01-18 罗军 一种广告投放方法及系统
US10643236B2 (en) * 2016-09-23 2020-05-05 Walmart Apollo, Llc Systems and methods for predicting user segments in real-time
CN107993084B (zh) * 2016-10-27 2020-11-06 北京酷我科技有限公司 一种广告推送方法
CN106504686A (zh) * 2016-12-30 2017-03-15 山东依鲁光电科技有限公司 Led智能营销广告服务系统
US11263704B2 (en) * 2017-01-06 2022-03-01 Microsoft Technology Licensing, Llc Constrained multi-slot optimization for ranking recommendations
CN106997549A (zh) * 2017-02-14 2017-08-01 火烈鸟网络(广州)股份有限公司 一种广告信息的推送方法及系统
CN108874529B (zh) * 2017-05-10 2022-05-13 腾讯科技(深圳)有限公司 分布式计算系统、方法及存储介质
CN107424016B (zh) * 2017-08-10 2020-10-23 安徽大学 一种在线招聘广告推荐的实时竞价方法及其系统
CN110019290B (zh) * 2017-08-31 2023-01-10 腾讯科技(深圳)有限公司 基于统计先验的推荐方法及装置
CN107977865A (zh) * 2017-12-07 2018-05-01 畅捷通信息技术股份有限公司 广告推送方法、装置、计算机设备和可读存储介质
CN108388624B (zh) * 2018-02-12 2022-05-17 科大讯飞股份有限公司 多媒体信息推荐方法及装置
CN108733825B (zh) * 2018-05-23 2022-04-26 创新先进技术有限公司 一种对象触发事件预测方法及装置
CN110598086B (zh) * 2018-05-25 2020-11-24 腾讯科技(深圳)有限公司 文章推荐方法、装置、计算机设备及存储介质
CN109146551A (zh) * 2018-07-26 2019-01-04 深圳市元征科技股份有限公司 一种广告推荐方法、服务器及计算机可读介质
CN109086439B (zh) * 2018-08-15 2022-02-25 腾讯科技(深圳)有限公司 信息推荐方法及装置
CN109460783B (zh) * 2018-10-22 2021-02-12 武汉极意网络科技有限公司 伪造浏览器的识别方法、系统、服务器及存储介质
CN109784967A (zh) * 2018-12-05 2019-05-21 微梦创科网络科技(中国)有限公司 一种信息的推送方法和装置
CN109446431A (zh) * 2018-12-10 2019-03-08 网易传媒科技(北京)有限公司 用于信息推荐的方法、装置、介质、和计算设备
CN109960759B (zh) * 2019-03-22 2022-07-12 中山大学 基于深度神经网络的推荐系统点击率预测方法
US11562401B2 (en) 2019-06-27 2023-01-24 Walmart Apollo, Llc Methods and apparatus for automatically providing digital advertisements
US11763349B2 (en) * 2019-06-27 2023-09-19 Walmart Apollo, Llc Methods and apparatus for automatically providing digital advertisements
CN112150182B (zh) * 2019-06-28 2023-08-29 腾讯科技(深圳)有限公司 多媒体文件推送方法和装置、存储介质及电子装置
CN110675217A (zh) * 2019-09-05 2020-01-10 广州亚美信息科技有限公司 个性化背景图生成方法及装置
US11449671B2 (en) * 2020-01-30 2022-09-20 Optimizely, Inc. Dynamic content recommendation for responsive websites
CN112465555B (zh) * 2020-12-04 2024-05-14 北京搜狗科技发展有限公司 一种广告信息推荐的方法及相关装置
CN112819570B (zh) * 2021-01-21 2023-09-26 东北大学 一种基于机器学习的商品智能搭配推荐方法
CN114282941A (zh) * 2021-12-20 2022-04-05 咪咕音乐有限公司 广告插入位置的确定方法、装置、设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101685521A (zh) * 2008-09-23 2010-03-31 北京搜狗科技发展有限公司 在网页中展现广告的方法及系统
US20100153219A1 (en) * 2008-12-12 2010-06-17 Microsoft Corporation In-text embedded advertising
CN102346899A (zh) * 2011-10-08 2012-02-08 亿赞普(北京)科技有限公司 一种基于用户行为的广告点击率预测方法和装置
CN102663617A (zh) * 2012-03-20 2012-09-12 亿赞普(北京)科技有限公司 一种广告的点击率预测方法及系统
CN104090919A (zh) * 2014-06-16 2014-10-08 华为技术有限公司 推荐广告的方法及广告推荐服务器

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060026064A1 (en) * 2004-07-30 2006-02-02 Collins Robert J Platform for advertising data integration and aggregation
US7689458B2 (en) * 2004-10-29 2010-03-30 Microsoft Corporation Systems and methods for determining bid value for content items to be placed on a rendered page
WO2009038822A2 (en) * 2007-05-25 2009-03-26 The Research Foundation Of State University Of New York Spectral clustering for multi-type relational data
US8204878B2 (en) * 2010-01-15 2012-06-19 Yahoo! Inc. System and method for finding unexpected, but relevant content in an information retrieval system
WO2012040881A1 (en) * 2010-09-30 2012-04-05 Yahoo! Inc. Determining placement of advertisements on web pages
EP2568429A4 (en) * 2010-11-29 2013-11-27 Huawei Tech Co Ltd METHOD AND SYSTEM FOR PUSHING INDIVIDUAL ADVERTISING BASED ON THE LEARNING OF USER INTERESTS
WO2012141700A1 (en) * 2011-04-13 2012-10-18 Empire Technology Development Llc Dynamic advertising content selection
CN102332006B (zh) * 2011-08-03 2016-08-03 百度在线网络技术(北京)有限公司 一种信息推送控制方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101685521A (zh) * 2008-09-23 2010-03-31 北京搜狗科技发展有限公司 在网页中展现广告的方法及系统
US20100153219A1 (en) * 2008-12-12 2010-06-17 Microsoft Corporation In-text embedded advertising
CN102346899A (zh) * 2011-10-08 2012-02-08 亿赞普(北京)科技有限公司 一种基于用户行为的广告点击率预测方法和装置
CN102663617A (zh) * 2012-03-20 2012-09-12 亿赞普(北京)科技有限公司 一种广告的点击率预测方法及系统
CN104090919A (zh) * 2014-06-16 2014-10-08 华为技术有限公司 推荐广告的方法及广告推荐服务器

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109360057A (zh) * 2018-10-12 2019-02-19 平安科技(深圳)有限公司 信息推送方法、装置、计算机设备及存储介质
CN109360057B (zh) * 2018-10-12 2023-07-25 平安科技(深圳)有限公司 信息推送方法、装置、计算机设备及存储介质
CN111242699A (zh) * 2020-02-07 2020-06-05 恩亿科(北京)数据科技有限公司 流量退量管理方法、装置、电子设备及可读存储介质
CN111242699B (zh) * 2020-02-07 2023-04-07 恩亿科(北京)数据科技有限公司 流量退量管理方法、装置、电子设备及可读存储介质

Also Published As

Publication number Publication date
CN104090919B (zh) 2017-04-19
CN104090919A (zh) 2014-10-08
US20170091805A1 (en) 2017-03-30

Similar Documents

Publication Publication Date Title
WO2015192667A1 (zh) 推荐广告的方法及广告推荐服务器
Karimi et al. News recommender systems–Survey and roads ahead
US11710054B2 (en) Information recommendation method, apparatus, and server based on user data in an online forum
Bagher et al. User trends modeling for a content-based recommender system
TWI636416B (zh) 內容個人化之多相排序方法和系統
US9594826B2 (en) Co-selected image classification
KR101700352B1 (ko) 이력적 검색 결과들을 사용한 향상된 문서 분류 데이터 생성
JP6167493B2 (ja) 情報を管理するための方法、コンピュータプログラム、記憶媒体及びシステム
JP7160980B2 (ja) 情報提供装置、情報提供方法、およびプログラム
Agarwal et al. Statistical methods for recommender systems
US9183499B1 (en) Evaluating quality based on neighbor features
US20210056458A1 (en) Predicting a persona class based on overlap-agnostic machine learning models for distributing persona-based digital content
US20120259831A1 (en) User Information Needs Based Data Selection
JP5615857B2 (ja) 分析装置、分析方法及び分析プログラム
WO2016003508A1 (en) Context-aware approach to detection of short irrelevant texts
US10929036B2 (en) Optimizing static object allocation in garbage collected programming languages
CN102521233A (zh) 自适应图像检索数据库
US20190303980A1 (en) Training and utilizing multi-phase learning models to provide digital content to client devices in a real-time digital bidding environment
US20140059089A1 (en) Method and apparatus for structuring a network
JP2014203442A (ja) レコメンド情報生成装置及びレコメンド情報生成方法
Chung et al. Categorization for grouping associative items using data mining in item-based collaborative filtering
CN112487283A (zh) 训练模型的方法、装置、电子设备及可读存储介质
Liu et al. Online recommendations based on dynamic adjustment of recommendation lists
Li et al. From reputation perspective: a hybrid matrix factorization for qos prediction in location‐aware mobile service recommendation system
Wang et al. Temporal topic-based multi-dimensional social influence evaluation in online social networks

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15809898

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15809898

Country of ref document: EP

Kind code of ref document: A1