WO2014043699A1 - Système et procédé d'estimation d'intérêt d'audience - Google Patents

Système et procédé d'estimation d'intérêt d'audience Download PDF

Info

Publication number
WO2014043699A1
WO2014043699A1 PCT/US2013/060156 US2013060156W WO2014043699A1 WO 2014043699 A1 WO2014043699 A1 WO 2014043699A1 US 2013060156 W US2013060156 W US 2013060156W WO 2014043699 A1 WO2014043699 A1 WO 2014043699A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
computer
interest
distribution
interest distribution
Prior art date
Application number
PCT/US2013/060156
Other languages
English (en)
Inventor
Xiaohan ZHANG
Foster Provost
Kiril TSEMEKHMAN
Original Assignee
New York University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by New York University filed Critical New York University
Priority to US14/428,265 priority Critical patent/US10599981B2/en
Publication of WO2014043699A1 publication Critical patent/WO2014043699A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks

Definitions

  • the present disclosure relates generally to the prediction of audience interest in a web page, and more specifically, to exemplary embodiments of systems, methods and computer-accessible medium for estimating and/or determining an audience interest based on. audience web behavior.
  • a goal of many different online enterprises can be to understand die visitors to particular websites and webpages. (See, e.g.. Reference 8). Understanding one property of online visitors—the interests of these visitors to a particular website or webpage, (e.g., the audience interests of that website or webpage)— can be especially beneficial to a variety of online players. Knowledge of audience interests can facilitate websi te operators to optimize their content and navigation, create better contest for their audience, improve site
  • audience interests can be a key goal of many players in the online a d vertising industry, where advertisers can associate brand advertisements with the interests of website- visitors. For example, Proctor & Gamble may want to place Olay advertisements on webpages whose audience interests include the category "beauty”.
  • Behavioral targeting ("BT") procedures see, e.g.. Reference 5 can analyze historical user behavior in an attempt to deliver relevant advertisements to the user.
  • BT aims to increase advertising revenue through maximizing proxy measures such as the click through rate ("CTR") (e.g., the percentage of browsers who click on an advertisement, out of the total mimber of browsers who are shown the advertisement) or con versions.
  • CTR click through rate
  • Con versions e.g., con versions of con versions.
  • CTR click through rate
  • Reference 33 See, e.g.. Reference 33
  • Other procedures can extract quasi-social networks from users' browsing behavior for the purpose of improving brand advertising targeting. Similar to BT, user interests can be modeled from users' browsing behavior.
  • CT JOiOSj Contextual targeting
  • References 4, 24 aim to place advertisements that match the content of the websites, so as to increase revenue of both publishers and ad-networks, and also to improve user experience.
  • previous methods propose to integrate behavioral targeting into contextual advertising to improve the relevance of advertisements reirieved.
  • CT does not model/profile user interests, but focuses on content of websites;
  • CT focuses specifically on the interests represented explicitly on the webpages, rather thai the more general interests of the audience, and
  • the goal of CT can be to maximize advertising revenue
  • Content filtering can build profiles for items (e.g., actors, directors, and genres for movies, etc.) and users (e.g., demographic information, and information through explicit user feedback) in order to recommend items similar to those items a gi en user may have liked in the past.
  • items e.g., actors, directors, and genres for movies, etc.
  • users e.g., demographic information, and information through explicit user feedback
  • previous methods describe a learning-driven client-side keyword-based personalization approach for search advertising. (See, e.g.. Reference 3). They can aiiow advertisers to customize existing search advertising campaigns based on users' prior behavior, wlriie facilitating users to opt out from server-side storage of their behavioral history.
  • previous work describes predictive bilinear regression models which cat) be used to combine both profiles of contents (e.g., popularity arid freshness) and profiles of users (e.g., demographic information, and summary of oniine activities) in order to provide personalized recommendations of new items to users.
  • profiles of contents e.g., popularity arid freshness
  • profiles of users e.g., demographic information, and summary of oniine activities
  • Systems, methods and computer-accessible mediums can be provided that can determioe an audience interest distribution ⁇ ) of content(s) by, for exampie, receiving first mformalion related to a web behavior(s) of a user(s), determining second information related to a user interest dislribtuionis) of the user(s) based on the first information, and determining the audience interest distribiition(s) of the contentCs) based on the second information.
  • the audience interest distribution can be determined based on a probabilistic model(s) of the second information.
  • the probabilistic niodel(s) can include a maximum likelihood estimator ,
  • the content's) can include a webpage(s).
  • the behavior can include a web behavior of the user(s), and can include substantially anonymous web behavior.
  • the behavior can also include visits by the user's) to a webpage(s).
  • the second information can be determined based on a plurality of topical interest categories associated with the ebpage(s).
  • the user interest distribution can inciitde further information related to inherent preferences by the user's ) for a particular topic(s) of interest.
  • the user interest distrihu on(s) can include a plurality of user interest distributions, and the audience interest distribution can be determined using a weighted mean of the user interest
  • the weighted mean can be based on an expected number of views of the content's
  • the user interest distribution's can be modeled using a matrix(s), and each row vector of the matrix(s) can represent the user's user interest distribution and each column of the matrix's) can represent a category's audience interest for all users.
  • the audience interest distribution's) can be determined based on a multinomial distribution model of the second information.
  • the second information can be determined by inferring the user interest distribution's) based on an inference model
  • the inference model can be an estimation of the user's inherent interest distribution based on the behavior's) of the user's).
  • the inference model can be generated by probabilistically modeling visits of the user's) to a plurality of websites.
  • the behavior of the user(s) can be .modeled using a bipartite graphis).
  • the hefoavior(s) can also exclude information related to the eonteni(s).
  • exemplary systems, methods and computer-accessible mediums that can determine an audience interest distribution(s) of conient(s) by. for example, receiving first information related to a user interest distributions) of the useri ' s), and determine the audience interest dlsiriboti.otifs) of the conietu(s) based on the second information.
  • Figure 1 is an exemplary image of an exemplary website forum for the Montreal Canadians:
  • Figure 2 is an exemplary image of the Sports Illustrated website
  • Figure 3 is a set of exemplary images of other exemplary websites
  • Figure 4 is a representation of an exemplary user interest model according to an exemplary embodiment of the present disclosure.
  • Figure 5 is a representation of an exemplary audience interest model, according to an exemplary embodiment of ihe present disclosure
  • Figure 6 is a representation of a structure of the exemplary model according to an exemplary embodiment of the present disclosure
  • Figure 7 is a representation of the structure of the exemplary aggregation model according to an exemplary embodiment of ihe present disclosure
  • Figure 8 is an exemplary representation of the structure of the exemplary interference model according to an exemplary embodiment of the present disclosure.
  • Figure 9 A is a graph illustrating an exemplar histogram of a number of categories per webpage according to an exemplar embodiment of the present disclosure
  • Figure 9B is a graph illustrating an exemplary histogram of a number of webpages per category
  • Figure 1 A is a graph illustrating an exemplary precision recall curve with full range according to an exemplary embodiment of the present disclosure
  • Figure JOB is a graph illustrating an exemplary precision recall curve with magnified range according to an exemplary embodiment of the present disclosure
  • Figure 1 LA a graph illustrating a further exemplary precision recall curve with full range according to an exemplary embodiment of the present di sclosure ;
  • Figure 1 18 is a graph illustrating an a further exemplary precision recall curve with magnified range according to an exemplary embodiment of the present disclosure
  • Figures .12A-12C are exemplary images from exemplary websites
  • Figures 13A-13C are further exemplary images from exemplary websites
  • Figures .14A-14C are even further exemplary images from exemplary websites;
  • Figure 15 A is a full graph of an exemplar model according to an exemplary embodiment of ihe present disclosure.
  • Figure 158 is an interference graph of the exem lary model according to an exemplary embodiment of th present disclosure.
  • Figure 16 is an aggregation graph of ihe exemplary model according to an exemplary embodiment of the present disclosure.
  • FIG. 17 is an illustralion of an exemplary block diagram of an exemplary system in accordance with certain exemplary embodiments of the present disclosure.
  • Website "site'; 'Svebpage * ⁇ and "page” can all be used as a general term for web content, if not otherwise stated, it can range from, an individual Uniform Resource Locator ("URL") (e.g., http / mQwy.m a URL prefix (e.g., money.cnn.com/qifote) or a collection of related URLs (e.g., a general domain like money.con.com), depending on the context and application. It is possible to consider individual URLs, unless stated otherwise, but the exemplary model can be applied to any form of website aggregation.
  • URL Uniform Resource Locator
  • the exemplary systems, methods and computer-accessible mediums can estimate ihe distribution of a target website's audience interests based one users' online behavior.
  • the exemplary systems, methods and computer-accessibie mediums can use a probabilistic model to estimate the audience interest distribution and an evaluation framework can be used io evaluate various aspects of the problem.
  • the exemplary model can function using the following exemplary procedures; a) estimate user interest distribution ("U!D") from users' web behavior; and/or b) estimate the expected audience interest distribution ("AID”) for a website based on ihe UIDs of the audience.
  • a marketer such as Proctor & Gamble
  • AudienceMedi can find an inventory from the webpage htip://www:forumice.com t' for mdispiay whose screen shot is shown in Figure 1 .
  • their reaction can be to not place the Olay advertisement on the webpage because of lack of association with the brand.
  • the exemplary audience interest model can determine that the AID of the aforementioned webpage can contain the following categories: hockey, beauty, and style.
  • the AID output can indicate the potential audience of the webpage .for the Olay brand, and AudienceMedia can make the decision to bid for the inventory through ad exchanges, and the exemplary systems, methods and computer-accessible mediums, according to an exemplary embodiment of the present disclosure, can determine that almost half of NHL f s, and visitors to the Sports IHustrated's website, can be women (see, e.g.. Figures 2 and 3).
  • This example shows one of the advantages of the exempiarv AID model according io an exemplary embodiment of the present disclosure, which can provide certain audience insights to website operators and/or advertisers.
  • An. exemplary data-driven model of the exemplary distribution of interests of a website's audienc thai can take advantage of the increasing availability of massive data on users' online behavior is illustrated below.
  • the audience interest is modeled as distributed across some set of predefined exemplary categories. These exemplary categories are taken as input to the exemplary model, and the exemplary modeling can use that fact thai there can exist a "seed" set of labeled websites.
  • Such websites can. be labeled by humans, e.g., by text classification meiiiods (see, e.g.. Reference 19), or by some combination of the two. (See, e.g.. References 15, 2, 24).
  • a website's labels can be identified as representing some subset of the interests of the visitors to the websites.
  • the exemplary model can estimate the distribution of audience in terests for one website based on massive data about the audience's visitations to other websites.
  • the exemplary generative mode! can provide a crisp interpretation of audience interest, for example, a UTD is the probability (e.g., estimated) that any particular user will visit a website with a certain topic (e.g., category).
  • a site's AID is the expected user interest distribution for a randomly drawn visitor to the site.
  • Exemplary Estimating the AID From Data can be important for the following reasons.
  • Contextual categorization of websites can be expensive and/or error prone at large scale. More specifically, human (e.g., "manual") categorization can be very expensive and time-consuming, and can simply be unrealistic for large websites, and for applications such as online advertising.
  • Automated classification for example via text classification and natural language processing, can be error-prone (e.g., accurate for certain categories and types of pages, not so accurate for many others).
  • Commercial systems for contextual classification use a combination of both manual and automated classifications and charge for the service accordingly.
  • the exemplary systems, methods and computer-accessible mediums according to an exemplary embodiment of the present disclosure can facilitate the AID to be used to predict website contextual categories on a massive scale, and with sufficient accuracy.
  • contextual categorization can provide only a narrow view of user interests, it can be assumed that users visit a webpage (e.g., website) because their owe interests can be aligned with at least some topic represented on. the page— this can be an assumption, in the exemplary generative model.
  • audiences can generally have other interests that may not be directly represented in the contextual categorization.
  • the AID for a particular "hockey" website can show significant audience interest in "style” and "beauty”— possibly unlike other hockey sites. This can be important both to website operators and to advertisers, (e.g., sports magazines have overlooked the fact that almos half of NHL fans can be women).
  • each user's interest distribution is constant during the modeling period.
  • the modeling periods in the exemplary empirical study can be fairly short due to the massive size of associated data. Specifically, if the modeling period is 24 hours, and a user actually shifted from being interested in. ' botbair to being interested in "dinin out," the model can consider the user to have a single interest distribution with substantial probabilities on both "football" and "dining out ' ".
  • the exemplary systems, methods and computer-accessible mediums can estimate the user interests of websites. This can be based on massive data of anonymous web users' visitations to
  • a quantitative evaluation can be based on an assumption thai a user visits a website because of his/her interest in at least one of the topics (e.g. , categories) of the content on the website. This assumption can also be the explicit, basis for work, on "behavioral targeting,' ' (See, e.g.. References 33, 6). Therefore, there can be an. overlap between the contextual category distribution ⁇ "CCD" of a website (e.g., which can estimated from the content) and the AID (e.g., which is estimated from audience behavior).
  • An exemplary evaluation test can be formatted for predictive modeling research. For example, given a set of contex.tual.ly labeled, websites, the contextual categories for each website can be
  • the estimated AID for each website is included in th website's actual (e.g., held out) categories can be measured. The results show that the AID can be quite accurate in predicting these known audience interests.
  • Such exemplary modeling can foe "privacy friendly": (i) the exemplary model and modeling does not rely on any knowledge of the identities of the users they can be anonymized arbitrarily, as long as it is possible to relate multiple website visitations to a single anonymized web user . In addition, no demographics or other user-level data are needed or required to be used; (it) the exemplary model and modeling does not need to know the content of the websites either, except for the contextual classifications and the estimated AID; arid (iii) after the AID is calculated or determined,, even the anonymi/ed user visitation data can be disposed of. Unlike behavioral targeting, where a representation of users' interests must be maintained (e.g., anonymously), for conscientious behavioral targeters, AlD-based ad targeting does not need 10 store "profiles" of users.
  • a goal is to use behavioral data to estimate the distribution of interests of visitors to websites and wehpages.
  • FIG. 4 shows a representation of an exemplary ⁇ e.g., collected) web behavior model of users.
  • a dotted line 405 from user 410 to page 415 can indicate that user 410 ma have visited page 415 before.
  • CCD can represent the topics of a page extracted from its content.
  • UID can represent the topics that a user is interested in. Since a user can visit a hockey webpage ⁇ e.g., page w) and a style/beauty page (e.g., page W), the user, to some degree, is interested in the three topics (e.g., hockey, style, and beauty). Then, user N's 410 "UID" can include the three topics, reflecting the fact that the user can have an interest in the topics.
  • Figure 5 illustrates that the audience interests can be calculated for pages. Notice that the pages on the right-hand side can be a different, set- from the pages from Figure 4.
  • the rectangular nodes 505 in th middle can represent hidden categories, and is ignored for explanation purposes. If page w 510 is looked at, "the Habs' ⁇ which is a page about hockey, it is expected that the audience is interested in not only "hockey”, but also topics like "beauty” and "style” The reason is that user N, who is one of the visitors to page w, can be interested in "beauty” and "style". This can assist in explaining the reason that AID
  • the exemplary model is a two-stage generative model, the structure of which is illustrated in Figure 6.
  • the generative nature of the exemplary model is based on the assumption that users " website visitation behavior is generated by user-specific distributions of interest in the different topical interest categories that can be present in websites' content.
  • the category choices generally can be unobserved, and therefore are latent in the exemplary model; these latent choices are represented by the solid lines 605 and the lighter-shaded rectangles 6 it ) in Figure 6.
  • the observed data are represented by the dotted lines 615 and the darker-shaded ovals 625 and circles 620.
  • the exemplary "aggregation" stage, or aggregation model can calculate each website's expected audience interest distribution ij3 ⁇ 4) based on: i) known or estimated user interest distributions (y ), and ii) users 1 expected visitation behaviors based on websile i terest categories ( 1 . . . K),
  • the exemplary aggregation model can assume that the user interest distribution is known; they generally may not be known, so they can. be inferred from the data.
  • the "inference" stage, or inference model estiniates each user's inherent interest distribution ( ⁇ £) based on users' observed browsing behavior (e.g., indicated by dotted lines 605 from users to websites as shown in Figure 6).
  • each user can have a certain number of web activities (e.g., visiting sites) in the exemplary time frame of evaluation. Each such activity can form a link between a user and a site, and can add to the number of visits between them.
  • .y > which is a categorical distribution over all possible categories, with k represeruing an arbitrary category.
  • Subscripts can be used to indicate the current observed object that is of concern, where, e.g., the dark-shaded ovals and circles as shown in i Figure 6, users are indicated as n, web sites are indicated, as w, and superscripts indicate varying elements, for example, categories k for an interest distribution of user n over ali categories.
  • the interest distribution for a user n can determine her inherent preferences in particular categories (e.g., category k). Under this exemplary assumption, such preferences can lead to her visitin sites whose content contain such categories.
  • This exemplar ⁇ ' probabiiity can be specified by the probability mass function as, for example:
  • ⁇ ( ⁇ :::: k) can be an indicator function where, for example:
  • N The total n umber of users.
  • X A random variable with various meanings depending on the context. User n's interest distribution, over category k or j.
  • the N by mat i representing user interest distributions for all users.
  • CCD Shortened form of Contextual Category Distribution for a website
  • HID Shortened form of User Interest Distribution for a user
  • each of the N users can have their own interest distribution %, thus, as a whole, the set of interest distributions for ail users can be represented as a N ' by K matrix ⁇ ' , where , for example
  • each row vector representing user n f s interest distribution, and each column representing each category's audience interest component across ail users.
  • each site can contain onl one category of content, and each category only belongs to one site.
  • the sites are presented by the categories, which can be I, . , . , j, , J , where J ::: , the total number of categories.
  • the number of visits irom user n to al! sites can be modeled as a multinomial distribution on counts Scalar L n , which are the total number of visits that user n pays to all sites as defined above.
  • Random variable hike can be the number of visits from user n to category] (e.g., in this case, also site j), and yj, can be the probability of user n visiting category j as defined above.
  • the probability of J user n having L radical, V n , L 3 ⁇ 4 ( 3 ⁇ 4 ⁇ ⁇ j s i ⁇ n ) visits can be the probability mass function where, for example:
  • the expected number of visits from user n to category (e.g., site)/ can be, for example:
  • site receives 1 ⁇ visits from the user with interest distribution f and j ⁇ y ⁇ visits from the user with interest distribution %
  • Equation 7 can show that AID cat? amend the original audience interest distribution matrix ⁇ by aggregating each user's visitation factor t and. category (/) specific information P
  • the AID 6 can represent the expectecf interest distribution across all users that visit the site.
  • each element of the AID vector S ⁇ ⁇ 6 ⁇ , ... , S , ... , ⁇ > can epresent the aggregated audience interest probability for each specific category.
  • All the interest distributions across all sites can comprise a J by K matrix A, where , for example:
  • EXEMPLARY P OPOsrno 1. 8 ⁇ can be the expected interest distribution of randomly drawn visitor to website (in this case also category) j. [ ⁇ 063]
  • each category belongs to only one site.
  • a slightly less simplified scenario is where each category can belong to multiple sites, but each site contains only one category. Modeling this exemplary scenario is a straightforward extension to the previous exemplary model, if the only extension to the exemplary model is that a category can belong to multiple sites, then in terras of the a dience interest
  • Equation 7 can be extended following this observation that each category carries its own AID regardless of the sites it belongs to. Additional exemplary extensions to this exemplary model are possible.
  • the exemplary aggregation model according to an exemplary embodiment of the present disclosure can be extended to the situation where there is a many-to-many mapping between sites and categories.
  • the exemplary generative process is thai to take art action, each user (e.g., with ⁇ ) can draw a category j that can determine his or her visitation, based on the multinomial distribution in equation 1.
  • a site is drawn at the same time, because of the assumed Mo- mapping between sites and. categories.
  • the categorical distribution ⁇ ⁇ is known to the model; it could be obtained based on different. assumptions and different inteiprelations-ofthe data.
  • One exemplary solution is to assume a uniform distribution that assigns equal probabilities to a!l sites within category j.
  • An exemplary alternative is to assign higher probabilities to more popular sites.
  • a third exemplary way is to assign smaller probabilities to sites that have a larger number of contextual categories. How the distribution is arrived at may not. be important for the present theoretical model development. As indicated herein below, a uniform distribution is assumed, as the results are less influenced by the
  • a final step in building a practically useful aggregation model is to compose direct links from each user n to site w, since in the data, the intermediate category choke is unobserved.
  • Each user n is associated with a categorical distribution »TM ⁇ r l .. . i f >, where each element q cauliflower can denote the probability of user n visiting site . can be the random variable representing the number of visits from user n to site w.
  • the distribution of X ' taking values P* ' can be an multinomial distribution where, for example: - .!
  • the exemplary generative model can draw a visit from user rt to website w by first drawing a visit from user n to c tegory j based on user interest distribution ⁇ . then draw website w based on ⁇
  • the exemplary probability of user n visiting site w ( ⁇ ' ) is, for example; ⁇ ::: ⁇
  • EXEMPLARY PROPOS ITION 2. (calculated through equation i 3) can be a proper probability distribution.
  • the expected number of visits from user n to site w can be, for example:
  • site w received ⁇ ⁇ ⁇ % visits from user n. order to obtain the expected audience interest distribution /? v . for site >v, the expectation of iiiieresi distributions can be taken from all users who visit site w cod and treat ⁇ ⁇ % as the averaging weight where, for example:
  • the AID across all sites can include a W-b - matrix I where each element of the matri can be, for example: ⁇ ⁇ can be the aggregated AID for site w.
  • p w can have the following meaning: across all visits to site w, one visit is randomly drawn, the expected interest distribution of the unknown visitor who pays this visit can be
  • EXEMPLARY PROPOSITION 3.get can be the expected interest distribution of a r ndomly drawn visitor to website w.
  • EX ⁇ P n i .
  • user n with interest distribution y n visits site w ⁇ ⁇ ⁇ times.
  • Z ⁇ u P n ,f can be the normalising factor.
  • the expected interest distribution of a drawn visitor to site w can be, for example;
  • the exemplary aggregation model can assume that the user interest, distributions ⁇ is known. In general, this may not be known. This can lead to a further stage of the exemplary model: inferring ⁇ from the data.
  • the inference is based on a generative model similar to the exemplary generative aggregation model presented, above, in the exemplary aggregation model, it is assuraed that all users' interest distributions ⁇ can be known, and the aggregation model can be utilized to obtain the expected audience interest distribution for each website.
  • the visits from users to websites are modeled probabilistically, and use all known information to infer the best ⁇ (e.g., the best individual user interest distribution f n for user n).
  • An exemplary goal is to infer parameters of the exemplary model , ⁇ , from site visitation data where the contextual categories of the sites are observed.
  • the contextual category distribution ("CCD") of a site may not be the same as the audience interest
  • the exemplary generative model .for visits from user ⁇ to web site can be as follows. As in the exemplary aggregation stage, a visit is drawn front user n to category j from a multinomial distribution, mult ⁇ yj,, L tl ). where ⁇ ⁇ can be the probability of visiting category j from user n, which is a IJID. Visits are drawn in this way until the total number of visits L n axe reached. For each of the visits to category], a draw of visits to site w is conducted based on a multinomial distributionmulti ( ⁇ , V ), where ⁇ * can e the
  • each site can only have one category, and each category can only have one site, it is assumed that websiie visitation behavior is observed by, for example, a bipartite graph formed by users' visitations to websites.
  • the users' visitations are modeled to categories, which is equivalent to websites, if sites can each have onl one category, multiple sites per category can treat category j as a "supemode" where all sites with category j cluster together.
  • n ii is known that, from the bipartite graph described above thai, the total number of visits L View as well as the number of visits to individual categories j can be l .
  • MLE maximum likelihood estimator
  • the MLB estimator is used, e.g., as a starting point to build the exemplary aggregation model
  • L ? n cau be the number of visits from user n to category j, which is assumed to be observed for the moment because of the simplifying assumption.
  • j :::: k because the category-to-website mapping is a one-to-one mapping. Therefore, y ⁇ a cm be written ashyroid.
  • a smoothing factor can be used, such as, for example: ; i5: ) " - ;;; ⁇ - ⁇ - ⁇ ,
  • L n can be the total number of visits (e.g., draws) from, user n to all categories, and therefore the total number of visits to ail sites; L B can be calculated from the known dataset (e.g., users' visitations to websites). The number of visits from user n to website w, P ( f is known.
  • L f n the number of draws of category]— may not be .known, and can be estimated based on L ti and P, .
  • is known, which, can represent the probability for website v to receive users' visits redirected from category /. m other words, i i- can show the importance of category j for website w. Note that 3 ⁇ 4 ⁇ — 1.
  • may be modeled as a uniform distribution, which can assign each weight to all categories. , may instead be estimated through contextual analysis methods (e.g., text mining, natural language processing). For example, finance related websites can have a higher ⁇ , in category
  • proxy measurement is employed for vj/j—the website-specific contextual scores across all categories, obtained from an. industry-leading contextual classification company specialized in applying semantics technology and Natural Language Processing procedures to website content. This is a convenient and reasonable proxy, as the contextual website classifications (e.g., the CCD) is needed io bootstrap the model, and thus, this proxy may not introduce an extr estimation.
  • the contextual website classifications e.g., the CCD
  • ( Mi83]
  • An important property of the exemplary model is that.iller is specific to website w, and is independent of the events of any user visiting the website.
  • -'w can represent the probability for websit w to receive visits from categories.
  • the probability that a visit can come from category/ can be ⁇
  • a model of the visits that website w can receive from user n can be a multinomial , i3 ⁇ 4*'), similar to the exemplary models described above.
  • the expected number of visits from category / to website w can be Pn i>w ⁇
  • L n can be calculated by summing over all websites who receive visits; i) originating from user and/or u) can be from category. to each site w, which can be, for example;
  • the exemplary result can show the user n's total number of visits to ail categories (£ convention) can be the same as users n's total number of visits to all websites 3 ⁇ 4). Tiros the exemplary estimation can pass the correctness cheek,
  • the user interest distribution eat ⁇ be estimated from the data by, for example;
  • the CCD from the seed set is used in the inference phase, to estimat ⁇ (e.g., users' interest distribution), and the CCD from the holdout set is hidden from the inference model.
  • e.g., users' interest distribution
  • the CCD from the holdout set is hidden from the inference model.
  • the data fo r the experiment can include (a) a set. of webpages spanning a wide variety of contextual categories, labeled with high-quality granular contextual categories, (b) a set of users who visit these webpages, and (c) a set of visits from the users to the webpages.
  • These webpages are commonly visited pages scattered about the web.
  • the data include visits to a large portion of ad-supported webpages; however, the visits may only be a sample of all visits to any given webpage and only a part of all visits from any gi ven user.
  • Contextually classified categories can be obtained from one of the leading
  • the pages can be sampled for crawling and classification from real ad-delivery traffic, and weighted by frequency of occurrence, so that more frequently visited pages are more likely to b labeled.
  • J0088 For example, all users who visited at least two of these labeled pages are extracted, as users who visit only on page can make no difference in the holdout-based application of the model as will become clear below. Collecting such data can utilize tremendous data processing infrastructure, because very large number of visits needs to be filtered to sel ect the visits to these specific pages.
  • J 00S9 Users can be defined by a combination of IP address and HTTP User Agent, including browser type and browser configuration, based on industry best practices, which have been shown to be .reasonably accurate at singling out individual users. (See, e.g..
  • IF address and User Agent can be converted using one-way hash functions both for convenience of use and to completely anonymize users.
  • a procedure can be applied to identify and filter out activities which most likely cannot, be attributed to an individual user. These can include requests from IP addresses identified as hotspots or sources of server, as opposed to browser, requests as well as from, user agents identified as robots, automated tools or those conducting malicious activities,
  • the exemplary result of these processes is a bipartite graph between users and webpages, with the size and richness of connectivity, and effort needed to construct, depending on the time frame, and the number of labeled webpages.
  • the disjoint datasets is referred to by the time frame (e.g., 1 -hour, 10-hour).
  • the data to be processed can be massive.
  • the original log had 174 million webpages, 50 million users and 483 million visits between users and webpages.
  • the pre-fi!tere l and extracted 10-hour dataset had 18 million users, 28 million webpages and 78 million visits.
  • Exemplary Holdout Design ⁇ 009l] la order to assess whether the exemplary mode! can estimate the interests of visitors well enough, in the spirit of a "holdout" evaluation, an experimental design can be setup, where no category information from a webpage w can propagate back to itself.
  • webpage w's AID be estimated using any information in ⁇ that originated from website w.
  • the aggregation graph and the inference graph After randomly splitting the bipartite graph between users and webpages in half the graphs used in the aggregation and inference phase are denoted as the aggregation graph and the inference graph, respectively.
  • the exemplary AID is calculated or otherwise determined for every webpage w based on the users whose connections to w are in the aggregation graph; these users' y's are estimated using the inference model based on the contextual categories of webpages in the inference graph and users * visits to the webpages in (he inference graph. For example, there can be three hundred and one contextual categories, and each page can have on average only 2 categories, so this task is far from trivial
  • AID may never include this page's category. More generally, if users only visit one particular webpage in each category, the webpages' AID ⁇ may never include their own categories. It can, therefore, be desirable that a significant portion of users have diverse and sufficiently dense navigation patterns, thus in the exemplary procedure for extracting data, all users visit at least two pages and all pages are visited by at least two users.
  • the resultant bipartite graph with labeled webpages can include about 1 ,017,547 visits, 186,691 users, and 36,876 webpages, which can then be split into an inference graph (e.g., 508,718 visits, ⁇ 8.1 ,832 users and 33,347 webpages) and an aggregation graph (e.g., 508,829 visits, 181 ,744 users and 33,301 webpages).
  • an inference graph e.g., 508,718 visits, ⁇ 8.1 ,832 users and 33,347 webpages
  • an aggregation graph e.g., 508,829 visits, 181 ,744 users and 33,301 webpages.
  • the mil original connectivity between users and pages after the data has split may not be known.
  • the graph structure of the aggregation data is taken into account. Only if there exists a link between a user and a webpage, can the expectation of the number of visits from the user to the webpage be calculated. If webpage w does not exist in the specific aggregation graph being focused on, then W should not exist even though it can be calculated, if there is no link between user n and webpage w in the aggregation graph, the aggregation model based on ⁇ % may not exist even though it can be calculated through.
  • a link indicator ⁇ — ⁇ 1,0 ⁇ is introduced, which can represent the existence of a link between us n and webpage w. Then ?* can be modified to be, for example:
  • can still be the expected interest distribution of a randomly drawn vi sitor to webpage w based on the user-webpage visitation graph structure.
  • the framework can exclude a webpage * s own contextual categories from the model, and can then use the model to predict the webpage * s audience interests.
  • known category data is used to evaluate how well the exemplary model can predict audience interests. For example, it is assumed that a user visits a webpage because of her interest in at least one of the topics (e.g., categories) of the content on the webpage. Therefore there should he an overlap of at least one category between the CCD of a website and the AID, Given the set of contextual category labels for a webpage w, the overlap between the interests estimated by the AID and the category labels are examined.
  • the notion of a good prediction having at least one category overlap does not penalize the AID for including interests that visitors often have, but that may not be represented in the context (e.g., hockey fans interested in "style"), and does not penalize the CCD for including contextually extracted categories that may not be the subject of v isitors' particular interests .(e.g., a hockey story commenting on the food variety in a particular arena).
  • the evaluation does penalize the model if the AID categories do not contain any of the CCD categories.
  • the AID interest vector p w can give the estimated interest distribution over the categories.
  • the exemplary threshold r can represent the level of interest thai is utilized i order to predict the interest; technically, based on the model it can use the fact that an expectation is thai a visitor to w can choose to visit a site with category k with greater than probability i. All contextual categories are used as the CCD set: call it ⁇ CCD ⁇ .
  • the set of exemplary AID categories ⁇ AID ⁇ is referred to simply by "AID", and to (CCD ⁇ b "CCD".
  • Kronecker recall can be defined as follows, for example:
  • Kronecker recall can be 1 if there is at least one commo category between AID and CCD; it can be 0 if no categories in AID are In the CCD. This is contrasted with "regular" recall, which in this case is the fraction of CCD categories that is successfully predicted, or, for example, can be in the AID: ⁇ ⁇ lOCD
  • FIGS 10A and 1 OB illustrate exemplary results assessing whether the AID can predict interests well, as represented by the CCD categories.
  • Three different data samples are used representing time frames of one hour 1005, six hours 101 , and ten hours 1 15.
  • the exemplary curves can represent the a verage precision/average Kronecker recall tradeoffs achieve by varying the threshold r.
  • Figure I OA shows tha for all. threshold values, the average precision and Kronecker recall can both be quite high.
  • Figure 10B shows an illustration which magnifies an important part of the exemplary graph.
  • JOOIOIJ The results illustrated in Figures 10A and 10B show that the results do not vary much based on the different timeframes (e.g.. between one and ten hours).
  • average Kronecker recall is tetter than 90%, with precisions varying from •60% to better than 90%, as t can be varied.
  • the AID categories can include the page's contextually determined interest categories in the estimated audience interest distributions, even though the AID categories are estimated based, on traffic to other webpages.
  • the exemplary slope of the curves is steep. This can indicate that the contextual categories can actually be the categories with generally the strongest representation in the AID. As x can be increased, the precision can go up towards one— so even as the AID categories are culled, in roost cases at least, one of the "true" contextual categories remains.
  • Exemplary differences between AID and CCD can exist for several reasons.
  • the examples used are summarized in Table 3 below.
  • Such exemplary Table 3 shows the URL of the webpage, below which are shown: the original category or categories (e.g. , column 3), the new categories (e.g.. Chose that the AID adds, column I k the missing categories (e.g., those that are in the CCD but not in the AID, coiumn 2), and the type of case which each
  • the AID may not add categories, or no categories are missing, in which case the corresponding column is blank.
  • the AID of the website can be "disease, medicine”, with “women's health” as the missing category.
  • a quick look at the website shows that its content can be about a stomach vims; it can be health-related, but does not contain any specific content about women 's health.
  • a third case is when AID can represent unexpected/surprising new categories, which appear odd at first sight, but make sense after some deeper analysis. For example, website can have a contextual category of "relationships". The AID can add categories
  • the CCD category "software” simply seems to be a gross misclassification.
  • the CCD category "fine arts” can be wrong as well for most of the pictures sampled, in a semi-quantitative analysis, two dozen pictures are randomly sampled and labeled as "fine arts.” None could be fairly judged as fine art.
  • the ciosest can be a nice Bob ar!ey portrait made with the tape pulled out of a cassette tape, and a nice na ture photo; almost all were humorous photos. The nature photo seemed to be misclassified as "humor” by the CCD; the AID did not include "humor”.
  • the AID can add "humor” to various photos; this is because many of them are humorous photos.
  • the AID can add "celebrity fan/gossip".
  • the AID can add the category "transports," To the school comic (e.g., Figures 13B and 13C),
  • the AID categories can have an overlap with the AID categories.
  • the AID categories can represent, the interests of the visitors, which may not be directly relevant to the actual content.
  • the users who visited the phot of the music-writer tended also to visit "radio" and "cinema” pages.
  • the exemplary systems, methods and computer-accessible mediums can rely on the website visitation behaviors of massive number of users to build an AID for each website.
  • the interest distribution is dynamic, and behavior-generated, and thus is different than studies based on categorizing the content of websites.
  • the exemplary model can estimate individual users' interest distributions based on their website visitation patterns., and the contextual categories of the websites that the users visit. Using these exemplary estimated user interest distributions, the exemplary model then calculates the expected AID of each website.
  • the exemplary model can. provide the following meaning for an AID: across most or all visits to a website, if one were to randomly draw one visit, the interest distribution of this unknown visitor is the AID.
  • Exemplary audience interest es timation is of interest to managers for many di ferent reasons. Understanding audience interests can help managers of companies with signi icant web presence to optimize their content and navigation, create better content for their audience, improve site merchandizing such, as the placement of product links and internal offers, solicit sponsorship, and perform other audience analytics. In. addition. understanding audience interests is aft important goal of many companies in the online advertising industry, where advertisers want to target advertisements based on the interests of website visitors.
  • FIG. 17 shows a block diagram of an exemplary embodiment of a system according to the present disclosure.
  • exemplary procedures in accordance with the present disclosure described herein can be performed by a processing arrangemeiit and/or a computing arrangement 1.702.
  • Such processing/computing arrangement .1702 can be, for example, entirely or a pari of, or include, but not limited to.
  • a computer/processor 1704 can include, for example, one or more microprocessors, and itse instructions stored on a computer-accessible medium (e.g., RAM, ROM, hard drive, or other storage device).
  • a computer-accessible medium e.g., RAM, ROM, hard drive, or other storage device.
  • a computer-accessible medium 1706 e.g., as described herein above, a storage device suc as a hard disk, floppy disk, memory stick, CD- ROM, RAM, ROM, etc, or a collection thereof
  • the computer-accessible medium 1706 can contain executable instructions 1708 thereon.
  • a storage arrangement 1710 can be provided separately from (he computer-accessible medium 1706, which can provide the instructions Co Che processing arrangement 1702 so as to configure the processing arrangement to execute certain exemplary procedures, processes and methods, as described herein above, for example.
  • the exemplary processing arrangement .1702 can be provided with or include an input/output arrangement 1714, which can include, for example, a wired network, a wireless network, the internet, an intranet, a data collection probe, a sensor, etc.
  • the exemplary processing arrangement 1702 can. be in communication with an exemplary display arrangement 1712, which, according to certain exemplary embodiments of the present disclosure, can be a touch-screen configured for inputting information to the processing arrangement in addition to oufputtiag information from the processing arrangement, for example.
  • the exemplary display 1712 and/or a storage arrangement 1710 can be used to display and/or store data in a user-accessible forniat and/or user-readable format.
  • An exemplary purpose of the exemplary simulation is to intuitively ill usttate how the exemplary model works and also provide for the correctness of the exemplary model.
  • the setup of the simulation may not be representative of a real dataset, so statistics of the results here may not align with those from the real dataset,
  • the exemplary simulation can assume, e.g., a "perfect", or otherwise simplified, world where all users behave under the assumptions above, thai users look fo websites whose contextual categories overlap with users' interests. It is "perfect" because a user only visits websites tha contain her his own. interests. For example, user ul interested in category A only visits websites that, contain category A, which can be l.i , 12, and 17.
  • a small bipartite graph is randomly generated on the assumption. Figure ISA shows this visitation graph. The number of visits from, users to websites is shown as a label of the link between them. As is seen below, websites 11 to 13 only contain contents of contextual category A, 14 to 16 only contain content of B, and 1? and 18 contain content of both A and B with equal weight,
  • the full graph is processed by excluding singleton, users, and splitting the full graph into an aggregation graph, and an inference graph, which are shown in Figure 16 and Figure 15B.
  • ail links from user u l can be randomly split into either the inference graph (link l ⁇ 12) or the aggregation graph (links ul ⁇ I I and ul ⁇ 17).
  • the checkmarks in the inference/aggregati n graph can indicate each website's contextual categories.
  • the exemplary probabilities on the left part of both graphs can indicate the user interest distribution (UID) ⁇ estimated by the inference stage of the exemplary model.
  • UID user interest distribution
  • the exemplary probabilities on the right part of the aggregation model can indicate the audience interest distribution (AID) (3 ⁇ 4, for each website, calculated by the aggregation stage of the exemplary model.
  • AID audience interest distribution
  • AH websites AID "recover" the original CCD correctly by plurality vote (on the probabiiities).
  • S can have a larger probability in category B (0.875) than in category A (0.125), because it can contain only contextual category B.
  • the user's number of visits to category k (denoted by random variable Xf, ) cai3 ⁇ 4 be distributed according to a muliinomiai distribution.
  • the likelihood function can be, for example: (31 )
  • L n cm be the total number of visits from user n to ail websites
  • f n can be the audience interest distribution for user n.
  • the log-likelihood function then can be. for example:
  • the exemplary maximum likelihood estimator of yk for user n based on the exemplary model can be ( ⁇ £) ⁇ — ⁇ , which is the frequency of user ffs visit to category (e.g., site) k across user «'s visits to all categories (e.g., sites).

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Accounting & Taxation (AREA)
  • Strategic Management (AREA)
  • Algebra (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Probability & Statistics with Applications (AREA)
  • Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Operations Research (AREA)
  • Information Transfer Between Computers (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

La présente invention concerne des systèmes, des procédés et des supports accessibles par ordinateur, permettant de déterminer au moins une distribution d'intérêt d'audience d'au moins un contenu, par exemple, par réception de premières informations concernant au moins un comportement Web d'au moins un utilisateur, par détermination de secondes informations concernant une ou des distributions d'intérêt d'utilisateur du ou des utilisateurs sur la base des premières informations, et par détermination de la ou des distributions d'intérêt d'audience du ou des contenus sur la base des secondes informations.
PCT/US2013/060156 2012-09-17 2013-09-17 Système et procédé d'estimation d'intérêt d'audience WO2014043699A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US14/428,265 US10599981B2 (en) 2012-09-17 2013-09-17 System and method for estimating audience interest

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201261702096P 2012-09-17 2012-09-17
US61/702,096 2012-09-17

Publications (1)

Publication Number Publication Date
WO2014043699A1 true WO2014043699A1 (fr) 2014-03-20

Family

ID=50278765

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2013/060156 WO2014043699A1 (fr) 2012-09-17 2013-09-17 Système et procédé d'estimation d'intérêt d'audience

Country Status (2)

Country Link
US (1) US10599981B2 (fr)
WO (1) WO2014043699A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016038471A1 (fr) * 2014-09-12 2016-03-17 Yandex Europe Ag Procédé pour estimer des intérêts d'utilisateur
CN111028006A (zh) * 2019-12-02 2020-04-17 支付宝(杭州)信息技术有限公司 一种业务投放辅助方法、业务投放方法及相关装置

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9665584B2 (en) * 2013-03-28 2017-05-30 Linkedin Corporation System and method for recommending actions on a social network
US10090002B2 (en) 2014-12-11 2018-10-02 International Business Machines Corporation Performing cognitive operations based on an aggregate user model of personality traits of users
US10282409B2 (en) * 2014-12-11 2019-05-07 International Business Machines Corporation Performance modification based on aggregation of audience traits and natural language feedback
CN105117772B (zh) * 2015-09-02 2017-10-27 电子科技大学 一种多状态系统可靠性模型的参数估计方法
CN111027737B (zh) * 2019-10-16 2024-02-09 平安科技(深圳)有限公司 基于大数据的职业兴趣预测方法、装置、设备及存储介质
CN111104599B (zh) * 2019-12-23 2023-08-18 北京百度网讯科技有限公司 用于输出信息的方法和装置
EP3955114B1 (fr) * 2020-08-10 2023-11-08 Capital One Services, LLC Procédé et système d'essai de page web numérique
US12008587B2 (en) * 2020-08-21 2024-06-11 The Nielsen Company (Us), Llc Methods and apparatus to generate audience metrics using matrix analysis

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080097834A1 (en) * 1999-04-02 2008-04-24 Overture Sevices, Inc. Method For Optimum Placement Of Advertisements On A Webpage
US20100122178A1 (en) * 1999-12-28 2010-05-13 Personalized User Model Automatic, personalized online information and product services
US20100241625A1 (en) * 2006-03-06 2010-09-23 Veveo, Inc. Methods and Systems for Selecting and Presenting Content Based on User Preference Information Extracted from an Aggregate Preference Signature
US20110246285A1 (en) * 2010-03-31 2011-10-06 Adwait Ratnaparkhi Clickable Terms for Contextual Advertising

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060294124A1 (en) * 2004-01-12 2006-12-28 Junghoo Cho Unbiased page ranking
US8930400B2 (en) * 2004-11-22 2015-01-06 Hewlett-Packard Development Company, L. P. System and method for discovering knowledge communities
US7941535B2 (en) * 2008-05-07 2011-05-10 Doug Sherrets System for targeting third party content to users based on social networks
US8108406B2 (en) * 2008-12-30 2012-01-31 Expanse Networks, Inc. Pangenetic web user behavior prediction system
US9600581B2 (en) * 2009-02-19 2017-03-21 Yahoo! Inc. Personalized recommendations on dynamic content
US8489625B2 (en) * 2010-11-29 2013-07-16 Microsoft Corporation Mobile query suggestions with time-location awareness
US8712719B2 (en) * 2011-03-29 2014-04-29 Brian P. Klawinski Method and system for detecting center pivot collision
US8838688B2 (en) * 2011-05-31 2014-09-16 International Business Machines Corporation Inferring user interests using social network correlation and attribute correlation
US20130103609A1 (en) * 2011-10-20 2013-04-25 Evan R. Kirshenbaum Estimating a user's interest in an item
US9607077B2 (en) * 2011-11-01 2017-03-28 Yahoo! Inc. Method or system for recommending personalized content
US10685065B2 (en) * 2012-03-17 2020-06-16 Haizhi Wangju Network Technology (Beijing) Co., Ltd. Method and system for recommending content to a user
US9569432B1 (en) * 2012-08-10 2017-02-14 Google Inc. Evaluating content in a computer networked environment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080097834A1 (en) * 1999-04-02 2008-04-24 Overture Sevices, Inc. Method For Optimum Placement Of Advertisements On A Webpage
US20100122178A1 (en) * 1999-12-28 2010-05-13 Personalized User Model Automatic, personalized online information and product services
US20100241625A1 (en) * 2006-03-06 2010-09-23 Veveo, Inc. Methods and Systems for Selecting and Presenting Content Based on User Preference Information Extracted from an Aggregate Preference Signature
US20110246285A1 (en) * 2010-03-31 2011-10-06 Adwait Ratnaparkhi Clickable Terms for Contextual Advertising

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016038471A1 (fr) * 2014-09-12 2016-03-17 Yandex Europe Ag Procédé pour estimer des intérêts d'utilisateur
US9740782B2 (en) 2014-09-12 2017-08-22 Yandex Europe Ag Method for estimating user interests
RU2643434C2 (ru) * 2014-09-12 2018-02-01 Общество С Ограниченной Ответственностью "Яндекс" Способ предоставления пользователю сообщения посредством вычислительного устройства и машиночитаемый носитель информации
CN111028006A (zh) * 2019-12-02 2020-04-17 支付宝(杭州)信息技术有限公司 一种业务投放辅助方法、业务投放方法及相关装置
CN111028006B (zh) * 2019-12-02 2023-07-14 支付宝(杭州)信息技术有限公司 一种业务投放辅助方法、业务投放方法及相关装置

Also Published As

Publication number Publication date
US10599981B2 (en) 2020-03-24
US20150242751A1 (en) 2015-08-27

Similar Documents

Publication Publication Date Title
US10599981B2 (en) System and method for estimating audience interest
US10325289B2 (en) User similarity groups for on-line marketing
US10108979B2 (en) Advertisement effectiveness measurements
US10134058B2 (en) Methods and apparatus for identifying unique users for on-line advertising
US8412648B2 (en) Systems and methods of making content-based demographics predictions for website cross-reference to related applications
US8676875B1 (en) Social media measurement
JP6248106B2 (ja) 広告ターゲティングのための否定的なシグナル
Segev et al. Measuring influence on Instagram: a network-oblivious approach
US9208441B2 (en) Information processing apparatus, information processing method, and program
US10163130B2 (en) Methods and apparatus for identifying a cookie-less user
Wang et al. Learning relevance from heterogeneous social network and its application in online targeting
JP7130560B2 (ja) コンテンツを効果的に配信するための動的クリエイティブの最適化
US20120173338A1 (en) Method and apparatus for data traffic analysis and clustering
Xu et al. Integrated collaborative filtering recommendation in social cyber-physical systems
US20140172545A1 (en) Learned negative targeting features for ads based on negative feedback from users
US9256692B2 (en) Clickstreams and website classification
KR20150023432A (ko) 사용자 데모그래픽을 추정하는 방법 및 장치
US20170140397A1 (en) Measuring influence propagation within networks
US20190095530A1 (en) Tag relationship modeling and prediction
US20150206222A1 (en) Method to construct conditioning variables based on personal photos
US10922722B2 (en) System and method for contextual video advertisement serving in guaranteed display advertising
US20150310487A1 (en) Systems and methods for commercial query suggestion
Niu et al. Predicting image popularity in an incomplete social media community by a weighted bi-partite graph
Kim et al. Topic-Driven SocialRank: Personalized search result ranking by identifying similar, credible users in a social network
Al-Qurishi et al. A new model for classifying social media users according to their behaviors

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13836964

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 14428265

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13836964

Country of ref document: EP

Kind code of ref document: A1