WO2014088564A1  Bayesian content discovery  Google Patents
Bayesian content discovery Download PDFInfo
 Publication number
 WO2014088564A1 WO2014088564A1 PCT/US2012/067859 US2012067859W WO2014088564A1 WO 2014088564 A1 WO2014088564 A1 WO 2014088564A1 US 2012067859 W US2012067859 W US 2012067859W WO 2014088564 A1 WO2014088564 A1 WO 2014088564A1
 Authority
 WO
 Grant status
 Application
 Patent type
 Prior art keywords
 questions
 item
 user
 system
 search
 Prior art date
Links
Classifications

 G—PHYSICS
 G06—COMPUTING; CALCULATING; COUNTING
 G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
 G06N7/00—Computer systems based on specific mathematical models
 G06N7/005—Probabilistic networks
Abstract
Description
BAYESIAN CONTENT DISCOVERY BACKGROUND
[0001] Extensive online collections of content exist, such as restaurants, movies, music, and others. Items in these collections are typically described by many attributes, some of which are predefined and common, and others that are freely generated by users, such as reviews. The problem of content discovery in this environment has been traditionally approached in two ways. In one approach, the user types attributes of interest into in a search box and a system returns a ranked list of items that are most relevant to the user's query. In the other approach, the user navigates through a series of menus with item categories and progressively refines the selection to a list of desired items. Most recommendation websites combine both approaches.
SUMMARY
[0002] Content discovery is modeled as an interactive game. In this game, the user has a specific target item on their mind, which is drawn i.i.d. from some probability distribution. In each turn of the game, the system asks a (binary) question, the user answers, and this process is repeated until the system identifies a single item that satisfies all user's answers so far. The questions are generated with respect to the expected probability of drawing target items. This distribution is updated after each game. So when the user searches for content the next time, the questions are better tuned to what the user may look for and their number is smaller.
[0003] The above presents a simplified summary of the subject matter in order to provide a basic understanding of some aspects of subject matter embodiments. This summary is not an extensive overview of the subject matter. It is not intended to identify key /critical elements of the embodiments or to delineate the scope of the subject matter. Its sole purpose is to present some concepts of the subject matter in a simplified form as a prelude to the more detailed description that is presented later.
[0004] To the accomplishment of the foregoing and related ends, certain illustrative aspects of embodiments are described herein in connection with the following description and the annexed drawings. These aspects are indicative, however, of but a few of the various ways in which the principles of the subject matter can be employed, and the subject matter is intended to include all such aspects and their equivalents. Other advantages and novel features of the subject matter can become apparent from the following detailed description when considered in conjunction with the drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
[0005] FIG. 1 is an example of a restaurant discovery guide on a smartphone.
[0006] FIG. 2 is an example of a movie discovery guide on a smartphone.
[0007] FIG. 3 is an example of a system that utilizes Bayesian learning to provide a user's target item.
[0008] FIG. 4 is another example of a system that utilizes Bayesian learning to provide a user's target item.
[0009] FIG. 5 is a method of applying Bayesian learning to content discovery.
DETAILED DESCRIPTION
[0010] The subject matter is now described with reference to the drawings, wherein like reference numerals are used to refer to like elements throughout. In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the subject matter. It can be evident, however, that subject matter embodiments can be practiced without these specific details. In other instances, wellknown structures and devices are shown in block diagram form in order to facilitate describing the embodiments.
[0011] In generalized binary search, the probability of items being of a user's interest is known. However, this is not true in many realworld scenarios. For instance, in the problem of content discovery, this would correspond to knowing the complete profile of the user, which is more often not the case. This limitation is addressed while utilizing generalized binary search as a framework for modeling interactive search. It is assumed that the probability with which searched items are drawn is unknown. An efficient technique is employed that learns this probability over time as the user interacts with the search engine. The techniques are Bayesian optimal, and its regret increases only sublinearly with time.
[0012] The number of searchable items is typically very large. Therefore, the items are organized in a treelike taxonomy, where each node corresponds to a selection in a menu done by the user (FIG. 1 and 2). In an example cell phone discovery guide interface 100 of FIG. 1, for example, restaurant review website users can choose from over 100 cuisines  but only 9 can be shown simultaneously 102 on the screen of a smart phone. Additionally, in many interfaces, like a movie discovery guide interface of a cell phone 200, the selections 202 are limited to the screen size and the user is shown movies that satisfies the criteria but also includes many unwanted items.
[0013] In one instance, the content discovery problem is viewed as an interactive game. In this game, the user has a specific target item on their mind, which is drawn from some probability distribution π^{*}. In each turn of the game, the system asks a (binary) question, the user answers, and this process is repeated until the system identifies a single item that satisfies all user's answers so far. The goal is to minimize the expected number of asked questions. This problem is known as generalized binary search (see, Sanjoy Dasgupta; Analysis of a greedy active learning strategy; In Advances in Neural Information Processing Systems 17, pages 337344, 2005 and Robert Nowak; The geometry of generalized binary search; IEEE
Transactions on Information Theory, 57(12):78937906, 2011), and it can be solved nearly optimally by a greedy algorithm. In each turn of the game, the algorithm selects the question that divides the version space up to that time closest to two halves. The probability π^{*} can be viewed as a user's preferences, and it is typically unknown. The efficient techniques disclosed herein learn this probability over time as the user interacts with a system. The techniques are Bayes optimal and their regret increases only sublinearly with time.
[0014] In one instance, a mathematical model is used for the interactive game.
It is assumed that the system has a collection of M items, and each item m =
1,2, ··· , M is identified by a feature vector φ(τη) & {0,1} . The item feature space Φ is defined as:
Φ = {ø: 3m = l,2,  , Ms. t.0 = 0(m)}. Φ is used to denote the cardinality of Φ. The user's preference is modeled as a probability distribution π^{*} over Φ. That is, each time when the user interacts with the system (henceforth referred to as a usersystem interaction), with probability π^{*}(φ), they are looking for the item with feature φ (henceforth it is said that the user's target item feature at that interaction is φ). It is further assumed that the user's preference π^{*} is timeinvariant, and their target item features in different interactions are statistically independent. That is, the target item features are drawn from π^{*}.
[0015] It is also assumed that during each usersystem interaction, the system is allowed to ask a sequence of yes/no (binary) questions from a question pool Q. Each question in Q is represented as a function q: {0,1}^{K}→ {Y, N}, where "Y" denotes the answer "yes" while "N" denotes the answer "no." Specifically, if the user's target item feature is φ€ Φ and the system chooses to ask them question q, then their answer will be q(0)€ {Y, N}. In this instance, the question pool Q can be an arbitrary set of such tj's, as long as it satisfies the following assumption: whatever the user's target item feature φ€ Φ is, if the system has asked them all the questions in Q, then it is able to discover φ. \Q\ is used to denote the number of questions in the question pool Q.
[0016] During each interaction, the system asks questions adaptively and continues asking questions until it can determine the user's target item. Specifically, assume the user's target item feature is φ and the system has asked question q_{it} ··· , q_{n}inQ, then the system will observe answers <?ι(φ), ··· , ς_{η}(φ) £ {Y, N}. The current observation state is defined as x— {q_{it} q_{t} (φ))"_{=1}. Furthermore, x is a terminal state if the system is able to determine φ based on it; otherwise, it is a nonterminal state and the system needs to choose a new question to ask based on x. A (oneinteraction) policy T is a decision rule to choose a new question given a nonterminal state x (notation T is used since this policy can be represented as a decision tree). Obviously, if both the policy T and the user's target item feature φ are given, the sequence of questions the system will ask is completely determined, and N(T, φ) is used to denote the length of this question sequence.
[0017] Generally speaking, the system's objective during each interaction is to minimize the expected number of questions it needs to ask to discover the user's target item. For the oracle case, when the system is assumed to know the user preference π', it aims to solve:
rnin E^. [Ν(Γ, )] (Eq. 1)
in each interaction. Of course, in practice, the system does not know π* beforehand and needs to learn it while interacting with the user (discuses infra).
[0018] Before discussion of the more realistic case when the system needs to learn π*, it is worth pointing out that the oracle case of this problem (i.e., when the system is assumed to know π^{*}) is wellstudied, and various methods have been proposed. Specifically, from the perspective of stochastic control, the oracle problem can be formulated as a Markov decision process (MDP) and hence can be solved by dynamic programming (DP). While in theory DP can obtain the optimal policy 7", however, in practice, it tends to be computationally intractable due to the curse of dimensionality, as has occurred in many other problems. It is also known to the information theorists for a long time that a special case of this oracle problem can be efficiently solved by Huffman coding (see, Thomas Cover and Joy Thomas; Elements of Information Theory; John Wiley & Sons, Hoboken, New Jersey, 2006), however, it cannot be generalized to the general case. In addition to these two exact solution methods, some approximation algorithms aiming for near optimal solutions have also been proposed. For example, it is well known that the oracle case of this problem can be solved nearly optimally and computationally efficiently by a greedy algorithm (see, Dasgupta; Daniel Golovin and Andreas Krause; Adaptive Submodularity: Theory and Applications in Active Learning and Stochastic Optimization; Journal of Artificial Intelligence Research, 42:427486, 2011). Recently, based on the notion of adaptive submodularity, the performance bound on this greedy algorithm is further tightened (see, Golovin et al.).
[0019] As discussed supra, in practice, the system needs to learn the user preference π' while interacting with them. Since it is assumed that the system cannot influence the user's behavior (the user's target item features are drawn from π^{*}, no matter what the system has done), there is no need and no way for the system to explore. This distinguishes this problem from the classical multiarmed bandit problems (see, Peter Auer, Nicolo CesaBianchi, and Paul Fischer; Finitetime analysis of the multiarmed bandit problem; Machine Learning, 47:235256, 2002). Thus, a content discovery algorithm is provided herein based on Bayesian learning, which is referred to as the Bayesian content discovery algorithm.
[0020] Specifically, it is assumed that the system has a prior belief P_{0} ^{over} the user preference π*, which is a probability distribution over all the possible realizations of 7T* . The system updates its belief at the end of each interaction, after observing the user's target item feature in that interaction (which is a sample from π^{*}). During interaction t, the system exploits its current belief P_{t} to derive an adaptive (one interaction) policy 77, which (exactly or approximately) solves:
and applies it in this interaction. In other words, T_{t} ^{*} is optimal (or nearoptimal) with respect to the system's current belief P_{t}.
[0021] It is now shown that the optimization problem (Eq. 2) can be further simplified. Specifically, if the certaintyequivalence (CE) user preference is defined in interaction t as ^(φ) = Ε_{π}„_{Ρί}[π(φ)], V e Φ, then for any policy T, we have:
E_{K}~_{Pt} [Κφ~ΛΝ(.Τ.
[π .φ)Ν(Τ,φ)}= Σφεφ Ν (Τ, φ)Έ_{η Ρι} [π(.φ)] = ίφ)Ν(Τ, φ) = Ε_{φ}.„·[Λ/(Τ, φ)].
Thus, min_{r} Ε_{π}„_{Ρ(}. [Εφ~„[Ν(Τ_{>} is equivalent to m!n_{T} [N(T, φ)].
Furthermore, notice that min_{T} Εψ__{π}· [N(T, ø)] is mathematically equivalent to the oracle problem (with the true user preference π^{*} replaced by the CE user preference nl), thus, any method (either exact or approximate) discussed supra can be used to solve min_{r} Έφ„_{π(}* [ΝζΤ, φ)]. The Bayesian content discovery algorithm is detailed in Algorithm 1 (TABLE 1).
TABLE 1  Bayesian Content Discovery Algorithm
ASgarsthsH 1 Bayesian Content Discovery Algorithm
liipisfc Itera feature space Φ, question pool Q and prior belief Po
for i  0, 1, 2,  ·  do
Compute the CE user preference π (φ) = Ε_{π},_{^Ρ(} [π(φ)] , V^ c
Solve miny Ε^.,_{π}· [Ν(Τ, φ)) (exactly or approximately), and denote the solution as T_{t} ^{*} Apply in interaction t, and observe <j>_{t}, the user's target item feature in interaction t
Update the system's (posteri ayes rule
end lor
[0022] In FIG. 3, an example system 300 is shown which incorporates a
Bayesian content discoverer 302 that employs a Bayesian content discovery algorithm to determine a user's target item 304 (or content) based on a series of adaptive user questions 306. In one instance, the example system 300 can be utilized in a processor that is specifically programmed to provide the Bayesian content discoverer 302. Thus, it 302 can also interface with a user interface such as a display device and the like to provide the user questions 306. The user interface device can be hard wired and/or wireless in nature and can be a remote device such as a cell phone and the like that allows interactive input. In this manner, the user questions can be relayed to a user and the user can then reply to the questions which are relayed back to the Bayesian content discoverer 302. This allows the Bayesian content discoverer 302 to process the answers to the questions and determine which questions will be asked of a user during a subsequent user search. The questions are optimized so that the next time the user searches for an item, the questions have been adapted based on the algorithm, the user reaches the searched item more quickly. Thus, if a user is searching for sushi restaurants in a first search, the system 300 might rank Asian cuisine higher in the taxonomy for subsequent restaurant searches and make it more easily selectable and the like for the user, reducing user frustration in subsequent searches.
[0023] Another example system 400 is illustrated in FIG. 4. In this instance, a
Bayesian content discoverer 402 is comprised of an adaptive searcher 404 and a belief updater 406. The belief updater 406 uses a prior belief or probability based on target items found by a user 414 up to that point. This belief is then used to adapt binary questions in the adaptive searcher 404 to build a set of user questions 408 to be asked of the user 414 to help find a user's desired target item 412 (what the user is searching for). Each time the user 414 answers a set of questions and finds a target item 410, the belief updater 406 updates its belief/probability of finding items by using
Bayesian learning (e.g., Algorithm 1, supra), and the updated belief/probability is used by the adaptive searcher 404 to adapt a future set of questions for a user's subsequen target item searches.
[0024] The Bayesian learning is used to adapt the set of questions such that fewer questions need be asked of the user before a target item is found. In one instance, the derived probability can be used to increase the ranking of the question for subsequent searches. This increase in ranking can be implemented such that the question is asked substantially sooner of the user and/or presented (e.g., displayed, etc.) to the user at a higher level. Either means will allow the user to more quickly find their target item. For example, users who frequently search for videos about pets, might be presented with a pet video category at a top level and/or asked if they are looking for pet videos sooner in the question hierarchy.
[0025] In view of the exemplary systems shown and described above, methodologies that can be implemented in accordance with the embodiments will be better appreciated with reference to the flow charts of FIG. 5. While, for purposes of simplicity of explanation, the methodologies are shown and described as a series of blocks, it is to be understood and appreciated that the embodiments are not limited by the order of the blocks, as some blocks can, in accordance with an embodiment, occur in different orders and/or concurrently with other blocks from that shown and described herein. Moreover, not all illustrated blocks may be required to implement the methodologies in accordance with the embodiments.
[0026] FIG. 5 is a flow diagram of a method 500 of discovering content. The method starts by obtaining a prior belief on the probability of a user choosing their content of interest (i.e., their target item) 502. The prior belief is based on target items that the user has selected up to that point in time. A user searches for a target item 504. The user is asked a question with respect to the belief for choosing target item 506. This can be accomplished by presenting a menu with choices that a user can select and the like. The user responds to the question 508. The response can be a selection of one of the choices in a menu and the like. A determination is then made as to whether the remaining number of items in the selection space is more than one item 510. If more than one item still exists in the space, the user is asked another question with respect to the belief for choosing the target item 506. This question and answer loop can continue until one item remains in the selection space. Thus, at this point, the target item is found 512. The probability of drawing target items based on the found target item is then updated 514 and the user can begin another search 504 for a new target item where the process is based on updates gained from prior found target items. The method 500 optimizes each new set of adapted questions so that the user arrives at the target item as efficiently as possible.
[0027] The method 500 employs Bayesian learning to adapt the set of questions for the desired user content. For example, if a user constantly watches horror films, a content discovery guide will raise this genre higher in the question hierarchy. Thus, a user may be asked whether they desire to find a horror movie sooner in a set of questions than would be asked of a brand new user without a history of responses to adapted questions. This allows the user to more quickly and efficiently find their target item. In other words, the ranking of the question is moved higher so that if it is relevant to the searched item (target item) in a subsequent user search, it will be more likely to be asked of the user sooner than later. Likewise, the question (or selection) can be presented to a user first in a display of listed genre types and the like. This allows the user to select the correct genre at a higher level and move more quickly to finding their desired item (e.g., a horror movie).
[0028] What has been described above includes examples of the
embodiments. It is, of course, not possible to describe every conceivable combination of components or methodologies for purposes of describing the embodiments, but one of ordinary skill in the art can recognize that many further combinations and permutations of the embodiments are possible. Accordingly, the subject matter is intended to embrace all such alterations, modifications and variations that fall within scope of the appended claims. Furthermore, to the extent that the term "includes" is used in either the detailed description or the claims, such term is intended to be inclusive in a manner similar to the term "comprising" as "comprising" is interpreted when employed as a transitional word in a claim.
Claims
Priority Applications (1)
Application Number  Priority Date  Filing Date  Title 

PCT/US2012/067859 WO2014088564A1 (en)  20121205  20121205  Bayesian content discovery 
Applications Claiming Priority (1)
Application Number  Priority Date  Filing Date  Title 

PCT/US2012/067859 WO2014088564A1 (en)  20121205  20121205  Bayesian content discovery 
Publications (1)
Publication Number  Publication Date 

WO2014088564A1 true true WO2014088564A1 (en)  20140612 
Family
ID=47352063
Family Applications (1)
Application Number  Title  Priority Date  Filing Date 

PCT/US2012/067859 WO2014088564A1 (en)  20121205  20121205  Bayesian content discovery 
Country Status (1)
Country  Link 

WO (1)  WO2014088564A1 (en) 
NonPatent Citations (11)
Title 

DASGUPTA; DANIEL GOLOVIN; ANDREAS KRAUSE: "Adaptive Submodularity: Theory and Applications in Active Learning and Stochastic Optimization", JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, vol. 42, 2011, pages 427  486, XP080396750 
G. BELLALA: "Information and decision theoretic approaches to problems in active diagnosis", PH.D. IN THE UNIVERSITY OF MICHIGAN, 15 June 2012 (20120615), XP055075869, Retrieved from the Internet <URL:http://hdl.handle.net/2027.42/91606> [retrieved on 20130816] * 
L. YANG, S. HANNEKE, J. CARBONELL: "A theory of transfer learning with applications to active learning", MACHINE LEARNING, vol. 90, no. 2, 7 July 2012 (20120707), pages 161  189, XP035163789, DOI: 10.1007/S109940125310Y * 
M. NAGHSHVAR, T. JAVIDI: "Sequentiality and Adaptivity Gains in Active Hypothesis Testing", ARXIV:1211.2291V1, 10 November 2012 (20121110), XP055075851, Retrieved from the Internet <URL:http://arxiv.org/abs/1211.2291v1> [retrieved on 20130816] * 
MACHINE LEARNING, vol. 47, 2002, pages 235  256 
R. WAEBER, P. I. FRAZIER, S. G. HENDERSON: "A Bayesian approach to stochastic root finding", PROCEEDINGS OF THE 2011 IEEE WINTER SIMULATION CONFERENCE (WSC'11), 11 December 2011 (20111211), pages 4033  4045, XP032112293, DOI: 10.1109/WSC.2011.6148093 * 
ROBERT NOWAK: "The geometry of generalized binary search", IEEE TRANSACTIONS ON INFORMATION THEORY, vol. 57, no. 12, 2011, pages 7893  7906, XP080373469 
SANJOY DASGUPTA: "Analysis of a greedy active learning strategy", ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS, vol. 17, 2005, pages 337  344, XP032187496, DOI: doi:10.1109/ISRA.2012.6219353 
THOMAS COVER; JOY THOMAS: "Elements of Information Theory", 2006, JOHN WILEY & SONS 
Z. WEN, B. KVETON, B. ERIKSSON, S. BHAMIDIPATI: "Sequential Bayesian search", PAPERS ACCEPTED FOR THE 30TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML'13), TO BE HELD 1621 JUNE 2013, 19 March 2013 (20130319), XP055075569, Retrieved from the Internet <URL:http://www.stanford.edu/~zhengwen/pdffiles/ICML2013_Paper.pdf> [retrieved on 20130816] * 
Z. WEN, B. KVETON, S. BHAMIDIPATI: "Learning to discover: a Bayesian approach", PAPERS ACCEPTED FOR THE NIPS 2012 WORKSHOP ON BAYESIAN OPTIMIZATION & DECISION MAKING, TO BE HELD ON 7 DECEMBER 2012, 1 December 2012 (20121201), XP055075564, Retrieved from the Internet <URL:http://javadazimi.com/nips2012ws/index_files/LearningtoDiscover.pdf> [retrieved on 20130816] * 
Similar Documents
Publication  Publication Date  Title 

Lika et al.  Facing the cold start problem in recommender systems  
Bostandjiev et al.  TasteWeights: a visual interactive hybrid recommender system  
US20080215583A1 (en)  Ranking and Suggesting Candidate Objects  
US20090006360A1 (en)  System and method for applying ranking svm in query relaxation  
US20110060717A1 (en)  Systems and methods for improving web site user experience  
US20090144780A1 (en)  Socially collaborative filtering  
FernándezTobías et al.  Crossdomain recommender systems: A survey of the state of the art  
US20110131077A1 (en)  ContextAware Recommendation Module Using Multiple Models  
US20090164442A1 (en)  Interactive hybrid recommender system  
US20110202557A1 (en)  System and method for crowdsourced template based search  
Ricci et al.  Acquiring and revising preferences in a critiquebased mobile recommender system  
Chen et al.  A hybrid recommendation algorithm adapted in elearning environments  
US20120143911A1 (en)  Recommendations based on topic clusters  
CN102346899A (en)  Method and device for predicting advertisement click rate based on user behaviors  
CN101321190A (en)  Recommend method and recommend system of heterogeneous network  
US8364664B2 (en)  Web browser accessible search engine that identifies search result maxima through user search flow and result content comparison  
CostaMontenegro et al.  Which App? A recommender system of applications in markets: Implementation of the service for monitoring users’ interaction  
CN103198418A (en)  Application recommendation method and application recommendation system  
Bouneffouf et al.  A contextualbandit algorithm for mobile contextaware recommender system  
Shi et al.  Exploiting user similarity based on rateditem pools for improved userbased collaborative filtering  
Singer et al.  Detecting memory and structure in human navigation patterns using markov chain models of varying order  
FernándezTobías et al.  Alleviating the new user problem in collaborative filtering by exploiting personality information  
Zhao et al.  Interactive collaborative filtering  
Garcia et al.  Preference elicitation techniques for group recommender systems  
US20140195506A1 (en)  System and method for generating suggestions by a search engine in response to search queries 
Legal Events
Date  Code  Title  Description 

121  Ep: the epo has been informed by wipo that ep was designated in this application 
Ref document number: 12799496 Country of ref document: EP Kind code of ref document: A1 

NENP  Nonentry into the national phase in: 
Ref country code: DE 

122  Ep: pct application nonentry in european phase 
Ref document number: 12799496 Country of ref document: EP Kind code of ref document: A1 