WO2010009314A2 - System and method of using automated collaborative filtering for decision-making in the presence of data imperfections - Google Patents

System and method of using automated collaborative filtering for decision-making in the presence of data imperfections Download PDF

Info

Publication number
WO2010009314A2
WO2010009314A2 PCT/US2009/050848 US2009050848W WO2010009314A2 WO 2010009314 A2 WO2010009314 A2 WO 2010009314A2 US 2009050848 W US2009050848 W US 2009050848W WO 2010009314 A2 WO2010009314 A2 WO 2010009314A2
Authority
WO
WIPO (PCT)
Prior art keywords
data
prediction
collaborative filtering
module
user
Prior art date
Application number
PCT/US2009/050848
Other languages
French (fr)
Other versions
WO2010009314A3 (en
Inventor
Thanuka L. Wickramarathne
Kamal Premaratne
Miroslav Kubat
Dushyantha T. Jayaweera
Original Assignee
University Of Miami
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University Of Miami filed Critical University Of Miami
Publication of WO2010009314A2 publication Critical patent/WO2010009314A2/en
Publication of WO2010009314A3 publication Critical patent/WO2010009314A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Definitions

  • the invention relates generally to data pattern analysis and in particular to a method and system for automatically accommodating data imperfections and making decisions using imperfect data, without implementing simplifying assumptions.
  • ACF Automated Collaborative Filtering
  • An example of a typical application is an e-commerce system where customers rate items and receive automated recommendations based on detected similarity patterns.
  • One of the problems encountered by conventional ACF algorithms is data imperfection, e.g., limited statistics, subjective judgment, etc.
  • Existing techniques are rarely capable of dealing with imperfections in user-supplied ratings.
  • the invention provides systems and methods of modeling various imperfections, propagating partial knowledge throughout the decision-making process, providing a framework for incorporating background knowledge into automated collaborative filtering and providing predictions having ass ociated reliability information .
  • an automated collaborative filtering device communicates with a client terminal device and receives data from a plurality of sources.
  • the automated collaborative filtering device includes a storage module that stores data gathered from the plurality of sources, wherein the data includes contextual information and wherein the storage module has a database that includes filled data slots and empty data slots.
  • a prediction module is provided that communicates with the storage module and the client terminal device, the prediction module being programmed to generate prediction data based on the contextual information, wherein the prediction data is provided to populate the empty data slots.
  • the invention provides a method of performing automated collaborative filtering that includes providing a database that includes filled data slots and empty data slots and storing data gathered from a plurality of sources into the database. Contextual information is obtained from the stored data and prediction data is generated based on the contextual information so that the empty data slots may be populated with the prediction data.
  • the invention provides an automated collaborative filtering device that communicates with a client terminal device and receives data from a plurality of sources.
  • the automated collaborative filtering device includes a storage module that stores data gathered from the plurality of sources, wherein the data includes contextual information and wherein the storage module has a database that includes filled data slots and empty data slots.
  • a probability rating module is provided that communicates with the storage module and the client terminal device, wherein the probability rating module is programmed to extract predefined values from the data and transform the predefined values into a probability of obtaining the predefined values.
  • a prediction module is also provided that communicates with the probability rating module and is programmed to generate prediction data based on the contextual information and the probability of obtaining the predefined values, wherein the prediction data is provided to populate the empty data slots.
  • FIG. 1 illustrates a system diagram according to one embodiment of the invention
  • FIG. 2 illustrates a table containing user ratings of a provider's items
  • FIG. 3 illustrates a table containing Dempster-Shafer ("DS") theoretic notations
  • FIG. 4 illustrates a set of partial probability models of user character profiles
  • BPA basic probability assignment
  • FIG. 6 illustrates a graph of the variation of the mean absolute error over time for different values of a dispersion factor when applying the principles of the invention to an exemplary dataset without imperfections
  • FIG. 7 illustrates a graph of the variation of the mean absolute error in relation to the neighborhood size, when applying the principles of the invention to a dataset without imperfections
  • FIG. 8 illustrates a graph of the variation of the mean absolute error in relation to the similarity threshold when applying the principles of the invention to a dataset without imperfections
  • FIG. 9 illustrates a graph of the variation of the mean absolute error in relation to the neighborhood size when applying the principles of the invention to a dataset that includes imperfections
  • FIG. 10 illustrates a graph of the variation of the mean absolute error in relation to the similarity threshold when applying the principles of the invention to a dataset that includes imperfections
  • FIG. 11 illustrates a table containing exemplary predictions in accordance with the principles of the invention
  • FIG. 12 illustrates a table containing performance comparison data between the invention and prior methods for making hard decisions on a dataset without imperfections
  • FIG. 13 illustrates a table containing performance comparison data between the invention and prior methods for making soft predictions on a dataset without imperfections
  • FIG. 14 illustrates a table containing performance comparison data between the invention and prior methods for a dataset that includes imperfections
  • FIG. 15 illustrates a graph of the variation of the new DS-theoretic measure
  • ACF Automated collaborative filtering
  • FIG. 1 illustrates an example of the system architecture 100 according to one embodiment of the invention.
  • Client terminal devices 105a-105n may be coupled to one or more automated collaborative filtering devices 110 via a wired network, a wireless network, a combination of the foregoing and/or other networks, such as a network 107.
  • the client terminal devices 105 may include any number of different types of client terminal devices, such as personal computers, laptops, smart terminals, personal digital assistants (PDAs), cell phones, Web TV systems, video game consoles, and devices that combine the functionality of one or more of the foregoing or other client terminal devices.
  • PDAs personal digital assistants
  • the client terminal devices 105 may include processors, RAM, USB interfaces, telephone interfaces, satellite interface, microphones, speakers, a stylus, a computer mouse, a wide area network interface, a local area network interface, hard disks, wireless communication interfaces, DVD/CD reader/burners, a keyboard, a flat touch-screen display, and a display, among other components.
  • the client terminal devices 105 may communicate with the automated collaborative filtering device 110, other client terminal devices 105 and/or other systems.
  • the automated collaborative filtering device 110 may include any number of different types of automated collaborative filtering devices, such as servers, personal computers, laptops, smart terminals, video game consoles, and devices that combine the functionality of one or more of the foregoing or other automated collaborative filtering devices.
  • the automated collaborative filtering device 110 may be of modular construction to facilitate adding, deleting, updating and/or amending modules therein and/or features within modules.
  • Modules may include a prediction module 112, a storage module 114, a probability rating module 116 or other modules. It should be readily understood that a greater or lesser number of modules might be used.
  • One skilled in the art will readily appreciate that the invention may be implemented using individual modules, a single module that incorporates the features of two or more separately described modules, individual software programs, and/or a single software program.
  • Automated Collaborative Filtering ACF has spawned a whole family of techniques and algorithms employed in a myriad of e-commerce applications.
  • the invention provides a prediction module 112 having a unified framework with ACF that is based on Dempster- Shafer ("DS") belief theoretic notions.
  • the prediction module 112 provides Collaborative Filtering based on Dempster-Shafer belief theoretic framework ("CoFiDS”) and is capable of conveniently modeling a wider class of data imperfections, propagating partial knowledge throughout the entire decision-making process, providing a framework for incorporating background knowledge into the ACF task, and providing a 'soft' decision that possess information regarding the reliability of the prediction being made.
  • CoFiDS Dempster-Shafer belief theoretic framework
  • the automated collaborative filtering device 110 may include a storage module 114 for storing data.
  • FIG. 2 illustrates an exemplary database structure that includes customer rating data directed to a subset of a company's products.
  • the database structure may be stored in the storage module 114.
  • Each row represents a customer (U 1 -U5), and each column represents a product (I 1 -I 7 ): the field ⁇ / ,j ⁇ contains the user I 1 ' s rating of the item I j .
  • An empty field indicates that the user has not rated the corresponding item.
  • the invention is directed to improving prediction accuracy on the behaviors of alternative approaches by combining ACF with other recommendation systems and on explaining the predictions of the ACF algorithms.
  • a function of recommender applications is to predict how a user might rate other items, such as the item indicated by the question mark in the last column of user U5 in FIG. 2, based on a given ratings matrix and based on known ratings provided by a particular user.
  • Known recommender applications of ACF include, for example, AMAZON.COM ® book recommender and BLOCKBUSTER ® and NETFLIX ® video recommenders, among other recommenders.
  • the prediction module 112 is directed to providing a mechanism for accommodating imperfections in user ratings, e.g., ambiguities, uncertainties, etc.
  • An exemplary database that has served as a benchmark domain in many studies is Moußs where about 1,000 users have rated more than 1,600 movies. Since ratings perforce are subjective, the invention considers that different users have varying criteria for providing a film with an "excellent" rating. Additionally, the prediction module 112 takes into account that a user may choose different values depending on other intangible factors, such as a momentary mood or perhaps in comparison to other films that he or she has seen, and rated, at about the same time, among other intangible factors. The prediction module 112 provides a method for handling ambiguous and uncertain ratings in order to create accurate modeling.
  • Another exemplary embodiment of the invention is directed to HIV treatment by highly active antiretroviral therapy ("HAART").
  • Patients are administered drug combinations, referred to as drug cocktails.
  • the concrete choice is based on the recommendations of the Department of Health and Human Services and on the results of large studies that may not reflect the idiosyncrasies of the given case.
  • Physicians may adjust treatments based on their experience with the successes/failures under given circumstances. From the perspective of the ACF scenario, the physician may generate a "ratings matrix" whose rows and columns correspond to patients and drug cocktails, respectively.
  • the entries may quantify an effectiveness of the drug cocktail when administered to the patient.
  • the ratings may not be "hard” (or “crisp”, or “perfect"), such as when the ratings have been obtained by a team's collective decision.
  • the invention is directed to improving ACF methodology by addressing questions such as, how should user preferences be modeled? How can this model be used for extracting useful knowledge and making reliable predictions that are robust against data imperfections? How can the prediction accuracy be improved? How can data sparsity and cold- start, two common problems in traditional ACF algorithms, be addressed in this setup? Data sparsity refers to the difficulties generated by the sparse nature of the ratings matrix. Cold- start refers to the difficulties associated with making predictions for newly introduced users and/or items.
  • the invention exploits background knowledge that may be available in real- world applications and provides techniques for overcoming data imperfections.
  • the prediction module 112 uses the DS theoretic framework and offers a mechanism to represent a variety of data imperfections, e.g., probabilistic uncertainties, qualitative evidence, evidence ambiguities, missing information, among other data imperfections.
  • the prediction module 112 also may account for and represent ignorance and incomplete knowledge.
  • the prediction module 112 may use DS-based techniques in applications where the integrity of the decision making process and its robustness against modeling errors caused by lack of precise information are critical, such as in battlefield target tracking and situation awareness, among other critical situations.
  • ACF systems operate using a subset of data.
  • the data subset represents items having similar qualities to an item whose ratings are to be predicted.
  • non-ACF systems are deficient at least because they operate using entire item populations, wherein the entire item populations may be represented by a sparse rating matrix.
  • the sparse rating matrix may render difficult the task of identifying similar items.
  • the number of drug cocktails prescribed to each patient may be small compared to the number of available drug cocktails.
  • the several drug cocktails may not have been rated at all.
  • the number of items that are co-rated by more than one user may be small or the lack of statistical representativeness may render predictions unreliable.
  • the prediction module 112 may apply background knowledge about the users and/or items to mitigate data imperfections and may fuse the background knowledge with that yielded by ACF.
  • the prediction module 112 may apply the DS theoretic basis of CoFiDS to address data sparsity. While it is known to replace each unrated entry of the ratings matrix by a vacuous mass structure, the prediction module 112 applies CoFiDS to narrow the uncertainty inherent in the vacuous mass structure by taking advantage of the background knowledge. For example, the prediction module 112 may fill in the values of unrated items prior to ACF, which increases the computed user-to-user similarity.
  • the invention further may resolve the cold-start problem of using the system when few ratings are available.
  • the prediction module 112 may apply the DS theoretic model and may use background knowledge to populate drug response entries corresponding to a new patient. The same concept may apply to a newly introduced drug cocktail.
  • a user whose ratings are currently being predicted may be referred to as an active user.
  • An item that is rated by multiple users may be referred to as co-rated by those users.
  • co-rated items are used to determine whether two users are 'similar' to each other.
  • a similarity designation between a pair of users can be identified as a mapping , where denotes the similarity between users U 1 and U j . The higher the value of the closer the similarity between U 1 and U j .
  • An M xM symmetric matrix created as may be referred to as the (user-user) similarity matrix.
  • the prediction module 112 may apply a DS theory to define as a finite set of mutually exclusive and exhaustive propositions about some problem domain.
  • the propositions signify the corresponding 'scope of expertise' and are referred to as its frame of discernment ("FoD").
  • a proposition ⁇ i referred to as a singleton, represents the lowest level of discernible information in this FoD.
  • the power set of ⁇ form all propositions of interest.
  • a proposition that is not a singleton is referred to as a composite, .
  • the term "proposition" denotes both singletons and composites.
  • Cardinality of set A is denoted by IAI.
  • the mapping m : 2 ⁇ h-> [0,1] is a basic probability assignment ("BPA") or mass structure for the FoD
  • a proposition that possesses a nonzero mass is referred to as a focal element.
  • the set of focal elements is the core and is denoted by F .
  • the triple ⁇ , F, m] is referred to as the body of evidence ("BoE").
  • the number of focal elements in this BoE is F . BoE ⁇ , F, m] and A c ⁇ ,
  • m(A) measures the support assigned to proposition A only and the belief assigned to A takes into account the supports for all proper subsets of A.
  • Bl(A) represents the total support that can move into A without any ambiguity.
  • Pl(A) represents the extent to which one finds A plausible. When the core contains only singletons, the BPA, belief and plausibility all reduce to probability.
  • the prediction module 112 allows the prediction module 112 to represent a wide variety of data imperfections with ease, as shown in FIG. 2.
  • the BPAs and m(Excellent, Good, Fair) 1.0 elegantly capture the ratings "Good with a 70% level of confidence" and "definitely not Poor but more evidence is needed to discern further," respectively.
  • An unrated item may be captured via the vacuous BPA
  • the prediction module 112 renders a probability distribution Pr( ⁇ ), such that compatible with the underlying BPA m( ⁇ ).
  • An example of such a probability distribution is the pignistic probability distribution Bp(.)
  • the prediction module 112 may "pool" the evidence of two 'independent' BoEs to form a single BoE via the Dempster's Rule of Combination ("DRC").
  • DRC Dempster's Rule of Combination
  • d i ⁇ [ ⁇ ,l] is referred to as a discounting factor.
  • the mapping that generates the BoE r ik corresponding to the rating r ⁇ is referred to as the DS modeling function.
  • the invention is directed to selecting an appropriate DS modeling function f R that captures the explicit and implicit user preference information, while accommodating the associated imperfections (e.g., ambiguities, uncertainties, missing values, etc.).
  • Simple DS theoretic models may be used for this purpose and contextual information may be incorporated to aid the prediction task.
  • the ratings assignment process possesses a level of uncertainty. For example, a user may have difficulty selecting or may be unwilling to select a single label as the proper preference rating, e.g., in a movie recommendation scenario where users must rate the movies via a '5- star' system.
  • prediction module 112 may apply a DS modeling function to capture the user uncertainty in a wide variety of scenarios using
  • the trust factor a quantifies how likely the user assigned rating reflects the user's true perception. The value represents the case when the user's rating is completely untrustworthy and may be modeled via the vacuous BoE.
  • the dispersion factor quantifies how likely the user assigned rating would span a larger set.
  • the value represents the case when the user assigned rating is allocated a DS theoretic mass (provided that
  • the selection of trust and dispersion factors is domain and dataset dependent.
  • the prediction module 112 may utilize user-wide, item- wide, or system- wide constants for these parameters. These constants also may be used to capture the 'significance' of a particular rating towards the overall ACF prediction process. For example, consider a scenario where most users allocate a similar rating for a particular item (e.g., most users in the Moußs dataset give a higher rating for the movie Titanic). That rating would play a less significant role in the CF prediction process.
  • the prediction module 112 may use a smaller item- wide constant value for a ⁇ k .
  • the prediction module 112 combines the trust and dispersion factors to control the DS theoretic mass assigned to the user assigned rating.
  • the DS modeling function in Equation (3) captures a wide variety of user uncertainty. For example, consider ACF algorithms where weighted majority voting strategies produce significant prediction performance improvements compared to correlation based methods. By allowing a ⁇ 1 tolerance on user ratings when calculating similarities, these algorithms accommodate a certain level of uncertainty in user rating assignment.
  • the prediction module 112 may apply the vacuous BoE to model lack of evidence that manifests itself as an unrated entry. Although the system may simply proceed with this vacuous BoE model for an unrated entry, the prediction module 112 may incorporate contextual information to effectively reduce the uncertainty that would otherwise be introduced by a vacuous BoE representation. Embodiments of the invention exploit the power of DS theory to represent imperfections, while reducing the uncertainty of missing entries by using contextual information.
  • the prediction module 112 may completely populate the ratings matrix prior to the application of ACF and may combine information from multiple sources, taking into account their reliability and significance. Furthermore, the prediction module 112 provides solutions to difficulties associated with data sparsity and cold-start. According to one embodiment, empty slots may be filled in using implicit, explicit, and other contextual information before application of the ACF.
  • the patient criteria may include Drug_Compliance, Initial_Viral_Load, and Age, among other criteria.
  • the patient criteria may impact the drug response of a drug cocktail and provide contextual domain expertise.
  • the contextual domain expertise defines a "concept" for grouping patients.
  • each concept may provide criterion for grouping the patients.
  • Drug_Compliance may have the following groups: Drug_Compliance.High, Drug_Compliance. Medium, and Drug_Compliance.Low. Users associated with a group may be defined to possess similar drug responses to selected drugs.
  • the groups corresponding to a selected concept may not partition the user space. For example, a user may belong to one or more groups from the same concept. This grouping is directed to the HIV drug treatment context.
  • alternate groupings may be provided for different context.
  • the prediction module 112 may apply contextual information to populate an unrated entry r ik ⁇ R .
  • the system may combine or fuse an effectiveness rating that each group in which U 1 is a member allocates to I k as a "whole.” This fusion operation may be carried out in two stages. At the group level, the system fuses the group preference of each group to which U 1 belongs and generates a concept preference. At the concept level, the system fuses the concept preference of all grouping concepts and generates an overall contextual preference.
  • the prediction module 112 may apply item-based concepts in addition to the user-based concepts, among other concepts. For example, a physician may group the drug cocktails based on an item- based concept, such as Class_of_Drugs.
  • item-based concept such as Class_of_Drugs.
  • the prediction module 112 may assign "Q" as the number of groups belonging to the 'generic' concept "Concept,” which may be defined as ⁇ Concept.Groupi, ..., Concept.Group Q ⁇ .
  • Concept One concept is considered here for notational simplicity.
  • a subscript/superscript / may be added to differentiate among concepts.
  • the groups to which a user belongs may be identified via the mapping f c : U I— > ⁇ Concept. GwUp 1 ,..., Concept. Group Q J . This mapping is defined as the grouping function.
  • the prediction module 112 may apply a DS theoretic BPA to define how the group members belonging to the group Concept.Group j would, as a whole, rate the item I k . If information regarding the group preferences of each item is available, the prediction module 112 may use this information directly in a DS theoretic setting. Otherwise, the prediction module 112 may consider users within a given group that have already rated item I k .
  • the group preference BPA may be defined as where
  • the corresponding BoE is the group preference BoE.
  • the concept preference BoE corresponding to user U 1 and item I k may be obtained by combining or fusing these group preference BoEs.
  • the concept preference BPA may be defined as where
  • the corresponding BoE is the concept preference BoE.
  • the overall contextual preference BoE corresponding to user U 1 and item I k may be obtained by fusing all the concept preference BoEs.
  • the contextual preference BPA may be defined as where
  • the corresponding BoE is the contextual preference BoE.
  • the prediction module 112 may modify the DS ratings matrix R such that each unrated entry is replaced by its corresponding contextual preference BoE, i.e., when matrix element .
  • the prediction module 112 may employ this ratings matrix for future calculations.
  • the prediction module 112 may employ a discounting factor to discount each constituent BoE prior to application of the DRC. This may be particularly relevant in an application such as the HAART therapy scenario, e.g., if one concept such as Age is known to have less of an impact on the drug response.
  • the BoE r ik may be considered an 'intra-item' BoE that captures the user preference toward a single item.
  • the prediction module 112 may use an appropriately constructed 'inter- item' BoE defined over the cross-product space of
  • a focal element may be extracted from the BoE r ik . Its cylindrical extension to the cross-product FoD ⁇ is
  • the corresponding BoE ⁇ , F 1 , M 1 ) is the user-BoE.
  • Bp ; (•) and B fer to user U 1 ' s pignistic probability distributions corresponding to its user-BoE and ratings BoEs, respectively.
  • a distance metric may be defined on the cross-product FoD ⁇ to calculate the 'distance' between two users.
  • the 'distance' may be used to identify the similarity among users. If a distance measure between two probability mass functions (p.m.f.s) is available, via the application of the pignistic transformation in Equation (1), the prediction module 112 may use this distance measure as a distance measure between two BoEs.
  • CD(»,») refers to the Chan-Darwiche ("CD") distance measure:
  • the prediction module 112 may use the distances between ratings BPAs (which are defined over instead of directly computing the distance between the two user-BPAs, which are defined over the cross-product FoD ⁇ .
  • the associated reduction in computational overhead is from O r by a fraction of
  • a monotonically decreasing function is provided as satisfying (O) I d ( ) 0
  • the prediction module 112 may apply , where is a domain specific constant.
  • the M xM user-user similarity matrix then may be generated as
  • the prediction module 112 may apply the K-nearest neighbor (“KNN”) strategy, the minimum similarity thresholding ("MST”) (where all users having a similarity higher than or equal to a specified threshold are selected), or a combination of both to perform neighborhood selection in ACF.
  • KNN K-nearest neighbor
  • MST minimum similarity thresholding
  • the prediction module 112 may use the K-nearest neighbors with minimum similarity thresholding technique due to its ability to mitigate prediction errors that are generated from dissimilar users.
  • KNN alone, the prediction module 112 may select K neighbors even though all of them may not be sufficiently similar to the active user. For given parameters ⁇ and K, the largest set that satisfies and i s the neighborhood set Nbhd ik of user , The prediction module 112 may select
  • Nbhd ik by applying MST to U and selecting those users who have rated item I k and meet the minimum similarity threshold ⁇ with U 1 .
  • the prediction module 112 may then apply KNN to select at most K users from this user set having the highest similarity with U To determine the neighborhood corresponding to a new user, the condition may apply for Nbhd, ⁇ .
  • the ACF predictions are usually generated by evidence gathered from neighboring data entries that rate an item of interest.
  • the prediction module 112 enhances the ACF predictions by applying contextual information to populate the ratings of all the neighboring data entries for the users. Therefore, the invention is able to exploit the evidence from all the neighboring data entries rather than only the neighboring data entries that include rating data for selected items.
  • the prediction module 112 may represent the prediction of the unrated item I k of the active user U 1 as the B , where Here, m ⁇ k > is the BPA corresponding to the neighborhood prediction
  • the prediction module 112 captures the similarity between users via the user-user similarity
  • the above equation utilizes the user-user similarity as a discounting factor to 'discount' the ratings BoEs of the neighbor data entries prior to fusion.
  • the predictions of the invention offer more flexibility to the decision-maker than what other ACF schemes may provide.
  • the predictions provide information regarding the confidence associated with the ratings prediction and allow the system to make decisions that correspond more closely to the application domain requirements.
  • the prediction module 112 may use the pignistic probability in equation (1) and may select a singleton as the preference label. If one preference label such as a singleton or a composite is desired, the prediction module 112 may apply the maximum belief with non- overlapping interval strategy (maxBL). The prediction module 112 may select the singleton preference label whose belief is greater than the plausibility of any other singleton. If this preference label does not exist, the prediction module 112 may select the composite preference label that includes a singleton label that has a maximum belief and those singletons that have a higher plausibility. According to an exemplary embodiment, the above concepts are implemented using the Moußs Dataset.
  • a domain with soft user ratings is provided to demonstrate the functionality of the invention.
  • a probability rating module 116 is provided to create a DS_MoMaps dataset by modifying Moußs through artificially introducing imperfections into the data. The modification process introduces imperfections while preserving existing user-user, item-item and user-item relationships that are needed by ACF algorithms.
  • the probability rating module 116 applies the following viewpoint regarding the Mocludes ratings to generate the DS_Mocludes dataset.
  • the users considered "soft" ratings. Since the Moußs domain only shows hard ratings, the probability rating module 116 employs a mechanism to transform a hard rating in Moußs to a soft rating in DS_Mocludes.
  • the probability rating module 116 applies partial probability models to create different user profiles. Such partial probability models, together with the power set method, are used to convert data rife with diverse types of imperfections into DS theoretic evidence.
  • FIG. 4 illustrates graphical summaries of partial probability models for four user profiles based on zero tolerance, ⁇ 1 tolerance, end-weighted ⁇ 1 tolerance, and ⁇ 2 tolerance.
  • the horizontal axis represents the user rating as it appears in the Moußs dataset.
  • the vertical axis represents the 'true' rating that a movie received.
  • these four user profiles represent a relatively broad spectrum of users.
  • the probability rating module 116 employs the power set approach while generating DS_Mocludes from Mocludes.
  • the power set approach accounts for user rating imperfections, without resorting to various "assumptions” and “interpolations.”
  • the power set approach applied by the probability rating module 116 may identify the gray and black distributions as 0 and 1, respectively.
  • the probability rating module 116 may complete the "Feasible True_ Rating" column in FIG. 5.
  • the set of feasible true ratings of any other Moußs rating corresponding to an arbitrary user 'character' profile can be obtained similarly.
  • the probability rating module 116 may create each DS_MoMaps dataset in the following manner. A user-item pair may be selected that has been rated as r ⁇ .
  • one user profile may be selected from FIG. 4(a), 4(b), 4(c), and 4(d), respectively.
  • the corresponding feasible true ratings and DS theoretic BPA r ⁇ k may be obtained via the procedure described above and r ⁇ may be replaced with r ik .
  • the probability rating module 116 may repeat this process for all rated entries in Moußs dataset.
  • the probability rating module 116 may transform the DS theoretic user ratings in the DS_Mocludes dataset generated in the previous step into probabilities via the pignistic transformation in Equation (1). This is how the PR_MoMaps dataset is generated. The dataset that is generated by selecting the most likely ratings in FIG. 5, which provides the rating ⁇ 2 ⁇ , produces the Mocludes dataset that was initially selected.
  • a broadly used ACF system based on correlation analysis is identified by the acronym CORR.
  • NA The authors of NA provided three variants: one based on user-to-user similarity (u-NA), one based on item-to-item similarity (i-NA), and one combining these two (c-NA).
  • NA compares favorably with correlation-based methods. While neither CORR nor NA is directly applicable to the DS_Mocludes datasets, the probability rating module 116 may apply CORR to the PR_Mocludes with non-integer ratings generated by weighing each rating by its corresponding probability. By contrast, integer- valued ratings are used in NA. If the probability rating module 116 generates such dataset from the DS_Mocludes dataset, the information provided by the user ratings may be significantly distorted. As a result, NA is applied to the Mocludes dataset.
  • simulations were run with the DS modeling function in Equation (3).
  • the two model parameters, trust factors and dispersion factors were replaced with system- wide constants: ⁇ a ⁇ k , ⁇ ⁇ k ⁇ ⁇ ⁇ a, ⁇ ] , Vik .
  • the probability rating module 116 seeks to capture how movies from a given genre would, as a whole, be rated by user U 1 . Since Moußs does not provide users the opportunity to express their genre preferences explicitly, the probability rating module 116 estimates the users' preferences using the group preference BPA applied to movies that have already been rated by user U i . According to one embodiment, no discounting is applied.
  • the probability rating module 116 does not apply discounting to the definitions for a concept preference BPA and a contextual preference BPA described above. If additional concepts are utilized, not all concepts contribute equally to user preferences. For example, the contribution of concept Director may be different than the contribution of concept Cast. These differences may be accommodated through discounting.
  • the ratings in the DS_MoMaps dataset were used as the user preference BoE without additional modeling and the genre information was used as in the above case. The following methodology was used for consistency with prior methods in conducting performance experiments which apply the principles of the invention. According to one embodiment, the system randomly selects 10% of users and withholds five randomly selected movie ratings for each user.
  • the probability rating module 116 denotes the 'true' rating that user U 1 gives to item in the case of the Mocludes dataset and by in the case of the DS_Mocludes dataset.
  • the ratings predicted by the CORR and NA techniques are denoted by r,k and the ratings predicted by the invention are denoted Performance criteria used for evaluations of ACF algorithms in environments with hard ratings include the mean absolute error (MAE). Other metrics include Precision or Recall.
  • the MAE corresponding to the rating of the testing set m ay be calculated as follows:
  • the probability rating module 116 may obtain the overall MAE measure for the ACF algorithm.
  • the MAE expects hard ratings to generate the 'predicted' rating.
  • the DS theoretic predictions are converted to hard predictions through, for example, the pignistic transformation.
  • the probability rating module 116 may compare soft predictions, such as those provided by CoFiDS, with hard ratings using the following DS theoretic measures:
  • the degree to which one BPA (viz.,r» ) approximates another BPA is determined using the following definition:
  • both DS-PEl and DS-PE2 may take values from [0,1].
  • the probability rating module 116 may have used the KL- divergence instead of the Euclidean norm. In this case, the error would not be bounded by the closed interval [0,1].
  • KL-divergence may use the pignistic distributions corresponding to the true and predicted BPAs to have identical supports.
  • FIG. 6 illustrates how CoFiDS' MAE varies with different values of the dispersion factor ⁇ (for several choices of ⁇ K, ⁇ ⁇ ). The results indicate that the performance is minimally sensitive as long as ⁇ is somewhere in the interval [0.4, 0.7], with the best overall MAE being obtained when ⁇ ⁇ 2/3 .
  • FIG. 7 illustrates how CoFiDS' MAE changes with the neighborhood size, K.
  • FIG. 8 illustrates how MAE varies with the similarity threshold ⁇ . These graphs show the impact of some other parameters, with other variables remaining fixed. As shown, the MAE first drops with increasing K, but then appears to stabilize for higher values, such as around K > 70. The MAE remains generally constant beyond K>70.
  • the value of these parameters may need to be established using a cross-validation technique.
  • FIGS. 9 and 10 illustrate how the value of DS-MAE for CoFiDS varies with changing neighborhood size, K, and similarity threshold, ⁇ , respectively. In FIGS. 9 and 10, all other parameters kept constant. The nature of the DS theoretic predictions renders subjective the direct comparison of CoFiDS' performance with that of CORR and NA.
  • FIG. 11 illustrates a few exemplary CoFiDS predictions performed by the probability rating module 116 using Moußs and single-label predictions obtained by the pignistic transformation and the maxBL strategy.
  • the decision that corresponds to the user-item pair (72, 550) is not controversial.
  • the decision that corresponds to the user- item pair (2, 251) shows that there may be a challenge capturing the richer information content of the DS theoretic BoE with a single-label decision.
  • FIG. 11 illustrates that although the pignistic transformation and the maxBL strategy both favor a "4" rating, the CoFiDS prediction does not appear clearly to discriminate between the "4" and "5" (true) ratings.
  • the maxBL strategy captures the indecision that is apparent in the CoFiDS prediction, the pignistic transformation does not.
  • a first strategy includes converting CoFiDS' predictions to hard ones.
  • a second strategy includes interpreting CORR' s and NA' s predictions as soft predictions. Each of these strategies is addressed separately.
  • the pignistic transformation may be used to generate hard decisions from the soft CoFiDS predictions. This approach reduces the effectiveness of CoFiDS, whose strength is the ability to generate soft decisions. This strategy is available for cases where hard decisions are satisfactory.
  • MAE and other field information retrieval criteria such as Precision, Recall, and Fi are used.
  • a high value is desired for Precision in certain domains to ensure that the system's prediction of value True_Rating is accurate. This desire is valid even if the system may have missed many cases where the true user's rating was True_Rating, such as if the system predicts "2.” While this value may be relied upon, the system may have missed many cases where the true value was "2".
  • Recall is desired in domains where the system needs to correctly recall as many occurrences of ratings True_Rating as possible.
  • Fi combines the two criteria and is preferred in domains where Precision and Recall are deemed equally important.
  • FIG. 12 summarizes the results of these experiments.
  • Bold values indicate the best performance in each category. As the differences are substantial, the statistical significance is not evaluated.
  • each of the five possible ratings (“1" through “5") is provided a column.
  • the experiments show that NA-based predictions are seldom the best, which appears to indicate that the technique is not well suited for soft ratings of this particular kind.
  • the situation is less straightforward when CORR and CoFiDS are compared.
  • a superficial observation demonstrates that, on average, CoFiDS' mean error is lower. This apparent performance edge may be attributed to this system's higher ability to predict the "middle" ratings of "3" and "4".
  • CORR more accurately predicted “1.”
  • the margin between the two systems is low.
  • the Fi criterion provides similar impressions, with Fi components potentially offering deep insights.
  • CoFiDS may be preferable in domains where the user emphasizes Precision, whereas CORR may be a better choice when Recall is of importance.
  • the prediction module 112 may provide the CoFiDS results even though a conversion to a hard decision may not exploit the full strength and functionality of the underlying DS -theoretic basis.
  • coverage performance that calculates the percentage of items for which the ACF algorithm can make correct predictions is lower for CORR if the ACF algorithm parameters have been tuned for lower MAE.
  • NA and CoFiDS provide nearly complete coverage. So, for an improved comparison with CORR, a configuration is used that minimizes MAE, while keeping the 90% level coverage.
  • CORR and NA decisions are interpreted as soft predictions, integer-valued predictions are not needed for CORR. This simplifies the comparison of CORR with CoFiDS along the soft predictions.
  • the NA decisions cannot be readily “softened” because they are integer- valued decisions. For this reason, a comparison of CORR with CoFiDS is provided below.
  • the prediction module 112 may apply the following DS -theoretic BPA to interpret a CORR prediction, , as soft:
  • the CORR prediction 3.3 with is interpreted as the Bayesian statement, "The rating is 3 with
  • Equation (7) corresponds well with this typical interpretation of a CORR prediction.
  • FIG. 13 summarizes the results for the configuration that yields the best overall DS-MAE being used for each ACF algorithm.
  • the same CoFiDS parameters are used as before. Again, bold values in FIG. 13 indicate the best performance in each category.
  • CoFiDS While the average mean error is lower in the case of CoFiDS, the correlation- based approach provides improved results for predicting the maximum and minimum values ("1" and “5,” respectively). By contrast, CoFiDS provides enhanced results for predicting the "middle” values of "3" and "4.” This same conclusion is reached in the case of the F 1 criterion, whose components display different behavior for each of the two systems. CoFiDS is preferred in domains where high precision is desired, while CORR is preferred in applications where high recall is desired.
  • the true ratings may be soft to permit performance comparisons using the criteria DS-PEl and DS-PE2 from above. While the CoFiDS predictions are provided in the soft form, CORR predictions are converted to soft predictions using Equation (7).
  • FIG. 14 illustrates the comparison for several different values of p, the probability with which the zero tolerance user was selected. The other 3 user profiles were selected with equal probability. Since the CoFiDS consistently outperforms the CORR system by a large margin, the evaluation of statistical significance is not performed.
  • the performance is poor as long as the neighborhood is small. The performance peaks and then starts slowly degrading. In a realistic domain, the graceful performance degradation after reaching the optimum value supports the notion that the optimum value of K can be obtained by cross-validation techniques.
  • the invention provides methods of accommodating data imperfections for domains where the user ratings are subjective or are otherwise unreliable.
  • the system applies coarse setting to system-wide parameters.
  • the CoFiDS performance compares favorably with performance derived using conventional ACF techniques.
  • the invention uses CoFiDS to generate soft decisions where domain experts offer subjective opinions.
  • the invention propagates the uncertainties from the user- preference ratings to the output predictions.
  • the invention may be realized in hardware, software, or a combination of hardware and software. Any kind of computing system or other apparatus adapted for carrying out the methods described herein is suited to perform the functions described herein.
  • a typical combination of hardware and software could be a specialized or general purpose computer system having one or more processing elements.
  • a computer program may be provided and stored on a storage medium that controls the computer system when loaded and executed, such that it carries out the methods described herein.
  • the invention can also be embedded in a computer program product, which comprises all the features enabling the implementation of the methods described herein, and which, when loaded in a computing system is able to carry out these methods.
  • Storage medium refers to any volatile or non-volatile storage device.
  • Computer program or application in the present context means any expression, in any language, code or notation, of a set of instructions intended to cause a system having an information processing capability to perform a particular function either directly or after either or both of the following: a) conversion to another language, code or notation; b) reproduction in a different material form.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Operations Research (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Quality & Reliability (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A system and method are provided for performing automated collaborative filtering for data that is gathered from the plurality of sources. The data includes contextual information and is stored in a database that includes filled data slots and empty data slots. A prediction module communicates with a client terminal device, receives the data, and generates prediction data based on the contextual information, wherein the prediction data is provided to populate the empty data slots. The invention models a wide class of data imperfections, propagates partial knowledge throughout a decision-making process, incorporates background knowledge into the automated collaborative filtering and provides reliability information for system predictions.

Description

SYSTEM AND METHOD OF USING AUTOMATED COLLABORATIVE FILTERING FOR DECISION-MAKING IN THE PRESENCE OF DATA
IMPERFECTIONS
FIELD OF THE INVENTION
The invention relates generally to data pattern analysis and in particular to a method and system for automatically accommodating data imperfections and making decisions using imperfect data, without implementing simplifying assumptions. BACKGROUND OF THE INVENTION
Automated Collaborative Filtering ("ACF") refers to a group of algorithms used in recommender systems implemented for various applications. An example of a typical application is an e-commerce system where customers rate items and receive automated recommendations based on detected similarity patterns. One of the problems encountered by conventional ACF algorithms is data imperfection, e.g., limited statistics, subjective judgment, etc. Existing techniques are rarely capable of dealing with imperfections in user-supplied ratings.
When such imperfections, e.g., ambiguities, cannot be avoided, designers typically resort to simplifying assumptions that impairs the system's performance and utility. Conventional algorithms either completely ignore imperfect user ratings or utilize some imputation mechanism to remove the imperfections, e.g., fill-in the missing entries. Neither strategy produces acceptable results, especially when a large percentage of the data is imperfect and/or little information is available about the reason for and the mechanism driving the imperfections. This is one reason that existing ACF algorithms have not been widely utilized in applications where data imperfections are commonplace and decisions being made are of critical importance, such as medical/healthcare data, homeland security and defense applications, etc. Simplifying assumptions made in such applications may harm the reliability of the decisions being made. Existing technologies cannot handle data imperfections without making assumptions that are not realistic and/or cannot be justified. Hence, the decisions and prediction being made by these methods cannot be relied upon, especially in critical and sensitive applications. Therefore, what is needed is a method and system for automatically making decisions in the presence of imperfect data without the need to make simplifying assumptions.
SUMMARY OF THE INVENTION Various aspects of the invention overcome at least some of these and other drawbacks of existing systems. The invention provides systems and methods of modeling various imperfections, propagating partial knowledge throughout the decision-making process, providing a framework for incorporating background knowledge into automated collaborative filtering and providing predictions having ass ociated reliability information .
According to one embodiment of the invention, an automated collaborative filtering device is provided that communicates with a client terminal device and receives data from a plurality of sources. The automated collaborative filtering device includes a storage module that stores data gathered from the plurality of sources, wherein the data includes contextual information and wherein the storage module has a database that includes filled data slots and empty data slots. A prediction module is provided that communicates with the storage module and the client terminal device, the prediction module being programmed to generate prediction data based on the contextual information, wherein the prediction data is provided to populate the empty data slots.
According to another embodiment, the invention provides a method of performing automated collaborative filtering that includes providing a database that includes filled data slots and empty data slots and storing data gathered from a plurality of sources into the database. Contextual information is obtained from the stored data and prediction data is generated based on the contextual information so that the empty data slots may be populated with the prediction data.
According to yet another embodiment, the invention provides an automated collaborative filtering device that communicates with a client terminal device and receives data from a plurality of sources. The automated collaborative filtering device includes a storage module that stores data gathered from the plurality of sources, wherein the data includes contextual information and wherein the storage module has a database that includes filled data slots and empty data slots. A probability rating module is provided that communicates with the storage module and the client terminal device, wherein the probability rating module is programmed to extract predefined values from the data and transform the predefined values into a probability of obtaining the predefined values. A prediction module is also provided that communicates with the probability rating module and is programmed to generate prediction data based on the contextual information and the probability of obtaining the predefined values, wherein the prediction data is provided to populate the empty data slots. BRIEF DESCRIPTION OF THE DRAWINGS
A more complete understanding of the present invention, and the attendant advantages and features thereof, will be more readily understood by reference to the following detailed description when considered in conjunction with the accompanying drawings wherein: FIG. 1 illustrates a system diagram according to one embodiment of the invention;
FIG. 2 illustrates a table containing user ratings of a provider's items;
FIG. 3 illustrates a table containing Dempster-Shafer ("DS") theoretic notations; FIG. 4 illustrates a set of partial probability models of user character profiles;
FIG. 5 illustrates a table containing a basic probability assignment ("BPA") corresponding to a Movielens_Rating = 2 of a ±1 tolerance user;
FIG. 6 illustrates a graph of the variation of the mean absolute error over time for different values of a dispersion factor when applying the principles of the invention to an exemplary dataset without imperfections;
FIG. 7 illustrates a graph of the variation of the mean absolute error in relation to the neighborhood size, when applying the principles of the invention to a dataset without imperfections; FIG. 8 illustrates a graph of the variation of the mean absolute error in relation to the similarity threshold when applying the principles of the invention to a dataset without imperfections;
FIG. 9 illustrates a graph of the variation of the mean absolute error in relation to the neighborhood size when applying the principles of the invention to a dataset that includes imperfections;
FIG. 10 illustrates a graph of the variation of the mean absolute error in relation to the similarity threshold when applying the principles of the invention to a dataset that includes imperfections; FIG. 11 illustrates a table containing exemplary predictions in accordance with the principles of the invention;
FIG. 12 illustrates a table containing performance comparison data between the invention and prior methods for making hard decisions on a dataset without imperfections; FIG. 13 illustrates a table containing performance comparison data between the invention and prior methods for making soft predictions on a dataset without imperfections;
FIG. 14 illustrates a table containing performance comparison data between the invention and prior methods for a dataset that includes imperfections; and FIG. 15 illustrates a graph of the variation of the new DS-theoretic measure
DS-PEl in relation to the neighborhood size when applying the principle of the invention to a dataset that includes imperfections.
DETAILED DESCRIPTION OF THE INVENTION
Automated collaborative filtering ("ACF") is a technique for making recommendations when presented with imperfect data, such as data having ambiguities and uncertainties in ratings and missing or incomplete data, among other imperfect data. Current ACF systems are not capable of adequately handling data imperfections. However, it is becoming increasingly important to have effective strategies in place to model and propagate these data imperfections throughout the decision-making process so that prediction tasks can be completed with high reliability.
While specific embodiments of the invention are discussed herein and are illustrated in the drawings appended hereto, the invention encompasses a broader spectrum than the specific subject matter described and illustrated. As would be appreciated by those skilled in the art, the embodiments described herein provide but a few examples of the broad scope of the invention. There is no intention to limit the scope of the invention only to the embodiments described.
Computer networks are used to implement the ACF. FIG. 1 illustrates an example of the system architecture 100 according to one embodiment of the invention. Client terminal devices 105a-105n (hereinafter identified collectively as 105) may be coupled to one or more automated collaborative filtering devices 110 via a wired network, a wireless network, a combination of the foregoing and/or other networks, such as a network 107. The client terminal devices 105 may include any number of different types of client terminal devices, such as personal computers, laptops, smart terminals, personal digital assistants (PDAs), cell phones, Web TV systems, video game consoles, and devices that combine the functionality of one or more of the foregoing or other client terminal devices. The client terminal devices 105 may include processors, RAM, USB interfaces, telephone interfaces, satellite interface, microphones, speakers, a stylus, a computer mouse, a wide area network interface, a local area network interface, hard disks, wireless communication interfaces, DVD/CD reader/burners, a keyboard, a flat touch-screen display, and a display, among other components. The client terminal devices 105 may communicate with the automated collaborative filtering device 110, other client terminal devices 105 and/or other systems.
Users may access the client terminal devices 105 to communicate with selected sources, including other client terminal devices 105 and the automated collaborative filtering device 110. Data requests that originate from the client terminal devices 105 may be broadcast to selected sources substantially in real-time if the client terminal devices 105 are coupled to the network 107. The automated collaborative filtering device 110 may include any number of different types of automated collaborative filtering devices, such as servers, personal computers, laptops, smart terminals, video game consoles, and devices that combine the functionality of one or more of the foregoing or other automated collaborative filtering devices.
The automated collaborative filtering device 110 may be of modular construction to facilitate adding, deleting, updating and/or amending modules therein and/or features within modules. Modules may include a prediction module 112, a storage module 114, a probability rating module 116 or other modules. It should be readily understood that a greater or lesser number of modules might be used. One skilled in the art will readily appreciate that the invention may be implemented using individual modules, a single module that incorporates the features of two or more separately described modules, individual software programs, and/or a single software program. Automated Collaborative Filtering (ACF) has spawned a whole family of techniques and algorithms employed in a myriad of e-commerce applications. The invention provides a prediction module 112 having a unified framework with ACF that is based on Dempster- Shafer ("DS") belief theoretic notions. The prediction module 112 provides Collaborative Filtering based on Dempster-Shafer belief theoretic framework ("CoFiDS") and is capable of conveniently modeling a wider class of data imperfections, propagating partial knowledge throughout the entire decision-making process, providing a framework for incorporating background knowledge into the ACF task, and providing a 'soft' decision that possess information regarding the reliability of the prediction being made. These features are not provided in existing technologies. Indeed, the absence of effective methods for handling data imperfections has been a hurdle that prevents ACF methods from being utilized in more sensitive and critical problem domains, such as medical/healthcare data and battlefield situation awareness, among other domains.
The automated collaborative filtering device 110 may include a storage module 114 for storing data. FIG. 2 illustrates an exemplary database structure that includes customer rating data directed to a subset of a company's products. The database structure may be stored in the storage module 114. Each row represents a customer (U1-U5), and each column represents a product (I1-I7): the field {/ ,j} contains the user I1 's rating of the item Ij. An empty field indicates that the user has not rated the corresponding item. The invention is directed to improving prediction accuracy on the behaviors of alternative approaches by combining ACF with other recommendation systems and on explaining the predictions of the ACF algorithms. A function of recommender applications is to predict how a user might rate other items, such as the item indicated by the question mark in the last column of user U5 in FIG. 2, based on a given ratings matrix and based on known ratings provided by a particular user. Known recommender applications of ACF include, for example, AMAZON.COM® book recommender and BLOCKBUSTER® and NETFLIX® video recommenders, among other recommenders.
The prediction module 112 is directed to providing a mechanism for accommodating imperfections in user ratings, e.g., ambiguities, uncertainties, etc. An exemplary database that has served as a benchmark domain in many studies is Movielens where about 1,000 users have rated more than 1,600 movies. Since ratings perforce are subjective, the invention considers that different users have varying criteria for providing a film with an "excellent" rating. Additionally, the prediction module 112 takes into account that a user may choose different values depending on other intangible factors, such as a momentary mood or perhaps in comparison to other films that he or she has seen, and rated, at about the same time, among other intangible factors. The prediction module 112 provides a method for handling ambiguous and uncertain ratings in order to create accurate modeling. Another exemplary embodiment of the invention is directed to HIV treatment by highly active antiretroviral therapy ("HAART"). Patients are administered drug combinations, referred to as drug cocktails. The concrete choice is based on the recommendations of the Department of Health and Human Services and on the results of large studies that may not reflect the idiosyncrasies of the given case. Physicians may adjust treatments based on their experience with the successes/failures under given circumstances. From the perspective of the ACF scenario, the physician may generate a "ratings matrix" whose rows and columns correspond to patients and drug cocktails, respectively. The entries may quantify an effectiveness of the drug cocktail when administered to the patient.
The physician may rate the drug response by the values from θpref = {Excellent, Good, Fair, Poor}. However, the ratings may not be "hard" (or "crisp", or "perfect"), such as when the ratings have been obtained by a team's collective decision. Even if the ACF algorithm can accommodate probabilistic user rating imperfections, the probabilistic model {Excellent = 0.1, Good = 0.7, Fair = 0.1, Poor = 0.1 } may have difficulty capturing the rating "Good with a 70% level confidence" that is allocated to a drug cocktail. This rating implies that a 30% level of confidence on the complement of Good is also unsatisfactory. Classical models are not good at reflecting conclusions such as, "the effectiveness of the drug cocktail is definitely not Poor, but more evidence is needed to discern further". Lack of mechanisms to accommodate such subjective issues often requires one to make various unwarranted "assumptions" and "interpolations".
The invention is directed to improving ACF methodology by addressing questions such as, how should user preferences be modeled? How can this model be used for extracting useful knowledge and making reliable predictions that are robust against data imperfections? How can the prediction accuracy be improved? How can data sparsity and cold- start, two common problems in traditional ACF algorithms, be addressed in this setup? Data sparsity refers to the difficulties generated by the sparse nature of the ratings matrix. Cold- start refers to the difficulties associated with making predictions for newly introduced users and/or items.
The invention exploits background knowledge that may be available in real- world applications and provides techniques for overcoming data imperfections. The prediction module 112 uses the DS theoretic framework and offers a mechanism to represent a variety of data imperfections, e.g., probabilistic uncertainties, qualitative evidence, evidence ambiguities, missing information, among other data imperfections. The prediction module 112 also may account for and represent ignorance and incomplete knowledge. The prediction module 112 may use DS-based techniques in applications where the integrity of the decision making process and its robustness against modeling errors caused by lack of precise information are critical, such as in battlefield target tracking and situation awareness, among other critical situations.
ACF systems operate using a subset of data. The data subset represents items having similar qualities to an item whose ratings are to be predicted. By contrast, non-ACF systems are deficient at least because they operate using entire item populations, wherein the entire item populations may be represented by a sparse rating matrix. The sparse rating matrix may render difficult the task of identifying similar items.
Referring back to the above described HAART scenario, the number of drug cocktails prescribed to each patient may be small compared to the number of available drug cocktails. Alternatively, the several drug cocktails may not have been rated at all. Furthermore, the number of items that are co-rated by more than one user may be small or the lack of statistical representativeness may render predictions unreliable. According to one embodiment, the prediction module 112 may apply background knowledge about the users and/or items to mitigate data imperfections and may fuse the background knowledge with that yielded by ACF.
The prediction module 112 may apply the DS theoretic basis of CoFiDS to address data sparsity. While it is known to replace each unrated entry of the ratings matrix by a vacuous mass structure, the prediction module 112 applies CoFiDS to narrow the uncertainty inherent in the vacuous mass structure by taking advantage of the background knowledge. For example, the prediction module 112 may fill in the values of unrated items prior to ACF, which increases the computed user-to-user similarity.
The invention further may resolve the cold-start problem of using the system when few ratings are available. In the HAART scenario, the prediction module 112 may apply the DS theoretic model and may use background knowledge to populate drug response entries corresponding to a new patient. The same concept may apply to a newly introduced drug cocktail.
The invention may apply the following underlying mathematical principles. First, in relation to ACF, U = (U1, U2, ... , UM} and I = (I1, 12, ... , IN} denote exhaustive sets of M users and N items, respectively. Assume a user allocates a 'preference' or 'rating' to an item via a finite, rank-ordered set of L user preference labels θpref = {θi, θ2, ..., ΘL}, where θj < θi whenever j < 1. If a user allocates a rating to an item, then the item is identified as rated. Otherwise, the item is identified as not rated.
A user' s rating can be identified as a mapping fB : UxI I— > Θ f : U xlt ι— > r, where r, ∈ Θ f denotes the rating that a user U1 allocates to item h. If h has not been rated by U1, the system uses rιk = 0 . A ratings matrix may be created as an M xN matrix having R = [rιk ) where rιk ∈ Θ ref for a rated item and r, = 0 for an unrated item. For i = \,M , the notation \ ] and
Figure imgf000012_0007
Figure imgf000012_0001
may also be used to denote the rated and unrated items of user U1, respectively. A user whose ratings are currently being predicted may be referred to as an active user.
An item that is rated by multiple users may be referred to as co-rated by those users. In ACF, co-rated items are used to determine whether two users are 'similar' to each other. A similarity designation between a pair of users can be identified as a mapping , where denotes the similarity
Figure imgf000012_0002
Figure imgf000012_0006
between users U1 and Uj. The higher the value of
Figure imgf000012_0008
the closer the similarity between U1 and Uj. An M xM symmetric matrix created as may be referred to as the
Figure imgf000012_0003
(user-user) similarity matrix.
The prediction module 112 may apply a DS theory to define as
Figure imgf000012_0004
a finite set of mutually exclusive and exhaustive propositions about some problem domain. The propositions signify the corresponding 'scope of expertise' and are referred to as its frame of discernment ("FoD"). A proposition θi , referred to as a singleton, represents the lowest level of discernible information in this FoD.
Elements in 2Θ , the power set of Θ form all propositions of interest. A proposition that is not a singleton is referred to as a composite, . The term
Figure imgf000012_0005
"proposition" denotes both singletons and composites. Cardinality of set A is denoted by IAI. The set A denotes all singletons in Θ that are not included in A c Θ , i.e., A = {Θ1 ∈ Θ : Θ1 ∈ A} = Θ\ A . The mapping m : 2Θ h-> [0,1] is a basic probability assignment ("BPA") or mass structure for the FoD
Figure imgf000013_0005
The prediction module 112 uses DS theory to model the notion of ignorance by allowing the mass of a proposition to move freely into its individual singletons. For example, complete lack of evidence can be conveniently captured via the vacuous BPA: m(A) = 0, VA c Θ and m(Θ)=1.0. A proposition that possesses a nonzero mass is referred to as a focal element. The set of focal elements is the core and is denoted by F . The triple {Θ, F, m] is referred to as the body of evidence ("BoE"). The number of focal elements in this BoE is F . BoE {Θ, F, m] and A c Θ ,
B1 : 2Θ ι-> [0,1] , the belief of A is B . The plausibility of A is
Figure imgf000013_0002
Figure imgf000013_0001
According to one embodiment, m(A) measures the support assigned to proposition A only and the belief assigned to A takes into account the supports for all proper subsets of A. Bl(A) represents the total support that can move into A without any ambiguity. Pl(A) represents the extent to which one finds A plausible. When the core contains only singletons, the BPA, belief and plausibility all reduce to probability.
The above-referenced DS theoretic notions allow the prediction module 112 to represent a wide variety of data imperfections with ease, as shown in FIG. 2. For example, in the HAART therapy scenario, the BPAs and m(Excellent, Good, Fair) = 1.0 elegantly
Figure imgf000013_0004
capture the ratings "Good with a 70% level of confidence" and "definitely not Poor but more evidence is needed to discern further," respectively. An unrated item may be captured via the vacuous BPA
Figure imgf000013_0003
The prediction module 112 renders a probability distribution Pr(●), such that
Figure imgf000014_0010
compatible with the underlying BPA m(●). An example of such a probability distribution is the pignistic probability distribution Bp(.)
Figure imgf000014_0001
The prediction module 112 may "pool" the evidence of two 'independent' BoEs to form a single BoE via the Dempster's Rule of Combination ("DRC").
Suppose the two BoEs
Figure imgf000014_0008
pan the same FoD Θ . Then, if , the DRC generates the
Figure imgf000014_0003
Figure imgf000014_0009
where . This combination operation is
Figure imgf000014_0004
denoted as m . The operation Θ is both associative and commutative thus
Figure imgf000014_0011
enabling the combination of multiple BoEs with ease. A variation of the DRC that accounts for evidence reliability is
Figure imgf000014_0005
) , where
Figure imgf000014_0002
Here, di ∈ [θ,l] is referred to as a discounting factor.
In accordance with the principles of the invention, each user preference rating is viewed as a BoE spanning over the FoD Θpref = {Θ1,..., ΘL) . The mapping that generates the BoE rik corresponding to the rating r^
Figure imgf000014_0006
is referred to as the DS modeling function. The M xN matrix created as is
Figure imgf000014_0007
referred to as the DS ratings matrix.
The invention is directed to selecting an appropriate DS modeling function fR that captures the explicit and implicit user preference information, while accommodating the associated imperfections (e.g., ambiguities, uncertainties, missing values, etc.). Simple DS theoretic models may be used for this purpose and contextual information may be incorporated to aid the prediction task.
Even in domains where the user preferences are hard, the ratings assignment process possesses a level of uncertainty. For example, a user may have difficulty selecting or may be unwilling to select a single label as the proper preference rating, e.g., in a movie recommendation scenario where users must rate the movies via a '5- star' system.
Given the prediction module 112 may apply a DS modeling
Figure imgf000015_0002
function to capture the user uncertainty in a wide variety of scenarios using
Figure imgf000015_0001
The trust factor a
Figure imgf000015_0003
quantifies how likely the user assigned rating reflects the user's true perception. The value represents the case when the
Figure imgf000015_0006
user's rating is completely untrustworthy and may be modeled via the vacuous BoE.
The dispersion factor
Figure imgf000015_0004
quantifies how likely the user assigned rating would span a larger set. The value
Figure imgf000015_0007
represents the case when the user assigned rating is allocated a DS theoretic mass (provided that
Figure imgf000015_0005
According to one embodiment, the selection of trust and dispersion factors is domain and dataset dependent. Depending on the available evidence and the complexity of the process, the prediction module 112 may utilize user-wide, item- wide, or system- wide constants for these parameters. These constants also may be used to capture the 'significance' of a particular rating towards the overall ACF prediction process. For example, consider a scenario where most users allocate a similar rating for a particular item (e.g., most users in the Movielens dataset give a higher rating for the movie Titanic). That rating would play a less significant role in the CF prediction process. The prediction module 112 may use a smaller item- wide constant value for aιk .
The prediction module 112 combines the trust and dispersion factors to control the DS theoretic mass assigned to the user assigned rating. The DS modeling function in Equation (3) captures a wide variety of user uncertainty. For example, consider ACF algorithms where weighted majority voting strategies produce significant prediction performance improvements compared to correlation based methods. By allowing a ±1 tolerance on user ratings when calculating similarities, these algorithms accommodate a certain level of uncertainty in user rating assignment. The prediction module 112 may apply the DS modeling function in Equation (3) with the parameter {aιk , σιk } = {1,1} to capture this scenario. Equation (3) is one simple model that may be used in accordance with the principles of the invention. However, one of ordinary skill in the art will readily appreciate that other modeling functions may be used.
The prediction module 112 may apply the vacuous BoE to model lack of evidence that manifests itself as an unrated entry. Although the system may simply proceed with this vacuous BoE model for an unrated entry, the prediction module 112 may incorporate contextual information to effectively reduce the uncertainty that would otherwise be introduced by a vacuous BoE representation. Embodiments of the invention exploit the power of DS theory to represent imperfections, while reducing the uncertainty of missing entries by using contextual information.
The prediction module 112 may completely populate the ratings matrix prior to the application of ACF and may combine information from multiple sources, taking into account their reliability and significance. Furthermore, the prediction module 112 provides solutions to difficulties associated with data sparsity and cold-start. According to one embodiment, empty slots may be filled in using implicit, explicit, and other contextual information before application of the ACF.
Returning to the HAART therapy scenario presented above, the patient criteria may include Drug_Compliance, Initial_Viral_Load, and Age, among other criteria. The patient criteria may impact the drug response of a drug cocktail and provide contextual domain expertise. The contextual domain expertise defines a "concept" for grouping patients. According to one embodiment, each concept may provide criterion for grouping the patients. For example, the concept Drug_Compliance may have the following groups: Drug_Compliance.High, Drug_Compliance. Medium, and Drug_Compliance.Low. Users associated with a group may be defined to possess similar drug responses to selected drugs. The groups corresponding to a selected concept may not partition the user space. For example, a user may belong to one or more groups from the same concept. This grouping is directed to the HIV drug treatment context. One of ordinary skill in the art will readily appreciate that alternate groupings may be provided for different context.
According to one embodiment, the prediction module 112 may apply contextual information to populate an unrated entry rik ∈ R . The system may combine or fuse an effectiveness rating that each group in which U1 is a member allocates to Ik as a "whole." This fusion operation may be carried out in two stages. At the group level, the system fuses the group preference of each group to which U1 belongs and generates a concept preference. At the concept level, the system fuses the concept preference of all grouping concepts and generates an overall contextual preference.
Regarding the HAART therapy discussed above, the prediction module 112 may apply item-based concepts in addition to the user-based concepts, among other concepts. For example, a physician may group the drug cocktails based on an item- based concept, such as Class_of_Drugs. One of ordinary skill in the art will readily recognize that the principles of the invention may extend equally to other concepts.
The prediction module 112 may assign "Q" as the number of groups belonging to the 'generic' concept "Concept," which may be defined as {Concept.Groupi, ..., Concept.GroupQ}. One concept is considered here for notational simplicity. For multiple concepts, a subscript/superscript / may be added to differentiate among concepts. According to one embodiment, the groups to which a user belongs may be identified via the mapping
Figure imgf000018_0005
fc : U I— > \ Concept. GwUp1,..., Concept. GroupQ J . This mapping is defined as the grouping function.
The prediction module 112 may apply a DS theoretic BPA to define how the group members belonging to the group Concept.Groupj would, as a whole, rate the item Ik . If information regarding the group preferences of each item is available, the prediction module 112 may use this information directly in a DS theoretic setting. Otherwise, the prediction module 112 may consider users within a given group that have already rated item Ik .
The group preference BPA may be defined as where
Figure imgf000018_0006
Figure imgf000018_0003
The corresponding BoE
Figure imgf000018_0008
is the group preference BoE. The concept preference BoE corresponding to user U1 and item Ik may be obtained by combining or fusing these group preference BoEs.
The concept preference BPA may be defined as
Figure imgf000018_0007
where
Figure imgf000018_0004
The corresponding BoE
Figure imgf000018_0001
is the concept preference BoE. The overall contextual preference BoE corresponding to user U1 and item Ik may be obtained by fusing all the concept preference BoEs.
The contextual preference BPA may be defined as
Figure imgf000018_0002
where
Figure imgf000019_0002
The corresponding BoE
Figure imgf000019_0008
is the contextual preference BoE.
The prediction module 112 may modify the DS ratings matrix R such that each unrated entry is replaced by its corresponding contextual preference BoE, i.e.,
Figure imgf000019_0009
when matrix element
Figure imgf000019_0010
. The prediction module 112 may employ this ratings matrix for future calculations.
In the fusion operations being carried out by the concept preference BoE and the contextual preference BoE, the prediction module 112 may employ a discounting factor to discount each constituent BoE prior to application of the DRC. This may be particularly relevant in an application such as the HAART therapy scenario, e.g., if one concept such as Age is known to have less of an impact on the drug response.
The BoE rik may be considered an 'intra-item' BoE that captures the user preference toward a single item. In order to capture a user preference toward all items as a whole, the prediction module 112 may use an appropriately constructed 'inter- item' BoE defined over the cross-product space of
Figure imgf000019_0003
A focal element may be extracted from the BoE rik . Its cylindrical
Figure imgf000019_0007
extension to the cross-product FoD Θ is
Figure imgf000019_0004
where The mapping where
Figure imgf000019_0005
Figure imgf000019_0006
Figure imgf000019_0001
generates a valid BPA defined on the FoD Θ . The corresponding BoE is the user-BoE generated by extending the
Figure imgf000019_0011
For user U1 , consider the BoEs Mιk (•) , k = 1, N , respectively. Then the BPA Af, : 2Θ ι-> [0,l] where
Figure imgf000020_0002
is referred to as the user-BPA of user U1. The corresponding BoE {Θ, F1, M1) is the user-BoE.
The following result is provided. User Ui 's user-BPA M1 (defined over the
FoD Θ ) and the ratings BPAs m
Figure imgf000020_0004
are each defined over the FoD Θpref . Then, the pignistic probability of the singleton is
Figure imgf000020_0003
Figure imgf000020_0001
where Here, Bp; (•) and B fer to user U1 ' s
Figure imgf000020_0005
pignistic probability distributions corresponding to its user-BoE and ratings BoEs, respectively.
Since the user-BoE defines a user's 'joint' preference over all the items, a distance metric may be defined on the cross-product FoD Θ to calculate the 'distance' between two users. The 'distance' may be used to identify the similarity among users. If a distance measure between two probability mass functions (p.m.f.s) is available, via the application of the pignistic transformation in Equation (1), the prediction module 112 may use this distance measure as a distance measure between two BoEs. The distance measure between the two user-BPAs M1 and M } defined over the same cross-product FoD Θ is D(M1 , M } ) = CD(Bp1 , Bp ) , where Bp, and Bp7 denote the pignistic probability transformations corresponding to M1 and M } , respectively. CD(»,») refers to the Chan-Darwiche ("CD") distance measure:
Figure imgf000021_0001
Thus, the distance measure between the two user-BPAs Mi and M j where Bpi/k and Bpj/k refer to the pignistic
Figure imgf000021_0002
probability distributions corresponding to the ratings BPAs of users U1 and U , respectively.
When determining the distance between two user-BoEs, the prediction module 112 may use the distances between ratings BPAs (which are defined over
Figure imgf000021_0012
instead of directly computing the distance between the two user-BPAs, which are defined over the cross-product FoD Θ . The associated reduction in computational overhead is from Or by a fraction of
Figure imgf000021_0003
Figure imgf000021_0004
A monotonically decreasing function is provided as satisfying (O) I d ( ) 0 With respect to
Figure imgf000021_0005
Figure imgf000021_0011
is referred to as the user-user similarity between users U1
Figure imgf000021_0006
and
Figure imgf000021_0007
, the prediction module 112 may apply , where
Figure imgf000021_0010
is a domain specific constant. The M xM user-user similarity matrix then
Figure imgf000021_0008
may be generated as
Figure imgf000021_0009
The prediction module 112 may apply the K-nearest neighbor ("KNN") strategy, the minimum similarity thresholding ("MST") (where all users having a similarity higher than or equal to a specified threshold are selected), or a combination of both to perform neighborhood selection in ACF. According to one embodiment of the invention, the prediction module 112 may use the K-nearest neighbors with minimum similarity thresholding technique due to its ability to mitigate prediction errors that are generated from dissimilar users. With KNN alone, the prediction module 112 may select K neighbors even though all of them may not be sufficiently similar to the active user. For given parameters τ and K, the largest set that satisfies and
Figure imgf000022_0008
is the
Figure imgf000022_0002
neighborhood set Nbhdik of user
Figure imgf000022_0007
, The prediction module 112 may select
Nbhdik by applying MST to U and selecting those users who have rated item Ik and meet the minimum similarity threshold τ with U1. The prediction module 112 may then apply KNN to select at most K users from this user set having the highest similarity with U To determine the neighborhood corresponding to a new user, the
Figure imgf000022_0006
condition may apply for Nbhd,^.
Figure imgf000022_0005
The ACF predictions are usually generated by evidence gathered from neighboring data entries that rate an item of interest. The prediction module 112 enhances the ACF predictions by applying contextual information to populate the ratings of all the neighboring data entries for the users. Therefore, the invention is able to exploit the evidence from all the neighboring data entries rather than only the neighboring data entries that include rating data for selected items. The prediction module 112 may represent the prediction of the unrated item Ik of the active user U1 as the B , where
Figure imgf000022_0003
Figure imgf000022_0004
Here, mιk > is the BPA corresponding to the neighborhood prediction
Figure imgf000022_0001
Since the prediction module 112 captures the similarity between users via the user-user similarity, the above equation utilizes the user-user similarity as a discounting factor to 'discount' the ratings BoEs of the neighbor data entries prior to fusion. The predictions of the invention offer more flexibility to the decision-maker than what other ACF schemes may provide. The predictions provide information regarding the confidence associated with the ratings prediction and allow the system to make decisions that correspond more closely to the application domain requirements.
For a hard decision on a singleton classification, the prediction module 112 may use the pignistic probability in equation (1) and may select a singleton as the preference label. If one preference label such as a singleton or a composite is desired, the prediction module 112 may apply the maximum belief with non- overlapping interval strategy (maxBL). The prediction module 112 may select the singleton preference label whose belief is greater than the plausibility of any other singleton. If this preference label does not exist, the prediction module 112 may select the composite preference label that includes a singleton label that has a maximum belief and those singletons that have a higher plausibility. According to an exemplary embodiment, the above concepts are implemented using the Movielens Dataset. Movielens is a movie recommendation dataset widely used by researchers for benchmarking purposes. At the time of use, Movielens included 100,000 ratings from 943 users for 1682 movies. In Movielens, that ratings are assigned integer values between 1 and 5, with 5 representing the highest possible rating ( Θpref = {l, 2, 3, 4, 5} ). In addition to ratings, Movielens includes the Genre of each movie, user-related information and item-related information. The user-related information includes age, gender, and occupation of individual users, among other user-related information. The item-related information includes title and IMDb_URL, among other item-related information. While Movielens, with its integer ratings, is not an ideal dataset to demonstrate the full functionality and effectiveness of the invention, it is considered appropriate for traditional ACF algorithms. Thus, Movielens provides a domain for performance evaluation and comparisons with the principles of the invention.
A domain with soft user ratings is provided to demonstrate the functionality of the invention. A probability rating module 116 is provided to create a DS_Movielens dataset by modifying Movielens through artificially introducing imperfections into the data. The modification process introduces imperfections while preserving existing user-user, item-item and user-item relationships that are needed by ACF algorithms.
The probability rating module 116 applies the following viewpoint regarding the Movielens ratings to generate the DS_Movielens dataset. Suppose the users considered "soft" ratings. Since the Movielens domain only shows hard ratings, the probability rating module 116 employs a mechanism to transform a hard rating in Movielens to a soft rating in DS_Movielens. The probability rating module 116 applies partial probability models to create different user profiles. Such partial probability models, together with the power set method, are used to convert data rife with diverse types of imperfections into DS theoretic evidence.
FIG. 4 illustrates graphical summaries of partial probability models for four user profiles based on zero tolerance, ±1 tolerance, end-weighted ±1 tolerance, and ±2 tolerance. In each of the four graphs of FIG. 4, the horizontal axis represents the user rating as it appears in the Movielens dataset. The vertical axis represents the 'true' rating that a movie received.
The probability rating module 116 employs user profiles to generate the DS_Movielens dataset. For a movie having a True_Rating = 2, a ±1 tolerance user may, with equal probability, allocate either True_Rating = 2 or a rating from the set (1,2,3) (see FIG. 4(b)). In other words, if the hard rating Movielens dataset allowed soft ratings, a ±1 tolerance user may sometimes rather use the interval-valued rating (1, 2, 3) for this same movie instead of the hard rating Movielens_Rating = 2. The system identifies these as the black or gray distribution, with the black distribution representing the hard rating Movielens_Rating = 2 and the gray distribution representing the interval-valued rating (1,2,3). A zero tolerance user, for the same movie, would allocate a Movielens _Rating = 2 (see FIG. 4(a)). An end-weighted ±1 tolerance user behaves similar to a ±1 tolerance user (see FIG. 4(c)) and a ±2 tolerance user may, with equal probability, allocate either True_Rating = 2 or a rating from the set (1,2,3,4) (see FIG. 4(d)). Clearly, these four user profiles represent a relatively broad spectrum of users. To determine what user's opinion could have generated Movielens_Rating = 2, the probability rating module 116 employs the power set approach while generating DS_Movielens from Movielens. The power set approach accounts for user rating imperfections, without resorting to various "assumptions" and "interpolations." The power set approach applied by the probability rating module 116 may identify the gray and black distributions as 0 and 1, respectively. The "state of nature" may be considered to be in one of 25 = 32 states. Suppose a ±1 tolerance user allocates a Movielens_Rating = 2. If the state of nature is defined as { l,x,l,x,x}, then the generating distributions are black for True_Rating = { 1,3} and either gray or black for the other ratings. In view of this, the only "feasible" true rating that could have generated Movielens_Rating = 2 is in fact True_Rating = 2. Alternatively, if the state of nature is defined as {0,x,0,x,x}, then the generating distributions are gray for True_Rating = { 1,3} and either gray or black for the other ratings. In view of this, the feasible true ratings are True_Rating = { 1,2,3}. In this manner, the probability rating module 116 may complete the "Feasible True_ Rating" column in FIG. 5. The set of feasible true ratings of any other Movielens rating corresponding to an arbitrary user 'character' profile can be obtained similarly. FIG. 5 shows the BPA corresponding to Movielens_Rating = 2 of a ±1 tolerance user.
According to one embodiment, the probability rating module 116 generates five DS_Movielens datasets with different values for p, viz., p= {0.1,0.3,0.5,0.7,0.9}. The probability rating module 116 may create each DS_Movielens dataset in the following manner. A user-item pair may be selected that has been rated as r^.
Randomly, with the probabilities { p, (l - p ) / 3, (l - p ) / 3, (l - p ) / 3] , one user profile may be selected from FIG. 4(a), 4(b), 4(c), and 4(d), respectively. The corresponding feasible true ratings and DS theoretic BPA rιk may be obtained via the procedure described above and r± may be replaced with rik . The probability rating module 116 may repeat this process for all rated entries in Movielens dataset.
The probability rating module 116 may transform the DS theoretic user ratings in the DS_Movielens dataset generated in the previous step into probabilities via the pignistic transformation in Equation (1). This is how the PR_Movielens dataset is generated. The dataset that is generated by selecting the most likely ratings in FIG. 5, which provides the rating {2}, produces the Movielens dataset that was initially selected.
For performance comparisons, the system generated re-implementations of the following ACF algorithms. A broadly used ACF system based on correlation analysis is identified by the acronym CORR. An algorithm proposed by Nakamura and Abe in a periodical identified by A. Nakamura and N. Abe, "Collaborative filtering using weighted majority prediction algorithms," Proc. International Conference on Machine learning (ICML '98), San Francisco, CA: Morgan Kaufmann, 1998, pp. 395-403, is identified by the acronym NA. The authors of NA provided three variants: one based on user-to-user similarity (u-NA), one based on item-to-item similarity (i-NA), and one combining these two (c-NA). These algorithms enable the user to accommodate the ignorance inherent in user ratings. In experiments reported by Nakamura and Abe, NA compares favorably with correlation-based methods. While neither CORR nor NA is directly applicable to the DS_Movielens datasets, the probability rating module 116 may apply CORR to the PR_Movielens with non-integer ratings generated by weighing each rating by its corresponding probability. By contrast, integer- valued ratings are used in NA. If the probability rating module 116 generates such dataset from the DS_Movielens dataset, the information provided by the user ratings may be significantly distorted. As a result, NA is applied to the Movielens dataset.
According to one embodiment, simulations were run with the DS modeling function in Equation (3). In view of the absence of adequate information, the two model parameters, trust factors and dispersion factors, were replaced with system- wide constants: {aιk , σιk } ≡ {a, σ] , Vik .
For example, the concept of Genre information was used to generate item- based contextual information. Adopting the nomenclature detailed above, the invention defines Concept := Genre and Groups := {Genre. Groupi,....,Genre.GroupQ}, where the concept groups may be Drama, Thrillers, and Romance, among other concept groups. When generating the group-preference BoEs, the probability rating module 116 seeks to capture how movies from a given genre would, as a whole, be rated by user U1. Since Movielens does not provide users the opportunity to express their genre preferences explicitly, the probability rating module 116 estimates the users' preferences using the group preference BPA applied to movies that have already been rated by user Ui. According to one embodiment, no discounting is applied.
According to one embodiment, the probability rating module 116 does not apply discounting to the definitions for a concept preference BPA and a contextual preference BPA described above. If additional concepts are utilized, not all concepts contribute equally to user preferences. For example, the contribution of concept Director may be different than the contribution of concept Cast. These differences may be accommodated through discounting. The ratings in the DS_Movielens dataset were used as the user preference BoE without additional modeling and the genre information was used as in the above case. The following methodology was used for consistency with prior methods in conducting performance experiments which apply the principles of the invention. According to one embodiment, the system randomly selects 10% of users and withholds five randomly selected movie ratings for each user. In other words, the probability rating module 116 hides five non-empty fields in the ratings matrix and prevents these fields from being used during training. The system subsequently uses these withheld ratings as an independent testing set. The remaining ratings represent the training set. This process may be repeated for ten different random splits into training and testing sets. The resulting sets are denoted by and
Figure imgf000027_0003
Figure imgf000027_0004
where £ = 1, ...,10. The results shown below are averages obtained from the 10 splits. For user-to-user similarity, set γ = 10-4 .
The probability rating module 116 denotes the 'true' rating that user U1 gives to item in the case of the Movielens dataset and by
Figure imgf000027_0005
Figure imgf000027_0002
in the case of the DS_Movielens dataset. The ratings predicted by the CORR and NA techniques are denoted by r,k and the ratings predicted by the invention are denoted
Figure imgf000027_0001
Performance criteria used for evaluations of ACF algorithms in environments with hard ratings include the mean absolute error (MAE). Other metrics include Precision or Recall. The MAE corresponding to the rating of the testing set
Figure imgf000028_0008
may be calculated as follows:
Figure imgf000028_0001
where identifies the user- item pairs whose true
Figure imgf000028_0002
rating is . With obvious modifications, the probability rating module 116
Figure imgf000028_0005
may obtain the overall MAE measure for the ACF algorithm.
The MAE expects hard ratings to generate the 'predicted' rating. As a result, in order for the MAE to compare CoFiDS with the CORR and NA, the DS theoretic predictions are converted to hard predictions through, for example, the pignistic transformation. Alternately, the probability rating module 116 may compare soft predictions, such as those provided by CoFiDS, with hard ratings using the following DS theoretic measures:
Figure imgf000028_0003
where
Figure imgf000028_0004
Here,
Figure imgf000028_0006
refers to the pignistic probability that corresponds to the DS theoretic
Figure imgf000028_0007
. The following performance criteria are provided:
Figure imgf000029_0001
where β
Figure imgf000029_0003
. For example, measure.
Figure imgf000029_0013
In environments where the user preference ratings are soft, such as the
DS_Movielens dataset, the degree to which one BPA (viz.,r» ) approximates another BPA
Figure imgf000029_0014
is determined using the following definition:
Figure imgf000029_0002
Figure imgf000029_0004
where and
Figure imgf000029_0005
and denotes the Euclidean norm. Here,
Figure imgf000029_0006
Figure imgf000029_0008
are each a size
Figure imgf000029_0007
column vector containing the masses allocated to each subset of by and respectively. matrix with
Figure imgf000029_0010
Figure imgf000029_0011
Figure imgf000029_0012
Figure imgf000029_0009
According to one embodiment, both DS-PEl and DS-PE2 may take values from [0,1]. For DS-PE2, the probability rating module 116 may have used the KL- divergence instead of the Euclidean norm. In this case, the error would not be bounded by the closed interval [0,1]. Moreover, KL-divergence may use the pignistic distributions corresponding to the true and predicted BPAs to have identical supports.
The behavior of CoFiDS depends on a few parameters, which leads to the examination of how the concrete settings of these parameters affect CoFiDS' performance. An elementary performance measure is the MAE. FIG. 6 illustrates α = 0.9, which indicates 90% confidence in each user rating. FIG. 6 illustrates how CoFiDS' MAE varies with different values of the dispersion factor σ (for several choices of {K,τ }). The results indicate that the performance is minimally sensitive as long as σ is somewhere in the interval [0.4, 0.7], with the best overall MAE being obtained when σ ~ 2/3 . FIG. 12 illustrates experiments for σ = 2/3 , as described below.
FIG. 7 illustrates how CoFiDS' MAE changes with the neighborhood size, K. FIG. 8 illustrates how MAE varies with the similarity threshold τ. These graphs show the impact of some other parameters, with other variables remaining fixed. As shown, the MAE first drops with increasing K, but then appears to stabilize for higher values, such as around K > 70. The MAE remains generally constant beyond K>70.
As for the similarity threshold, τ, FIG. 8 illustrates a minimum for MAE around τ = 0.79. The results of these experiments are used as guidance in the next set of experiments, where {K, τ} = {80, 0.79}. For a concrete domain, the value of these parameters may need to be established using a cross-validation technique. Proceeding to the more appropriate DS -based performance criteria, FIGS. 9 and 10 illustrate how the value of DS-MAE for CoFiDS varies with changing neighborhood size, K, and similarity threshold, τ, respectively. In FIGS. 9 and 10, all other parameters kept constant. The nature of the DS theoretic predictions renders subjective the direct comparison of CoFiDS' performance with that of CORR and NA.
FIG. 11 illustrates a few exemplary CoFiDS predictions performed by the probability rating module 116 using Movielens and single-label predictions obtained by the pignistic transformation and the maxBL strategy. The decision that corresponds to the user-item pair (72, 550) is not controversial. By contrast, the decision that corresponds to the user- item pair (2, 251) shows that there may be a challenge capturing the richer information content of the DS theoretic BoE with a single-label decision. FIG. 11 illustrates that although the pignistic transformation and the maxBL strategy both favor a "4" rating, the CoFiDS prediction does not appear clearly to discriminate between the "4" and "5" (true) ratings. For the user- item pair (116, 758) illustrated in FIG. 11, while the maxBL strategy captures the indecision that is apparent in the CoFiDS prediction, the pignistic transformation does not.
In view of the different nature of the systems being compared, two strategies emerge for comparing the predictions. A first strategy includes converting CoFiDS' predictions to hard ones. A second strategy includes interpreting CORR' s and NA' s predictions as soft predictions. Each of these strategies is addressed separately.
After the probability rating module 116 converts the CoFiDS' predictions to hard decisions for direct comparison with the CORR and NA, the pignistic transformation may be used to generate hard decisions from the soft CoFiDS predictions. This approach reduces the effectiveness of CoFiDS, whose strength is the ability to generate soft decisions. This strategy is available for cases where hard decisions are satisfactory.
According to one embodiment, the basic parameters for CoFiDS are set to { α, σ} = {0.9, 2/3}. To quantify the prediction performance, MAE and other field information retrieval criteria, such as Precision, Recall, and Fi are used. A high value is desired for Precision in certain domains to ensure that the system's prediction of value True_Rating is accurate. This desire is valid even if the system may have missed many cases where the true user's rating was True_Rating, such as if the system predicts "2." While this value may be relied upon, the system may have missed many cases where the true value was "2". By contrast, Recall is desired in domains where the system needs to correctly recall as many occurrences of ratings True_Rating as possible. Fi combines the two criteria and is preferred in domains where Precision and Recall are deemed equally important.
FIG. 12 summarizes the results of these experiments. Bold values indicate the best performance in each category. As the differences are substantial, the statistical significance is not evaluated. In FIG. 12, each of the five possible ratings ("1" through "5") is provided a column. The experiments show that NA-based predictions are seldom the best, which appears to indicate that the technique is not well suited for soft ratings of this particular kind. The situation is less straightforward when CORR and CoFiDS are compared. A superficial observation demonstrates that, on average, CoFiDS' mean error is lower. This apparent performance edge may be attributed to this system's higher ability to predict the "middle" ratings of "3" and "4". By contrast, CORR more accurately predicted "1." In this example, the margin between the two systems is low. The Fi criterion provides similar impressions, with Fi components potentially offering deep insights. According to one embodiment, CoFiDS may be preferable in domains where the user emphasizes Precision, whereas CORR may be a better choice when Recall is of importance.
The prediction module 112 may provide the CoFiDS results even though a conversion to a hard decision may not exploit the full strength and functionality of the underlying DS -theoretic basis. According to one embodiment, coverage performance that calculates the percentage of items for which the ACF algorithm can make correct predictions is lower for CORR if the ACF algorithm parameters have been tuned for lower MAE. Both NA and CoFiDS provide nearly complete coverage. So, for an improved comparison with CORR, a configuration is used that minimizes MAE, while keeping the 90% level coverage. Turning now to a second analysis strategy where CORR and NA decisions are interpreted as soft predictions, integer-valued predictions are not needed for CORR. This simplifies the comparison of CORR with CoFiDS along the soft predictions. By contrast, the NA decisions cannot be readily "softened" because they are integer- valued decisions. For this reason, a comparison of CORR with CoFiDS is provided below.
The prediction module 112 may apply the following DS -theoretic BPA to interpret a CORR prediction,
Figure imgf000032_0003
, as soft:
Figure imgf000032_0001
where
Figure imgf000032_0002
and
Figure imgf000032_0004
denote the highest integer ratings that do not exceed the CORR
prediction
Figure imgf000032_0005
, and the lowest integer rating that does not fall below the CORR prediction , respectively. According to one embodiment, the CORR prediction 3.3
Figure imgf000033_0001
with is interpreted as the Bayesian statement, "The rating is 3 with
Figure imgf000033_0002
70% confidence, and it is 4 with 30% confidence". Equation (7) corresponds well with this typical interpretation of a CORR prediction.
FIG. 13 summarizes the results for the configuration that yields the best overall DS-MAE being used for each ACF algorithm. The same CoFiDS parameters are used as before. Again, bold values in FIG. 13 indicate the best performance in each category.
While the average mean error is lower in the case of CoFiDS, the correlation- based approach provides improved results for predicting the maximum and minimum values ("1" and "5," respectively). By contrast, CoFiDS provides enhanced results for predicting the "middle" values of "3" and "4." This same conclusion is reached in the case of the F1 criterion, whose components display different behavior for each of the two systems. CoFiDS is preferred in domains where high precision is desired, while CORR is preferred in applications where high recall is desired.
According to one embodiment, the true ratings may be soft to permit performance comparisons using the criteria DS-PEl and DS-PE2 from above. While the CoFiDS predictions are provided in the soft form, CORR predictions are converted to soft predictions using Equation (7). FIG. 14 illustrates the comparison for several different values of p, the probability with which the zero tolerance user was selected. The other 3 user profiles were selected with equal probability. Since the CoFiDS consistently outperforms the CORR system by a large margin, the evaluation of statistical significance is not performed.
FIG. 15 illustrates how DS-PEl varies with the changing neighborhood size K when p = 0.1. The performance is poor as long as the neighborhood is small. The performance peaks and then starts slowly degrading. In a realistic domain, the graceful performance degradation after reaching the optimum value supports the notion that the optimum value of K can be obtained by cross-validation techniques. In a general field of recommender systems, the invention provides methods of accommodating data imperfections for domains where the user ratings are subjective or are otherwise unreliable. The system applies coarse setting to system-wide parameters. According to one embodiment, the CoFiDS performance compares favorably with performance derived using conventional ACF techniques. The invention uses CoFiDS to generate soft decisions where domain experts offer subjective opinions. The invention propagates the uncertainties from the user- preference ratings to the output predictions. By contrast, conventional methods of "forcing" decisions into crisp integer values is deficient. The invention may be realized in hardware, software, or a combination of hardware and software. Any kind of computing system or other apparatus adapted for carrying out the methods described herein is suited to perform the functions described herein.
A typical combination of hardware and software could be a specialized or general purpose computer system having one or more processing elements. A computer program may be provided and stored on a storage medium that controls the computer system when loaded and executed, such that it carries out the methods described herein. The invention can also be embedded in a computer program product, which comprises all the features enabling the implementation of the methods described herein, and which, when loaded in a computing system is able to carry out these methods. Storage medium refers to any volatile or non-volatile storage device.
Computer program or application in the present context means any expression, in any language, code or notation, of a set of instructions intended to cause a system having an information processing capability to perform a particular function either directly or after either or both of the following: a) conversion to another language, code or notation; b) reproduction in a different material form.
In addition, unless mention is made above to the contrary, it should be noted that all of the accompanying drawings are not to scale. Significantly, this invention can be embodied in other specific forms without departing from the spirit or essential attributes thereof. Accordingly, reference should be had to the following claims, rather than to the foregoing specification, as indicating the scope of the invention.

Claims

What is claimed is:
1. An automated collaborative filtering device in communication with a client terminal device and receiving data from a plurality of sources, the automated collaborative filtering device comprising: a storage module that stores data gathered from the plurality of sources, wherein the data includes contextual information and wherein the storage module has a database that includes filled data slots and empty data slots; and a prediction module that communicates with the storage module and the client terminal device, the prediction module is programmed to generate prediction data based on the contextual information, wherein the prediction data is provided to populate the empty data slots.
2. The automated collaborative filtering device according to claim 1, wherein the prediction module processor applies Dempster-Shafer belief theoretic collaborative filtering to populate the empty data slots.
3. The automated collaborative filtering device according to claim 1, wherein the prediction module generates reliability information for the prediction data.
4. The automated collaborative filtering device according to claim 3, wherein the prediction module generates predictions with reliability information for each of the plurality of sources.
5. The automated collaborative filtering device according to claim 4, wherein the prediction module combines the predictions with the reliability information from the plurality of sources to provide an aggregate prediction with aggregate reliability information..
6. The automated collaborative filtering device according to claim 1, wherein the prediction module populates the empty data slots prior to performing automated collaborative filtering.
7. The automated collaborative filtering device according to claim 1, wherein the prediction module organizes the gathered data into at least one category based on criteria including at least one of user data and item data.
8. The automated collaborative filtering device according to claim 7, wherein the prediction module generates prediction data based on one category using at least one of a K-nearest neighbor selection and a minimum similarity threshold selection.
9. The automated collaborative filtering device according to claim 7, wherein the prediction module generates prediction data based on at least two categories using at least one of a K-nearest neighbor selection and a minimum similarity threshold selection.
10. A method of performing automated collaborative filtering, the method comprising: providing a database that includes filled data slots and empty data slots; storing data gathered from a plurality of sources into the database, obtaining contextual information from the stored data; generating prediction data based on the contextual information; and populating the empty data slots with the prediction data.
11. The method according to claim 10, further comprising applying
Dempster-Shafer belief theoretic collaborative filtering to populate the empty data slots.
12. The method according to claim 10, further comprising generating predictions with reliability information for the prediction data.
13. The method according to claim 12, wherein the predictions with reliability information is generated for each of the plurality of sources.
14. The method according to claim 13, further comprising: combining the predictions with reliability information from the plurality of sources; and providing an aggregate prediction with aggregate reliability information.
15. The method according to claim 10, further comprising populating the empty data slots prior to performing automated collaborative filtering.
16. The method according to claim 10, further comprising organizing the gathered data into at least one category based on criteria including at least one of user data and item data.
17. The method according to claim 16, further comprising generating prediction data based on one category using at least one of a K-nearest neighbor selection and a minimum similarity threshold selection.
18. The method according to claim 16, further comprising generating prediction data based on at least two categories using at least one of a K-nearest neighbor selection and a minimum similarity threshold selection.
19. An automated collaborative filtering device in communication with a client terminal device and receiving data from a plurality of sources, the automated collaborative filtering device comprising: a storage module that stores data gathered from the plurality of sources, wherein the data includes contextual information and wherein the storage module has a database that includes filled data slots and empty data slots; a probability rating module that communicates with the storage module and the client terminal device, the probability rating module being programmed to extract predefined values from the data and transform the predefined values into a probability of obtaining the predefined values; and a prediction module that communicates with the probability rating module, the prediction module being programmed to generate prediction data based on the contextual information and the probability of obtaining the predefined values, wherein the prediction data is provided to populate the empty data slots.
20. The automated collaborative filtering device according to claim 19, wherein the probability rating module is programmed to apply a weighting factor based on the source of the data.
PCT/US2009/050848 2008-07-16 2009-07-16 System and method of using automated collaborative filtering for decision-making in the presence of data imperfections WO2010009314A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US8113408P 2008-07-16 2008-07-16
US61/081,134 2008-07-16

Publications (2)

Publication Number Publication Date
WO2010009314A2 true WO2010009314A2 (en) 2010-01-21
WO2010009314A3 WO2010009314A3 (en) 2010-04-15

Family

ID=41551015

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2009/050848 WO2010009314A2 (en) 2008-07-16 2009-07-16 System and method of using automated collaborative filtering for decision-making in the presence of data imperfections

Country Status (1)

Country Link
WO (1) WO2010009314A2 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103888852A (en) * 2014-03-24 2014-06-25 清华大学 Video recommendation method and device for social television
WO2018160492A1 (en) * 2017-03-03 2018-09-07 Asapp, Inc. Automated upsells in customer conversations
US10579752B2 (en) 2014-05-12 2020-03-03 Micro Focus Llc Generating a model based on input
US11869015B1 (en) 2022-12-09 2024-01-09 Northern Trust Corporation Computing technologies for benchmarking

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020083067A1 (en) * 2000-09-28 2002-06-27 Pablo Tamayo Enterprise web mining system and method
US20020198991A1 (en) * 2001-06-21 2002-12-26 International Business Machines Corporation Intelligent caching and network management based on location and resource anticipation

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020083067A1 (en) * 2000-09-28 2002-06-27 Pablo Tamayo Enterprise web mining system and method
US20020198991A1 (en) * 2001-06-21 2002-12-26 International Business Machines Corporation Intelligent caching and network management based on location and resource anticipation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
T. L. WICKRAMARATHNE: 'A BELIEF THEORETIC APPROACH FOR AUTOMATED COLLABORATIVE FILTERING' A THESIS FOR THE DEGREE OF MASTER OF SCIENCE, [Online] May 2008, Retrieved from the Internet: <URL:http://etd.library.miami.edu/theses/av ailable/etd-02222008-101303/unrestrict ed/twickramarathneSp08.pdf> *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103888852A (en) * 2014-03-24 2014-06-25 清华大学 Video recommendation method and device for social television
CN103888852B (en) * 2014-03-24 2017-05-31 清华大学 For the video recommendation method and device of social television
US10579752B2 (en) 2014-05-12 2020-03-03 Micro Focus Llc Generating a model based on input
WO2018160492A1 (en) * 2017-03-03 2018-09-07 Asapp, Inc. Automated upsells in customer conversations
US10885529B2 (en) 2017-03-03 2021-01-05 Asapp, Inc. Automated upsells in customer conversations
US11869015B1 (en) 2022-12-09 2024-01-09 Northern Trust Corporation Computing technologies for benchmarking

Also Published As

Publication number Publication date
WO2010009314A3 (en) 2010-04-15

Similar Documents

Publication Publication Date Title
Wang et al. Member contribution-based group recommender system
Himabindu et al. Conformal matrix factorization based recommender system
CA3007853C (en) End-to-end deep collaborative filtering
Choi et al. A new similarity function for selecting neighbors for each target item in collaborative filtering
EP3690768A1 (en) User behavior prediction method and apparatus, and behavior prediction model training method and apparatus
Kim et al. Collaborative error-reflected models for cold-start recommender systems
Chen et al. Predicting the influence of users’ posted information for eWOM advertising in social networks
Shi et al. Local representative-based matrix factorization for cold-start recommendation
Xu et al. Integrated collaborative filtering recommendation in social cyber-physical systems
JP2017535857A (en) Learning with converted data
Bobadilla et al. Generalization of recommender systems: Collaborative filtering extended to groups of users and restricted to groups of items
Qi et al. “Time–Location–Frequency”–aware Internet of things service selection based on historical records
WO2013067461A2 (en) Identifying associations in data
Hoiles et al. Rationally inattentive inverse reinforcement learning explains youtube commenting behavior
WO2023087914A1 (en) Method and apparatus for selecting recommended content, and device, storage medium and program product
Liu et al. Towards context-aware collaborative filtering by learning context-aware latent representations
Anand et al. Using deep learning to overcome privacy and scalability issues in customer data transfer
US20220197978A1 (en) Learning ordinal regression model via divide-and-conquer technique
EP2983123A1 (en) Self transfer learning recommendation method and system
WO2010009314A2 (en) System and method of using automated collaborative filtering for decision-making in the presence of data imperfections
Sengupta et al. Simple surveys: Response retrieval inspired by recommendation systems
US20150371241A1 (en) User identification through subspace clustering
Savia et al. Two-way latent grouping model for user preference prediction
Qu et al. Learning demand curves in B2B pricing: A new framework and case study
Nguyen et al. Cold-start problems in recommendation systems via contextual-bandit algorithms

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09798750

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09798750

Country of ref document: EP

Kind code of ref document: A2