US20170103343A1 - Methods, systems, and media for recommending content items based on topics - Google Patents

Methods, systems, and media for recommending content items based on topics Download PDF

Info

Publication number
US20170103343A1
US20170103343A1 US15/384,692 US201615384692A US2017103343A1 US 20170103343 A1 US20170103343 A1 US 20170103343A1 US 201615384692 A US201615384692 A US 201615384692A US 2017103343 A1 US2017103343 A1 US 2017103343A1
Authority
US
United States
Prior art keywords
topics
user
content items
user interest
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US15/384,692
Inventor
Yangli Hector Yee
James Vincent McFadden
John Kraemer
Dasarathi Sampath
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Google LLC
Original Assignee
Google LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Google LLC filed Critical Google LLC
Priority to US15/384,692 priority Critical patent/US20170103343A1/en
Publication of US20170103343A1 publication Critical patent/US20170103343A1/en
Assigned to GOOGLE LLC reassignment GOOGLE LLC CHANGE OF NAME (SEE DOCUMENT FOR DETAILS). Assignors: GOOGLE INC.
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N99/005
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24578Query processing with adaptation to user needs using ranking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F17/30781
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • G06N7/005
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Definitions

  • Methods, systems, and media for recommending content items based on topics are provided. More particularly, the disclosed subject matter relates to recommending content items by modeling user interest profiles that are based on content consumption.
  • content hosting services generally attempt to present content that is interesting to its users.
  • Some content hosting services allow users to create user profiles that indicate demographic information, such as gender or age, as well as areas of interest. These content hosting services then attempt to use such user profiles to select content to provide to each of its users.
  • users may not be able to articulate their interests while populating a user profile.
  • interests may change over time and these users may not update their user profiles to reflect such changes.
  • a method for recommending content items comprising: determining, using a hardware processor, a plurality of accessed content items associated with a user, wherein each of the plurality of content items is associated with a plurality of topics; determining, using the hardware processor, the plurality of topics associated with each of the plurality of accessed content items; generating a model of user interests based on the plurality of topics, wherein the model implements a machine learning technique to determine a plurality of weights for assigning to each of the plurality of topics; applying, using the hardware processor, the model to determine, for a plurality of content items, a probability that the user would watch a content item of the plurality of content items; ranking, using the hardware processor, the plurality of content items based on the determined probabilities; and selecting a subset of the plurality of content items to recommend to the user based on the ranked plurality of content items.
  • a system for recommending content items comprising: a hardware processor that is configured to: determine a plurality of accessed content items associated with a user, wherein each of the plurality of content items is associated with a plurality of topics; determine the plurality of topics associated with each of the plurality of accessed content items; generate a model of user interests based on the plurality of topics, wherein the model implements a machine learning technique to determine a plurality of weights for assigning to each of the plurality of topics; apply the model to determine, for a plurality of content items, a probability that the user would watch a content item of the plurality of content items; rank the plurality of content items based on the determined probabilities; and select a subset of the plurality of content items to recommend to the user based on the ranked plurality of content items.
  • a computer-readable medium containing computer-executable instructions that, when executed by a processor, cause the processor to perform a method for recommending content items.
  • the method comprises: determining a plurality of accessed content items associated with a user, wherein each of the plurality of content items is associated with a plurality of topics; determining the plurality of topics associated with each of the plurality of accessed content items; generating a model of user interests based on the plurality of topics, wherein the model implements a machine learning technique to determine a plurality of weights for assigning to each of the plurality of topics; applying the model to determine, for a plurality of content items, a probability that the user would watch a content item of the plurality of content items; ranking the plurality of content items based on the determined probabilities; and selecting a subset of the plurality of content items to recommend to the user based on the ranked plurality of content items.
  • FIG. 1 is a flowchart of an illustrative process for recommending content items, such as video content items and channels that present video content items, based on topics in accordance with some embodiments of the disclosed subject matter.
  • FIG. 2 is a flowchart of an illustrative process for modeling user interests using a subset of topics (e.g., top K topics) that are selected based on an associated interest weight in accordance with some embodiments of the disclosed subject matter.
  • a subset of topics e.g., top K topics
  • FIG. 3 is a flowchart of an illustrative process for modeling user interests using topic clusters including related topics, where topics associated with the content items viewed by the user are mapped to one or more topic clusters, in accordance with some embodiments of the disclosed subject matter.
  • FIG. 4 is a flowchart of an illustrative process for modeling user interests using a decision tree that groups or clusters user interest profiles associated with the same or similar topics in accordance with some embodiments of the disclosed subject matter.
  • FIG. 5 is a flowchart of an illustrative process for modeling user interests using embedding trees, where multiple topics and multiple content items are mapped into a multi-dimensional space and the embedding tree is trained such that a loss or error function is minimized, in accordance with some embodiments of the disclosed subject matter.
  • FIG. 6 is an illustrative screen of an interface that provides recommended content items to a user in accordance with some embodiments of the disclosed subject matter.
  • FIG. 7 is a diagram of an illustrative system suitable implementation of the content recommendation system in accordance with some embodiments of the disclosed subject matter.
  • FIG. 8 is a diagram of an illustrative computing device and server as provided, for example, in FIG. 7 in accordance with some embodiments of the disclosed subject matter.
  • mechanisms for recommending content items such as video content items and channels that provide video content items, using topics are provided.
  • These content items and channels can be presented by a content hosting service (e.g., a video hosting service) or any other suitable content database.
  • a content hosting service e.g., a video hosting service
  • one or more topics associated with each of the content items can be determined.
  • a channel in some embodiments, is a group of videos or video content items from a particular source. In another embodiment, a channel is a group of videos or video content items with a common attribute, such as a source, a topic, a date, etc.
  • a user interest profile can be accessed to determine which content items have been accessed by the user. For example, in response to logging into a user account, a content hosting service can associate and store information relating to content items that have been accessed by the user in a user interest profile.
  • the user interest profile can also include user interactions with the content items, such as pausing, fast forwarding, rewinding a particular content item, rating a particular content item, sharing a particular content item, the amount of time a particular content item is watched, etc.
  • a user access log can be accessed to determine which content items have been accessed by the user.
  • the topics associated with each of the content items can be determined from the user interest profile.
  • topics can be derived from metadata associated with a content item (e.g., genre, category, title, actor, director, description, etc.).
  • a content database e.g., a video database
  • a knowledgebase can be queried to determine, for each accessed content item, the topics associated with each content item and/or other suitable topic information (e.g., an interest weight associated with each topic).
  • a model of user interests can be generated using one or more machine learning techniques.
  • the model of user interests derived using machine learning techniques can be used, for example, in conjunction with linear regression to provide predictions or probabilities of the likelihood that the user accesses or watches a particular content item or a particular channel that presents particular content items.
  • By ranking content items based on the determined probabilities, a subset of content items or channels presenting content items can be selected for recommending to the user.
  • a model of user interests can be generated using a machine learning approach, where a subset of topics, such as the top K topics, from a user interest profile can be determined and a conjunction of the subset of topics and the content items associated with a content hosting service can be determined.
  • a model of user interests can be generated by using a hierarchical agglomerative clustering approach to create topic clusters of related topics, where a user interest profile of topics can be mapped to one or more topic clusters and a conjunction of those topic clusters and the content items associated with a content hosting service can be determined.
  • a model of user interests can be generated using vector clustering approaches.
  • these mechanisms can be used in a variety of applications. For example, these mechanisms can allow a content hosting service, using these models of user interests, to propagate interests in a content item or a channel that presents content items from a single user or a group of users to other users, thereby generalizing the content recommendation system beyond the preferences of a single user. In another example, these mechanisms can allow a content hosting service to recommend content items or a channel that presents content items to a user that is unaware of the availability of such a content item or channel.
  • these content recommendation mechanisms can be implemented in a product search domain or commerce domain, where one or more topics associated with a user are determined from product metadata relating to products viewed by a user and a model of user interests is generated based on those topics.
  • these content recommendation mechanisms can be implemented in a music search domain, where one or more topics associated with a user are determined from structured music metadata relating to media accessed by the user and a model of user interests can be generated based on those topics.
  • the content recommendation mechanisms can be used in any suitable recommendation system for providing recommended content items.
  • FIG. 1 is a flow chart of an illustrative process 100 for providing a content recommendation system that recommends content items based on topics in accordance with some embodiments of the disclosed subject matter.
  • Process 100 can begin by determining the content items that have been accessed by a user. For example, in response to detecting that a user has logged into a content hosting service (e.g., by entering a username and password), information relating to the content items that the user has accessed can be stored in a user interest profile.
  • the content recommendation system can access the user interest profile to determine the content items that have been accessed by the user.
  • the content recommendation system can transmit a query to a user access log associated with the user to determine the content items that have been accessed by the user.
  • the content recommendation system can retrieve multiple content identifiers associated with the content items accessed by the user (e.g., video identifiers, such as “video_id:1234,” that are associated with each video watched by the user).
  • the users can be provided with an opportunity to control whether programs or features collect user information (e.g., information about content items accessed by a user, a user's interactions with content items, a user's social network, social actions or activities, profession, a user's preferences, or a user's current location), or to control whether and/or how to receive content from the content server that may be more relevant to the user.
  • user information e.g., information about content items accessed by a user, a user's interactions with content items, a user's social network, social actions or activities, profession, a user's preferences, or a user's current location
  • certain data may be treated in one or more ways before it is stored or used, so that personally identifiable information is removed.
  • a user's identity may be treated so that no personally identifiable information can be determined for the user, or a user's geographic location may be generalized where location information is obtained (such as to a city, zip code, or state level), so that a particular location of a user cannot be determined.
  • location information such as to a city, zip code, or state level
  • the user may have control over how information is collected about the user and used by a content server.
  • the content recommendation system can determine one or more topics associated with each of the content items accessed by the user. For example, a video content item relating to the supernova SN 1987A can have the topics “SN_1987A” and “astronomy.”
  • the user interest profile can include the content items accessed by the user and a list of the one or more topics associated with each of the content items accessed by the user.
  • each content item on a content hosting service can be annotated with entities or topics created by a content source.
  • the topics associated with the content items can be created by a community of users in a collaborative knowledgebase, such as Freebase.
  • the topics associated with the content items can be generated based on content metadata, such as unique terms derived from video titles, actor names, character names, director names, genres or category information, etc.
  • the content recommendation system can transmit a query to a content database (e.g., a video database, a knowledgebase having entity information or topic information, etc.) to determine, for each of the accessed content items, the topics associated with each content item.
  • a content database e.g., a video database, a knowledgebase having entity information or topic information, etc.
  • the topics associated with a content item can be stored along with other metadata associated with the content item.
  • the topics and/or other information relating to the topics can be stored in a separate database.
  • an interest weight can be associated with each topic.
  • a topic corresponding to an accessed content item can be associated with an interest weight that represents the strength or degree of association of the topic for a given content item.
  • the interest profile can, in some embodiments, be represented as a weighted sum of all of the topics associated with the content items accessed by the user.
  • the interest weight for a particular topic in a user interest profile can be a product, sum, average, or other arithmetic function of the interest weight normalized by the frequency of the topic in the user interest profile.
  • the interest weight for the topic “astronomy” can be high in comparison to the interest weights associated with other topics when the topic “astronomy” is a frequent topic of content items accessed by the user.
  • the content recommendation system can generate a model of user interests using one or more machine learning techniques at 130 .
  • a feature can be represented by feature-value pairs—i.e., a feature template and a feature value.
  • the feature for a particular video content item can have the value “1234” or video_id:1234.
  • the feature video_id:1234 can model the global popularity of a video content item having the video identifier “1234.” When used by itself, such a model determines the probability that a user watches the video content item having the video identifier “1234.”
  • a feature can be used to model a channel (channel_id).
  • the feature channel_id:ABC123 can model the global popularity that a user watches video content items presented by the channel having the channel identifier “ABC123.” Again, when used by itself, such a model determines the probability that a user watches video content items presented by the channel having the channel identifier “ABC123.”
  • user modeling can include one or more input features relating to the user.
  • a user feature such as user_id
  • the feature user_id:XYZ can determine the probability that a user having a user identifier “XYZ” watches any video content item provided by a video hosting service.
  • a model for a more discerning user can, for example, determine that the probability that the user accesses any content item provided by a content hosting service is low.
  • Another input feature relating to the user can include a geographical location.
  • a user feature such as user_geo
  • the feature user_geo:GBR can determine the probability that users located in Great Britain (GBR) watch any video content item provided by a video hosting service.
  • features can be used in conjunction with other features, where the conjunction is sometimes referred to herein as the symbol.
  • a covariance matrix can be generated by taking the outer product between the features—e.g., the sum of the products of the values associated with the features over different instances or sub-portions.
  • the user feature user_geo for geographical location can be used in conjunction with the feature channel_id for a particular channel to model regional preferences (e.g., USA, GBR, CAN, etc.) for the particular channel or channels.
  • the feature template can be represented by “user_geo channel_id.”
  • One illustrative instance of the conjunction can be “user_geo:GBR channel_id:ABC123,” which models the likelihood that a user from Great Britain (GBR) watches a video content item presented by a channel having a channel identifier “ABC123.”
  • these input features and conjunctions can be used to model users. More particularly, upon determining the list of entities and topics that a user has shown interest based on viewing history (e.g., the content items accessed by the user, the content items watched for a substantial period of time by the user, the content items that the user has provided a favorable indication, the content items that the user has shared with other users, etc.), the content recommendation system can use various input features and conjunctions to generate an interest model for a user using one or more machine learning techniques.
  • viewing history e.g., the content items accessed by the user, the content items watched for a substantial period of time by the user, the content items that the user has provided a favorable indication, the content items that the user has shared with other users, etc.
  • FIGS. 2-5 Illustrative examples of the one or more machine learning techniques that can be used to generate a model of user interests are described herein in connection with FIGS. 2-5 .
  • the content recommendation system can generate a model of user interests using a direct interest modeling approach. For example, the content recommendation system can determine a subset of topics in a user interest profile and determine the conjunction of the subset of topics and the content items associated with a content hosting service.
  • FIG. 2 is a flowchart of an illustrative process for modeling user interests using a subset of topics (e.g., top K topics) based on an associated interest weight in accordance with some embodiments of the disclosed subject matter.
  • the content recommendation system can access a user interest profile of topics associated with the user.
  • the topics can be based on the content item accessed by the user and, in some implementations, an interest weight can be associated with each of the topics in the user interest profile. For example, an interest weight for the topic “science” can be high compared to other interest weights in response to determining that the user associated with the user interest profile watches a substantial number of video content items that are annotated with the topic “science.”
  • the content recommendation system can determine a subset of topics from the multiple topics in the user interest profile. In some embodiments, the content recommendation system can determine the subset of topics based on the associated interest weight. For example, the content recommendation system can determine the top fifty topics in the user interest profile and determine the conjunction of the top fifty topics and the content items provided by the content hosting service. In another example, the content recommendation system can select a particular subset of topics from the user interest profile based on whether the associated interest weight is greater than a particular threshold value. In such an example, the threshold value can be determined such that at least a given number of topics have been selected. Additionally or alternatively, the threshold value can be adjusted by an administrative user using the content recommendation system (e.g., after inspecting the data, after inspecting the associated interest weights, etc.).
  • the content recommendation system can determine the conjunction of the subset of topics and the content items provided by the content hosting service to model the interaction of user interests and content items. For example, given a set of feature vectors representing a subset of topics, a model can be generated that indicates the probability that the user associated with the user interest profile watches a content item.
  • a video content item having a video identifier “567” (video_id:567) relates to astronomy and users with the topic “science” (user_topic:science) in their user interest profile watch or are highly likely to watch the video content item, the instance for the conjunction “user_topic: science video_id:567” can have a high weight or probability value.
  • the embodiments described herein generally relate to determining the conjunction between content items and a topic, a topic cluster, or any other suitable representation derived from topic information, this is merely illustrative.
  • the content recommendation system can determine the conjunction between topics (e.g., user_topics) and a particular channel (e.g., channel_id) to model the interaction of user interests and a particular channel.
  • This conjunction can be represented as, for example, “user_topic channel_id.”
  • the illustrative example shown in FIG. 2 accesses a user interest profile associated with a user and generates a model of user interests based on at least a portion of the topics from the user interest profile
  • any suitable number of user interest profiles can be used.
  • the content recommendation system can determine that a user has similar interests to a group of users.
  • the content recommendation system can access the user interest profiles associated with the user and each user in the group of users and determine a subset of topics that are common to the multiple users and that have interest weights greater than a particular threshold value. This subset of topics can then be used in conjunction with content items that are provided by the content hosting service.
  • the content recommendation system can generate a model of user interests using a topic clustering approach. For example, the content recommendation system can identify one or more topic clusters of related topics that are similar to the topics included in a user interest profile and determine the conjunction of the one or more topic clusters and the content items associated with a content hosting service. As opposed to directly modeling the topics associated with a user in a user interest profile, the content recommendation system can map the topics in the user interest profile to one or more topic clusters and generate an interest model based on the topic clusters.
  • FIG. 3 is a flowchart of an illustrative process for modeling user interests using topic clusters in accordance with some embodiments of the disclosed subject matter.
  • the content recommendation system can access a user interest profile of topics associated with the user.
  • the topics can be based on the content items accessed by the user and, in some implementations, an interest weight can be associated with each of the topics in the user interest profile.
  • the content recommendation system can determine, for each topic, one or more related topics. Any suitable approach can be used for determining related topics. For example, the content recommendation system can use the co-occurrence of topics in user interest profiles to determine which topics are related to each other. In this example, the content recommendation system can determine across multiple user interest profiles (e.g., all of the user interest profiles in the content hosting service), pairs of topics that co-occur in at least a portion of the user interest profiles, and determine a measure of co-occurrence or any other suitable distance measure. The distance measure can, for example, indicate how often video content items on two co-occurring topics have been accessed by a user.
  • the distance measure can, for example, indicate how often video content items on two co-occurring topics have been accessed by a user.
  • the content recommendation system can determine related topics by querying an alternate source that specifies relationships between entities or topics, such as a knowledgebase (e.g., Freebase). For example, in response to transmitting a query that includes a topic (e.g., topic:SN_1987A) to a knowledgebase, the content recommendation system can receive a related topic graph that includes a set of related topic nodes and distance measures between the various nodes. The content recommendation can use the related topic graph for determining topic clusters, where neighboring nodes to a topic node are semantically more related than nodes with larger distance measure from the topic node.
  • a knowledgebase e.g., Freebase
  • the content recommendation system can cluster each topic with one or more related topics based on the distance measure.
  • the content recommendation system can create topic clusters using any suitable clustering approaches, such as a hierarchical agglomerative clustering approach.
  • the content recommendation system can identify the topic clusters having a distance measure less than a given threshold value (e.g., grouping topics together that are closer in distance) and combine the two or more topics into a topic cluster.
  • the content recommendation system can generate an interest weight for the topic cluster.
  • the interest weight for the topic cluster can combine the interest weights associated with each of the topics by, for example, adding, multiplying, averaging, or applying another arithmetic or statistical function on the interest weights.
  • the interest weight for the topic cluster can be normalized by, for example, the frequency of the co-occurring topics in the user interest profiles.
  • the content recommendation system can map the user interest profile (that includes topics associated with content items accessed by the user) with the topic clusters to determine user cluster features. For example, the content recommendation system can determine a topic cluster corresponding to each topic in the user interest profile. As a result, the content recommendation system can determine multiple topic clusters of related topics for association with a user interest profile.
  • topic clusters can be stored in the user interest profile as opposed to the topics associated with content items accessed by the user.
  • topics associated with content items accessed by the user and topic clusters can be stored in the user interest profile.
  • the content recommendation system can determine the conjunction of the user cluster features (e.g., the topic clusters associated with the user interest profile) with content items provided by a content hosting service. That is, the content recommendation system can generate a matrix or any other suitable data representation that provides probabilities that a user of a user interest profile watches content items based on the user cluster features. Alternatively, in some embodiments, the content recommendation system can select a subset of user cluster features for modeling user interests (e.g., the topic clusters that have an interest weight greater than a particular threshold value, the topic cluster that has the highest interest weight, etc.).
  • the user cluster features e.g., the topic clusters associated with the user interest profile
  • the content recommendation system can select a subset of user cluster features for modeling user interests (e.g., the topic clusters that have an interest weight greater than a particular threshold value, the topic cluster that has the highest interest weight, etc.).
  • the content recommendation system can also determine the conjunction between user cluster features and a particular channel (e.g., channel_id) to model the interaction of user interests and a particular channel.
  • This conjunction can be represented as, for example, “user_clustered topic channel_id.”
  • the content recommendation system can generate a model of user interests using unsupervised decision trees. For example, the content recommendation system can generate a decision tree that identifies user interest profiles having similar topics and determine the conjunction of the topics shared by those user interest profiles and the content items associated with a content hosting service. This can, for example, model the collective interests or topics of users whose user interest profiles are grouped in the same leaf node.
  • FIG. 4 is a flowchart of an illustrative process for modeling user interests using a decision tree that groups or clusters user interest profiles having similar topics in accordance with some embodiments of the disclosed subject matter.
  • the content recommendation system can access a user interest profile of topics associated with the user. As described above, the topics are based on the content items accessed by the user.
  • the content recommendation system can access other user interest profiles (e.g., all of the user interest profiles in the content hosting service), where each user interest profile includes topics associated with the content items accessed by that user.
  • user interest profiles e.g., all of the user interest profiles in the content hosting service
  • a vector clustering approach such as a k-means approach, can be used to form a community of user interest profiles. It should be noted that, in some implementations, the content recommendation system can replace the Euclidean distance calculation in the vector clustering approach with a dot product calculation.
  • a decision tree can be constructed that determines the similarities between a user interest profile and other interest profile. The content recommendation system can use a decision tree algorithm in which user interest profiles are stored based on topic feature values so that nodes of the decision tree represent a feature that is being classified and branches of the tree represent a value that the node may assume. Results may be classified by traversing the decision tree from the root node through the tree and sorting the nodes using their respective values.
  • a binary decision tree can be constructed such that, at any node, if the dot product between the node and the user interest profile is non-zero, the node is placed under a right sub-tree. Otherwise, the left child is chosen and the node is placed under a left sub-tree.
  • a top level or root node can contain the topics “Backyard Wrestling,” “Wrestling,” and “Beer.” If any user interest profile contains one or more of the topics in the top level node, the user interest profile is placed under the right sub-tree. This can, for example, partition user interest profiles into clusters that tend to contain users with similar interests and similar topics. In some embodiments, the nodes placed under the left sub-tree can be removed or otherwise excluded from generating the model of user interests.
  • the content recommendation system can select a splitting node such that the population between right nodes and left nodes is generally balanced.
  • the content recommendation system can obtain the portion of the decision tree that includes similar user interest profiles and determine the conjunction of this portion of the decision tree and the content items provided by a content hosting service. For example, the content recommendation system can use the right sub-tree of the decision tree to create a feature that models the collective interests of users having user interest profiles that are clustered in the same leaf node. The content recommendation system can determine the conjunction of topics in the user interest profiles of a group of users in the same leaf node and the content items provided by a content hosting service. In a more particular example, this conjunction can be represented as “unsupervised_decision_tree_leaf video_id” or “unsupervised_decision_tree_leaf channel_id.”
  • one or more aspects of the decision tree described above in connection with FIG. 4 can be constructed using any suitable iterative process.
  • a structure of the decision tree can be selected in an initial iteration and tested against training data from user interest profiles. The results of such testing with training data can be used to generate another structure of the decision tree in another iteration.
  • the content recommendation system can model user interests using a boosted decision tree that jointly clusters content features and user. For example, the content recommendation system can determine that users interested in classical music (e.g., from the user interest profile, from preferences inputted by the user, etc.) are likely to watch video content items associated with the topic “Mozart” (topic:Mozart) but only if the sound track provided in the video content items is high quality or high fidelity.
  • classical music e.g., from the user interest profile, from preferences inputted by the user, etc.
  • a boosting approach can be used to build a strong classifier by combining multiple weak classifiers and a boosted decision tree approach can be a combination of the decision tree approach and the boosting approach.
  • the content recommendation system can use a boosted decision tree approach, where the above-mentioned k-means clustering approach is used to split user interest profiles into a right sub-tree of similar user interest profiles and a left sub-tree of other user interest profiles.
  • the content recommendation system can use the leaf node of similar user interest profiles as weak learners.
  • the content recommendation system can then apply user features for training.
  • the quality of an audio track in a video content item can be used to train a joint model of user and content features.
  • user features can be used to filter out candidate content items to recommendation to the user (e.g., a particular content item includes a low-quality audio track that would not be of interest to the user).
  • the content recommendation system can generate a model of user interests by mapping content items and topics into a multi-dimensional space. For example, in some implementations, the content recommendation system can generate a decision tree that identifies user interest profiles having similar topics and determine the conjunction of the topics shared by those user interest profiles and the content items associated with a content hosting service.
  • FIG. 5 is a flowchart of an illustrative process for modeling user interests using latent embedding, where multiple topics and multiple content items are mapped into a multi-dimensional space and a user interest model is trained such that a loss or error function is minimized, in accordance with some embodiments of the disclosed subject matter.
  • the content recommendation system can map each topic of a plurality of topics into a multi-dimensional space.
  • each topic can be initially assigned a random location in a 64th-dimensional space.
  • the number of dimensions e.g., 64
  • the number of dimensions of the multi-dimensional embedding space can be determined based upon any suitable criterion, such as available computing resources, size of the training database, etc.
  • the content recommendation can also map each content feature of a plurality of content features into the multi-dimensional space.
  • mapping functions that map each type of data item to the joint embedding space can be learned at 530 .
  • the content recommendation system can iteratively select sets of embedded training items and determine if the distances between the selected items based on the current location in the joint embedding space corresponds to the known relationships between those items.
  • an error function such as a hinge ranking loss function, can be calculated such that if A and B are topics associated with a user interest profile, the dot product of A and B is closer than another topic, C. This can be represented, for example, as:
  • mappings and locations of the topics in the multi-dimensional space are adjusted such that their locations relate to each other are improved.
  • an error function can be calculated such that if A is a topic associated with a video content item V and C is another topic that is not associated with the video content item V, the dot product of topic A and video content item V is closer than topic C and video content item V.
  • This can be represented, for example, as:
  • examples of functions that may be used as the error function in the training phase include the standard margin ranking loss (AUC) function and the weighted approximate-ranked pairwise (WARP) loss function.
  • AUC standard margin ranking loss
  • WARP weighted approximate-ranked pairwise
  • the content recommendation system can use a WARP function to select negative examples, such as topic C. That is, samples are drawn at random, and a gradient step can be made for each sample to minimize loss. Due to the cost of computing the exact rank, the content recommendation system can approximate the exact rank by sampling. For a given positive label, the content recommendation system draws negative labels until a violating label is found, where the rank is then approximated.
  • the content recommendation system can obtain a trained joint embedding space.
  • the content recommendation system can access a user interest profile of topics associated with the user.
  • the topics can be based on the content items accessed by the user.
  • the content recommendation system can project the topics from the user interest profile into the trained joint embedding space.
  • the new items including the topics from the user interest profile can be embedded in the trained joint embedding space at a location determined upon the learned mapping function for topics.
  • One or more associations between the newly embedded item and the previously embedded items can be determined based on a distance measure.
  • the content recommendation system can apply one or more techniques, such as Principal component analysis (PCA), Singular Value Decomposition (SVD), and Gaussian mixture modeling (GMM), to determine content items that are close in proximity.
  • PCA Principal component analysis
  • SVD Singular Value Decomposition
  • GMM Gaussian mixture modeling
  • the content recommendation system can use the learned joint embedding space to determine user preference for a topic, a video, or a channel. For example, the content recommendation system can project these items into the joint embedding space and calculate the dot product. In this example, the higher the dot product indicates a higher likelihood that the user associated with the user interest profile watches a particular content item or a channel that provides content items.
  • the content recommendation system can provide a predictive model of user interests that can be used to determine a recommendation of one or more content items to a user. Based on features relating to topics and features relating to content items, the predictive model can determine a set of probabilities that represents the interaction between user interests and content items. For example, if a video content item relates to astronomy and users with the topic “science” in their user interest profile watch or are highly likely to watch the video content item, the instance for the conjunction of the topic and the video content item can have a high weight or probability value.
  • the content recommendation system can rank multiple content items based on their determined probabilities at 150 .
  • the content recommendation system can use the model of user interests to generate a ranked list of candidate video items.
  • the content recommendation system can use the model of user interests to generate a list of candidate video items sorted by watch probability and then re-rank the list using additional criteria, such as view count, ratings, recency (e.g., upload date), sharing activity, etc.
  • the content recommendation system can select a subset of the content items from the ranked list to recommend to the user. For example, upon determining content items accessed by the user (e.g., using a user interest profile), determining topics associated with the accessed content items, generating a model of user interests based on the determined topics, and generating a list of content items that are ranked by watch probabilities derived using the model, the content recommendation system can select a particular number of content items from the ranked list to recommend to the user.
  • a content hosting service can indicate a number of recommendations that can be provided on an interface. As shown in FIG.
  • the content hosting service can request that a particular number of recommended content items (e.g., eight) are provided in region 610 .
  • a particular number of recommended content items e.g., eight
  • any suitable information can be displayed to provide the user with the recommended content items—e.g., a thumbnail or image, a title or a portion of a title, and/or a playback time associated with the recommended content item.
  • the content recommendation system can determine whether one or more of the content items in the ranked list have been previously accessed by the user. For example, the content recommendation system can inhibit the recommendation of a content item that has been previously watched by the user (e.g., based on information from a user access log). In another example, the content recommendation system can adjust the ranking associated with the previously watched content item (e.g., lower the ranking of the previously watched content).
  • FIG. 7 is a generalized schematic diagram of a content recommendation system in accordance with some implementations of the disclosed subject matter.
  • system 700 can include one or more computing devices 702 , such as a user computing device for providing search queries for content items and/or obtaining and playing back content items from a content hosting service, a tablet computing device for transmitting user instructions to a television device, etc.
  • computing device 702 can be implemented as a personal computer, a tablet computing device, a personal digital assistant (PDA), a portable email device, a multimedia terminal, a mobile telephone, a gaming device, a set-top box, a television, a smart television, etc.
  • PDA personal digital assistant
  • computing device 702 can include a storage device, such as a hard drive, a digital video recorder, a solid state storage device, a gaming console, a removable storage device, or any other suitable device for storing media content, entity tables, entity information, metadata relating to a particular search domain, etc.
  • a storage device such as a hard drive, a digital video recorder, a solid state storage device, a gaming console, a removable storage device, or any other suitable device for storing media content, entity tables, entity information, metadata relating to a particular search domain, etc.
  • computing device 702 can include a second screen device.
  • a second screen device can present the user with recommended content items based on topics and, in response to receiving a user selection of one of the recommended content items, can transmit playback instructions for a user-selected content item to a television device.
  • Computing devices 702 can be local to each other or remote from each other. For example, when one computing device 702 is a television and another computing device 702 is a second screen device (e.g., a tablet computing device, a mobile telephone, etc.), the computing devices 702 may be located in the same room. Computing devices 702 are connected by one or more communications links 704 to a communications network 706 that is linked via a communications link 708 to a server 710 .
  • a communications network 706 that is linked via a communications link 708 to a server 710 .
  • System 700 can include one or more servers 710 .
  • Server 710 can be any suitable server for providing access to content recommendation system, such as a processor, a computer, a data processing device, or a combination of such devices.
  • the content recommendation system can be distributed into multiple backend components and multiple frontend components or interfaces.
  • a content hosting service can include the content recommendation system, which provides recommended content items based on topic information associated with users.
  • backend components such as data distribution can be performed on one or more servers 710 .
  • the graphical user interfaces displayed by the content hosting service such as an interface for retrieving content items, can be distributed by one or more servers 710 to computing device 702 .
  • server 710 can include any suitable server for accessing metadata relating to content items, user access logs, user interest profiles, topic and related topic information, etc.
  • each of the computing devices 702 and server 710 can be any of a general purpose device such as a computer or a special purpose device such as a client, a server, etc. Any of these general or special purpose devices can include any suitable components such as a processor (which can be a microprocessor, digital signal processor, a controller, etc.), memory, communication interfaces, display controllers, input devices, etc.
  • a processor which can be a microprocessor, digital signal processor, a controller, etc.
  • memory memory
  • communication interfaces e.g., display controllers, input devices, etc.
  • computing device 702 can be implemented as a personal computer, a tablet computing device, a personal digital assistant (PDA), a portable email device, a multimedia terminal, a mobile telephone, a gaming device, a set-top box, a television, etc.
  • PDA personal digital assistant
  • any suitable computer readable media can be used for storing instructions for performing the processes described herein.
  • computer readable media can be transitory or non-transitory.
  • non-transitory computer readable media can include media such as magnetic media (such as hard disks, floppy disks, etc.), optical media (such as compact discs, digital video discs, Blu-ray discs, etc.), semiconductor media (such as flash memory, electrically programmable read only memory (EPROM), electrically erasable programmable read only memory (EEPROM), etc.), any suitable media that is not fleeting or devoid of any semblance of permanence during transmission, and/or any suitable tangible media.
  • transitory computer readable media can include signals on networks, in wires, conductors, optical fibers, circuits, any suitable media that is fleeting and devoid of any semblance of permanence during transmission, and/or any suitable intangible media.
  • communications network 706 may be any suitable computer network including the Internet, an intranet, a wide-area network (“WAN”), a local-area network (“LAN”), a wireless network, a digital subscriber line (“DSL”) network, a frame relay network, an asynchronous transfer mode (“ATM”) network, a virtual private network (“VPN”), or any combination of any of such networks.
  • Communications links 704 and 708 may be any communications links suitable for communicating data between computing devices 702 and server 710 , such as network links, dial-up links, wireless links, hard-wired links, any other suitable communications links, or a combination of such links.
  • Computing devices 702 enable a user to access features of the application.
  • Computing devices 702 and server 710 may be located at any suitable location. In one implementation, computing devices 702 and server 710 may be located within an organization. Alternatively, computing devices 702 and server 710 may be distributed between multiple organizations.
  • computing device 702 may include processor 802 , display 804 , input device 806 , and memory 808 , which may be interconnected.
  • memory 808 contains a storage device for storing a computer program for controlling processor 802 .
  • Processor 802 uses the computer program to present on display 804 the interfaces of the content hosting service and the data received through communications link 704 and commands and values transmitted by a user of computing device 702 . It should also be noted that data received through communications link 704 or any other communications links may be received from any suitable source.
  • Input device 806 may be a computer keyboard, a mouse, a keypad, a cursor-controller, dial, switchbank, lever, a remote control, or any other suitable input device as would be used by a designer of input systems or process control systems. Alternatively, input device 1206 may be a finger or stylus used on a touch screen display 804 .
  • Server 710 may include processor 820 , display 822 , input device 824 , and memory 826 , which may be interconnected.
  • memory 826 contains a storage device for storing data received through communications link 708 or through other links, and also receives commands and values transmitted by one or more users.
  • the storage device further contains a server program for controlling processor 820 .
  • the application may include an application program interface (not shown), or alternatively, the application may be resident in the memory of computing device 702 or server 710 .
  • the only distribution to computing device 702 may be a graphical user interface (“GUI”) which allows a user to interact with the application resident at, for example, server 710 .
  • GUI graphical user interface
  • the application may include client-side software, hardware, or both.
  • the application may encompass one or more Web-pages or Web-page portions (e.g., via any suitable encoding, such as HyperText Markup Language (“HTML”), Dynamic HyperText Markup Language (“DHTML”), Extensible Markup Language (“XML”), JavaServer Pages (“JSP”), Active Server Pages (“ASP”), Cold Fusion, or any other suitable approaches).
  • HTTP HyperText Markup Language
  • DHTML Dynamic HyperText Markup Language
  • XML Extensible Markup Language
  • JSP JavaServer Pages
  • ASP Active Server Pages
  • Cold Fusion or any other suitable approaches.
  • the application is described herein as being implemented on a user computer and/or server, this is only illustrative.
  • the application may be implemented on any suitable platform (e.g., a personal computer (“PC”), a mainframe computer, a dumb terminal, a data display, a two-way pager, a wireless terminal, a portable telephone, a portable computer, a palmtop computer, an H/PC, an automobile PC, a laptop computer, a cellular phone, a personal digital assistant (“PDA”), a combined cellular phone and PDA, etc.) to provide such features.
  • PC personal computer
  • mainframe computer e.g., a mainframe computer, a dumb terminal, a data display, a two-way pager, a wireless terminal, a portable telephone, a portable computer, a palmtop computer, an H/PC, an automobile PC, a laptop computer, a cellular phone, a personal digital assistant (“PDA”), a combined cellular phone and PDA, etc.
  • PDA personal

Abstract

Mechanisms for recommending content items based on topics are provided. In some implementations, a method for recommending content items is provided that includes: determining a plurality of accessed content items associated with a user, wherein each of the plurality of content items is associated with a plurality of topics; determining the plurality of topics associated with each of the plurality of accessed content items; generating a model of user interests based on the plurality of topics, wherein the model implements a machine learning technique to determine a plurality of weights for assigning to each of the plurality of topics; applying the model to determine, for a plurality of content items, a probability that the user would watch a content item of the plurality of content items; ranking the plurality of content items based on the determined probabilities; and selecting a subset of the plurality of content items to recommend to the user based on the ranked content items.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application is a continuation of U.S. patent application Ser. No. 14/816,866, filed on Aug. 3, 2015, which is a continuation of U.S. patent application Ser. No. 13/731,266, filed on Dec. 31, 2012, each of which is hereby incorporated by reference herein in its entirety.
  • TECHNICAL FIELD
  • Methods, systems, and media for recommending content items based on topics are provided. More particularly, the disclosed subject matter relates to recommending content items by modeling user interest profiles that are based on content consumption.
  • BACKGROUND
  • Due to an overwhelming volume of content that is available to the average consumer, content hosting services generally attempt to present content that is interesting to its users. Some content hosting services allow users to create user profiles that indicate demographic information, such as gender or age, as well as areas of interest. These content hosting services then attempt to use such user profiles to select content to provide to each of its users. However, users may not be able to articulate their interests while populating a user profile. In addition, interests may change over time and these users may not update their user profiles to reflect such changes.
  • SUMMARY
  • In accordance with various implementations of the disclosed subject matter, mechanisms for recommending content items based on topics are provided.
  • In accordance with some implementations of the disclosed subject matter, a method for recommending content items is provided, the method comprising: determining, using a hardware processor, a plurality of accessed content items associated with a user, wherein each of the plurality of content items is associated with a plurality of topics; determining, using the hardware processor, the plurality of topics associated with each of the plurality of accessed content items; generating a model of user interests based on the plurality of topics, wherein the model implements a machine learning technique to determine a plurality of weights for assigning to each of the plurality of topics; applying, using the hardware processor, the model to determine, for a plurality of content items, a probability that the user would watch a content item of the plurality of content items; ranking, using the hardware processor, the plurality of content items based on the determined probabilities; and selecting a subset of the plurality of content items to recommend to the user based on the ranked plurality of content items.
  • In accordance with some implementations of the disclosed subject matter, a system for recommending content items is provided. The system comprising: a hardware processor that is configured to: determine a plurality of accessed content items associated with a user, wherein each of the plurality of content items is associated with a plurality of topics; determine the plurality of topics associated with each of the plurality of accessed content items; generate a model of user interests based on the plurality of topics, wherein the model implements a machine learning technique to determine a plurality of weights for assigning to each of the plurality of topics; apply the model to determine, for a plurality of content items, a probability that the user would watch a content item of the plurality of content items; rank the plurality of content items based on the determined probabilities; and select a subset of the plurality of content items to recommend to the user based on the ranked plurality of content items.
  • In accordance with some implementations of the disclosed subject matter, a computer-readable medium containing computer-executable instructions that, when executed by a processor, cause the processor to perform a method for recommending content items, is provided. The method comprises: determining a plurality of accessed content items associated with a user, wherein each of the plurality of content items is associated with a plurality of topics; determining the plurality of topics associated with each of the plurality of accessed content items; generating a model of user interests based on the plurality of topics, wherein the model implements a machine learning technique to determine a plurality of weights for assigning to each of the plurality of topics; applying the model to determine, for a plurality of content items, a probability that the user would watch a content item of the plurality of content items; ranking the plurality of content items based on the determined probabilities; and selecting a subset of the plurality of content items to recommend to the user based on the ranked plurality of content items.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • Various objects, features, and advantages of the disclosed subject matter can be more fully appreciated with reference to the following detailed description of the disclosed subject matter when considered in connection with the following drawing, in which like reference numerals identify like elements.
  • FIG. 1 is a flowchart of an illustrative process for recommending content items, such as video content items and channels that present video content items, based on topics in accordance with some embodiments of the disclosed subject matter.
  • FIG. 2 is a flowchart of an illustrative process for modeling user interests using a subset of topics (e.g., top K topics) that are selected based on an associated interest weight in accordance with some embodiments of the disclosed subject matter.
  • FIG. 3 is a flowchart of an illustrative process for modeling user interests using topic clusters including related topics, where topics associated with the content items viewed by the user are mapped to one or more topic clusters, in accordance with some embodiments of the disclosed subject matter.
  • FIG. 4 is a flowchart of an illustrative process for modeling user interests using a decision tree that groups or clusters user interest profiles associated with the same or similar topics in accordance with some embodiments of the disclosed subject matter.
  • FIG. 5 is a flowchart of an illustrative process for modeling user interests using embedding trees, where multiple topics and multiple content items are mapped into a multi-dimensional space and the embedding tree is trained such that a loss or error function is minimized, in accordance with some embodiments of the disclosed subject matter.
  • FIG. 6 is an illustrative screen of an interface that provides recommended content items to a user in accordance with some embodiments of the disclosed subject matter.
  • FIG. 7 is a diagram of an illustrative system suitable implementation of the content recommendation system in accordance with some embodiments of the disclosed subject matter.
  • FIG. 8 is a diagram of an illustrative computing device and server as provided, for example, in FIG. 7 in accordance with some embodiments of the disclosed subject matter.
  • DETAILED DESCRIPTION
  • Methods, systems, and media for recommending content items using topics are provided.
  • In accordance with some embodiments of the disclosed subject matter, mechanisms for recommending content items, such as video content items and channels that provide video content items, using topics are provided. These content items and channels can be presented by a content hosting service (e.g., a video hosting service) or any other suitable content database. In response to determining the content items that have been accessed by a user on a content hosting service, one or more topics associated with each of the content items can be determined.
  • It should be noted that a channel, in some embodiments, is a group of videos or video content items from a particular source. In another embodiment, a channel is a group of videos or video content items with a common attribute, such as a source, a topic, a date, etc.
  • It should also be noted that, in some embodiments, a user interest profile can be accessed to determine which content items have been accessed by the user. For example, in response to logging into a user account, a content hosting service can associate and store information relating to content items that have been accessed by the user in a user interest profile. In some embodiments, the user interest profile can also include user interactions with the content items, such as pausing, fast forwarding, rewinding a particular content item, rating a particular content item, sharing a particular content item, the amount of time a particular content item is watched, etc. Additionally or alternatively, a user access log can be accessed to determine which content items have been accessed by the user.
  • In some embodiments, the topics associated with each of the content items can be determined from the user interest profile. For example, topics can be derived from metadata associated with a content item (e.g., genre, category, title, actor, director, description, etc.). In another example, a content database (e.g., a video database) or a knowledgebase can be queried to determine, for each accessed content item, the topics associated with each content item and/or other suitable topic information (e.g., an interest weight associated with each topic).
  • Based on one or more of the associated topics, a model of user interests can be generated using one or more machine learning techniques. The model of user interests derived using machine learning techniques can be used, for example, in conjunction with linear regression to provide predictions or probabilities of the likelihood that the user accesses or watches a particular content item or a particular channel that presents particular content items. By ranking content items based on the determined probabilities, a subset of content items or channels presenting content items can be selected for recommending to the user.
  • It should be noted that any suitable machine learning technique can be used to generate a model of user interests. For example, a model of user interests can be generated using a machine learning approach, where a subset of topics, such as the top K topics, from a user interest profile can be determined and a conjunction of the subset of topics and the content items associated with a content hosting service can be determined. In another example, a model of user interests can be generated by using a hierarchical agglomerative clustering approach to create topic clusters of related topics, where a user interest profile of topics can be mapped to one or more topic clusters and a conjunction of those topic clusters and the content items associated with a content hosting service can be determined. In yet another example, a model of user interests can be generated using vector clustering approaches.
  • These mechanisms can be used in a variety of applications. For example, these mechanisms can allow a content hosting service, using these models of user interests, to propagate interests in a content item or a channel that presents content items from a single user or a group of users to other users, thereby generalizing the content recommendation system beyond the preferences of a single user. In another example, these mechanisms can allow a content hosting service to recommend content items or a channel that presents content items to a user that is unaware of the availability of such a content item or channel.
  • Although the embodiments described herein generally relate to recommending video content items, such as television programs, movies, and video clips, this is merely illustrative. For example, these content recommendation mechanisms can be implemented in a product search domain or commerce domain, where one or more topics associated with a user are determined from product metadata relating to products viewed by a user and a model of user interests is generated based on those topics. In another example, these content recommendation mechanisms can be implemented in a music search domain, where one or more topics associated with a user are determined from structured music metadata relating to media accessed by the user and a model of user interests can be generated based on those topics. Accordingly, the content recommendation mechanisms can be used in any suitable recommendation system for providing recommended content items.
  • Turning to FIG. 1, FIG. 1 is a flow chart of an illustrative process 100 for providing a content recommendation system that recommends content items based on topics in accordance with some embodiments of the disclosed subject matter.
  • Process 100 can begin by determining the content items that have been accessed by a user. For example, in response to detecting that a user has logged into a content hosting service (e.g., by entering a username and password), information relating to the content items that the user has accessed can be stored in a user interest profile. The content recommendation system can access the user interest profile to determine the content items that have been accessed by the user. In another example, the content recommendation system can transmit a query to a user access log associated with the user to determine the content items that have been accessed by the user. In a more particular example, the content recommendation system can retrieve multiple content identifiers associated with the content items accessed by the user (e.g., video identifiers, such as “video_id:1234,” that are associated with each video watched by the user).
  • In situations in which the content recommendation system discussed herein collects personal information about users, or may make use of personal information, the users can be provided with an opportunity to control whether programs or features collect user information (e.g., information about content items accessed by a user, a user's interactions with content items, a user's social network, social actions or activities, profession, a user's preferences, or a user's current location), or to control whether and/or how to receive content from the content server that may be more relevant to the user. In addition, certain data may be treated in one or more ways before it is stored or used, so that personally identifiable information is removed. For example, a user's identity may be treated so that no personally identifiable information can be determined for the user, or a user's geographic location may be generalized where location information is obtained (such as to a city, zip code, or state level), so that a particular location of a user cannot be determined. Thus, the user may have control over how information is collected about the user and used by a content server.
  • At 120, the content recommendation system can determine one or more topics associated with each of the content items accessed by the user. For example, a video content item relating to the supernova SN 1987A can have the topics “SN_1987A” and “astronomy.” In another example, the user interest profile can include the content items accessed by the user and a list of the one or more topics associated with each of the content items accessed by the user.
  • It should be noted that each content item on a content hosting service can be annotated with entities or topics created by a content source. For example, the topics associated with the content items can be created by a community of users in a collaborative knowledgebase, such as Freebase. In another example, the topics associated with the content items can be generated based on content metadata, such as unique terms derived from video titles, actor names, character names, director names, genres or category information, etc. In these examples, the content recommendation system can transmit a query to a content database (e.g., a video database, a knowledgebase having entity information or topic information, etc.) to determine, for each of the accessed content items, the topics associated with each content item.
  • It should also be noted that the topics associated with a content item can be stored along with other metadata associated with the content item. Alternatively, instead of being stored with the metadata of each content item, the topics and/or other information relating to the topics can be stored in a separate database.
  • In some embodiments, an interest weight can be associated with each topic. For example, a topic corresponding to an accessed content item can be associated with an interest weight that represents the strength or degree of association of the topic for a given content item. The interest profile can, in some embodiments, be represented as a weighted sum of all of the topics associated with the content items accessed by the user. For example, the interest weight for a particular topic in a user interest profile can be a product, sum, average, or other arithmetic function of the interest weight normalized by the frequency of the topic in the user interest profile. In a more particular example, the interest weight for the topic “astronomy” can be high in comparison to the interest weights associated with other topics when the topic “astronomy” is a frequent topic of content items accessed by the user.
  • Referring back to FIG. 1, upon determining the topics associated with each of the content items accessed by the user, the content recommendation system can generate a model of user interests using one or more machine learning techniques at 130.
  • Generally speaking, user modeling can include one or more input features. As used herein, a feature can be represented by feature-value pairs—i.e., a feature template and a feature value. For example, the feature for a particular video content item (video_id) can have the value “1234” or video_id:1234. In this example, the feature video_id:1234 can model the global popularity of a video content item having the video identifier “1234.” When used by itself, such a model determines the probability that a user watches the video content item having the video identifier “1234.” In another example, a feature can be used to model a channel (channel_id). In this example, the feature channel_id:ABC123 can model the global popularity that a user watches video content items presented by the channel having the channel identifier “ABC123.” Again, when used by itself, such a model determines the probability that a user watches video content items presented by the channel having the channel identifier “ABC123.”
  • In some embodiments, user modeling can include one or more input features relating to the user. For example, a user feature, such as user_id, can model the probability that a given user accesses any content item provided by a content hosting service. In a more particular example, the feature user_id:XYZ can determine the probability that a user having a user identifier “XYZ” watches any video content item provided by a video hosting service. It should be noted that a model for a more discerning user can, for example, determine that the probability that the user accesses any content item provided by a content hosting service is low.
  • Another input feature relating to the user can include a geographical location. For example, a user feature, such as user_geo, can model the probability that users from a particular geographical location watch any content item provided by a content hosting service. In a more particular example, the feature user_geo:GBR can determine the probability that users located in Great Britain (GBR) watch any video content item provided by a video hosting service.
  • In some embodiments, features can be used in conjunction with other features, where the conjunction is sometimes referred to herein as the
    Figure US20170103343A1-20170413-P00001
    symbol. Based on the features, a covariance matrix can be generated by taking the outer product between the features—e.g., the sum of the products of the values associated with the features over different instances or sub-portions. For example, the user feature user_geo for geographical location can be used in conjunction with the feature channel_id for a particular channel to model regional preferences (e.g., USA, GBR, CAN, etc.) for the particular channel or channels. Using the example above, the feature template can be represented by “user_geo
    Figure US20170103343A1-20170413-P00001
    channel_id.” One illustrative instance of the conjunction can be “user_geo:GBR
    Figure US20170103343A1-20170413-P00001
    channel_id:ABC123,” which models the likelihood that a user from Great Britain (GBR) watches a video content item presented by a channel having a channel identifier “ABC123.” These input features and conjunctions can be used to, for example, personalize the content recommendations provided to a user.
  • In some embodiments, these input features and conjunctions can be used to model users. More particularly, upon determining the list of entities and topics that a user has shown interest based on viewing history (e.g., the content items accessed by the user, the content items watched for a substantial period of time by the user, the content items that the user has provided a favorable indication, the content items that the user has shared with other users, etc.), the content recommendation system can use various input features and conjunctions to generate an interest model for a user using one or more machine learning techniques.
  • Illustrative examples of the one or more machine learning techniques that can be used to generate a model of user interests are described herein in connection with FIGS. 2-5.
  • In some embodiments, the content recommendation system can generate a model of user interests using a direct interest modeling approach. For example, the content recommendation system can determine a subset of topics in a user interest profile and determine the conjunction of the subset of topics and the content items associated with a content hosting service.
  • FIG. 2 is a flowchart of an illustrative process for modeling user interests using a subset of topics (e.g., top K topics) based on an associated interest weight in accordance with some embodiments of the disclosed subject matter. At 210, the content recommendation system can access a user interest profile of topics associated with the user. As described above, the topics can be based on the content item accessed by the user and, in some implementations, an interest weight can be associated with each of the topics in the user interest profile. For example, an interest weight for the topic “science” can be high compared to other interest weights in response to determining that the user associated with the user interest profile watches a substantial number of video content items that are annotated with the topic “science.”
  • At 220, the content recommendation system can determine a subset of topics from the multiple topics in the user interest profile. In some embodiments, the content recommendation system can determine the subset of topics based on the associated interest weight. For example, the content recommendation system can determine the top fifty topics in the user interest profile and determine the conjunction of the top fifty topics and the content items provided by the content hosting service. In another example, the content recommendation system can select a particular subset of topics from the user interest profile based on whether the associated interest weight is greater than a particular threshold value. In such an example, the threshold value can be determined such that at least a given number of topics have been selected. Additionally or alternatively, the threshold value can be adjusted by an administrative user using the content recommendation system (e.g., after inspecting the data, after inspecting the associated interest weights, etc.).
  • At 230, the content recommendation system can determine the conjunction of the subset of topics and the content items provided by the content hosting service to model the interaction of user interests and content items. For example, given a set of feature vectors representing a subset of topics, a model can be generated that indicates the probability that the user associated with the user interest profile watches a content item. In this example, if a video content item having a video identifier “567” (video_id:567) relates to astronomy and users with the topic “science” (user_topic:science) in their user interest profile watch or are highly likely to watch the video content item, the instance for the conjunction “user_topic: science
    Figure US20170103343A1-20170413-P00001
    video_id:567” can have a high weight or probability value.
  • Although the embodiments described herein generally relate to determining the conjunction between content items and a topic, a topic cluster, or any other suitable representation derived from topic information, this is merely illustrative. For example, the content recommendation system can determine the conjunction between topics (e.g., user_topics) and a particular channel (e.g., channel_id) to model the interaction of user interests and a particular channel. This conjunction can be represented as, for example, “user_topic
    Figure US20170103343A1-20170413-P00001
    channel_id.”
  • It should also be noted that, although the illustrative example shown in FIG. 2 accesses a user interest profile associated with a user and generates a model of user interests based on at least a portion of the topics from the user interest profile, any suitable number of user interest profiles can be used. For example, the content recommendation system can determine that a user has similar interests to a group of users. In this example, the content recommendation system can access the user interest profiles associated with the user and each user in the group of users and determine a subset of topics that are common to the multiple users and that have interest weights greater than a particular threshold value. This subset of topics can then be used in conjunction with content items that are provided by the content hosting service.
  • In some embodiments, the content recommendation system can generate a model of user interests using a topic clustering approach. For example, the content recommendation system can identify one or more topic clusters of related topics that are similar to the topics included in a user interest profile and determine the conjunction of the one or more topic clusters and the content items associated with a content hosting service. As opposed to directly modeling the topics associated with a user in a user interest profile, the content recommendation system can map the topics in the user interest profile to one or more topic clusters and generate an interest model based on the topic clusters.
  • FIG. 3 is a flowchart of an illustrative process for modeling user interests using topic clusters in accordance with some embodiments of the disclosed subject matter. At 310, the content recommendation system can access a user interest profile of topics associated with the user. As described above, the topics can be based on the content items accessed by the user and, in some implementations, an interest weight can be associated with each of the topics in the user interest profile.
  • At 320, the content recommendation system can determine, for each topic, one or more related topics. Any suitable approach can be used for determining related topics. For example, the content recommendation system can use the co-occurrence of topics in user interest profiles to determine which topics are related to each other. In this example, the content recommendation system can determine across multiple user interest profiles (e.g., all of the user interest profiles in the content hosting service), pairs of topics that co-occur in at least a portion of the user interest profiles, and determine a measure of co-occurrence or any other suitable distance measure. The distance measure can, for example, indicate how often video content items on two co-occurring topics have been accessed by a user.
  • Additionally or alternatively, the content recommendation system can determine related topics by querying an alternate source that specifies relationships between entities or topics, such as a knowledgebase (e.g., Freebase). For example, in response to transmitting a query that includes a topic (e.g., topic:SN_1987A) to a knowledgebase, the content recommendation system can receive a related topic graph that includes a set of related topic nodes and distance measures between the various nodes. The content recommendation can use the related topic graph for determining topic clusters, where neighboring nodes to a topic node are semantically more related than nodes with larger distance measure from the topic node.
  • At 330, the content recommendation system can cluster each topic with one or more related topics based on the distance measure. For example, the content recommendation system can create topic clusters using any suitable clustering approaches, such as a hierarchical agglomerative clustering approach. The content recommendation system can identify the topic clusters having a distance measure less than a given threshold value (e.g., grouping topics together that are closer in distance) and combine the two or more topics into a topic cluster.
  • In some embodiments, in addition to combining topics into a topic cluster, the content recommendation system can generate an interest weight for the topic cluster. The interest weight for the topic cluster can combine the interest weights associated with each of the topics by, for example, adding, multiplying, averaging, or applying another arithmetic or statistical function on the interest weights. In some embodiments, the interest weight for the topic cluster can be normalized by, for example, the frequency of the co-occurring topics in the user interest profiles.
  • At 340, upon determining the multiple topic clusters of related topics, the content recommendation system can map the user interest profile (that includes topics associated with content items accessed by the user) with the topic clusters to determine user cluster features. For example, the content recommendation system can determine a topic cluster corresponding to each topic in the user interest profile. As a result, the content recommendation system can determine multiple topic clusters of related topics for association with a user interest profile.
  • It should be noted that, in some embodiments, topic clusters can be stored in the user interest profile as opposed to the topics associated with content items accessed by the user. Alternatively, topics associated with content items accessed by the user and topic clusters can be stored in the user interest profile.
  • At 350, the content recommendation system can determine the conjunction of the user cluster features (e.g., the topic clusters associated with the user interest profile) with content items provided by a content hosting service. That is, the content recommendation system can generate a matrix or any other suitable data representation that provides probabilities that a user of a user interest profile watches content items based on the user cluster features. Alternatively, in some embodiments, the content recommendation system can select a subset of user cluster features for modeling user interests (e.g., the topic clusters that have an interest weight greater than a particular threshold value, the topic cluster that has the highest interest weight, etc.).
  • As described above, the content recommendation system can also determine the conjunction between user cluster features and a particular channel (e.g., channel_id) to model the interaction of user interests and a particular channel. This conjunction can be represented as, for example, “user_clustered topic
    Figure US20170103343A1-20170413-P00001
    channel_id.”
  • In some embodiments, the content recommendation system can generate a model of user interests using unsupervised decision trees. For example, the content recommendation system can generate a decision tree that identifies user interest profiles having similar topics and determine the conjunction of the topics shared by those user interest profiles and the content items associated with a content hosting service. This can, for example, model the collective interests or topics of users whose user interest profiles are grouped in the same leaf node.
  • FIG. 4 is a flowchart of an illustrative process for modeling user interests using a decision tree that groups or clusters user interest profiles having similar topics in accordance with some embodiments of the disclosed subject matter. At 410, the content recommendation system can access a user interest profile of topics associated with the user. As described above, the topics are based on the content items accessed by the user.
  • At 420, the content recommendation system can access other user interest profiles (e.g., all of the user interest profiles in the content hosting service), where each user interest profile includes topics associated with the content items accessed by that user.
  • In some embodiments, a vector clustering approach, such as a k-means approach, can be used to form a community of user interest profiles. It should be noted that, in some implementations, the content recommendation system can replace the Euclidean distance calculation in the vector clustering approach with a dot product calculation. At 430, a decision tree can be constructed that determines the similarities between a user interest profile and other interest profile. The content recommendation system can use a decision tree algorithm in which user interest profiles are stored based on topic feature values so that nodes of the decision tree represent a feature that is being classified and branches of the tree represent a value that the node may assume. Results may be classified by traversing the decision tree from the root node through the tree and sorting the nodes using their respective values. For example, a binary decision tree can be constructed such that, at any node, if the dot product between the node and the user interest profile is non-zero, the node is placed under a right sub-tree. Otherwise, the left child is chosen and the node is placed under a left sub-tree.
  • In a more particular example, a top level or root node can contain the topics “Backyard Wrestling,” “Wrestling,” and “Beer.” If any user interest profile contains one or more of the topics in the top level node, the user interest profile is placed under the right sub-tree. This can, for example, partition user interest profiles into clusters that tend to contain users with similar interests and similar topics. In some embodiments, the nodes placed under the left sub-tree can be removed or otherwise excluded from generating the model of user interests.
  • In some embodiments, at 440, the content recommendation system can select a splitting node such that the population between right nodes and left nodes is generally balanced.
  • At 450, the content recommendation system can obtain the portion of the decision tree that includes similar user interest profiles and determine the conjunction of this portion of the decision tree and the content items provided by a content hosting service. For example, the content recommendation system can use the right sub-tree of the decision tree to create a feature that models the collective interests of users having user interest profiles that are clustered in the same leaf node. The content recommendation system can determine the conjunction of topics in the user interest profiles of a group of users in the same leaf node and the content items provided by a content hosting service. In a more particular example, this conjunction can be represented as “unsupervised_decision_tree_leaf
    Figure US20170103343A1-20170413-P00001
    video_id” or “unsupervised_decision_tree_leaf
    Figure US20170103343A1-20170413-P00001
    channel_id.”
  • It should be noted that one or more aspects of the decision tree described above in connection with FIG. 4 can be constructed using any suitable iterative process. For example, a structure of the decision tree can be selected in an initial iteration and tested against training data from user interest profiles. The results of such testing with training data can be used to generate another structure of the decision tree in another iteration.
  • In some embodiments, the content recommendation system can model user interests using a boosted decision tree that jointly clusters content features and user. For example, the content recommendation system can determine that users interested in classical music (e.g., from the user interest profile, from preferences inputted by the user, etc.) are likely to watch video content items associated with the topic “Mozart” (topic:Mozart) but only if the sound track provided in the video content items is high quality or high fidelity.
  • Generally speaking, a boosting approach can be used to build a strong classifier by combining multiple weak classifiers and a boosted decision tree approach can be a combination of the decision tree approach and the boosting approach. For example, the content recommendation system can use a boosted decision tree approach, where the above-mentioned k-means clustering approach is used to split user interest profiles into a right sub-tree of similar user interest profiles and a left sub-tree of other user interest profiles. The content recommendation system can use the leaf node of similar user interest profiles as weak learners. The content recommendation system can then apply user features for training. For example, as described above, the quality of an audio track in a video content item, the quality of a video content item (e.g., resolution), and other user features can be used to train a joint model of user and content features. In another example, user features can be used to filter out candidate content items to recommendation to the user (e.g., a particular content item includes a low-quality audio track that would not be of interest to the user).
  • In some embodiments, the content recommendation system can generate a model of user interests by mapping content items and topics into a multi-dimensional space. For example, in some implementations, the content recommendation system can generate a decision tree that identifies user interest profiles having similar topics and determine the conjunction of the topics shared by those user interest profiles and the content items associated with a content hosting service.
  • FIG. 5 is a flowchart of an illustrative process for modeling user interests using latent embedding, where multiple topics and multiple content items are mapped into a multi-dimensional space and a user interest model is trained such that a loss or error function is minimized, in accordance with some embodiments of the disclosed subject matter.
  • At 510, the content recommendation system can map each topic of a plurality of topics into a multi-dimensional space. For example, each topic can be initially assigned a random location in a 64th-dimensional space. It should be noted that the number of dimensions (e.g., 64) can be predetermined. In addition, the number of dimensions of the multi-dimensional embedding space can be determined based upon any suitable criterion, such as available computing resources, size of the training database, etc.
  • At 520, the content recommendation can also map each content feature of a plurality of content features into the multi-dimensional space.
  • Upon configuring a joint embedding space that embeds data items including topics and content items, mapping functions that map each type of data item to the joint embedding space can be learned at 530. For example, the content recommendation system can iteratively select sets of embedded training items and determine if the distances between the selected items based on the current location in the joint embedding space corresponds to the known relationships between those items. In a more particular example, an error function, such as a hinge ranking loss function, can be calculated such that if A and B are topics associated with a user interest profile, the dot product of A and B is closer than another topic, C. This can be represented, for example, as:

  • A′*w·B′*w>A′*w·C′*w+margin
  • Otherwise, the mappings and locations of the topics in the multi-dimensional space are adjusted such that their locations relate to each other are improved.
  • Similar to the topics embedded in the multi-dimensional space, an error function can be calculated such that if A is a topic associated with a video content item V and C is another topic that is not associated with the video content item V, the dot product of topic A and video content item V is closer than topic C and video content item V. This can be represented, for example, as:

  • A′*w·V′*w>C′*w·V′*w
  • It should be noted that examples of functions that may be used as the error function in the training phase include the standard margin ranking loss (AUC) function and the weighted approximate-ranked pairwise (WARP) loss function. For example, to focus more on the top of a ranked list where the top k positions are of interest, the content recommendation system can use a WARP function to select negative examples, such as topic C. That is, samples are drawn at random, and a gradient step can be made for each sample to minimize loss. Due to the cost of computing the exact rank, the content recommendation system can approximate the exact rank by sampling. For a given positive label, the content recommendation system draws negative labels until a violating label is found, where the rank is then approximated.
  • In response to learning such that the error function is minimized, the content recommendation system can obtain a trained joint embedding space.
  • At 540, the content recommendation system can access a user interest profile of topics associated with the user. As described above, the topics can be based on the content items accessed by the user.
  • At 550, the content recommendation system can project the topics from the user interest profile into the trained joint embedding space. For example, the new items including the topics from the user interest profile can be embedded in the trained joint embedding space at a location determined upon the learned mapping function for topics. One or more associations between the newly embedded item and the previously embedded items can be determined based on a distance measure. For example, the content recommendation system can apply one or more techniques, such as Principal component analysis (PCA), Singular Value Decomposition (SVD), and Gaussian mixture modeling (GMM), to determine content items that are close in proximity.
  • It should be noted that the content recommendation system can use the learned joint embedding space to determine user preference for a topic, a video, or a channel. For example, the content recommendation system can project these items into the joint embedding space and calculate the dot product. In this example, the higher the dot product indicates a higher likelihood that the user associated with the user interest profile watches a particular content item or a channel that provides content items.
  • These and other machine learning techniques that train models in an embedding space are further described in, for example, Weston et al., “Web Scale Image Annotation: Learning to Rank with Joint Word-Image Embeddings” and Weston et al., “Large-Scale Music Annotation and Retrieval: Learning to Rank in Joint Semantic Spaces,” which are hereby incorporated by reference herein in their entireties.
  • The content recommendation system can provide a predictive model of user interests that can be used to determine a recommendation of one or more content items to a user. Based on features relating to topics and features relating to content items, the predictive model can determine a set of probabilities that represents the interaction between user interests and content items. For example, if a video content item relates to astronomy and users with the topic “science” in their user interest profile watch or are highly likely to watch the video content item, the instance for the conjunction of the topic and the video content item can have a high weight or probability value.
  • Referring back to FIG. 1, using the model of user interests and the determined watch probabilities, the content recommendation system can rank multiple content items based on their determined probabilities at 150. For example, the content recommendation system can use the model of user interests to generate a ranked list of candidate video items. In another example, the content recommendation system can use the model of user interests to generate a list of candidate video items sorted by watch probability and then re-rank the list using additional criteria, such as view count, ratings, recency (e.g., upload date), sharing activity, etc.
  • At 160, the content recommendation system can select a subset of the content items from the ranked list to recommend to the user. For example, upon determining content items accessed by the user (e.g., using a user interest profile), determining topics associated with the accessed content items, generating a model of user interests based on the determined topics, and generating a list of content items that are ranked by watch probabilities derived using the model, the content recommendation system can select a particular number of content items from the ranked list to recommend to the user. In a more particular example, a content hosting service can indicate a number of recommendations that can be provided on an interface. As shown in FIG. 7, upon the completion of watching a video content item in video window 600, the content hosting service can request that a particular number of recommended content items (e.g., eight) are provided in region 610. As also shown, any suitable information can be displayed to provide the user with the recommended content items—e.g., a thumbnail or image, a title or a portion of a title, and/or a playback time associated with the recommended content item.
  • In some embodiments, the content recommendation system can determine whether one or more of the content items in the ranked list have been previously accessed by the user. For example, the content recommendation system can inhibit the recommendation of a content item that has been previously watched by the user (e.g., based on information from a user access log). In another example, the content recommendation system can adjust the ranking associated with the previously watched content item (e.g., lower the ranking of the previously watched content).
  • FIG. 7 is a generalized schematic diagram of a content recommendation system in accordance with some implementations of the disclosed subject matter. As illustrated, system 700 can include one or more computing devices 702, such as a user computing device for providing search queries for content items and/or obtaining and playing back content items from a content hosting service, a tablet computing device for transmitting user instructions to a television device, etc. For example, computing device 702 can be implemented as a personal computer, a tablet computing device, a personal digital assistant (PDA), a portable email device, a multimedia terminal, a mobile telephone, a gaming device, a set-top box, a television, a smart television, etc.
  • In some implementations, computing device 702 can include a storage device, such as a hard drive, a digital video recorder, a solid state storage device, a gaming console, a removable storage device, or any other suitable device for storing media content, entity tables, entity information, metadata relating to a particular search domain, etc.
  • In some implementations, computing device 702 can include a second screen device. For example, a second screen device can present the user with recommended content items based on topics and, in response to receiving a user selection of one of the recommended content items, can transmit playback instructions for a user-selected content item to a television device.
  • Computing devices 702 can be local to each other or remote from each other. For example, when one computing device 702 is a television and another computing device 702 is a second screen device (e.g., a tablet computing device, a mobile telephone, etc.), the computing devices 702 may be located in the same room. Computing devices 702 are connected by one or more communications links 704 to a communications network 706 that is linked via a communications link 708 to a server 710.
  • System 700 can include one or more servers 710. Server 710 can be any suitable server for providing access to content recommendation system, such as a processor, a computer, a data processing device, or a combination of such devices. For example, the content recommendation system can be distributed into multiple backend components and multiple frontend components or interfaces. In a more particular example, a content hosting service can include the content recommendation system, which provides recommended content items based on topic information associated with users. In another more particular example, backend components, such as data distribution can be performed on one or more servers 710. Similarly, the graphical user interfaces displayed by the content hosting service, such as an interface for retrieving content items, can be distributed by one or more servers 710 to computing device 702.
  • In some implementations, server 710 can include any suitable server for accessing metadata relating to content items, user access logs, user interest profiles, topic and related topic information, etc.
  • More particularly, for example, each of the computing devices 702 and server 710 can be any of a general purpose device such as a computer or a special purpose device such as a client, a server, etc. Any of these general or special purpose devices can include any suitable components such as a processor (which can be a microprocessor, digital signal processor, a controller, etc.), memory, communication interfaces, display controllers, input devices, etc. For example, computing device 702 can be implemented as a personal computer, a tablet computing device, a personal digital assistant (PDA), a portable email device, a multimedia terminal, a mobile telephone, a gaming device, a set-top box, a television, etc.
  • In some implementations, any suitable computer readable media can be used for storing instructions for performing the processes described herein. For example, in some implementations, computer readable media can be transitory or non-transitory. For example, non-transitory computer readable media can include media such as magnetic media (such as hard disks, floppy disks, etc.), optical media (such as compact discs, digital video discs, Blu-ray discs, etc.), semiconductor media (such as flash memory, electrically programmable read only memory (EPROM), electrically erasable programmable read only memory (EEPROM), etc.), any suitable media that is not fleeting or devoid of any semblance of permanence during transmission, and/or any suitable tangible media. As another example, transitory computer readable media can include signals on networks, in wires, conductors, optical fibers, circuits, any suitable media that is fleeting and devoid of any semblance of permanence during transmission, and/or any suitable intangible media.
  • Referring back to FIG. 7, communications network 706 may be any suitable computer network including the Internet, an intranet, a wide-area network (“WAN”), a local-area network (“LAN”), a wireless network, a digital subscriber line (“DSL”) network, a frame relay network, an asynchronous transfer mode (“ATM”) network, a virtual private network (“VPN”), or any combination of any of such networks. Communications links 704 and 708 may be any communications links suitable for communicating data between computing devices 702 and server 710, such as network links, dial-up links, wireless links, hard-wired links, any other suitable communications links, or a combination of such links. Computing devices 702 enable a user to access features of the application. Computing devices 702 and server 710 may be located at any suitable location. In one implementation, computing devices 702 and server 710 may be located within an organization. Alternatively, computing devices 702 and server 710 may be distributed between multiple organizations.
  • Referring back to FIG. 7, the server and one of the computing devices depicted in FIG. 7 are illustrated in more detail in FIG. 8. Referring to FIG. 8, computing device 702 may include processor 802, display 804, input device 806, and memory 808, which may be interconnected. In a preferred implementation, memory 808 contains a storage device for storing a computer program for controlling processor 802.
  • Processor 802 uses the computer program to present on display 804 the interfaces of the content hosting service and the data received through communications link 704 and commands and values transmitted by a user of computing device 702. It should also be noted that data received through communications link 704 or any other communications links may be received from any suitable source. Input device 806 may be a computer keyboard, a mouse, a keypad, a cursor-controller, dial, switchbank, lever, a remote control, or any other suitable input device as would be used by a designer of input systems or process control systems. Alternatively, input device 1206 may be a finger or stylus used on a touch screen display 804.
  • Server 710 may include processor 820, display 822, input device 824, and memory 826, which may be interconnected. In a preferred implementation, memory 826 contains a storage device for storing data received through communications link 708 or through other links, and also receives commands and values transmitted by one or more users. The storage device further contains a server program for controlling processor 820.
  • In some implementations, the application may include an application program interface (not shown), or alternatively, the application may be resident in the memory of computing device 702 or server 710. In another suitable implementation, the only distribution to computing device 702 may be a graphical user interface (“GUI”) which allows a user to interact with the application resident at, for example, server 710.
  • In one particular implementation, the application may include client-side software, hardware, or both. For example, the application may encompass one or more Web-pages or Web-page portions (e.g., via any suitable encoding, such as HyperText Markup Language (“HTML”), Dynamic HyperText Markup Language (“DHTML”), Extensible Markup Language (“XML”), JavaServer Pages (“JSP”), Active Server Pages (“ASP”), Cold Fusion, or any other suitable approaches).
  • Although the application is described herein as being implemented on a user computer and/or server, this is only illustrative. The application may be implemented on any suitable platform (e.g., a personal computer (“PC”), a mainframe computer, a dumb terminal, a data display, a two-way pager, a wireless terminal, a portable telephone, a portable computer, a palmtop computer, an H/PC, an automobile PC, a laptop computer, a cellular phone, a personal digital assistant (“PDA”), a combined cellular phone and PDA, etc.) to provide such features.
  • Accordingly, methods, systems, and media for recommending content items based on topics are provided.
  • Although the disclosed subject matter has been described and illustrated in the foregoing illustrative implementations, it is understood that the present disclosure has been made only by way of example, and that numerous changes in the details of implementation of the disclosed subject matter can be made without departing from the spirit and scope of the disclosed subject matter. Features of the disclosed implementations can be combined and rearranged in various ways.

Claims (18)

What is claimed is:
1. A method for recommending content items, the method comprising:
retrieving, using a hardware processor, a first plurality of content items associated with a user account, wherein each of the first plurality of content items is associated with a plurality of topics;
applying, using the hardware processor, to a second plurality of content items, a user interest model of interactions between the plurality of topics and the first plurality of content items to a second plurality of content items, wherein the application of the user interest model provides a probability that a user of the user account selects a content item from the second plurality of content items for presentation based on a plurality of related topics associated with the plurality of topics and user interest information associated with the user account using at least a portion of the plurality of related topics; and
selecting, using the hardware processor, at least one of the plurality of content items to recommend to the user of the user account based on the determined probabilities.
2. The method of claim 1, wherein the user interest model is generated by (i) determining the plurality of related topics associated with the plurality of topics from the plurality of accessed content items, (ii) generating the user interest information associated with the user account using at least a portion of the plurality of related topics, (iii) determining similarities between the user interest information associated with the user account and user interest information of other users accounts including the at least a portion of the plurality of related topics associated with the user account, and (iv) determining a conjunction of between the similarities and the plurality of accessed content items.
3. The method of claim 1, wherein the user interest model is generated by (i) selecting a subset of the plurality of topics associated with each of the plurality of content items, wherein the subset of the plurality of topics is selected based on a weight assigned to each of the plurality of topics, and (ii) determining a conjunction that models interaction between the subset of the plurality of topics and the plurality of content items.
4. The method of claim 1, wherein user interest model is generated by (i) determining related topics for each of the plurality of topics, wherein a distance value between a topic and a related topic is calculated, (ii) determining a plurality of topic clusters, wherein one or more of the plurality of topics and one or more of the related topics are placed in a topic cluster based on the distance value, (iii) mapping the user interest information to at least one of the plurality of topic clusters to obtain user cluster features, and (iv) determining a conjunction that models interaction between the user cluster features and the plurality of content items.
5. The method of claim 1, wherein the user interest model is generated by (i) generating a decision tree, wherein a portion of the decision tree identifies which of the user interest information of other user accounts is similar to the user interest information of the user account, (ii) determining a subset of the plurality of topics based on the decision tree, and (iii) determining a conjunction that models interaction between the subset of the plurality of topics and the plurality of content items.
6. The method of claim 1, further comprising ranking the second plurality of content items based on the determined probabilities, wherein the at least one of the second plurality of content items is selected based on the ranked plurality of content items.
7. A system for recommending content items, the system comprising:
a hardware processor that:
retrieves a first plurality of content items associated with a user account, wherein each of the first plurality of content items is associated with a plurality of topics;
applies, to a second plurality of content items, a user interest model of interactions between the plurality of topics and the first plurality of content items to a second plurality of content items, wherein the application of the user interest model provides a probability that a user of the user account selects a content item from the second plurality of content items for presentation based on a plurality of related topics associated with the plurality of topics and user interest information associated with the user account using at least a portion of the plurality of related topics; and
selects at least one of the plurality of content items to recommend to the user of the user account based on the determined probabilities.
8. The system of claim 7, wherein the user interest model is generated by (i) determining the plurality of related topics associated with the plurality of topics from the plurality of accessed content items, (ii) generating the user interest information associated with the user account using at least a portion of the plurality of related topics, (iii) determining similarities between the user interest information associated with the user account and user interest information of other users accounts including the at least a portion of the plurality of related topics associated with the user account, and (iv) determining a conjunction of between the similarities and the plurality of accessed content items.
9. The system of claim 7, wherein the user interest model is generated by (i) selecting a subset of the plurality of topics associated with each of the plurality of content items, wherein the subset of the plurality of topics is selected based on a weight assigned to each of the plurality of topics, and (ii) determining a conjunction that models interaction between the subset of the plurality of topics and the plurality of content items.
10. The system of claim 7, wherein user interest model is generated by (i) determining related topics for each of the plurality of topics, wherein a distance value between a topic and a related topic is calculated, (ii) determining a plurality of topic clusters, wherein one or more of the plurality of topics and one or more of the related topics are placed in a topic cluster based on the distance value, (iii) mapping the user interest information to at least one of the plurality of topic clusters to obtain user cluster features, and (iv) determining a conjunction that models interaction between the user cluster features and the plurality of content items.
11. The system of claim 7, wherein the user interest model is generated by (i) generating a decision tree, wherein a portion of the decision tree identifies which of the user interest information of other user accounts is similar to the user interest information of the user account, (ii) determining a subset of the plurality of topics based on the decision tree, and (iii) determining a conjunction that models interaction between the subset of the plurality of topics and the plurality of content items.
12. The system of claim 7, wherein the hardware processor is further configured to rank the second plurality of content items based on the determined probabilities, wherein the at least one of the second plurality of content items is selected based on the ranked plurality of content items.
13. A non-transitory computer-readable medium containing computer-executable instructions that, when executed by a processor, cause the processor to perform a method for recommending content items, the method comprising:
retrieving a first plurality of content items associated with a user account, wherein each of the first plurality of content items is associated with a plurality of topics;
applying, to a second plurality of content items, a user interest model of interactions between the plurality of topics and the first plurality of content items to a second plurality of content items, wherein the application of the user interest model provides a probability that a user of the user account selects a content item from the second plurality of content items for presentation based on a plurality of related topics associated with the plurality of topics and user interest information associated with the user account using at least a portion of the plurality of related topics; and
selecting at least one of the plurality of content items to recommend to the user of the user account based on the determined probabilities.
14. The non-transitory computer-readable medium of claim 13, wherein the user interest model is generated by (i) determining the plurality of related topics associated with the plurality of topics from the plurality of accessed content items, (ii) generating the user interest information associated with the user account using at least a portion of the plurality of related topics, (iii) determining similarities between the user interest information associated with the user account and user interest information of other users accounts including the at least a portion of the plurality of related topics associated with the user account, and (iv) determining a conjunction of between the similarities and the plurality of accessed content items.
15. The non-transitory computer-readable medium of claim 13, wherein the user interest model is generated by (i) selecting a subset of the plurality of topics associated with each of the plurality of content items, wherein the subset of the plurality of topics is selected based on a weight assigned to each of the plurality of topics, and (ii) determining a conjunction that models interaction between the subset of the plurality of topics and the plurality of content items.
16. The non-transitory computer-readable medium of claim 13, wherein user interest model is generated by (i) determining related topics for each of the plurality of topics, wherein a distance value between a topic and a related topic is calculated, (ii) determining a plurality of topic clusters, wherein one or more of the plurality of topics and one or more of the related topics are placed in a topic cluster based on the distance value, (iii) mapping the user interest information to at least one of the plurality of topic clusters to obtain user cluster features, and (iv) determining a conjunction that models interaction between the user cluster features and the plurality of content items.
17. The non-transitory computer-readable medium of claim 13, wherein the user interest model is generated by (i) generating a decision tree, wherein a portion of the decision tree identifies which of the user interest information of other user accounts is similar to the user interest information of the user account, (ii) determining a subset of the plurality of topics based on the decision tree, and (iii) determining a conjunction that models interaction between the subset of the plurality of topics and the plurality of content items.
18. The non-transitory computer-readable medium of claim 13, wherein the method further comprises ranking the second plurality of content items based on the determined probabilities, wherein the at least one of the second plurality of content items is selected based on the ranked plurality of content items.
US15/384,692 2012-12-31 2016-12-20 Methods, systems, and media for recommending content items based on topics Abandoned US20170103343A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US15/384,692 US20170103343A1 (en) 2012-12-31 2016-12-20 Methods, systems, and media for recommending content items based on topics

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US13/731,266 US9129227B1 (en) 2012-12-31 2012-12-31 Methods, systems, and media for recommending content items based on topics
US14/816,866 US9552555B1 (en) 2012-12-31 2015-08-03 Methods, systems, and media for recommending content items based on topics
US15/384,692 US20170103343A1 (en) 2012-12-31 2016-12-20 Methods, systems, and media for recommending content items based on topics

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US14/816,866 Continuation US9552555B1 (en) 2012-12-31 2015-08-03 Methods, systems, and media for recommending content items based on topics

Publications (1)

Publication Number Publication Date
US20170103343A1 true US20170103343A1 (en) 2017-04-13

Family

ID=54012609

Family Applications (3)

Application Number Title Priority Date Filing Date
US13/731,266 Active 2033-09-04 US9129227B1 (en) 2012-12-31 2012-12-31 Methods, systems, and media for recommending content items based on topics
US14/816,866 Active US9552555B1 (en) 2012-12-31 2015-08-03 Methods, systems, and media for recommending content items based on topics
US15/384,692 Abandoned US20170103343A1 (en) 2012-12-31 2016-12-20 Methods, systems, and media for recommending content items based on topics

Family Applications Before (2)

Application Number Title Priority Date Filing Date
US13/731,266 Active 2033-09-04 US9129227B1 (en) 2012-12-31 2012-12-31 Methods, systems, and media for recommending content items based on topics
US14/816,866 Active US9552555B1 (en) 2012-12-31 2015-08-03 Methods, systems, and media for recommending content items based on topics

Country Status (1)

Country Link
US (3) US9129227B1 (en)

Cited By (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160156579A1 (en) * 2014-12-01 2016-06-02 Google Inc. Systems and methods for estimating user judgment based on partial feedback and applying it to message categorization
US20160198005A1 (en) * 2015-01-05 2016-07-07 Facebook, Inc. Recommending objects to a social networking system user based in part on topics associated with the objects
US20170011112A1 (en) * 2014-01-30 2017-01-12 Microsoft Technology Licensing, Llc Entity page generation and entity related searching
US20170024391A1 (en) * 2015-07-23 2017-01-26 Netflix, Inc. Gaussian ranking using matrix factorization
US20170242849A1 (en) * 2016-02-24 2017-08-24 Yen4Ken, Inc. Methods and systems for extracting content items from content
CN107391760A (en) * 2017-08-25 2017-11-24 平安科技(深圳)有限公司 User interest recognition methods, device and computer-readable recording medium
WO2019010424A1 (en) * 2017-07-07 2019-01-10 Alibaba Group Holding Limited Data processing system, method, and device
US10387115B2 (en) 2015-09-28 2019-08-20 Yandex Europe Ag Method and apparatus for generating a recommended set of items
US10387513B2 (en) 2015-08-28 2019-08-20 Yandex Europe Ag Method and apparatus for generating a recommended content list
US10394420B2 (en) 2016-05-12 2019-08-27 Yandex Europe Ag Computer-implemented method of generating a content recommendation interface
US10430481B2 (en) 2016-07-07 2019-10-01 Yandex Europe Ag Method and apparatus for generating a content recommendation in a recommendation system
US10452731B2 (en) 2015-09-28 2019-10-22 Yandex Europe Ag Method and apparatus for generating a recommended set of items for a user
RU2714594C1 (en) * 2018-09-14 2020-02-18 Общество С Ограниченной Ответственностью "Яндекс" Method and system for determining parameter relevance for content items
US10623709B2 (en) * 2018-08-31 2020-04-14 Disney Enterprises, Inc. Video color propagation
USD882600S1 (en) 2017-01-13 2020-04-28 Yandex Europe Ag Display screen with graphical user interface
US10706325B2 (en) 2016-07-07 2020-07-07 Yandex Europe Ag Method and apparatus for selecting a network resource as a source of content for a recommendation system
US10783206B2 (en) * 2016-07-07 2020-09-22 Tencent Technology (Shenzhen) Company Limited Method and system for recommending text content, and storage medium
US20210203623A1 (en) * 2018-06-25 2021-07-01 Microsoft Technology Licensing, Llc Topic guiding in a conversation
US11089034B2 (en) * 2018-12-10 2021-08-10 Bitdefender IPR Management Ltd. Systems and methods for behavioral threat detection
US11086888B2 (en) 2018-10-09 2021-08-10 Yandex Europe Ag Method and system for generating digital content recommendation
US11153332B2 (en) 2018-12-10 2021-10-19 Bitdefender IPR Management Ltd. Systems and methods for behavioral threat detection
US11263217B2 (en) 2018-09-14 2022-03-01 Yandex Europe Ag Method of and system for determining user-specific proportions of content for recommendation
US11276076B2 (en) 2018-09-14 2022-03-15 Yandex Europe Ag Method and system for generating a digital content recommendation
US11276079B2 (en) 2019-09-09 2022-03-15 Yandex Europe Ag Method and system for meeting service level of content item promotion
US11288333B2 (en) 2018-10-08 2022-03-29 Yandex Europe Ag Method and system for estimating user-item interaction data based on stored interaction data by using multiple models
US11323459B2 (en) 2018-12-10 2022-05-03 Bitdefender IPR Management Ltd. Systems and methods for behavioral threat detection
US11379901B2 (en) * 2018-01-10 2022-07-05 Beijing Sensetime Technology Development Co., Ltd Methods and apparatuses for deep learning-based recommendation, electronic devices, and media
US11418824B1 (en) 2021-01-26 2022-08-16 Synamedia Limited Approximated personalization for weakly connected devices
US11544340B2 (en) * 2020-12-15 2023-01-03 Docusign, Inc. Content item selection in a digital transaction management platform
US11562004B2 (en) * 2019-07-02 2023-01-24 Jpmorgan Chase Bank, N.A. Classifying and filtering platform data via k-means clustering
US11636335B2 (en) 2017-08-29 2023-04-25 Sky Cp Limited System and method for content discovery
US20230156289A1 (en) * 2021-11-12 2023-05-18 Disney Enterprises, Inc. Techniques for curating content items
US20240020345A1 (en) * 2018-06-25 2024-01-18 Meta Platforms, Inc. Semantic embeddings for content retrieval
US20240037154A1 (en) * 2022-07-27 2024-02-01 Dropbox, Inc. Seeding and generating suggested content collections

Families Citing this family (54)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8463053B1 (en) 2008-08-08 2013-06-11 The Research Foundation Of State University Of New York Enhanced max margin learning on multimodal data mining in a multimedia database
US10491694B2 (en) 2013-03-15 2019-11-26 Oath Inc. Method and system for measuring user engagement using click/skip in content stream using a probability model
US11244022B2 (en) * 2013-08-28 2022-02-08 Verizon Media Inc. System and methods for user curated media
US20150095303A1 (en) * 2013-09-27 2015-04-02 Futurewei Technologies, Inc. Knowledge Graph Generator Enabled by Diagonal Search
KR102158389B1 (en) * 2013-11-06 2020-09-21 삼성전자주식회사 Operating method of node considering packet characteristics in content centric network and the node
US20150170067A1 (en) * 2013-12-17 2015-06-18 International Business Machines Corporation Determining analysis recommendations based on data analysis context
US9471671B1 (en) * 2013-12-18 2016-10-18 Google Inc. Identifying and/or recommending relevant media content
US20150243279A1 (en) * 2014-02-26 2015-08-27 Toytalk, Inc. Systems and methods for recommending responses
US20150256566A1 (en) * 2014-03-06 2015-09-10 Call-It-Out, Inc. Project Collaboration
US10601749B1 (en) 2014-07-11 2020-03-24 Twitter, Inc. Trends in a messaging platform
US10592539B1 (en) * 2014-07-11 2020-03-17 Twitter, Inc. Trends in a messaging platform
US9575952B2 (en) 2014-10-21 2017-02-21 At&T Intellectual Property I, L.P. Unsupervised topic modeling for short texts
KR101605654B1 (en) * 2014-12-01 2016-04-04 서울대학교산학협력단 Method and apparatus for estimating multiple ranking using pairwise comparisons
US20160189036A1 (en) * 2014-12-30 2016-06-30 Cirrus Shakeri Computer automated learning management systems and methods
US10097648B2 (en) * 2015-02-27 2018-10-09 Rovi Guides, Inc. Methods and systems for recommending media content
US10341701B2 (en) * 2015-04-21 2019-07-02 Edge2020 LLC Clustering and adjudication to determine a recommendation of multimedia content
US10803391B2 (en) * 2015-07-29 2020-10-13 Google Llc Modeling personal entities on a mobile device using embeddings
CN105243143B (en) * 2015-10-14 2018-07-24 湖南大学 Recommendation method and system based on real-time phonetic content detection
US10331679B2 (en) * 2015-10-30 2019-06-25 At&T Intellectual Property I, L.P. Method and apparatus for providing a recommendation for learning about an interest of a user
EP3394769A4 (en) * 2015-12-21 2019-07-10 Particle Media, Inc. Method and system for exploring a personal interest space
US10762436B2 (en) * 2015-12-21 2020-09-01 Facebook, Inc. Systems and methods for recommending pages
US11675833B2 (en) * 2015-12-30 2023-06-13 Yahoo Assets Llc Method and system for recommending content
CN105678587B (en) * 2016-01-12 2020-11-24 腾讯科技(深圳)有限公司 Recommendation feature determination method, information recommendation method and device
US10432689B2 (en) * 2016-02-15 2019-10-01 Netflix, Inc. Feature generation for online/offline machine learning
CN107544981B (en) * 2016-06-25 2021-06-01 华为技术有限公司 Content recommendation method and device
CN107688956B (en) * 2016-08-05 2021-09-07 腾讯科技(深圳)有限公司 Information processing method and server
US11361242B2 (en) 2016-10-28 2022-06-14 Meta Platforms, Inc. Generating recommendations using a deep-learning model
US11205103B2 (en) 2016-12-09 2021-12-21 The Research Foundation for the State University Semisupervised autoencoder for sentiment analysis
US10558714B2 (en) * 2016-12-28 2020-02-11 Facebook, Inc. Topic ranking of content items for topic-based content feeds
US10462498B2 (en) * 2017-02-07 2019-10-29 The Directv Group, Inc. Providing options to live stream multimedia content
US20180232442A1 (en) * 2017-02-16 2018-08-16 International Business Machines Corporation Web api recommendations
CN107169012B (en) * 2017-03-31 2021-03-19 百度在线网络技术(北京)有限公司 POI recommendation method, device, equipment and computer readable storage medium
CN107454442B (en) * 2017-09-07 2021-02-05 阿里巴巴(中国)有限公司 Method and device for recommending video
CN107943895A (en) * 2017-11-16 2018-04-20 百度在线网络技术(北京)有限公司 Information-pushing method and device
US10803111B2 (en) * 2017-11-27 2020-10-13 Facebook, Inc. Live video recommendation by an online system
US20190187955A1 (en) * 2017-12-15 2019-06-20 Facebook, Inc. Systems and methods for comment ranking using neural embeddings
US11170006B2 (en) * 2018-01-03 2021-11-09 Facebook, Inc. Machine-learning model for ranking diverse content
CN108875071B (en) * 2018-07-05 2021-03-19 中北大学 Learning resource recommendation method based on multi-view interest
US11120067B2 (en) * 2018-07-17 2021-09-14 International Business Machines Corporation Present controlled heterogeneous digital content to users
CN109299321B (en) * 2018-08-31 2021-07-09 出门问问信息科技有限公司 Method and device for recommending songs
CN109543066B (en) * 2018-10-31 2021-04-23 北京达佳互联信息技术有限公司 Video recommendation method and device and computer-readable storage medium
CN109670077B (en) * 2018-11-01 2021-07-13 北京达佳互联信息技术有限公司 Video recommendation method and device and computer-readable storage medium
US10977297B1 (en) * 2018-12-12 2021-04-13 Facebook, Inc. Ephemeral item ranking in a graphical user interface
AU2019433967A1 (en) * 2019-03-12 2021-10-28 Citrix Systems, Inc. Intelligent file recommendation engine
US11675924B2 (en) * 2019-05-28 2023-06-13 SquelchAI, LLC Content aggregation system for intelligent searching of indexed content based on extracted security identifiers
USD912083S1 (en) 2019-08-01 2021-03-02 Facebook, Inc. Display screen or portion thereof with graphical user interface
US11797880B1 (en) 2019-08-27 2023-10-24 Meta Platforms, Inc. Systems and methods for digital content provision
US20210082471A1 (en) * 2019-09-17 2021-03-18 Facebook, Inc. Systems and methods for generating music recommendations
US11170270B2 (en) * 2019-10-17 2021-11-09 International Business Machines Corporation Automatic generation of content using multimedia
CA3097731A1 (en) * 2019-10-30 2021-04-30 Royal Bank Of Canada System and method for deep learning recommender
US11689507B2 (en) * 2019-11-26 2023-06-27 Adobe Inc. Privacy preserving document analysis
CN111259259B (en) * 2020-03-11 2021-03-30 郑州工程技术学院 University student news recommendation method, device, equipment and storage medium
US11921728B2 (en) * 2021-01-29 2024-03-05 Microsoft Technology Licensing, Llc Performing targeted searching based on a user profile
US20230214434A1 (en) * 2021-12-30 2023-07-06 Netflix, Inc. Dynamically generating a structured page based on user input

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6727914B1 (en) * 1999-12-17 2004-04-27 Koninklijke Philips Electronics N.V. Method and apparatus for recommending television programming using decision trees
US6981040B1 (en) * 1999-12-28 2005-12-27 Utopy, Inc. Automatic, personalized online information and product services
US20030066068A1 (en) * 2001-09-28 2003-04-03 Koninklijke Philips Electronics N.V. Individual recommender database using profiles of others
EP1860579A1 (en) * 2002-08-30 2007-11-28 Sony Deutschland Gmbh Method to split a multiuser profile
AU2003280158A1 (en) * 2002-12-04 2004-06-23 Koninklijke Philips Electronics N.V. Recommendation of video content based on the user profile of users with similar viewing habits
US20090006190A1 (en) * 2007-06-28 2009-01-01 Google Inc. Determining location-based commercial information
US8275764B2 (en) * 2007-08-24 2012-09-25 Google Inc. Recommending media programs based on media program popularity
JPWO2009122745A1 (en) * 2008-04-02 2011-07-28 パナソニック株式会社 Communication support device, communication support method, and communication support program
US20100114929A1 (en) * 2008-11-06 2010-05-06 Yahoo! Inc. Diverse query recommendations using clustering-based methodology
US9195739B2 (en) * 2009-02-20 2015-11-24 Microsoft Technology Licensing, Llc Identifying a discussion topic based on user interest information
JP5749279B2 (en) * 2010-02-01 2015-07-15 グーグル インコーポレイテッド Join embedding for item association
US8688781B2 (en) * 2010-08-26 2014-04-01 Tarik TALEB System and method for creating multimedia content channel customized for social network
US20120102121A1 (en) * 2010-10-25 2012-04-26 Yahoo! Inc. System and method for providing topic cluster based updates
US9275001B1 (en) * 2010-12-01 2016-03-01 Google Inc. Updating personal content streams based on feedback
US8751427B1 (en) * 2011-01-05 2014-06-10 Google Inc. Location-centric recommendation service for users
US8260117B1 (en) * 2011-07-26 2012-09-04 Ooyala, Inc. Automatically recommending content
WO2013138969A1 (en) * 2012-03-17 2013-09-26 Beijing Haipu Wangju Technology Limited Method and system for recommending content to a user

Cited By (45)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170011112A1 (en) * 2014-01-30 2017-01-12 Microsoft Technology Licensing, Llc Entity page generation and entity related searching
US10437859B2 (en) * 2014-01-30 2019-10-08 Microsoft Technology Licensing, Llc Entity page generation and entity related searching
US20160156579A1 (en) * 2014-12-01 2016-06-02 Google Inc. Systems and methods for estimating user judgment based on partial feedback and applying it to message categorization
US10027765B2 (en) * 2015-01-05 2018-07-17 Facebook, Inc. Recommending objects to a social networking system user based in part on topics associated with the objects
US20160198005A1 (en) * 2015-01-05 2016-07-07 Facebook, Inc. Recommending objects to a social networking system user based in part on topics associated with the objects
US20170024391A1 (en) * 2015-07-23 2017-01-26 Netflix, Inc. Gaussian ranking using matrix factorization
US10180968B2 (en) * 2015-07-23 2019-01-15 Netflix, Inc. Gaussian ranking using matrix factorization
US10387513B2 (en) 2015-08-28 2019-08-20 Yandex Europe Ag Method and apparatus for generating a recommended content list
US10387115B2 (en) 2015-09-28 2019-08-20 Yandex Europe Ag Method and apparatus for generating a recommended set of items
US10452731B2 (en) 2015-09-28 2019-10-22 Yandex Europe Ag Method and apparatus for generating a recommended set of items for a user
US20170242849A1 (en) * 2016-02-24 2017-08-24 Yen4Ken, Inc. Methods and systems for extracting content items from content
US10394420B2 (en) 2016-05-12 2019-08-27 Yandex Europe Ag Computer-implemented method of generating a content recommendation interface
US10783206B2 (en) * 2016-07-07 2020-09-22 Tencent Technology (Shenzhen) Company Limited Method and system for recommending text content, and storage medium
US10706325B2 (en) 2016-07-07 2020-07-07 Yandex Europe Ag Method and apparatus for selecting a network resource as a source of content for a recommendation system
US10430481B2 (en) 2016-07-07 2019-10-01 Yandex Europe Ag Method and apparatus for generating a content recommendation in a recommendation system
USD892846S1 (en) 2017-01-13 2020-08-11 Yandex Europe Ag Display screen with graphical user interface
USD890802S1 (en) 2017-01-13 2020-07-21 Yandex Europe Ag Display screen with graphical user interface
USD980246S1 (en) 2017-01-13 2023-03-07 Yandex Europe Ag Display screen with graphical user interface
USD882600S1 (en) 2017-01-13 2020-04-28 Yandex Europe Ag Display screen with graphical user interface
USD892847S1 (en) 2017-01-13 2020-08-11 Yandex Europe Ag Display screen with graphical user interface
WO2019010424A1 (en) * 2017-07-07 2019-01-10 Alibaba Group Holding Limited Data processing system, method, and device
US10977447B2 (en) * 2017-08-25 2021-04-13 Ping An Technology (Shenzhen) Co., Ltd. Method and device for identifying a user interest, and computer-readable storage medium
WO2019037195A1 (en) * 2017-08-25 2019-02-28 平安科技(深圳)有限公司 Method and device for identifying interest of user, and computer-readable storage medium
CN107391760A (en) * 2017-08-25 2017-11-24 平安科技(深圳)有限公司 User interest recognition methods, device and computer-readable recording medium
US11636335B2 (en) 2017-08-29 2023-04-25 Sky Cp Limited System and method for content discovery
US11379901B2 (en) * 2018-01-10 2022-07-05 Beijing Sensetime Technology Development Co., Ltd Methods and apparatuses for deep learning-based recommendation, electronic devices, and media
US20240020345A1 (en) * 2018-06-25 2024-01-18 Meta Platforms, Inc. Semantic embeddings for content retrieval
US20210203623A1 (en) * 2018-06-25 2021-07-01 Microsoft Technology Licensing, Llc Topic guiding in a conversation
US10623709B2 (en) * 2018-08-31 2020-04-14 Disney Enterprises, Inc. Video color propagation
US11263217B2 (en) 2018-09-14 2022-03-01 Yandex Europe Ag Method of and system for determining user-specific proportions of content for recommendation
RU2714594C1 (en) * 2018-09-14 2020-02-18 Общество С Ограниченной Ответственностью "Яндекс" Method and system for determining parameter relevance for content items
US11276076B2 (en) 2018-09-14 2022-03-15 Yandex Europe Ag Method and system for generating a digital content recommendation
US10674215B2 (en) 2018-09-14 2020-06-02 Yandex Europe Ag Method and system for determining a relevancy parameter for content item
US11288333B2 (en) 2018-10-08 2022-03-29 Yandex Europe Ag Method and system for estimating user-item interaction data based on stored interaction data by using multiple models
US11086888B2 (en) 2018-10-09 2021-08-10 Yandex Europe Ag Method and system for generating digital content recommendation
US11153332B2 (en) 2018-12-10 2021-10-19 Bitdefender IPR Management Ltd. Systems and methods for behavioral threat detection
US11323459B2 (en) 2018-12-10 2022-05-03 Bitdefender IPR Management Ltd. Systems and methods for behavioral threat detection
US11089034B2 (en) * 2018-12-10 2021-08-10 Bitdefender IPR Management Ltd. Systems and methods for behavioral threat detection
US11562004B2 (en) * 2019-07-02 2023-01-24 Jpmorgan Chase Bank, N.A. Classifying and filtering platform data via k-means clustering
US11276079B2 (en) 2019-09-09 2022-03-15 Yandex Europe Ag Method and system for meeting service level of content item promotion
US11544340B2 (en) * 2020-12-15 2023-01-03 Docusign, Inc. Content item selection in a digital transaction management platform
US11418824B1 (en) 2021-01-26 2022-08-16 Synamedia Limited Approximated personalization for weakly connected devices
US11785270B2 (en) 2021-01-26 2023-10-10 Synamedia Limited Approximated personalization for weakly connected devices
US20230156289A1 (en) * 2021-11-12 2023-05-18 Disney Enterprises, Inc. Techniques for curating content items
US20240037154A1 (en) * 2022-07-27 2024-02-01 Dropbox, Inc. Seeding and generating suggested content collections

Also Published As

Publication number Publication date
US9129227B1 (en) 2015-09-08
US9552555B1 (en) 2017-01-24

Similar Documents

Publication Publication Date Title
US9552555B1 (en) Methods, systems, and media for recommending content items based on topics
US11645301B2 (en) Cross media recommendation
TWI636416B (en) Method and system for multi-phase ranking for content personalization
US9235853B2 (en) Method for recommending musical entities to a user
US8589434B2 (en) Recommendations based on topic clusters
US20200226133A1 (en) Knowledge map building system and method
JP2021103543A (en) Use of machine learning for recommending live-stream content
CN111241311A (en) Media information recommendation method and device, electronic equipment and storage medium
WO2017035519A1 (en) Supervised learning based recommendation system
GB2545311A (en) Attribute weighting for media content-based recommendation
Yu et al. TIIREC: A tensor approach for tag-driven item recommendation with sparse user generated content
US20140074828A1 (en) Systems and methods for cataloging consumer preferences in creative content
Chen RETRACTED ARTICLE: Research on personalized recommendation algorithm based on user preference in mobile e-commerce
US20220107978A1 (en) Method for recommending video content
WO2023231542A1 (en) Representation information determination method and apparatus, and device and storage medium
US9213745B1 (en) Methods, systems, and media for ranking content items using topics
US11727221B2 (en) Dynamic correlated topic model
Loizou How to recommend music to film buffs: enabling the provision of recommendations from multiple domains
US11841914B2 (en) System and method for topological representation of commentary
Alluhaidan Recommender System Using Collaborative Filtering Algorithm
JP6882534B2 (en) Identifying videos with inappropriate content by processing search logs
Harrando et al. Improving media content recommendation with automatic annotations
Clement et al. Impact of recommendation engine on video-sharing platform-YouTube
US20230057792A1 (en) Training data fidelity for machine learning applications through intelligent merger of curated auxiliary data
Ge et al. Research challenges in multimedia recommender systems

Legal Events

Date Code Title Description
STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

AS Assignment

Owner name: GOOGLE LLC, CALIFORNIA

Free format text: CHANGE OF NAME;ASSIGNOR:GOOGLE INC.;REEL/FRAME:044695/0115

Effective date: 20170929

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STCV Information on status: appeal procedure

Free format text: NOTICE OF APPEAL FILED

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE AFTER FINAL ACTION FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: ADVISORY ACTION MAILED

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION