WO2008045979A2 - Service automatisé de collecte et de communication de données associées à l'activité d'un utilisateur, et de sélection et de propagation de contenu et/ou de métadonnées - Google Patents

Service automatisé de collecte et de communication de données associées à l'activité d'un utilisateur, et de sélection et de propagation de contenu et/ou de métadonnées Download PDF

Info

Publication number
WO2008045979A2
WO2008045979A2 PCT/US2007/081012 US2007081012W WO2008045979A2 WO 2008045979 A2 WO2008045979 A2 WO 2008045979A2 US 2007081012 W US2007081012 W US 2007081012W WO 2008045979 A2 WO2008045979 A2 WO 2008045979A2
Authority
WO
WIPO (PCT)
Prior art keywords
content
client device
data
user
metadata
Prior art date
Application number
PCT/US2007/081012
Other languages
English (en)
Other versions
WO2008045979A3 (fr
Inventor
Bill Messing
Michael Hyman
Jan S. Drake
Nils B. Lahr
Original Assignee
Ripl Corp.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ripl Corp. filed Critical Ripl Corp.
Publication of WO2008045979A2 publication Critical patent/WO2008045979A2/fr
Publication of WO2008045979A3 publication Critical patent/WO2008045979A3/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3476Data logging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3438Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment monitoring of user actions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3495Performance evaluation by tracing or monitoring for systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/875Monitoring of systems including the internet

Definitions

  • provisional application number 60/850,841 entitled Automatic Activity Based Construction of a Persona Representation, filed on October 10, 2006
  • provisional application number 60/854,802 entitled 'Display of Contextual Advertising as a Form of User Generated Content
  • provisional application number 60/850,838, entitled Relevant Content Recommendation System filed on October 10, 2006.
  • the present invention relates generally to the fields of data processing and information technology. More specifically, embodiments of the present invention relate to a service for selecting and propagating content and/or metadata to client device, which applications include selecting and propagating user created content via the World Wide Web (WWW).
  • WWW World Wide Web
  • social networks on the Internet have become very popular in recent years.
  • Social networks typically consist of two main elements: 1) users; and 2) the content within the network, such as home pages and images, that the users come to the network to view.
  • content is typically produced (i.e. published) by users using a traditional publishing approach. That is, when a user has something he or she decides to share, the user uses the social network system to create (publish) the content—for example by writing a blog entry, by uploading an image, or by rearranging his or her home page.
  • This set of explicit actions lets a user construct a representation, available for others to view, of his or her personality and interests, or persona.
  • This approach allows for the display of a breadth of content, but it requires users to actively update their content in order to maintain the interest of viewers. Because updating content is labor-intensive for the publisher, sites typically have a very large difference between the number of people viewing and the number of people creating content, sometimes as much as 100: 1. This means that the social network system must attract a very large number of people in order to have enough actively changing content to generate repeat traffic. Typically such social network systems have a large number of publishers who create an initial page and then rarely or never update it. Likewise, the abandonment rate of viewers is also often high. Viewers must be dedicated in order to find new and interesting content. Thus, increased automation in content publication and propagation in a relevant manner would be desirable.
  • FIG. 1 illustrates an overview of various embodiments of the present invention
  • Figure 2 illustrates selected components of a content/metadata selection and propagation service, including selected operations, in accordance with various embodiments of the present invention
  • Figures 3 illustrates an example computer system suitable for use as a client device to practice various embodiments of the present invention
  • FIG. 4 illustrates selected operations for selecting relevant content employing multiple relevance analysis algorithms, in accordance with various embodiments
  • Figure 5 illustrates selected operations for selecting relevant content based on user activities on friend's client devices, in accordance with various embodiments
  • Figure 6 illustrates selected operations for selecting relevant content through a cosine similarity approach, in accordance with various embodiments
  • Figure 7 illustrates selected operations for selecting relevant content through a cosine similarity analysis of metadata, in accordance with various embodiments;
  • Figure 8 illustrates selected operations for associating algorithm analysis results with content; in accordance with various embodiments;
  • FIG. 9 illustrates selected operations for selecting relevant content through use of Bayesian network, in accordance with various embodiments.
  • Figure 10 illustrates selected operations for selecting relevant content by experimenting with "new" content, in accordance with various embodiments.
  • FIG. 11 illustrates selected components of a client device and user activity associated data collection operations performed thereon in further details, in accordance with various embodiments of the present invention
  • Figures 12 illustrates selected components of a client device and relevant content publication and propagation related operations, in accordance with various embodiments of the present invention
  • Figure 13 illustrates an example computer system suitable for use as a client device to practice various embodiments of the present invention
  • Figures 14-15 illustrate application to the publication of persona representation in a social network, in accordance with various embodiments of the present invention.
  • Illustrative embodiments of the present invention include, but are not limited to, methods and apparatuses for receiving from client devices automatically collected user activities associated data, and for selecting and propagating content and/or metadata back the client devices in a more efficient, flexible and effective (with high relevancy) manner.
  • the methods and apparatuses having particular application to selection and propagation of relevant user created content in a social network.
  • the phrase "in one embodiment” is used repeatedly. The phrase generally does not refer to the same embodiment; however, it may.
  • the terms "comprising,” “having,” and “including” are synonymous, unless the context dictates otherwise.
  • the phrase “A/B” means “A or B”.
  • the phrase “A and/or B” means “(A), (B), or (A and B)”.
  • the phrase “at least one of A, B and C” means "(A), (B), (C), (A and B), (A and C), (B and C) or (A, B and C)".
  • the phrase “(A) B” means "(B) or (A B)", that is, A is optional.
  • Figure 1 illustrated as overview of the present invention, in accordance with various embodiments. Illustrated therein are a number of client devices 102, a content/metadata selection and propagation service 104, and a number of content/metadata providers 108 coupled to each other via network 106.
  • Service 104 is endowed with the teachings of the present invention to receive from client devices 102 automatically collected user activities related data, and in response, to select and propagate relevant content/metadata back to client devices 102. More specifically, for the embodiments, content/metadata selection is endowed with a core data collection and management service 122 and a core content/metadata selection service 124.
  • Core data collection and management service 122 is configured to receive automatically collected user activities associated data from client devices 102.
  • the data may comprise both actively associated as well as passively associated data.
  • the data may be filtered/unfiltered, modified/unmodified, and/or analyzed/unanalyzed.
  • Core content/metadata selection service 124 is configured in response to select and propagate relevant content/metadata. Various embodiments of service 124 will be further described in more detail below.
  • Content/metadata selection and propagation service 104 may be implemented on a single central computer or a collection of servers, e.g. a cluster of locally networked servers, or a system of distributed servers coupled via one or more local/wide area networks.
  • the various networks may comprise wired or wireless segments/domains.
  • content/metadata means content and/or metadata.
  • Content may be commercial or non-commercial in nature, may be public or private, and may be text, graphics, video, audio or multi-media in form.
  • Metadata may be a wide range of data describing technical and/or substantive attributes of the content. Accordingly, each of content/metadata providers may be any one of a wide range of such providers, including but not limited to a commercial or non-commercial website, a video and/or audio service, and so forth.
  • each client device 102 may be endowed with at least a client data collection and management service 112, a client content/metadata selection and propagation service 114 and a client content presentation service 116.
  • Services 112 and 114 may be configured complementarily to services 122 and 124. Various implementations of services 112, 114, 116, 122 and 124 will be described in turn below.
  • Each of client devices 102 may be any one of a broad range of computing or processor based devices known in the art or to be developed, including but not limited to, desktop computers, notebook computers, palm-sized hand-held computing devices, personal digital assistants, smart phones, game consoles, set top boxes, and so forth.
  • Network 106 may comprise one or more wired and/or wireless, local and/or wide area networks.
  • service 112 may comprise a number of data collection rules 1202, a number of event handlers 1210, a number of data filter and/or data modification rules 1214, data analysis modules 1216, a local client data store 1218, and a data reporter 1220, operatively coupled to each other as shown.
  • Data collection rules 1202 may comprise a number of rules to be applied to user activities 1204 on the client device to generate a number of user activities associated data 1206 and/or a number of trigger events 1208.
  • data collection rules 1202 may comprise internal as well as external data collection rules 1202.
  • Internal data collection rules 1202 are those locally installed on client device to provide local data collection rules typically applicable to only the client device itself, whereas external data collection rules 1202 are those provided from an external source (e.g. content/metadata selection and propagation service 104) specifying data collection rules typically apply to a number, a group or a family of client devices.
  • Internal data collection rules 1202 may be provided e.g., through a number of portable data medium, such as diskettes, CDROM or flash drives, whereas external data collection rules 1204 may be provided e.g. through a network connection coupling the external source to the client device. Accordingly, data collection may be more flexible and may change over time.
  • user activities associated data 1206 are preferably comprised of actively associated as well as passively associated data.
  • actively associated data may include e.g. a user clicking or otherwise interacting with a presented content
  • passively associated data may include e.g. "mouse-over" (but not interacting) with a presented content.
  • Event handlers 1210 are employed to create additional data that may be of interest for various trigger events 1208. Each of event handlers 1210 may be configured to handle one or more types of trigger events 1208. Event handler 1210 may e.g. be registered with an operating system service of the operating system environment of a client device to be notified of occurrences of one or more trigger events 1208.
  • Data filter and/or modification rules 1214 are configured to filter and/or modify the nominally collected data 1206 or other data of interest 1212 created by event handlers 1210, to streamline the amount of data eventually reported by data reporter 1220, enabling more efficient and effective data reporting.
  • Data analysis modules 1216 may perform a number of analyses, e.g. statistical analysis or modeling, to analyze, summarize or otherwise model the collected data, enabling data reporter 1220 to report the analysis results in lieu of the nominally collected or rolled up data, and selectively including the analyzed data only when necessary, to further streamline data reporting.
  • data reporter 1220 is configured to report the collected or created data, in a filtered or unfiltered, modified or unmodified, analyzed or unanalyzed manner, to content/metadata selection and propagation service 104.
  • data reporter 1220 may also be configured to report the collected or created data, in a filtered or unfiltered, modified or unmodified, analyzed or unanalyzed manner, to a peer client device 102.
  • the peer client device 102 may be a trusted peer client device.
  • data collection rules (internal and/or external) 1202 are applied to the observed user activities 1204 to generate user activities associated data (active or passive) 1206 and trigger events 1208.
  • appropriate ones of the events handler 1210 are invoked to process applicable ones of the event handlers 1210 to create additional data of interest 1212.
  • Data Filter and/or modification rules 1214 are then applied to data 1206 and 1212 to filter and/or modify the nominally collected/created user activity associated data.
  • the data, filtered/unfiltered, modified/unmodified may be subjected to various client data analyses.
  • the data collected/created, filter/unfiltered or modified/unmodified, as well as the analysis results may be stored in client data store 1218, for reporting by data reporter in batch or in real time.
  • content/metadata selection and propagation service 114 may comprise a client message generation service 1302, a client pattern matching service 1304, various pattern analysis algorithms 1312, a client algorithm manager 1306, a client message queue 1308 and a client message service 1310, operatively coupled to each other as shown.
  • Client pattern matching service 1304 is configured to perform local client pattern detection, discerning patterns in user activities on client device, and/or relevancy between content consumed on client device and the user activities.
  • client pattern matching service 1304 performs the client pattern detection/determination, employing a number of locally maintained pattern analysis algorithms 1312.
  • Pattern analysis algorithms 1312 may be any one of such analysis algorithms known in the art or to be devised.
  • algorithms 1312 are maintained and managed by client algorithm manager 1306, which may manage the algorithms to be employed in coordination e.g. with content/metadata selection and propagation service 104, thereby enabling service 104 to influence the patterns discernment, and in turn, content presentation on client device 102.
  • Content message generation 1302 is configured to locally generate messages comprising content and/or metadata 1314, and storing them in client message queue 1308.
  • Content message merging service 1310 is configured to merge external messages 1318, e.g. those received from content/metadata selection and propagation service 104 with local message 1314 to form merged messages 1318 for presentation service 116 to selectively present on client device 104.
  • external messages 1318 provided by content/metadata selection and propagation service 104 may be selected advertisement messages of particular relevance to client device 102.
  • content message merging service 1310 may also be configured to receive and merge external messages 1318, e.g. those received from a peer client device 102 with local message 1314.
  • content message merging service 1310 may also be configured to send the locally generated messages 314 to other peer client devices 102.
  • core content/metadata selection and propagation service 124 may comprise a core message generation service 202, a core pattern matching service 204, various pattern analysis algorithms 212, and a core algorithm manager 206, operatively coupled to each other as shown.
  • Content message generation service 202 is configured to generate messages comprising content and/or metadata 208 for selection and propagation to the various client devices.
  • Core pattern matching service 204 is configured to perform patterns detection for client devices 102, discerning patterns from reported user activities 210 on client devices, and/or relevancy between content and the client devices.
  • core pattern matching service 204 performs the pattern detection and relevance determination for client devices, employing a number of pattern/relevance analysis algorithms 212.
  • Pattern analysis algorithms 212 may be any one of such analysis algorithms known in the art or to be devised. Examples of these pattern/relevancy analysis algorithms 212 include but are not limited to cosine similarity algorithm, Bayesian network, and so forth. However, preferably the pattern/relevance analysis algorithms 212 complement each other, in that one pattern/relevance algorithm's strength compensate at least in part the weakness of another pattern/analysis relevance algorithm.
  • algorithms 212 are maintained and managed by core algorithm manager 206. In various embodiments, algorithm manager 206 also manages the algorithms to be employed for local pattern/relevance analysis on client devices 102.
  • the messages 208 are propagated to the client devices based on their relevance to the various client devices. In various embodiments, the messages 208 propagated to each client device are locally merged with messages locally generated on the particular client device 102 and presented on the client devices 102 respectively
  • FIG. 3 illustrates an example computer system suitable for use as a client device or a server to practice various embodiments of the present invention.
  • computing system 300 includes a number of processors or processor cores 302, and system memory 304.
  • processors or processor cores may be considered synonymous, unless the context clearly requires otherwise.
  • computing system 300 includes mass storage devices 306 (such as diskette, hard drive, compact disc read only memory (CDROM) and so forth), input/output devices 308 (such as display, keyboard, cursor control and so forth) and communication interfaces 310 (such as network interface cards, modems and so forth).
  • the elements are coupled to each other via system bus 312, which represents one or more buses. In the case of multiple buses, they are bridged by one or more bus bridges (not shown).
  • system memory 304 and mass storage 306 may be employed to store a working copy and a permanent copy of the programming instructions implementing, in whole or in part, services 122 and 124 (core services), including the various components illustrated in Fig 2, or services 112-116 (client services), including the various components illustrated in Figs. 11-12, collectively denoted as 322.
  • the various components may be implemented by assembler instructions supported by processor(s) 302 or high-level languages, such as C, that can be compiled into such instructions.
  • the permanent copy of the programming instructions may be placed into permanent storage 406 in the factory, or in the field, through, for example, a distribution medium (not shown), such as a compact disc (CD), or through communication interface 410 (from a distribution server (not shown)). That is, one or more distribution media having an implementation of the agent program may be employed to distribute the agent and program various computing devices.
  • a distribution medium such as a compact disc (CD)
  • CD compact disc
  • communication interface 410 from a distribution server (not shown)
  • embodiments of the present invention may be practiced to automatically create a persona representation in a social network based on user activities, thus enabling the social network to propagate and present to each user of the system a set of constantly changing content that the user will likely find interesting (relevant).
  • the content may originate within the system or from external sources available to the system.
  • the content is published substantially automatically, based upon a broad set of discovery methods. These methods, in various embodiments, look at factors such as the person's social network, what music they are listening to, how they behave at one or more web sites, and so forth (user activities associated data).
  • These discovery methods implemented using e.g.
  • This social network could be embodied via a web site or via some other electronic mechanism. We will refer to the electronic mechanism by which the users interact as the "social network.” The members will ideally listen to music or take photographs or browse through the social network. All of these are considered natural actions for users of the system. From the simple act of having friends and occasionally (or better yet frequently) interacting with the social network, the system is able to provide a constantly changing set of content. This content, in various embodiments, is delivered directly to the user's desktop in addition to their home page on the social network.
  • the social networking system implemented this way combines this constantly changing content with another innovation: the system exposes what the system is delivering to a person's desktop to anyone who visits the person's home page. For example, suppose that the system is showing user A content items 1, 2 and 3 on A's desktop. These items appear on user A's desktop as well as on user A's home page on the social network. If visitor B goes to user A's home page, visitor B will also see content items 1, 2 and 3. Suppose then that as user A interacts with incoming content, the system changes the content user A sees to content items 1, 5, and 10. When user B goes to user A's home page, user B will also see items 1, 5 and 10.
  • user A's persona page is constantly changing simply by the act of user A having had minimal interactions with content on the social network. What this means is a complete shift of the typical viewer-participant ratio.
  • everyone using the social network is a participant and is acting as a discovery engine that others can see.
  • the content that is shown to user B is processed through a set of permissions filters before being displayed. For example, suppose that content item 1 is marked as only visible for user A. The system will show items 1, 5 and 10 to user A. When user B visits user A's page on the social network, however, the system will only display items 5 and 10.
  • a content selection system for selecting material to display to the user based on social network activity which in this document we call the Relevant Content Service (implemented using e.g. services 114 and 124 of Fig. 1)
  • a Content Selection Service for selecting material that is published by a specific user (implemented using e.g. services 114 and 124 of Fig. 1)
  • a Rights Filtering Service (implemented using e.g. services 114 and 124 of Fig. 1)
  • a Content Metadata Store (implemented using e.g. services 114 and 124 of Fig. 1)
  • a Content Store (implemented using e.g. services 114 and 124 of Fig. 1)
  • a Data Collection Service (implemented using e.g. services 112 and 122 of Fig. 1)
  • a Content Merging Service (implemented using e.g. services 114 and 124 of Fig. 1)
  • the Relevant Content Service may be designed to accept a user ID as an input, and provide access to a content metadata store that provides information about all content in the system and all user interactions with that content. From that information, the Relevant Content Service returns a set of content IDs that would potentially be of interest to the user, each of which has a relevance score associated with it.
  • the content is selected at random from the entire set of content in the content metadata store, with each content having a relevancy score that ranges from 0 to 1, where the relevancy score may be e.g. the number of seconds from the current date back to the publication date of the content divided by the number of seconds from the current date back to the earliest publishing date of any content in the system.
  • content may be provided based on people that the user knows. That is, for a given user ID, say 7, the system would look for other users in the social network that user 7 knows. This set of users could be determined in a number of ways, such as looking at what users user 7 has invited to the social network, or looking at what users user 7 has interacted with on the social network. Call this set of users set 1.
  • the Relevant Content Service would then examine the content that has been uploaded by set 1.
  • the relevancy score could be based upon date ranges, as previously discussed, or based upon how often user 7 has interacted with a given user in set 1 , or some combination thereof.
  • the Relevant Content Service divides the content uploaded by user 7 into two sections. One section would be content that was less than N days old (set
  • N is a value that can be altered within the system
  • the other section would be content that is greater than N days old (set B).
  • M items that the Relevant Content Service would like to return it would attempt to select M/2 items at random from set A. If there are less than M/2 items in set A, then a smaller number of items will be selected from set A. We will designate the number of items selected from set A as P.
  • the Content Service would then attempt to select (M-P) items from set B.
  • the relevancy score could be based on date, as previously described.
  • the Rights Filter Service is also designed to take as input a user ID and a set of content IDs, and return the subset of content IDs that the user with the particular ID is allowed to see.
  • a relational database is created for storing rights information. Each record in the relational database would store a user ID, a content ID, and whether the user was explicitly denied access to the content item. For example, if User A is not allowed to see Content B, then there could be a record that contains the ID for User A and the ID for Content B.
  • the Rights Filter Service Given a set of content IDs and a user ID, the Rights Filter Service can perform a query against the database returning all content IDs from the set that do not have a corresponding record with that ID and the user ID.
  • the Content Merging Service is designed to merge together content from many different sources, such as the Relevant Content Service content and the user uploaded content.
  • percentage targets are established for each source. For example, suppose that the Content Merging Service needs to return M items, and has sources 1, 2, 3. Suppose it is given targets of returning x% from source 1, y% from source 2, and the remaining from source 3. With such a system, the Content Merging Service would sort content from each source based on relevancy, and then attempt to select the top M*x% items from source 1. Since source 1 could have fewer than this many items, call the number of items that were selected P. The service would then attempt to select (M + (M*x% - P))*y% items from source 2. Call the number of items selected Q. The service would then attempt to select M-P-Q items from source 3.
  • the Content Metadata Store may be designed to store information about all content in the system.
  • a relational database is employed.
  • the relational database may contain a table describing users, a table describing content, and a table describing interactions.
  • the table describing users would provide a unique ID for each user and any other information the system needed to store, such as email address.
  • the table describing content would store the type of the content, the ID of the user that published it (a foreign key to the user table), when it was published, a reference to where the content was actually stored (a foreign key to the content store) and other descriptive information about the content, such as the title or size.
  • the table describing interactions would store the ID of the user performing the interaction (a foreign key to the user table), the ID of the content with which the user interacted (a foreign key to the content table), the time of the interaction, and the type of interaction (such as viewed, rated, etc.). These tables can then be queried to satisfy requests such as:
  • the Content Store may be designed to store the actual content.
  • a file system is used. Given a content ID, the file system can have a set of directories whose names correspond to each character in the content ID. The first N set of characters could be used for directories, and the remaining set ignored. This enables the system to control how many items are stored in any particular directory. For example, if the system creates directories 4 levels deep, than an item with content ID 0192323 would be given the file name 0192323 and be stored in directory 0/1/9/2. Thus, the full path to the piece of content would be 0/1/9/2/0192323. The content store would return the path to the content item given a particular ID.
  • the invention determines what to show User A. First, it calls the Relevant Content Service to get content for User B. This is passed to the Rights Filter service so that only content User A is allowed to see is returned. If User A is not the same as User B, then the system selects a set of content that has been uploaded by User B. This is passed to the Rights Filter so that only content that User A is allowed to see is returned. These two sets of content are merged together by the Content Merging Service and returned.
  • the process may begin at 1602, with User A coming to the social network and viewing the home page of User B.
  • the system determines whether User A and User B are the same user (1604). If User A and User B are the same user, then this means that User A is visiting his own page.
  • the system calls the Relevant Content Service to determine what to show the user (1616).
  • the Relevant Content Service in response, examines content that has been uploaded by users of the social network, and by analyzing user activity, determines what content will be interesting for User A.
  • the Relevant Content Service retrieves its information from a metadata store (1632) which stores information about what content has been uploaded by users of the social network and what content and what home pages within the social network site have been viewed by users of the social network.
  • the metadata store can be implemented in various ways, such as with a relational database in which each content item, user and home page has a unique identifier, and in which a field code indicates an action. For example, if user A uploads content B, then a record can be entered in the database indicating that user A performed action upload on content B. Likewise, if user C views content B, a record can be entered indicating that user C performed action view on content B.
  • the Relevant Content Service also retrieves information from a Content Store (1634). This stores the actual content that the metadata service refers to.
  • the Content Store can be embodied in a variety of ways, such as a set of files in a file system or a set of binary data stored within a relational database.
  • a Rights Filter service (1618). The purpose of this service is to make sure that the content that is returned (1620) is content that User A is allowed to see.
  • the rights service can be created in any number of ways.
  • the Rights Filter could be embodied in a relational database, in which each record contains a user ID, a content ID, and a right.
  • the Rights Filter service can check the database to determine whether or not the user is allowed to see the content. After the Rights Filter service has removed items that User A is not allowed to see, the resulting set of content items is returned (1620).
  • the decision process moves to a different process.
  • we perform two operations Similar to the step previously outlined, we call the Relevant Content Service to determine what to show User B (1610). Note that User A is looking at the page for User B. By calling the Relevant Content Service for User B (instead of User A), we are displaying to User A the content that we would normally show to User B.
  • the system then removes items from the result set that User A is not allowed to see (1612). This is similar to what was earlier described, only in this case we are determining what we would normally show User B, but then removing content that User A is not allowed to see.
  • the system shows User A items that User B has uploaded to the system (1606).
  • the system examines the Metadata Store (1632) to find content that User B has created.
  • the system divides the content that User B has created into two categories: recent and not-recent content.
  • the service for selecting a subset of User B's content selects a set of content from the recent category and a set from the not- recent category.
  • the recency is determined by looking at the metadata associated with the content.
  • the percentage of content that should be selected from the recent and not-recent set can be established in a variable so that the system or administrators of the system can alter the values.
  • the techniques used for selecting content from the recent and not-recent set could include stochastic sampling or relevancy algorithms as are used by the Relevant Content Service.
  • the Content Merging Service 1614 merges together the content that was selected by 1204 and 1206.
  • the merging process can be embodied in a variety of forms. For example, all content could be returned by returning the complete set of content returned by the selection processes 1606 and 1610. Or, the two sets could be stochastically sampled to return a smaller set. Or, the two sets could be merged and relevance sorted to return a smaller set. Or, the two sets could be relevance sorted individually and then sampled equally. There are many other embodiments as well. After the content is merged, the merged content is returned.
  • above described embodiments of the present invention may be practiced to providing relevant content to client devices in a social network, including content created by users of the client devices, thus enabling the social network to propagate and present to each user of the system a set of constantly changing content that the user will likely find interesting (relevant).
  • Figure 4 illustrates selected operations for selecting relevant content employing multiple analysis algorithms, in accordance with various embodiments.
  • a result queue for a client device may first be initialized, 402, and if all analysis algorithm have not been invoked, 406, the next relevant algorithm analysis is invoked 410.
  • the analysis algorithms may be invoked in any arbitrary order.
  • the relevant algorithm analysis 410 returns a relevance score at completion of the analysis.
  • the relevance score is normalized by the importance/weight of the algorithm, and the result is stored into the content result queue, 414.
  • the content queue may be sorted by the content's relevance, 408.
  • the relevant content service may be designed such that additional relevance algorithms may be added at any time. Each relevance algorithm is given a unique identifier.
  • the relevant content service stores the relevance weight that each relevance algorithm provides for the content that the relevant content service surfaces, and records the resulting clickthrough rates on that content.
  • the relevant content service then back-propagates a score to the relevance algorithms that suggested the content, weighted by their relevance score. Thus, a relevance algorithm that gave high relevance to a piece of content that was clicked on will get a large bonus.
  • the relevant content service uses these weights as the weighting score discussed previously. As a result, relevance algorithms that are most effective for a particular user will gain increasing influence in selecting content for that user.
  • the relevant content service gives a score to the overall performance of each relevance algorithm across the entire set of users, and combines that score with the per-user score to determine actual weighting in the use of that algorithm for that particular user. This has the value of damping out spikes that might occur due to a very short term behavior pattern of a user. (E.g., the user might heavily click on one content base and overly highly weight a particular relevance algorithm.)
  • Figure 5 illustrates selected operations for selecting relevant content based on user activities on friends' client devices, in accordance with various embodiments.
  • the relevant content service may make the relevant predictions by looking at a user's social network, looping through all "friends" of the user, 506-538. From that, the relevant content service looks for content that the relevant content service can recommend, based upon both what people in the social network have recently uploaded, 520, as well as what people in the network have recently clicked on, 528.
  • the relevant content service weighs the values of the content based upon the strength of the connections between the user requesting content and the person who created or uploaded it, 534-538. Eventually, after sufficient relevant content has been accumulated, the relevant content service propagates the content to the client device 540.
  • the strength is a function of explicit statements such as
  • the function f could be any one of a number of functions with an "inversely proportional" behavior.
  • An example of such a function is 1/n 2 .
  • the various embodiments assume that people in a social network have enough of a relationship that they will have some common interests or behaviors, but that this commonality drops off with distance (or degree of removal) in a non- linear fashion.
  • the relevant content service enhances the accuracy of the prediction with a clickstream-based cosine similarities model, Figure 6.
  • the relevant content service looks at content that the user has already responded to (with a clickthrough or positive vote or other such action) and performs a cosine similarities expansion on that content (known as a seed set) to create a new base of content (604-614).
  • This model looks at user behavior in aggregate to find content that other people who have responded to a particular seed set have responded to. This will, for example, identify correlations such as the fact that users who like Houses of the Holy often like Crossroads.
  • the relevant content identified through this approach is added to the selected relevant content 616. At such time, again the relevant content are re-sorted by their scores 618, and the selected relevant content may be propagated to the client device, 620.
  • the relevant content service additionally looks at metadata associated with content the user has responded to select relevant content, Figure 7.
  • the relevant content service looks at the tags on the content and performs a cosine similarities expansion on that tag set (704-720). This is good for suggesting that people who like things tagged "cat” often like things tagged "Siamese," and thus we can use content tagged "Siamese" as a source for people who have responded to things tagged cat.
  • the relevant content identified through this approach is added to the selected relevant content,722.
  • the relevant content are re-sorted by their score 726, and the selected relevant content may be propagated to the client device, 728.
  • the process of Figure 8 may be employed to associate algorithm and relevant value pair to content.
  • a description vector may be initialized for each content, 802.
  • the analysis algorithm employed are looped through 804-810, invoked at 804, its result vector metadata obtained at 806, its analysis performed at 808, and the corresponding algorithm metadata/result pair placed into the content description vector at 810.
  • the process is repeated for all analysis algorithms 812.
  • the content description vectors are stored and indexed 814.
  • the relevant content service further employs a Bayesian system that analyzes a particular user's patterns to attempt to learn what might be useful to send them, Figure 9.
  • the relevant content service might determine that a particular user most often likes images that have a high red component.
  • the relevant content service extracts a number of properties (called dimensions) of objects, 902, 908 and 914, feeds the properties to a Bayesian network 904, 910 and 914, and determines their relevance, 906, 912 and 916.
  • dimensions called dimensions
  • the relevant content service extracts a number of properties (called dimensions) of objects, 902, 908 and 914, feeds the properties to a Bayesian network 904, 910 and 914, and determines their relevance, 906, 912 and 916.
  • These can be things such as parameters of a Daubechies wavelet compression for images, wordnet analysis for text, and what artists or genres a person listens to.
  • the relevant content service may use the weighting factors of the person's social network when the user hasn't performed enough interaction with the site. In the case of the person's social network not having enough activity, the relevant content service uses overall site activity to populate the weighting factors. If no relevant content are found, the relevant content service may return an empty set 922. If relevant content are found, the relevant content may be propagated.
  • the relevant content service may additionally inject (e.g. randomly or pseudo-randomly) a set of content that hasn't yet been clicked on, and for which there is therefore no response data about it, into the queue into a mix of locations

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

L'invention concerne, dans divers modes de réalisation, un ou plusieurs serveurs qui sont(collectivement) dotés d'un service central de collecte et de gestion de données, et d'un service central de sélection et de propagation de contenu et/ou de métadonnées. Ces services permettent de recevoir des données associées à une activité d'utilisateur qui sont collectées automatiquement à partir de dispositifs clients et, en réponse, de sélectionner et de propager le contenu et/ou les métadonnées vers le dispositif client, d'une manière plus rentable, plus souple et plus efficace (avec une pertinence élevée). Dans diverses modes de réalisation, un dispositif client est doté d'un service client de collecte et de gestion de données, d'un service client de sélection et de propagation de contenu et/ou de métadonnées et d'un service client de présentation de contenu, afin de collecter automatiquement des données associées à l'activité d'un utilisateur, d'assister un service de sélection et de propagation de contenu et/ou de métadonnées et de sélectionner et propager le contenu et/ou les métadonnées d'une manière plus rentable, plus souple et plus efficace (avec une pertinence élevée).
PCT/US2007/081012 2006-10-10 2007-10-10 Service automatisé de collecte et de communication de données associées à l'activité d'un utilisateur, et de sélection et de propagation de contenu et/ou de métadonnées WO2008045979A2 (fr)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US85083806P 2006-10-10 2006-10-10
US85084106P 2006-10-10 2006-10-10
US60/850,838 2006-10-10
US60/850,841 2006-10-10
US85480206P 2006-10-27 2006-10-27
US60/854,802 2006-10-27

Publications (2)

Publication Number Publication Date
WO2008045979A2 true WO2008045979A2 (fr) 2008-04-17
WO2008045979A3 WO2008045979A3 (fr) 2008-09-12

Family

ID=39283612

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2007/081012 WO2008045979A2 (fr) 2006-10-10 2007-10-10 Service automatisé de collecte et de communication de données associées à l'activité d'un utilisateur, et de sélection et de propagation de contenu et/ou de métadonnées

Country Status (1)

Country Link
WO (1) WO2008045979A2 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2545523A4 (fr) * 2010-03-11 2015-09-23 Microsoft Technology Licensing Llc Procédés de pertinence adaptables pour des flux d'activités sociales

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6029195A (en) * 1994-11-29 2000-02-22 Herz; Frederick S. M. System for customized electronic identification of desirable objects
US20020087573A1 (en) * 1997-12-03 2002-07-04 Reuning Stephan Michael Automated prospector and targeted advertisement assembly and delivery system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6029195A (en) * 1994-11-29 2000-02-22 Herz; Frederick S. M. System for customized electronic identification of desirable objects
US20020087573A1 (en) * 1997-12-03 2002-07-04 Reuning Stephan Michael Automated prospector and targeted advertisement assembly and delivery system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2545523A4 (fr) * 2010-03-11 2015-09-23 Microsoft Technology Licensing Llc Procédés de pertinence adaptables pour des flux d'activités sociales

Also Published As

Publication number Publication date
WO2008045979A3 (fr) 2008-09-12

Similar Documents

Publication Publication Date Title
US20080215581A1 (en) Content/metadata selection and propagation service to propagate content/metadata to client devices
Amato et al. Recognizing human behaviours in online social networks
AU2010204767B2 (en) Conditional incentive presentation, tracking and redemption
Dumais et al. Understanding user behavior through log data and analysis
US8682723B2 (en) Social analytics system and method for analyzing conversations in social media
US8856167B2 (en) System and method for context based query augmentation
US9405827B2 (en) Playlist generation of content gathered from multiple sources
US8060504B2 (en) Method and system for selecting content items to be presented to a viewer
CN100530177C (zh) 用于接收并响应知识互换查询的方法、系统和装置
US9070110B2 (en) Identification of unknown social media assets
US20080160490A1 (en) Seeking Answers to Questions
US20080140642A1 (en) Automated user activity associated data collection and reporting for content/metadata selection and propagation service
KR20130135977A (ko) 소셜 네트워크 내의 피드의 추적
JP2001142907A (ja) インターネット・プロファイリングシステム
WO2009085583A2 (fr) Systèmes et procédés de classement de l'attention
WO2017158452A1 (fr) Graphes abstraits provenant d'un graphe de relations sociales
US9069880B2 (en) Prediction and isolation of patterns across datasets
US20160239533A1 (en) Identity workflow that utilizes multiple storage engines to support various lifecycles
JP2011514570A (ja) 集中型ソーシャル・ネットワーク応答追跡
Aizen et al. Traffic-based feedback on the web
Raban et al. Acting or reacting? Preferential attachment in a people‐tagging system
WO2008045979A2 (fr) Service automatisé de collecte et de communication de données associées à l'activité d'un utilisateur, et de sélection et de propagation de contenu et/ou de métadonnées
Markovets et al. Information Consolidation on Users of Social Networks to Determine Their Credibility.
Poniszewska-Marańda et al. Analyzing user profiles with the use of social API
Khan et al. Cloud service for assessment of news' Popularity in internet based on Google and Wikipedia indicators

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07853925

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 07853925

Country of ref document: EP

Kind code of ref document: A2