WO2014201570A1 - System and method for analysing social network data - Google Patents

System and method for analysing social network data Download PDF

Info

Publication number
WO2014201570A1
WO2014201570A1 PCT/CA2014/050586 CA2014050586W WO2014201570A1 WO 2014201570 A1 WO2014201570 A1 WO 2014201570A1 CA 2014050586 W CA2014050586 W CA 2014050586W WO 2014201570 A1 WO2014201570 A1 WO 2014201570A1
Authority
WO
WIPO (PCT)
Prior art keywords
topic
user
determining
users
list
Prior art date
Application number
PCT/CA2014/050586
Other languages
French (fr)
Inventor
Nicholas KOUDAS
Nilesh Bansal
Hao-Yu Cheng
Original Assignee
Marketwire L.P.
Sysomos Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Marketwire L.P., Sysomos Inc. filed Critical Marketwire L.P.
Publication of WO2014201570A1 publication Critical patent/WO2014201570A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles

Definitions

  • Some bloggers on particular topics with a wide following are identified and are used to endorse or sponsor specific products. For example, advertisement space on a popular blogger's website is used to advertise related products and services.
  • Social network platforms are also used to influence groups of people. Examples of social network platforms include those known by the trade names Facebook, Twitter, Linkedln and Pinterest. Popular or expert individuals within a social network platform can be used to market to other people. Quickly identifying popular or expert individuals becomes more difficult when the number of users within a social network grows. Furthermore, accurately identifying experts within a particular topic is difficult.
  • FIG. 1 is a schematic diagram of a server in communication with a computing device.
  • FIG. 2 is a flow diagram of an example embodiment of computer executable instructions for obtaining and storing social networking data.
  • FIG. 3 is a block diagram of example data components in an index store.
  • FIG. 4 is a block diagram of example data components in a profile store.
  • FIG. 5 is a schematic diagram of example user lists and a tally of the number of times a user is listed within different user lists.
  • FIG. 6 is a flow diagram of an example embodiment of computer executable instructions for determining topics in which a given user is considered an expert.
  • FIG. 7 is a flow diagram of an example embodiment of computer executable instructions for determining topics in which a given user is interested.
  • FIG. 8 is a flow diagram of an example embodiment of computer executable instructions for topic analysis.
  • FIG. 9 is a flow diagram of an example embodiment of computer executable instructions for searching for users in the index store that are considered experts in a topic.
  • FIG. 10 is a flow diagram of an example embodiment of computer executable instructions for processing links.
  • FIG. 1 1 is a flow diagram of an example embodiment of computer executable instructions for identifying experts in a first topic and that have interest in a second topic.
  • FIG. 12 is a flow diagram of an example embodiment of computer executable instructions for identifying users that have interest in a topic.
  • FIG. 13 is a flow diagram of an example embodiment of computer executable instructions for suggesting followers for a specific user account that have interest in a topic.
  • FIG. 14 is an example embodiment of a graphical user interface (GUI) for searching for users which are experts in a topic.
  • GUI graphical user interface
  • FIG. 15 is an example embodiment of a GUI for performing an advanced query using social network data.
  • FIG. 16 is an example embodiment of a GUI for displaying the results of a query. DETAILED DESCRIPTION OF THE DRAWINGS
  • Social networking platforms include users who generate and post content for others to see, hear, etc.
  • Non-limiting examples of social networking platforms are Facebook, Twitter, Linkedln and Pinterest. Currently known and future known social networking platforms may be used with principles described herein.
  • Social networking platforms can be used to market to, and advertise to, users of the platforms. It is recognized that it is difficult to identify users relevant to a given topic and, conversely, topics that are relevant to a given user. This includes identifying experts on a given topic as well as users who are interested in a given topic.
  • the proposed system and methods described herein are able to identify users who are experts on a topic, and are able to identify users with an interest on a topic.
  • the term "expert” refers to a user account that primarily produces and shares content related to a topic and has a wide following of users.
  • the term "follower”, as used herein, refers to a first user account that follows a second user account, such that content posted by the second user account is published for the first user account to read, consume, etc. For example, when a first user follows a second user, the first user (i.e. the follower) will receive content posted by the second user.
  • a user with an "interest" on a particular topic herein refers to a user account that follows a number of experts in the particular topic. In some cases, a follower engages with the content posted by the other user (e.g. by sharing or reposting the content).
  • the proposed system and methods can be used to determine that experts in Topic A are also experts in one or more other topics (e.g. Topic B, Topic C, etc.).
  • FIG. 1 a schematic diagram of a proposed system is shown.
  • a server 100 is in communication with a computing device 101 over a network 102.
  • the server 100 obtains and analyzes social network data and provides results to the computing device 101 over the network.
  • the computing device 101 can receive user inputs through a GUI to control parameters for the analysis.
  • social network data includes data about the users of the social network platform, as well as the content generated or organized, or both, by the users.
  • Non-limiting examples of social network data includes the user account ID or user name, a description of the user or user account, the messages or other data posted by the user, connections between the user and other users, location information, etc.
  • An example of connections is a "user list”, also herein called “list”, which includes a name of the list, a description of the list, and one or more other users which the given user follows. The user list is created by the given user.
  • the server 100 includes a processor 103 and a memory device 104.
  • the server includes multiple quad-core processors, 96 gigabytes of main memory, and 12 terabytes of raw disk storage.
  • the memory device 104 or memory devices are solid state drives for increased read/write performance.
  • multiple servers are used to implement the methods described herein.
  • other currently known computing hardware or future known computing hardware is used, or both.
  • the server 100 also includes a communication device 105 to communicate via the network 102.
  • the network 102 may be a wired or wireless network, or both.
  • the server 100 also includes a GUI module 106 for displaying and receiving data via the computing device 101 .
  • the server also includes: a social networking data module 107; an indexer module 108; a user account relationship module 109; an expert identification module 1 10; an interest identification module 1 1 1 ; a topic analytics module 1 12; a query module to identify experts in Topic A that also have interests in Topic B 1 13; a query module to identify user that have interests in Topic A 114; and a query module to suggest followers that have interests in Topic A 1 15.
  • the server 100 also includes a number of databases, including a data store 1 16; an index store 1 17; a database for a social graph 1 18; a profile store 1 19; a database for expertise vectors 120; and a database for interest vectors 121.
  • the social networking data module 107 is used to receive a stream of social networking data. In an example embodiment, millions of new messages are delivered to social networking data module 107 each day, and in real-time.
  • the social networking data received by the social networking data module 107 is stored in the data store 1 16.
  • the indexer module 108 performs an indexer process on the data in the data store 1 16 and stores the indexed data in the index store 1 17.
  • the indexed data in the index store 1 17 can be more easily searched, and the identifiers in the index store can be used to retrieve the actual data (e.g. full messages).
  • a social graph is also obtained from the social networking platform server, not shown, and is stored in the social graph database 1 18.
  • the social graph when given a user as an input to a query, can be used to return all users following the queried user.
  • the profile store 1 19 stores meta data related to user profiles. Examples of profile related meta data include the aggregate number of followers of a given user, self- disclosed personal information of the given user, location information of the given user, etc. The data in the profile store 1 19 can be queried. [0036] In an example embodiment, the user account relationship module can use the social graph 1 18 and the profile store 1 19 to determine which users are following a particular user.
  • the expert identification module 1 10 is configured to identify the set of all user lists in which a user account is listed, called the expertise vector.
  • the expertise vector for a user is stored in the expertise vector database 120.
  • the interest identification module 1 1 1 is configured to identify topics of interest to a given user, called the interest vector.
  • the interest vector for a user is stored in the interest vector database 121.
  • the computing device 101 includes a communication device 122 to communicate with the server 100 via the network 102, a processor 123, a memory device 124, a display screen 125, and an Internet browser 126.
  • the GUI provided by the server 100 is displayed by the computing device 101 through the Internet browser.
  • an analytics application 127 is available on the computing device 101
  • the GUI is displayed by the computing device through the analytics application 127.
  • the display device 125 may be part of the computing device (e.g. a mobile device, a tablet, a laptop, etc.) or may be separate from the computing device (e.g. a desktop computer).
  • various user input devices e.g. touch screen, roller ball, optical mouse, buttons, keyboard, microphone, etc.
  • touch screen e.g., touch screen, roller ball, optical mouse, buttons, keyboard, microphone, etc.
  • any module or component exemplified herein that executes instructions may include or otherwise have access to computer readable media such as storage media, computer storage media, or data storage devices (removable and/or non-removable) such as, for example, magnetic disks, optical disks, or tape.
  • Computer storage media may include volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data.
  • Examples of computer storage media include RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by an application, module, or both. Any such computer storage media may be part of the server 100 or computing device 101 or accessible or connectable thereto. Any application or module herein described may be implemented using computer readable/executable instructions that may be stored or otherwise held by such computer readable media.
  • FIG. 2 an example embodiment of computer executable instructions are shown for obtaining and storing social network data.
  • the server 100 obtains social networking data.
  • the data may be received as a stream of data, including messages and meta data, in real time.
  • This data is stored in the data store 1 16, for example, using a compressed row format (block 201 ).
  • a compressed row format block 201
  • Blocks 200 and 201 are used.
  • the social network data received by social networking module 107 is copied, and the copies of the social network data are stored across multiple servers. This facilitates parallel processing when analysing the social network data. In other words, it is possible for one server to analyse one aspect of the social network data, while another server analyses another aspect of the social network data.
  • the server 100 indexes the messages using an indexer process (block 202).
  • the indexer process is a separate process from the storage process that includes scanning the messages as they materialize in the data store 1 16.
  • the indexer process is a separate process from the storage process that includes scanning the messages as they materialize in the data store 1 16.
  • the indexer process runs on a separate server by itself. This facilitates parallel processing.
  • the indexer process is, for example, a multi-threaded process that materializes a table of indexed data for each day, or for some other given time period.
  • the indexed data is outputted and stored in the index store 1 17 (block 204).
  • each row in the table is a unique user account identifier and a corresponding list of all message identifiers that are produced that day, or that given time period.
  • millions of rows of data can be read and written in the index store 1 17 each day, and this process can occur as new data is materialized or added to the data store 116.
  • a compressed row format is used in the index store 1 17.
  • deadlocks are avoided by running relaxed transactional semantics, since this increases throughput across multiple threads when reading and writing the table. By way of background, a deadlock occurs when two or more tasks permanently block each other by each task having a lock on a resource which the other tasks are trying to lock.
  • the server 100 further obtains information about which user accounts follow other user accounts (block 203).
  • This process includes identifying profile related meta data and storing the same in the profile store (block 205).
  • an example of the profile store 1 19 shows that for each user account, there is associated profile related meta data.
  • the profile related meta data includes, for example, the aggregate number of followers of the user, self-disclosed personal information, location information, and user lists.
  • the data After the data is obtained and stored, it can be analyzed, for example, to identify experts and interests.
  • a user may have a list of other users which he or she may follow.
  • User A has a list of User B, User C and User D, which User A follows.
  • the users e.g. User B, User C and User D
  • the list has an associated list description (e.g. Description A).
  • User A believes that User B, User C and User D are experts or knowledgeable in Topic A.
  • Another user may have the same or similar list name and description (e.g. same or similar to List A, Description A), but may have different users listed than those by User A. For example, User E follows User B, User C and User G. In other words, User E believes that User B, User C and User G are experts or knowledgeable in Topic A.
  • Another user, User F may have the same or similar list name and description (e.g. same or similar to List A, Description A), but may have different users listed than those by User A. For example, User F follows User B, User H and User I, since User F believes these users are experts or knowledgeable in Topic A.
  • the server 100 can determine whether the user is considered an expert by other users. For example, User B is listed on three different lists related to Topic A; User C is listed on two different lists; and each of User D, User G, User H and User I are only listed on one list. Therefore, in this example, User B is considered the foremost expert in Topic A, followed by User C.
  • an example embodiment of computer executable instructions is provided for determining topics for which a given user is considered an expert.
  • the server 100 obtains a set of lists in which the given user listed.
  • the server 100 uses the set of lists to determine topics associated with the given user.
  • the server outputs the topics in which the given user is considered an expert.
  • These topics form the expertise vector of the given user. For example, if the user Alice is listed in Bob's fishing list, Celine's art list, and David's photography list, then Alice's expertise vector includes: fishing, art and photography.
  • the user lists are obtained by constantly crawling them, since the user lists are dynamically updated by users, and new lists are created often.
  • the user lists are processed using an Apache Lucene index.
  • the expertise vector of a given user is processed using the Lucene algorithm to populate the index of topics associated with the given user.
  • This index supports, for example, full Lucene query syntax, including phrase queries and Boolean logic.
  • Apache Lucene is an information retrieval software library that is suitable for full text indexing and searching. Lucene is also widely known for its use in the implementation of Internet search engines and local single-site searching. It can be appreciated, that other currently known or future known searching and indexing algorithms can be used.
  • the computer executable instructions of FIG. 6 are implemented by module 1 10.
  • FIG. 7 an example embodiment of computer executable instructions is provided for determining topics in which a given user is interested.
  • the server 100 obtains ancillary users that the given user follows.
  • a number of instructions are performed, but specific to each ancillary user.
  • the server obtains a set of lists in which the ancillary user is listed (e.g. the expertise vector of the ancillary user).
  • the server uses the set of lists to determine topics associated with the ancillary user.
  • the outputs of block 704 are topics associated with the ancillary user (block 705).
  • block 702 can simply call on the algorithm presented in FIG. 6, but being applied to each ancillary user.
  • the server combines the topics from all the ancillary users.
  • the combined topics form the output 707 of the topics of interest for the given user (e.g. the interest vector of the given user).
  • an alternative to the blocks 706 and 707 is to determine which topics are common, or most common amongst the ancillary users (block 708).
  • a given user Alice follows ancillary users Bob, Celine and David.
  • Bob is considered an expert in fishing and photography (e.g. the expertise vector of Bob).
  • Celine is considered an expert in fishing, photography and art (e.g. the expertise vector of Celeine).
  • David is considered an expert in fishing and music (e.g. the expertise vector of David). Therefore, since the topic of fishing is common amongst all the ancillary users, it is identified that Alice has an interest in the topic of fishing. Or, since photography is more common amongst the ancillary users (e.g. the second most common topic after fishing), then the topic of photography is also identified as a topic of interest for Alice. Since art and music are not common amongst the ancillary users, these topics are not considered to be topics interest to Alice.
  • module 11 1 implements the computer executable instructions presented in FIG. 7.
  • the data from the expertise vector and the data from interest vector are supplied to the Lucene algorithm for indexing.
  • an example embodiment of computer executable instructions are provided for topic analysis. These instructions can, for example, be implanted by module 1 12.
  • the server 100 obtains a topic for querying.
  • the topic can be provided by a user through the GUI displayed by the computing device 101.
  • the topic may also come from another source.
  • the server searches for users in the index store 1 17 that are considered experts in the topic.
  • the experts determined in block 802 may be limited to the top n users (block 803).
  • a set of instructions 804 are executed for each expert identified in block 802.
  • the instructions include obtaining profile information of the expert using the profile store 1 19 (block 805) and obtaining messages sent from the expert using the index store 1 17 and the data store 1 16 (block 806).
  • the server 100 Using the messages obtained from all the experts, the server 100 identifies: frequently used keywords, frequently used keyword pairs, frequently used hashtags, frequently used links (e.g. URLs), etc. (block 807). The server then causes this information, including the profile information of the experts, to be diplayed using the GUI. It will be appreciated that the keywords, keyword pairs, hashtags and links can be ordered from most frequently used to least frequently used. The top n most frequently results will be displayed on the GUI. The identification of the keywords, keyword pairs, etc. can be done using currently known or future known semantic processing, including removing stop words.
  • the extraction or search for experts in block 802 can be identified using the Lucene index.
  • example computer executable instructions are provided for implementing block 802.
  • the server 100 identifies users having Topic A (e.g. the topic being queried in FIG. 8) listed in their expertise vector.
  • the server determines which users appear on the highest number of lists associated with Topic A.
  • the top n users who appear on the highest number of lists are the experts of Topic A.
  • example computer executable instructions are provided for processing links (e.g. URLs).
  • the server 100 obtains a list of shortened URLs.
  • the server calls on a URL dereferencing algorithm that utilizes asynchronous input/output (IO). The server then outputs the list of unshortened URLs.
  • IO asynchronous input/output
  • URLs in messages are shortened (e.g. bit.ly or t.co) and that conducting analysis on these domains can be challenging since each shortened URL will be unshortened, possibly multiple times.
  • the process described in FIG. 10 efficiently unshortens multiple URLs (e.g. thousands of URLs) in parallel on a single thread and in a short time frame (e.g. a second).
  • FIG.s 1 1 , 12 and 13 example embodiments of computer executable instructions for different queries are provided. It can be appreciated that the operations described below with respect to FIG.s 1 1 , 12 and 13 can be implemented by modules 1 13, 1 14 and 1 15, respectively.
  • the operations of FIG. 11 are used to identify experts in a given topic (e.g. Topic A) that have an interest in another topic (e.g. Topic B).
  • the operations of FIG. 1 1 can be implemented by module 1 13.
  • the server 100 obtains Topic A and Topic B, for example, via the GUI.
  • the server searches for users in the index store that are considered experts in Topic A.
  • the operations presented with respect to FIG. 9 can be used, for example, to implement block 1 102.
  • the server determines which of the experts have an interest in Topic B (e.g. by analysing the interest vector of each identified expert) (block 1 103).
  • the server outputs the users that are considered an expert in Topic A and that have an interest in Topic B, as determined by block 1 103.
  • the server identifies users that are experts in Topic A, have an interest in B, and also maximize the number of unique followers of a predetermined number n of experts.
  • the max reach operation 1 105 includes, of the users that are considered an expert in Topic A and have an interest in Topic B, determining which combination of n users provides the highest number of unique followers of the users.
  • the determined n users are outputted (block 1 106). For example: Alice, Bob and Celine are identified from block 1 103; the parameter n is 2; Alice has the followers David, Eve and Frank; Bob has the followers David and Eve; and Celine has the followers Gregory and Hanna. Based on this example, the combination of the experts Alice and Celine would provide the highest number of unique followers (e.g. five unique followers). By contrast, the combination of experts Alice and Bob would provide three unique followers.
  • the example computer executable instructions are for identifying users that have an interest in Topic A.
  • the server 100 obtains Topic A, for example, through a user input in the GUI.
  • the server searches for users that have an interest in Topic A (e.g. by analysing the index vector of each user).
  • the identified users from block 1202 are outputted.
  • the server determines which combination of n users provides the highest number of unique followers of the users (block 1204).
  • the determined n users are outputted (block 1205).
  • the example computer executable instructions are for suggesting followers for a specific user account that have an interest in Topic A.
  • the server 100 obtains the Topic A, for example, via the GUI.
  • the server searches for users in the index store that are considered experts in Topic A.
  • the server determines which of the experts have the largest number of followers and that do not currently follow the specific user account. In an example embodiment, the server identifies the top n experts with the largest number of followers.
  • the server outputs the determined experts, or the followers of the determined experts, or both.
  • FIG.s 14, 15 and 16 are example embodiments of different GUIs which can be generated by the GUI module 106 and displayed on an Internet browser 126, or displayed on the Analytics application 127, or both.
  • an example GUI 1400 is shown for searching for users who are experts in a given topic.
  • the GUI 1400 includes an input box 1401 to receive text identifying the given topic.
  • the search button 1402 When the user selects the search button 1402, the operations described in FIG. 8 are implemented.
  • the results of the query are displayed, for example, in a GUI like that in FIG. 16.
  • the GUI 1500 may or may not include components 1401 and 1402.
  • the advanced query GUI 1500 includes section 1501 for finding experts in Topic A with interests in Topic B (e.g. which initiates the operations of FIG. 1 1 ); section 1502 for finding users interested in Topic A (e.g. which initiates the operations of FIG. 12); and section 1503 for suggesting followers for a specific account and that are interested in Topic A (e.g. which initiates the operations of FIG. 13).
  • Section 1501 includes an input box 1504 to receive Topic A, for which users are an expert, and an input box 1505 to receive Topic B, for which the identified experts of Topic A have an interest. There is also a selection box 1505 to select the 'max reach' parameter. When this box 1505 is selected, operations in boxes 1 105 and 1 106 are executed. When the search button 151 1 is selected, the query is performed.
  • Section 1502 includes an input box 1507 to receive Topic A, for which users have an interest, and a selection box 1508 to select the 'max reach' parameter. When this box 1508 is selected, operations in boxes 1204 and 1205 are executed. When the search button 1512 is selected, the query is performed.
  • Section 1503 includes an input box 1509 to receive a specified user for which to suggest followers, and an input box 1510 to receive Topic A for which the suggested followers have an interest.
  • search button 1513 When the search button 1513 is executed, the query is performed.
  • FIG. 16 an example results GUI 1600 is shown.
  • the results for example, are shown when the topic "social” has been inputted into input box 1401 shown in GUI 1400.
  • GUI 1600 includes components 1401 and 1402 to allow the user to easily conduct a new query.
  • GUI 1600 also includes display area 1601 , which shows the topic that is being searched. In this example, it reads: Analysing query for "social”.
  • GUI 1600 includes a list of top experts 1602 that considered experts for the topic "social”, a list of other topics 1603 , a list of frequently used keywords 1604 based on the messages of the identified experts, a list of frequently used keyword pairs 1605 based on the messages of the identified experts, a list of frequently used hashtags 1606 based on the messages of the identified experts, and a list of links 1615 from which the content or messages are mostly shared.
  • the list of other topics 1603 are for which the identified experts are also considered experts. For example, a query done for Topic A shows that an expert in Topic A is also an expert in Topic B. In another example embodiment, the list of other topics 1603 are the other listed topics of interest of the identified experts. For example, a query on Topic A shows that an expert in Topic A has an interest in Topic B.
  • word cloud buttons 161 1 , 1612, 1613 and 1614 Displayed respectively in association with the topics listing 1604, the keywords listing 1604, the keyword pairs listing 1605 and the hashtags listing 1606, are word cloud buttons 161 1 , 1612, 1613 and 1614.
  • word cloud button 1612 When a word cloud button is selected or hovered over with a pointer or mouse, the results for that listing are displayed in a word cloud. For example, if word cloud button 1612 was selected or hovered over, then the keywords would be shown in a word cloud.
  • the hourly message activity from the identified experts is also displayed in section 1616.
  • the hourly message activity is shown in a bar graph, with each bar representing the number of messages sent during a different hour of the day.
  • GUI 1600 also includes a button 1617 that, when selected, invokes the server 100 to analyze the interests of the followers of the identified experts.
  • an advertiser bids on a list of topics.
  • the system and methods described herein can assist in such a scenario by analysing and identifying related topics for a topic of interest. These related topics may be cheaper to bid on than the topic of interest.
  • the topic of interest is "social marketing" and this topic has a high bidding price for the advertiser.
  • the GUI may display a related topic "seo", which has a lower bidding price than "social marketing”.
  • the server 100 determines in this example that the followers of the experts in "social marketing” and the followers of the experts in "seo” are highly related.
  • “seo" stands for search engine optimization.
  • keywords used in searches also evolve over time.
  • the social networking platform will display promoted messages as advertising along with the search results.
  • the system and methods described herein can be used to assist with this keyword bidding process.
  • the operations described herein can be used to identify keywords that are in messages and discussions related to a given topic. An advertiser can bid on these identified keywords, which are prevalent in messages related to the given topic and are also prevalent in discussions (e.g. hashtags) related to the given topic.
  • a method performed by a server for analysing data from users includes: obtaining a topic; identifying a user as an expert of the topic; obtaining a message sent from the expert; and determining a frequently used keyword in the message.
  • multiple users are identified as experts of the topic; multiple messages are sent from the experts; and the frequently used keyword in the multiple messages is determined.
  • the method further comprises:
  • the method further comprises determining a related topic to the topic, wherein the user is also identified as an expert of the related topic. In another aspect, the method further comprises determining a related topic to the topic, wherein the related topic is of interest to the user.
  • identifying the user as the expert of the topic includes: obtaining a list of another user in which the user is listed; and after determining that a name of the list or a description of the list is related to the topic, identifying the user as the expert of the topic.
  • identifying the user as the expert of the topic includes: obtaining multiples lists of other users in which the user is listed; determining that a name of the list or a description of the lists are related to the topic; after determining that the user appears on a highest number of lists that are related to the topic, identifying the user as the expert of the topic.
  • the method further includes: obtaining profile data about the expert; and displaying in a graphical user interface (GUI) the profile data of the expert, the topic, and the frequently used keyword.
  • GUI graphical user interface
  • a method performed by a server to identify a user that has interest in a topic includes: obtaining the topic; determining an ancillary user that the user follows; obtaining a list in which the ancillary user is listed;
  • multiple ancillary users that the user follows are determined; and the method further includes: for each of the multiple ancillary users, obtaining a given list in which a given ancillary user is listed and determining a given topic from the given list; combining the given topics corresponding to the multiple ancillary users; and determining that the user has interest in the given topics, the given topics including the topic.
  • multiple ancillary users that the user follows are determined; and the method further includes: for each of the multiple ancillary users, obtaining a given list in which a given ancillary user is listed and determining a given topic from the given list;
  • the method further includes displaying the user in a graphical user interface.
  • the method further includes: determining a first number of users have interest in the topic; and determining a combination of a second number of users, where the second number is smaller than the first number.
  • each of the first number of user have followers, and the server determines which combination of the second number of users has a highest number of unique followers.
  • a server configured to analyse data from users, includes: a processor; a communication device; and a memory device.
  • the memory device includes computer executable instructions for at least: obtaining a topic; identifying a user as an expert of the topic; obtaining a message sent from the expert; and determining a frequently used keyword in the message.
  • a server configured to identify a user that has interest in a topic, includes: a processor; a communication device; and a memory device.
  • the memory device comprises computer executable instructions for at least:
  • obtaining the topic determining an ancillary user that the user follows; obtaining a list in which the ancillary user is listed; determining that a name of the list or a description of the list is related to the topic; and determining that the user has interest in the topic.
  • GUIs and screen shots described herein are just for example. There may be variations to the graphical and interactive elements without departing from the spirit of the invention or inventions. For example, such elements can be positioned in different places, or added, deleted, or modified.

Abstract

A system and method are provided for analysing social networking data. A method performed by a server for analysing data from users includes obtaining a topic, identifying a user as an expert of the topic, obtaining a message sent from the expert, and determining a frequently used keyword in the message. Other data, such as related topics, keyword pairs and hashtags can also be determined based on the expert and the message sent from the expert. In another aspect, a method performed by a server to identify a user that has interest in a topic includes obtaining the topic, determining an ancillary user that the user follows, obtaining a list in which the ancillary user is listed, determining that a name of the list or a description of the list is related to the topic, and determining that the user has interest in the topic.

Description

SYSTEM AND METHOD FOR ANALYSING SOCIAL NETWORK DATA
CROSS-REFERENCE TO RELATED APPLICATIONS:
[0001] This application claims priority to United States Provisional Patent Application No. 61 /837,933 filed on June 21 , 2013, the entire contents of which are incorporated by reference.
TECHNICAL FIELD
[0002] The following generally relates to analysing social network data. BACKGROUND
[0003] In recent years social media has become a popular way for individuals and consumers to interact online (e.g. on the Internet). Social media also affects the way businesses aim to interact with their customers, fans, and potential customers online.
[0004] Some bloggers on particular topics with a wide following are identified and are used to endorse or sponsor specific products. For example, advertisement space on a popular blogger's website is used to advertise related products and services.
[0005] Social network platforms are also used to influence groups of people. Examples of social network platforms include those known by the trade names Facebook, Twitter, Linkedln and Pinterest. Popular or expert individuals within a social network platform can be used to market to other people. Quickly identifying popular or expert individuals becomes more difficult when the number of users within a social network grows. Furthermore, accurately identifying experts within a particular topic is difficult.
BRIEF DESCRIPTION OF THE DRAWINGS
[0006] Embodiments will now be described by way of example only with reference to the appended drawings wherein:
[0007] FIG. 1 is a schematic diagram of a server in communication with a computing device.
[0008] FIG. 2 is a flow diagram of an example embodiment of computer executable instructions for obtaining and storing social networking data.
[0009] FIG. 3 is a block diagram of example data components in an index store.
[0010] FIG. 4 is a block diagram of example data components in a profile store.
[0011] FIG. 5 is a schematic diagram of example user lists and a tally of the number of times a user is listed within different user lists. [0012] FIG. 6 is a flow diagram of an example embodiment of computer executable instructions for determining topics in which a given user is considered an expert.
[0013] FIG. 7 is a flow diagram of an example embodiment of computer executable instructions for determining topics in which a given user is interested.
[0014] FIG. 8 is a flow diagram of an example embodiment of computer executable instructions for topic analysis.
[0015] FIG. 9 is a flow diagram of an example embodiment of computer executable instructions for searching for users in the index store that are considered experts in a topic.
[0016] FIG. 10 is a flow diagram of an example embodiment of computer executable instructions for processing links.
[0017] FIG. 1 1 is a flow diagram of an example embodiment of computer executable instructions for identifying experts in a first topic and that have interest in a second topic.
[0018] FIG. 12 is a flow diagram of an example embodiment of computer executable instructions for identifying users that have interest in a topic.
[0019] FIG. 13 is a flow diagram of an example embodiment of computer executable instructions for suggesting followers for a specific user account that have interest in a topic.
[0020] FIG. 14 is an example embodiment of a graphical user interface (GUI) for searching for users which are experts in a topic.
[0021] FIG. 15 is an example embodiment of a GUI for performing an advanced query using social network data.
[0022] FIG. 16 is an example embodiment of a GUI for displaying the results of a query. DETAILED DESCRIPTION OF THE DRAWINGS
[0023] It will be appreciated that for simplicity and clarity of illustration, where considered appropriate, reference numerals may be repeated among the figures to indicate
corresponding or analogous elements. In addition, numerous specific details are set forth in order to provide a thorough understanding of the example embodiments described herein. However, it will be understood by those of ordinary skill in the art that the example embodiments described herein may be practiced without these specific details. In other instances, well-known methods, procedures and components have not been described in detail so as not to obscure the example embodiments described herein. Also, the description is not to be considered as limiting the scope of the example embodiments described herein. [0024] Social networking platforms include users who generate and post content for others to see, hear, etc. Non-limiting examples of social networking platforms are Facebook, Twitter, Linkedln and Pinterest. Currently known and future known social networking platforms may be used with principles described herein. Social networking platforms can be used to market to, and advertise to, users of the platforms. It is recognized that it is difficult to identify users relevant to a given topic and, conversely, topics that are relevant to a given user. This includes identifying experts on a given topic as well as users who are interested in a given topic.
[0025] The proposed system and methods described herein are able to identify users who are experts on a topic, and are able to identify users with an interest on a topic. As used herein, the term "expert" refers to a user account that primarily produces and shares content related to a topic and has a wide following of users. The term "follower", as used herein, refers to a first user account that follows a second user account, such that content posted by the second user account is published for the first user account to read, consume, etc. For example, when a first user follows a second user, the first user (i.e. the follower) will receive content posted by the second user. A user with an "interest" on a particular topic herein refers to a user account that follows a number of experts in the particular topic. In some cases, a follower engages with the content posted by the other user (e.g. by sharing or reposting the content).
[0026] In an example embodiment, the proposed system and methods can be used to determine that experts in Topic A are also experts in one or more other topics (e.g. Topic B, Topic C, etc.).
[0027] Turning to FIG. 1 , a schematic diagram of a proposed system is shown. A server 100 is in communication with a computing device 101 over a network 102. The server 100 obtains and analyzes social network data and provides results to the computing device 101 over the network. The computing device 101 can receive user inputs through a GUI to control parameters for the analysis.
[0028] It can be appreciated that social network data includes data about the users of the social network platform, as well as the content generated or organized, or both, by the users. Non-limiting examples of social network data includes the user account ID or user name, a description of the user or user account, the messages or other data posted by the user, connections between the user and other users, location information, etc. An example of connections is a "user list", also herein called "list", which includes a name of the list, a description of the list, and one or more other users which the given user follows. The user list is created by the given user. [0029] Continuing with FIG. 1 , the server 100 includes a processor 103 and a memory device 104. In an example embodiment, the server includes multiple quad-core processors, 96 gigabytes of main memory, and 12 terabytes of raw disk storage. In another example embodiment, the memory device 104 or memory devices are solid state drives for increased read/write performance. In another example embodiment, multiple servers are used to implement the methods described herein. In another example embodiment, other currently known computing hardware or future known computing hardware is used, or both.
[0030] The server 100 also includes a communication device 105 to communicate via the network 102. The network 102 may be a wired or wireless network, or both. The server 100 also includes a GUI module 106 for displaying and receiving data via the computing device 101 . The server also includes: a social networking data module 107; an indexer module 108; a user account relationship module 109; an expert identification module 1 10; an interest identification module 1 1 1 ; a topic analytics module 1 12; a query module to identify experts in Topic A that also have interests in Topic B 1 13; a query module to identify user that have interests in Topic A 114; and a query module to suggest followers that have interests in Topic A 1 15.
[0031] The server 100 also includes a number of databases, including a data store 1 16; an index store 1 17; a database for a social graph 1 18; a profile store 1 19; a database for expertise vectors 120; and a database for interest vectors 121.
[0032] The social networking data module 107 is used to receive a stream of social networking data. In an example embodiment, millions of new messages are delivered to social networking data module 107 each day, and in real-time. The social networking data received by the social networking data module 107 is stored in the data store 1 16.
[0033] The indexer module 108 performs an indexer process on the data in the data store 1 16 and stores the indexed data in the index store 1 17. In an example embodiment, the indexed data in the index store 1 17 can be more easily searched, and the identifiers in the index store can be used to retrieve the actual data (e.g. full messages).
[0034] A social graph is also obtained from the social networking platform server, not shown, and is stored in the social graph database 1 18. The social graph, when given a user as an input to a query, can be used to return all users following the queried user.
[0035] The profile store 1 19 stores meta data related to user profiles. Examples of profile related meta data include the aggregate number of followers of a given user, self- disclosed personal information of the given user, location information of the given user, etc. The data in the profile store 1 19 can be queried. [0036] In an example embodiment, the user account relationship module can use the social graph 1 18 and the profile store 1 19 to determine which users are following a particular user.
[0037] The expert identification module 1 10 is configured to identify the set of all user lists in which a user account is listed, called the expertise vector. The expertise vector for a user is stored in the expertise vector database 120. The interest identification module 1 1 1 is configured to identify topics of interest to a given user, called the interest vector. The interest vector for a user is stored in the interest vector database 121.
[0038] Continuing with FIG. 1 , the computing device 101 includes a communication device 122 to communicate with the server 100 via the network 102, a processor 123, a memory device 124, a display screen 125, and an Internet browser 126. In an example embodiment, the GUI provided by the server 100 is displayed by the computing device 101 through the Internet browser. In another example embodiment, where an analytics application 127 is available on the computing device 101 , the GUI is displayed by the computing device through the analytics application 127. It can be appreciated that the display device 125 may be part of the computing device (e.g. a mobile device, a tablet, a laptop, etc.) or may be separate from the computing device (e.g. a desktop computer).
[0039] Although not shown, various user input devices (e.g. touch screen, roller ball, optical mouse, buttons, keyboard, microphone, etc.) can be used to facilitate interaction between the user and the computing device 101 .
[0040] It will be appreciated that any module or component exemplified herein that executes instructions may include or otherwise have access to computer readable media such as storage media, computer storage media, or data storage devices (removable and/or non-removable) such as, for example, magnetic disks, optical disks, or tape. Computer storage media may include volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data. Examples of computer storage media include RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by an application, module, or both. Any such computer storage media may be part of the server 100 or computing device 101 or accessible or connectable thereto. Any application or module herein described may be implemented using computer readable/executable instructions that may be stored or otherwise held by such computer readable media.
[0041] Turning to FIG. 2, an example embodiment of computer executable instructions are shown for obtaining and storing social network data. At block 200, the server 100 obtains social networking data. The data may be received as a stream of data, including messages and meta data, in real time. This data is stored in the data store 1 16, for example, using a compressed row format (block 201 ). In a non-limiting example
embodiment, a MySQL database is used. Blocks 200 and 201 , for example, are
implemented by the social networking data module 107.
[0042] In an example embodiment, the social network data received by social networking module 107 is copied, and the copies of the social network data are stored across multiple servers. This facilitates parallel processing when analysing the social network data. In other words, it is possible for one server to analyse one aspect of the social network data, while another server analyses another aspect of the social network data.
[0043] The server 100 indexes the messages using an indexer process (block 202). For example, the indexer process is a separate process from the storage process that includes scanning the messages as they materialize in the data store 1 16. In an example
embodiment, the indexer process runs on a separate server by itself. This facilitates parallel processing. The indexer process is, for example, a multi-threaded process that materializes a table of indexed data for each day, or for some other given time period. The indexed data is outputted and stored in the index store 1 17 (block 204).
[0044] Turning briefly to FIG. 3, which shows an example index store 1 17, each row in the table is a unique user account identifier and a corresponding list of all message identifiers that are produced that day, or that given time period. In an example embodiment, millions of rows of data can be read and written in the index store 1 17 each day, and this process can occur as new data is materialized or added to the data store 116. In an example embodiment, a compressed row format is used in the index store 1 17. In another example embodiment, deadlocks are avoided by running relaxed transactional semantics, since this increases throughput across multiple threads when reading and writing the table. By way of background, a deadlock occurs when two or more tasks permanently block each other by each task having a lock on a resource which the other tasks are trying to lock.
[0045] Turning back to FIG. 2, the server 100 further obtains information about which user accounts follow other user accounts (block 203). This process includes identifying profile related meta data and storing the same in the profile store (block 205). [0046] In FIG. 4, an example of the profile store 1 19 shows that for each user account, there is associated profile related meta data. The profile related meta data includes, for example, the aggregate number of followers of the user, self-disclosed personal information, location information, and user lists.
[0047] After the data is obtained and stored, it can be analyzed, for example, to identify experts and interests.
[0048] By way of example, and turning to FIG. 5, a user may have a list of other users which he or she may follow. For example, User A has a list of User B, User C and User D, which User A follows. The users (e.g. User B, User C and User D) are grouped under a list named List A, and the list has an associated list description (e.g. Description A). In other words, User A believes that User B, User C and User D are experts or knowledgeable in Topic A.
[0049] Another user, User E, may have the same or similar list name and description (e.g. same or similar to List A, Description A), but may have different users listed than those by User A. For example, User E follows User B, User C and User G. In other words, User E believes that User B, User C and User G are experts or knowledgeable in Topic A.
[0050] Another user, User F, may have the same or similar list name and description (e.g. same or similar to List A, Description A), but may have different users listed than those by User A. For example, User F follows User B, User H and User I, since User F believes these users are experts or knowledgeable in Topic A.
[0051] Based on the above example scenario, it can be appreciated that different users may have the same or similarly named or similarly described lists, but the users in each list can be different. In other words, different users may think that other different users are experts in a given topic.
[0052] Continuing with the example in FIG. 5, based on the number of times that a user is listed on another user's list for a given topic, the server 100 can determine whether the user is considered an expert by other users. For example, User B is listed on three different lists related to Topic A; User C is listed on two different lists; and each of User D, User G, User H and User I are only listed on one list. Therefore, in this example, User B is considered the foremost expert in Topic A, followed by User C.
[0053] Turning to FIG. 6, an example embodiment of computer executable instructions is provided for determining topics for which a given user is considered an expert. At block 601 , the server 100 obtains a set of lists in which the given user listed. At block 602, the server 100 uses the set of lists to determine topics associated with the given user. At block 603, the server outputs the topics in which the given user is considered an expert. These topics form the expertise vector of the given user. For example, if the user Alice is listed in Bob's fishing list, Celine's art list, and David's photography list, then Alice's expertise vector includes: fishing, art and photography.
[0054] In an example embodiment, the user lists are obtained by constantly crawling them, since the user lists are dynamically updated by users, and new lists are created often. In an example embodiment, the user lists are processed using an Apache Lucene index. The expertise vector of a given user is processed using the Lucene algorithm to populate the index of topics associated with the given user. This index supports, for example, full Lucene query syntax, including phrase queries and Boolean logic. By way of background, Apache Lucene is an information retrieval software library that is suitable for full text indexing and searching. Lucene is also widely known for its use in the implementation of Internet search engines and local single-site searching. It can be appreciated, that other currently known or future known searching and indexing algorithms can be used.
[0055] In an example embodiment, the computer executable instructions of FIG. 6 are implemented by module 1 10.
[0056] Turning to FIG. 7, an example embodiment of computer executable instructions is provided for determining topics in which a given user is interested. At block 701 , the server 100 obtains ancillary users that the given user follows.
[0057] At block 702, a number of instructions are performed, but specific to each ancillary user. In particular, at block 703, the server obtains a set of lists in which the ancillary user is listed (e.g. the expertise vector of the ancillary user). At block 704, the server uses the set of lists to determine topics associated with the ancillary user. The outputs of block 704 are topics associated with the ancillary user (block 705). In an example embodiment, block 702 can simply call on the algorithm presented in FIG. 6, but being applied to each ancillary user.
[0058] In an example embodiment, at block 706, the server combines the topics from all the ancillary users. The combined topics form the output 707 of the topics of interest for the given user (e.g. the interest vector of the given user).
[0059] In another example embodiment, an alternative to the blocks 706 and 707 is to determine which topics are common, or most common amongst the ancillary users (block 708). For example, a given user Alice, follows ancillary users Bob, Celine and David. Bob is considered an expert in fishing and photography (e.g. the expertise vector of Bob). Celine is considered an expert in fishing, photography and art (e.g. the expertise vector of Celeine). David is considered an expert in fishing and music (e.g. the expertise vector of David). Therefore, since the topic of fishing is common amongst all the ancillary users, it is identified that Alice has an interest in the topic of fishing. Or, since photography is more common amongst the ancillary users (e.g. the second most common topic after fishing), then the topic of photography is also identified as a topic of interest for Alice. Since art and music are not common amongst the ancillary users, these topics are not considered to be topics interest to Alice.
[0060] In an example embodiment, module 11 1 implements the computer executable instructions presented in FIG. 7.
[0061] In an example embodiment, the data from the expertise vector and the data from interest vector are supplied to the Lucene algorithm for indexing.
[0062] Turning to FIG. 8, an example embodiment of computer executable instructions are provided for topic analysis. These instructions can, for example, be implanted by module 1 12. At block 801 , the server 100 obtains a topic for querying. In an example embodiment, the topic can be provided by a user through the GUI displayed by the computing device 101. The topic may also come from another source. At block 802, the server searches for users in the index store 1 17 that are considered experts in the topic. The experts determined in block 802 may be limited to the top n users (block 803).
[0063] A set of instructions 804 are executed for each expert identified in block 802. In particular, the instructions include obtaining profile information of the expert using the profile store 1 19 (block 805) and obtaining messages sent from the expert using the index store 1 17 and the data store 1 16 (block 806).
[0064] Using the messages obtained from all the experts, the server 100 identifies: frequently used keywords, frequently used keyword pairs, frequently used hashtags, frequently used links (e.g. URLs), etc. (block 807). The server then causes this information, including the profile information of the experts, to be diplayed using the GUI. It will be appreciated that the keywords, keyword pairs, hashtags and links can be ordered from most frequently used to least frequently used. The top n most frequently results will be displayed on the GUI. The identification of the keywords, keyword pairs, etc. can be done using currently known or future known semantic processing, including removing stop words.
[0065] In an example embodiment, the extraction or search for experts in block 802 can be identified using the Lucene index.
[0066] Turning to FIG. 9, example computer executable instructions are provided for implementing block 802. At block 901 , the server 100 identifies users having Topic A (e.g. the topic being queried in FIG. 8) listed in their expertise vector. At block 902, of the identified users, the server determines which users appear on the highest number of lists associated with Topic A. At block 903, the top n users who appear on the highest number of lists are the experts of Topic A.
[0067] Turning to FIG. 10, example computer executable instructions are provided for processing links (e.g. URLs). At block 1001 , the server 100 obtains a list of shortened URLs. At block 1002, the server calls on a URL dereferencing algorithm that utilizes asynchronous input/output (IO). The server then outputs the list of unshortened URLs.
[0068] It can be appreciated that URLs in messages are shortened (e.g. bit.ly or t.co) and that conducting analysis on these domains can be challenging since each shortened URL will be unshortened, possibly multiple times. The process described in FIG. 10 efficiently unshortens multiple URLs (e.g. thousands of URLs) in parallel on a single thread and in a short time frame (e.g. a second).
[0069] Turning to FIG.s 1 1 , 12 and 13, example embodiments of computer executable instructions for different queries are provided. It can be appreciated that the operations described below with respect to FIG.s 1 1 , 12 and 13 can be implemented by modules 1 13, 1 14 and 1 15, respectively.
[0070] The operations of FIG. 11 are used to identify experts in a given topic (e.g. Topic A) that have an interest in another topic (e.g. Topic B). The operations of FIG. 1 1 can be implemented by module 1 13. At block 1 101 , the server 100 obtains Topic A and Topic B, for example, via the GUI. At block 1 102, the server searches for users in the index store that are considered experts in Topic A. The operations presented with respect to FIG. 9 can be used, for example, to implement block 1 102. Of the identified experts in Topic A, the server determines which of the experts have an interest in Topic B (e.g. by analysing the interest vector of each identified expert) (block 1 103). In particular, if the interest vector of an identified expert does include Topic B, then the identified expert is determined to have an interest in Topic B. If the interest vector of the identified expert does not include Topic B, then the identified expert does not have an interest in Topic B. In an example embodiment, the server outputs the users that are considered an expert in Topic A and that have an interest in Topic B, as determined by block 1 103.
[0071] In an alternative example embodiment, after block 1 103 is executed, if the 'max reach' parameter has been selected (e.g. by the user), then the server identifies users that are experts in Topic A, have an interest in B, and also maximize the number of unique followers of a predetermined number n of experts. The max reach operation 1 105 includes, of the users that are considered an expert in Topic A and have an interest in Topic B, determining which combination of n users provides the highest number of unique followers of the users. The determined n users are outputted (block 1 106). For example: Alice, Bob and Celine are identified from block 1 103; the parameter n is 2; Alice has the followers David, Eve and Frank; Bob has the followers David and Eve; and Celine has the followers Gregory and Hanna. Based on this example, the combination of the experts Alice and Celine would provide the highest number of unique followers (e.g. five unique followers). By contrast, the combination of experts Alice and Bob would provide three unique followers.
[0072] Turning to FIG. 12, the example computer executable instructions are for identifying users that have an interest in Topic A. At block 1201 , the server 100 obtains Topic A, for example, through a user input in the GUI. At block 1202, the server searches for users that have an interest in Topic A (e.g. by analysing the index vector of each user). At block 1203, the identified users from block 1202 are outputted.
[0073] If the 'max reach' parameter has been selected, then in another example embodiment, of the users that have an interest in Topic A, the server determines which combination of n users provides the highest number of unique followers of the users (block 1204). The determined n users are outputted (block 1205).
[0074] Turning to FIG. 13, the example computer executable instructions are for suggesting followers for a specific user account that have an interest in Topic A. At block 1301 , the server 100 obtains the Topic A, for example, via the GUI. At block 1302, the server searches for users in the index store that are considered experts in Topic A. At block 1303, of the identified experts for Topic A, the server determines which of the experts have the largest number of followers and that do not currently follow the specific user account. In an example embodiment, the server identifies the top n experts with the largest number of followers. At block 1304, the server outputs the determined experts, or the followers of the determined experts, or both.
[0075] It will be appreciated that based on the users or experts, or both, identified in any of the queries described in FIG.s 1 1 , 12 and 13, other data can be derived. For example, based on the users or experts, frequently used keywords, frequently used keyword pairs, frequently used hashtags, frequently used links, and profile information about the users and experts can be determined or obtained, and then displayed as results in the GUI.
[0076] FIG.s 14, 15 and 16 are example embodiments of different GUIs which can be generated by the GUI module 106 and displayed on an Internet browser 126, or displayed on the Analytics application 127, or both. [0077] Turning to FIG. 14, an example GUI 1400 is shown for searching for users who are experts in a given topic. In particular, the GUI 1400 includes an input box 1401 to receive text identifying the given topic. When the user selects the search button 1402, the operations described in FIG. 8 are implemented. The results of the query are displayed, for example, in a GUI like that in FIG. 16.
[0078] If the user selects the advanced query button 1403, then an advanced query GUI is displayed.
[0079] An example of the advanced query GUI 1500 is shown in FIG. 15. The GUI 1500 may or may not include components 1401 and 1402. The advanced query GUI 1500 includes section 1501 for finding experts in Topic A with interests in Topic B (e.g. which initiates the operations of FIG. 1 1 ); section 1502 for finding users interested in Topic A (e.g. which initiates the operations of FIG. 12); and section 1503 for suggesting followers for a specific account and that are interested in Topic A (e.g. which initiates the operations of FIG. 13).
[0080] Section 1501 includes an input box 1504 to receive Topic A, for which users are an expert, and an input box 1505 to receive Topic B, for which the identified experts of Topic A have an interest. There is also a selection box 1505 to select the 'max reach' parameter. When this box 1505 is selected, operations in boxes 1 105 and 1 106 are executed. When the search button 151 1 is selected, the query is performed.
[0081] Section 1502 includes an input box 1507 to receive Topic A, for which users have an interest, and a selection box 1508 to select the 'max reach' parameter. When this box 1508 is selected, operations in boxes 1204 and 1205 are executed. When the search button 1512 is selected, the query is performed.
[0082] Section 1503 includes an input box 1509 to receive a specified user for which to suggest followers, and an input box 1510 to receive Topic A for which the suggested followers have an interest. When the search button 1513 is executed, the query is performed.
[0083] Turning to FIG. 16, an example results GUI 1600 is shown. The results, for example, are shown when the topic "social" has been inputted into input box 1401 shown in GUI 1400.
[0084] GUI 1600 includes components 1401 and 1402 to allow the user to easily conduct a new query. GUI 1600 also includes display area 1601 , which shows the topic that is being searched. In this example, it reads: Analysing query for "social". [0085] GUI 1600 includes a list of top experts 1602 that considered experts for the topic "social", a list of other topics 1603 , a list of frequently used keywords 1604 based on the messages of the identified experts, a list of frequently used keyword pairs 1605 based on the messages of the identified experts, a list of frequently used hashtags 1606 based on the messages of the identified experts, and a list of links 1615 from which the content or messages are mostly shared.
[0086] In an example embodiment, the list of other topics 1603 are for which the identified experts are also considered experts. For example, a query done for Topic A shows that an expert in Topic A is also an expert in Topic B. In another example embodiment, the list of other topics 1603 are the other listed topics of interest of the identified experts. For example, a query on Topic A shows that an expert in Topic A has an interest in Topic B.
[0087] For each top expert result 1601 , the name of the user (e.g. the expert) 1607, the profile data of the user 1608, and the number of followers of the user 1609 are shown.
[0088] Displayed respectively in association with the topics listing 1604, the keywords listing 1604, the keyword pairs listing 1605 and the hashtags listing 1606, are word cloud buttons 161 1 , 1612, 1613 and 1614. When a word cloud button is selected or hovered over with a pointer or mouse, the results for that listing are displayed in a word cloud. For example, if word cloud button 1612 was selected or hovered over, then the keywords would be shown in a word cloud.
[0089] The hourly message activity from the identified experts is also displayed in section 1616. In the example shown, the hourly message activity is shown in a bar graph, with each bar representing the number of messages sent during a different hour of the day.
[0090] GUI 1600 also includes a button 1617 that, when selected, invokes the server 100 to analyze the interests of the followers of the identified experts.
[0091] The system and methods described herein can be used to identify sets of experts of any topic. This information readily aids advertisers and marketers to identify the most relevant user accounts in the social network platform when instigating an advertising campaign.
[0092] In some social networking platforms, for example Twitter, an advertiser bids on a list of topics. The system and methods described herein can assist in such a scenario by analysing and identifying related topics for a topic of interest. These related topics may be cheaper to bid on than the topic of interest. For example, the topic of interest is "social marketing" and this topic has a high bidding price for the advertiser. Using the system and methods described herein, the GUI may display a related topic "seo", which has a lower bidding price than "social marketing". Using the operations described herein, the net effect of the advertising campaign will be similar, because largely the same audience will be targeted. In other words, using the operations described herein, the server 100 determines in this example that the followers of the experts in "social marketing" and the followers of the experts in "seo" are highly related. By way of background, "seo" stands for search engine optimization.
[0093] In some social networking platforms, advertisers bid on search keywords. It can be appreciated that information on a social networking platform is typically temporal by nature and events evolve with time. Therefore, keywords used in searches also evolve over time. In an example embodiment, when a keyword is used during a search query on a social networking platform for which an advertisement exists, the social networking platform will display promoted messages as advertising along with the search results. The system and methods described herein can be used to assist with this keyword bidding process. In particular, the operations described herein can be used to identify keywords that are in messages and discussions related to a given topic. An advertiser can bid on these identified keywords, which are prevalent in messages related to the given topic and are also prevalent in discussions (e.g. hashtags) related to the given topic.
[0094] In a general example embodiment, a method performed by a server for analysing data from users, includes: obtaining a topic; identifying a user as an expert of the topic; obtaining a message sent from the expert; and determining a frequently used keyword in the message.
[0095] In an aspect of the method, multiple users are identified as experts of the topic; multiple messages are sent from the experts; and the frequently used keyword in the multiple messages is determined. In another aspect, the method further comprises:
determining a frequently used keyword pair in the message; determining a frequently used hashtag in the message; and determining a frequently used hyperlink in the message. In another aspect, multiple users are identified as experts of the topic; multiple messages are sent from the experts; the frequently used keyword pair in the multiple messages is determined; the frequently used hashtag is determined in the multiple messages; and the frequently used hyperlink is determined in the multiple messages. In another aspect, the method further comprises determining a related topic to the topic, wherein the user is also identified as an expert of the related topic. In another aspect, the method further comprises determining a related topic to the topic, wherein the related topic is of interest to the user. In another aspect, identifying the user as the expert of the topic includes: obtaining a list of another user in which the user is listed; and after determining that a name of the list or a description of the list is related to the topic, identifying the user as the expert of the topic. In another aspect, identifying the user as the expert of the topic includes: obtaining multiples lists of other users in which the user is listed; determining that a name of the list or a description of the lists are related to the topic; after determining that the user appears on a highest number of lists that are related to the topic, identifying the user as the expert of the topic. In another aspect, the method further includes: obtaining profile data about the expert; and displaying in a graphical user interface (GUI) the profile data of the expert, the topic, and the frequently used keyword.
[0096] In another general example embodiment, a method performed by a server to identify a user that has interest in a topic, includes: obtaining the topic; determining an ancillary user that the user follows; obtaining a list in which the ancillary user is listed;
determining that a name of the list or a description of the list is related to the topic; and determining that the user has interest in the topic.
[0097] In an aspect of the method, multiple ancillary users that the user follows are determined; and the method further includes: for each of the multiple ancillary users, obtaining a given list in which a given ancillary user is listed and determining a given topic from the given list; combining the given topics corresponding to the multiple ancillary users; and determining that the user has interest in the given topics, the given topics including the topic. In another aspect, multiple ancillary users that the user follows are determined; and the method further includes: for each of the multiple ancillary users, obtaining a given list in which a given ancillary user is listed and determining a given topic from the given list;
determining which of the given topics is most common amongst the multiple ancillary users; and determining that the most common given topic is the topic in which user has interest. In another aspect, the method further includes displaying the user in a graphical user interface. In another aspect, the method further includes: determining a first number of users have interest in the topic; and determining a combination of a second number of users, where the second number is smaller than the first number. In another aspect, each of the first number of user have followers, and the server determines which combination of the second number of users has a highest number of unique followers.
[0098] In another general example embodiment, a server configured to analyse data from users, includes: a processor; a communication device; and a memory device. The memory device includes computer executable instructions for at least: obtaining a topic; identifying a user as an expert of the topic; obtaining a message sent from the expert; and determining a frequently used keyword in the message. [0099] In another general example embodiment, a server configured to identify a user that has interest in a topic, includes: a processor; a communication device; and a memory device. The memory device comprises computer executable instructions for at least:
obtaining the topic; determining an ancillary user that the user follows; obtaining a list in which the ancillary user is listed; determining that a name of the list or a description of the list is related to the topic; and determining that the user has interest in the topic.
[00100] It will be appreciated that different features of the example embodiments of the system and methods, as described herein, may be combined with each other in different ways. In other words, different modules, operations and components may be used together according to other example embodiments, although not specifically stated.
[00101] The steps or operations in the flow diagrams described herein are just for example. There may be many variations to these steps or operations without departing from the spirit of the invention or inventions. For instance, the steps may be performed in a differing order, or steps may be added, deleted, or modified.
[00102] The GUIs and screen shots described herein are just for example. There may be variations to the graphical and interactive elements without departing from the spirit of the invention or inventions. For example, such elements can be positioned in different places, or added, deleted, or modified.
[00103] Although the above has been described with reference to certain specific embodiments, various modifications thereof will be apparent to those skilled in the art without departing from the scope of the claims appended hereto.

Claims

Claims:
1 . A method performed by a server for analysing data from users, comprising:
obtaining a topic;
identifying a user as an expert of the topic;
obtaining a message sent from the expert; and
determining a frequently used keyword in the message.
2. The method of claim 1 wherein multiple users are identified as experts of the topic;
multiple messages are sent from the experts; and the frequently used keyword in the multiple messages is determined.
3. The method of claim 1 further comprising: determining a frequently used keyword pair in the message; determining a frequently used hashtag in the message; and determining a frequently used hyperlink in the message.
4. The method of claim 3 wherein multiple users are identified as experts of the topic;
multiple messages are sent from the experts; the frequently used keyword pair in the multiple messages is determined; the frequently used hashtag is determined in the multiple messages; and the frequently used hyperlink is determined in the multiple messages.
5. The method of claim 1 further comprising determining a related topic to the topic, wherein the user is also identified as an expert of the related topic.
6. The method of claim 1 wherein identifying the user as the expert of the topic comprises: obtaining a list of another user in which the user is listed; and
after determining that a name of the list or a description of the list is related to the topic, identifying the user as the expert of the topic.
7. The method of claim 1 wherein identifying the user as the expert of the topic comprises: obtaining multiples lists of other users in which the user is listed;
determining that a name of the list or a description of the lists are related to the topic; after determining that the user appears on a highest number of lists that are related to the topic, identifying the user as the expert of the topic.
8. The method of claim 1 further comprising: obtaining profile data about the expert; and displaying in a graphical user interface (GUI) the profile data of the expert, the topic, and the frequently used keyword.
9. A method performed by a server to identify a user that has interest in a topic, comprising: obtaining the topic;
determining an ancillary user that the user follows;
obtaining a list in which the ancillary user is listed;
determining that a name of the list or a description of the list is related to the topic; and
determining that the user has interest in the topic.
10. The method of claim 9 wherein multiple ancillary users that the user follows are determined; and the method further comprises:
for each of the multiple ancillary users, obtaining a given list in which a given ancillary user is listed and determining a given topic from the given list;
combining the given topics corresponding to the multiple ancillary users; and determining that the user has interest in the given topics, the given topics including the topic.
1 1 . The method of claim 9 wherein multiple ancillary users that the user follows are determined; and the method further comprises:
for each of the multiple ancillary users, obtaining a given list in which a given ancillary user is listed and determining a given topic from the given list;
determining which of the given topics is most common amongst the multiple ancillary users; and
determining that the most common given topic is the topic in which user has interest.
12. The method of claim 9 further comprising displaying the user in a graphical user interface.
13. The method of claim 9 further comprising: determining a first number of users have interest in the topic; and determining a combination of a second number of users, where the second number is smaller than the first number.
14. The method of claim 13 wherein each of the first number of user have followers, and the server determines which combination of the second number of users has a highest number of unique followers.
15. A server configured to analyse data from users, comprising:
a processor;
a communication device;
a memory device; and
wherein the memory device comprises computer executable instructions for at least: obtaining a topic;
identifying a user as an expert of the topic;
obtaining a message sent from the expert; and
determining a frequently used keyword in the message.
16. A server configured to identify a user that has interest in a topic, comprising:
a processor;
a communication device;
a memory device; and
wherein the memory device comprises computer executable instructions for at least: obtaining the topic;
determining an ancillary user that the user follows;
obtaining a list in which the ancillary user is listed;
determining that a name of the list or a description of the list is related to the topic; and
determining that the user has interest in the topic.
PCT/CA2014/050586 2013-06-21 2014-06-20 System and method for analysing social network data WO2014201570A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201361837933P 2013-06-21 2013-06-21
US61/837,933 2013-06-21

Publications (1)

Publication Number Publication Date
WO2014201570A1 true WO2014201570A1 (en) 2014-12-24

Family

ID=52103754

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CA2014/050586 WO2014201570A1 (en) 2013-06-21 2014-06-20 System and method for analysing social network data

Country Status (2)

Country Link
CA (1) CA2821164A1 (en)
WO (1) WO2014201570A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160055249A1 (en) * 2014-08-21 2016-02-25 Fujitsu Limited Information processing method, information processing apparatus and storage medium
US9710563B2 (en) 2015-08-28 2017-07-18 International Business Machines Corporation Search engine analytics and optimization for media content in social networks
CN109815319A (en) * 2018-12-24 2019-05-28 联想(北京)有限公司 Information processing method and information processing unit
US11423439B2 (en) * 2017-04-18 2022-08-23 Jeffrey D. Brandstetter Expert search thread invitation engine

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005071665A1 (en) * 2004-01-20 2005-08-04 Koninklijke Philips Electronics, N.V. Method and system for determining the topic of a conversation and obtaining and presenting related content
US20070214097A1 (en) * 2006-02-28 2007-09-13 Todd Parsons Social analytics system and method for analyzing conversations in social media
US20090319518A1 (en) * 2007-01-10 2009-12-24 Nick Koudas Method and system for information discovery and text analysis
US20110145398A1 (en) * 2009-12-10 2011-06-16 Sysomos Inc. System and Method for Monitoring Visits to a Target Site
US20130036107A1 (en) * 2011-08-07 2013-02-07 Citizennet Inc. Systems and methods for trend detection using frequency analysis
US8447852B1 (en) * 2011-07-20 2013-05-21 Social Yantra, Inc. System and method for brand management using social networks

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005071665A1 (en) * 2004-01-20 2005-08-04 Koninklijke Philips Electronics, N.V. Method and system for determining the topic of a conversation and obtaining and presenting related content
US20070214097A1 (en) * 2006-02-28 2007-09-13 Todd Parsons Social analytics system and method for analyzing conversations in social media
US20090319518A1 (en) * 2007-01-10 2009-12-24 Nick Koudas Method and system for information discovery and text analysis
US20110145398A1 (en) * 2009-12-10 2011-06-16 Sysomos Inc. System and Method for Monitoring Visits to a Target Site
US8447852B1 (en) * 2011-07-20 2013-05-21 Social Yantra, Inc. System and method for brand management using social networks
US20130036107A1 (en) * 2011-08-07 2013-02-07 Citizennet Inc. Systems and methods for trend detection using frequency analysis

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Using Twitter lists", WEB LINK CAPTURE, 16 May 2013 (2013-05-16), pages 1 - 3, Retrieved from the Internet <URL:http://web.archive.org/web/20130516090233/https://support.twitter.com/groups/51-me/topics/208-lists/articles/76460-using-twitter-lists> *
"Using Twitter search", WEB LINK CAPTURE, 16 May 2013 (2013-05-16), pages 1 AND 2, Retrieved from the Internet <URL:http://web.archive.org/web/20130516211228/https://support.twitter.com/articles/132700-using-twitter-search> *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160055249A1 (en) * 2014-08-21 2016-02-25 Fujitsu Limited Information processing method, information processing apparatus and storage medium
US10331674B2 (en) * 2014-08-21 2019-06-25 Fujitsu Limited Information processing method, information processing apparatus and storage medium to determine ranking of registrants
US9710563B2 (en) 2015-08-28 2017-07-18 International Business Machines Corporation Search engine analytics and optimization for media content in social networks
US11423439B2 (en) * 2017-04-18 2022-08-23 Jeffrey D. Brandstetter Expert search thread invitation engine
CN109815319A (en) * 2018-12-24 2019-05-28 联想(北京)有限公司 Information processing method and information processing unit

Also Published As

Publication number Publication date
CA2821164A1 (en) 2014-12-21

Similar Documents

Publication Publication Date Title
US11709901B2 (en) Personalized search filter and notification system
US20180373788A1 (en) Contrastive multilingual business intelligence
US9367603B2 (en) Systems and methods for behavioral segmentation of users in a social data network
US10469275B1 (en) Clustering of discussion group participants
JP2019507417A (en) User interface for multivariable search
US10437859B2 (en) Entity page generation and entity related searching
US20160092576A1 (en) Association- and perspective-based content item recommendations
KR20160075739A (en) Systems and methods for determining influencers in a social data network
US10482142B2 (en) Information processing device, information processing method, and program
US20150058417A1 (en) Systems and methods of presenting personalized personas in online social networks
Choudhari et al. Video search engine optimization using keyword and feature analysis
US20130246432A1 (en) Providing content based on online topical trends
US11599822B1 (en) Generation and use of literary work signatures reflective of entity relationships
US20190155934A1 (en) Search query enhancement with context analysis
US9558165B1 (en) Method and system for data mining of short message streams
US20150356098A1 (en) Identifying video files of a video file storage system having relevance to a first file
US20180276559A1 (en) Displaying feed content
US8843576B2 (en) Identifying audio files of an audio file storage system having relevance to a first file
WO2014201570A1 (en) System and method for analysing social network data
US20160379283A1 (en) Analysis of social data to match suppliers to users
CA2868948A1 (en) System and method for identifying experts on social media
US11650986B1 (en) Topic modeling for short text
US10430852B2 (en) Social result abstraction based on network analysis
US20160335325A1 (en) Methods and systems of knowledge retrieval from online conversations and for finding relevant content for online conversations
US11269940B1 (en) Related content searching

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14813612

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14813612

Country of ref document: EP

Kind code of ref document: A1