METHOD AND SYSTEM FOR DETERMINING THE POPULARITY OF A
SUBJECT
This invention relates to the determination of the popularity of a subject or system user. More specifically, this invention relates to such determination based upon the quantity and quality of links to other subjects, or the like, established by a subject.
The concept of popularity is easily understood and readily expressed in qualitative terms. For example, individuals know (according to their own terms of reference) who within their social circle is and is not popular. However, with the exception of celebrities or other famous people, it is currently difficult, if not impossible, for an individual to make an assessment of the popularity of someone when that individual is outside their immediate social circle.
A further, more specific, problem which currently exists is faced by those interested in marketing their products or services and those who are seeking to identify suitable individuals to whom to direct their marketing efforts. Word of mouth is viewed as one of the strongest, if not the strongest, forms of marketing. However, word of mouth only works when the person making the recommendation has an influence over the person to whom they are making the recommendation. Those who are marketing actively seek out influential individuals to assist them with their promotions. For example, celebrities are used to endorse products and services in the hope that the buying public will be positively influenced. Marketing companies also recognise, however, that certain "non-famous" individuals within a community can perform a very influential role as regards their peers - the individual could, for example, be a "trend setter" in the local club scene, an avid computer games player or a talented club-level footballer. These individuals tend to be well-connected and therefore are in a position to influence potentially a large group. The problem remains, however, of how to locate such individuals.
The inventors of the present invention have found that the more popular an individual is, the more that individual communicates using telecommunications or other similar communications networks. Such communication may take place by aural (traditional voice telephony) or written means (e.g. electronic mail or Short Message Service (SMS)). A popular individual enjoys an above average level of connectivity to others within telecommunications networks. Despite this correlation between popularity in the "real world" and connectivity within a telecommunications network, there is currently no method or system for translating that connectivity into a measurement of popularity.
The present invention addresses the above problems by providing a method and system for modelling social interaction, as measured through the usage of devices such as mobile telephones, personal digital assistants ("PDAs") and personal computers ("PCs"), to quantify popularity.
In view of the foregoing there is provided a method of generating a popularity indication for a user of a system, comprising the steps of: determining direct and indirect connections between the user and other users of the system; generating a popularity indication; and transmitting the indication, via a communications channel, to a terminal, for display or presentation.
Preferably, the communications channel is the internet, an intranet, a mobile telecommunications network, a terrestrial or cable telecommunications network or a radio communications system or network. However, the channel may merely consist of the standard connections within a desktop, or similar, computer which allow the interrogation of a store of information or database resident within the memory of the computer, by a user thereof, utilising a terminal associated with the computer.
More preferably, the terminal is a mobile telephone, a radio transmitter/receiver a personal computer, a laptop or palmtop computer, a pager, a
WAP or other internet enabled telephone, or an analogous device. Such analogous devices may include an SMS enabled telephone or other wireless networked communications device, or an interactive television console.
According to a preferred embodiment of the present invention, the determination of direct and indirect connections comprises the application, to a record of the system's users and their connections, of an algorithm configured to determine the shortest paths between the users in the record. Preferably, the connections between the user and other users of the system are represented in the form of a graph. Still more preferably, the algorithm used is Dijkstra's algorithm.
According to a preferred embodiment of the present invention, connections between users are based upon one or more of telephone numbers of contacts, electronic mail addresses of contacts, other such contact details, similar interests of users, similar provenance/background of users, or other such criteria.
According to a still further preferred embodiment of the present invention the generation of a popularity indication comprises the steps of: determining the sum of the shortest paths from the user (1) to other users
(IN) of the system; determining the sum of the shortest paths from users (IN) of the system to the user (I); and applying a damping factor (D) to the sum of the sums determined.
Preferably a weight (Wi) is applied to a link from the user (l) to other users
(IN). More preferably, a weight (W2) is applied to a link from the system users (IN) to the user (I). Preferably, the weight (Wi) applied to a link from the user (I) to the other users (IN) is less than the weight (W2) applied to a link from other users (IN) to the user (I).
According to a further preferred embodiment of the present invention, the magnitude of a weight (Wι, W2) is scaled by a strength quotient assigned to a link by a user.
According to a still further preferred embodiment of the present invention, the magnitude of a weight is scaled by the frequency and duration of contact or communication between two users, utilising the system.
According to a still further preferred embodiment of the present invention, the determination of sums is carried out incrementally, for an increasing path (N) from the user (I) to other users (IN).
According to a still further preferred embodiment of the present invention, the damping factor (D) is reduced as the path increases incrementally.
According to a preferred embodiment of the present invention, the damping factor (D) applied to a first link of the path (N) is greater in magnitude than that applied to a second link of the path (N), when the second link is further removed from the user (I) than the first link.
Preferably, the indication transmitted to a user device is displayed or presented aurally, visually, or by the production of motion or the like.
Preferably, the output transmitted includes a popularity quotient and/or a measure of change in popularity since last notification. Still more preferably, the popularity of the user is determined upon request from the user, upon the user accessing the system, or at predetermined intervals.
Also in accordance with the present invention there is provided a system configured to generate a popularity indication for a user, comprising: means for receiving information about a contact from a user, via a communications channel;
a connectivity and popularity module for generating the popularity indication; and means for transmitting to a user, via the communications channel, the popularity indication.
In a preferred embodiment of the present invention, the system further comprises a store for storing details of all system users. Preferably, the stored details include telephone numbers of contacts, electronic mail addresses of contacts, other such contact details, interests of the user and/or the provenance and background of the user.
In a preferred embodiment of the present invention, the communications channel is the internet, an intranet, a mobile/terrestrial/cable telecommunications network or a radio communications system or network.
Preferably, the terminal is a mobile telephone, a radio transmitter/receiver, a personal/laptop/palmtop computer, a pager, a WAP or other internet enabled telephone, or an analogous device.
The above and further features of the present invention are set forth with particularity in the appended claims. The organisation and manner of operation of the invention, together with further objects and advantages thereof, may best be understood by reference to the following description taken in connection with the accompanying drawings.
A specific embodiment of the present invention is now described, by way of example only, with reference to the accompanying drawings, in which:-
Figure 1 is a schematic illustrating a preferred embodiment of the present invention;
Figure 2 is a schematic of a small social network, illustrating one-way or mutual connections between individuals, according to the present invention; and
Figure 3 is a schematic of a simplified form of subject record in accordance with the present invention.
Referring firstly to Figure 1 of the drawings, it may be seen that the system 301 of the present invention is implemented using a computer system server 302, which runs an appropriate operating system 309 and networking software 310 and which connects to the internet or another telecommunications network (for example, a localised internet known as an intranet) 308 using networking hardware 311.
All computer and networking hardware as described above is conventional and as would typically be employed as part of a telecommunications network, for example.
Subject data may be entered into any device 303 suitable for interaction with the system using standard entry means appropriate to that device. "Subject data" is described in more detail below.
Subject data is transmitted (either by wireless link or by fixed link) from a device 303 to the server 302 via the network 308 using standard procedures available for downloading and uploading such data from such a device 303 to such a server 302. This transmission takes place using standard communications protocols such as Wireless Application Protocol (WAP) or Transmission Control Protocol/Internet Protocol (TCP/IP).
As may be seen, there is provided a store or database 304. The database 304 is a database of records corresponding to individuals who make use of the network 308. In practice, the database 304 would contain records corresponding either to every subscriber to the relevant network provider's telecommunications network, or only to those subscribers who have opted to have their details included in the database 304.
The structure of each record in the database 304 is described in more detail below.
In response to the receipt of subject data from a device 303, a connectivity and popularity module 307 (Figure 1) resident on the central processing unit (CPU) of the server 302 operates on subject data to:
(a) add to or otherwise update the database 304;
(b) calculate all direct and indirect links between a subject (I) 306 and other individuals (IN) whose records are stored in the database 304; and
(c) produce an output which includes a measurement of the subject's (I) 306 popularity.
The connectivity and popularity module may be written in any suitable programming language, such as C++, or alternatively, it may be implemented using bespoke special purpose hardware or programmable hardware. The operation of the connectivity and popularity module 307 is described in more detail below.
The output of the module 307, which includes a measurement of the subject's (I) 306 popularity, is then sent back to the device 303 via the network 308 and communicated to the subject 306 in any one or more of a number of forms (a number, a graphic character, an audible signal, etc.) displayed or presented by the device 303. The act of sending the output to the device 303 may either be initiated by the connectivity and popularity module 307, or may be a response to a request signal sent from the device 303 under the subject's (I) 306 control.
In its simplest form, subject data may be a copy of a contact list of the subject (I) 306 as created using and/or stored on the device 303. For example, it could be the names and telephone numbers of contacts as created by the subject
306 and stored within the memory of a mobile telephone 303. Alternatively, it could be contact details as stored within a proprietary database such as Microsoft
(RTM) Outlook within a personal computer 303. Subject data may, at the subjects option, consist of such further information as the software running on device 303 allows such as electronic mail addresses, web site addresses, etc.
The device 303 may be configured so as to allow the subject 306 to input a wide range of data to form subject data, for example, his or her age, address, interests and background such as school attended, and how highly he or she rates a connection to another individual.
It is also possible for subject data to be provided in the form of the data in a network provider's existing customer database as it exists from time to time (to the extent that the database 304 differs from the network provider's customer database). For example, if a network provider purchases another network provider, thereby acquiring an existing subscriber database 313 on a computer 305, this database could be input to the server 302 as subject data either via the network 308 or by direct means, such as inter computer/server connection.
The database 304 is organised so as to provide a model of the social network formed by either all the individuals using the network 308 or those individuals who have opted to form part of the network, as described above. Its organisation is now described, referring to Figures 2 and 3.
Figure 2 is a greatly simplified schematic of a very small social network. An alternative term for a network is a graph, and that term will be used hereinafter so as to prevent confusion with a reference to the network 308 referred to above.
Only the names of individuals within the graph are shown in Figure 2. The arrows on the connecting lines indicate whether the individual in question knows someone, is known by someone, or whether the relationship is mutual.
Such graphs may be modelled by computing means in a variety of different ways, all of which are standard (see for example the discussion on graphs in Aho
Hopcroft and Ullman, Data Structures and Algorithms; 1987, incorporated herein by reference).
Referring now to Figure 3, illustrated is a subject record 100 comprising of the following fields: the subject's name 101 , the subjects mobile telephone number
102, links to the subject's contacts 103 and other information relating to the subject 104 (e.g. age, address, interests). The links simply point to records corresponding to the other individuals which the subject 306 knows.
A subject record 100 may be implemented as a linked list structure, to allow a variable number of contacts and other details for each subject.
The determination of connectivity, as carried out by the connectivity and popularity module 307 is now described. The method of determining connectivity includes:
(a) processing subject data so as to maintain the database 304; and
(b) depending on the nature of the subject data, determining all the direct and indirect links from each individual to all other individuals whose records are stored in the database 304.
Step (a) above consists of using subject data to:
(i) update the relevant records in the database 304, converting the subject data into the form of record 100 as required; and
(ii) maintain the links 103 between records, as described above.
Step (b) may be carried out using Dijkstra's algorithm (see Aho Hopcroft and Ullman, Data Structures and Algorithms; 1987, page 203 onwards for a detailed description of Dijkstra's algorithm, incorporated herein by reference).
Djikstra's algorithm solves the problem of finding the shortest path from a vertex in a graph to any other vertex in the graph. Using the same algorithm, one can find
all the shortest paths from a given vertex to all points in a graph in the same time. For example, referring to Figure 2, it can be seen that the vertex Fred Bloggs is one step away from Joe Smith (his immediate neighbour) but 2 steps away from Alice Carter (his neighbour once removed) and 3 steps away from Bill Anderson. These are all shortest path measurements. However, if Fred and Bill meet, say, and become friends, such that they enter each others names in their contact field(s) (as described above), a new reciprocal link in the graph would be generated directly between Fred and Bill and this link would be the new shortest path between those two individuals.
Such shortest direct and indirect links are determined by checking the field of each record which stores telephone numbers, since telephone numbers are unique whereas a person's name may not be ("John Smith" for example). Of course, other unique fields, such as electronic mail addresses, may be utilised for this purpose.
Having calculated each individual's neighbours as described above, the popularity of a given individual (I) is determined by the module 307. A pseudocode representation of the method is set out below:
1. Set N (the path length or number of steps to a neighbour) to be 1 , so that in the first pass, a user's (I) immediate links are examined.
2. Set a "damping factor" D to be 1
3. Set a popularity factor P to be 0.
4. Add up the number of links:
(a) from the user (I) to his or her neighbours (IN) N away, applying any weighting W-i associated with those links; (b) to the user (I) from his or her neighbours (IN) N away, applying any weighting W2 associated with those links.
5. Multiply the total by the damping factor (D) and add this to the popularity factor (P).
6. Increment N by a single step.
7. Reduce D by a proportion. This proportional reduction will be described further below.
8. Repeat until all unique shortest direct and indirect paths between the user (I) and the neighbours (IN) thereof have been traversed, i.e. incrementally increase N until all vertices within a graph directly and indirectly connected to (I) have been included in the analysis.
The damping factor (D) may be any function which reflects the fact that as one moves out further and further from the individual's (I) 306 immediate social circle (i.e. N is increased), then the effect of other individuals on that individual's (I) 306 level of popularity (P) becomes less and less. This may be achieved using a negative exponential function, for example, to compute D. Of course, it is possible that a simpler or more complex reduction function may be utilised.
After a certain number of iterations, assuming something like a negative- exponential damping factor, the percentage increase in popularity (P) per iteration will drop to a minimal level. By monitoring the percentage increase, the algorithm can be caused to quit when the percentage drops down to, or below, a pre-set level. This prevents effectively redundant processing. An alternative method of cutting down on the amount of unnecessary processing is to run iterations to a maximum number of steps (N) (of 3 only, for example), therefore making the assumption that subjects over 3 connections or steps away from the individual (I) 306 should not have any contributory effect on popularity (P).
Step 4 in the pseudo-code above makes reference to two weightings (W1 ; W2). These modify the numeric value assigned to the connection between the
user (I) and the other subject vertices (IN) in the graph. The weighting may be a product of any one or more of a number of factors, and set out below are some example factors which may be used to determine the weightings:
(a) W2 should be set to a greater value than Wi to reflect the fact that being known by someone is a more important contribution to popularity than knowing someone. For example, W2 may be set to 1 whilst Wi is set to 0.1. Thus, being known by someone is quantified as being 10 times more valuable a contribution to popularity (P) than knowing someone.
An alternative method of discounting the value of a purported link by the user (I) to another user (IN) is to ignore the purported link completely for the purposes of calculating popularity (P) and to take the link into account only if the individual (IN) to whom the user (I) says he or she is connected confirms the link. This confirmation may take place only if the individuals subject data contained the user (I) as a contact. Alternatively, when the subject data of the user (I), containing the individual (IN) as a contact, is first processed as described above, the connectivity and module program 307 may send, as an additional output, a message to the individual (IN) asking the individual (IN) to confirm the connection cited by the user (I).
(b) Weightings may be affected by:
(i) A classification within the subject data of the nature of the friendship (or other classification of connection) between the two vertices. For example, the user (I) could classify his or her friend as a 7 on a scale of 1-10, indicating a good friend. Alternatively, the user (I) could classify his or her friend as a 1 on a scale of 1-10, indicating a passing acquaintance.
(ii) The frequency and duration of communications between the two vertices as measured through monitoring means within the system 301 of telecommunications traffic data (not illustrated). Such
monitoring is a standard function of existing traffic and billing systems within existing telecommunications systems and the output of such monitoring provides input for subscriber records. This is also an illustration of how the system could dynamically and automatically adjust and update an individual's (I) popularity measurement based on that individual's usage of telecommunications media.
(iii) A calculation performed by the connectivity and popularity program, based on the geographic proximity of the subject vertices in question.
It should be stressed that these weighting factors, and the example values given, are illustrative only and not the only factors or values which may be used.
Having determined the popularity (P) of an individual (I) 306, this measurement is then output back to the individual 306. The output is transmitted to the individual's device 303, via the internet or other telecommunications network. Of course, the indication of popularity may be delivered to a third party as well as/instead of the individual to whom it relates.
The output transmitted may take the form of an indication of the popularity (P), or of the difference in popularity quantification since the last time popularity (P) was calculated. The output may take many forms, for example numeric, text, graphical, colour, shade, audible signal or translation into mechanical motion of an output device (e.g. oscillation). The output may be converted into one or more of these forms either at the server end, before transmission, or by the device.
It will be appreciated by the reader that the connectivity and popularity module described need not be run each time the database is updated, as described above, (for example, when a new subject record is added to the database). The module may run as a batch process at predetermined times during the day or at other intervals, depending on how often the popularity measurements are required to be updated.
Additionally, the algorithm used to calculate the connections between subjects need not be Dijkstra's algorithm - the choice of algorithm is not important, provided, of course, it produces the required information as to connectivity.
It may be noted that a preferred embodiment of the invention as illustrated uses a record structure 100 which stores information other than just names and telephone numbers. This information (age, interests, background etc.) may be used to calculate popularity levels within a specific category. For example, by calculating links based on criteria other than or in addition to telephone number, the invention may be used to derive popularity quantification's within a specific sector such as "subscribers aged 24" or "subscribers with an interest in football".
Further, using a telecommunications network (including the internet) in association with the method and system described herein is not a necessary feature of the present invention. The present invention is applicable to other communication networks/channels than the internet. Such networks/channels include intranets (networks that are internal to particular organisations) and also stand alone computers. The internet as described herein, is merely one example of a remote network that can be utilised in accordance with a preferred embodiment of the present invention. Further, whilst the embodiment described is restricted to one telecommunications network 308 controlled by one network provider, it is evident that the method and system of the present invention may easily be extended across multiple networks.
Finally, whilst the embodiment above has been illustrated by reference to subject 305 being an individual. It may readily be seen that the invention is equally applicable to the calculation of a measurement of popularity of a business, or of any other entity.
It will of course be understood that the present invention has been described above by way of example only, and that modifications of detail can be made within the scope of the invention.