WO2013104922A1 - System and method for generating a list of recommended items - Google Patents

System and method for generating a list of recommended items Download PDF

Info

Publication number
WO2013104922A1
WO2013104922A1 PCT/GB2013/050058 GB2013050058W WO2013104922A1 WO 2013104922 A1 WO2013104922 A1 WO 2013104922A1 GB 2013050058 W GB2013050058 W GB 2013050058W WO 2013104922 A1 WO2013104922 A1 WO 2013104922A1
Authority
WO
WIPO (PCT)
Prior art keywords
books
proximity
computer
items
user
Prior art date
Application number
PCT/GB2013/050058
Other languages
French (fr)
Inventor
Francois PLANCKE
Matthew INNES
Keith CROOK
Esteban SIMON
Original Assignee
Mobcast Services Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mobcast Services Limited filed Critical Mobcast Services Limited
Priority to EP13702250.5A priority Critical patent/EP2803028A1/en
Publication of WO2013104922A1 publication Critical patent/WO2013104922A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising

Definitions

  • the present invention relates to systems and methods for generating a list of recommended items. More particularly, the present invention relates to systems and methods for generating a list of recommended items using one or more proximity networks. Each proximity network defines relationships between similar items, wherein the similarity of two items is based on predefined criteria specific to the proximity network.
  • Reading books is a common pastime for many people. Typically after a person finishes a book they rely on recommendations to pick their next book or books. Traditionally recommendations were provided by a librarian, a bookstore salesperson, a best sellers list, book critics and/or friends.
  • the search engine technique typically allows a user to locate books using keywords.
  • the keywords may relate to the title of the book, the author, or genre, for example.
  • One of the problems with the search engine method is that it allows only very limiting searching.
  • the user has to have some idea of what they are looking for to generate the keywords. Accordingly, the search engine method requires a significant effort from the user and does not typically provide an efficient means to locate books that match a user's tastes.
  • the collaborative filtering technique recommends books to users based on the interest of a community of users, without any analysis of book content.
  • Some collaborative filtering methods are based on the purchase history of the users. For example, if a user purchases or is considering purchasing a book then the method may recommend other books purchased by users who also purchased that book.
  • Other collaborative filtering methods are based on how well a user's profile matches other users that expressed similar opinions about a given book. For example, such collaborative filtering methods operate by enabling users to rate individual books. Through this process, each user builds a personal profile of ratings data. To generate recommendations for a particular user, the user's profile is compared to the profiles of other users to identify one or more similar users. Books that were rated highly by these similar users are then recommended to the user.
  • Collaborative filtering techniques are very popular because they are simple. However, collaborative filtering techniques suffer from a number of problems. One problem is that collaborative filtering techniques base their recommendation on how well the user matches the preference of other users. As a consequence, these techniques are less accurate if the user base is small. Another problem is that a book cannot be recommended until it is purchased or rated by other users. Accordingly, collaborative filtering techniques tend to recommend mostly best sellers. Another problem with collaborative filtering techniques is that they don't typically take any context into consideration. For example, a surgeon with small children may purchase books on surgery and children's books.
  • a computer-implemented method to generate a list of recommended items from a plurality of items comprising: generating a proximity network, the proximity network defining relationships between similar items of the plurality of items, wherein the similarity of two items is based on predefined criteria; receiving data identifying an item of the plurality of items; and generating the list of recommended items by selecting a plurality of items similar to the identified item using the proximity network.
  • a computer-implemented system to generate a list of recommended items from a plurality of items, the system comprising: a storage module configured to generate and store a proximity network, the proximity network defining relationships between similar items of the plurality of items, wherein the similarity of two items is based on predefined criteria; and a proximity module configured to: receive data identifying an item of the plurality of items; generate the list of recommended items by selecting a plurality of items similar to the identified item using the proximity network; and output the list of recommended items.
  • Figure 1 is a block diagram of a system for generating a list of recommended items
  • Figure 2 is a depiction of an exemplary proximity network
  • Figure 3 is a schematic of an interface for displaying the list of recommended items
  • Figure 4 is a flow chart of a method for generating a content proximity network
  • Figure 5 is a flow chart of a method for generating a rating proximity network
  • Figure 6 is a flow chart of a method for generating an author proximity network
  • Figure 7 is a flow chart of a method for generating a list of recommended items.
  • Each proximity network defines relationships between similar items, wherein the similarity of two items is based on predefined criteria specific to the proximity network.
  • Figure 1 illustrates a system 100 for generating a list of recommended items in accordance with an embodiment.
  • system 100 is described as generating book recommendations, however, it will be evident to a person of skill in the art that the general principles described herein may be equally applied to provide recommendations of other items such as compact discs, video discs, video or music downloads etc.
  • a service module 102 provides a book service to one or more user.
  • book service is intended to cover any hardware, software, or combination of hardware and software that allows a user to browse, obtain information about, or get a copy of books including an electronic ("eBook) website, an eBook reader book store (e.g. Uncuva, the Samsung E60/E65 eReader book store and Amazon's Kindle Book Store) and a website that discusses books.
  • the system 100 receives data from the service module 102 identifying a particular book and a particular user.
  • the system 100 uses the data received from the service module 102 and one or more proximity networks to generate a list of recommended books.
  • the system 100 comprises a proximity module 104, a navigation module 106, a learning module 108 and a storage module 110 which work together to generate the list of recommended books.
  • the proximity module 104 generates a list of recommended books that are similar to the identified book from one or more proximity networks.
  • the navigation module 106 orders or sorts the list of recommended books based on how closely the books match the user's characteristics and preferences.
  • the learning module 108 adjusts the parameters used by the proximity module 104 and the navigation module 106 to generate the best list of recommended books for the particular service module 102 and for the particular user.
  • the storage module 110 stores and generates the data used by the other three modules (i.e. the proximity module 104, the navigation module 106, and the learning module 108), such as the list of books and the proximity network(s).
  • Each module of the system 100 will be described in detail below.
  • the proximity module 104 receives data identifying a particular book and generates a list of recommended books that are similar to the identified book using one or more proximity networks.
  • the data identifying a particular book may be provided directly to the proximity module 104 by the service module 102.
  • the service module 102 may provide the data identifying the book to the storage module 110 which in turn provides the data to the proximity module 104.
  • a proximity network defines relationships between similar books, wherein the similarity of two books is based on predefined criteria. Typically each proximity network uses different criteria to assess the similarity of books.
  • An exemplary proximity network 200 is shown in Figure 2.
  • Each proximity network may comprise nodes and links connecting pairs of nodes.
  • the exemplary proximity network 200 of Figure 2 comprises six nodes 202, 204, 206, 208, 210 and 212 connected by a plurality of links 214, 216, 218, 220, 222, 224, 226, 228 and 230.
  • Each node 202, 204, 206, 208, 210 and 212 represents one or more books stored in the storage module 110.
  • each node is a book and thus each node represents a single book.
  • each node is an author and thus each node represents one or more books (e.g. all the books by that particular author).
  • each link 214, 216, 218, 220, 222, 224, 226, 228 and 230 is assigned a distance value dl, d2, d3, d4, d5, d6, d7, d8 or d9 which represents the similarity of the two nodes (e.g. books or authors) connected by the link based on the predefined criteria.
  • the predefined criteria may be used to generate a weight that represents the similarity of two nodes.
  • the weight may be an integer that cannot be zero.
  • Example proximity networks include a content proximity network, a rating proximity network and an author proximity network. However, it will be understood by those of skill in the art that other proximity networks may be used.
  • a content proximity network uses the content of the books to determine their similarity.
  • a content proximity network may be generated by counting the occurrence of each word in the books and comparing the high occurrence words.
  • High occurrence words are those words that occur most frequently in a particular book.
  • the high occurrence words may be the X most frequent words where X is an integer greater than 1.
  • the high occurrence words may be the 100 most frequently used words.
  • a rating proximity network uses the ratings of the books to determine their similarity. For example, a rating proximity network may be based on the premise that the more often two books have been rated positively by the same users, the more similar they are. Since a rating proximity network links books that have been similarly rated by multiple users, it is particularly effective at recommending well known books having a large readership (e.g. novels and children's books).
  • the rating proximity network can be seen as operating similarly to a librarian that knows a set of people that like the same kind of books and then recommends to members of the group the books that one of the members has enjoyed.
  • An exemplary method for generating a rating proximity network will be described in reference to Figure 5.
  • An author proximity network uses ratings of authors and/or books to determine the similarity of authors. For example, when author ratings are available, a link may be established between two authors if the authors have been positively rated by at least two users. Where author ratings are not available for a pair of authors, a link may be established between the authors if multiple books by the authors have been rated positively by at least two users.
  • the author proximity network is based on the premise that the more two authors (or books by those authors) are rated positively by the same users, the more similar the authors are.
  • An author proximity network is designed to recommend books that relate to different subject matter than previously explored by the user and not necessarily well known (e.g. not rated by many users). An exemplary method for generating an author proximity network will be described in reference to Figure 6.
  • the proximity module 104 uses the data identifying a particular book and one or more proximity networks to compile a list of books that are similar to the identified book.
  • the list of books similar to the identified book forms the list of recommended books.
  • the proximity module 104 may be configured to generate the list of similar books by selecting those nodes from the proximity network(s) that are connected by a link to the node representing the identified book. This is based on the premise that if two nodes are connected by a link then they by definition have some level of similarity.
  • the proximity network is a content proximity network or a rating proximity network the "node representing the identified book" is the node for the identified book.
  • the proximity network is an author proximity network the "node representing the identified book” is the node for the author of the identified book.
  • the list of similar books is then the list of books represented by the selected nodes.
  • the proximity network is a content proximity network or a rating proximity network the nodes are books, thus the list of similar books comprises each node/book selected.
  • the proximity network is an author proximity network the nodes are authors, thus the list of similar books comprises the books written by each node/author selected.
  • the proximity module 104 may be further configured to use the distance values of the links to only select a limited number of similar books. Specifically, the proximity module 104 may be configured to select only the X most similar nodes from the proximity network(s), where X is a predefined integer greater than zero. For example, the proximity module 104 may be configured to select the 10 most similar nodes to the node representing the identified book using the distance values.
  • the proximity module 104 may be configured to select only the nodes that have a distance value below a predetermined threshold. Potentially, where nodes are authors a node may correspond to a large number of books. Accordingly, where some or all of the selected nodes correspond to a plurality of books, the proximity module 104 may be configured to select only the Y most similar books, where Y is a predefined integer greater than zero.
  • the proximity module 104 may be configured to select similar books from only one proximity network. In other cases, the proximity module 104 may be configured to select similar books from a plurality of proximity networks so that the list of recommended books includes books identified as similar by different criteria. For example, if the proximity module 104 selects similar books from both the content proximity network and the rating proximity network, the list of recommended books will have books identified as similar by the content of the books and by the ratings of the books.
  • the proximity module 104 may receive information from the learning module 108 that defines the combination of proximity networks to be used in generating the list of recommended books.
  • the learning module 108 may be configured to select a service profile from a plurality of predefined service profiles that will provide the best list of recommended books for the particular service module 102.
  • Each service profile assigns a proportion value to each of the different proximity networks. Typically the proportion values add up to 1 or 100%.
  • a particular service profile may assign the content proximity network a proportion value of 0.5, the rating proximity network a proportion value of .3, and the author proximity network a proportion value of .2. This would mean that when this service profile is used by the proximity module 104, 50% of the recommended books would be selected from the content proximity network, 30% of the recommended books would be selected from the rating proximity network, and 20% of the recommended books would be selected from the author proximity network. Since the proximity networks are not based on information about the particular user, the proximity module 104 will typically produce the same list of recommended books for the same identified book regardless of the identity of the user.
  • the navigation module 106 receives the list of recommended books generated by the proximity module 104 and the data identifying the user and orders the list of recommended books so that the books that are most likely to appeal to the identified user are presented first (or at the top of the list). In some cases, once the books have been ordered, the navigation module 106 may remove one or more books from the bottom of the list to eliminate books that are less likely to appeal to the user.
  • the navigation module 106 may order the recommended books by comparing information about the user and their book preferences to characteristics of the recommended books. This comparison may comprise comparing the general profile of the identified user against the general profiles and rating profiles of the recommended books.
  • the general profile of a book typically comprises one or more criterion which characterizes the book.
  • the criteria are typically selected to provide the most relevant information in assessing the similarity of the book to the user's preferences.
  • the criteria may include, but is not limited to, one or more of the following: genre, size, style signature, part of a series, time period, gender, age, price, language, and user home location. It will be evident to a person of skill in the art that other suitable criteria may be used alternatively to the listed criteria or in addition to the listed criteria.
  • the "genre” criterion may be used to indicate the subject category to which the book belongs. Typically the genre information is provided by the publisher of the book. In some cases there may be a predefined list of genres.
  • the "size” criterion may be used to indicate the length of the book. In some cases the number of words may be used to assess the size of the book. For example, each book may be given a size rating from 1 to 10 based on the number of words in the book. Typically a size rating of "1" would indicate that the book is small (e.g. has a small number of words) and a size rating of "10" would indicate that the book is large (e.g. has a large number of words).
  • the number of pages may be used to assess the size of the book.
  • the number of pages is typically a less accurate means for assessing the size of the book since it does not take into account the density of the content on each page. For example, it does not take into account varying font and page sizes between books.
  • the "style signature" criterion may be used to indicate the complexity of the book.
  • each book may be given a style signature rating from 1 to 10 that represents the complexity of the book.
  • the style signature rating may be based on the number of words in each sentence and/or the length of the words in the book.
  • the style signature rating may be based on the average length of the longest sentences in the book.
  • the longest sentences in the book may be the top percentage (e.g. 5%) of the longest sentences.
  • the style signature rating may be calculated in accordance with Table 1.
  • the style signature rating may be based on the Flesch Reading Ease Score (FRES) of the book.
  • FRES Flesch Reading Ease Score
  • the FRES is calculated from the following formula: 206.835 - (1.015 x ASL) - (84.6 x ASW) where ASL is the average sentence length and ASW is the average syllables per word. This rates the text on a 100-point scale; the higher the score, the easier it is to understand the document with 60 to 70 being an acceptable score for literate adults.
  • the FRES may, for example, be converted into a style signature rating in accordance with Table 2.
  • the style signature rating may be based on the Gunning Fog Index Readability Formula, simply called the FOG Index, of the book.
  • the Fog Index is calculated from the following formula: 0.4 (ASL + PHW) where ASL is the average sentence length and PHW is the percentage of hard words.
  • a hard word is defined as one that is three or more syllables that is not (i) a proper noun, (ii) a combination of easy words or hyphenated words, or (iii) a two-syllable verb made into three syllables with common suffixes such as -es, -ed and -ing.
  • the Fog index is a rough measure of how many years of school it would take someone to understand the content. The lower the FOG index, the more understandable the content is. The average person reads at level 9. The easy reading range is 6-10 and anything above 15 is getting difficult.
  • the "part of a series” criterion may be used to indicate whether the book is part of a series of books. It may be assigned a Boolean value indicating whether the book is part of a series. For example, the "part of a series” criterion may be set to "true”, “yes” or “1” when the book is part of a series and “false", “no” or “0” when the book is not part of a series.
  • the "time period" criterion may be used to indicate the time period in which the book was written.
  • a book may be classified into one of the following time period categories based on when it was written: less than 1 year ago; between 1 and 5 years ago; between 5 and 10 years ago; between 10 and 20 years ago; between 20 and 50 years ago; more than 50 years ago and in the 20 th century; 19 th century; 18 th century; etc.
  • the "gender” criterion may be used to indicate the target reader's gender.
  • the positive ratings e.g. ratings above a predetermined threshold, such as a rating above three stars in a five star rating system
  • the positive ratings may be used to assess the gender target of the book. For example, if there are less than a threshold number of positive ratings the book's gender target may be classified as unknown, if, however, there are at least a threshold number of positive ratings, then if more than a predetermined percentage (e.g. 70%) of the positive ratings are from a certain gender (e.g. males or females) the book's gender target may be classified as that gender (e.g. male or female), otherwise the book's gender target may be classified as unknown.
  • the threshold number of positive ratings will typically be between 3 and 5, however, it will be evident to a person of skill in the art that the threshold number may be fine-tuned and may be any suitable threshold number may be used.
  • the "age” criterion may be used to indicate the target reader's age. Similar to the “gender” criterion, the positive ratings (e.g. ratings above a predetermined threshold, such as ratings above three stars in a five star rating system) may be used to assess the target reader's age.
  • the "age target” criterion may be an array, each element of the array representing a range of ages. The percentage of positive ratings by each age category would be used to fill the array.
  • the array may have 10 elements which represent the following age categories: less than 1 year, 1-5 years, 6-10 years, 11-15 years, 16-20 years, 21-25 years, 26-30 years, 31-35 years, 36-40 years, 41-45 years, 46-50 years, and over 50 years.
  • the user may be given the option to specify the age of the child on which the rating is based.
  • the "price" criterion may be used to indicate the price of the book.
  • a book may be classified into one of the following price categories based on the price of the book: free; less than $1.99; $2-$4.99; $5-$9.99; $10-$19.99; $20-29.99; $30-$39.99; $40-$49.99; greater than $49.99.
  • this criterion is used to match the price of the book to the user's preference in relation to book price.
  • the "language” criterion may be used to indicate the language of the book.
  • the storage module 110 may only contain books in a single language. In these cases it may not be necessary to include the language criterion. In other cases, the storage module 110 may contain books in multiple languages. In these cases, the language criterion may be used to match the language of the book to the language preferences of the user.
  • the "user home location” criterion may be used to indicate the target reader's home location. Similar to the “gender target” and “age target” criteria, the positive ratings (e.g. ratings above a predetermined threshold, such as ratings above three stars in a five star rating system) may be used to assess the target reader's home location. For example, if there are less than a threshold number of positive ratings of the book, the book's "user home location” may be classified as unknown; if, however, there are at least a threshold number of positive ratings of the book, then if more than a predetermined percentage (e.g.
  • the book's "user home location” may be classified as that city or town, otherwise the "user home location” may be classified as unknown. While the user's city or town has been used as the user's home location, it will be evident to a person of skill in the art that other suitable location metrics, such as country, county or region could also be used as the user's home location. Since the user home location criterion is one of the weaker criteria in terms of matching user preferences to books, the threshold number of positive ratings may be higher than for other stronger criterion. For example, the threshold number of positive ratings may be around 10, however, it will be evident to a person of skill in the art that the threshold value may be fine-tuned over time and that other suitable threshold values may be used.
  • the rating profile of a book typically comprises information about how the book has generally been rated.
  • a book may be assigned one of the following rating profiles based on the ratings of the book: positive, negative, average, or polar.
  • Positive, negative and average ratings may be defined as ratings that fall within certain predefined ranges. For example, in a five star rating system, positive ratings may be ratings of 4 or 5 stars, negative ratings may be ratings of 0 or 1 star, and average ratings may be ratings of 2 or 3 stars.
  • a predetermined percentage e.g. 60%
  • a book may be assigned a polar rating profile if at least a predefined percentage (e.g. 40%) of the ratings are positive (e.g. ratings of 4 or 5 stars) and at least a predefined percentage (e.g. 40%) of the ratings are negative (e.g. ratings of 0 or 1 stars).
  • a predefined percentage e.g. 40%
  • the ratings are positive (e.g. ratings of 4 or 5 stars)
  • at least a predefined percentage (e.g. 40%) of the ratings are negative (e.g. ratings of 0 or 1 stars).
  • the rating profile of a book will be separate from the general profile of the book. In other cases the rating profile of a book will form part of the general profile of the book.
  • the rating profile may be a criterion of the general profile of the book.
  • the general profile of a user typically comprises one or more criterion that characterizes the user and the user's book preferences.
  • the criteria used in the general profile of the user correspond to the criteria used in the general profile of a book.
  • the criteria used in the general profile of the user may include, but is not limited to, one or more of the following: genre, size, style signature, part of a series, time period, gender, age, price, language, user home location and rating profile. It will be evident to a person of skill in the art that other suitable criteria may be used alternatively to the listed criteria or in addition to the listed criteria.
  • the "genre" criterion may be used to indicate the genre(s) of books preferred by the user.
  • the genre preferences of a user may be based on the genres of the books purchased and/or positively rated (e.g. given a rating of 4 or 5 stars in a five star rating system) by the user. For example, there may be a fixed number of genres and each genre may be assigned a value in a predetermined range based on the number of books purchased and/or positively rated by the user of that genre. For example, each genre may be assigned a value from 1 to 10 representing the number of books purchased and/or positively rated by the user of that genre, where a value of 10 may mean that at least 10 books of this genre have been purchased and/or rated by the user.
  • the term "books purchased and/or positively rated by the user” is used herein to mean the books purchased by the user, the books positively rated by the user, or both the books purchased by the user and the books positively rated by the user.
  • the "size” criterion may be used to indicate the book length(s) preferred by the user.
  • the book length(s) preferred by the user may be assessed based on the size rating of the books purchased and/or positively rated by the user. For example, as described above, each book may be given a size rating from 1 to 10 based on the number of words in the book. Typically a size rating of "1" would indicate that the book is small (e.g. has a small number of words) and a size rating of "10" would indicate that the book is large (e.g. has a large number of words).
  • the user size criterion may be an array wherein each size rating is assigned a user preference value between 0 to 10 based on the size rating of the books purchased and/or positively rated by the user. In some cases, a particular size rating will only be assigned a user preference value after the user has positively and/or negatively rated a predetermined number of books with that size rating.
  • Table 3 illustrates an exemplary method for calculating the user "size" criterion.
  • the size ratings 1, 2 and 3 are not assigned a user preference value since the total number of books positively or negatively rated by the user is below the predetermined threshold (e.g. 5).
  • the predetermined threshold e.g. 5
  • each size rating is assigned a value representative of the number of books positively rated up to a maximum value of 10.
  • the "style signature" criterion may be used to indicate the book complexity preferred by the user.
  • the book complexity preferred by the user may be based on the style signature rating of the books purchased and/or positively rated (e.g. given a rating of 4 or 5 stars in a five star rating system) by the user.
  • each book may be given a style signature rating from 1 to 10 based on the percentage of long sentences, where a long sentence is one with more than a predefined number of words.
  • the style signature criterion may be an array wherein each style signature rating is assigned a user preference value between 0 to 10 based on the style signature rating of the books purchased and/or positively rated by the user. In some cases, a particular style signature rating may only be assigned a user preference value after a predetermined number of books with that particular style signature have been positively or negatively rated by the user.
  • Table 4 illustrates an exemplary method for calculating the user's style signature preferences.
  • the style signature ratings 1, 2, 3 and 10 are not assigned a user preference value since the total number of books positively or negatively rated by the user is below the predetermined threshold (e.g. 5).
  • the predetermined threshold e.g. 5
  • style signature ratings 8 and 9 more than 75% of the rated books with this style signature rating have been negatively rated thus style signature is likely to be negatively impacting the user's appreciation of the books with those style signature ratings. Accordingly these style signature ratings have been assigned a user preference value of 0.
  • style signature ratings 4 5, 6 and 7, more than the threshold number of books with these style signature ratings have been positively or negatively rated, but less than 75% of the rated books in each category have been negatively rated, thus the style signature of the books does not appear to impact the user's appreciation of the book. Accordingly, each of these style signature ratings is assigned a value representative of the number of books positively rated by the user up to a maximum value of 10.
  • the "part of a series" criterion may be used to indicate whether the user prefers or enjoys books that are part of a series. Whether or not the user prefers or enjoys books that are part of series may be based on whether the books purchased and/or positively rated by a user are part of a series. For example, the user may be assigned a "part of a series" value from 0 to 10 that is representative of the percentage of books purchased and/or positively rated by a user that are part of a series. For example, a user may be assigned a "part of a series" value of 7 if 70% of the books purchased and/or positively rated by the user are part of a series. In some cases, the user may only be assigned a part of a series value after a predetermined number of books have been rated.
  • the "time period" criterion may be used to indicate whether the user prefers or enjoys books from particular time periods.
  • the time period(s) preferred by the user may be based on the time period of the books purchased and/or rated positively by the user.
  • a book may be classified into one of the following time periods based on when it was written: less than 1 year ago; between 1 and 5 years ago; between 5 and 10 years ago; between 10 and 20 years ago; between 20 and 50 years ago; more than 50 years ago and in the 20 th century; 19 th century; 18 th century; etc.
  • each time period may be assigned a user preference value from 0 to 10 representing the number of books of that time period that were purchased and/or rated positively by the user.
  • a user preference value of 10 may represent, for example, that the user has purchased and/or rated positively at least 10 books of that time period. In some cases, a particular time period may only be assigned a user preference value after a predetermined number of books in that time period have been purchased and/or positively rated by the user.
  • the "gender" criterion may be used to indicate the user's gender.
  • the gender criterion may be assigned a value of male, female or unknown.
  • the gender information may be provided to the system 100 by the service module 102.
  • the "age” criterion may be used to indicate the user's age.
  • the age criterion may specify the user's age in years, for example, or alternatively it may specify the age range that the user falls into.
  • the age criterion may specify which of the following age ranges the user falls into: less than 1 year, 2-5 years, 6-10 years, 11-15 years, 16-20 years, 21-25 years, 26-30 years, 31-35 years, 36-40 years, 41-45 years, 46-50 years, and over 50 years.
  • the age information may be provided to the system 100 by the service module 102.
  • the "price" criterion may be used to indicate the user's price preferences. As noted above, many users will only be interested in books in certain price ranges. As described above, a book may be classified into one of the following price ranges based on the price of the book: free; less than $1.99; $2-$4.99; $5-$9.99; $10-$19.99; $20-$29.99; $30-39.99; $40-49.99; greater than $49.99.
  • each price range may be assigned a user preference value from 0 to 10 representing the number of books in that price range that were purchased and/or rated positively by the user. A user preference value of 10 may represent, for example, that the user has purchased and/or rated positively at least 10 books in that price range.
  • a particular price range may only be given a user preference value after a user has purchased and/or positively rated a predetermined number of books in the particular price range.
  • the "language” criterion may be used to indicate the user's language preferences.
  • the storage module 110 may only contain books in a single language. In these cases it may not be helpful to include the language criterion. In other cases, the storage module 110 may contain books in multiple languages. In these cases, the language criterion may be used to match the language preferences of the user to the language of the recommended books. The user's language preferences may be based on user information received from the service module 102.
  • the user's language preferences may be based on the language of the books positively rated and/or purchased by the user. For example, a particular language may be automatically added to the user's language preferences once the user has positively rated and/or purchased a threshold number (e.g. 5) books in the particular language. Where a user's language preferences include more than one language the list of recommended books may include books in these more than one languages.
  • a threshold number e.g. 5
  • the "user home location” criterion may be used to indicate the home location of the user.
  • the "user home location” may specify one or more of the following: the town or city, county or region, or country of residence of the user.
  • the user home location information may be provided to the system 100 by the service module 102.
  • the "rating profile" criterion may be used to indicate the book rating profile preferred by the user.
  • the rating profile criterion may be assessed from the rating profiles of the books that the user has purchased and/or rated positively (e.g. rated 4 or 5 stars in a five star rating system).
  • the rating profile criterion may help identify whether the user can appreciate books that are controversial (e.g. book that have a polar rating profile), and whether the user can appreciate books that are not appreciated by a majority of other users (e.g. books that have a negative or average rating profile).
  • a book may be assigned one of the following rating profiles based on the ratings of the book: positive, negative, average and neutral.
  • each rating profile may be assigned a user preference value from 0 to 10 representing the number of books with that rating profile that were purchased and/or rated positively by the user.
  • a user preference value of 10 may represent, for example, that the user has purchased and/or rated positively at least 10 books with that rating profile.
  • a particular rating profile may only be given a user preference value after a user has purchased and/or positively rated a predetermined number of books with that rating profile.
  • many of the user criteria may be based on a user's positive ratings of books. For example, it has been described above that many criteria may be assigned a user preference value from, for example, 0 to 10 based on the number of books positively rated by the user. It will be evident to a person of skill in the art that the negative and/or average ratings by the user may also be used to assess the user's preferences. Specifically, the negative ratings by the user may be used to identify an aversion or dislike of a particular characteristic. For example, if a predetermined percentage of the user's ratings are negative for a particular criterion then a negative user preference rating may be assigned to indicate that the user has an aversion to or dislike of books with a particular characteristic.
  • the navigation module 106 compares the general profile for the user against the general profile for each recommended book to determine the similarity between the user's characteristics and preferences and each book's characteristics.
  • the similarity between a user's characteristics and preferences to a book's characteristics may be represented by a distance value. Typically, the smaller the distance value, the more closely the book matches the user's characteristics and preferences.
  • the distance between the user's characteristics and preferences and a book's characteristics may, for example, be calculated by determining a proximity value for each valid criterion in the user's profile and then summing the individual proximity values.
  • a valid criterion in the user's profile is a criterion that has a non-null value. For example, since some of the criteria (e.g. genre, size, style signature, time period, rating profile) may only be assigned values after the user has positively rated a predetermined number of books, and some of the criteria (e.g. gender, age, home location) may only be assigned values if the relevant information is available, there may be one or more criterion in the user's profile that is not assigned a value (e.g.
  • Each proximity value is representative of how well a particular user criterion matches the corresponding book criterion.
  • the age proximity value indicates how well the user's age matches the target age for the book.
  • the proximity value is calculated by comparing the user characteristic against the corresponding book criterion and assessing how well they match.
  • the age proximity value may be based on whether the the user's age fall into the target age range of the book and if not, how far away it is.
  • the proximity valued is calculated from the user preference value associated with the corresponding book criterion.
  • a user's style signature criterion comprises an array of user preference values and each user preference value represents the user's preference for a particular style signature.
  • the style signature proximity value for a particular book is based on the user preference value corresponding to the style signature of the book (e.g. 10 for style signature 4 (Book 1), and 0 for style signature 9 (Book 2)).
  • the higher the user preference value the more the user prefers books with that trait.
  • a user preference value of 10 for style signature 4 indicates the user prefers or likes books with a style signature of 4. It will be evident to a person of skill in the art that this is an exemplary method of calculating the proximity values and other suitable methods may be used to calculate the proximity values.
  • each proximity value may be calculated by determining the difference between a book criterion (e.g. style signature) and the user's preference regarding that criterion (e.g. the user's preferred style signature).
  • a book criterion e.g. style signature
  • the user's preference regarding that criterion e.g. the user's preferred style signature
  • one or more criterion may be given different weight in determining the distance.
  • the style signature criterion may be given more weight than the user home location criterion.
  • each similarity value may be multiplied by a weighting factor before being summed.
  • the weighting factors may be predetermined or they may be dynamically calculated or updated by the learning module 108.
  • the closer a book's characteristics match a user's preferences the lower the distance value. This is illustrated in Table 5 where the user's preferences more closely match Book l's characteristics than Book 2's characteristics thus Book 1 is assigned a lower distance value (0.03) and Book 2 is assigned a higher distance value (0.10).
  • the user preference values corresponding to the characteristics of book 1 are higher than the user preference values corresponding to the characteristics of book 2, thus book 1 more closely matches the user's preferences.
  • the books are sorted or ordered according to the distance value. For example, where a lower distance value indicates a better match to the user's characteristics and preferences, the recommended books may be placed in ascending order based on the distance values. The objective is to present the books that most closely match the user's characteristics and preferences at the beginning or top of the list.
  • the navigation module 106 may also remove a number of books from the bottom of the list to eliminate books that are less likely to appeal to the user. In some cases, the navigation module 106 may eliminate a predefined number of books from the bottom of the sorted or ordered list. For example, the navigation module 106 may eliminate the last five books on the sorted or ordered list.
  • the navigation module 106 may eliminate all books with a distance value above a certain threshold. For example, where the distance values range from 0 to 1 the navigation module 106 may eliminate all books with a distance value above 0.8.
  • the service module 102 provides the sorted list of recommended books to the user. The service module 102 may then allow the user to purchase and/or rate the recommended books. The same book recommendations cannot typically be made for all user groups and all individual users.
  • the service module 102 may only provide book services to users in a specific country. For example, one service module 102 may provide an eBook service to Canadians where another service module 102 may provide an eBook service to Australians. It may be that Canadians have statistically different book preferences than Australians, thus it may be beneficial to tailor the system to the specific service module 102 and thus the group of users of the service module 102. Similarly each user is unique and thus it is beneficial to tailor the list of recommended books to the specific user as much as possible. Accordingly, the learning module 108 is designed to tweak the system 100 for the specific service module 102 and specific user.
  • the learning module 108 may be configured to select what combination of proximity networks should be used by the proximity module 104 to provide the best book recommendations for the specific service module 102 (and thus for the group of users of the service module 102).
  • the learning module 108 may also be configured to select the criteria weighting factors used by the navigation module 106 to provide the best book recommendations for the specific user.
  • the learning module 108 may select what combination proximity networks should be used by the proximity network 104 by testing a plurality of pre-defined service profiles. Each service profile assigns a proportion value to each proximity network. Typically the proportion values add up to 1 or 100%.
  • a particular service profile may assign the content proximity network a proportion value of 0.5, the rating proximity network a proportion value of .3, and the author proximity network a proportion value of .2. This would mean that when this service profile is used, 50% of the recommended books would be selected from the content proximity network, 30% of the recommended books would be selected from the rating proximity network, and 20% of the recommended books would be selected from the author proximity network.
  • a proximity network may be assigned a proportion value of zero which means that the particular proximity network is not to be used in generated the list of recommended books.
  • the learning module 108 tests or evaluates each of the pre-defined service profiles and provides the service profile with the best performance to the proximity module 104 to be used in generating the list of recommended books.
  • Testing or evaluating the pre-defined service profiles may be done in an intrusive manner or non-intrusive manner.
  • Intrusive methods of testing a service profile are those where the users are aware that the testing is being performed.
  • an intrusive method of testing may comprise asking the users of the service module 102 to rate the relevance of the books recommended using the particular service profile.
  • non-intrusive testing methods are those where the users are unaware that the testing is being performed.
  • a non- intrusive method of testing a service profile may comprise monitoring the number of clicks on the recommended books (e.g. indicating interest in the books) and/or the number of recommended books that are ultimately purchased. Since the performance of the service profiles may change as more data (e.g. ratings and
  • the learning module 108 may be configured to automatically perform the service profile selection on a periodic basis. For example, the learning module 108 may be configured to test the pre-defined service profiles once a week. However, it will be evident to a person of skill in the art that other suitable time frames for testing the pre-defined service profiles may be used.
  • the learning module 108 may also select the criteria weighting factors used by the navigation module 106 to provide the best book recommendations for the specific user.
  • the weighting factors used by the navigation module 106 may be selected in a similar manner to how the combination of proximity networks is selected. For example, the learning module may test a plurality of pre-defined weighting factor profiles and select the one with the best performance.
  • the testing of the pre-defined weighting factor profiles may be performed in an intrusive or non-intrusive manner.
  • Each pre-defined weighting factor profile assigns a weighting factor to each criterion in the user profile.
  • the weighting factors add up to 1 or 100%, but it will be evident to a person of skill in the art that the method would still operate if the weighting factors do not add up to 1 or 100%.
  • a particular weighting factor profile may assign the weighting factors shown in Table 6.
  • the navigation module 106 may use the weighting factors to determine the distance between a recommended book and the user's characteristics and preferences. For example, the navigation module 106 may generate the distance by determining a proximity value for each criterion, multiplying each proximity value by the associated weighting factor, and summing the weighted proximity values.
  • the learning module 108 may be configured to automatically perform the weighting factor profile selection on a periodic basis. For example, the learning module 108 may be configured to test the pre-defined weighting factor profiles once a week. However, it will be evident to a person of skill in the art that other suitable time frames for testing the pre-defined service profiles may be used.
  • the storage module 110 is configured to generate and store the information used to generate the list of recommended books and may comprise or be in the form of one or more databases.
  • the storage module 110 may be configured to store the following: a plurality of books in electronic form and the ratings of the books.
  • the storage module 110 may also be configured to generate and store the general profile for each book, the general profile for each user, and the one or more proximity networks as described above.
  • the storage module 110 may be configured to generate a content proximity network in accordance with the method described in reference to Figure 4, a rating proximity network in accordance with the method described in reference to Figure 5, and/or an author proximity network in accordance with the method described in reference to Figure 6.
  • the storage module 110 may be configured to regenerate or update the general profiles of the books, the general profiles of the users, and the one or more proximity networks on a periodic basis. For example, the storage module 110 may be configured to re-generate or update the profiles and proximity networks once a week.
  • the system 100 may also include an interface module (not shown) that is configured to generate a user interface for displaying the ordered list of recommended books. The user interface generated by the interface module is provided to the service module 102 where it is displayed or presented to the user. An exemplary user interface 300 is shown in Figure 3.
  • the interface module may be configured to use images of the covers of the recommended books to present the recommended books to the user. Different image sizes may be used to represent the similarity of the book to the user's characteristics and preferences. For example, as shown in Figure 3, the books that are most similar to the user's characteristics and preferences (e.g. are at the top of the sorted list) may be represented by larger book cover images 302, 304, 306 and 308, whereas, the books that are less similar to the user's characteristics and preferences (e.g. are lower down the sorted list) may be represented by smaller book cover images 310, 312, 314 and 316.
  • the user interface may also be divided into sections where each section groups together the books located from a particular proximity network or from a particular combination of proximity networks.
  • the user interface 300 may be divided into a first row 318 and a second row 320.
  • the first row 318 may be used to display cover images of the books that were located from a rating proximity network
  • the second row 320 may be used to display cover images of the books that were located from a content proximity network.
  • the user interface may be configured to allow the user to access the additional recommended items by, for example, scrolling in one or more directions.
  • the user may be able to access additional recommended items by scrolling to the left or right, or up and down.
  • scrolling along different axes may provide additional books from different proximity networks or different combinations of proximity networks. For example, scrolling along one axis (e.g. left or right) may provide the user with additional books located by one proximity network (e.g. a content proximity network) whereas scrolling along different axis (e.g. up or down) may provide the user with additional books located by another proximity network (e.g. a rating proximity network).
  • scrolling along one axis may provide the user with additional books located by a first combination of proximity networks (e.g. 100% content proximity network) whereas scrolling along a different axis (e.g. up or down) may provide the user with additional books located by another combination of proximity networks (e.g. 80% rating proximity network and 20% author proximity network).
  • Each of the proximity module 104, the navigation module 106, the learning module 108, the storage module 110, and the interface module may be implemented in hardware or software.
  • each of the proximity module 104, the navigation module 106, the learning module 108, the storage module 110 and the interface module may be implemented by one or more computers, each computer comprising one or more processors.
  • each of the proximity module 104, the navigation module 106, the learning module 108, the storage module 110 and the interface module may be implemented by instructions that are stored on a computer readable medium, that when executed by a computer performs the functions described above.
  • each book in the database is analysed to generate a list of words used in the book and a count of how many times each word is used.
  • the lists generated at step 402 are filtered to eliminate words that do not provide any useful information as to the subject or content of the book.
  • the lists may be filtered to eliminate all semantically empty words, such as articles and adverbs, from the lists. This typically leaves only the words that semantically have meaning.
  • step 404 may be performed as part of step 402. Specifically, the semantically empty words may not be included (or ignored) in generating the lists in step 402.
  • the filtered lists generated at step 404 are compared and when two books have a predetermined number of words in common a link is created between the books (nodes).
  • the predetermined number is typically selected to produce a proximity network with a useful and manageable number of links. Specifically, if the predefined number is too small, there will be too many links between books that are only loosely similar. However, if the predefined number is too large, a number of useful links will not be created. Preferably the predefined number is selected so as to limit the number of links to any book to 500.
  • the weight of each link created at step 406 is determined from the number of words in common. For example, the weight may be the sum of the counts of the words in common.
  • FIG. 5 illustrates an exemplary method 500 for generating a rating proximity network.
  • the rating is stored in the storage module 110.
  • the rating may be based on a standard five star rating system where the users rate the books on a scale of 0 to 5 stars, where 0 stars typically means that the user did not like the book at all and 5 stars typically means that the user really liked or even loved the book.
  • the ratings are analysed to identify pairs of books that have been positively rated by the same user.
  • a positive rating may be considered to be a rating above a certain threshold.
  • a positive rating may be a rating above three stars (e.g. a rating of 4 or 5 stars).
  • a link is created between the pairs of books identified in step 504.
  • the weight of each link created at step 506 is determined from the positive user ratings. In some cases the similarity of two ratings may be more important than the actual rating. Specifically, two identical ratings may be given a higher weight, than two higher, but different ratings. For example, the weight of a link may be determined in accordance with Table 7. Specifically, a weight of 5 may be assigned when a user gives both books a rating of 5; a weight of 3 may be assigned when the user gives one book a rating of 4 and the other book a rating of 5; and a weight of 4 may be assigned when the user gives both books a rating of 4.
  • FIG. 6 illustrates an exemplary method 600 for generating an author proximity network in accordance with an embodiment.
  • an author proximity network links authors (e.g. each node is an author).
  • the rating is stored in the storage module 110.
  • the rating may be based on a standard five star rating system where the users rate the books and/or the authors on a scale of 0 to 5 stars, where 0 stars typically means that the user did not like the book/author at all and 5 stars typically means that the user really liked or even loved the book/author.
  • the ratings are analysed to identify pairs of authors that have been positively rated by at least two users or have multiple books positively rated by at least two users.
  • a positive rating may be considered to be a rating above a certain threshold. For example, a positive rating may be a rating above three stars (e.g. a rating of 4 or 5 stars).
  • a link is created between the pairs of authors identified in step 604.
  • the weight of each link created at step 606 is determined from the positive user ratings.
  • the weight of a link may be determined in accordance with Table 8. Specifically, a weight of 5 may be assigned when a user gives both authors a rating of 5; a weight of 3 may be assigned when the user gives one author a rating of 4 and the other author a rating of 5; and a weight of 4 may be assigned when the user gives both authors a rating of 4. Where more than one user has positively rated the same two authors then a weight is calculated for each user and the final weight is the sum of the individual weights.
  • virtual links are created between books written by the pairs of authors identified in step 606.
  • each proximity network defines relationships between similar items, wherein the similarity of two items is based on predefined criteria.
  • Each proximity network may comprise nodes and links connecting the nodes, where each node represents one or more items of the plurality of items.
  • Each link typically has an associated distance which represents the similarity between the nodes connected by the link. In some cases, a link is only created between a pair of nodes when the nodes have a minimum level of similarity according to the predefined criteria.
  • each proximity network uses different predefined criteria to determine the similarity between nodes.
  • the proximity networks generated at step 702 may include one or more of a content proximity network, a rating proximity network and an author proximity network.
  • a content proximity network determines the similarity of two books based on the content of the two books. For example, a content proximity network may determine the similarity of two books based on the number of words the two books have in common.
  • a rating proximity network determines the similarity of two books based on ratings of the books. For example, a rating proximity may determine that two books are more similar if two users have rated the two books the same and less similar if two users have rated the two books differently.
  • An author proximity network determines the similarity of two authors based on the rating of the authors and/or books by the authors.
  • data identifying an item is received.
  • the data may also include information identifying a particular user.
  • this data is typically provided by a service module, such as service module 102, which is configured to provide item services, such as book services, to a plurality of users.
  • a list of recommended items is generated by selecting a plurality of items similar to the identified item using the one or more proximity networks. This may comprise selecting the nodes from the generated proximity network(s) that are connected to the node representing the identified item by a link and have a distance below a predetermined threshold. Where similar items are selected from a plurality of proximity networks the number of items selected from a particular proximity network may be based on a proportion value assigned to that particular proximity network.
  • the characteristics of each recommended item are compared to preferences of the identified user. The comparison may comprise comparing criteria characterizing each recommended item to corresponding criteria characterizing the preferences of the identified user. The comparison may involve determining a proximity value for each valid criterion characterizing the preferences of the user.
  • Each proximity value represents the similarity between the valid criterion and the corresponding criterion characterizing the recommended item.
  • Each proximity value may then be summed to generate a distance value for the book.
  • the distance value represents the similarity between the recommended items and the preferences of the user. In some cases each proximity value is multiplied by a weighting factor prior to summing.
  • the criteria may include, but is not limited to, genre, size, style signature, part of a series, time period, gender, age, price, language and user home location.
  • the recommended items are ordered or ranked based on the comparison to place the items that are most likely to appeal to the identifier user at the top of the list. For example, where a distance value is calculated for each recommended book and the smaller the distance the more similar the books, then the books may be ordered in reverse order according to the distance values.
  • a service profile is selected from a plurality of predefined service profiles to provide the best list of recommended items for the specific service that provided the data in step 706.
  • Each service profile assigns a proportion value to each proximity network.
  • selecting the appropriate service profile may comprise testing each of the predefined service profiles with a plurality of users and selecting the service profile with the best test performance. Testing may be performed in an intrusive or non-intrusive manner.
  • a weighting factor profile is selected from a plurality of predefined weighting factor profiles to provide the best list of recommended items for the identified user.
  • Each weighting factor profile assigns a weighting factor to each criterion.
  • selecting the appropriate weighting factor profile may comprise testing each weighting factor profile and selecting the weighting factor profile with the best test performance. Testing may be performed in an intrusive or non-intrusive manner. It will be evident to a person of skill in the art that the steps of method 700 may be performed in another order and not all the steps must be performed.

Abstract

A computer-implemented system to generate a list of recommended items from a plurality of items. The system comprises a storage module configured to generate and store a proximity network, the proximity network defining relationships between similar items of the plurality of items, wherein the similarity of two items is based on predefined criteria; and a proximity module configured to: receive data identifying an item of the plurality of items; generate the list of recommended items by selecting a plurality of items similar to the identified item using the proximity network; and output the list of recommended items.

Description

SYSTEM AND METHOD FOR GENERATING A LIST OF RECOMMENDED ITEMS
Field of the Invention
The present invention relates to systems and methods for generating a list of recommended items. More particularly, the present invention relates to systems and methods for generating a list of recommended items using one or more proximity networks. Each proximity network defines relationships between similar items, wherein the similarity of two items is based on predefined criteria specific to the proximity network.
Background
Reading books is a common pastime for many people. Typically after a person finishes a book they rely on recommendations to pick their next book or books. Traditionally recommendations were provided by a librarian, a bookstore salesperson, a best sellers list, book critics and/or friends.
However, technology is changing the way people select and interact with books. Specifically, people are buying more and more books online, without any human contact. In addition, more and more books are becoming available electronically. Once a book has been digitized it will be available virtually forever, which provides the general public access to a greater number of books. In particular, books that are out of print are typically difficult to locate, however once such a book is digitized a copy can easily and cost-effectively be obtained.
These changes in technology have led the traditional recommendation methods to be less effective and in some cases totally ineffective. Accordingly, additional recommendation techniques such as search engines, collaborative filtering, and social navigation have been developed.
The search engine technique typically allows a user to locate books using keywords. The keywords may relate to the title of the book, the author, or genre, for example. One of the problems with the search engine method is that it allows only very limiting searching. In addition, the user has to have some idea of what they are looking for to generate the keywords. Accordingly, the search engine method requires a significant effort from the user and does not typically provide an efficient means to locate books that match a user's tastes.
The collaborative filtering technique recommends books to users based on the interest of a community of users, without any analysis of book content. Some collaborative filtering methods are based on the purchase history of the users. For example, if a user purchases or is considering purchasing a book then the method may recommend other books purchased by users who also purchased that book. Other collaborative filtering methods are based on how well a user's profile matches other users that expressed similar opinions about a given book. For example, such collaborative filtering methods operate by enabling users to rate individual books. Through this process, each user builds a personal profile of ratings data. To generate recommendations for a particular user, the user's profile is compared to the profiles of other users to identify one or more similar users. Books that were rated highly by these similar users are then recommended to the user.
Collaborative filtering techniques are very popular because they are simple. However, collaborative filtering techniques suffer from a number of problems. One problem is that collaborative filtering techniques base their recommendation on how well the user matches the preference of other users. As a consequence, these techniques are less accurate if the user base is small. Another problem is that a book cannot be recommended until it is purchased or rated by other users. Accordingly, collaborative filtering techniques tend to recommend mostly best sellers. Another problem with collaborative filtering techniques is that they don't typically take any context into consideration. For example, a surgeon with small children may purchase books on surgery and children's books.
Therefore other users that purchase or highly rate a particular surgical book may be recommended children's books. It is unlikely that the majority of people who buy surgical books will also like children's books and vice versa.
In social navigation techniques books are recommended to users based on books purchased or recommended by their "friends" in a social network such as Facebook. However, social navigation techniques suffer from a number of problems. First, "friends" don't necessarily share all interests and tastes. In other words, it does not follow that because two people play tennis at the same club they will like the same book. Second, recommendations made using social navigation techniques most often don't come at the moment when a user is seeking a recommendation. Third, recommendations provided through a social network often get lost amongst the all the other information provided by the social network. Accordingly, social navigation techniques do not typically provide a very efficient means to provide book recommendations to users.
Accordingly, there is a need for an improved recommendation method and system that addresses one or more of the problems with the current recommendation systems.
Summary In a first aspect there is provided a computer-implemented method to generate a list of recommended items from a plurality of items, the method comprising: generating a proximity network, the proximity network defining relationships between similar items of the plurality of items, wherein the similarity of two items is based on predefined criteria; receiving data identifying an item of the plurality of items; and generating the list of recommended items by selecting a plurality of items similar to the identified item using the proximity network.
In a second aspect there is provided a computer-implemented system to generate a list of recommended items from a plurality of items, the system comprising: a storage module configured to generate and store a proximity network, the proximity network defining relationships between similar items of the plurality of items, wherein the similarity of two items is based on predefined criteria; and a proximity module configured to: receive data identifying an item of the plurality of items; generate the list of recommended items by selecting a plurality of items similar to the identified item using the proximity network; and output the list of recommended items.
Brief Description of the Drawings
Embodiments of the present disclosure will now be described, by way of example only, with reference to the attached figures, wherein:
Figure 1 is a block diagram of a system for generating a list of recommended items; Figure 2 is a depiction of an exemplary proximity network;
Figure 3 is a schematic of an interface for displaying the list of recommended items;
Figure 4 is a flow chart of a method for generating a content proximity network;
Figure 5 is a flow chart of a method for generating a rating proximity network;
Figure 6 is a flow chart of a method for generating an author proximity network; and Figure 7 is a flow chart of a method for generating a list of recommended items.
Detailed Description
It will be appreciated that for simplicity and clarity of illustration, where considered appropriate, reference numerals may be repeated among the figures to indicate corresponding or analogous elements. In addition, numerous specific details are set forth in order to provide a thorough understanding of the examples described herein. However, it will be understood by those of ordinary skill in the art that the examples described herein may be practiced without these specific details. In other instances, well-known methods, procedures and components have not been described in detail so as not to obscure the examples described herein. Also, the description is not to be considered as limited to the scope of the examples described herein. The methods and systems described herein are designed to generate a list of recommended items from a plurality of items using one or more proximity networks. Each proximity network defines relationships between similar items, wherein the similarity of two items is based on predefined criteria specific to the proximity network. Reference is made to Figure 1 which illustrates a system 100 for generating a list of recommended items in accordance with an embodiment. For ease of explanation, system 100 is described as generating book recommendations, however, it will be evident to a person of skill in the art that the general principles described herein may be equally applied to provide recommendations of other items such as compact discs, video discs, video or music downloads etc. A service module 102 provides a book service to one or more user. The term "book service" is intended to cover any hardware, software, or combination of hardware and software that allows a user to browse, obtain information about, or get a copy of books including an electronic ("eBook) website, an eBook reader book store (e.g. Uncuva, the Samsung E60/E65 eReader book store and Amazon's Kindle Book Store) and a website that discusses books. The system 100 receives data from the service module 102 identifying a particular book and a particular user. The system 100 then uses the data received from the service module 102 and one or more proximity networks to generate a list of recommended books.
The system 100 comprises a proximity module 104, a navigation module 106, a learning module 108 and a storage module 110 which work together to generate the list of recommended books.
Specifically, the proximity module 104 generates a list of recommended books that are similar to the identified book from one or more proximity networks. The navigation module 106 orders or sorts the list of recommended books based on how closely the books match the user's characteristics and preferences. The learning module 108 adjusts the parameters used by the proximity module 104 and the navigation module 106 to generate the best list of recommended books for the particular service module 102 and for the particular user. The storage module 110 stores and generates the data used by the other three modules (i.e. the proximity module 104, the navigation module 106, and the learning module 108), such as the list of books and the proximity network(s). Each module of the system 100 will be described in detail below.
The proximity module 104 receives data identifying a particular book and generates a list of recommended books that are similar to the identified book using one or more proximity networks. The data identifying a particular book may be provided directly to the proximity module 104 by the service module 102. Alternatively, the service module 102 may provide the data identifying the book to the storage module 110 which in turn provides the data to the proximity module 104.
A proximity network defines relationships between similar books, wherein the similarity of two books is based on predefined criteria. Typically each proximity network uses different criteria to assess the similarity of books. An exemplary proximity network 200 is shown in Figure 2.
Each proximity network may comprise nodes and links connecting pairs of nodes. For example, the exemplary proximity network 200 of Figure 2 comprises six nodes 202, 204, 206, 208, 210 and 212 connected by a plurality of links 214, 216, 218, 220, 222, 224, 226, 228 and 230. Each node 202, 204, 206, 208, 210 and 212 represents one or more books stored in the storage module 110. In some proximity networks, each node is a book and thus each node represents a single book. In other proximity networks, each node is an author and thus each node represents one or more books (e.g. all the books by that particular author).
When two nodes are connected by a link their similarity is defined based on the predefined criteria. Typically two nodes are only connected by a link after it has been determined that that they have a minimum level of similarity according to the predefined criteria. This limits the number of links in a proximity network to a manageable number. If such a limit is not imposed the proximity network can become significantly more complicated without providing any additional valuable information.
Preferably each link 214, 216, 218, 220, 222, 224, 226, 228 and 230 is assigned a distance value dl, d2, d3, d4, d5, d6, d7, d8 or d9 which represents the similarity of the two nodes (e.g. books or authors) connected by the link based on the predefined criteria. For example, the predefined criteria may be used to generate a weight that represents the similarity of two nodes. The weight may be an integer that cannot be zero. The distance may be calculated from the weight according to the equation: distance = round(MAXINT/weight), where MAXINT is the biggest integer value that can be used by the specific coding scheme and weight is the weight determined from the predefined criteria.
Example proximity networks include a content proximity network, a rating proximity network and an author proximity network. However, it will be understood by those of skill in the art that other proximity networks may be used.
A content proximity network uses the content of the books to determine their similarity. For example, a content proximity network may be generated by counting the occurrence of each word in the books and comparing the high occurrence words. High occurrence words are those words that occur most frequently in a particular book. The high occurrence words may be the X most frequent words where X is an integer greater than 1. For example, the high occurrence words may be the 100 most frequently used words. The more high occurrence words two books have in common and the higher the count number is, the more similar the books are. Since books with common vocabulary are typically about the same subject, a content proximity network is particularly useful at identifying books on similar subjects (e.g. books on war, books on history, or recipe books). Typically, generic, or semantically empty, words which are not specific to the actual content of a book may be discounted or ignored by a content proximity network. Examples of such words may include "and", "if" and the like. An exemplary method for generating a content proximity network will be described in reference to Figure 4. A rating proximity network uses the ratings of the books to determine their similarity. For example, a rating proximity network may be based on the premise that the more often two books have been rated positively by the same users, the more similar they are. Since a rating proximity network links books that have been similarly rated by multiple users, it is particularly effective at recommending well known books having a large readership (e.g. novels and children's books). The rating proximity network can be seen as operating similarly to a librarian that knows a set of people that like the same kind of books and then recommends to members of the group the books that one of the members has enjoyed. An exemplary method for generating a rating proximity network will be described in reference to Figure 5.
An author proximity network uses ratings of authors and/or books to determine the similarity of authors. For example, when author ratings are available, a link may be established between two authors if the authors have been positively rated by at least two users. Where author ratings are not available for a pair of authors, a link may be established between the authors if multiple books by the authors have been rated positively by at least two users. The author proximity network is based on the premise that the more two authors (or books by those authors) are rated positively by the same users, the more similar the authors are. An author proximity network is designed to recommend books that relate to different subject matter than previously explored by the user and not necessarily well known (e.g. not rated by many users). An exemplary method for generating an author proximity network will be described in reference to Figure 6.
The proximity module 104 uses the data identifying a particular book and one or more proximity networks to compile a list of books that are similar to the identified book. The list of books similar to the identified book forms the list of recommended books. The proximity module 104 may be configured to generate the list of similar books by selecting those nodes from the proximity network(s) that are connected by a link to the node representing the identified book. This is based on the premise that if two nodes are connected by a link then they by definition have some level of similarity. Where the proximity network is a content proximity network or a rating proximity network the "node representing the identified book" is the node for the identified book. Where, however, the proximity network is an author proximity network the "node representing the identified book" is the node for the author of the identified book.
The list of similar books is then the list of books represented by the selected nodes. For example, where the proximity network is a content proximity network or a rating proximity network the nodes are books, thus the list of similar books comprises each node/book selected. Where, however, the proximity network is an author proximity network the nodes are authors, thus the list of similar books comprises the books written by each node/author selected.
The proximity module 104 may be further configured to use the distance values of the links to only select a limited number of similar books. Specifically, the proximity module 104 may be configured to select only the X most similar nodes from the proximity network(s), where X is a predefined integer greater than zero. For example, the proximity module 104 may be configured to select the 10 most similar nodes to the node representing the identified book using the distance values.
Alternatively, the proximity module 104 may be configured to select only the nodes that have a distance value below a predetermined threshold. Potentially, where nodes are authors a node may correspond to a large number of books. Accordingly, where some or all of the selected nodes correspond to a plurality of books, the proximity module 104 may be configured to select only the Y most similar books, where Y is a predefined integer greater than zero.
In some cases, the proximity module 104 may be configured to select similar books from only one proximity network. In other cases, the proximity module 104 may be configured to select similar books from a plurality of proximity networks so that the list of recommended books includes books identified as similar by different criteria. For example, if the proximity module 104 selects similar books from both the content proximity network and the rating proximity network, the list of recommended books will have books identified as similar by the content of the books and by the ratings of the books. The proximity module 104 may receive information from the learning module 108 that defines the combination of proximity networks to be used in generating the list of recommended books. For example, as described below, the learning module 108 may be configured to select a service profile from a plurality of predefined service profiles that will provide the best list of recommended books for the particular service module 102. Each service profile assigns a proportion value to each of the different proximity networks. Typically the proportion values add up to 1 or 100%. For example, a particular service profile may assign the content proximity network a proportion value of 0.5, the rating proximity network a proportion value of .3, and the author proximity network a proportion value of .2. This would mean that when this service profile is used by the proximity module 104, 50% of the recommended books would be selected from the content proximity network, 30% of the recommended books would be selected from the rating proximity network, and 20% of the recommended books would be selected from the author proximity network. Since the proximity networks are not based on information about the particular user, the proximity module 104 will typically produce the same list of recommended books for the same identified book regardless of the identity of the user.
The navigation module 106 receives the list of recommended books generated by the proximity module 104 and the data identifying the user and orders the list of recommended books so that the books that are most likely to appeal to the identified user are presented first (or at the top of the list). In some cases, once the books have been ordered, the navigation module 106 may remove one or more books from the bottom of the list to eliminate books that are less likely to appeal to the user.
The navigation module 106 may order the recommended books by comparing information about the user and their book preferences to characteristics of the recommended books. This comparison may comprise comparing the general profile of the identified user against the general profiles and rating profiles of the recommended books.
The general profile of a book typically comprises one or more criterion which characterizes the book. The criteria are typically selected to provide the most relevant information in assessing the similarity of the book to the user's preferences. The criteria may include, but is not limited to, one or more of the following: genre, size, style signature, part of a series, time period, gender, age, price, language, and user home location. It will be evident to a person of skill in the art that other suitable criteria may be used alternatively to the listed criteria or in addition to the listed criteria.
The "genre" criterion may be used to indicate the subject category to which the book belongs. Typically the genre information is provided by the publisher of the book. In some cases there may be a predefined list of genres. The "size" criterion may be used to indicate the length of the book. In some cases the number of words may be used to assess the size of the book. For example, each book may be given a size rating from 1 to 10 based on the number of words in the book. Typically a size rating of "1" would indicate that the book is small (e.g. has a small number of words) and a size rating of "10" would indicate that the book is large (e.g. has a large number of words). However, it will be evident to a person of skill in the art that any other suitable rating system may be used. In other cases the number of pages may be used to assess the size of the book. However, the number of pages is typically a less accurate means for assessing the size of the book since it does not take into account the density of the content on each page. For example, it does not take into account varying font and page sizes between books.
The "style signature" criterion may be used to indicate the complexity of the book. In some cases each book may be given a style signature rating from 1 to 10 that represents the complexity of the book. The style signature rating may be based on the number of words in each sentence and/or the length of the words in the book. In some cases, the style signature rating may be based on the average length of the longest sentences in the book. The longest sentences in the book may be the top percentage (e.g. 5%) of the longest sentences. For example, the style signature rating may be calculated in accordance with Table 1.
Table 1
Figure imgf000010_0001
In other cases, the style signature rating may be based on the Flesch Reading Ease Score (FRES) of the book. The FRES is calculated from the following formula: 206.835 - (1.015 x ASL) - (84.6 x ASW) where ASL is the average sentence length and ASW is the average syllables per word. This rates the text on a 100-point scale; the higher the score, the easier it is to understand the document with 60 to 70 being an acceptable score for literate adults. The FRES may, for example, be converted into a style signature rating in accordance with Table 2.
Table 2
Figure imgf000011_0001
In still other cases, the style signature rating may be based on the Gunning Fog Index Readability Formula, simply called the FOG Index, of the book. The Fog Index is calculated from the following formula: 0.4 (ASL + PHW) where ASL is the average sentence length and PHW is the percentage of hard words. A hard word is defined as one that is three or more syllables that is not (i) a proper noun, (ii) a combination of easy words or hyphenated words, or (iii) a two-syllable verb made into three syllables with common suffixes such as -es, -ed and -ing. The Fog index is a rough measure of how many years of school it would take someone to understand the content. The lower the FOG index, the more understandable the content is. The average person reads at level 9. The easy reading range is 6-10 and anything above 15 is getting difficult.
The "part of a series" criterion may be used to indicate whether the book is part of a series of books. It may be assigned a Boolean value indicating whether the book is part of a series. For example, the "part of a series" criterion may be set to "true", "yes" or "1" when the book is part of a series and "false", "no" or "0" when the book is not part of a series.
The "time period" criterion may be used to indicate the time period in which the book was written. For example, a book may be classified into one of the following time period categories based on when it was written: less than 1 year ago; between 1 and 5 years ago; between 5 and 10 years ago; between 10 and 20 years ago; between 20 and 50 years ago; more than 50 years ago and in the 20th century; 19th century; 18th century; etc.
The "gender" criterion may be used to indicate the target reader's gender. In some cases, the positive ratings (e.g. ratings above a predetermined threshold, such as a rating above three stars in a five star rating system) of the book may be used to assess the gender target of the book. For example, if there are less than a threshold number of positive ratings the book's gender target may be classified as unknown, if, however, there are at least a threshold number of positive ratings, then if more than a predetermined percentage (e.g. 70%) of the positive ratings are from a certain gender (e.g. males or females) the book's gender target may be classified as that gender (e.g. male or female), otherwise the book's gender target may be classified as unknown. The threshold number of positive ratings will typically be between 3 and 5, however, it will be evident to a person of skill in the art that the threshold number may be fine-tuned and may be any suitable threshold number may be used.
The "age" criterion may be used to indicate the target reader's age. Similar to the "gender" criterion, the positive ratings (e.g. ratings above a predetermined threshold, such as ratings above three stars in a five star rating system) may be used to assess the target reader's age. For example, the "age target" criterion may be an array, each element of the array representing a range of ages. The percentage of positive ratings by each age category would be used to fill the array. For example, the array may have 10 elements which represent the following age categories: less than 1 year, 1-5 years, 6-10 years, 11-15 years, 16-20 years, 21-25 years, 26-30 years, 31-35 years, 36-40 years, 41-45 years, 46-50 years, and over 50 years. When a user is rating a children's book, the user may be given the option to specify the age of the child on which the rating is based.
The "price" criterion may be used to indicate the price of the book. In some cases a book may be classified into one of the following price categories based on the price of the book: free; less than $1.99; $2-$4.99; $5-$9.99; $10-$19.99; $20-29.99; $30-$39.99; $40-$49.99; greater than $49.99. Many users will only be interested in books in a certain price range. Accordingly, this criterion is used to match the price of the book to the user's preference in relation to book price.
The "language" criterion may be used to indicate the language of the book. In some cases the storage module 110 may only contain books in a single language. In these cases it may not be necessary to include the language criterion. In other cases, the storage module 110 may contain books in multiple languages. In these cases, the language criterion may be used to match the language of the book to the language preferences of the user.
The "user home location" criterion may be used to indicate the target reader's home location. Similar to the "gender target" and "age target" criteria, the positive ratings (e.g. ratings above a predetermined threshold, such as ratings above three stars in a five star rating system) may be used to assess the target reader's home location. For example, if there are less than a threshold number of positive ratings of the book, the book's "user home location" may be classified as unknown; if, however, there are at least a threshold number of positive ratings of the book, then if more than a predetermined percentage (e.g. 50%) of the positive ratings are from users located in a specific city or town, the book's "user home location" may be classified as that city or town, otherwise the "user home location" may be classified as unknown. While the user's city or town has been used as the user's home location, it will be evident to a person of skill in the art that other suitable location metrics, such as country, county or region could also be used as the user's home location. Since the user home location criterion is one of the weaker criteria in terms of matching user preferences to books, the threshold number of positive ratings may be higher than for other stronger criterion. For example, the threshold number of positive ratings may be around 10, however, it will be evident to a person of skill in the art that the threshold value may be fine-tuned over time and that other suitable threshold values may be used.
The rating profile of a book typically comprises information about how the book has generally been rated. For example, a book may be assigned one of the following rating profiles based on the ratings of the book: positive, negative, average, or polar. Positive, negative and average ratings may be defined as ratings that fall within certain predefined ranges. For example, in a five star rating system, positive ratings may be ratings of 4 or 5 stars, negative ratings may be ratings of 0 or 1 star, and average ratings may be ratings of 2 or 3 stars. For a book to be assigned a positive, negative or average rating profile, at least a predetermined percentage (e.g. 60%) of the ratings must be positive, negative or average respectively. For a book to be assigned a polar rating profile a high percentage of the ratings must be positive and a high percentage of the ratings must be negative, leaving very few average ratings. For example, a book may be assigned a polar rating profile if at least a predefined percentage (e.g. 40%) of the ratings are positive (e.g. ratings of 4 or 5 stars) and at least a predefined percentage (e.g. 40%) of the ratings are negative (e.g. ratings of 0 or 1 stars).
In some cases, the rating profile of a book will be separate from the general profile of the book. In other cases the rating profile of a book will form part of the general profile of the book. For example, the rating profile may be a criterion of the general profile of the book.
The general profile of a user typically comprises one or more criterion that characterizes the user and the user's book preferences. Preferably the criteria used in the general profile of the user correspond to the criteria used in the general profile of a book. For example, the criteria used in the general profile of the user may include, but is not limited to, one or more of the following: genre, size, style signature, part of a series, time period, gender, age, price, language, user home location and rating profile. It will be evident to a person of skill in the art that other suitable criteria may be used alternatively to the listed criteria or in addition to the listed criteria. The "genre" criterion may be used to indicate the genre(s) of books preferred by the user. The genre preferences of a user may be based on the genres of the books purchased and/or positively rated (e.g. given a rating of 4 or 5 stars in a five star rating system) by the user. For example, there may be a fixed number of genres and each genre may be assigned a value in a predetermined range based on the number of books purchased and/or positively rated by the user of that genre. For example, each genre may be assigned a value from 1 to 10 representing the number of books purchased and/or positively rated by the user of that genre, where a value of 10 may mean that at least 10 books of this genre have been purchased and/or rated by the user. The term "books purchased and/or positively rated by the user" is used herein to mean the books purchased by the user, the books positively rated by the user, or both the books purchased by the user and the books positively rated by the user.
The "size" criterion may be used to indicate the book length(s) preferred by the user. The book length(s) preferred by the user may be assessed based on the size rating of the books purchased and/or positively rated by the user. For example, as described above, each book may be given a size rating from 1 to 10 based on the number of words in the book. Typically a size rating of "1" would indicate that the book is small (e.g. has a small number of words) and a size rating of "10" would indicate that the book is large (e.g. has a large number of words). The user size criterion may be an array wherein each size rating is assigned a user preference value between 0 to 10 based on the size rating of the books purchased and/or positively rated by the user. In some cases, a particular size rating will only be assigned a user preference value after the user has positively and/or negatively rated a predetermined number of books with that size rating.
Table 3 illustrates an exemplary method for calculating the user "size" criterion. In this example, the size ratings 1, 2 and 3 are not assigned a user preference value since the total number of books positively or negatively rated by the user is below the predetermined threshold (e.g. 5). For size ratings 8, 9 and 10, more than 75% of the books with this size rating have been are negatively rated by the user thus size is likely to be negatively impacting the user's appreciation of the books with those size ratings. Accordingly these size ratings have been assigned a user preference value of 0. For size ratings 4, 5, 6 and 7, more than the threshold number of books positively or negatively rated, but less than 75% of the rated books in each category have been negatively rates, thus the size of the books is not likely significantly impacting the user's appreciation of the books with these size ratings. Accordingly, each size rating is assigned a value representative of the number of books positively rated up to a maximum value of 10. Table 3
Figure imgf000015_0001
The "style signature" criterion may be used to indicate the book complexity preferred by the user. The book complexity preferred by the user may be based on the style signature rating of the books purchased and/or positively rated (e.g. given a rating of 4 or 5 stars in a five star rating system) by the user. For example, as described above, each book may be given a style signature rating from 1 to 10 based on the percentage of long sentences, where a long sentence is one with more than a predefined number of words. The style signature criterion may be an array wherein each style signature rating is assigned a user preference value between 0 to 10 based on the style signature rating of the books purchased and/or positively rated by the user. In some cases, a particular style signature rating may only be assigned a user preference value after a predetermined number of books with that particular style signature have been positively or negatively rated by the user.
Table 4 illustrates an exemplary method for calculating the user's style signature preferences. In this example, the style signature ratings 1, 2, 3 and 10 are not assigned a user preference value since the total number of books positively or negatively rated by the user is below the predetermined threshold (e.g. 5). For style signature ratings 8 and 9 more than 75% of the rated books with this style signature rating have been negatively rated thus style signature is likely to be negatively impacting the user's appreciation of the books with those style signature ratings. Accordingly these style signature ratings have been assigned a user preference value of 0. For style signature ratings 4, 5, 6 and 7, more than the threshold number of books with these style signature ratings have been positively or negatively rated, but less than 75% of the rated books in each category have been negatively rated, thus the style signature of the books does not appear to impact the user's appreciation of the book. Accordingly, each of these style signature ratings is assigned a value representative of the number of books positively rated by the user up to a maximum value of 10.
Table 4
Figure imgf000016_0001
The "part of a series" criterion may be used to indicate whether the user prefers or enjoys books that are part of a series. Whether or not the user prefers or enjoys books that are part of series may be based on whether the books purchased and/or positively rated by a user are part of a series. For example, the user may be assigned a "part of a series" value from 0 to 10 that is representative of the percentage of books purchased and/or positively rated by a user that are part of a series. For example, a user may be assigned a "part of a series" value of 7 if 70% of the books purchased and/or positively rated by the user are part of a series. In some cases, the user may only be assigned a part of a series value after a predetermined number of books have been rated.
The "time period" criterion may be used to indicate whether the user prefers or enjoys books from particular time periods. The time period(s) preferred by the user may be based on the time period of the books purchased and/or rated positively by the user. As described above, a book may be classified into one of the following time periods based on when it was written: less than 1 year ago; between 1 and 5 years ago; between 5 and 10 years ago; between 10 and 20 years ago; between 20 and 50 years ago; more than 50 years ago and in the 20th century; 19th century; 18th century; etc. For the user, each time period may be assigned a user preference value from 0 to 10 representing the number of books of that time period that were purchased and/or rated positively by the user. A user preference value of 10 may represent, for example, that the user has purchased and/or rated positively at least 10 books of that time period. In some cases, a particular time period may only be assigned a user preference value after a predetermined number of books in that time period have been purchased and/or positively rated by the user.
The "gender" criterion may be used to indicate the user's gender. The gender criterion may be assigned a value of male, female or unknown. The gender information may be provided to the system 100 by the service module 102.
The "age" criterion may be used to indicate the user's age. The age criterion may specify the user's age in years, for example, or alternatively it may specify the age range that the user falls into. For example, the age criterion may specify which of the following age ranges the user falls into: less than 1 year, 2-5 years, 6-10 years, 11-15 years, 16-20 years, 21-25 years, 26-30 years, 31-35 years, 36-40 years, 41-45 years, 46-50 years, and over 50 years. The age information may be provided to the system 100 by the service module 102.
The "price" criterion may be used to indicate the user's price preferences. As noted above, many users will only be interested in books in certain price ranges. As described above, a book may be classified into one of the following price ranges based on the price of the book: free; less than $1.99; $2-$4.99; $5-$9.99; $10-$19.99; $20-$29.99; $30-39.99; $40-49.99; greater than $49.99. For the user, each price range may be assigned a user preference value from 0 to 10 representing the number of books in that price range that were purchased and/or rated positively by the user. A user preference value of 10 may represent, for example, that the user has purchased and/or rated positively at least 10 books in that price range. In some cases, a particular price range may only be given a user preference value after a user has purchased and/or positively rated a predetermined number of books in the particular price range.
The "language" criterion may be used to indicate the user's language preferences. In some cases the storage module 110 may only contain books in a single language. In these cases it may not be helpful to include the language criterion. In other cases, the storage module 110 may contain books in multiple languages. In these cases, the language criterion may be used to match the language preferences of the user to the language of the recommended books. The user's language preferences may be based on user information received from the service module 102.
Alternatively, the user's language preferences may be based on the language of the books positively rated and/or purchased by the user. For example, a particular language may be automatically added to the user's language preferences once the user has positively rated and/or purchased a threshold number (e.g. 5) books in the particular language. Where a user's language preferences include more than one language the list of recommended books may include books in these more than one languages.
The "user home location" criterion may be used to indicate the home location of the user. The "user home location" may specify one or more of the following: the town or city, county or region, or country of residence of the user. The user home location information may be provided to the system 100 by the service module 102.
The "rating profile" criterion may be used to indicate the book rating profile preferred by the user. The rating profile criterion may be assessed from the rating profiles of the books that the user has purchased and/or rated positively (e.g. rated 4 or 5 stars in a five star rating system). The rating profile criterion may help identify whether the user can appreciate books that are controversial (e.g. book that have a polar rating profile), and whether the user can appreciate books that are not appreciated by a majority of other users (e.g. books that have a negative or average rating profile). As described above, a book may be assigned one of the following rating profiles based on the ratings of the book: positive, negative, average and neutral. For the user, each rating profile may be assigned a user preference value from 0 to 10 representing the number of books with that rating profile that were purchased and/or rated positively by the user. A user preference value of 10 may represent, for example, that the user has purchased and/or rated positively at least 10 books with that rating profile. In some cases, a particular rating profile may only be given a user preference value after a user has purchased and/or positively rated a predetermined number of books with that rating profile.
Using the rating profile criterion to match books to users provides a distinct advantage over most prior art book recommendation systems. Specifically, most prior art book recommendation systems assume that the majority opinion on a book is paramount so that any specific user will follow the majority opinion. Therefore, in most prior art book recommendation systems, once a book has been negatively rated by a certain number of users it is unlikely to subsequently be recommended to any users, since it is assumed that since the majority of users don't like it, other users will also not like it. In the systems and methods described herein, however, the objective is to find a list of
recommended books that fits a specific user's tastes and preferences, even if they deviate from the majority opinion. Accordingly, in the systems and methods described herein, it is determined for a particular user whether only books with a "positive rating" should be recommended to the user. It is not assumed that this would be appropriate for all users. .
It has been described above that many of the user criteria may be based on a user's positive ratings of books. For example, it has been described above that many criteria may be assigned a user preference value from, for example, 0 to 10 based on the number of books positively rated by the user. It will be evident to a person of skill in the art that the negative and/or average ratings by the user may also be used to assess the user's preferences. Specifically, the negative ratings by the user may be used to identify an aversion or dislike of a particular characteristic. For example, if a predetermined percentage of the user's ratings are negative for a particular criterion then a negative user preference rating may be assigned to indicate that the user has an aversion to or dislike of books with a particular characteristic.
The navigation module 106 compares the general profile for the user against the general profile for each recommended book to determine the similarity between the user's characteristics and preferences and each book's characteristics. The similarity between a user's characteristics and preferences to a book's characteristics may be represented by a distance value. Typically, the smaller the distance value, the more closely the book matches the user's characteristics and preferences.
The distance between the user's characteristics and preferences and a book's characteristics may, for example, be calculated by determining a proximity value for each valid criterion in the user's profile and then summing the individual proximity values. A valid criterion in the user's profile is a criterion that has a non-null value. For example, since some of the criteria (e.g. genre, size, style signature, time period, rating profile) may only be assigned values after the user has positively rated a predetermined number of books, and some of the criteria (e.g. gender, age, home location) may only be assigned values if the relevant information is available, there may be one or more criterion in the user's profile that is not assigned a value (e.g. has a null value). Each proximity value is representative of how well a particular user criterion matches the corresponding book criterion. For example, the age proximity value indicates how well the user's age matches the target age for the book. For some user criterion (e.g. age, language), the proximity value is calculated by comparing the user characteristic against the corresponding book criterion and assessing how well they match. For example, the age proximity value may be based on whether the the user's age fall into the target age range of the book and if not, how far away it is. For other user criterion (e.g. style signature, size) the proximity valued is calculated from the user preference value associated with the corresponding book criterion. For example, a user's style signature criterion comprises an array of user preference values and each user preference value represents the user's preference for a particular style signature. Accordingly, as shown in Table 5, the style signature proximity value for a particular book, is based on the user preference value corresponding to the style signature of the book (e.g. 10 for style signature 4 (Book 1), and 0 for style signature 9 (Book 2)). As described above, typically the higher the user preference value, the more the user prefers books with that trait. For example, a user preference value of 10 for style signature 4 indicates the user prefers or likes books with a style signature of 4. It will be evident to a person of skill in the art that this is an exemplary method of calculating the proximity values and other suitable methods may be used to calculate the proximity values. For example, each proximity value may be calculated by determining the difference between a book criterion (e.g. style signature) and the user's preference regarding that criterion (e.g. the user's preferred style signature). Typically, the more closely the user criterion matches the corresponding book criterion, the lower the proximity value.
In some cases, one or more criterion may be given different weight in determining the distance. For example, the style signature criterion may be given more weight than the user home location criterion. In these cases, each similarity value may be multiplied by a weighting factor before being summed. The weighting factors may be predetermined or they may be dynamically calculated or updated by the learning module 108. Typically, the closer a book's characteristics match a user's preferences, the lower the distance value. This is illustrated in Table 5 where the user's preferences more closely match Book l's characteristics than Book 2's characteristics thus Book 1 is assigned a lower distance value (0.03) and Book 2 is assigned a higher distance value (0.10). Specifically, the user preference values corresponding to the characteristics of book 1 are higher than the user preference values corresponding to the characteristics of book 2, thus book 1 more closely matches the user's preferences.
Table 5
Figure imgf000021_0001
Once each of the recommended books has been assigned a distance value, the books are sorted or ordered according to the distance value. For example, where a lower distance value indicates a better match to the user's characteristics and preferences, the recommended books may be placed in ascending order based on the distance values. The objective is to present the books that most closely match the user's characteristics and preferences at the beginning or top of the list. As described above, the navigation module 106 may also remove a number of books from the bottom of the list to eliminate books that are less likely to appeal to the user. In some cases, the navigation module 106 may eliminate a predefined number of books from the bottom of the sorted or ordered list. For example, the navigation module 106 may eliminate the last five books on the sorted or ordered list. Alternatively, the navigation module 106 may eliminate all books with a distance value above a certain threshold. For example, where the distance values range from 0 to 1 the navigation module 106 may eliminate all books with a distance value above 0.8. Once the list of recommended books has been sorted or ordered by the navigation module 106, it is provided to the service module 102. The service module 102 then provides the sorted list of recommended books to the user. The service module 102 may then allow the user to purchase and/or rate the recommended books. The same book recommendations cannot typically be made for all user groups and all individual users.
In some cases the service module 102 may only provide book services to users in a specific country. For example, one service module 102 may provide an eBook service to Canadians where another service module 102 may provide an eBook service to Australians. It may be that Canadians have statistically different book preferences than Australians, thus it may be beneficial to tailor the system to the specific service module 102 and thus the group of users of the service module 102. Similarly each user is unique and thus it is beneficial to tailor the list of recommended books to the specific user as much as possible. Accordingly, the learning module 108 is designed to tweak the system 100 for the specific service module 102 and specific user. Specifically, the learning module 108 may be configured to select what combination of proximity networks should be used by the proximity module 104 to provide the best book recommendations for the specific service module 102 (and thus for the group of users of the service module 102). The learning module 108 may also be configured to select the criteria weighting factors used by the navigation module 106 to provide the best book recommendations for the specific user. The learning module 108 may select what combination proximity networks should be used by the proximity network 104 by testing a plurality of pre-defined service profiles. Each service profile assigns a proportion value to each proximity network. Typically the proportion values add up to 1 or 100%. For example, a particular service profile may assign the content proximity network a proportion value of 0.5, the rating proximity network a proportion value of .3, and the author proximity network a proportion value of .2. This would mean that when this service profile is used, 50% of the recommended books would be selected from the content proximity network, 30% of the recommended books would be selected from the rating proximity network, and 20% of the recommended books would be selected from the author proximity network. A proximity network may be assigned a proportion value of zero which means that the particular proximity network is not to be used in generated the list of recommended books.
The learning module 108 tests or evaluates each of the pre-defined service profiles and provides the service profile with the best performance to the proximity module 104 to be used in generating the list of recommended books. Testing or evaluating the pre-defined service profiles may be done in an intrusive manner or non-intrusive manner. Intrusive methods of testing a service profile are those where the users are aware that the testing is being performed. For example, an intrusive method of testing may comprise asking the users of the service module 102 to rate the relevance of the books recommended using the particular service profile. Conversely, non-intrusive testing methods are those where the users are unaware that the testing is being performed. For example, a non- intrusive method of testing a service profile may comprise monitoring the number of clicks on the recommended books (e.g. indicating interest in the books) and/or the number of recommended books that are ultimately purchased. Since the performance of the service profiles may change as more data (e.g. ratings and
recommendations) is added to the system 100, the learning module 108 may be configured to automatically perform the service profile selection on a periodic basis. For example, the learning module 108 may be configured to test the pre-defined service profiles once a week. However, it will be evident to a person of skill in the art that other suitable time frames for testing the pre-defined service profiles may be used.
As described above, the learning module 108 may also select the criteria weighting factors used by the navigation module 106 to provide the best book recommendations for the specific user. The weighting factors used by the navigation module 106 may be selected in a similar manner to how the combination of proximity networks is selected. For example, the learning module may test a plurality of pre-defined weighting factor profiles and select the one with the best performance.
Similar to selecting the combination of proximity networks, the testing of the pre-defined weighting factor profiles may be performed in an intrusive or non-intrusive manner.
Each pre-defined weighting factor profile assigns a weighting factor to each criterion in the user profile. Typically the weighting factors add up to 1 or 100%, but it will be evident to a person of skill in the art that the method would still operate if the weighting factors do not add up to 1 or 100%. For example, a particular weighting factor profile may assign the weighting factors shown in Table 6.
Table 6
Criterion Weighting Factor
Genre 0.30 Size 0.20
Style Signature 0.15
Part of a Series 0.05
Time Period 0.10
Gender 0.05
Age 0.01
Price 0.01
Language 0.05
Rating Profile 0.03
As described above, the navigation module 106 may use the weighting factors to determine the distance between a recommended book and the user's characteristics and preferences. For example, the navigation module 106 may generate the distance by determining a proximity value for each criterion, multiplying each proximity value by the associated weighting factor, and summing the weighted proximity values.
Since the performance of the weighting factor profiles may change as more data (e.g. ratings and recommendations) is added to the system 100, the learning module 108 may be configured to automatically perform the weighting factor profile selection on a periodic basis. For example, the learning module 108 may be configured to test the pre-defined weighting factor profiles once a week. However, it will be evident to a person of skill in the art that other suitable time frames for testing the pre-defined service profiles may be used.
The storage module 110 is configured to generate and store the information used to generate the list of recommended books and may comprise or be in the form of one or more databases.
Specifically, the storage module 110 may be configured to store the following: a plurality of books in electronic form and the ratings of the books. The storage module 110 may also be configured to generate and store the general profile for each book, the general profile for each user, and the one or more proximity networks as described above. For example, the storage module 110 may be configured to generate a content proximity network in accordance with the method described in reference to Figure 4, a rating proximity network in accordance with the method described in reference to Figure 5, and/or an author proximity network in accordance with the method described in reference to Figure 6.
Since the general profiles of the books, the general profiles of the users, and the proximity networks may change as more data (e.g. user ratings and books) is added to the system 100, the storage module 110 may be configured to regenerate or update the general profiles of the books, the general profiles of the users, and the one or more proximity networks on a periodic basis. For example, the storage module 110 may be configured to re-generate or update the profiles and proximity networks once a week. In some embodiments, the system 100 may also include an interface module (not shown) that is configured to generate a user interface for displaying the ordered list of recommended books. The user interface generated by the interface module is provided to the service module 102 where it is displayed or presented to the user. An exemplary user interface 300 is shown in Figure 3.
In some cases, the interface module may be configured to use images of the covers of the recommended books to present the recommended books to the user. Different image sizes may be used to represent the similarity of the book to the user's characteristics and preferences. For example, as shown in Figure 3, the books that are most similar to the user's characteristics and preferences (e.g. are at the top of the sorted list) may be represented by larger book cover images 302, 304, 306 and 308, whereas, the books that are less similar to the user's characteristics and preferences (e.g. are lower down the sorted list) may be represented by smaller book cover images 310, 312, 314 and 316.
The user interface may also be divided into sections where each section groups together the books located from a particular proximity network or from a particular combination of proximity networks. For example, as shown in Figure 3, the user interface 300 may be divided into a first row 318 and a second row 320. The first row 318 may be used to display cover images of the books that were located from a rating proximity network, and the second row 320 may be used to display cover images of the books that were located from a content proximity network.
In some cases, there may be more books on the recommended list than can be displayed at once, thus the user interface may be configured to allow the user to access the additional recommended items by, for example, scrolling in one or more directions. For example, the user may be able to access additional recommended items by scrolling to the left or right, or up and down. In some embodiments, scrolling along different axes may provide additional books from different proximity networks or different combinations of proximity networks. For example, scrolling along one axis (e.g. left or right) may provide the user with additional books located by one proximity network (e.g. a content proximity network) whereas scrolling along different axis (e.g. up or down) may provide the user with additional books located by another proximity network (e.g. a rating proximity network). In another example, scrolling along one axis (e.g. left or right) may provide the user with additional books located by a first combination of proximity networks (e.g. 100% content proximity network) whereas scrolling along a different axis (e.g. up or down) may provide the user with additional books located by another combination of proximity networks (e.g. 80% rating proximity network and 20% author proximity network). Each of the proximity module 104, the navigation module 106, the learning module 108, the storage module 110, and the interface module may be implemented in hardware or software. For example, each of the proximity module 104, the navigation module 106, the learning module 108, the storage module 110 and the interface module may be implemented by one or more computers, each computer comprising one or more processors. Alternatively each of the proximity module 104, the navigation module 106, the learning module 108, the storage module 110 and the interface module may be implemented by instructions that are stored on a computer readable medium, that when executed by a computer performs the functions described above.
Reference is now made to Figure 4 which illustrates an exemplary method 400 for generating a content proximity network. At step 402, each book in the database is analysed to generate a list of words used in the book and a count of how many times each word is used. At step 404, the lists generated at step 402 are filtered to eliminate words that do not provide any useful information as to the subject or content of the book. For example, the lists may be filtered to eliminate all semantically empty words, such as articles and adverbs, from the lists. This typically leaves only the words that semantically have meaning. In some embodiments, step 404 may be performed as part of step 402. Specifically, the semantically empty words may not be included (or ignored) in generating the lists in step 402.
At step 406, the filtered lists generated at step 404 are compared and when two books have a predetermined number of words in common a link is created between the books (nodes). The predetermined number is typically selected to produce a proximity network with a useful and manageable number of links. Specifically, if the predefined number is too small, there will be too many links between books that are only loosely similar. However, if the predefined number is too large, a number of useful links will not be created. Preferably the predefined number is selected so as to limit the number of links to any book to 500. At 408, the weight of each link created at step 406 is determined from the number of words in common. For example, the weight may be the sum of the counts of the words in common.
Reference is now made to Figure 5 which illustrates an exemplary method 500 for generating a rating proximity network. At step 502, each time a user rates a book, the rating is stored in the storage module 110. The rating may be based on a standard five star rating system where the users rate the books on a scale of 0 to 5 stars, where 0 stars typically means that the user did not like the book at all and 5 stars typically means that the user really liked or even loved the book. At step 504, the ratings are analysed to identify pairs of books that have been positively rated by the same user. A positive rating may be considered to be a rating above a certain threshold. For example, a positive rating may be a rating above three stars (e.g. a rating of 4 or 5 stars). At step 506, a link is created between the pairs of books identified in step 504. At step 508, the weight of each link created at step 506 is determined from the positive user ratings. In some cases the similarity of two ratings may be more important than the actual rating. Specifically, two identical ratings may be given a higher weight, than two higher, but different ratings. For example, the weight of a link may be determined in accordance with Table 7. Specifically, a weight of 5 may be assigned when a user gives both books a rating of 5; a weight of 3 may be assigned when the user gives one book a rating of 4 and the other book a rating of 5; and a weight of 4 may be assigned when the user gives both books a rating of 4.
Table 7
Figure imgf000027_0001
Where more than one user has positively rated the same two books then a weight is calculated each user and the final weight is the sum of the individual weights. Reference is now made to Figure 6 which illustrates an exemplary method 600 for generating an author proximity network in accordance with an embodiment. In contrast to the content proximity network and rating proximity network described above which link books (e.g. each node is a book), an author proximity network links authors (e.g. each node is an author). At step 602, each time a user rates a book or an author, the rating is stored in the storage module 110. The rating may be based on a standard five star rating system where the users rate the books and/or the authors on a scale of 0 to 5 stars, where 0 stars typically means that the user did not like the book/author at all and 5 stars typically means that the user really liked or even loved the book/author. At step 604, the ratings are analysed to identify pairs of authors that have been positively rated by at least two users or have multiple books positively rated by at least two users. A positive rating may be considered to be a rating above a certain threshold. For example, a positive rating may be a rating above three stars (e.g. a rating of 4 or 5 stars). At step 606, a link is created between the pairs of authors identified in step 604. At step 608, the weight of each link created at step 606 is determined from the positive user ratings. For example , the weight of a link may be determined in accordance with Table 8. Specifically, a weight of 5 may be assigned when a user gives both authors a rating of 5; a weight of 3 may be assigned when the user gives one author a rating of 4 and the other author a rating of 5; and a weight of 4 may be assigned when the user gives both authors a rating of 4. Where more than one user has positively rated the same two authors then a weight is calculated for each user and the final weight is the sum of the individual weights. At step 610, virtual links are created between books written by the pairs of authors identified in step 606.
Table 8
Figure imgf000028_0001
Reference is now made to Figure 7 which illustrates a method 700 for generating a list of recommended items from a plurality of items in accordance with an embodiment. At step 702 one or more proximity networks are generated for the plurality of items (e.g. books). As described above, each proximity network defines relationships between similar items, wherein the similarity of two items is based on predefined criteria. Each proximity network may comprise nodes and links connecting the nodes, where each node represents one or more items of the plurality of items. Each link typically has an associated distance which represents the similarity between the nodes connected by the link. In some cases, a link is only created between a pair of nodes when the nodes have a minimum level of similarity according to the predefined criteria.
Preferably each proximity network uses different predefined criteria to determine the similarity between nodes. Where the items are books, the proximity networks generated at step 702 may include one or more of a content proximity network, a rating proximity network and an author proximity network.
As described above, a content proximity network determines the similarity of two books based on the content of the two books. For example, a content proximity network may determine the similarity of two books based on the number of words the two books have in common. A rating proximity network determines the similarity of two books based on ratings of the books. For example, a rating proximity may determine that two books are more similar if two users have rated the two books the same and less similar if two users have rated the two books differently. An author proximity network determines the similarity of two authors based on the rating of the authors and/or books by the authors.
At step 704, data identifying an item is received. In some cases, the data may also include information identifying a particular user. As described above, this data is typically provided by a service module, such as service module 102, which is configured to provide item services, such as book services, to a plurality of users.
At step 706, a list of recommended items is generated by selecting a plurality of items similar to the identified item using the one or more proximity networks. This may comprise selecting the nodes from the generated proximity network(s) that are connected to the node representing the identified item by a link and have a distance below a predetermined threshold. Where similar items are selected from a plurality of proximity networks the number of items selected from a particular proximity network may be based on a proportion value assigned to that particular proximity network. At step 708, the characteristics of each recommended item are compared to preferences of the identified user. The comparison may comprise comparing criteria characterizing each recommended item to corresponding criteria characterizing the preferences of the identified user. The comparison may involve determining a proximity value for each valid criterion characterizing the preferences of the user. Each proximity value represents the similarity between the valid criterion and the corresponding criterion characterizing the recommended item. Each proximity value may then be summed to generate a distance value for the book. The distance value represents the similarity between the recommended items and the preferences of the user. In some cases each proximity value is multiplied by a weighting factor prior to summing.
As described above, where the items are books the criteria may include, but is not limited to, genre, size, style signature, part of a series, time period, gender, age, price, language and user home location.
At step 710, the recommended items are ordered or ranked based on the comparison to place the items that are most likely to appeal to the identifier user at the top of the list. For example, where a distance value is calculated for each recommended book and the smaller the distance the more similar the books, then the books may be ordered in reverse order according to the distance values.
At step 712, a service profile is selected from a plurality of predefined service profiles to provide the best list of recommended items for the specific service that provided the data in step 706. Each service profile assigns a proportion value to each proximity network. As described in reference to Figure 1, selecting the appropriate service profile may comprise testing each of the predefined service profiles with a plurality of users and selecting the service profile with the best test performance. Testing may be performed in an intrusive or non-intrusive manner.
At step 714, a weighting factor profile is selected from a plurality of predefined weighting factor profiles to provide the best list of recommended items for the identified user. Each weighting factor profile assigns a weighting factor to each criterion. As described in reference to Figure 1, selecting the appropriate weighting factor profile may comprise testing each weighting factor profile and selecting the weighting factor profile with the best test performance. Testing may be performed in an intrusive or non-intrusive manner. It will be evident to a person of skill in the art that the steps of method 700 may be performed in another order and not all the steps must be performed.

Claims

Claims
1. A computer-implemented method to generate a list of recommended items from a plurality of items, the method comprising: generating a proximity network, the proximity network defining relationships between similar items of the plurality of items, wherein the similarity of two items is based on predefined criteria; receiving data identifying an item of the plurality of items; and generating the list of recommended items by selecting a plurality of items similar to the identified item using the proximity network.
2. The computer-implement method of claim 1, wherein the proximity network comprises nodes and links connecting pairs of nodes, each node representing one or more items of the plurality of items, each link having a distance wherein the distance represents the similarity between the nodes connected by the link.
3. The computer-implemented method of claim 2, wherein generating the list of
recommended items comprises selecting the nodes from the proximity network that are connected to the node representing the identified item by a link with a distance value below a predetermined threshold.
4. The computer-implemented method of claim 2 or claim 3, wherein a pair of nodes is only connected by a link if the nodes have a minimum level of similarity according to the predefined criteria.
5. The computer-implemented method of any one of claims 1 to 4 comprising generating a plurality of proximity networks, wherein each proximity network is based on different predefined criteria.
6. The computer-implemented method of claim 5, wherein generating the list of
recommended items comprises selecting a number of items similar to the identified item using the plurality of proximity networks.
7. The computer-implemented method of claim 6, wherein the number of nodes selected from a particular proximity network of the plurality of proximity networks is based on a proportion value assigned to the particular proximity network.
8. The computer-implemented method of claim 7, further comprising selecting a service profile from a plurality of predefined service profiles, each service profile assigning a proportion value to each proximity network of the plurality of proximity networks.
The computer-implemented method of claim 8, wherein selecting the service profile comprises testing each predefined service profile with a plurality of users and selectii service profile with the best test performance.
10. The computer-implemented method of any one of claims 1 to 9, wherein the items are books and the proximity network is one of a content proximity network, a rating proxim network and an author proximity network.
11. The computer-implemented method of claim 10, wherein the proximity network is a content proximity network wherein each node is a book and the similarity of two books is based on the content of the two books.
12. The computer-implemented method of claim 11, wherein the similarity of two books is based on the number of words the two books have in common.
The computer-implemented method of claim 10, wherein the proximity network is a rating proximity network wherein each node is a book and the similarity of two books is based on ratings of the books.
14. The computer-implemented method of claim 13, wherein two books are more similar if two users have rated the two books the same and less similar if two users have rated the two books differently.
15. The computer-implemented method of claim 10, wherein the proximity rating network is an author proximity network wherein each node is an author and two authors are similar if multiple books by the authors have been similarly rated by two or more users.
16. The computer-implemented method of any one of claims 1 to 15, further comprising: receiving data identifying a user; comparing characteristics of each recommended item to preferences of the identified user; ordering the recommended items based on the comparison to place the items that are most likely to appeal to the identifier user at the top of the list.
17. The computer-implemented method of claim 16, wherein the comparison comprises
comparing criteria characterizing each recommended item to corresponding criteria characterizing the preferences of the identified user.
18. The computer-implemented method of claim 17, wherein the comparison further comprises for each recommended item, determining a proximity value for each valid criterion characterizing the preferences of the user, the proximity value representing the similarity between the valid criterion and the corresponding criterion characterizing the
recommended item.
19. The computer-implemented method of claim 18, wherein the comparison further comprises for each recommended item, summing the proximity values for each valid criterion to generate a distance value, the distance value representing the similarity between the recommended item and the preferences of the user.
20. The computer-implemented method of claim 19, wherein each proximity value is multiplied by a weighting factor prior to summing.
21. The computer-implemented method of claim 20, comprising selecting a weighting factor profile from a plurality of predefined weighting factor profiles, each predefined weighting factor profile assigning a weighting factor to each criteria.
The computer-implemented method of claim 8, wherein selecting the weighting factor profile comprises testing each predefined weighting factor profile and selecting the weighting factor profile with the best test performance.
The computer implemented method of any one of claims 17 to claim 22, wherein the items are books and the criteria comprise at least one of: genre, size, style signature, part of a series, time period, gender, age, price, language and user home location.
A computer-implemented system to generate a list of recommended items from a plurality of items, the system comprising: a storage module configured to generate and store a proximity network, the proximity network defining relationships between similar items of the plurality of items, wherein the similarity of two items is based on predefined criteria; and a proximity module configured to:
receive data identifying an item of the plurality of items;
generate the list of recommended items by selecting a plurality of items similar to the identified item using the proximity network; and
output the list of recommended items.
The computer-implement system of claim 24, wherein the proximity network comprises nodes and links connecting pairs of nodes, each node representing one or more items of the plurality of items, each link having a distance value wherein the distance value represents the similarity between the nodes connected by the link.
The computer-implemented system of claim 25, wherein the proximity module is configured to generate the list of recommended items by selecting the nodes from the proximity network that are connected to the node representing the identified item by a link with a distance value below a predetermined threshold.
27. The computer-implemented system of claim 25 or claim 26, wherein a pair of nodes is only connected by a link if the nodes have a minimum level of similarity according to the predefined criteria.
28. The computer-implemented system of any one of claims 24 to 27, wherein the storage module is configured to generate a plurality of proximity networks, wherein each proximity network is based on different predefined criteria.
29. The computer-implemented system of claim 28, wherein the proximity module is configured to generate the list of recommended items by selecting a number of items similar to the identified item using the plurality of proximity networks.
30. The computer-implemented system of claim 29, wherein the number of nodes selected from a particular proximity network of the plurality of proximity networks is based on a proportion value assigned to the particular proximity network.
31. The computer-implemented system of claim 30, further comprising a learning module configured to select a service profile from a plurality of predefined service profiles, each service profile assigning a proportion value to each proximity network of the plurality of proximity networks.
32. The computer-implemented system of claim 31, wherein the learning module is configured to select the service profile by testing each predefined service profile with a plurality of users and selecting the service profile with the best test performance.
33. The computer-implemented system of any one of claims 24 to 32, wherein the items are books and the proximity network is one of a content proximity network, a rating proximity network and an author proximity network.
34. The computer-implemented system of claim 33, wherein the proximity network is a content proximity network wherein each node is a book and the similarity of two books is based on the content of the two books.
35. The computer-implemented system of claim 34, wherein the similarity of two books is based on the number of words the two books have in common.
36. The computer-implemented system of claim 33, wherein the proximity network is a rating proximity network wherein each node is a book and the similarity of two books is based on ratings of the books.
37. The computer-implemented system of claim 36, wherein two books are more similar if two users have rated the two books the same and less similar if two users have rated the two books differently.
38. The computer-implemented system of claim 33, wherein the proximity rating network is an author proximity network wherein each node is an author and two authors are similar if books by the authors have been similarly rated by two or more users.
39. The computer-implemented system of any one of claims 33 to 38, further comprising a navigation module configured to: receive data identifying a user; compare characteristics of each recommended item to preferences of the identified user; and order the recommended items based on the comparison to place the items that are most likely to appeal to the identified user at the top of the list.
40. The computer-implemented system of claim 39, wherein the comparison comprises
comparing criteria characterizing each recommended item against corresponding criteria characterizing the preferences of the identified user.
41. The computer-implemented system of claim 40, wherein the comparison further comprises for each recommended item, determining a proximity value for each valid criterion characterizing the preferences of the user, the proximity value representing the similarity between the valid criterion and the corresponding criterion characterizing the
recommended item.
42. The computer-implemented system of claim 41, wherein the comparison further comprises for each recommended item, summing the proximity values for each valid criterion to generate a distance value, the distance value representing the similarity between the recommended item and the preferences of the user.
43. The computer-implemented system of claim 42, wherein each proximity value is multiplied by a weighting factor prior to summing.
44. The computer-implemented system of claim 43, further comprising a learning module
configured to select a weighting factor profile from a plurality of predefined weighting factor profiles, each predefined weighting factor profile assigning a weighting factor to each criteria.
45. The computer-implemented system of claim 44, wherein the learning module is configured to select the weighting factor profile by testing each predefined weighting factor profile and selecting the weighting factor profile with the best test performance.
46. The computer implemented system of any one of claims 40 to claim 45, wherein the items are books and the criteria comprise at least one of: genre, size, style signature, part of a series, time period, gender, age, price, language and user home location
47. Apparatus for generating a list of recommended items from a plurality of items comprising one or more computers each comprising one or more processors, the apparatus being configured to implement the system of any one of claims 24 to 46.
48. Computer readable medium comprising instructions when implemented by a computer cause the computer to perform the method of any one of claims 1 to 23.
PCT/GB2013/050058 2012-01-13 2013-01-14 System and method for generating a list of recommended items WO2013104922A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP13702250.5A EP2803028A1 (en) 2012-01-13 2013-01-14 System and method for generating a list of recommended items

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB1200576.5 2012-01-13
GB201200576A GB2498384A (en) 2012-01-13 2012-01-13 System and methods to generate a list of recommended items

Publications (1)

Publication Number Publication Date
WO2013104922A1 true WO2013104922A1 (en) 2013-07-18

Family

ID=45813986

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2013/050058 WO2013104922A1 (en) 2012-01-13 2013-01-14 System and method for generating a list of recommended items

Country Status (3)

Country Link
EP (1) EP2803028A1 (en)
GB (1) GB2498384A (en)
WO (1) WO2013104922A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108352028A (en) * 2015-11-09 2018-07-31 株式会社电通 CRM Customer Relationship Management device and method
CN109992602A (en) * 2019-04-02 2019-07-09 海南颖川科技有限公司 Juvenile's digital reading guiding apparatus

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001006398A2 (en) * 1999-07-16 2001-01-25 Agentarts, Inc. Methods and system for generating automated alternative content recommendations
US6360227B1 (en) * 1999-01-29 2002-03-19 International Business Machines Corporation System and method for generating taxonomies with applications to content-based recommendations

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6360227B1 (en) * 1999-01-29 2002-03-19 International Business Machines Corporation System and method for generating taxonomies with applications to content-based recommendations
WO2001006398A2 (en) * 1999-07-16 2001-01-25 Agentarts, Inc. Methods and system for generating automated alternative content recommendations

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BEN SCHAFER J ET AL: "Collaborative Filtering Recommender Systems", 24 April 2007, THE ADAPTIVE WEB; [LECTURE NOTES IN COMPUTER SCIENCE;;LNCS], SPRINGER BERLIN HEIDELBERG, BERLIN, HEIDELBERG, PAGE(S) 291 - 324, ISBN: 978-3-540-72078-2, XP019057885 *
ZAN HUANG ET AL: "A graph-based recommender system for digital library", PROCEEDINGS OF THE SECOND ACM/IEEE-CS JOINT CONFERENCE ON DIGITAL LIBRARIES , JCDL '02, July 2002 (2002-07-01), New York, New York, USA, pages 65, XP055059745, ISBN: 978-1-58-113513-8, DOI: 10.1145/544220.544231 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108352028A (en) * 2015-11-09 2018-07-31 株式会社电通 CRM Customer Relationship Management device and method
CN109992602A (en) * 2019-04-02 2019-07-09 海南颖川科技有限公司 Juvenile's digital reading guiding apparatus
CN109992602B (en) * 2019-04-02 2023-05-16 海南颖川科技有限公司 Digital reading guiding equipment for children

Also Published As

Publication number Publication date
GB2498384A (en) 2013-07-17
GB201200576D0 (en) 2012-02-29
EP2803028A1 (en) 2014-11-19

Similar Documents

Publication Publication Date Title
CN108959603B (en) Personalized recommendation system and method based on deep neural network
Lee et al. Automated marketing research using online customer reviews
Neidhardt et al. Eliciting the users' unknown preferences
Wan et al. The effect of firm marketing content on product sales: Evidence from a mobile social media platform
US7930302B2 (en) Method and system for analyzing user-generated content
US20050210025A1 (en) System and method for predicting the ranking of items
US20110035329A1 (en) Search Methods and Systems Utilizing Social Graphs as Filters
US20140330653A1 (en) Information Recommendation Method and Apparatus
US20130275269A1 (en) Searching supplier information based on transaction platform
KR20090028438A (en) A recommender system with ad-hoc, dynamic model composition
US20140288999A1 (en) Social character recognition (scr) system
KR20200033572A (en) Tourism goods recommending system using attribute information
CN107943910B (en) Personalized book recommendation method based on combined algorithm
US9542482B1 (en) Providing items of interest
Chiru et al. Movie Recommender system using the user's psychological profile
KR101712291B1 (en) System for recommending a user-customized famous place based on opinion mining and Method of the Same
CN110990717B (en) Interest point recommendation method based on cross-domain association
CN107133811A (en) The recognition methods of targeted customer a kind of and device
Du et al. Pcard: Personalized restaurants recommendation from card payment transaction records
Nan et al. DO ONLY REVIEW CHARACTERISTICS AFFECT CONSUMERS'ONLINE BEHAVIORS? A STUDY OF RELATIONSHIP BETWEEN REVIEWS.
US20150293988A1 (en) System and Method for Opinion Sharing and Recommending Social Connections
US10720081B2 (en) Method and system for matching people with choices
EP2803028A1 (en) System and method for generating a list of recommended items
CN108960954B (en) Content recommendation method and system based on user group behavior feedback
Chiny et al. Towards a Machine Learning and Datamining approach to identify customer satisfaction factors on Airbnb

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13702250

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2013702250

Country of ref document: EP