WO2002021335A1 - Recommandation automatique de produits a l'aide d'un indexage semantique latent de contenu - Google Patents

Recommandation automatique de produits a l'aide d'un indexage semantique latent de contenu Download PDF

Info

Publication number
WO2002021335A1
WO2002021335A1 PCT/US2001/025899 US0125899W WO0221335A1 WO 2002021335 A1 WO2002021335 A1 WO 2002021335A1 US 0125899 W US0125899 W US 0125899W WO 0221335 A1 WO0221335 A1 WO 0221335A1
Authority
WO
WIPO (PCT)
Prior art keywords
items
user
textual
item
textual items
Prior art date
Application number
PCT/US2001/025899
Other languages
English (en)
Inventor
Clifford A. Behrens
Dennis E. Egan
Yu-Yun Ho
Carol Lochbaum
Mark Rosenstein
Original Assignee
Telcordia Technologies, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US09/653,917 external-priority patent/US6615208B1/en
Application filed by Telcordia Technologies, Inc. filed Critical Telcordia Technologies, Inc.
Publication of WO2002021335A1 publication Critical patent/WO2002021335A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/313Selection or weighting of terms for indexing

Definitions

  • This invention relates generally to a procedure for selecting a product by a customer and, more particularly, to methodologies and concomitant circuitry for using latent semantic structure of content ascribed to the products to provide automatic recommendations to the customer.
  • the current state of the art with respect to item (1) above is composed of two techniques for providing recommendations.
  • the first is to use a domain expert to handcraft recommendations for a specific item.
  • an expert proceeds through a series of items, and notates for each item which additional items should be recommended when a customer chooses the original item.
  • This technique is laborious and is not automatic; for instance, when a new item is introduced, the expert must be consulted again to generate recommendations for the new item.
  • An expert can also provide recommendations to be given for a set of items. While this is possible to consider in the case of a small number of sets, an expert will be quickly overwhelmed in any attempt to provide a comprehensive set of recommendations given the large number of possible combinations of items.
  • the second prior art technique in recommendations manipulates customer preference data to provide a recommendation.
  • U.S Patent No. 4,348,740 entitled “Method and portable apparatus for comparison of stored sets of data”
  • U.S. Patent No. 4,870,579 entitled “System and method of predicting subjective reactions”
  • Other techniques have built upon this latter reference to promote alternative techniques of using preference data to provide recommendations.
  • the second is when the recommendation is for a task or a situation where preferences are not the overriding concern. For instance, no matter how well-liked a "bicycle" is, if the task is moving furniture, a less preferred "truck” would be a more appropriate recommendation than any type of "bicycle”.
  • the second thread of pertinent background subject matter is the use of relevance feedback in information retrieval tasks.
  • Relevance feedback consists of the idea of modifying a subsequent information query by using feedback from the user as to the relevance of information retrieved in a previous query. For instance, a user enters a query, and an information retrieval system returns a set of responses. The user then indicates which of these responses is most relevant to the query, and the query is modified to use this relevance information in producing another query.
  • the first use of relevance feedback is attributed to Rocchio in the reference "Document retrieval systems - optimization and evaluation", a Doctoral Dissertation by Rocchio J.J. Jr.
  • the prior art is devoid of a method such that the two threads of pertinent prior art coalesced whereby relevance feedback is used to automatically provide recommendations .
  • a method for automatically recommending textual items stored in a database to a user of a computer-implemented service, the user having selected one of the items includes: (a) applying a latent semantic algorithm to the textual items to establish a conceptual similarity among the textual items and the selected item; and (b) outputting to the user a recommended set of nearest items to the selected item based upon the conceptual similarity.
  • a system for automatically recommending textual items stored in a database to a user of a computer-implemented service includes: (a) a processor for applying a latent semantic algorithm to the textual items to establish a conceptual similarity among the textual items and one of the items selected by the user; and (b) means for outputting to the user a recommended set of nearest items to the selected item with reference to the conceptual similarity among the textual items and the selected item.
  • FIG. 1 depicts a screen display, presented to the customer by the system in accordance with the present invention, the screen display facilitating input by a customer to initiate a search for an item using keywords;
  • FIG. 2 depicts a screen display presenting the response by the system to the search request of FIG. 1 as initiated by the customer;
  • FIG. 3 depicts a screen display presenting the response by the system to the customer's request for more detailed information about one item displayed in FIG. 2;
  • FIG. 4 depicts a screen display presenting the response by the system to the customer's request to add the item detailed in FIG. 3 to the customer's shopping cart, including recommendations presented by the system based upon the item so selected by the customer;
  • FIG. 5 depicts a flow diagram of the method in accordance with the present invention to thereby determine the recommendations presented to the customer in FIG. 4;
  • FIG. 6 is a plot of the "term” coordinates and the "document” coordinates based on a two-dimensional singular value decomposition of an original "term-by- document” matrix;
  • FIG. 7 depicts a flow diagram of the method in accordance with one illustrative embodiment of the present invention relating to document abstracts to generate recommendations of pertinent items to the customer based upon the customer's selection actions;
  • FIG. 8 is a flow diagram of the method of the present invention in its most generic form for generating and storing a "nearest" items file;
  • FIG. 9 is a flow diagram of the method of the present invention in its most generic form for generating a "nearest" items file useful for real-time and non real-time applications.
  • FIG. 10 is a high-level block diagram of hardware components for an illustrative embodiment of the present invention. To facilitate understanding, identical reference numerals have been used, where possible, to designate elements that are common to the figures.
  • the system should be visualized as a Web server accessible from a purchaser's personal computer (PC) over the Internet; the PC includes a monitor for displaying Web pages on the monitor's screen, a keyboard, and a "mouse".
  • the system is configured with a set of application programs for servicing the purchaser's on-line inputs to the system from the PC.
  • screen display 100 which appears on the purchaser's PC monitor in response to a request by the purchaser to access the Search aspect of the system (such as by clicking on a "SEARCH” request button on the on-line merchandiser's home Web page (not shown)).
  • the Web page shown on display 100 results from clicking on a "SEARCH WITH KEYWORDS” region of such merchandiser's home page, as repeated for reminder purposes on display 100.
  • the purchaser is prompted to enter keywords into "boxed" display area 102, which is empty when initially displayed by the system.
  • the words "network equipment building system” are keywords typed by the purchaser into display area 102.
  • FIG. 2 there is shown screen display 200 which results from submitting the Search request of FIG. 1 to the system.
  • box area 201 repeats the keywords input by the purchaser for ready reference.
  • Document category titles 210, 220, and 230 show, respectively, the documents located in the search and categorized according to the category titles. For instance, referring now to document title category 220 "Family of Requirements", 2 documents were located as a result of the search, namely, FR-440 entitled “Transport Systems Generic Requirements - April 1999", and FR-64CD-1-1USER entitled " Lata Switching Systems Generic Requirements - January 1999". Similarly, under document title category 230, reference numeral 231 identifies the single document located in the search, namely, GR-2930 entitled "Network Equipment ... and Data Centers - November, 1996.”
  • each document is presented on screen display 200 as a hypertext link, so that the purchaser needs only to "click on” the document, either its document reference number (e.g., GR- 2930) or its title (e.g., "Network Equipment ... and Data Centers - November 1996"). It is further supposed that the purchaser calls into view the details of the single document under the "Generic Requirements" document category 230 by clicking on GR-2930.
  • the detailed information pertaining to this document presented to the purchaser as a result of clicking on GR-2930 is shown in screen display 300 of FIG. 3.
  • the ABSTRACT of the document is displayed in the upper portion of screen display 300.
  • reference numeral 310 is ORDERING INFORMATION for this document.
  • Reference numeral 311 points to the MEDIA box, and an associated box filled-in with the term "Paper”, which summarize the medium in which the document is available, h addition, reference numeral 312 points to the PRICE box, and an associated box filled-in with the term $150.00", which summarize the cost of the document.
  • Reference numeral 313 points to the PAGES box, and an associated box filled-in with the term "300”, which summarize the size of the document.
  • reference numeral 314 points to the ACTION box, and an associated box having the term "Add Item to Shopping Basket” displayed, which summarize a possible action which may be taken by the purchaser.
  • FIG. 4 screen display 400 of FIG. 4 depicts the result of this click-on activity.
  • Reference numeral 401 indicates that the screen display is the SHOPPING BASKET for reminder purposes.
  • the portion below the heading SHOPPING BASKET displays the contents of the shopping basket, which to this point is the single document shown by its title "Network Equipment ... and Data Centers - January 1996", along with the price (reference numeral 412) of each item (reference numeral 411) for each item displayed; for this single document, the price is $150.00 (reference numeral 413), as also displayed earlier in FIG. 3.
  • the bottom half of screen display 400 having the heading RELATED HEMS YOU MAY WISH TO CONSIDER (reference numeral 420), displays three system-recommended documents as generated by an algorithm carried out by the on-line merchandiser's system — the algorithm being transparent to the purchaser.
  • the recommendation for this illustration is based upon the latest document placed into the shopping basket by the purchaser. The method to arrive at the recommendation is discussed in detail in the sequel.
  • FIG. 5 there is shown flow diagram 500 which summarizes the sequence of steps carried out to present and display the information of FIGS. 1-4.
  • the processing blocks 510-570 are described as follows.
  • Processing Block 510 decide whether to use the content or text surrogates for the content.
  • the products/items are documents. If the system implementers do not have the full text of the documents, it is possible to use the abstracts of the documents as surrogates for the documents themselves.
  • Processing Block 520 decide what, if any, criteria will be used to determine if a given product will be indexed.
  • not all documents may be deemed as "good” recommendations. For instance, it may not be advantageous to recommend free sales material.
  • certain products may be available under different licensing terms - for instance, there may be separate products with different right-to-use clauses, e.g., there are separate product numbers for use by one person, use by 2-5 people, use by 10-100 people, and so forth, so it may be desirable to only recommend the use-by-one-person product.
  • Processing Block 530 assemble the content or text surrogates for the content for indexing.
  • the abstracts of the documents were embedded in a software file, which was a representation of the catalog available to purchasers.
  • the criteria as arrived at via processing block 520 are then used to filter out any documents that did not meet the criteria, that is, to arrive at "good" candidates for recommendations.
  • Processing Block 540 index, verify, and determine a threshold
  • the abstracts were indexed using the LSI algorithm.
  • the LSI algorithm generates a new vector space with vector positions for all the indexed terms and documents, where the cosine distance between item vectors is a measure of the items' semantic distance.
  • the set of documents is then sampled, and checked for the closest documents, to make sure that the parameter choices led to "reasonable" results. As part of this sampling process it was noted that items with a distance, or score below 0.6 were unlikely to be relevant to the item, so a threshold of 0.6 was established.
  • Processing Block 550 generate a Recommendations table
  • a set of the ten closest items for each item in the catalog was generated. To this set the threshold determined in Step 4 was applied, with any items with a threshold below 0.6 being eliminated. Next, a table was compiled where each row contained the item and the recommended items. For this example, one file for each item was compiled in the scaling that contains the recommended items. This file serves as the database of recommended items for the interface.
  • Processing Block 560 decide where in shopping experience to provide
  • the recommendation set determined by the newly added item was filtered to not include any items that were already in the customer's shopping basket.
  • the recommendations are displayed on the same screen display as the shopping basket and show the titles of the recommended items (e.g., 421-423 in FIG. 4), which are implemented as hyperlinks. Clicking on the link takes the customer to more information and the opportunity to purchase the recommended item.
  • space constraints on the shopping basket screen display e.g., a web page
  • Processing Block 570 implement Recommendations
  • the recommendation files are accessed to retrieve the compiled recommendations from the stored files.
  • the recommendations are filtered to not repeat any items already in the basket, and trimmed to show at most 4 items.
  • Table 1 The contents of Table 1 are used to illustrate how semantic structure analysis works and to point out the differences between this method and conventional keyword matching.
  • cl Human machine interface for Lab ABC computer applications
  • c2 A survey of user opening of computer response time
  • c3 The EPS user interface management system
  • c4 Systems and human systems engineering testing of EPS-2
  • c5 Relation of user-perceived response time to error measurement
  • ml The generation of random, binary, unordered trees
  • m2 The intersection graph of paths in trees
  • m3 Graph minors IV: widths of tress and well-quasi-ordering
  • m4 Graph minors: a survey
  • a file of text objects is composed of nine technical documents with titles cl-c5 concerned with human/computer interactions, and titles ml-m4 concerned with mathematical graph theory.
  • titles cl-c5 concerned with human/computer interactions
  • titles ml-m4 concerned with mathematical graph theory.
  • cl, c2, and c4 would be returned since these titles contain at least one keyword from the user request.
  • c3 and c5 while related to the query, would not be returned since they share no words in common with the request. It is now shown how latent semantic structure analysis treats this request to return titles c3 and c5.
  • Table 2 depicts the "term-by-document" matrix for the nine technical document titles. Each cell, (i,j), is the frequency of occurrence of term i in document j. This basic term-by-document matrix or a mathematical transformation thereof is used as input to the statistical procedure described below.
  • FIG. 6 is a two-dimensional graphical representation of the two largest dimensions resulting from the statistical processing via singular value decomposition (SND). Both document titles and the terms used in them are fit into the same space. Terms are shown as circles and labeled by number. Document titles are represented by squares with the numbers of constituent terms indicated parenthetically. The cosine or dot product between two objects (terms or documents) describes their estimated similarity. In this representation, the two types of documents form two distinct groups: all the mathematical graph theory documents (ml- m4) occupy the same region in space (basically along Dimension 1 of FIG. 6), whereas a quite distinct group is formed for human/machine interaction titles (cl-c5) (essentially along Dimension 2 of FIG. 6).
  • SND singular value decomposition
  • the query is first folded into this two-dimensional space using those terms that occur in the space (namely, "human” and "computer”).
  • the query vector is located in the direction of the weighted average of these constituent terms, and is denoted by a directional arrow labeled "Q" in FIG. 6.
  • a measure of the closeness or similarity is related to the angle between the query vector and any given term or document vector.
  • One such measure is the cosine between the query vector and a given term or document vector.
  • the cosine between the query vector and each of the cl-c5 titles is greater than 0.90; the angle corresponding to the cosine value of 0.90 with the query vector is shown by dashed lines in FIG. 6.
  • documents c3 and c5 would be returned as matches to the user query, even though they share no common terms with the query. This is because the latent semantic structure, as captured by the depiction of FIG. 6, fits the overall pattern of usage across documents.
  • the "term-by-document" matrix of Table 2 is decomposed using Singular Nalue Decomposition (SND).
  • SND Singular Nalue Decomposition
  • a reduced SVD is employed to approximate the original matrix in terms of a much smaller number of orthogonal dimensions. This reduced SVD is used for retrieval; it describes major associational structures in the matrix but it ignores small variations in word usage.
  • the number of dimensions to adequately represent a particular domain is largely an empirical matter. If the number of dimensions is too large, random noise or variations in word usage will be remodeled. If the number of dimensions is too small, significant semantic distinctions will remain un-captured. For diverse information sources, 100 or more dimensions may be needed.
  • the "term by-document" matrix denoted Y
  • Y is decomposed into three other matrices, namely, the term matrix (TERM), the document matrix (DOCUMENT), and a diagonal matrix of singular values (DIAGONAL), as follows:
  • Y t)d TERM t,m DIAGONAL m)m DOCUMENT m,d ,
  • Y is the original t-by-d matrix
  • TERM is the t-by-m term matrix with unit-length orthogonal columns
  • DOCUMENT is the m-by-d document matrix with unit-length orthogonal columns
  • DIAGONAL is the m-by-m diagonal matrix of singular values typically ordered by magnitude.
  • the dimensionality of the full solution, denoted m is the rank of the t-by-d matrix, that is, m ⁇ min(t,d).
  • Tables 3, 4, and 5 below show the TERM and DOCUMENT matrices and the diagonal elements of the DIAGONAL matrix, respectively, as found via SVD.
  • Any rectangular matrix Y of t rows and d columns can be decomposed into a product of three other matrices:
  • D Q * is the transpose of D 0 , and such that T 0 and D 0 have unit-length orthogonal
  • T 0 and D 0 are the matrices of left and right singular vectors and S 0 is the
  • the new matrix YR is the matrix of rank k closest in the least squares sense to
  • k is chosen for each application; it is generally such that k > 100 for a collection of 100-3000 text objects (e.g., documents).
  • the rows of the reduced matrices T and D may be taken as vectors representing terms and documents, respectively, in a k-dimensional space.
  • dot products between points in the space can be used to access and compare objects.
  • comparisons of interest There are basically three types of comparisons of interest; (i) those comparing two terms; (ii) those comparing two documents or text objects; and (iii) those comparing a term and a document or text object.
  • a text object or a data object is general, whereas a document is a specific instance of a text object or a data object.
  • text or data objects are stored in the computer system in files.
  • the rows of the DS matrix are taken as vectors representing the documents, and the comparison is via the dot product between the rows of the DS matrix.
  • the i,j cell of YR may
  • D q is derived such that D q can be used just like a row of D in the
  • D q may be used like any row of D and, appropriately scaled by S or S 1/2 , can be used like a usual document vector for making "within” and "between” comparisons. It is to be noted that if the measure of similarity to be used in comparing the query against all documents is one in which only the angle between the vectors is important (such as the cosine measure), there is no difference for comparison purposes between placing the query at the vector average or the vector sum of the terms.
  • FIG. 7 amplifies on and/or encapsulates certain method steps of FIG. 5 that are particular to the illustrative example of FIGS. 1-4.
  • Processing block 710 depicts that the starting point in the process of FIG. 7 is a catalog of abstracts, with each abstract being representative of a corresponding item (e.g., a full document).
  • processing block 720 is executed to filter the catalog of abstracts to yield a reduced set of abstracts for processing by the latent semantic indexing algorithm — recall that documents are culled so that only "good" recommendations are offered to the purchaser.
  • processing block 730 is invoked apply the latent semantic indexing algorithm to the reduced set of abstracts to produce a vector space representation of the reduced set of abstracts.
  • the "term-by-document" matrix Y is formed from the terms in the reduced set of abstracts (which are now the documents). Then Singular Value Decomposition is applied, and the dimensionality of the space is selected to generate the vector space representation of the reduced set of abstracts, that is, the k largest singular values are selected to yield the approximation matrix YR .
  • processing block 740 is used to find so-called “nearest” abstracts for each abstract in the reduced set of abstracts.
  • the type of comparison utilized is the "Two Documents" comparison already discussed above. Recall in this case, the dot product is between two column vectors of Y.
  • the document-by-document dot product is approximated by:
  • the rows of the DS matrix are taken as vectors representing the documents, and the comparison is via the dot product between the rows of the DS matrix.
  • the cosine measure is used to gauge the closeness of all other abstracts to the given abstract under consideration, that is, one-by-one each abstract is taken as a reference document and the cosine measure of all other abstracts to the given abstract is computed.
  • the "nearest" abstracts are determined based upon pre-determined criteria, such as, the cosine being no less than 0.6 and selection of only the four closest abstracts.
  • the "nearest" abstracts are stored in a file for later recall during the actual "search” activity by the purchaser, as evidenced by processing block 750. Recommendations to a purchaser are expedited because the "nearest" abstracts file is generated off-line and stored, that is, the only real-time execution activity required of the on-line system is an access to the file of stored "nearest" abstracts when a purchaser, for example, adds an item to the shopping basket. It is also clear that if a new document is entered into the system and made available to the purchaser, the system is scalable in that the abstract of the new document can be considered as a pseudo-object and the abstracts "nearest" the pseudo-object can serve as recommendations to the purchaser.
  • the final processing step is that of outputting to the purchaser the recommended list of "nearest" abstracts as an item is added to the shopping basket.
  • flow diagram 800 of FIG. 8 depicts the most general processing in accordance with the method aspect of the present invention when a file of "nearest' items is generated, usually off-line, and then stored.
  • Processing block 810 applies a latent semantic algorithm to the items to determine a conceptual similarity among the items. It is implicit that the items form a catalog in the generic sense, and that each of the items has an associated textual description. Thus the catalog of items is not necessarily composed of documents, but can be composed of, as suggested earlier, audio tape listings, video tape listings, works-of-art, electronic product listings, and so forth; however, each item has an associated written description that can be used with a latent semantic algorithm to find the conceptual similarity among the items (e.g., inner product or dot product with the cosine measure).
  • processing block 820 is invoked to find, for each item, the "nearest” items using the conceptual similarity as a measure of "nearness".
  • the file is stored for later recall during the shopping experience of an on-line purchaser.
  • processing block 830 is executed so that, whenever each on-line purchaser adds a "latest" item to the shopping cart, the file of "nearest” items determined by processing block 820 is accessed to provide a recommendation of items "nearest” to the item added to the shopping cart. (Of course it is possible to return the "nearest” items to the purchaser at other points in the shopping experience, not just at the time the purchaser selects a “latest” items. For example, the "nearest" items for each item in the shopping basket could be displayed if there is sufficient screen display area to accomplish this display).
  • flow diagram 900 of FIG. 9 depicts the most general processing in accordance with the method aspect of the present invention when a list of "nearest" items is dynamically generated in response to a purchaser's request — the processing by flow diagram 900 does not require storing a file of conceptual similarity among textual items.
  • Processing block 910 applies, whenever the on-line purchaser adds a "latest" item to the shopping cart, a latent semantic algorithm is applied to the items to determine a conceptual similarity among the items and the "latest" item. It is implicit that the items form a catalog in the generic sense, and that each of the items has an associated textual description.
  • the matrix YR is
  • Processing block 920 is then executed so that a recommendation of items "nearest” to the item added to the shopping cart is generated. (Of course it is possible to return the "nearest” items to the purchaser at other points in the shopping experience, not just at the time the purchaser selects a “latest” items. For example, the "nearest” items for each item in the shopping basket could be displayed if there is sufficient screen display area to accomplish this display).
  • system 1000 includes: (a) Web server 1010; (b) application server 1020; and (c) storage file 1030.
  • System 1000 is coupled to conventional Internet network or "cloud" 1005. Moreover, access to Internet
  • the purchaser is presented with a Web page in HTML format on the display of PC 1001 — depicted as Web page 1002 which conveys purchaser input to system 1000, and as Web page 1003 which conveys a system response to the purchaser.
  • Web page 1002 which conveys purchaser input to system 1000
  • Web page 1003 which conveys a system response to the purchaser.
  • the purchaser requests information from system 1000 such as by typing and/or clicking on links on input Web page 1002
  • the request for information is transmitted using the "https" protocol to system 1000.
  • the purchaser requests system responses in the usual manner by typing, pointing and/or clicking on HTML Web pages.
  • Web server 1010 passes the purchaser's input information, such as "search” keywords entered or a Web page link clicked upon by the purchaser, depending upon the stage of the shopping experience, to input web page processor 1011 which parses the Web page to obtain information to pass along to application server 1020. If the purchaser has entered "search" keywords, then application server 1020 consults storage file 1030 to obtain data to return a response Web page.
  • Output Web page processor 1021 receives the response data, and prepares a Web page in HTML format for transmission, via server 1010 and Internet 1005, to PC 1001 as Web page 1003.
  • application server 1020 accesses that part of storage file 1030 that stores the file of "nearest” items to the clicked-upon item.
  • the output of application server 1020 is a set of "nearest" items, which is again placed in HTML format and delivered to PC 1001, via Web server 1010 and Internet 1005, as response Web page 1003.
  • system 1000 is illustrated as operating in the Internet environment with only a single server, and initially elucidates the set of services embodied in the product-purchase experience.
  • a general computer network implementation imbued with the structure and characteristics heretofore described can effect the applications in accordance with the present invention.
  • the product-purchase experience can be implemented locally as well, that is, the client-server may be interconnected, for example, via a local area network (LAN) which is not coupled to the Internet. All of the aforementioned benefits apply to this local system so as to realize a product selection experience.
  • LAN local area network
  • filtering may be imposed so as to generate the recommendations provided to the purchaser; such filtering may be accomplished, illustratively, by processing blocks 830 or 920.
  • items below a certain price may not warrant a recommendation, e.g., the price of the item placed in the basket.
  • the "nearest" items list could be filtered based on something known about the purchaser, e.g., no "adult” content for kids, or only content written in languages x and y but not language z.
  • This post-filer processing complements the pre-filtering processing already discussed.
  • an extension to the illustrative embodiment is that of using all the items in the shopping basket or a subset of items in the shopping basket, in contrast to the latest item, to generate a recommendation.
  • the technique for accommodating multiple items as input is normally implemented in real-time since it would be virtually impossible to generate and store a "nearest" items file using permutations of all items in the catalog to form composite pseudo-objects.
  • the recommended list could be e-mailed to the purchaser rather than displaying the list immediately on the screen. This may occur when, for example, the recommended list may be too large to be conveniently displayed on the screen display. Also, a recommendation could be e-mailed to the purchaser when a new item is added to the catalog and such added item, if available during the time of the prior interaction with the purchaser, would have been included in the list of recommended items.
  • the e-mail functionality may be implemented by application server 1020.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

L'invention consente des techniques permettant d'utiliser structure sémantique (810) latente de contenu textuel attribuée à des articles afin de fournir des recommandations automatiques à un utilisateur (1002). Cet utilisateur entre (1002) un article sélectionné (102), puis un processeur applique un algorithme sémantique latent à la sélection de l'utilisateur et au contenu textuel des articles d'une base de données (1030) afin d'établir une similitude conceptuelle entre ladite sélection et les articles (920). Un ensemble articles (740) se rapprochant de l'article sélectionné est fourni sous forme de recommandation (910) à l'utilisateur d'autres articles, cet ensemble pouvant présenter une pertinence ou un intérêt particulier pour la sélection originale de l'utilisateur en fonction de la mesure (820) de la similitude conceptuelle.
PCT/US2001/025899 2000-09-01 2001-08-17 Recommandation automatique de produits a l'aide d'un indexage semantique latent de contenu WO2002021335A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US65302500A 2000-09-01 2000-09-01
US09/653,917 US6615208B1 (en) 2000-09-01 2000-09-01 Automatic recommendation of products using latent semantic indexing of content
US09/653,917 2000-09-01
US09/653,025 2000-09-01

Publications (1)

Publication Number Publication Date
WO2002021335A1 true WO2002021335A1 (fr) 2002-03-14

Family

ID=27096424

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2001/025899 WO2002021335A1 (fr) 2000-09-01 2001-08-17 Recommandation automatique de produits a l'aide d'un indexage semantique latent de contenu

Country Status (1)

Country Link
WO (1) WO2002021335A1 (fr)

Cited By (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2391967A (en) * 2002-08-16 2004-02-18 Canon Kk Information analysing apparatus
US7865522B2 (en) 2007-11-07 2011-01-04 Napo Enterprises, Llc System and method for hyping media recommendations in a media recommendation system
US7970922B2 (en) 2006-07-11 2011-06-28 Napo Enterprises, Llc P2P real time media recommendations
US8060525B2 (en) 2007-12-21 2011-11-15 Napo Enterprises, Llc Method and system for generating media recommendations in a distributed environment based on tagging play history information with location information
US8090606B2 (en) 2006-08-08 2012-01-03 Napo Enterprises, Llc Embedded media recommendations
US8112720B2 (en) 2007-04-05 2012-02-07 Napo Enterprises, Llc System and method for automatically and graphically associating programmatically-generated media item recommendations related to a user's socially recommended media items
US8117193B2 (en) 2007-12-21 2012-02-14 Lemi Technology, Llc Tunersphere
US8224856B2 (en) 2007-11-26 2012-07-17 Abo Enterprises, Llc Intelligent default weighting process for criteria utilized to score media content items
US8285595B2 (en) 2006-03-29 2012-10-09 Napo Enterprises, Llc System and method for refining media recommendations
US8285776B2 (en) 2007-06-01 2012-10-09 Napo Enterprises, Llc System and method for processing a received media item recommendation message comprising recommender presence information
US8327266B2 (en) 2006-07-11 2012-12-04 Napo Enterprises, Llc Graphical user interface system for allowing management of a media item playlist based on a preference scoring system
US8396951B2 (en) 2007-12-20 2013-03-12 Napo Enterprises, Llc Method and system for populating a content repository for an internet radio service based on a recommendation network
US8620699B2 (en) 2006-08-08 2013-12-31 Napo Enterprises, Llc Heavy influencer media recommendations
US8805831B2 (en) 2006-07-11 2014-08-12 Napo Enterprises, Llc Scoring and replaying media items
US8839141B2 (en) 2007-06-01 2014-09-16 Napo Enterprises, Llc Method and system for visually indicating a replay status of media items on a media device
US8874655B2 (en) 2006-12-13 2014-10-28 Napo Enterprises, Llc Matching participants in a P2P recommendation network loosely coupled to a subscription service
US8874554B2 (en) 2007-12-21 2014-10-28 Lemi Technology, Llc Turnersphere
US8880599B2 (en) 2008-10-15 2014-11-04 Eloy Technology, Llc Collection digest for a media sharing system
US8903843B2 (en) 2006-06-21 2014-12-02 Napo Enterprises, Llc Historical media recommendation service
US8983950B2 (en) 2007-06-01 2015-03-17 Napo Enterprises, Llc Method and system for sorting media items in a playlist on a media device
US9003056B2 (en) 2006-07-11 2015-04-07 Napo Enterprises, Llc Maintaining a minimum level of real time media recommendations in the absence of online friends
US9037632B2 (en) 2007-06-01 2015-05-19 Napo Enterprises, Llc System and method of generating a media item recommendation message with recommender presence information
US9060034B2 (en) 2007-11-09 2015-06-16 Napo Enterprises, Llc System and method of filtering recommenders in a media item recommendation system
US9081780B2 (en) 2007-04-04 2015-07-14 Abo Enterprises, Llc System and method for assigning user preference settings for a category, and in particular a media category
US9164993B2 (en) 2007-06-01 2015-10-20 Napo Enterprises, Llc System and method for propagating a media item recommendation message comprising recommender presence information
US9224427B2 (en) 2007-04-02 2015-12-29 Napo Enterprises LLC Rating media item recommendations using recommendation paths and/or media item usage
US9224150B2 (en) 2007-12-18 2015-12-29 Napo Enterprises, Llc Identifying highly valued recommendations of users in a media recommendation network
US9292179B2 (en) 2006-07-11 2016-03-22 Napo Enterprises, Llc System and method for identifying music content in a P2P real time recommendation network
US9367808B1 (en) 2009-02-02 2016-06-14 Napo Enterprises, Llc System and method for creating thematic listening experiences in a networked peer media recommendation environment
US9734507B2 (en) 2007-12-20 2017-08-15 Napo Enterprise, Llc Method and system for simulating recommendations in a social network for an offline user

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4839853A (en) * 1988-09-15 1989-06-13 Bell Communications Research, Inc. Computer information retrieval using latent semantic structure
US5278980A (en) * 1991-08-16 1994-01-11 Xerox Corporation Iterative technique for phrase query formation and an information retrieval system employing same
US5301109A (en) * 1990-06-11 1994-04-05 Bell Communications Research, Inc. Computerized cross-language document retrieval using latent semantic indexing
US5987446A (en) * 1996-11-12 1999-11-16 U.S. West, Inc. Searching large collections of text using multiple search engines concurrently

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4839853A (en) * 1988-09-15 1989-06-13 Bell Communications Research, Inc. Computer information retrieval using latent semantic structure
US5301109A (en) * 1990-06-11 1994-04-05 Bell Communications Research, Inc. Computerized cross-language document retrieval using latent semantic indexing
US5278980A (en) * 1991-08-16 1994-01-11 Xerox Corporation Iterative technique for phrase query formation and an information retrieval system employing same
US5987446A (en) * 1996-11-12 1999-11-16 U.S. West, Inc. Searching large collections of text using multiple search engines concurrently

Cited By (42)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2391967A (en) * 2002-08-16 2004-02-18 Canon Kk Information analysing apparatus
US8285595B2 (en) 2006-03-29 2012-10-09 Napo Enterprises, Llc System and method for refining media recommendations
US8903843B2 (en) 2006-06-21 2014-12-02 Napo Enterprises, Llc Historical media recommendation service
US8327266B2 (en) 2006-07-11 2012-12-04 Napo Enterprises, Llc Graphical user interface system for allowing management of a media item playlist based on a preference scoring system
US10469549B2 (en) 2006-07-11 2019-11-05 Napo Enterprises, Llc Device for participating in a network for sharing media consumption activity
US9292179B2 (en) 2006-07-11 2016-03-22 Napo Enterprises, Llc System and method for identifying music content in a P2P real time recommendation network
US9003056B2 (en) 2006-07-11 2015-04-07 Napo Enterprises, Llc Maintaining a minimum level of real time media recommendations in the absence of online friends
US7970922B2 (en) 2006-07-11 2011-06-28 Napo Enterprises, Llc P2P real time media recommendations
US8805831B2 (en) 2006-07-11 2014-08-12 Napo Enterprises, Llc Scoring and replaying media items
US8620699B2 (en) 2006-08-08 2013-12-31 Napo Enterprises, Llc Heavy influencer media recommendations
US8090606B2 (en) 2006-08-08 2012-01-03 Napo Enterprises, Llc Embedded media recommendations
US8874655B2 (en) 2006-12-13 2014-10-28 Napo Enterprises, Llc Matching participants in a P2P recommendation network loosely coupled to a subscription service
US9224427B2 (en) 2007-04-02 2015-12-29 Napo Enterprises LLC Rating media item recommendations using recommendation paths and/or media item usage
US9081780B2 (en) 2007-04-04 2015-07-14 Abo Enterprises, Llc System and method for assigning user preference settings for a category, and in particular a media category
US8434024B2 (en) 2007-04-05 2013-04-30 Napo Enterprises, Llc System and method for automatically and graphically associating programmatically-generated media item recommendations related to a user's socially recommended media items
US8112720B2 (en) 2007-04-05 2012-02-07 Napo Enterprises, Llc System and method for automatically and graphically associating programmatically-generated media item recommendations related to a user's socially recommended media items
US9275055B2 (en) 2007-06-01 2016-03-01 Napo Enterprises, Llc Method and system for visually indicating a replay status of media items on a media device
US9037632B2 (en) 2007-06-01 2015-05-19 Napo Enterprises, Llc System and method of generating a media item recommendation message with recommender presence information
US9164993B2 (en) 2007-06-01 2015-10-20 Napo Enterprises, Llc System and method for propagating a media item recommendation message comprising recommender presence information
US9448688B2 (en) 2007-06-01 2016-09-20 Napo Enterprises, Llc Visually indicating a replay status of media items on a media device
US8954883B2 (en) 2007-06-01 2015-02-10 Napo Enterprises, Llc Method and system for visually indicating a replay status of media items on a media device
US8839141B2 (en) 2007-06-01 2014-09-16 Napo Enterprises, Llc Method and system for visually indicating a replay status of media items on a media device
US8983950B2 (en) 2007-06-01 2015-03-17 Napo Enterprises, Llc Method and system for sorting media items in a playlist on a media device
US8285776B2 (en) 2007-06-01 2012-10-09 Napo Enterprises, Llc System and method for processing a received media item recommendation message comprising recommender presence information
US7865522B2 (en) 2007-11-07 2011-01-04 Napo Enterprises, Llc System and method for hyping media recommendations in a media recommendation system
US9060034B2 (en) 2007-11-09 2015-06-16 Napo Enterprises, Llc System and method of filtering recommenders in a media item recommendation system
US8224856B2 (en) 2007-11-26 2012-07-17 Abo Enterprises, Llc Intelligent default weighting process for criteria utilized to score media content items
US9164994B2 (en) 2007-11-26 2015-10-20 Abo Enterprises, Llc Intelligent default weighting process for criteria utilized to score media content items
US8874574B2 (en) 2007-11-26 2014-10-28 Abo Enterprises, Llc Intelligent default weighting process for criteria utilized to score media content items
US9224150B2 (en) 2007-12-18 2015-12-29 Napo Enterprises, Llc Identifying highly valued recommendations of users in a media recommendation network
US9071662B2 (en) 2007-12-20 2015-06-30 Napo Enterprises, Llc Method and system for populating a content repository for an internet radio service based on a recommendation network
US9734507B2 (en) 2007-12-20 2017-08-15 Napo Enterprise, Llc Method and system for simulating recommendations in a social network for an offline user
US8396951B2 (en) 2007-12-20 2013-03-12 Napo Enterprises, Llc Method and system for populating a content repository for an internet radio service based on a recommendation network
US8060525B2 (en) 2007-12-21 2011-11-15 Napo Enterprises, Llc Method and system for generating media recommendations in a distributed environment based on tagging play history information with location information
US9275138B2 (en) 2007-12-21 2016-03-01 Lemi Technology, Llc System for generating media recommendations in a distributed environment based on seed information
US8117193B2 (en) 2007-12-21 2012-02-14 Lemi Technology, Llc Tunersphere
US8983937B2 (en) 2007-12-21 2015-03-17 Lemi Technology, Llc Tunersphere
US9552428B2 (en) 2007-12-21 2017-01-24 Lemi Technology, Llc System for generating media recommendations in a distributed environment based on seed information
US8874554B2 (en) 2007-12-21 2014-10-28 Lemi Technology, Llc Turnersphere
US8880599B2 (en) 2008-10-15 2014-11-04 Eloy Technology, Llc Collection digest for a media sharing system
US9367808B1 (en) 2009-02-02 2016-06-14 Napo Enterprises, Llc System and method for creating thematic listening experiences in a networked peer media recommendation environment
US9824144B2 (en) 2009-02-02 2017-11-21 Napo Enterprises, Llc Method and system for previewing recommendation queues

Similar Documents

Publication Publication Date Title
US6615208B1 (en) Automatic recommendation of products using latent semantic indexing of content
WO2002021335A1 (fr) Recommandation automatique de produits a l'aide d'un indexage semantique latent de contenu
Federico et al. A survey on visual approaches for analyzing scientific literature and patents
KR100601578B1 (ko) 문서를 개념적으로 분류하기 위한 요약 및 클러스터링
Oren et al. Extending faceted navigation for RDF data
US5987446A (en) Searching large collections of text using multiple search engines concurrently
US7756879B2 (en) System and method for estimating user ratings from user behavior and providing recommendations
US20070250500A1 (en) Multi-directional and auto-adaptive relevance and search system and methods thereof
US9208223B1 (en) Method and apparatus for indexing and querying knowledge models
Joia et al. Uncovering representative groups in multidimensional projections
US20200265491A1 (en) Dynamic determination of data facets
CN101114287A (zh) 为数据生成浏览路径的方法和装置及浏览数据的方法
US20160196593A1 (en) System and method for tracking filter activity and monitoring trends associated with said activity
CN111428100A (zh) 一种数据检索方法、装置、电子设备及计算机可读存储介质
Cuenca et al. VERTIGo: A visual platform for querying and exploring large multilayer networks
Salmani et al. Hybrid movie recommendation system using machine learning
Jetter et al. Hypergrid—accessing complex information spaces
JP2017146869A (ja) 情報検索プログラム及び情報検索装置
Rao et al. Web scraping (imdb) using python
Amer-Yahia et al. Exploring ratings in subjective databases
Yin et al. A cold-start recommendation algorithm based on new user's implicit information and multi-attribute rating matrix
Chédin et al. The tell-tale cube
Hasani et al. TableView: A visual interface for generating preview tables of entity graphs
Marchionini et al. Extending retrieval strategies to networked environments: Old ways, new ways, and a critical look at WAIS
Mirizzi et al. Linked Open Data for content-based recommender systems

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): CA IN JP

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR

121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP