WO2011109516A2 - Traitement de document à l'aide de données de voie d'extraction - Google Patents
Traitement de document à l'aide de données de voie d'extraction Download PDFInfo
- Publication number
- WO2011109516A2 WO2011109516A2 PCT/US2011/026867 US2011026867W WO2011109516A2 WO 2011109516 A2 WO2011109516 A2 WO 2011109516A2 US 2011026867 W US2011026867 W US 2011026867W WO 2011109516 A2 WO2011109516 A2 WO 2011109516A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- user
- request
- document
- requests
- event
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/954—Navigation, e.g. using categorised browsing
Definitions
- the subject matter disclosed herein generally relates to the processing of data. Specifically, the present disclosure addresses systems and methods involving document processing, document presentation, or both, using retrieval path data.
- a web server machine may receive a request from a user to retrieve a document stored in a database of the web server machine, and the web server machine may provide the document to a web client machine (e.g., the user's computer) in response to the request.
- a web client machine e.g., the user's computer
- the request may be a elicit made by the user on a hyperlink displayed in a web page, where the hyperlink references another web page.
- the web server machine may respond to the click by retrieving the latter web page and providing it to the web client machine.
- a machine may be used to facilitate a presentation of a document that references a product available for selection by the user.
- the web server machine may cause an electronic storefront to be displayed in the document, and the electronic storefront may present the available product. If the user is interested in the product, the user may use the electronic storefront to select that product for purchase or to obtain further information about the product.
- FIG. 1 is an event diagram illustrating events in a retrieval path of a document, according to some example embodiments
- FIG. 2 is an event diagram illustrating requests included within an intent boundary and requests outside the intent boundary, according to some example embodiments
- FIG. 3 is a diagram illustrating augmentation of a document with event metadata and intent metadata, according to some example embodiments
- FIG. 4 is a diagram illustrating a web page with some event metadata and some intent metadata, according to some example embodiments
- FIG. 5 is a network diagram illustrating a network environment of a document processing and presentation machine, according to some example embodiments.
- FIG. 6 is a block diagram illustrating modules of a document processing and presentation machine, according to some example embodiments.
- FIG . 7 is a flow chart illustrating a method of document processing using retrieval path data, according to some example embodiments.
- FIG. 8-9 are flowcharts illustrating a method of processing retrieval path data of a document, according to some example embodiments.
- FIG. 10 is a flow chart illustrating a method of document presentation using retrieval path data, according to some example embodiments.
- FIG . 1 1 is a block diagram il lustrating components of a machine, according to some example embodiments, able to read instructions from a machine-readable medium and perform any one or more of the methodologies discussed herein.
- Example methods and systems are directed to document processing, document presentation, or both, using retrieval path data. Examples merely
- a user who is browsing through documents generally has some intent for engaging in the browsing,
- the user ' s bro wsing activity may involve requesting retrieval of one or more documents and, based on a reading of one or more documents, requesting retrieval of further documents.
- intent refers to a goal, purpose, objective, or desire that motivates browsing activity.
- the intent of the user may be to find a recipe for beef noodle soup.
- the intent may be to shop for an espresso machine that is simple to clean.
- the intent may be to find an inexpensive camera suitable for outdoor photography.
- the intent may be to research potential gifts suitable for a seven-year old nephew.
- the browsing acti vity of the user can be viewed as events that constitute a "retrieval path,” which is to say, a path of events leading to, though not necessarily ending with, a retrieval of a particular document that satisfies the user's intent, at least partially if not fully.
- the events in the retrieval path may include requests for information (e.g., documents, questions, or queries), as well as results of those requests (e.g., document presentation, document denial, answers to questions, or search results).
- “retrieval path data” refers to information that describes a retrieval path.
- retrieval path data may include event data (e.g., data from one or more events constituting the retrieval path).
- the retrieval path may be short or direct, allowing the user to find a satisfactory document quickly.
- the user may search for an "iPhone,” and the returned search results may include a link to an electronic storefront that sells exactly the kind of iPhoneTM desired by the user. If the user clicks on the link and purchases the iPhoneTM, it may be inferred that the user's intent was to purchase an iPhoneTM of that kind.
- the path of events leading to the electronic storefront includes a request, specifically, a request to search for "iPhone,” that led to the retrieval of the electronic storefront.
- the retrieval path may be long or indirect, retrieving the satisfactory document for the user after multiple attempts to seek the document.
- the user may search for a "tent for burning man,” in contemplation of attending an annual outdoor festival in the Nevada desert known as "The Burning Man.”
- the search engine being untrained with respect to this festival, may provide generic results for "tent” or may provide no results at all, thus frustrating the user.
- the user may persist and modify his search, requesting a second query for a "tent for the desert.”
- the search engine may then return results useful to the user, such as links (e.g., hyperlinks) to product information in the form of, for example, documents (e.g., product web pages), news articles, consumer reviews, frequently asked questions (FAQs), advertisements, and shopping interfaces (e.g., an electronic storefront), all related to tents usable in desert conditions.
- the user may request and read several documents (e.g., multiple reviews of tents) before requesting an electronic storefront to purchase a particular tent.
- the retrieval path of the electronic storefront includes multiple requests, including the request to search for a "tent for burning man," that led to the retrieval of the electronic storefront.
- a system may process the metadata to determine an intent.
- This intent is inferred from the retrieval path, and the inferred intent may be ascribed to the user. While the system does not purport to read the mind of the user and thereby discover the actual intent contemplated by the user, the system may process an aggregate of retrieval paths from multiple users for multiple documents and infer a statistically likely intent of the user.
- the inferred intent may be stored by the system as further metadata (e.g., metadata relating to the intent) of the document.
- the system indexes at least some of the metadata, hence enabling the system to provide the document to another user whose retrieval path intersects with the previously processed retrieval path. Accordingly, the system shortens the retrieval path for the latter user.
- the system may also present some of the metadata of the doc ument. For example, the system may generate and provide a web page that includes the document and some metadata. As another example, the system may alter the document to display some of the metadata within the document itself.
- Metadata relating to events in the retrieval path is referred to herein as “event metadata.”
- Metadata relating to inferred intent is referred to herein as “intent metadata.”
- event metadata Metadata relating to events in the retrieval path
- intent metadata Metadata relating to inferred intent
- the system may show the latter user activities performed (e.g., requests made) by other users prior to retrieving the document, as well as links to further documents that the other users subsequently retrieved.
- the system may show the latter user one or more intents likely held by other users when retrieving the document. Accordingly, the system may assist the latter user in pursuing his or her actual intent by providing shortcuts to documents ultimately retrieved by the other users in pursuit of their actual intents.
- Multiple retrieval paths may be represented within the event metadata, and multiple intents may be represented within the intent metadata.
- the system may, however, process metadata to identify a single event or a single intent. For example, the system may perform a semantic analysis (e.g., a latent semantic analysis) of event data to determine (e.g., infer) boundaries between individual intents included in a long retrieval path (e.g., event data from a long chain of events). Accordingly, the system may determine that the intent corresponds to a request to retrieve a particular document.
- a semantic analysis e.g., a latent semantic analysis
- FIG. 1 is an event diagram illustrating events 101-109 in a retrieval path 110 of a document, according to some example embodiments. Also shown are events 151-152. The events 101-109 and 151-152 are ordered in time and are shown in chronological sequence, as indicated by arrows. However, alternative example embodiments may order events using any dimension (e.g., according to mathematically calculated vector distances in an ⁇ -dimensional space). Events 101-109 occur prior to processing the retrieval path 110 and are associated with a first user interacting with a network-based publication system from a first client device of the first user (e.g., a computer or a phone). Events 151-152 occur after the processing of the retrieval path 110 and are associated with a second user interacting with the system from a second client device.
- a first client device of the first user e.g., a computer or a phone
- Event 101 is a request in which the first user submits a query for a "tent for burning man.”
- the first user may access a network-based publication system (e.g., an online shopping web server, an inventory control server, or a classified ad web server) and use its search engine to search for "tent for burning man.”
- a network-based publication system e.g., an online shopping web server, an inventory control server, or a classified ad web server
- Event 102 is a response in which no results are found.
- the network-based publication system may respond to the first user with a message (e.g., in a web page) indicating that the search returned zero results.
- Event 103 is a request in which the first user re-formulates his query and submits a new query for a "tent for the desert.”
- Not shown in FIG. 1 is a response event in which the network-based publication system provides a web page containing several search results in response to event 103.
- the search results may include links to a product page for "tent A,” a product page for "tent B,” a product review of "tent B,” and a product review of "tent C.”
- Event 104 is a request by the first user to view the product page for "tent
- Event 105 is a request by the first user to view the product review of "tent ⁇ ;” and event 106 is a request to view the product review of "tent C,” Not shown in FIG. 1 are responses to these requests, in which the network- based publication system provides the requested information (e.g., the product review of "tent B").
- Event 107 is a request by the first user to view the product page for "tent
- event 109 is a request by the first user to purchase "tent B.”
- event 109 may be a request submitted via an electronic storefront to initiate a purchase transaction for a specimen of "tent 13.”
- event 109 may be a
- event 109 is a "positive event,” which is to say, an event that indicates an affirmation of the first user's intent
- the network-based publication system may infer from events 101- 109 that the first user intended to purchase a particul ar kind of tent, namely, a kind of tent satisfied by "tent B.” After requesting two searches and four documents, the first user purchased the product is shown in one particular document, the product page for "tent B.”
- the retrieval path 110 may be associated with the product page for "tent B" (e.g., as event metadata) for future use with respect to other users.
- Events 151 and 152 occur after the processing of the retrieval path 110. T he processing of the retrieval path 1 10 associates the retrieval path 110 with a particular document, namely, the product page for "tent B.”
- the retrieval path 1 10 may be stored as e vent metadata of the product page for "tent B," and the event metadata may be indexed to facilitate identification of the product page for "tent B” in future searches.
- the events 151 and 152 are associated with the second user interacting with the network-based publication system from the second client device (e.g., a computer or a phone).
- Event 151 is a request in which the second user submits a query for a "tent for burning man,” similar to the first user's request in event 101.
- the retrieval path 110 now stored as event metadata of the product page for "tent B”
- the network-based publication system no longer responds with zero results, as in event 102. Instead, the system responds to the second user with a document likely to satisfy the inferred intent motivating a search for a "tent for burning man.” In other words, the system ascribes this intent to the second user and selects the product page for "tent B" for presentation to the second user.
- Event 152 is a response in which the network-based publication system presents the product page for "tent B" to the second user, Additionally, in event 152, the product page for "tent B” is augmented with retrieval path data (e.g., event metadata or intent metadata). For example, the product page may be supplemented with a system-generated statement that the first user also searched for a "tent for burning man” and ultimately purchased “tent B.” Thus, the second user may experience a more direct and satisfying fulfillment of his actual intent.
- retrieval path data e.g., event metadata or intent metadata
- FIG. 2 is an event diagram illustrating requests 205-208 included within an intent boundary 210 and requests 201-204 outside the intent boundary 210, according to some example embodiments. Also shown are events 251 and 252. The events 201-208 and 251-252 are ordered in time and shown in chronological sequence, as indicated by arrows. However, alternative embodiments may order events using any dimension. Events 201-208 occur prior to processing of e vents 205-208, and are associated with a first user interacting with a network-based publication system from a first client device of the first user (e.g., a computer or a phone). Events 251-252 occur after the processing of events 205-208 and are associated with a second user interacting with the system from a second client device.
- a first client device of the first user e.g., a computer or a phone
- Events 201-208 constitute a retrieval path that expresses multiple intents (e.g., two intents).
- Event 201 is a request in which the first user submits a query for an "espresso machine.”
- Not shown in FIG. 2 is a response event in which the system provides a web page containing several search results in response to event 201.
- the search results may include links to product information for various espresso machines.
- Event 202 is a request by the first user to view a product page for "espresso machine A" (e.g., an advertisement, a description, or technical specifications).
- Event 203 is a request by the first user to search for a product review of "espresso machine B" (e.g., a professional review, an amateur review, consumer poll results, a ranked "top-ten” list, or an aggregate rating).
- Event 204 is a request by the first user to view the product news pertaining to "espresso machine C" (e.g., consumer safety news, product recall news, or celebrity endorsement news).
- Event 205 is a request in which the first user searches for a new topic unrelated to espresso machines, namely, a "gym bag.”
- a new topic unrelated to espresso machines namely, a "gym bag.”
- the search results may include links to product information for various gym bags (e.g., sports bags, exercise bags, duffel bags, or athletic bags).
- Event 206 is a request by the first user to view a product review of "gym bag X.”
- Event 207 is a request by the first user to view a product page describing "gym bag Y.”
- Event 208 is a request by the first user to purchase "gym bag Y,” and accordingly, event 208 is a positive event that indicates an affirmation of the first user's intent. Similar to event 109, event 208 may be a submission via an electronic storefront to commit the first user to a purchase transaction.
- Events 201-204 relate to espresso machines, while events 205-208 relate to gym bags.
- one intent e.g., shopping for an espresso machine
- another intent e.g., shopping for a gym bag
- a network-based publication system may determine the intent boundary 210 that separates the former intent from the la tter intent within a given retrieval path (e.g., events 201 -208).
- the system includes the events associated with a particular intent (e.g., events 205-208 as indicative of shopping for a gym bag) as event metadata to be associated with the product page of "gym bag Y.”
- the system excludes events 201-204 from the event metadata, because the excluded events indicate an unrelated intent (e.g., shopping for an espresso machine).
- the system stores the event metadata with the product page of "gym bag Y" (e.g., in a common database).
- the system further may index the event metadata to enable efficient retrieval of the product page based on the event metadata.
- the system generates intent metadata to be associated with the product page of "gym bag Y.”
- the system may genera te one or more text phrases, such as “gym bag,” “bag for gym,” “bag for working out,” “bag for exercising,” and “bag for exercise class” as the intent metadata.
- the sy stem may then store the intent metadata with the product page of "gym bag Y" (e.g., in the common database).
- the intent metadata may be generated based on a semantic analysis of requests (e.g., events 205-208) submitted by one or more users (e.g., the first user).
- the system may also index the intent metadata to enable efficient retrieval of the product page based on the in tent metadata.
- Events 251 and 252 occur after the processing of events 205-208 to associate the event metadata and the intent metadata with the product page of "gym. bag Y.”
- Event 251 is a request in which a second user submits a query for a "bag for exercise,” Based on the event metadata, the intent metadata, or both, the network-based publication system selects the product page for "gym bag Y" for presentation to the second user.
- Event 252 is a response in which the system presents the product page for "gym bag Y" to the second user. Similar to event 152, in events 252, the system may present some retrieval path data (e.g., event metadata, intent metadata, or both) to augment the product page for "gym bag Y," For example, the product page may be supplemented with a machine-generated statement that the first user searched for a "gym bag” and eventually purchased “gym bag Y.” This may have the effect of saving the second user the time and incon venience of reviewing the product review of "gym bag X," resulting in a more direct and satisfying fulfillment of his intent.
- some retrieval path data e.g., event metadata, intent metadata, or both
- FIG. 3 is a diagram illustrating augmentation of a document 310 with event metadata 335 and intent metadata 340, according to some example embodiments.
- Event data 320 represents one or more requests made by a user (e.g., a first user) to a network-based publication system. The requests include a request to retrieve the document 310.
- the document 310 is a document available from the networked-based publication system.
- the document 310 may be, or include: a listing of an item available for sale (e.g., a specimen of a product available for sale), an electronic storefront that is operable by a user (e.g., the first user) to initiate a purchase of the item, a description of the product available for sale, a review of the product, a buying guide that references the product, a question pertinent to the product (e.g., a frequently asked question (FAQ)), an answer to the question, or any suitable combination thereof.
- a listing of an item available for sale e.g., a specimen of a product available for sale
- an electronic storefront that is operable by a user (e.g., the first user) to initiate a purchase of the item
- a description of the product available for sale e.g., a review of the product
- a buying guide that references the product
- a question pertinent to the product e.g
- the event data 320 may also include: a request to execute a query generated by a user (e.g., the first user), a request to view a search result provided to a client device by the network-based publication system (e.g., in response to the query), a request to view a page devoid of references to an item available for sale that is referenced by the document 310 (e.g., a web page unrelated to the item available for sale), a request to initiate a purchase of the item (e.g., a purchase confirmation), or any suitable combination thereof.
- a request to execute a query generated by a user e.g., the first user
- a request to view a search result provided to a client device by the network-based publication system
- a request to view a page devoid of references to an item available for sale that is referenced by the document 310 e.g., a web page unrelated to the item available for sale
- a request to initiate a purchase of the item e.g.,
- a request to initiate a purchase of the item may be the final request in a sequence of requests ordered in time, but such a request need not be the final request in all example embodiments.
- the event data 320 may include one or more timestamps corresponding respectively to one or more requests.
- a request to view a product page may include a timestamp indicating when the user submitted the request to the network-based publication system.
- the document 310 and the event data may be combined together (e.g., by a document processing and presentation machine within the network-based publication system), and the event data 320 may become event metadata 330 of the document 310.
- the document 310 may be stored with the event metadata 330.
- a document processing and presentation machine within the network-based publication system may store the document 310 and the event metadata 330 in a database of the networked-based publication system.
- the document processing and presentation machine may perform a semantic analysis 360 of the event metadata 330. Based on the semantic analysis 360, the machine may modify (e.g., truncate) the event metadata 330 to obtain a portion 335 of the event data 330 (e.g., a portion limited to events representing a single intent). Moreover, the document processing and presentation machine may determine intent metadata 340 based on the event metadata 330. The portion 335 of the event metadata 330 and the intent metadata 340 may be stored with a document (e.g., by the document processing and presentation machine) in a database. Furthermore, the portion 335 of the event metadata 330, the intent metadata 340, or both, may be indexed to facilitate retrieval of the document 310. For example, the document processing and presentation machine may perform the indexing to optimize retrieval of the document 310 based on some of the event metadata 335, some of the intent metadata 340, or any suitable combination thereof.
- FIG. 4 is a diagram illustrating a web page 400 with some event metadata 410 and 430 and some intent metadata 420, according to some example embodiments.
- the web page 400 is an example of a document available from a network-based publication server, in particular, the web page 400 is a product page for a digital camera (e.g., a "CanonTM PowershotTM 10,0 Megapixel Digital ELPHTM camera") and hence includes some information describing the digital camera.
- Event metadata 410 is an aggregate of event data (e.g., requests for documents) from multiple users.
- the event metadata 410 indicates statistical behavior of other users who ultimately purchased this digital camera, For example, the event metadata 410 indicates that 32% of the users requested a product review (e.g., of this digital camera), while 10% of the users requested product information (e.g., product pages) of alternatives (e.g., other digital cameras).
- Event metadata 430 is an aggregate of event data (e.g., requests to purchase items) from multiple users.
- the event metadata 430 indicates statistical behavior of other users in purchasing digital cameras. For example, the event metadata 430 indicates that 67% of the users chose to purchase this digital camera, while 10% of the users chose to purchase a different digital camera (e.g., a "NikonTM CoolPixTM" camera).
- Intent metadata 420 is an aggregate of intent metadata generated based on the event data from the multiple users.
- the intent metadata 420 includes machine-generated statements describing contexts (e.g., conditions) suitable for this digital camera. For example, the intent metadata 420 includes the statement, "It's good for . . . Amateurs.”
- the intent metadata 420 also includes machine- generated statements describing positive features of this digital camera (e.g., "Pros . . , Bright LCD.”).
- the intent metadata 420 further includes machine- generated statements describing negative features of this digital camera (e.g., "Cons , . . Lack of storage.”). These statements do not need to be machine- generated. Any one or more of the statements may be generated by a user and used in the intent metadata 420.
- the event data from the multiple users may include requests by some of the users to submit a statement (e.g., a comment) pertaining to this digital camera.
- the intent metadata 420 may be based on inferred intent (e.g., as described herein), explicit intent (e.g., as submitted by users), or any suitable combination thereof.
- FIG. 5 is a network diagram illustrating a network environment 500 of a document processing and presentation machine 510, according to some example embodiments.
- the network environment 500 includes the document processing and presentation machine 510, a database 520, a first client device 580, and the second client device 590, all connected to a network 550 and configured to communicate with each other via the network 550.
- the document processing and presentation machine 510 includes a processor and may be implemented using a computer that has been programmed by software, resulting in a special-purpose computer to perform document processing and presentation using retrieval path data.
- An example of physical structures of a general-purpose computer is described below with respect to FIG. 11.
- the database 520 is a repository of data and stores information on a machine-readable storage medium.
- the database 520 may be a database server machine (e.g., a server computer) and may store documents (e.g., document 310) with their associated event metadata (e.g., event metadata 410 and 430) and intent metadata (e.g., intent metadata 420).
- documents e.g., document 310
- event metadata e.g., event metadata 410 and 430
- intent metadata e.g., intent metadata 420
- the network 550 may be any network that enables communication between machines (e.g., the document processing and presentation machine 510 and the first client device 580). Accordingly, the network 550 may be a wired network, a wireless network, or any suitable combination thereof. The network 550 may include one or more portions that constitute a private network, a public network (e.g., the Internet), or any suitable combination thereof.
- the first client device 580 is associated with a first user and may be a machine of the first user (e.g., a personal computer, a cellular phone, or a web appliance).
- the second client device 590 is associated with a second user and may be a machine of the second user.
- Any of the machines shown in FIG. 5 may he implemented using a general-purpose computer modified (e.g., programmed) by special-purpose software to be a special-purpose computer to perform the functions described herein for that machine.
- a computer system able to implement any one or more of the methodologies described herein is discussed below with respect to FIG. 11.
- any two or more of the machines illustrated in FIG. 5 may be combined into a single machine, and the functions described herein for a single machine may be subdivided among multiple machines.
- FIG. 6 is a block diagram illustrating modules of a document processing and presentation machine 510, according to some example embodiments.
- the document processing and presentation machine 510 includes an access module 610, a storage module 620, a server module 630, a determination module 640, and an index module 650, a reception module 660, and a generator module 670, all configured to communicate with each other (e.g., via a bus, a shared memory, or a switch). Any of these modules may be implemented using hardware, as described below with respect to FIG. 1 1. Moreover, any two or more of these modules may be combined into a single module, and the functions described herein for a single module may be subdivided among multiple modules.
- the functionality of modules 610-670 is described belo w with respect to FIG. 7-10.
- FIG. 7 is a flow chart illustrating a method 700 of document processing using retrieval path data, according to some example embodiments.
- the method 700 includes operations 710-750.
- the reception module 660 receives at least some of the event data 320 from the first client device 580 (e.g., from the first user).
- the event data 320 represents one or more requests, at least one of which is a request to retrieve the document 310 (e.g., event 207, the request to view the product page of "gym bag Y").
- the first client device 580 may collect the event data 320 over a period of time (e.g., one hour, or one day) and upload the event data 320 to the document processing and presentation machine 510.
- the document processing and presentation machine 510 may monitor communications from the first client device 580 to the network-based publication system and accordingly accumulate the event data 320 request by request.
- the determination module 640 may filter requests (e.g., events 201 -207) received from the first client device 580 to limit the event data 320.
- the determination module 640 may filter the requests based on a period of time (e.g., selecting only those requests made by the user during the period of time).
- the determination module may filter the requests based on a total number of requests to be included in the event data 320 (e.g., selecting only the most recent 100 requests made by the user).
- the access module 610 accesses the event data 320 (e.g., by accessing the database 520, or by reading the event data 320 from a computer memory).
- the event data 320 includes a request to retrieve the document 310 (e.g., event 207, the request to view the product page of "gym bag Y").
- the storage module 620 stores the event data 320 as event metadata 330 (e.g., event metadata 410) of the document 310.
- event metadata 330 e.g., event metadata 410
- the storage module 620 may store the event metadata 330 as a file linked to the document 310 in the database 520.
- the storage module 620 may write the event metadata 330 into a document header of the document 310,
- the server module 630 provides the document 310 to the first client device 580 in response to the request to retrieve the document 310 (e.g., event 207).
- the server module 630 may be a web server module and serve the document 310 using any Internet protocol (e.g., Hypertext Transfer Protocol (HTTP)).
- HTTP Hypertext Transfer Protocol
- the index module 650 indexes the event data 320 stored as the event metadata 330 in the database 520.
- the index module 650 may use any indexing algorithm to perform operation 750.
- FIG. 8-9 are flowcharts illustrating a method 800 of processing retrieval path data of a document, according to some example embodiments.
- the method 800 includes operations 810-860 and operations 910-930.
- the reception module 660 receives at least some of the event data 320 from the first client device 580. This may be performed in a manner similar to operation 710 of method 700.
- the access module 610 accesses the event data 320. This may be performed in a manner similar to operations 720 of method 700.
- the event data 320 may be stored (e.g., by the storage module 620) in the database 520 as the event metadata 330 of the document 310.
- the access module 610 may access (e.g., read from the database 520) the event metadata 330 to access the event data 320.
- the determination module 640 determines the portion 335 of the event metadata 330 and determines intent data based on the portion 335. For example, the determination module 640 may modify (e.g., truncate) the event metadata 330 to determine the portion 335. The determination of the portion 335 may be based on the semantic analysis 360 of the event metadata 330. As noted above, the portion 335 includes a request (e.g., event 207) to retrieve the document 310. Based on the portion 335 of the event metada ta 330, the determination module 640 determines the intent data. For example, the determination module 640 may extract textual information (e.g., keywords) from the portion 335 that are statistically likely to indicate an intent ascribable to the user (e.g., the first user).
- textual information e.g., keywords
- Operation 910 involves performing a semantic analysis of the event metadata 330.
- the semantic analysis may be a latent semantic analysis.
- the semantic analysis may include operation 920, which involves performing a comparison of textual information (e.g., text data) included in the event metadata 330,
- the determination module 640 may compare the phrase "espresso machine” (e.g., from event 201 ) to the phrase "gym bag” (e.g., from the event 205) in performing the semantic analysis.
- the semantic analysis may include operation 930, which involves processing an aggregate of event metadata (e.g., event metadata 330) for multiple documents (e.g., document 310).
- the aggregate of event metadata may be received (e.g., by the reception module 660) from multiple client devices (e.g., the second client device 590) associated with multiple users (e.g., the second user).
- the reception module 660 may accumulate the aggregate over a period of time (e.g., three months), and the determination module may process the simulated aggregate at the end of the period.
- the determination module 640 determines the intent boundary 210 and accordingly determines that a subset of the e vents (e.g., requests) represented in the event metadata 330 correspond to the intent data and that the remainder of the events do not correspond to the intent data.
- the subset of the events is represented by the portion 335 of the event metadata 330.
- Operations 830 and 840 may be performed by the determination module 640 iteratively.
- the determination module 640 may initially estimate the intent boundary 210 using operation 830 and performed the semantic analysis 360 to determine the intent boundary 210.
- the determination module 640 may determine intent data for all of the event metadata 330 and accordingly determine the intent boundary 210 as a boundary of the portion 335, thus defining the intent boundary 210 and the portion 305 contemporaneou sly .
- the storage module stores the intent data in the database 520 as the intent metadata 340 (e.g., intent metadata 420) of the document 310.
- the storage module 620 may store the intent metadata 340 as a file linked to the document 310 in the database 520.
- the storage module 620 may write the intent metadata 340 into the document header of the document 310.
- the index module 650 indexes the intent data stored as the intent metadata 340 in the database 520.
- the index module 650 may use any indexing algorithm to perform operation 860.
- FIG. 10 is a flow chart il lustrating a method 1000 of document presentation using retrieval path data, according to some example embodiments.
- the method 1000 includes operations 1010-1060.
- the document 310 has been augmented using retrieval path data from a first user of the first client device 580.
- Methods 700 and 800 have been performed as described above.
- the document 310 has been stored in the database 520 with the portion 335 of the event metadata 330 and with the intent metadata 340.
- the document 310 and its metadata have been indexed by the index module 650.
- the retrieval path data is available for use by another user (e.g., a further user).
- a second user of the second client device 590 may submit a new request (e.g., a further request) to the network-based publication system.
- Event 251 is an example of such a new request.
- the document processing and presentation machine 510 responds to the new request and uses the retrieval path data (e.g., the portion 335 of the event metadata 330, or the intent metadata 340) to select the document 310 for presentation to the second user.
- the retrieval path data e.g., the portion 335 of the event metadata 330, or the intent metadata 340
- the reception module 660 receives the new request from the second client device 590. This may be performed in a manner similar to operation 710 of method 700.
- the access module 610 accesses the intent metadata 340 of the document 310.
- the access module 610 accesses the portion 335 of the event metadata 330 of the document 310. Operation 1020, operation 1030, or both, may be performed in a manner similar to operation 720 of method 700. in the context of method 1000, the portion 335 includes a first request (e.g., event 207) made by the first user to retrieve the document 310 (e.g., the product page for "gym bag Y") to the first client device 580.
- a first request e.g., event 207
- the document 310 e.g., the product page for "gym bag Y
- the determination module 640 determines that the new request (e.g., event 251, the request to search for "gym bag") made by the second user is a variant of the first request (e.g., event 207, the request to search for "bag for exercise") made by the first user. This determination may be made based on the intent metadata 340, the portion 335 of the event metadata 330, or both. In alternative example embodiments, the determination module 640 determines that the new request is the same as the first request (e.g., the new request is a request for a search that uses the same search terms as the first request).
- the new request e.g., event 251, the request to search for "gym bag
- event 207 the request to search for "bag for exercise
- the new request is similar to the first request, differing only in time (e.g., timestamp) and in destination. For example, where the first request was a request to retrieve a body of information to the first client device 580 on a Monday, the new request may be a request to retrieve the same body of ' information to the second client device 590 on the following Tuesday.
- the generator module 670 generates a web page (e.g., web page 400) that includes the document 310, some intent metadata (e.g., intent metadata 420), and some event metadata (e.g., event metadata 410). The effect of this is to allow the second user to view some retrieval path data when viewing the document 310.
- the server module 630 provides the generated web page (e.g., web page 400) to the second client device 590 in response to the determination performed in operation 1040.
- the server module 630 may be a web server module and serve the web page in a manner similar to providing the document 310 in operation 740 of method 700. Accordingly, the second user is presented with the document 310, augmented with retrieval path data, without having to follo w the retrieval path of the first user.
- the method 1000 proceeds directly from operation 1010 to operation 1050.
- the reception module 660 may receive the new request from the second client device 590, and the new request may be a straightforward request to retrieve the document 310.
- a third-party web site may recommend the document 310 to its users and provide a direct hyperlink to the document 310, which is being served by the network-based publication system (e.g., the server module 630 of the document processing and presentation machine 510), From operation 1010, as indicated by an arrow in FIG. 10, the method 1000 proceeds to operation 1050, in which the generator module 670 generates the web page (e.g., web page 400).
- the generator module 670 may access the database 520 and accordingly perform operation 1020, operation 1030, or both. According to various example embodiments, the generator module 670 may cause the access module 610 to perform operation 1020, operation 1030, or both.
- the web page may have been previously generated by the generator module 670 and stored by the storage module 620 for future use (e.g., in a cache memory, or in the database 520).
- the method 1000 may proceed directly from operation 1010 to operation 1060, in which the server module 630 provides the web page to the second client device 590.
- one or more of the methodologies described herein may facilitate an enhanced user experience for the second user by reducing time, effort, computing resources, network traffic, power usage, or any combination thereof, associated with browsing activities of the second user,
- the document processing and presentation machine 510 correlates a likely intent of the first user with a likely intent of the second user.
- the document processing and presentation machine 510 accordingly offers the second user a shortcut that abbreviates the retrieval path of the first user and leads the second user directly to the document 310.
- the second user may be able to satisfy his intent with significantly less browsing activity (e.g., requests) compared to the first user.
- all subsequent users may gain similar benefits.
- FIG. 11 illustrates components of a machine 1 100, according to some example embodiments, that is able to read instructions from a machine-readable medium (e.g., machine-readable storage medium) and perform any one or more of the methodologies discussed herein.
- FIG. 11 shows a diagrammatic representa tion of the machine 1100 in the example form of a computer system and within which instructions 1124 (e.g., software) for causing the machine 1100 to perform any one or more of the methodologies discussed herein may be executed.
- the machme 1100 operates as a standalone device or may be connected (e.g., networked) to other machines.
- the machine 1100 may operate in the capacity of a server machine or a client machme in a server-client network environment, or as a peer machine in a peer-to-peer (or distributed) network environment.
- the machine 1 100 may be a server computer, a client computer, a personal computer (PC), a tablet computer, a laptop computer, a netbook, a set-top box (STB), a personal digital assistant (PDA), a cellular telephone, a smartphone, a web appliance, a network router, a network switch, a network bridge, or any machine capable of executing the instructions 1124 (sequentially or otherwise) that specify actions to be taken by that machine.
- PC personal computer
- PDA personal digital assistant
- the machine 1100 includes a processor 1 102 (e.g., a central processing unit (CPU), a graphics processing unit (GPU), a digital signal processor (DSP), an application specific integrated circuit (ASIC), a radio-frequency integrated circuit (RFIC), or any suitable combination thereof), a main memory 1104, and a static memory 1 106, which are configured to communicate with each other via a bus 1108.
- processor 1 102 e.g., a central processing unit (CPU), a graphics processing unit (GPU), a digital signal processor (DSP), an application specific integrated circuit (ASIC), a radio-frequency integrated circuit (RFIC), or any suitable combination thereof
- main memory 1104 e.g., a central processing unit (CPU), a graphics processing unit (GPU), a digital signal processor (DSP), an application specific integrated circuit (ASIC), a radio-frequency integrated circuit (RFIC), or any suitable combination thereof
- main memory 1104 e.g., a central processing unit (CPU), a graphics processing unit (
- the machine 1100 may further include a graphics display 1110 (e.g., a plasma display panel (PDP), a liquid crystal display (LCD), a projector, or a cathode ray tube (CRT)).
- a graphics display 1110 e.g., a plasma display panel (PDP), a liquid crystal display (LCD), a projector, or a cathode ray tube (CRT)
- the machine 1100 may also include an alphanumeric input device 1 1 12 (e.g., a keyboard), a cursor control device 1 1 14 (e.g., a mouse, a touchpad, a trackball, a joystick, a motion sensor, or other pointing
- a storage unit 1116 a storage unit 1116, a signal generation device 1 1 18 (e.g., a speaker), and a network interface device 1 120.
- a signal generation device 1 1 18 e.g., a speaker
- the storage unit 1116 includes a machine-readable medium 1122 on which is stored the instructions 1 124 (e.g., software) embodying any one or more of the methodologies or functions described herein.
- the instructions 1 124 may also reside, completely or at least partially, within the main memory 1104, within the processor 1102 (e.g., within the processor's cache memory), or both, during execution thereof by machine 1100. Accordingly, the main memory 1 104 and the processor 1102 may be considered as machine-readable media.
- the instructions 1124 may be transmitted or received over a network 1126 (e.g., network 550) via the network interface device 1 120.
- the term "memory” refers to a machine-readable medium able to store data temporarily or permanently and may be taken to include, but not be limited to, random-access memory (RAM), read-only memory (ROM), buffer memory, flash memory, and cache memory. While the machine -readable medium 1122 is shown in an example embodiment to be a single medium, the term “machine-readable medium” should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, or associated caches and servers) able to store instructions (e.g., instructions 1124).
- RAM random-access memory
- ROM read-only memory
- buffer memory e.g., a centralized or distributed database, or associated caches and servers
- machine-readable medium shall also be taken to include any medium that is capable of storing instructions (e.g., software) for execution by the machine, such that the instructions, when executed by one or more processors of the machine (e.g., processor 1102), cause the machine to perform any one or more of the methodologies described herein.
- the term “machine-readable medium” shall accordingly be taken to include, but not be limited to, a data repository in the form of a solid-state memory, an optical medium, a magnetic medium, or any suitable combination thereof.
- Modules may constitute either software modules (e.g., code embodied on a machine-readable medium or in a transmission signal) or hardware modules.
- a "hardware module” is a tangible unit capable of performing certain operations and may be configured or arranged in a certain physical manner.
- one or more computer systems e.g., a standalone computer system, a client computer system, or a server computer system
- one or more hardware modules of a computer system e.g., a processor or a group of processors
- software e.g., an application or application portion
- a hardware module may be implemented mechanically, electronically, or any suitable combination thereof.
- a hardware module may include dedicated circuitry' or logic that is permanently configured to perform certain operations.
- a hardware module may be a special-purpose processor, such as a field programmable gate array (FPGA) or an application-specific integrated circuit (ASIC),
- a hardware module may also include programmable logic or circuitry that is temporarily configured by software to perform certain operations.
- a hardware module may include software encompassed within a general -purpose processor or other programmable processor. It will be appreciated that the decision to implement a hardware module mechanically, in dedicated and permanently configured circuitry, or in temporarily configured circuitry (e.g., configured by software) may be driven by cost and time considerations.
- hardware module should be understood to encompass a tangible entity, be that an entity that is physically constructed, permanently configured (e.g., hardwired), or temporarily configured (e.g., programmed) to operate in a certain manner or to perform certain operations described herein.
- “hardware-implemented module” refers to a hardware module. Considering embodiments in which hardware modules are temporarily configured (e.g., programmed), each of the hardware modules need not be configured or instantiated at any one instance in time. For example, where the hardware modules comprise a general-purpose processor configured using software, the general-purpose processor may be configured as respective different hardware modules at different times. Software may accordingly configure a processor, for example, to constitute a particular hardware module at one instance of time and to constitute a different hardware module at a different instance of time.
- Hardware modules can provide information to, and receive information from, other hardware modules. Accordingly, the described hardware modules may be regarded as being communicatively coupled. Where multiple hardware modules exist contemporaneously, communications may be achieved through signal transmission (e.g., over appropriate circuits and buses) that connect the hardware modules, in embodiments in which multiple hardware modules are configured or instantiated at different times, communications between such hardware modules may be achieved, for example, through the storage and retrieval of information in memory structures to which the multiple hardware modules have access. For example, one hardware module may perform an operation and store the output of that operation in a memory device to which it is communicatively coupled. A further hardware module may then, at a later time, access the memory device to retrieve and process the stored output. Hardware modules may also initiate communications with input or output devices, and can operate on a resource (e.g., a collection of information).
- a resource e.g., a collection of information
- processors may be temporarily configured (e.g., by software) or permanently configured to perform the relevant operations. Whether temporarily or permanently configured, such processors may constitute processor-implemented modules that operate to perform one or more operations or functions described herein.
- processor-implemented module refers to a hardware module implemented using one or more processors.
- the methods described herein may be at least partially processor-implemented. For example, at least some of th e operations of a method may be performed by one or more processors or processor-implemented modules. The performance of certain of the operations may be distributed among the one or more processors, not only residing within a single machine, but deployed across a number of machines. In some example embodiments, the processor or processors may be located in a single location (e.g., within a home environment, an office environment or as a server farm), while in other embodiments the processors may be distributed across a number of locations.
- the one or more processors may also operate to support performance of the relevant operations in a "cloud computing" environment or as a “software as a service” (SaaS). For example, at least some of the operations may be performed by a group of computers (as examples of machines including processors), these operations being accessible via a network (e.g., the Internet) and via one or more appropriate interfaces (e.g., an application program interface (API)).
- a network e.g., the Internet
- API application program interface
- the performance of certain of the operations may be distributed among the one or more processors, not only residing within a single machine, but deployed across a number of machines.
- the one or more processors or processor- implemented modules may be located in a single geographic location (e.g., within a home environment, an office environment, or a server farm). In other example embodiments, the one or more processors or processor-implemented modules may be distributed across a number of geographic locations,
- displaying may refer to actions or processes of a machine (e.g., a computer) that manipulates or transforms data represented as physical (e.g., electronic, magnetic, or optical) quantities within one or more memories (e.g., volatile memory, non-volatile memory, or any suitable combination thereof), registers, or other machine components that receive, store, transmit, or display information.
- a machine e.g., a computer
- memories e.g., volatile memory, non-volatile memory, or any suitable combination thereof
- registers e.g., volatile memory, non-volatile memory, or any suitable combination thereof
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Radar, Positioning & Navigation (AREA)
- Remote Sensing (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Selon l'invention, l'activité de navigation d'un premier utilisateur est motivée par une certaine intention. Le premier utilisateur demande l'extraction d'un document particulier tout en naviguant. Une machine de traitement et de présentation de document associe le document à une voie d'extraction prise par le premier utilisateur. En utilisant les données de voie d'extraction du document, la machine de traitement et de présentation de document déduit une intention susceptible d'avoir motivé le premier utilisateur. Lorsqu'un second utilisateur fait une demande similaire à une demande dans la voie d'extraction, la machine présente au second utilisateur le document et certaines des données de voie d'extraction, fournissant ainsi au second utilisateur un raccourci qui conduit le second utilisateur directement au document. Ainsi, le second utilisateur peut être capable de satisfaire son intention au moyen d'une activité de navigation significativement moins importante par comparaison à celle du premier utilisateur.
Applications Claiming Priority (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/717,082 US20110219029A1 (en) | 2010-03-03 | 2010-03-03 | Document processing using retrieval path data |
US12/717,091 | 2010-03-03 | ||
US12/717,088 US20110218883A1 (en) | 2010-03-03 | 2010-03-03 | Document processing using retrieval path data |
US12/717,082 | 2010-03-03 | ||
US12/717,088 | 2010-03-03 | ||
US12/717,091 US20110219030A1 (en) | 2010-03-03 | 2010-03-03 | Document presentation using retrieval path data |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2011109516A2 true WO2011109516A2 (fr) | 2011-09-09 |
WO2011109516A3 WO2011109516A3 (fr) | 2012-01-05 |
Family
ID=44542823
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2011/026867 WO2011109516A2 (fr) | 2010-03-03 | 2011-03-02 | Traitement de document à l'aide de données de voie d'extraction |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2011109516A2 (fr) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2014116361A1 (fr) * | 2013-01-25 | 2014-07-31 | Ebay Inc. | Systèmes et procédés pour apparier des états de page |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1997045786A1 (fr) * | 1996-05-24 | 1997-12-04 | V-Cast, Inc. | Systeme client/serveur, destine a fournir des informations en direct |
JP2002092379A (ja) * | 2000-09-20 | 2002-03-29 | Nec Corp | インターネットを用いた電子取引システム及びその方法 |
WO2004066163A1 (fr) * | 2003-01-24 | 2004-08-05 | British Telecommunications Public Limited Company | Dispositifs et procedes de recherche |
US20070136272A1 (en) * | 2005-12-14 | 2007-06-14 | Amund Tveit | Ranking academic event related search results using event member metrics |
US20080040321A1 (en) * | 2006-08-11 | 2008-02-14 | Yahoo! Inc. | Techniques for searching future events |
-
2011
- 2011-03-02 WO PCT/US2011/026867 patent/WO2011109516A2/fr active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1997045786A1 (fr) * | 1996-05-24 | 1997-12-04 | V-Cast, Inc. | Systeme client/serveur, destine a fournir des informations en direct |
JP2002092379A (ja) * | 2000-09-20 | 2002-03-29 | Nec Corp | インターネットを用いた電子取引システム及びその方法 |
WO2004066163A1 (fr) * | 2003-01-24 | 2004-08-05 | British Telecommunications Public Limited Company | Dispositifs et procedes de recherche |
US20070136272A1 (en) * | 2005-12-14 | 2007-06-14 | Amund Tveit | Ranking academic event related search results using event member metrics |
US20080040321A1 (en) * | 2006-08-11 | 2008-02-14 | Yahoo! Inc. | Techniques for searching future events |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2014116361A1 (fr) * | 2013-01-25 | 2014-07-31 | Ebay Inc. | Systèmes et procédés pour apparier des états de page |
US10025760B2 (en) | 2013-01-25 | 2018-07-17 | Ebay Inc. | Mapping page states to URLs |
Also Published As
Publication number | Publication date |
---|---|
WO2011109516A3 (fr) | 2012-01-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11449719B2 (en) | Image evaluation | |
JP5945332B2 (ja) | パーソナライズ情報転送方法および装置 | |
US11829430B2 (en) | Methods and systems for social network based content recommendations | |
US10636075B2 (en) | Methods and apparatus for querying a database for tail queries | |
US20150310392A1 (en) | Job recommendation engine using a browsing history | |
US11526570B2 (en) | Page-based prediction of user intent | |
US9552144B2 (en) | Item preview with aggregation to a list | |
US11416482B2 (en) | Adaptive search refinement | |
WO2015191622A1 (fr) | Systèmes et procédés de traitement de filtres | |
US20130263044A1 (en) | Method and system to provide a scroll map | |
US20110219030A1 (en) | Document presentation using retrieval path data | |
EP2778979A1 (fr) | Classement de résultats de recherche par marque | |
US9984403B2 (en) | Electronic shopping cart processing system and method | |
US20230177087A1 (en) | Dynamic content delivery search system | |
US10354318B2 (en) | Providing an image of an item to advertise the item | |
US9881027B2 (en) | Image appended search string | |
US20110218883A1 (en) | Document processing using retrieval path data | |
US20120159368A1 (en) | Search history navigation | |
US20150221014A1 (en) | Clustered browse history | |
US20110219029A1 (en) | Document processing using retrieval path data | |
US20140358819A1 (en) | Tying Objective Ratings To Online Items | |
US10185982B1 (en) | Service for notifying users of item review status changes | |
WO2011109516A2 (fr) | Traitement de document à l'aide de données de voie d'extraction | |
US20130262447A1 (en) | Method and system to provide inline refinement of on-line searches | |
US10891659B2 (en) | Placing resources in displayed web pages via context modeling |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
NENP | Non-entry into the national phase in: |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 11751288 Country of ref document: EP Kind code of ref document: A2 |