US20100138403A1 - System for modifying queries before presentation to a sponsored search generator or other matching system where modifications improve coverage without a corresponding reduction in relevance - Google Patents
System for modifying queries before presentation to a sponsored search generator or other matching system where modifications improve coverage without a corresponding reduction in relevance Download PDFInfo
- Publication number
- US20100138403A1 US20100138403A1 US12/697,922 US69792210A US2010138403A1 US 20100138403 A1 US20100138403 A1 US 20100138403A1 US 69792210 A US69792210 A US 69792210A US 2010138403 A1 US2010138403 A1 US 2010138403A1
- Authority
- US
- United States
- Prior art keywords
- query
- search
- modified
- units
- sponsored
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9532—Query formulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99931—Database or file accessing
- Y10S707/99933—Query processing, i.e. searching
Definitions
- the present invention relates to search systems generally, wherein a query is processed to return search results, and more particularly to techniques for modifying queries before presentation to a sponsored search generator or other matching system where modifications improve coverage without a corresponding reduction in relevance.
- a search process involves a user inputting a query to the search system and the search system returning one or more search results (“hits”) that are deemed responsive to the query.
- Many search providers also display sponsored links along with the search results, where the main search results result from searching a corpus such as a collection of Web pages referenced by an index and where the sponsored links are found in a database of sponsored links set up to supply relevant links to searchers on behalf of sponsors.
- the sponsored links that are provided are relevant to the query. For example, if a searcher (which can be a person, a person using a computer, or a computer) submits a search query in the form of a search query string such as “European vacation”, the search engine might find pages from the Web that are deemed to relate to vacationing in Europe.
- the sponsored search links might be found from the sponsored link database according to purchased keywords.
- a travel agent sponsor might pay to have a link they devise presented to queriers that use “European vacation” in their search query.
- sponsored advertising links are sold using a “pay per click” model, wherein the search system might present a sponsored link, but the sponsor only pays the search system operator when and if the querier clicks on the sponsored link.
- the search system operator would like to ensure that the sponsored links are relevant to the search. If, for example, sponsored links for auto repair are displayed with search results for vacations, it is not likely that the reader will be interested, and such links would have a very low click-through rate and the search system operator would not see much revenue. On the other end, if the search system is too strict about what it shows, insufficient coverage might result.
- a sponsor typically identifies in advance one or more search query strings that should trigger the display of the sponsor's presentation.
- Each sponsor's presentation might be indexed against one or more of these pre-selected search query strings.
- the search system attempts to match the search query with as many of the search query strings that have been pre-selected by the sponsors as possible.
- a sponsor presentation could be displayed along with the other search results.
- search queries that would have no matching presentations.
- the search query “John Q. Public's Daily Breakfast Menu” might not attract any interested sponsors, so users submitting that as a search would not see any sponsored links.
- the “coverage” of search queries would be such that a large proportion of the searches performed would be covered by at least one relevant sponsored presentation. Otherwise, where search queries are not covered by any sponsored presentations, the search system operator would not see any sponsored presentation revenue for those search queries.
- the present invention provides techniques for modifying queries before presentation to a sponsored search generator or other matching system where modifications improve coverage without a corresponding reduction in relevance.
- a query modification system might be used to process a user's query to form a modified query that is in turn submitted to a sponsored search system to return sponsored searches with improved coverage while maintaining relevance.
- the techniques can be used where the modified queries are submitted to other than a sponsored search system.
- the modified queries might be used to improve matchmaking such as finding a potential customer for a sponsor or finding a potential provider of suitable products and/or services for a potential customer.
- the modified query can then be used to obtain sponsored presentations by matching against a listing of search query strings that have been pre-selected by sponsors or other methods. Each pre-selected search query strings might correspond to one or more sponsored web links. If the modified query matches one of the pre-selected search query strings, corresponding sponsored web links are returned and displayed to the user.
- the input to the sponsored search system can be the modified query alone or the modified query and the original query.
- a modified query might be generated by leaving off words, substituting phrases, differentially weighting “units” of a search, and/or using associations between units. The weighting of units might be done based on how frequently units appeared in previous search queries, the length of the units and associations between units. In some cases, weighting of units is leveraged to decide which unit to drop from the search query string to form the modified query string.
- FIG. 1 is a diagram of an Internet communications system that can implement embodiments of the present invention.
- FIG. 2A is a generalized diagram illustrating how a query modification system interacts with a web search system according to an embodiment of the present invention.
- FIG. 2B is a diagram illustrating a specific example of how a query modification system modifies queries after they are transmitted to a sponsored listings system according to an embodiment of the present invention.
- FIG. 2C illustrates another specific example of how a query modification system modifies queries before they are transmitted to a sponsored listings system according to an embodiment of the present invention.
- FIG. 3 is a flowchart that illustrates a general methodology for modifying search queries to increase the number of matching sponsored listings according to the present invention.
- FIG. 4 comprises flowcharts illustrating more specific examples of methodologies for modifying search queries;
- FIG. 4A illustrates a method of increasing the number of matching sponsored listings by identifying more specific units in the search query;
- FIG. 4B illustrates a method of identifying longer sets of units in the search query;
- FIG. 4C illustrates a method of identifying frequently occurring unit associations in the search query.
- FIG. 5 comprises flowcharts for a process of evaluating a query to determine matches for matching against bidded terms; FIGS. 5A and 5B together form FIG. 5 .
- FIG. 6 comprises flowcharts for a process of evaluating a query to determine matches for matching against bidded terms using a plurality of units for checking against;
- FIGS. 6A and 6B together form FIG. 6A .
- FIG. 1 illustrates a general overview of an information retrieval and communication network 100 including a client system 120 according to an embodiment of the present invention.
- client system 120 can communicate through the Internet 140 , or other communication network, e.g., over any LAN or WAN connection, with a plurality of server systems 150 1 to 150 N .
- server systems 150 1 to 150 N can communicate with search result server 160 .
- client system 120 is configured according to the present invention to communicate with any of server systems 150 1 to 150 N and 160 , e.g., to access, receive, retrieve and display media content and other information such as web pages and web sites.
- client system 120 could include a desktop personal computer, workstation, laptop, PDA, cell phone, or any WAP-enabled device or any other computing device capable of interfacing directly or indirectly to the Internet.
- Client system 120 typically runs an HTTP client, e.g., a browsing program, such as Microsoft's Internet ExplorerTM browser, Netscape NavigatorTM browser, MozillaTM browser, OperaTM browser, or a WAP-enabled browser in the case of a cell phone, PDA or other wireless device, or the like, allowing a user of client system 120 to access, process and view information and pages available to it from server systems 150 1 to 150 N over Internet 140 .
- HTTP client e.g., a browsing program, such as Microsoft's Internet ExplorerTM browser, Netscape NavigatorTM browser, MozillaTM browser, OperaTM browser, or a WAP-enabled browser in the case of a cell phone, PDA or other wireless device, or the like, allowing a user of client system 120 to access, process and view information and pages available to it from server systems 150 1
- Client system 120 also typically includes one or more user interface devices 122 , such as a keyboard, a mouse, touch-screen, pen or the like, for interacting with a graphical user interface (GUI) provided by the browser on a display (e.g., monitor screen, LCD display, etc.), in conjunction with pages, forms and other information provided by server systems 150 1 to 150 N or other servers.
- GUI graphical user interface
- the present invention is suitable for use with the Internet, which refers to a specific global internetwork of networks. However, it should be understood that other networks can be used instead of or in addition to the Internet, such as an intranet, an extranet, a virtual private network (VPN), a non-TCP/IP based network, any LAN or WAN or the like.
- VPN virtual private network
- client system 120 and all of its components are operator configurable using an application including computer code run using a central processing unit such as an Intel PentiumTM processor, AMD AthlonTM processor, or the like or multiple processors.
- Computer code for operating and configuring client system 120 to communicate, process and display data and media content as described herein is preferably downloaded and stored on a hard disk, but the entire program code, or portions thereof, may also be stored in any other volatile or non-volatile memory medium or device as is well known, such as a ROM or RAM, or provided on any media capable of storing program code, such as a compact disk (CD) medium, a digital versatile disk (DVD) medium, a floppy disk, and the like.
- CD compact disk
- DVD digital versatile disk
- floppy disk floppy disk
- the entire program code, or portions thereof, may be transmitted and downloaded from a software source, e.g., from one of server systems 150 1 to 150 N to client system 120 over the Internet as is well known, or transmitted over any other conventional network connection as is well known (e.g., extranet, VPN, LAN, etc.) using any communication medium and protocols (e.g., TCP/IP, HTTP, HTTPS, Ethernet, etc.) as are well known.
- computer code for implementing aspects of the present invention can be implemented in any programming language that can be executed on a client system such as, for example, in C, C+, HTML, XML, Java, JavaScript, any scripting language, such as VBScript.
- no code is downloaded to client system 120 , and needed code is executed by a server, or code already present at client system 120 is executed.
- a client application (represented as module 125 ) executing on client system 120 includes instructions for controlling client system 120 and its components to communicate with server systems 150 1 through 150 N and 160 and to process and display data content received therefrom.
- client application module 125 includes various software modules for processing data and media content.
- application module 125 can include one or more of a search module 126 for processing search requests and search result data, a user interface module 127 for rendering data and media content in text and data frames and active windows, e.g., browser windows and dialog boxes, and an application interface module 128 for interfacing and communicating with various applications executing on client 120 .
- interface module 127 can include a browser, such as a default browser configured on client system 120 or a different browser.
- search result server 160 is configured to provide search result data and media content to client system 120
- server systems 150 are configured to provide data and media content such as web pages to client system 120 , for example, in response to links selected in search result pages provided by server system 160
- Server system 160 in one embodiment references various collection technologies for collecting information from the World Wide Web and for populating one or more indexes with, for example, pages, links to pages, etc. Such collection technologies include automatic web crawlers, spiders, etc., as well as manual or semi-automatic classification algorithms and interfaces for classifying and ranking web pages within a hierarchical structure.
- server 160 is also configured with search related algorithms for processing and ranking web pages.
- Server 160 is also preferably configured to record user query activity in the form of query log files.
- Server system 160 in one aspect, is configured to provide data responsive to various search requests received from a client system, in particular search module 126 .
- Server systems 150 and 160 can be part of a single organization, e.g., a distributed server system such as that provided to users by Yahoo! Inc., or they can be part of disparate organizations.
- Server systems 150 and server system 160 each includes at least one server and an associated database system, and may include multiple servers and associated database systems, and although shown as a single block, may be geographically distributed.
- server system 160 can be located in close proximity to one another (e.g., in a server farm located in a single building or campus), or they may be distributed at locations remote from one another (e.g., one or more servers located in city A and one or more servers located in city B).
- server system will typically include one or more logically and/or physically connected servers distributed locally or across one or more geographic locations.
- server typically includes a computer system and an associated storage system and database application as is well known in the art. The terms “server” and “server system” will be used interchangeably herein.
- server 160 includes algorithms that provide search results to users in response to search queries received from client system 120 .
- server system 160 is configured to increase coverage of search queries received from client system 120 without a corresponding decrease in relevance.
- FIG. 2A illustrates is a generalized diagram illustrating how a query modification system 174 interacts with a web search system according to an embodiment of the present invention.
- a search query 170 is transmitted to a search engine 175 to initiate a search of the Internet.
- Search engine 175 can implement any Internet or web searching methods such as a crawling indexer.
- Search engine 175 locates web content matching search query 170 from a search corpus 190 .
- Search corpus 190 can store copies of content that is accessible via the World Wide Web, the Internet, intranets, local networks, and wide area networks.
- Search engine 175 retrieves content from search corpus 190 matching search query 170 and transmits the matching content (i.e., search results) to a page assembler 180 .
- Page assembler 180 displays the search results in a readable format.
- the search results are displayed to a user as a listing of web search results in search result display screen 185 .
- Search queries that are transmitted to search engine 175 are also sent to sponsored listings system 179 through query modification system 174 .
- Sponsored listings system 179 selects sponsored web links to display in response to receiving a search query.
- the sponsored web links from a sponsored listing database 178 are sent to page assembler 180 and displayed in a portion of search result display screen 185 .
- sponsors' web sites can be displayed in a sponsor section of screen 185 when a search query matches a predefined search query string.
- the hardware vendor might want to promote its printers by having a web page about printers from that vendor's website be pointed to by a sponsored link that would appear in the sponsored links region of display screen 185 when a web user enters the term “printers” has a search query.
- the hardware vendor would pay the search system operator for each time a user clicks on the printers sponsored link as that link is displayed in the sponsored links portion of display screen 185 .
- Sponsored listings database 178 might contain records mapping search query strings to sponsors and sponsor's presentations, wherein a presentation might be a short text sequence and a web link. The mappings might be determined by a bidding process or other process used to assign search query strings to sponsors. Using that database 178 , when sponsored listings system 179 receives a search query, modified or otherwise, sponsored listings system 179 determines whether the search query matches one of the predefined search query strings that are in the database.
- sponsored listings system 179 retrieves the sponsored web links that are indexed with that search query string. The selected sponsored web links are transmitted to page assembler 180 . If the search query does not exactly match one of the indexed search query strings, sponsored listings system 179 does not return any sponsored web links. Thus, failing to locate an exact match between a search query entered by a user and one of the indexed search query strings prevents the search provider from receiving revenue from a sponsor.
- the query modification system attempts to mitigate this problem by modifying search queries that are transmitted to sponsored listings system 179 when the search queries do not have a match.
- sponsors pre-select one or more search query strings.
- the search query strings that a sponsor selects are indexed with that sponsor's web link and when a user submits a search query, if there are no matches, sponsored listings system 179 would otherwise not return any sponsor web links or presentations.
- Query modification system 174 generates modified search queries from the search queries where the modified search queries are more likely to match one or more of the indexed search query strings pre-selected by the sponsors, thus increasing coverage, but do so in a way that there is not a corresponding decrease in relevance. Further details of how query modification system 174 might modify queries are described below with respect to FIGS. 4A-4C .
- FIG. 2B illustrates one particular system for modifying search queries so that they are more likely to match one of the sponsor-selected search query strings.
- search query modification system 174 initially forwards all search queries 170 directly to sponsored listings system 179 without modifying them.
- System 174 also stores copies of the search queries it sends to system 179 .
- Sponsored listings system 179 attempts to match the search query with one of the indexed search query strings as discussed above and returns the corresponding sponsored web links to query modification system 174 . If sponsored listings system 179 returns at least a predetermined number of sponsored web links, these links are sent directly to page assembler 180 .
- query modification system 174 then changes search query 170 into a new query to increase the chance that the new query will match more of the pre-selected sponsored query strings.
- the new query is transmitted from system 174 back to sponsored listings system 179 .
- System 179 attempts to match the new query against the sponsored search query strings. If a new set of sponsored links are identified as matching the new query, the new set of sponsored links are transmitted to page assembler 180 .
- FIG. 2C illustrates another system for modifying search queries so that they are more likely to match more of the sponsor-selected search query strings.
- query modification system 174 modifies all search queries 170 that it receives before the queries are transmitted to sponsored listings system 179 , using a knowledge base 199 .
- Knowledge base 199 stores sets of rules that are used to increase the coverage of search queries with respect to the sponsored links.
- the modified query is transmitted to sponsored listings system 179 .
- System 179 locates sponsored links that match the modified query and transmits the results to page assembler 180 .
- the original query and the modified query are provided to sponsored listings system 179 .
- queries are modified by the query modification system while the original query is submitted to a search engine.
- the query modification occurs at the client side, in others it occurs at the location of the search engine and in yet others, it occurs at a different place in a network.
- the search system operator can provide the query modification system external to the sponsored listings system and treat the sponsored listing system as a “black box” with no internal modifications.
- one enhancement provides feedback by noting the click-throughs that occur for particular search queries and use that information in deciding how to modify search query terms.
- the click-through rates are an indication of relevance and those indications can be used to select from among several options for modified queries.
- FIG. 3 is a flowchart that illustrates a general methodology for modifying search queries to increase coverage.
- query modification system 174 receives a search query from a user at step 301 .
- query modification system 174 modifies the search query using rules designed to increase the number of sponsored search query strings that the search query received at step 301 matches, without corresponding loss of relevance. Many embodiments of these rules are possible. Examples of rules that can increase the number of matching sponsored search query strings are described below with respect to FIGS. 4A-4C .
- Each sponsored search query string is indexed with one or more sponsored links in sponsored listing system 179 .
- Query modification system 174 can modify search queries before or after they have been sent to system 179 , as discussed above with respect to FIGS. 2A-2B .
- system 179 attempts to locate sponsored search query strings that match the modified search query. If sponsored search query strings are matched at step 303 , system 179 returns the sponsored links that correspond to the matched query strings at step 304 .
- the number of matching sponsored listings can be increased by identifying units in a search query that appeared less frequently in previous search queries.
- Search queries can be decomposed into constituent parts referred to as units.
- a query processing engine can decompose a search query into one or more constituent units using statistical methods.
- a unit is one or more word sequence that typically corresponds to a natural concept such as “New York City” or “bird of prey.” Further details of techniques for generating concept units from search queries are discussed in co-pending and commonly-assigned U.S. patent application Ser. No. 10/713,576, filed Nov. 12, 2003, which is incorporated by reference herein.
- each of the units in a search query is compared to previously submitted search queries.
- previously submitted search queries are stored for later use.
- Weight values are assigned to units in the search query based on the relative frequency that the units appeared in previously submitted search queries. Units that appeared less frequently in previous searches are given a higher weight, and units that appeared more frequently in previous search queries are given a lower weight.
- Units that have appeared less frequently in past search queries correspond to more specific concepts. The less frequently occurring units are more likely to be a good approximation of the user's true intent in entering the search query. The units that appeared more frequently in previous search queries are more generic and less likely to be a good approximation of the user's intent.
- Query modification system 174 drops units in a search query that have lower weights.
- the original search query is modified to contain only the units in the original query that appeared less frequently in previous search queries relative to the other units in the original query. This feature allows more frequently occurring units in a search query to be filtered out to increase the coverage of sponsored listings.
- the modified search query has fewer units. Queries shortened in this way have an increased chance of matching a larger number of sponsored listings in system 179 .
- sponsored listings system 179 processes the modified search query, it is likely to return more sponsored links than when it processes the original query.
- this embodiment generally increases the coverage of sponsored links that are returned by system 179 without a corresponding decrease in relevance.
- a user can enter a search query for a “10 day trip to Europe” to locate travel information to help plan a European vacation.
- This search query includes two concepts, “10 day” and “trip to Europe.” However, the concept “trip to Europe” is more relevant to the user's intent (planning a European vacation) than the concept “10 day.” Many travel web sites relating to European vacations do not include the phrase “10 day.”
- Sponsored listings system 179 may not return sponsored links to European travel web sites that do not mention “10 day.”
- the units “10 day” and “trip to Europe” are compared to previous search queries to determine how frequently these units appeared. Because “10 day” appears more frequently than “trip to Europe,” the unit “10 day” is dropped from the search query.
- the modified search query only contains “trip to Europe.” The modified query “trip to Europe” has a greater chance of exactly matching more sponsored search query strings than “10 day trip to Europe.”
- FIG. 4A illustrates a methodology according to this embodiment of the present invention.
- query modification system 174 receives a search query.
- system 174 modifies the query by dropping the units that appear more frequently in previously submitted queries.
- the modified search query only contains the units that appeared less frequently in previous queries relative to other units in the original search query.
- sponsored listings system 179 attempts to locate sponsored search query strings that match the modified search query.
- system 179 returns a list of sponsored web links corresponding to the matched search query strings.
- the units in a search query are compared with previously submitted search queries to determine how often groups of units in the search query appear in the previous search queries.
- a log of queries can be used to determine how frequently units occur, how frequently they occur in various combinations, etc. and that information can be used to determine how best to modify the search query to improve coverage without a corresponding decrease in relevance.
- query modification system 174 might modify a search query by eliminating units or groups of units that appeared less frequently. System 174 might also drop shorter groups of units from the search query.
- query modification system 174 might use query logs to determine the frequency of each combination of units in the query in previous queries and find that “cheap hotel” and “Seattle cheap hotel” appear more frequently than “Seattle cheap”. In that case, query modification system 174 would not modify the query because longest string is also one of the most frequent. However, if the “Seattle hotel” appears much more frequently that “Seattle cheap hotel”, query modification system 174 might modify the query to be “Seattle hotel”.
- FIG. 4B illustrates another methodology.
- query modification system 174 receives a search query.
- system 174 modifies the query by eliminating groups of units in the search query that do not appear more frequently than by chance in previously submitted search queries. These groups of units are less likely to match relevant sponsored listings.
- query modification system 174 modifies the search query again by eliminating shorter sets of the remaining units. The longer units are also more likely to match relevant sponsored listings.
- query modification system 174 locates sponsored search query strings that match the modified search query. Because, in general, groups of units are eliminated from the search query, the modified search query is less specific, and therefore has an increased chance of matching more sponsored search query strings. Eliminating groups of units from queries is another way that the present invention increases the coverage of search queries with respect to sponsored listings.
- sponsored listings system 174 returns a list of sponsored web links that correspond to the matched search query strings.
- Associated units in a current search query might be compared with previously submitted search queries to determine if the associated units occur together more or less frequency in past search queries.
- Associated units are groups of units that are not sufficiently related to form a new unit. Associated units that appear together more frequently in previous search queries are probably more likely to match relevant sponsored links than less frequently occurring associated units. Thus, a search query can be modified by eliminating the unit associations that appeared less frequently in past queries.
- Query modification system 174 determines how frequently each of these unit associations appeared in the previously submitted search queries. If the unit association “pregnancy nausea” appeared more frequently in previous search queries than “first trimester” then the search query might be modified by eliminating the unit association “first trimester.” The modified search query would then be “pregnancy nausea.” In this manner, a modified search query, “pregnancy nausea” is submitted that have more coverage (as it is more likely that a sponsor would sponsor the search query “pregnancy nausea” than the search query “first trimester pregnancy nausea”.
- the relevance does not decline much for the original search query and the modified search query, as might occur if the modified search query were “first” (which might have wide coverage, but low relevancy), or “first trimester” (which might have low relevancy as the search is mostly about nausea during pregnancy).
- FIG. 4C illustrates another methodology.
- query modification system 174 receives a search query.
- system 174 modifies the query by eliminating associated units in the query that appeared less frequently in previous queries than other associated units in the current query. The less frequently occurring associated units are less likely to match relevant sponsored listings.
- query modification system 174 locates sponsored search query strings that match the modified search query. Because associated units have been removed from the search query at step 332 , the modified search query is less specific and therefore likely to match more sponsored search query strings. Eliminating associated units from queries is another way that the present invention increases the coverage of search queries with respect to sponsored listings.
- sponsored listings system 174 returns a list of sponsored web links that correspond to the matched search query strings.
- the present invention By modifying a search query to include less units, groups of units, or associated units, the present invention increases the coverage of matching sponsored links. Units, group of units, and associated units are dropped from the search query. By dropping less relevant units, the present invention increases the coverage of sponsored links that are returned in response to a query.
- the units, groups of units, and unit associations that are dropped from the search query are identified as being less likely to be a good approximation of a user's intent based on predefined sets of rules.
- query modification system 174 might also provide filtering functions. For example, it might modify queries to provide adult filtering, brand name filtering, etc. With such filtering, some terms that might have been eliminated are left in. For example, if someone searched for “brand X shoes”, the relevant portion of the string would be “shoes”, but if the “brand X” portion were left off, it would too greatly modify the results, so it should be left in. More generally, selector words are identified and left in the query even if other measures would have shortened a query by removing those words.
- a query is modified by substituting the query with a synonym or a preferred form of a query.
- the synonyms and preferred forms correspond to predefined query strings that have been selected by sponsors to correspond to sponsored links.
- This embodiment of the present invention allows a modified query to match a sponsor listing, even if the original query does not exactly match a predefined query string linked to the sponsored listing.
- the query “NYC restaurants” can be replaced with the modified query “New York City restaurants,” if “New York City restaurants” is a predefined query string that has been selected by a sponsor, but “NYC restaurants” has not been selected by a sponsor.
- the query “autos repair” can be modified into the query “car repair,” by appropriately modifying the original query to generate a synonym or a preferred form.
- the phrase “wood work” can be a preferred form of “woodwork.”
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Finance (AREA)
- Strategic Management (AREA)
- Accounting & Taxation (AREA)
- Development Economics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Entrepreneurship & Innovation (AREA)
- Game Theory and Decision Science (AREA)
- Economics (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Mathematical Physics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Techniques are provided for modifying queries to increase the number of sponsored links that are returned in response to the queries. A query modification system uses a predefined set of rules that are designated to modify a query to increase the chance that the modified query will match more sponsored links. The modified query is then matched against a listing of search query strings that have been pre-selected by sponsors. Each pre-selected search query strings corresponds to one or more sponsored web links. If the modified query matches one of the pre-selected search query strings, the corresponding sponsored web links are returned and displayed to the user.
Description
- This application claims benefit as a Continuation of application Ser. No. 11/077,968, filed Mar. 10, 2005 the entire contents of which is hereby incorporated by reference as if fully set forth herein, under 35 U.S.C. §120. The applicant(s) hereby rescind any disclaimer of claim scope in the parent application(s) or the prosecution history thereof and advise the USPTO that the claims in this application may be broader than any claim in the parent application(s).]
- The present invention relates to search systems generally, wherein a query is processed to return search results, and more particularly to techniques for modifying queries before presentation to a sponsored search generator or other matching system where modifications improve coverage without a corresponding reduction in relevance.
- With the advent of the Internet and the multitude of web pages and media content available to a user over the World Wide Web (web), there has become a need to provide users with streamlined approaches to filter and obtain desired information from the web. Search systems and processes have been developed to meet the needs of users to obtain desired information. Examples of such technologies can be accessed through Yahoo!' s website, Google's website and other sites.
- Typically, a search process involves a user inputting a query to the search system and the search system returning one or more search results (“hits”) that are deemed responsive to the query. Many search providers also display sponsored links along with the search results, where the main search results result from searching a corpus such as a collection of Web pages referenced by an index and where the sponsored links are found in a database of sponsored links set up to supply relevant links to searchers on behalf of sponsors.
- Ideally, the sponsored links that are provided are relevant to the query. For example, if a searcher (which can be a person, a person using a computer, or a computer) submits a search query in the form of a search query string such as “European vacation”, the search engine might find pages from the Web that are deemed to relate to vacationing in Europe. The sponsored search links might be found from the sponsored link database according to purchased keywords.
- Thus, a travel agent sponsor might pay to have a link they devise presented to queriers that use “European vacation” in their search query. Often, sponsored advertising links are sold using a “pay per click” model, wherein the search system might present a sponsored link, but the sponsor only pays the search system operator when and if the querier clicks on the sponsored link.
- With a pay per click model, the search system operator would like to ensure that the sponsored links are relevant to the search. If, for example, sponsored links for auto repair are displayed with search results for vacations, it is not likely that the reader will be interested, and such links would have a very low click-through rate and the search system operator would not see much revenue. On the other end, if the search system is too strict about what it shows, insufficient coverage might result.
- A sponsor typically identifies in advance one or more search query strings that should trigger the display of the sponsor's presentation. Each sponsor's presentation might be indexed against one or more of these pre-selected search query strings. Each time a search query is entered, the search system attempts to match the search query with as many of the search query strings that have been pre-selected by the sponsors as possible. When a search query submitted by a user is relevant to one of the pre-selected search query strings, a sponsor presentation could be displayed along with the other search results.
- As sponsors typically indicate the keywords that are needed in a search query and expect that their sponsored presentation would not be shown at random, there are some search queries that would have no matching presentations. For example, the search query “John Q. Public's Daily Breakfast Menu” might not attract any interested sponsors, so users submitting that as a search would not see any sponsored links. Ideally, the “coverage” of search queries would be such that a large proportion of the searches performed would be covered by at least one relevant sponsored presentation. Otherwise, where search queries are not covered by any sponsored presentations, the search system operator would not see any sponsored presentation revenue for those search queries. Thus, there is a tension between casting too wide a net and having possibly irrelevant sponsored links, which would over time cause users to ignore them, and casting so narrowly that insufficient coverage results.
- It would therefore be desirable to provide techniques for increasing the coverage of sponsored presentations that are returned in response to search queries while maintaining relevance or lowering instances where sponsored presentations might be deemed to be less relevant.
- The present invention provides techniques for modifying queries before presentation to a sponsored search generator or other matching system where modifications improve coverage without a corresponding reduction in relevance. A query modification system might be used to process a user's query to form a modified query that is in turn submitted to a sponsored search system to return sponsored searches with improved coverage while maintaining relevance.
- In variations, the techniques can be used where the modified queries are submitted to other than a sponsored search system. Thus, the modified queries might be used to improve matchmaking such as finding a potential customer for a sponsor or finding a potential provider of suitable products and/or services for a potential customer.
- The modified query can then be used to obtain sponsored presentations by matching against a listing of search query strings that have been pre-selected by sponsors or other methods. Each pre-selected search query strings might correspond to one or more sponsored web links. If the modified query matches one of the pre-selected search query strings, corresponding sponsored web links are returned and displayed to the user.
- The input to the sponsored search system can be the modified query alone or the modified query and the original query. According to one embodiment of the present invention, a modified query might be generated by leaving off words, substituting phrases, differentially weighting “units” of a search, and/or using associations between units. The weighting of units might be done based on how frequently units appeared in previous search queries, the length of the units and associations between units. In some cases, weighting of units is leveraged to decide which unit to drop from the search query string to form the modified query string.
- Other objects, features, and advantages of the present invention will become apparent upon consideration of the following detailed description and the accompanying drawings, in which like reference designations represent like features throughout the figures.
-
FIG. 1 is a diagram of an Internet communications system that can implement embodiments of the present invention. -
FIG. 2A is a generalized diagram illustrating how a query modification system interacts with a web search system according to an embodiment of the present invention. -
FIG. 2B is a diagram illustrating a specific example of how a query modification system modifies queries after they are transmitted to a sponsored listings system according to an embodiment of the present invention. -
FIG. 2C illustrates another specific example of how a query modification system modifies queries before they are transmitted to a sponsored listings system according to an embodiment of the present invention. -
FIG. 3 is a flowchart that illustrates a general methodology for modifying search queries to increase the number of matching sponsored listings according to the present invention. -
FIG. 4 comprises flowcharts illustrating more specific examples of methodologies for modifying search queries;FIG. 4A illustrates a method of increasing the number of matching sponsored listings by identifying more specific units in the search query;FIG. 4B illustrates a method of identifying longer sets of units in the search query; andFIG. 4C illustrates a method of identifying frequently occurring unit associations in the search query. -
FIG. 5 comprises flowcharts for a process of evaluating a query to determine matches for matching against bidded terms;FIGS. 5A and 5B together formFIG. 5 . -
FIG. 6 comprises flowcharts for a process of evaluating a query to determine matches for matching against bidded terms using a plurality of units for checking against; -
FIGS. 6A and 6B together formFIG. 6A . -
FIG. 1 illustrates a general overview of an information retrieval andcommunication network 100 including aclient system 120 according to an embodiment of the present invention. Incomputer network 100,client system 120 can communicate through the Internet 140, or other communication network, e.g., over any LAN or WAN connection, with a plurality of server systems 150 1 to 150 N. For example,client system 120 can communicate withsearch result server 160. As described herein,client system 120 is configured according to the present invention to communicate with any of server systems 150 1 to 150 N and 160, e.g., to access, receive, retrieve and display media content and other information such as web pages and web sites. - Several elements in the system shown in
FIG. 1 include conventional, well-known elements that need not be explained in detail here. For example,client system 120 could include a desktop personal computer, workstation, laptop, PDA, cell phone, or any WAP-enabled device or any other computing device capable of interfacing directly or indirectly to the Internet.Client system 120 typically runs an HTTP client, e.g., a browsing program, such as Microsoft's Internet Explorer™ browser, Netscape Navigator™ browser, Mozilla™ browser, Opera™ browser, or a WAP-enabled browser in the case of a cell phone, PDA or other wireless device, or the like, allowing a user ofclient system 120 to access, process and view information and pages available to it from server systems 150 1 to 150 N overInternet 140. -
Client system 120 also typically includes one or moreuser interface devices 122, such as a keyboard, a mouse, touch-screen, pen or the like, for interacting with a graphical user interface (GUI) provided by the browser on a display (e.g., monitor screen, LCD display, etc.), in conjunction with pages, forms and other information provided by server systems 150 1 to 150 N or other servers. The present invention is suitable for use with the Internet, which refers to a specific global internetwork of networks. However, it should be understood that other networks can be used instead of or in addition to the Internet, such as an intranet, an extranet, a virtual private network (VPN), a non-TCP/IP based network, any LAN or WAN or the like. - According to one embodiment,
client system 120 and all of its components are operator configurable using an application including computer code run using a central processing unit such as an Intel Pentium™ processor, AMD Athlon™ processor, or the like or multiple processors. Computer code for operating and configuringclient system 120 to communicate, process and display data and media content as described herein is preferably downloaded and stored on a hard disk, but the entire program code, or portions thereof, may also be stored in any other volatile or non-volatile memory medium or device as is well known, such as a ROM or RAM, or provided on any media capable of storing program code, such as a compact disk (CD) medium, a digital versatile disk (DVD) medium, a floppy disk, and the like. - Additionally, the entire program code, or portions thereof, may be transmitted and downloaded from a software source, e.g., from one of server systems 150 1 to 150 N to
client system 120 over the Internet as is well known, or transmitted over any other conventional network connection as is well known (e.g., extranet, VPN, LAN, etc.) using any communication medium and protocols (e.g., TCP/IP, HTTP, HTTPS, Ethernet, etc.) as are well known. It will also be appreciated that computer code for implementing aspects of the present invention can be implemented in any programming language that can be executed on a client system such as, for example, in C, C+, HTML, XML, Java, JavaScript, any scripting language, such as VBScript. In some embodiments, no code is downloaded toclient system 120, and needed code is executed by a server, or code already present atclient system 120 is executed. - According to one embodiment, a client application (represented as module 125) executing on
client system 120 includes instructions for controllingclient system 120 and its components to communicate with server systems 150 1 through 150 N and 160 and to process and display data content received therefrom. Additionally,client application module 125 includes various software modules for processing data and media content. For example,application module 125 can include one or more of asearch module 126 for processing search requests and search result data, auser interface module 127 for rendering data and media content in text and data frames and active windows, e.g., browser windows and dialog boxes, and anapplication interface module 128 for interfacing and communicating with various applications executing onclient 120. Further,interface module 127 can include a browser, such as a default browser configured onclient system 120 or a different browser. - According to one embodiment,
search result server 160 is configured to provide search result data and media content toclient system 120, and server systems 150 are configured to provide data and media content such as web pages toclient system 120, for example, in response to links selected in search result pages provided byserver system 160.Server system 160 in one embodiment references various collection technologies for collecting information from the World Wide Web and for populating one or more indexes with, for example, pages, links to pages, etc. Such collection technologies include automatic web crawlers, spiders, etc., as well as manual or semi-automatic classification algorithms and interfaces for classifying and ranking web pages within a hierarchical structure. In certain aspects,server 160 is also configured with search related algorithms for processing and ranking web pages.Server 160 is also preferably configured to record user query activity in the form of query log files. -
Server system 160, in one aspect, is configured to provide data responsive to various search requests received from a client system, inparticular search module 126.Server systems 150 and 160 can be part of a single organization, e.g., a distributed server system such as that provided to users by Yahoo! Inc., or they can be part of disparate organizations. Server systems 150 andserver system 160 each includes at least one server and an associated database system, and may include multiple servers and associated database systems, and although shown as a single block, may be geographically distributed. For example, all servers ofserver system 160 can be located in close proximity to one another (e.g., in a server farm located in a single building or campus), or they may be distributed at locations remote from one another (e.g., one or more servers located in city A and one or more servers located in city B). As used herein, the term “server system” will typically include one or more logically and/or physically connected servers distributed locally or across one or more geographic locations. Additionally, the term “server” typically includes a computer system and an associated storage system and database application as is well known in the art. The terms “server” and “server system” will be used interchangeably herein. - According to one embodiment,
server 160 includes algorithms that provide search results to users in response to search queries received fromclient system 120. According to an embodiment of the present invention,server system 160 is configured to increase coverage of search queries received fromclient system 120 without a corresponding decrease in relevance. -
FIG. 2A illustrates is a generalized diagram illustrating how aquery modification system 174 interacts with a web search system according to an embodiment of the present invention. Asearch query 170 is transmitted to asearch engine 175 to initiate a search of the Internet.Search engine 175 can implement any Internet or web searching methods such as a crawling indexer. -
Search engine 175 locates web content matchingsearch query 170 from asearch corpus 190.Search corpus 190 can store copies of content that is accessible via the World Wide Web, the Internet, intranets, local networks, and wide area networks. -
Search engine 175 retrieves content fromsearch corpus 190matching search query 170 and transmits the matching content (i.e., search results) to apage assembler 180.Page assembler 180 displays the search results in a readable format. The search results are displayed to a user as a listing of web search results in searchresult display screen 185. - Search queries that are transmitted to
search engine 175 are also sent to sponsoredlistings system 179 throughquery modification system 174. Sponsoredlistings system 179 selects sponsored web links to display in response to receiving a search query. The sponsored web links from a sponsoredlisting database 178 are sent topage assembler 180 and displayed in a portion of searchresult display screen 185. - With sponsored
listing system 179, sponsors' web sites can be displayed in a sponsor section ofscreen 185 when a search query matches a predefined search query string. For example, the hardware vendor might want to promote its printers by having a web page about printers from that vendor's website be pointed to by a sponsored link that would appear in the sponsored links region ofdisplay screen 185 when a web user enters the term “printers” has a search query. The hardware vendor would pay the search system operator for each time a user clicks on the printers sponsored link as that link is displayed in the sponsored links portion ofdisplay screen 185. - Sponsored
listings database 178 might contain records mapping search query strings to sponsors and sponsor's presentations, wherein a presentation might be a short text sequence and a web link. The mappings might be determined by a bidding process or other process used to assign search query strings to sponsors. Using thatdatabase 178, when sponsoredlistings system 179 receives a search query, modified or otherwise, sponsoredlistings system 179 determines whether the search query matches one of the predefined search query strings that are in the database. - Typically, if the search query exactly matches one of the indexed search query strings, sponsored
listings system 179 retrieves the sponsored web links that are indexed with that search query string. The selected sponsored web links are transmitted topage assembler 180. If the search query does not exactly match one of the indexed search query strings, sponsoredlistings system 179 does not return any sponsored web links. Thus, failing to locate an exact match between a search query entered by a user and one of the indexed search query strings prevents the search provider from receiving revenue from a sponsor. - The query modification system attempts to mitigate this problem by modifying search queries that are transmitted to sponsored
listings system 179 when the search queries do not have a match. As discussed above, sponsors pre-select one or more search query strings. The search query strings that a sponsor selects are indexed with that sponsor's web link and when a user submits a search query, if there are no matches, sponsoredlistings system 179 would otherwise not return any sponsor web links or presentations.Query modification system 174 generates modified search queries from the search queries where the modified search queries are more likely to match one or more of the indexed search query strings pre-selected by the sponsors, thus increasing coverage, but do so in a way that there is not a corresponding decrease in relevance. Further details of howquery modification system 174 might modify queries are described below with respect toFIGS. 4A-4C . -
FIG. 2B illustrates one particular system for modifying search queries so that they are more likely to match one of the sponsor-selected search query strings. In the embodiment ofFIG. 2B , searchquery modification system 174 initially forwards allsearch queries 170 directly to sponsoredlistings system 179 without modifying them.System 174 also stores copies of the search queries it sends tosystem 179. - Sponsored
listings system 179 then attempts to match the search query with one of the indexed search query strings as discussed above and returns the corresponding sponsored web links to querymodification system 174. If sponsoredlistings system 179 returns at least a predetermined number of sponsored web links, these links are sent directly topage assembler 180. - If sponsored
listings system 179 returns less than the predetermined number of sponsored web links (e.g., less than 1 or less than 2),query modification system 174 then changessearch query 170 into a new query to increase the chance that the new query will match more of the pre-selected sponsored query strings. - The new query is transmitted from
system 174 back to sponsoredlistings system 179.System 179 then attempts to match the new query against the sponsored search query strings. If a new set of sponsored links are identified as matching the new query, the new set of sponsored links are transmitted topage assembler 180. -
FIG. 2C illustrates another system for modifying search queries so that they are more likely to match more of the sponsor-selected search query strings. In the embodiment ofFIG. 2C ,query modification system 174 modifies all search queries 170 that it receives before the queries are transmitted to sponsoredlistings system 179, using aknowledge base 199.Knowledge base 199 stores sets of rules that are used to increase the coverage of search queries with respect to the sponsored links. After a query has been modified, the modified query is transmitted to sponsoredlistings system 179.System 179 locates sponsored links that match the modified query and transmits the results topage assembler 180. In some embodiments, the original query and the modified query are provided to sponsoredlistings system 179. - As illustrated in various figures, queries are modified by the query modification system while the original query is submitted to a search engine. In some embodiments, the query modification occurs at the client side, in others it occurs at the location of the search engine and in yet others, it occurs at a different place in a network. Where full access to the sponsored listings system is not available to the search system operator, the search system operator can provide the query modification system external to the sponsored listings system and treat the sponsored listing system as a “black box” with no internal modifications.
- Additional components might be added to a basic system. For example, one enhancement provides feedback by noting the click-throughs that occur for particular search queries and use that information in deciding how to modify search query terms. In some cases, the click-through rates are an indication of relevance and those indications can be used to select from among several options for modified queries.
- Various embodiments of methods for modifying queries to increase coverage without a corresponding loss of relevance are now described in detail with respect to
FIGS. 4A-4C .FIG. 3 is a flowchart that illustrates a general methodology for modifying search queries to increase coverage. - Referring to
FIG. 3 ,query modification system 174 receives a search query from a user atstep 301. Atstep 302,query modification system 174 modifies the search query using rules designed to increase the number of sponsored search query strings that the search query received atstep 301 matches, without corresponding loss of relevance. Many embodiments of these rules are possible. Examples of rules that can increase the number of matching sponsored search query strings are described below with respect toFIGS. 4A-4C . - Each sponsored search query string is indexed with one or more sponsored links in sponsored
listing system 179.Query modification system 174 can modify search queries before or after they have been sent tosystem 179, as discussed above with respect toFIGS. 2A-2B . - At
step 303,system 179 attempts to locate sponsored search query strings that match the modified search query. If sponsored search query strings are matched atstep 303,system 179 returns the sponsored links that correspond to the matched query strings atstep 304. - According to a more specific embodiment of the present invention illustrated in
FIG. 4A , the number of matching sponsored listings can be increased by identifying units in a search query that appeared less frequently in previous search queries. - Search queries can be decomposed into constituent parts referred to as units. A query processing engine can decompose a search query into one or more constituent units using statistical methods. A unit is one or more word sequence that typically corresponds to a natural concept such as “New York City” or “bird of prey.” Further details of techniques for generating concept units from search queries are discussed in co-pending and commonly-assigned U.S. patent application Ser. No. 10/713,576, filed Nov. 12, 2003, which is incorporated by reference herein.
- According to the embodiment of
FIG. 4A , each of the units in a search query is compared to previously submitted search queries. In some cases, previously submitted search queries are stored for later use. - Weight values are assigned to units in the search query based on the relative frequency that the units appeared in previously submitted search queries. Units that appeared less frequently in previous searches are given a higher weight, and units that appeared more frequently in previous search queries are given a lower weight.
- Units that have appeared less frequently in past search queries correspond to more specific concepts. The less frequently occurring units are more likely to be a good approximation of the user's true intent in entering the search query. The units that appeared more frequently in previous search queries are more generic and less likely to be a good approximation of the user's intent.
-
Query modification system 174 drops units in a search query that have lower weights. Thus, the original search query is modified to contain only the units in the original query that appeared less frequently in previous search queries relative to the other units in the original query. This feature allows more frequently occurring units in a search query to be filtered out to increase the coverage of sponsored listings. - Because some of the units in the original query are eliminated, the modified search query has fewer units. Queries shortened in this way have an increased chance of matching a larger number of sponsored listings in
system 179. In general, when sponsoredlistings system 179 processes the modified search query, it is likely to return more sponsored links than when it processes the original query. Thus, this embodiment generally increases the coverage of sponsored links that are returned bysystem 179 without a corresponding decrease in relevance. - For example, a user can enter a search query for a “10 day trip to Europe” to locate travel information to help plan a European vacation. This search query includes two concepts, “10 day” and “trip to Europe.” However, the concept “trip to Europe” is more relevant to the user's intent (planning a European vacation) than the concept “10 day.” Many travel web sites relating to European vacations do not include the phrase “10 day.” Sponsored
listings system 179 may not return sponsored links to European travel web sites that do not mention “10 day.” - According to an embodiment of the present invention, the units “10 day” and “trip to Europe” are compared to previous search queries to determine how frequently these units appeared. Because “10 day” appears more frequently than “trip to Europe,” the unit “10 day” is dropped from the search query. The modified search query only contains “trip to Europe.” The modified query “trip to Europe” has a greater chance of exactly matching more sponsored search query strings than “10 day trip to Europe.”
-
FIG. 4A illustrates a methodology according to this embodiment of the present invention. Atstep 311query modification system 174 receives a search query. Atstep 312,system 174 modifies the query by dropping the units that appear more frequently in previously submitted queries. The modified search query only contains the units that appeared less frequently in previous queries relative to other units in the original search query. - At
step 313, sponsoredlistings system 179 attempts to locate sponsored search query strings that match the modified search query. Atstep 314,system 179 returns a list of sponsored web links corresponding to the matched search query strings. - According to another embodiment, the units in a search query are compared with previously submitted search queries to determine how often groups of units in the search query appear in the previous search queries. Thus, a log of queries can be used to determine how frequently units occur, how frequently they occur in various combinations, etc. and that information can be used to determine how best to modify the search query to improve coverage without a corresponding decrease in relevance.
- For example,
query modification system 174 might modify a search query by eliminating units or groups of units that appeared less frequently.System 174 might also drop shorter groups of units from the search query. - To illustrate this with an example, consider the search query “Seattle cheap hotel.” For purposes of this example, suppose that each of the three words in the query were found to be separate units.
Query modification system 174 might use query logs to determine the frequency of each combination of units in the query in previous queries and find that “cheap hotel” and “Seattle cheap hotel” appear more frequently than “Seattle cheap”. In that case,query modification system 174 would not modify the query because longest string is also one of the most frequent. However, if the “Seattle hotel” appears much more frequently that “Seattle cheap hotel”,query modification system 174 might modify the query to be “Seattle hotel”. -
FIG. 4B illustrates another methodology. Atstep 321,query modification system 174 receives a search query. Atstep 322,system 174 modifies the query by eliminating groups of units in the search query that do not appear more frequently than by chance in previously submitted search queries. These groups of units are less likely to match relevant sponsored listings. - At
step 323,query modification system 174 modifies the search query again by eliminating shorter sets of the remaining units. The longer units are also more likely to match relevant sponsored listings. - At
step 324,query modification system 174 locates sponsored search query strings that match the modified search query. Because, in general, groups of units are eliminated from the search query, the modified search query is less specific, and therefore has an increased chance of matching more sponsored search query strings. Eliminating groups of units from queries is another way that the present invention increases the coverage of search queries with respect to sponsored listings. Atstep 325, sponsoredlistings system 174 returns a list of sponsored web links that correspond to the matched search query strings. - Associated units in a current search query might be compared with previously submitted search queries to determine if the associated units occur together more or less frequency in past search queries.
- Associated units are groups of units that are not sufficiently related to form a new unit. Associated units that appear together more frequently in previous search queries are probably more likely to match relevant sponsored links than less frequently occurring associated units. Thus, a search query can be modified by eliminating the unit associations that appeared less frequently in past queries.
- To illustrate an example of this, consider the search query “first trimester pregnancy nausea.” The query contains two unit associations which are “first trimester” and “pregnancy nausea.” In this example, the words in both of these associations are not sufficiently related to be new two-word units.
-
Query modification system 174 determines how frequently each of these unit associations appeared in the previously submitted search queries. If the unit association “pregnancy nausea” appeared more frequently in previous search queries than “first trimester” then the search query might be modified by eliminating the unit association “first trimester.” The modified search query would then be “pregnancy nausea.” In this manner, a modified search query, “pregnancy nausea” is submitted that have more coverage (as it is more likely that a sponsor would sponsor the search query “pregnancy nausea” than the search query “first trimester pregnancy nausea”. Notably, the relevance does not decline much for the original search query and the modified search query, as might occur if the modified search query were “first” (which might have wide coverage, but low relevancy), or “first trimester” (which might have low relevancy as the search is mostly about nausea during pregnancy). -
FIG. 4C illustrates another methodology. Atstep 331,query modification system 174 receives a search query. Atstep 332,system 174 modifies the query by eliminating associated units in the query that appeared less frequently in previous queries than other associated units in the current query. The less frequently occurring associated units are less likely to match relevant sponsored listings. - At
step 333,query modification system 174 locates sponsored search query strings that match the modified search query. Because associated units have been removed from the search query atstep 332, the modified search query is less specific and therefore likely to match more sponsored search query strings. Eliminating associated units from queries is another way that the present invention increases the coverage of search queries with respect to sponsored listings. Atstep 334, sponsoredlistings system 174 returns a list of sponsored web links that correspond to the matched search query strings. - By modifying a search query to include less units, groups of units, or associated units, the present invention increases the coverage of matching sponsored links. Units, group of units, and associated units are dropped from the search query. By dropping less relevant units, the present invention increases the coverage of sponsored links that are returned in response to a query. In general, the units, groups of units, and unit associations that are dropped from the search query are identified as being less likely to be a good approximation of a user's intent based on predefined sets of rules.
- In addition to modifying queries to improve coverage without a corresponding reduction in relevance,
query modification system 174 might also provide filtering functions. For example, it might modify queries to provide adult filtering, brand name filtering, etc. With such filtering, some terms that might have been eliminated are left in. For example, if someone searched for “brand X shoes”, the relevant portion of the string would be “shoes”, but if the “brand X” portion were left off, it would too greatly modify the results, so it should be left in. More generally, selector words are identified and left in the query even if other measures would have shortened a query by removing those words. - According to another embodiment of the present invention, a query is modified by substituting the query with a synonym or a preferred form of a query. The synonyms and preferred forms correspond to predefined query strings that have been selected by sponsors to correspond to sponsored links. This embodiment of the present invention allows a modified query to match a sponsor listing, even if the original query does not exactly match a predefined query string linked to the sponsored listing.
- For example, the query “NYC restaurants” can be replaced with the modified query “New York City restaurants,” if “New York City restaurants” is a predefined query string that has been selected by a sponsor, but “NYC restaurants” has not been selected by a sponsor. As another example, the query “autos repair” can be modified into the query “car repair,” by appropriately modifying the original query to generate a synonym or a preferred form. As yet another example, the phrase “wood work” can be a preferred form of “woodwork.”
- While the present invention has been described herein with reference to particular embodiments thereof, a latitude of modification, various changes, and substitutions are intended in the present invention. In some instances, features of the invention can be employed without a corresponding use of other features, without departing from the scope of the invention as set forth. Therefore, many modifications may be made to adapt a particular configuration or method disclosed, without departing from the essential scope and spirit of the present invention. It is intended that the invention not be limited to the particular embodiments disclosed, but that the invention will include all embodiments and equivalents falling within the scope of the claims.
Claims (26)
1. A method for modifying a query before presenting the query to a matching system, comprising:
storing a plurality of sponsored links;
receiving selection information that identifies which of a plurality of predefined query strings are to be associated with each of the plurality of sponsored links;
based on the selection information, storing data that indicates a correspondence between said plurality of predefined query strings and said plurality of sponsored links;
after (i) storing the plurality of sponsored links, (ii) receiving the selection information, and (iii) storing the data, receiving a query to initiate a performance of a search;
modifying the query, to produce a modified query, using rules designed to increase a chance that the modified query matches one or more of the plurality of predefined query strings;
identifying at least one predefined query string, of the plurality of predefined query strings, that matches the modified query;
returning at least one of the sponsored links to which said at least one predefined query string corresponds;
wherein modifying the query is performed by one or more computing devices.
2. The method of claim 1 , wherein each of the plurality of sponsored links comprises at least one of: a link to a presentation, a link to an advertisement, or a web link
3. The method of claim 1 , wherein:
the query identifies a set of units; and
each unit, of said set of units, corresponds to one or more search terms directed towards a single concept.
4. The method of claim 1 , wherein modifying the query, to produce a modified query, includes:
determining how frequently each unit identified by the query appears in previously-submitted queries; and
eliminating one or more units associated with the query that appear more frequently in the previously-submitted queries relative to other units in the query.
5. The method of claim 1 , wherein modifying the query to produce a modified query includes:
determining how frequently each unit associated with the query appears in previously-submitted queries; and
eliminating one or more units associated with the query that appear less frequently in the previously submitted queries relative to other units associated with the query.
6. The method of claim 1 , wherein modifying the query to produce a modified query includes:
determining how frequently each unit associated with the query appears in previously-submitted queries; and
eliminating one or more groups of units associated with the query that appear less frequently in the previously-submitted queries, relative to other units associated with the query, to produce a preliminary modified query.
7. The method of claim 1 , wherein modifying the query to produce a modified query includes eliminating a unit associated with the query that corresponds to fewer search terms relative to other units associated with the query.
8. The method of claim 1 , further comprising:
before the query is modified to produce the modified query, attempting to identify which of the predefined query strings match the query; and
returning a result that includes a subset of the sponsored links, wherein the modified query is generated in response to determining that the result includes less sponsored links than a predetermined number.
9. The method of claim 1 , wherein the query is modified to produce the modified query before an attempt is made to match the query to the predefined query strings.
10. The method of claim 1 , wherein modifying the query to produce the modified query using the rules includes determining how frequently units, groups of units, or associated units in the query appear in previously-submitted queries.
11. The method of claim 1 , further comprising:
performing the search based on the query using a search engine to generate search results; and
causing said at least one of the sponsored links to be displayed on a display screen along with the search results.
12. The method of claim 11 , further comprising:
causing the display screen to be visually divided into a sponsor section and a search-result section; and
causing said at least one of the sponsored links to be displayed within the sponsor section and the search results to be displayed within the search-result section.
13. The method of claim 1 , wherein modifying the query to produce the modified query includes substituting the query with a synonym or a preferred form that corresponds to one of the predefined query strings.
14. A computer system designed to modify a query before presenting the query to a matching system, the computer system comprising:
one or more processors; and
one or more storage media storing instructions which, when executed by the one or more one or more processors, causes:
storing a plurality of sponsored links;
receiving selection information that identifies which of a plurality of predefined query strings are to be associated with each of the plurality of sponsored links;
based on the selection information, storing data that indicates a correspondence between said plurality of predefined query strings and said plurality of sponsored links;
after (i) storing the plurality of sponsored links, (ii) receiving the selection information, and (iii) storing the data, receiving a query to initiate a performance of a search;
modifying the query, to produce a modified query, using rules designed to increase a chance that the modified query matches one or more of the plurality of predefined query strings;
identifying at least one predefined query string, of the plurality of predefined query strings, that matches the modified query;
returning at least one of the sponsored links to which said at least one predefined query string corresponds.
15. The computer system of claim 14 , wherein each of the plurality of sponsored links comprises at least one of: a link to a presentation, a link to an advertisement, or a web link
16. The computer system of claim 14 , wherein:
the query identifies a set of units; and
each unit, of said set of units, corresponds to one or more search terms directed towards a single concept.
17. The computer system of claim 14 , wherein modifying the query to produce a modified query includes:
determining how frequently each unit identified by the query appears in previously-submitted queries; and
eliminating one or more units associated with the query that appear more frequently in the previously-submitted queries relative to other units in the query.
18. The computer system of claim 14 , wherein modifying the query, to produce a modified query, includes:
determining how frequently each unit associated with the query appears in previously-submitted queries; and
eliminating one or more units associated with the query that appear less frequently in the previously submitted queries relative to other units associated with the query.
19. The computer system of claim 14 , wherein modifying the query, to produce a modified query, includes:
determining how frequently each unit associated with the query appears in previously-submitted queries; and
eliminating one or more groups of units associated with the query that appear less frequently in the previously-submitted queries, relative to other units associated with the query, to produce a preliminary modified query.
20. The computer system of claim 14 , wherein modifying the query to produce a modified query includes eliminating a unit associated with the query that corresponds to fewer search terms relative to other units associated with the query.
21. The computer system of claim 14 , wherein the instructions, when executed by the one or more processors, further cause:
before the query is modified to produce the modified query, attempting to identify which of the predefined query strings match the query; and
returning a result that includes a subset of the sponsored links, wherein the modified query is generated in response to determining that the result includes less sponsored links than a predetermined number.
22. The computer system of claim 14 , wherein the query is modified to produce the modified query before an attempt is made to match the query to the predefined query strings.
23. The computer system of claim 14 , wherein modifying the query to produce the modified query using the rules includes determining how frequently units, groups of units, or associated units in the query appear in previously-submitted queries.
24. The computer system of claim 14 , wherein the instructions, when executed by the one or more processors, further cause:
performing the search based on the query using a search engine to generate search results; and
causing said at least one of the sponsored links to be displayed on a display screen along with the search results.
25. The computer system of claim 24 , wherein the instructions, when executed by the one or more processors, further cause:
causing the display screen to be visually divided into a sponsor section and a search-result section; and
causing said at least one of the sponsored links to be displayed within the sponsor section and the search results to be displayed within the search-result section.
26. The computer system of claim 14 , wherein modifying the query to produce the modified query includes substituting the query with a synonym or a preferred form that corresponds to one of the predefined query strings.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/697,922 US20100138403A1 (en) | 2005-03-10 | 2010-02-01 | System for modifying queries before presentation to a sponsored search generator or other matching system where modifications improve coverage without a corresponding reduction in relevance |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/077,968 US7668808B2 (en) | 2005-03-10 | 2005-03-10 | System for modifying queries before presentation to a sponsored search generator or other matching system where modifications improve coverage without a corresponding reduction in relevance |
US12/697,922 US20100138403A1 (en) | 2005-03-10 | 2010-02-01 | System for modifying queries before presentation to a sponsored search generator or other matching system where modifications improve coverage without a corresponding reduction in relevance |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/077,968 Continuation US7668808B2 (en) | 2005-03-10 | 2005-03-10 | System for modifying queries before presentation to a sponsored search generator or other matching system where modifications improve coverage without a corresponding reduction in relevance |
Publications (1)
Publication Number | Publication Date |
---|---|
US20100138403A1 true US20100138403A1 (en) | 2010-06-03 |
Family
ID=36526988
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/077,968 Expired - Fee Related US7668808B2 (en) | 2005-03-10 | 2005-03-10 | System for modifying queries before presentation to a sponsored search generator or other matching system where modifications improve coverage without a corresponding reduction in relevance |
US12/697,922 Abandoned US20100138403A1 (en) | 2005-03-10 | 2010-02-01 | System for modifying queries before presentation to a sponsored search generator or other matching system where modifications improve coverage without a corresponding reduction in relevance |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/077,968 Expired - Fee Related US7668808B2 (en) | 2005-03-10 | 2005-03-10 | System for modifying queries before presentation to a sponsored search generator or other matching system where modifications improve coverage without a corresponding reduction in relevance |
Country Status (2)
Country | Link |
---|---|
US (2) | US7668808B2 (en) |
WO (1) | WO2006099116A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100114863A1 (en) * | 2007-09-07 | 2010-05-06 | Ryan Steelberg | Search and storage engine having variable indexing for information associations |
Families Citing this family (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7689554B2 (en) * | 2006-02-28 | 2010-03-30 | Yahoo! Inc. | System and method for identifying related queries for languages with multiple writing systems |
US7483894B2 (en) * | 2006-06-07 | 2009-01-27 | Platformation Technologies, Inc | Methods and apparatus for entity search |
US7774198B2 (en) * | 2006-10-06 | 2010-08-10 | Xerox Corporation | Navigation system for text |
US7672937B2 (en) * | 2007-04-11 | 2010-03-02 | Yahoo, Inc. | Temporal targeting of advertisements |
US8112435B2 (en) * | 2007-04-27 | 2012-02-07 | Wififee, Llc | System and method for modifying internet traffic and controlling search responses |
US20090006311A1 (en) * | 2007-06-28 | 2009-01-01 | Yahoo! Inc. | Automated system to improve search engine optimization on web pages |
US9122743B2 (en) * | 2008-01-30 | 2015-09-01 | International Business Machines Corporation | Enhanced search query modification |
US8312095B2 (en) | 2008-01-30 | 2012-11-13 | International Business Machines Corporation | Tracking interactive text-message communications |
US20090248627A1 (en) * | 2008-03-27 | 2009-10-01 | Yahoo! Inc. | System and method for query substitution for sponsored search |
US20100125597A1 (en) * | 2008-11-14 | 2010-05-20 | Yahoo! Inc. | System and method for determining search terms for use in sponsored searches |
US8543381B2 (en) * | 2010-01-25 | 2013-09-24 | Holovisions LLC | Morphing text by splicing end-compatible segments |
US8161073B2 (en) | 2010-05-05 | 2012-04-17 | Holovisions, LLC | Context-driven search |
US20110313756A1 (en) * | 2010-06-21 | 2011-12-22 | Connor Robert A | Text sizer (TM) |
US8819000B1 (en) * | 2011-05-03 | 2014-08-26 | Google Inc. | Query modification |
US20140172821A1 (en) * | 2012-12-19 | 2014-06-19 | Microsoft Corporation | Generating filters for refining search results |
DE102013003055A1 (en) * | 2013-02-18 | 2014-08-21 | Nadine Sina Kurz | Method and apparatus for performing natural language searches |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5201048A (en) * | 1988-12-01 | 1993-04-06 | Axxess Technologies, Inc. | High speed computer system for search and retrieval of data within text and record oriented files |
US5864846A (en) * | 1996-06-28 | 1999-01-26 | Siemens Corporate Research, Inc. | Method for facilitating world wide web searches utilizing a document distribution fusion strategy |
US6282537B1 (en) * | 1996-05-30 | 2001-08-28 | Massachusetts Institute Of Technology | Query and retrieving semi-structured data from heterogeneous sources by translating structured queries |
US20020100055A1 (en) * | 2001-01-22 | 2002-07-25 | Zeidman Robert M. | Method for advertisers to sponsor broadcasts without commercials |
US20020169760A1 (en) * | 1999-05-28 | 2002-11-14 | Cheung Dominic Dough-Ming | System and method for providing place and price protection in a search result list generated by a computer network search engine |
US6671681B1 (en) * | 2000-05-31 | 2003-12-30 | International Business Machines Corporation | System and technique for suggesting alternate query expressions based on prior user selections and their query strings |
US6745178B1 (en) * | 2000-04-28 | 2004-06-01 | International Business Machines Corporation | Internet based method for facilitating networking among persons with similar interests and for facilitating collaborative searching for information |
US6816857B1 (en) * | 1999-11-01 | 2004-11-09 | Applied Semantics, Inc. | Meaning-based advertising and document relevance determination |
US20050033641A1 (en) * | 2003-08-05 | 2005-02-10 | Vikas Jha | System, method and computer program product for presenting directed advertising to a user via a network |
US20050232131A1 (en) * | 2004-04-15 | 2005-10-20 | Bulleit Douglas A | Systems, methods and computer program products for providing sponsored proactive searches for sponsored quality of service network connections |
US7133866B2 (en) * | 2002-10-02 | 2006-11-07 | Hewlett-Packard Development Company, L.P. | Method and apparatus for matching customer symptoms with a database of content solutions |
US7428529B2 (en) * | 2004-04-15 | 2008-09-23 | Microsoft Corporation | Term suggestion for multi-sense query |
US7444324B2 (en) * | 1998-07-15 | 2008-10-28 | A9.Com, Inc. | Search query processing to identify search string corrections that reflect past search query submissions of users |
US20110093709A1 (en) * | 2004-06-14 | 2011-04-21 | Christopher Lunt | Providing Social-Network Information to Third-Party Systems |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1665093A4 (en) | 2003-08-21 | 2006-12-06 | Idilia Inc | System and method for associating documents with contextual advertisements |
-
2005
- 2005-03-10 US US11/077,968 patent/US7668808B2/en not_active Expired - Fee Related
-
2006
- 2006-03-09 WO PCT/US2006/008563 patent/WO2006099116A1/en active Application Filing
-
2010
- 2010-02-01 US US12/697,922 patent/US20100138403A1/en not_active Abandoned
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5201048A (en) * | 1988-12-01 | 1993-04-06 | Axxess Technologies, Inc. | High speed computer system for search and retrieval of data within text and record oriented files |
US6282537B1 (en) * | 1996-05-30 | 2001-08-28 | Massachusetts Institute Of Technology | Query and retrieving semi-structured data from heterogeneous sources by translating structured queries |
US5864846A (en) * | 1996-06-28 | 1999-01-26 | Siemens Corporate Research, Inc. | Method for facilitating world wide web searches utilizing a document distribution fusion strategy |
US7444324B2 (en) * | 1998-07-15 | 2008-10-28 | A9.Com, Inc. | Search query processing to identify search string corrections that reflect past search query submissions of users |
US20020169760A1 (en) * | 1999-05-28 | 2002-11-14 | Cheung Dominic Dough-Ming | System and method for providing place and price protection in a search result list generated by a computer network search engine |
US6816857B1 (en) * | 1999-11-01 | 2004-11-09 | Applied Semantics, Inc. | Meaning-based advertising and document relevance determination |
US6745178B1 (en) * | 2000-04-28 | 2004-06-01 | International Business Machines Corporation | Internet based method for facilitating networking among persons with similar interests and for facilitating collaborative searching for information |
US6671681B1 (en) * | 2000-05-31 | 2003-12-30 | International Business Machines Corporation | System and technique for suggesting alternate query expressions based on prior user selections and their query strings |
US20020100055A1 (en) * | 2001-01-22 | 2002-07-25 | Zeidman Robert M. | Method for advertisers to sponsor broadcasts without commercials |
US7133866B2 (en) * | 2002-10-02 | 2006-11-07 | Hewlett-Packard Development Company, L.P. | Method and apparatus for matching customer symptoms with a database of content solutions |
US20050033641A1 (en) * | 2003-08-05 | 2005-02-10 | Vikas Jha | System, method and computer program product for presenting directed advertising to a user via a network |
US20050232131A1 (en) * | 2004-04-15 | 2005-10-20 | Bulleit Douglas A | Systems, methods and computer program products for providing sponsored proactive searches for sponsored quality of service network connections |
US7428529B2 (en) * | 2004-04-15 | 2008-09-23 | Microsoft Corporation | Term suggestion for multi-sense query |
US20110093709A1 (en) * | 2004-06-14 | 2011-04-21 | Christopher Lunt | Providing Social-Network Information to Third-Party Systems |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100114863A1 (en) * | 2007-09-07 | 2010-05-06 | Ryan Steelberg | Search and storage engine having variable indexing for information associations |
US8751479B2 (en) * | 2007-09-07 | 2014-06-10 | Brand Affinity Technologies, Inc. | Search and storage engine having variable indexing for information associations |
Also Published As
Publication number | Publication date |
---|---|
WO2006099116A1 (en) | 2006-09-21 |
US20060206474A1 (en) | 2006-09-14 |
US7668808B2 (en) | 2010-02-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7668808B2 (en) | System for modifying queries before presentation to a sponsored search generator or other matching system where modifications improve coverage without a corresponding reduction in relevance | |
US7698331B2 (en) | Matching and ranking of sponsored search listings incorporating web search technology and web content | |
KR100851710B1 (en) | Lateral search | |
US8019749B2 (en) | System, method, and user interface for organizing and searching information | |
US8014997B2 (en) | Method of search content enhancement | |
US9015176B2 (en) | Automatic identification of related search keywords | |
JP5114380B2 (en) | Reranking and enhancing the relevance of search results | |
US6718365B1 (en) | Method, system, and program for ordering search results using an importance weighting | |
KR100699977B1 (en) | Method and apparatus for identifying related searches in a database search system | |
US7962463B2 (en) | Automated generation, performance monitoring, and evolution of keywords in a paid listing campaign | |
US20130060747A1 (en) | Web search system with group interaction support | |
US20030046098A1 (en) | Apparatus and method that modifies the ranking of the search results by the number of votes cast by end-users and advertisers | |
US8375048B1 (en) | Query augmentation | |
AU2005267370A1 (en) | Results based personalization of advertisements in a search engine | |
US9300757B1 (en) | Personalizing aggregated news content | |
US9305088B1 (en) | Personalized search results |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |
|
AS | Assignment |
Owner name: YAHOO HOLDINGS, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:YAHOO| INC.;REEL/FRAME:042963/0211 Effective date: 20170613 |
|
AS | Assignment |
Owner name: OATH INC., NEW YORK Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:YAHOO HOLDINGS, INC.;REEL/FRAME:045240/0310 Effective date: 20171231 |