CN1430751A - 用于在数据库搜索系统中识别相关搜索的方法和装置 - Google Patents

用于在数据库搜索系统中识别相关搜索的方法和装置 Download PDF

Info

Publication number
CN1430751A
CN1430751A CN01809998A CN01809998A CN1430751A CN 1430751 A CN1430751 A CN 1430751A CN 01809998 A CN01809998 A CN 01809998A CN 01809998 A CN01809998 A CN 01809998A CN 1430751 A CN1430751 A CN 1430751A
Authority
CN
China
Prior art keywords
search
relevant
search listing
addelement
url
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN01809998A
Other languages
English (en)
Other versions
CN1430751B (zh
Inventor
P·G·罗雷克斯
T·A·索拉尼勒
B·R·豪加尔德
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fly Upward Management Co Ltd
Original Assignee
Overture Services Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Overture Services Inc filed Critical Overture Services Inc
Publication of CN1430751A publication Critical patent/CN1430751A/zh
Application granted granted Critical
Publication of CN1430751B publication Critical patent/CN1430751B/zh
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9538Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0273Determination of fees for advertising
    • G06Q30/0275Auctions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0277Online advertisement
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99941Database schema or data structure
    • Y10S707/99942Manipulating data structure, e.g. compression, compaction, compilation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99941Database schema or data structure
    • Y10S707/99943Generating database or data structure, e.g. via user interface

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Theoretical Computer Science (AREA)
  • Development Economics (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Technology Law (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Transition And Organic Metals Composition Catalysts For Addition Polymerization (AREA)

Abstract

一种生成一搜索结果列表并为一搜索者提供相关的搜索的的方法。生成与由该搜索者提交的一搜索请求的一匹配的搜索列表在包括多个搜索列表的一按业绩付费数据库中被识别。包含在由该按业绩付费数据库生成的一相关的搜索数据库中的相关的搜索列表被识别为与该搜索请求有关。包括所识别的搜索列表和一个或多个所识别的搜索列表的一搜索结果列表被返回给该搜索者。

Description

用于在数据库搜索系统中识别相关搜索的方法和装置 附录/版权参考
本发明文献的公开部分包括属于版权保护的资料。当它出现在U.S专利商标局专利文件或记录中,版权人不反对任何人完全相同该专利文献或专利公开,否则留所有版权。
附上包含计算机程序源代码的附录。因此该附录特地包含在此作为参考,以及包含属于版权保护的如上所述的资料。
发明背景
本发明通常涉及用于使用如一基于Internet的搜索引擎产生一搜索结果列表的方法和系统。更准确地说,本发明涉及用于从一按业绩付费数据库产生搜索结果以及从一相关的搜索数据库产生相关搜索的一列表的方法和系统。
搜索引擎通常被用来搜索计算机网络如万维网上可用的信息以允许用户定位存储在该网络中的感兴趣的信息。为使用一搜索引擎,一用户或搜索者通常输入搜索引擎用来产生一信息列表如网页的一个或多个搜索项,然后该搜索者能访问和利用。由该搜索产生的信息通常被识别为在该信息和由用户输入的一个或多个搜索项间建立的一关联的结果。不同的搜索引擎使用不同的技术来关联信息和搜索项并识别相关的信息。这些搜索引擎也使用不同的技术来向用户提供被识别的信息。因此,正被查找的信息的似然性作为一搜索的结果改变取决于用来执行该搜索的搜索引擎,
这种不确定性是使万维网上信息可用的网页经营商特别关心的。在这种设置,通常几个网页经营商或广告商竞争相同群体的可能的观众或用户。因此,被标识成一搜索结果的一网页的能力对一网页的成功来说通常是很重要的。因此,网页经营商通常试图增加他们的网页将被看成一搜索结果的似然性。
使网页经营商具有被看成一搜索的结果的一种更能预测的方法的一种类型的搜索引擎是一种“按业绩付费”协议,其中至少部分基于广告商或网页经营商同意付给该搜索引擎经营商的货币总额显示网页。该网页经营商同意已付适量的货币,通常是指竞买金额(bidamount),换得响应一用户输入的一搜索项产生的一组搜索结果组中一特定的位置。一较高的竞买金额将产生在该搜索结果组中一更突出的放置。因此,一网页经营者可试图在一个或多个搜索项上出大价以增加对那个术语来说他们的网页将被看成一搜索结果的似然性。然而,有多个相似的搜索项,对一网页经营者来说在每个潜在地相关的搜索项上竞买是困难的。同样,一竞买将在每个检索术语上获利也是不可能的。因此,一搜索引擎经营者不能从使用没有竞买的某些搜索项执行的搜索接收任何收入。
另外,因为现有的网页的数目是不断增加的,对一用户来说找到相关的搜索结果变得更困难。获得相关的搜索结果的困难进一步增加,因为搜索引擎是依靠由该用户输入的搜索项。一用户接收的搜索结果直接地取决于该用户输入的搜索项。一搜索项的输入不能导致关联的搜索结果,但是仅一稍有不同的搜索项的输入可以导致相应的搜索结果。因此搜索项的选择经常是该搜索过程重要的一部分。这可能的对该搜索者和为该搜索者推荐相关的搜索的提供到该搜索引擎的该广告商都有利。然而,当前的搜索引擎不允许搜索引擎经营者向一用户提供相关的搜索项,例如将会产生相应的搜索结果的那些搜索项。需要克服这些缺陷的一系统。
概述
仅通过介绍,根据本发明的一实施例,接收来自一搜索者的一搜索请求并用来在一按业绩付费的数据库上执行一搜索。在该按业绩付费的数据库中存储有包括网页定位器和由所列出的网页的经营者已付的竞买金额的搜索列表。使用该按业绩付费的数据库该搜索产生呈现给该搜索者的搜索结果。该搜索请求也被用来在一相关的搜索数据库上执行一搜索。至少部分地使用该按业绩付费的内容,已经形成该相关的搜索数据库。该相关的搜索数据库的搜索产生呈现给该搜索者的相关的搜索的一列表。
根据一第二实施例,使用一按业绩付费数据库创建一相关的搜索的数据库。由该按业绩付费的数据库引用的来自网页的所有文字被存储和用来产生一倒排索引。另外的索引被用来提高使用该数据库获得的关联性和相关的搜索结果的分布。
上述讨论的本发明的说明性的实施例仅仅用介绍的方式提供。在本部分中不应当视为对定义本发明的范围的下述权利要求的一限制。
附图的几个视图的简要的描述
图1是结合一计算机网络举例说明一数据库搜索系统的一方框图;
图2是举例说明用于操作图1的数据库搜索系统的一方法的流程图;
图3是举例说明用于操作图1的数据库搜索系统的一方法的流程图;
图4是更详细地举例说明如图2所示的该方法部分的流程图;
图5是更详细地举例说明如图2所示的该方法部分的流程图;
图6是举例说明用于形成一相关的搜索数据库的一方法的流程图;以及
图7是举例说明用于从一数据库删除相似的页面信息的一方法的流程图。
目前优选实施例的详细说明
现在参考图1,图1是所示的结合一计算机网络102的一数据库搜索系统100的方框图。
该数据库搜索系统100包括一按业绩付费数据库104,一相关的搜索数据库106,一搜索引擎网络服务器108,一相关的搜索网络服务器110以及一搜索引擎网页114。服务器104,106,108可由一广告商网络服务器120或一客户计算机122在该网络102上访问。
在所举例说明的实施例中该网络102是Internet并根据适当的标准例如网际协议提供数据通信。在其他的实施例中,其他的网络系统可能被单独的或结合Internet一起使用。在该网络102中的通信最好按照网际协议或相似的数据通信标准。其他的数据通信标准也可能被使用来确保可靠的数据通信。
数据库搜索系统100被设置为一客户机和服务器体系结构的一部分。在一计算机网络例如因特网的环境中,一客户端程序是要求一服务的诸如一程序、任务或应用的一过程,该服务是由诸如一程序、任务或应用的另一过程提供的,该另一过程要求由称为服务器程序的一另一过程提供的一服务。该客户端程序使用所要求的服务而不必了解任何关于该另一个服务器程序或该服务器本身的工作细节。在网络系统中,一客户端程序通常在访问由运行一相应的服务器进程的另一计算机提供的共享网络资源的一计算机之上运行。一服务器典型的是在一通信媒介例如一网络上可访问的一远程计算机系统。该服务器充当用于一计算机网络的一信息提供者。因此该系统100操作为用于由该客户机例如客户机122和该广告商网络服务器120访问的服务器。
客户计算机122可以是常规的个人电脑,工作站或任何大小的计算机系统。每个客户计算机112典型的包括一个或多个处理器,存储器,输入输出装置以及一网络接口例如一调制解调器。该广告商网络服务器120,该搜索引擎网络服务器108,该相关的搜索网络服务器110以及该帐户管理网络服务器112可以被相似地配置。然而,该广告商网络服务器120,该搜索引擎网络服务器108,该相关的搜索网络服务器110以及该帐户管理网络服务器112每个可包括由一单独的专用网连接的多个计算机。
该客户计算机112执行一万维网(″网络″)浏览器程序124。这种程序例子是可以从Netscape Communications Corporation获得的Navigator以及可以从微软公司获得的Internet Explorer。该浏览器程序124经由一用户使用来输入将被检索的特殊的网页的地址。这些地址被称为统一资源定位器(URLs)。另外,只要检索到一页面,该浏览器程序124可以提供当该用户点击包含在该网页中的其他网页的超链接时对其他页或记录的访问。这种超链接提供一自动方式用于该用户输入另一页的URL并检索那页。这些页可以是包括如内容简单的文字信息或更复杂的数字编码的多媒体内容例如软件程序,图形,声频数据,视频数据等等的数据记录。
客户计算机122通过该网络102与不同的网络信息提供者通信。这些信息提供者包括该广告商网络服务器120、该帐户管理服务器112、该搜索引擎服务器108、以及相关的搜索网络服务器110。最好,通信功能性是由超级文字传输协议(HTTP)提供,尽管可使用其他的通信协议如FTP,SNMP,Telnet以及在本领域公知的其他的协议。最好,搜索引擎服务器108,相关的搜索服务器110以及帐户管理服务器112,以及广告商服务器120位于万维网上。U.S.专利申请号09/322,627,1999年5月29日申请,名为“System and Method for Influencing aPosition on a Search Result List Generated by a Computer NetworkSearch Engine″,以及U.S.专利申请号09/494,818,2000年1月31日申请,名为″Method and System for Generating a Set of SearchTerms″均被授予本申请的受让人以及在此合并作为参考。这些申请公开了搜索引擎系统的其他的方面。
在该举例说明的实施例中的帐户管理网络服务器112包括一计算机存储器介质例如一磁碟片系统以及一处理系统。一数据库被存储在该存储介质上并且包含广告商的账户信息。运行在客户计算机122上的常规的浏览器程序124可能被用来访问存储在该帐户管理服务器112上的广告商账户信息。
该搜索引擎网络服务器108在定位到该搜索引擎网络服务器URL或在能通过一浏览器程序124向该搜索引擎网络服务器148提交查询的其他网络服务器的站点的基础上允许网络用户键入关键词查询来在网页上的上百万的可用的页中识别感兴趣的页。在本发明的一实施例中,搜索引擎网络服务器108产生至少部分地包括从由该帐户管理服务器112实施的竞买进程的结果获得的并格式化的相应的输入的一搜索结果列表。该搜索引擎网络服务器108对包含与由该用户在一客户计算机122输入的搜索项有关的信息的文档产生一超文字链接列表。该搜索引擎网络服务器将该列表用一网页114的形式传送给该网络用户,在运行在该客户计算机122上的浏览器124上显示它。该搜索引擎网络服务器的一实施例可通过定位到在URL http://www.goto.com/的网页被找到。
搜索引擎网络服务器108被连接到该网络102。在本发明的一个实施例中,搜索引擎网络服务器108包括包含多个搜索列表的一按业绩付费数据库。该数据库104包含用来产生响应用户查询的搜索结果的搜索列表记录的有序的集合。每个搜索列表记录包含一相关的网页或文档、一标题、叙述性的文字和一竞买金额的URL。另外,搜索引擎网络服务器108也可被连接到该帐户管理服务器112。该帐户管理服务器112也可被连接到该网络102。
另外,在图1的举例说明的实施例中,该数据库系统100进一步包括一相关的搜索网络服务器110和一关联的相关的搜索数据库106。该相关的搜索网络服务器110和数据库106用来向一搜索者提供建议的、相关的搜索以及响应他的查询的搜索结果。使用一搜索引擎网络服务器例如服务器108实施搜索信息的用户经常执行同该网络站点搜索引擎的索引数据相比被不适当地聚焦搜索。用户可使用或含糊和概括如一″音乐″或太具体和集中如″在1950s前期间来自New Orleans的热情的爵士乐”的搜索项。一些用户要求帮助来精炼他们的查询以便从该搜索引擎更好的获得有用信息。该相关的搜索网络服务器110向该用户提供非常适合于该按业绩付费数据库104的性能的查询建议。
在该举例说明的实施例中,该按业绩付费数据库104结合经营网络服务器例如广告商网络服务器120的广告商被建立。广告商网页121被显示在该广告商网络服务器120上。一广告商或网络站点创办人通过驻留在该帐户管理服务器112上的一账户与其他的广告商参予一竞标进程。一广告商可竞买多个与该广告商网络站点的内容有关的搜索项。
通过该网络站点创办人提交的竞买用来控制向使用客户计算机122的一搜索者提出的搜索结果。当使用由该广告商竞买的该搜索项的一搜索被执行时较高的竞买在由该搜索引擎网络服务器108产生的一搜索结果列表上得到更有利的位置。在一实施例中,由一广告商竞买的金额包括每当经在该搜索结果列表页上的一超链接访问该广告商网络站点时从该广告商的账户扣除的一货币金额。一搜索者用一计算机输入装置如一鼠标点击该超链接来启动一检索请求以便检索与该广告商的超链接有关的信息。最好,每次访问或点击一搜索结果列表超链接被重定向到该搜索引擎网络服务器108来将该点击同用于一广告商的该账户标识符关联。对搜索者来说不是显而易见的这个重定向作用在访问该广告商的URL前使用由该搜索者点击的该搜索结果列表超链接将访问编码成该搜索结果页的账户信息。在该举例说明的实施例中,在该搜索结果列表页上的该广告商的网络站点说明和超链接附有该广告商的列表是一已付表的一指示。每个已付表显示与对每次通过该搜索结果列表介绍到该广告商站点的每次点击由该广告商应付的一价格一致的一金额。
该搜索者可点击与那个搜索结果页中的每个列表有关的超级文字连接来访问该相应的网页。该超级文字连接可在该因特网上的任何地方访问网页,以及包括对位于该广告商网络服务器120上的广告商网页的已付表。在本发明的一实施例中,该搜索结果列表也包括不是按广告商的竞买结果定价的并且由一常规的搜索引擎如Inktomi、Lycos、或Yahoor搜索引擎产生的不付费表。不付费超级文字连接也可包括由一编辑组(editorial team)手动索引到该按业绩付费数据库104的链接。最好,该不付费表在该搜索结果页上的该已付广告商列表之后。
相关的搜索网络服务器110从在客户计算机122的搜索者接收使用该搜索引擎网页114输入的搜索请求。该相关的搜索数据库106中,包括从该按业绩付费数据库104产生的相关的搜索列表,该相关的搜索网络服务器110识别与该搜索请求关联的相关的搜索列表。结合该搜索引擎网络服务器108,该相关的网络服务器110返回包括位于该按业绩付费数据库中的所识别的搜索列表以及位于该相关的搜索数据库106中的一个或多个所识别的相关的搜索列表的一搜索结果列表给该搜索者。该相关的搜索网络服务器110连同相关的搜索数据库的操作将结合图2-5在下面描述。该相关的搜索数据库106的形成将结合图6描述如下。
图2是举例说明用于操作图1的数据库搜索系统100的一方法的流程图。该方法从块200开始。steps described herein is includedas an appendix.在此描述的用于实现图2的方法的Java源代码及其他方法步骤被包括作为一附录。
在块202,一搜索请求被接收。该搜索请求可用任何适当的方法接收。想象一搜索请求将由使用一客户计算机的一搜索者发起来访问实现如图2所示的方法的该数据库系统的搜索引擎网页。一搜索请求可被键入作为在一超文字链接点击中的输入文字来启动该搜索请求和搜索过程。
在块202后,两个平行的过程被启动。在块204,该数据库搜索系统的搜索引擎网络服务器在该系统的按业绩付费的数据库中识别匹配的搜索列表。另外,该搜索引擎网络服务器可进一步识别不付费的搜索列表。
同样,在块206,一相关的搜索网络服务器启动一搜索来在该相关的搜索数据库中识别匹配的相关的搜索列表。通过匹配搜索列表,表示各个搜索引擎识别包含在各个数据库中的搜索列表,这产生与该搜索请求的一匹配。如果用于字母原文匹配的一精确的、字母在一竞买的关键词和一搜索项间产生,可产生一匹配。在其他实施例中,如果一竞买的关键词与一搜索项具有预定的关系,可产生一匹配。例如,该预定的关系可包括匹配已经去除后缀的一词根;在一多个词查询中,匹配几个但不是所有词,或用预定数量的词的接近度定位该多个词的查询。
如果已经定位搜索结果,在块208,来自按业绩付费数据库的搜索结果与来自相关搜索数据库的搜索结果结合。在块210,一搜索结果列表被返回给该搜索者,例如通过在该搜索引擎网页上显示识别后的搜索列表并在该网络上将该网页数据传送到该客户计算机。该搜索结果和相关的搜索结果可用任何常规的方式显示。
用在本发明的一个实施例中的一搜索结果列表显示的例子如图3所示,其是由对术语“CD写入器(CD burner)”的搜索得出的开头的几个输入项的显示。图3的示例性的显示表示包括多个输入310a、310b、310c、310d、310e、310f、310g、310h、310i的一搜索结果列表部分、其他搜索类别(search category)的一列表312以及一相关的搜索列表314。
如图3所示,一单个输入,如在该搜索结果列表中的输入310a包括该网络站点的一说明320,最好包括一标题和一短的文字的说明以及一超链接330,当该超链接被一搜索者点击时,将该搜索者浏览器引导到该描述的网络站点所处的URL。该URL340也可以在该搜索结果列表输入310a显示,如图3所示。当浏览图3的搜索结果项显示310的远程搜索者选择或点击该搜索结果项显示310的超链接330时,该搜索结果项的“点击”发生。
搜索结果列表输入310a-310h也表示该广告商的搜索列表的等级值360a、360b、360c、360d、360e、360f、360g、360h、360i。该等级值360a-360i是一序数值,最好是由该搜索引擎网络服务器的处理系统产生并指定给该搜索列表的一数字。最好,该等级值360a-360i是用软件实现的在该竞买金额、等级和一搜索列表的搜索项间建立一关联的一过程中被指定。该过程收集匹配一特定的搜索项的所有的搜索列表、按从最高到最低竞买金额对搜索列表排序,并按顺序对每个搜索列表指定一等级值。最高竞买金额得到最高的等级值,第二最高的竞买金额得到第二最高的等级值,直到获得最低等级值的最低竞买金额。等级值与竞买金额间的关联如图3中所示,每个已付的搜索列表输入310a-310h显示用于那个输入的广告商的竞买金额350a、350b、350c、350d、350e、350f、350g、350h、350i。如果具有相同搜索项的两个搜索列表也具有相同的竞买金额,在时间上先接收的竞买将被指定较高的等级值。
图3的搜索结果列表不包括不付费的列表。在优选实施例中,不付费列表不显示一竞买金额以及在最低等级的已付列表之后显示。由利用对象分布数据库和本领域公知的文字搜索算法的一搜索引擎产生不付费的列表。这样一种搜索引擎的例子是由Inktomi Corporation运作的搜索引擎。由该远程的搜索者输入的最初的搜索查询被用来通过该常规的搜索引擎生成不付费的列表。
该列表312的其他搜索种类表示用于搜索可能与该搜索者的输入搜索项316相关的其他可能的种类。其他的搜索种类被选择用于通过识别一组如包含该输入搜索项316中的计算机硬件显示。然后在该组中的种类被显示为可由该搜索者点击用于另外的搜索的超链接。这提高了该用户的输入搜索不能找到适当的搜索结果的情况下的用户的方便。
该相关的搜索列表314显示使用在此描述的相关的搜索数据库确定的相关的搜索的6个输入318。在其他实施例中,可显示其他的相关搜索输入的数量。另外,标记为“更多”的一链接320允许用户显示另外的相关的搜索输入。在举例说明的实施例中所显示的输入318是在该相关的搜索数据库中最上面的六个最相关和竞买最多的术语。
现在参考图4,在一个实施例中识别在一相关的搜索数据库中匹配的相关的搜索列表的动作(图2,动作206)包括以下动作。在块400,包含来自包含在该数据库搜索系统的按业绩付费数据库中的所有网页的所有数据的一倒排索引被搜索。该倒排索引被存储在相关的搜索数据库中。在一倒排索引中,一单个的索引输入被用来参考多个数据库记录。查找每个索引输入的多个匹配当使用倒排索引时通常会更快,因为每个索引输入可引用多个数据库记录。该倒排索引列举可按如字母顺序搜索的词以及附有的每个词是识别包含该词和在每个文档中出现该词的位置的特定的文档的指针。为执行一搜索,代替用词的顺序搜索完这些文档,该计算机定位用于在一搜索查询中识别的特定词的指针并处理它们。该计算机识别具有对该搜索查询术语来说所要求的顺序以及接近关系的文档。
在块402,也为所接收的搜索项搜索元信息。元信息是抽象的,删除有关所接收的数据本身的一次性移出的信息并形成该数据的一说明。元信息是从信息和相关的信息导出的。用于一列表的元信息描述该列表与其他列表的关系,以及用于一列表的元信息描述主办一列表的广告商与其他广告商之间的关系。
使用命令的脚本来获得元信息以分析该按业绩付费数据库并确定在该数据中出现的信息和关系。在该数据库中收集用于每行的数据的元信息并将其加到那行上。在一个实施例中,在该数据被收集在该数据库中后,该脚本按一批处理运行一次。在其他实施例中,该脚本被定期地重新运行以更新元信息。
包含在按业绩付费数据库中的有关网页的元信息和关键词包括诸如在不同网页域中相似关键词出现的频率以及与一单个的网页相关的不同关键词的数据的信息。该元信息可进一步包括包含在由在该按业绩付费数据库具有竞买的搜索项的网页创办人提供的每个搜索列表中的信息、广告商标识信息、网络站点主题如赌博或成人内容、以及导出的主题的字段型的广告商数据。最好,在块400,该元信息用一普通的倒排索引与所搜索的所存储的网页数据结合。
块400和块402的搜索结果是该倒排索引或包含所搜索的信息索引的行的一列表。每行包含与该按业绩付费数据库的一搜索列表相关的信息以及与该搜索列表相关的网页的所有文字。在举例说明的实施例中,搜索列表包括该广告商的搜索项、该网页的URL、一标题以及描述性的文字。
在块404,返回的相关的搜索结果按关联排序。可使用任何适当的排序例程。按关联排序搜索结果的最好的处理在图5中更详细地说明。
在块406,可提供六个最相关的有关的搜索结果。注意可提供任何适当的搜索结果数量。建议一搜索者选择提供六个有关的搜索是任意的。在块406后,控制进行到图2的块208。
图5是说明用于按关联排序从一相关的搜索数据库获得的搜索结果的方法的流程图,对应于图4的块404。在如图5所述的实施例中,用于每个返回的列表的一关联值被保持。该关联值根据在图5中定义的一些特定的关联因素被调整。也可使用其他关联因素。在调整该关联值后,产生一最终的排序并且最高值的列表被返回。
在块500,根据在每个各自的记录中一查询的搜索项出现的频率,在搜索过程中用于个别记录的关联值增加(块400、402,图4)。例如,如果所查询的搜索项在与该搜索列表相关的文字中频繁地出现,那个列表的关联则增加。如果查询的搜索项在该列表中很少或根本不出现,则那个列表的关联值不增加或减小。
在块502,确定在由该搜索者提交的搜索查询中是否有多个搜索项。如果不是,控制进入块506。如果有多个搜索项,在块504,个别搜索结果的关联根据在一定位的记录中的近似的搜索项增加。因此,如果两个搜索项是直接最接近的,可实际上增加用于那个记录的关联比数值,建议所识别的搜索列表对由该搜索者提交的搜索查询来说很相关。另一方面,如果两个搜索项出现如在一相同的句子中但不是很接近,该记录的关联可稍微增加以表示由搜索项的降低的接近暗示的较低的关联。
在块506,确定所定位的记录是否包含一竞买的搜索项。搜索项是由广告商竞买的,该竞买被用来由使用按业绩付费的搜索引擎网络服务器显示搜索结果。如果搜索结果的确包括一竞买的搜索项,该记录的关联被调整,块508。如果查询不包括一个或多个竞买的搜索项,控制进入块510。
在块510,确定在搜索列表的说明中是否有搜索项。如图3所述,每个那样的列表包括与该搜索列表的网络站点的内容的文字性说明。如果搜索项没有包括在说明中,控制进入到块514。如果搜索项包括在该说明中,在块512,所定位的记录的关联被相应地调整。
在块514,确定搜索项是否定位在搜索列表的标题中。如图3所示,每个搜索列表包括一标题360。如果搜索项包括在一记录的标题中,在块516,该记录的关联被相应地调整。如果搜索项没有包括在该标题中,控制进入块518。
在块518,确定搜索项是否包括在搜索列表的元标签中。元标签是包括在对用户使用来说不显示的一网络站点中的文字性信息。然而,包含在按业绩付费数据库中的搜索列表包括用于搜索和其他目的的元标签。在块518,如果搜索项没有包括在搜索列表中,控制进入块522。另一方面,如果搜索项包括在一个或多个搜索列表的元标签中,在块520,该记录的关联被相应的调整。
在块522,确定用户的搜索项是否包括在竞买的网页的文字中。如果没有,控制进入块406,图4。然而,如果搜索项包括在该网页的文字中,在块524,该搜索列表记录的关联被相应地调整。
在图5中所述的步骤之后,一个或多个以及最好是六个最相关的有关的搜索列表被返回并连同来自该按业绩付费数据库的搜索结果显示给该搜索者。
图6说明用于形成用在图1的数据库搜索系统中的一相关的搜索数据库的一种方法。该方法从块600开始。
在块602,提取用于在按业绩付费数据库中所有网页的所有文字。包括元标签和包含由一URL引用的该网页中的其他未显示的文字信息,该URL包含在按业绩付费数据库中。在块604,来自相似的页的文字被省略。这减少必须被处理的数据的总量以形成相关的搜索数据库。用于执行该动作的一个方法的一实施例将结合图7在下面描述。另外,这大大地增加了产生相关的搜索数据库的速度。在块606,该文字被存储在相关的搜索数据库中。
在块608,创建一倒排索引,索引在块606中存储的搜索列表数据以及在块602提取的文字。最后得到的倒排索引包括多行数据,每行包括一关键词以及来自与那个关键词有磁的数据库的所有文字。
用于相关搜索数据库的内容的一结构的一说明性例子如下。该数据库的每行包括以下的部分:canon_ont      integer         #有关该搜索结果竞买的不同的搜索列
                           表数量advertiser_cnt   integer       #有关该搜索结果竞买的不同的广商
                           的数量related_result   varchar(50)      #相关结果(竞买的搜索项),规范
                           和单数(canonicalized  and
                           depluralized)raw search text varchar(50)      #初始行竞买的搜索项advertiser_ids   varchar(4096)     #对该搜索结果竞买的所有广
                            告商的须直接付款的列表(explicit
                            list)words                 varchar  (65536+)#搜索(crawl)过的所有网页
                            的全文,包括手工编码的脚本theme varchar(50)directory_taxonomy varchar(200)
数canon_cnt与数advertiser_cnt不同,因为在不同域中的多个不同的网页可竞买(bid against)相同的竞买的搜索项,或多个不同的广告商可仅竞买一条搜索项。特定主题的关键词被包含在具有插入在advertiser_cnt字段中的“标记”的数据库中。如果“advertiser_cnt=999999999”,所出现的查询是一面向成人的查询。在该实施方式中,一可选的提高是禁止在该情况下相关的结果。数canon_cnt和advertiser_cnt是当前导出的数据字段。另外的字段如theme和directory_taxonomy_category能任意地添加以给出更多提高的关联给相关结果匹配,尽管他们在所说明的实施例中并没有使用。
在一个实施例中,被查询来获得相关的结果的倒排索引用以下Java命令创建:
SQL>Create metamorph inverted index index02 on line ad02(words);
这是用于在一文档(在这里,该文档包含在将通过TexisThunderstone SQL命令从(RelatedSearcherCore.java)被搜索的一数据库列(词))上创建一任何的文字搜索索引(mm_index02)的厂商认证(vendor specific)方法(使用Thunderstone-EPI,Inc.提供的Texis关系数据库管理系统):
  ″SELECT″

  +″$rank,″//Num getRow()    arg position 0

  + canon cnt,″//Int getRow() arg position 1

  +″raw~search~text,″//Stri getRow() arg position 2

  +″cannon search text,″//Stri getRow () arg position 3

  +″advertiser~ids,″//liStri getRow ()   arg position 4

  +″advertiser~cnt″//Int getRow() arg position 5

  +″FROM line ad02″+″WHERE words″
				
				<dp n="d13"/>
  +″LIKEP $query ORDER BY 1 desc,advertiser cnt desc;″;
基于在该“words”字段中该搜索短语的出现频率、在该索引词字段中查询的短语部分彼此的近似以及与在该“word”字段中与词的次序相比的词的顺序(如果1>查询短语词query phrase word,$rank是可编程包含该搜索结果的“关联”的一厂商提供的虚拟数据字段,
该“rank”是特定、通过由不同的任何文本搜索引擎(Free TextSearch Engine)提供者的各种不同的算法导出的卖主,尽管实际上相当相似于任何卖主的任何文本搜索引擎工作来实现相关搜索功能性。
″ORDER BY 1 desc[ending],advertiser~cnt desc[ending]″首先通过关联,然后通过竞买该特定的related_search_result的广告商的数量的导出字段“advertiser_cnt”控制排列查询结果(字段”1”==$rank)。因此,“关联”是主要的选择标准,以及“用户倾向(popularity)”是次要的选择标准。
在块610,创建另外的索引并与在块608创建的倒排索引一起存储。另外的索引使用与每个搜索列表相关的关键信息创建。该关键信息包括,如字段性的广告商数据如广告商的标识以及导出的主题如赌博等等。然后该方法在块612结束。
图7是说明用于从一数据库移出相似的页面信息的一种方法的流程图。在所描述的实施方式中该方法继续图6的动作602的运行。
在块702,按业绩付费数据库(也称为竞买的搜索列表数据库)用URL数据来检查并且所有的URLs是从该数据库中抽取出来的并且形成一列表。在块704,该列表被存储并且任何精确的副本被移出。
在块706,在该列表中的一URL被选择并且确定所选择的URL是否对在该列表中的在前的URL具有相似性。相似性可通过任何适当的方法被确定,如在该URL中多个相同的字符或字段或相同字符的百分比,或一共同的根或串或字段。
在块708,如果选择的URL与在前的URL相似,所选择的URL被添加到一候选的完全相同的URLs的列表。在块710,搜索(crawl)多个预定的每个可能的完全一样的URL。在所描述的实施例中,预定数量首先是两个可能的完全相同的URLs。搜索(crawl)最好使用称为一搜索(crawl)者的一程序代码来实现。一搜索(crawl)者是访问网络站点和读取它们的页面和其他信息的一程序。这种程序是公知的并且也被称为“spider”或bot”。可选择性地访问和由一搜索(crawl)者索引整个站点或特定的页面。在另一实施例中,由一URL引用的每个站点的子集而不是整个网点可被搜索(crawl)并比较相似性。
在块712,由搜索(crawl)者返回的数据被检查。该数据可被称为URL体并包括来任何该URL标识的站点和所有可访问的站点的页面的数据。确定包括包含在URL体中的文字和其他信息的数据与包含在在前的URL体中的数据是否十分地相似。此外,可用任何适当的方法如每个页面的文字内容的Stats比较确定相似性。如果十分相似,控制进入块714并假定该URL与在前的URL相同。文字和其他信息体被指定到该相似的URLs的剩余部分。
如果在块706确定所选择的URL与在前的URL不相似或如果在块712,确定该URL体与在前的URL体不是十分相似,控制进入块718。在块718,该URL被添加到将被搜索(crawl)的URLs的一列表。在块720,,在该列表上的所有URLs被搜索(crawl)以检索和存储包含在由每个URL表示的站点上的信息。
在块716,来自每个搜索(crawl)过的URL的信息被加载到相关的搜索数据库(也称为任意文字数据库(free text database)。该信息与已经包括在相关的搜索数据库中的搜索列表数据结合。因此,在图7中描述的方法步骤通过降低被搜索(crawl)和存储的URLs的数量来降低包含在相关的搜索数据库中的数据总量。完全相同的URLs(duplicate URLs)被从该过程中消除并且相近的完全相同品(nearduplicate)URLs用内容的相似性被核查。结果是降低对该最终结果的数据库的存储要求以及在该数据库上更快、更有效的查询。通过提高性能提高用户的方便。
从以上所述,可以看出本发明提供用于产生呈现给一搜索者在一按业绩付费数据库中搜索的相关的搜索的一种改进的方法和装置。相关的搜索在使用该按业绩付费的数据库形成的一相关的搜索数据库中被执行。来自该相关的搜索者的数据库的搜索结果按关联排序用于呈现给该用户。因此,如果一用户的最初的搜索太窄或太宽,该用户具有可被用来产生更可用的结果的有效的相关的搜索。另外,该相关的搜索已经使用按竞买的搜索项引用的搜索列表产生。这有利于对在该数据库搜索系统中的广告付费的广告商。这增加了由一搜索者使用该数据库系统访问一广告商的网络站点的可能性。
同时本发明的一特定的实施例已经示出并描述,也可做修改。因此,覆盖所有这种变化和修改的附加的权利要求落在本发明的精神和范围内。
源代码附录
这实际上是执行来自任何文字搜索引擎(任何文本搜索引擎(FreeText Search Engine))(来自的Thunderstone-EPI,Inc.的TexisRDBMS)核心的Java部分并且后处理该结果,过滤有关广告商出现的频率和主题(“adult”-ness)
  package com.go2.search.related;import java.util.
Vector;

  import java.util.Hashtable;

  import java.util.StringTokenizer;

  import java.rmi.RemoteException;

  import com.go2.texis.*;

  /** * &amp;commat;author Phil Rorex * &amp;commat;version */

  class Callback implements TErrorMsgIF{private static int
errll5=0;public int getErrll5(){return(errll5);}
public void ErrorMsgDelivery(String msg,int level,int
msgNumber){switch(msgNumber){case 2:{

  System.out.println (″FATAL:msg:″+msg+″level:″+level
+″msgNumber:″+msgNumber);

  System.exit(2);}case 100:break;case 115:
errll5++;break;default:

  System.out.println (″UNUSUAL:msg:″+msg+″level:″+
level+″msgNumber:″+msgNumber);}}

  /**/*  *run as a stand-alone JVM,since the Free Text
Searcher *being used is best connected with as a JNI-based C
language按一单机的JVM运行,因为所使用的任何文字搜索器最好与
作为一基于C语言的JNI库接口API连接。

  **/publicclass RelatedSearcherCore implements Runnable
{
				
				<dp n="d16"/>
  //cache an instance of Texis server and Query高速缓存Texis
服务器和查询的一个实例

  private static Server texis=null;

  private static Query texisQuery=null;

  private static Query texisPlurQuery=null;

  private static Query texisAdultQuery=null;

  private static long timeClock;

  //used to coordinate time-outs on extra long queries用
来调整有关特别长的查询的超时

  private static final Integer PRE~QUERY=new Integer(1);

  private static final Integer MID~QUERY=new Integer(2);

  private static final Integer POST~QUERY=new Integer(3);

  private static Integer semaphore=PRE~QUERY;

  //time out process超时处理

  Thread watchDog;
  //线程开始等待如果Core()没有超过为它所设置的超时可能
永不会被使用的很长时间

  long globalTimeOut=0;

  //The magic adult flag不可思议的成人标记

  //If a related search free-text search returns a row//which
has this field set,it’s automatically″themed″//as an
adult-oriented related search//This particular″Magic data
row″is pre-loaded with//all the″adult-oriented″terms which
are typical//in this theme.Same should be done for CASINO FLAG
//CURRENT~NEWS~FLAG,and any other theme desired.如果一相关
的搜索任何的文字搜索返回具有该字段组的一行,它被自动地“主题
化”为一面向成人的相关的搜索。该特定的“不可思议的数据行”是
用在该主题内很典型的所有“面向成人的”术语预载。对CASINO FLAG、
CURRENT~NEWS~FLAG和所要求的其他主题也应当执行相同的处理

  private static final int ADULT~FLAG=999999999;

  //How many pluralized tries of the query//used to search
for singular and plural//version of up to [square root of]
				
				<dp n="d17"/>
MAX_PLURAL_QRY   //terms多少次查询尝试用来查询单一的和达到
MAX_PLURAL_QRY的[平方根]的多个版本

  private static final iht MAX-PLURAL_QRY=4;

  //Limit Texis to this many rows将Texis限定到多个行

  //This is the initial # of pre-filtered free-text
//searched rows coming back from the search engine这是初始
的从该搜索引擎返回的预先过滤的任何文字搜索的行

  private static final int MARROWS=60;//

  //controls the’looseness’of the post-search filter//that
filters out related searches based on the//derived-data
element of (#-of-different-advertisers//bidding on this
related search term).Set to 0 is //this element means″how many
times we can ignore seeing//the identical advertiser before we
start ignoring //related searches bid on by him″0 is strongest
reject,//larger numbers reject less stringently(usu.not>
than//1,if ratio of webpages:related searchterms is>
than//about 10控制后搜索过滤器的“压缩”,该过滤器基于(在
相关的搜索项上竞买的不同广告商的#)导出的数据部分过滤出相关
的搜索。设置为0,表示在我们开始忽略由广告商竞买的相关的搜索
前我们可忽略多少次查看相同的广告商,0是最强烈地拒绝,更大数
量的拒绝不是很严格(如果网页:相关的搜索项的比率大于10,则usu.
not>1)

  private static final int ADVERTISER_THRESHOLD=0;

  //the SQL query used to talk to Texis(the FTS engine)用
于与Texis(FTS引擎)对话的SQL查询

  private static final String TEXIS_SQL=″SELECT″+
$rank,″//Num getRow ()arg position 0+″canon_cnt,″//Int
getRow() arg position 1 +″ rawsearchtext,″//Stri getRowf)arg
position 2+″cannonsearchtext,″//Stri getRow ()arg position
3  +″advertiser_ids,″//Stri getRow()arg position 4+
″advertiser_cnt″//Int getRow()arg position 5+″FROM line
ad02″+″WHERE words″+″LIKEP?ORDER BY 1 desc,
				
				<dp n="d18"/>
advertiser_cnt desc″;

  //+″LIKEP?;”;

  //+″LIKEP 9 ORDER BY advertiser cnt desc;按
advertiser_cnt的降序排序″;

  private static final String TEXIS_PLUR_SQL=
″SELECTplural″+″FROM plurals″+″WHERE singular″+tt=?;
111 private static final String TEXISADULTSQL=″SELECT
cannon_search_text″+″FROM adult″+″WHERE words″
+″LIKE?;″;

  private static Callback cb=new Cailback();

  public void init(String texisHome,long timeOut)
{globalTimeOut=timeOut;init(texisHome);}

  public void init(String texisHome){/**

  *Instantiate Texis connection object and perform*Texis
query initialization*示例Texis连接对象以及执行Texis查询初
始化

  *Called one time to setup the Related Search query.调用
一次来安装相关的搜索查询

  *Must be called before findRelate is ever called.总是在
调用findRelate前必须被调用

  */ //Perform Texis initialization and precache an instance
//of Texis Server and Texis Query执行Texis初始化以及预先
高速缓冲Texis服务和Texis查询的一实例

  try{

  Texis texisRDBMS=new Texis();texis=(Server)
texisRDBMS.createServer(texisHome);

  Vector n=new Vector(200);//Vector n=texis.getNoise
();n.addElement(″a″);n.addElement(″about″);n.
addElement(″after″);n.addElement(″again″);n.addElement
(″ago″);n.addElement(″all″);n.addElement(″almost″);n.
addElement(″also″);n.addElement (″always″);n.addElement
(″am″);n.addElement(″an″);n.addElement(″and″);n.
				
				<dp n="d19"/>
addElement(″another″);n.addElement(″any″);n.addElement
(″anybody″);n.addElement(″anyhow″);n.addElement
(″anyone″);n.addElement(″anything″);n.addElement
(″anyway″);n.addElement(″are″);n.addElement(″as″);n.
addElement(″at″);n.addElement(″away″);n.addElement
(″be″);n.addElement(″became″);n.addElement(″because″);
n.addElement(″been″);n.addElement(″before″);n.
addElement(″being″);n.addElement(″but″);n.addElement
(″by″);n.addElement(″came″);n.addElement(″can″);n.
addElement(″cannot″);n.addElement(″com″);n.addElement
(″come″);n.addElement(″could″);n.addElement(″de″);
n.addElement(″del″);n.addElement(″der″);n.addElement
(″did″);n.addElement(″do″);n.addElement(″does″);n.
addElement(″doing″);n.addElement(″done″);n.addElement
(″down″);n.addElement(″each″);n.addElement(″else″);
n.addElement(″even″);n.addElement(″ever″);n.
addElement(″every″);n.addElement(″everyone″);n.addElement
(″everything″);n.addElement(″for″);n.addElement(″from″);
n.addElement(″front″);n.addElement(″get″);n.addElement
(″getting″);n.addElement(″go″);n.addElement(″goes″);n.
addElement(″going″);n.addElement(″gone″);n.addElement
(″got″);n.addElement(″goten″);n.addElement(″had″);n.
addElement(″has″);n.addElement(″have″);n.addElement
(″having″);n.addElement(″he″);n.addElement(″her″);
n.addElement(″here″);n.addElement(″him″);n.addElement
(″his″);n.addElement(″how″);n.addElement(″i″);
n.addElement(″if″);n.addElement(″in″);n.addElement(″into″);
n.addElement(″is″);n.addElement(″isn’t″);n.addElement(″it″);
n.addElement(″jpg″);n.addElement(″just″);n.addElement
(″last″);n.addElement(″least″);n.addElement(″left″);
n.addElement(″less″);n.addElement(″let″);n.addElement
(″like″);n.addElement(″make″);n.addElement(″many″);
				
				<dp n="d20"/>
n.addElement(″may″) ;n.addElement(″maybe″);n.addElement
(″me″);n.addElement(″mine″);n.addElement(″more″);n.
addElement(″most″);n.addElement(″much″);n.addElement
(″my″);n.addElement(″myself″);n.addElement(″net″);n.
addElement(″never″);n.addElement(″no″);n.addElement
(″none″);n.addElement(″not″);n.addElement(″now″);n.
addElement(″of″);n.addElement(″off″);n.addElement
(″on″);n.addElement(″one″);n.addElement(″onto″);n.
addElement(″org″);n.addElement(″our″);n.addElement
(″ourselves);n.addElement(″out″);n.addElement(″over″);
n.addElement(″per″);n.addElement(″put″);n.addElement
(″putting″);n.addElement(″same″);n.addElement
(X″sawll);n.addElement(″see″);n.addElement(″seen″);
n.addElement(″shall″);n.addElement(″she″);n.
addElement(″should″);n.addElement(″so″);n.addElement
(″some″);n.addElement(″somebody″);n.addElement
(″someone″);n.addElement(″something″);n.addElement
(″stand″);n.addElement(″such″);n.addElement(″sure″);
n.addElement(″take″);n.addElement(″than″);n.
addElement(″that″);n.addElement(″the″);n.addElement
(″their″);n.addElement(″them″);n.addElement(″then″);
n.addElement(″there″);n.addElement(″these″);n.
addElement(″they″);n.addElement(″this″);n.addElement
(″those″);n.addElement(″through″);n.addElement
(″till″);n.addElement(″to″);n.addElement(″too″);n.
addElement(″two″);n.addElement(″unless″);n.addElement
(″until″) ;n.addElement(″up″);n.addElement(″upon″);n.
addElement(″us″);n.addElement(″very″);n.addElement
(″was″);n.addElement(″we″);n.addElement(″went″);n.
addElement(″were″);n.addElement(″what″);n.addElement
(″what’s″);n.addElement(″whatever″);n.addElement
(″when″);n.addElement(″where″);n.addElement(″whether″);
				
				<dp n="d21"/>
n.addElement(″which″);n.addElement(″while″);n.
addElement(″who″);n.addElement(″whoever″);n.addElement
(″whom″)  ;n.addElement(″whose″);n.addElement(″why″);n.
addElement(″will″);n.addElement(″with″);n.addElement
(″within″);n.addElement(″without″);n.addElement
(”won’t’l);n.addElement(″would″);n.addElement
(″wouldn’t″);n.addElement(″www″);n.addElement(″yet″);
n.addElement(″you″);n.addElement(″your″);texis.
setNoise(n);texisQuery=(Query)texis.createQuery();
texisPlurQuery=(Query)texis.createQuery();
texisAdultQuery=(Query)texi s.creat.eQuery();

  /*  *Query.api()’s affect ALL queries,not just ones set
on Query.api()受所有查询的影响,而不仅是确定的查询

  */texisQuery.setlikeprows(MARROWS);texisQuery.allinear
(0);texi sQuery.alpostproc(0);texisQuery.prepSQL
(TEXISSQL);texisPlurQuery.prepSQL(TEXIS_PLUR_SQL);
texisAdultQuery.prepSQL(TEXISADULTSQL);

  TErrorMsg.RegisterMsgDelivery(cb);watchDog=new Thread
(this);watchDog.setPriority(Thread.NORM PRIORITY+1);
watchDog.start();}catch(TException te){te.
printStackTrace();throw new RuntimeException(″Could not
initialize Texis:Failed with:″+te.getMsg()+code:″+te.
getErrorCode());}catch(RemoteException re){throw new
RuntimeException(″Unexpected RemoteException:+re);}}/*
*

  *Perform a Texis related search query and package results
(if any)  *into an array of RelatedResults objects执行一Texis
相关的搜索查询以及将结果(若有的话)打包成RelatedResult对象
的一数组

  */public RelatedResult[]findRelated(String rawQuery,
String canonQuery,int maxResults,int maxResultLength)throws
Exception{try{return findRelated(rawQuery,canonQuery,
				
				<dp n="d22"/>
maxResults,maxResultLength,2000);}catch(Exception e){e.
printStackTrace();throw new Exception(″overloaded
findRelated″+e.getMessage());}}public RelatedResult[]
findRelated(String rawQuery,String canonQuery,int maxResults,
int maxResultLength,long timeOut)throws
RelatedSearchException{//local vars

  Vector resultVector=new Vector();

  Vector thisRow=new Vector();

  RelatedResult[] results=null;int resultCount=0;int
rank=0;int canonCnt=0;

  Integer advertiser id=null;int advertiserCnt=0;

  String advertiserIds=null;

  String rawSearchText=null;

  String canonSearchText=null;

  Runtime rt=Runtime.getRuntime();long mem=rt.
totalMemory();long free=rt.freeMemory();//System.err.
println(″totalMemory():″+mem);//System.err.println
(″freeMemory():″+free);try{if(canonQuery==null)return
(null);

  Vector queryArgs=new Vector();

  String newQuery;if(timeOut !=0)

  //get the loop out of wait()mode使该循环离开wait()
模式

  synchronized(watchDog){globalTimeOut=timeOut;
//System.err.println(//System.currentTimeMillis()//+
//+timeClock//+//+(System.currentTimeMillis()-
timeClock)//+″Core:setting MID_QUERY″);semaphore=
MID_QUERY;timeClock=System.currentTimeMillis();

  //thread better be waiting on eternity线程最好永远等待

  watchDog.notify();})

  //if no Raw Query,probably have canonicalized version only
如果没有行查询,最好仅具有规范的版本
				
				<dp n="d23"/>
  if(rawQuery!=null)

  //usual serving site,don’t re-pluralize,just use //raw
query普通的服务站点,不是多数,正好使用行查询

  ;newQuery=stripNoiseChars(rawQuery);}else

  //only have a canonQuery to work with,so make//a rough
approximation of a raw term to include in search//try and
generate queries to cover up to MAX_PLURAL_QRY possible
//re-pluralized forms of the query仅具有一canonQuery来工作,
因此做出一行术语的粗略的近似以包括在查询尝试中并生成查询来覆
盖该查询的重新使成复数形成的MAX_PLURAL_QRY

  newQuery=pluralize(stripNoiseChars(canonQuery));}//if
if(newQuery=null)return null;if(isAdult(newQuery))return
null;

  //Set up the(stack allocated)query parameters建立(分
配堆栈的)查询参数

  queryArgs.removeAllElements();queryArgs.addElement
(newQuery);

  //perform JNI calls here在这里执行JNI调用

  texisQuery.setParam(queryArgs);texi sQuery.execSQL();

  //Iterate over the rows在行上重复

  String lastCanon=rawQuery;

  Hashtable advertisers=new Hashtable(MAX_ROWS*200);.

  Hashtable used=new Hashtable(MXX_ROWS)

  Vector resultSet=texisQuery.getRows();//Vector
resultSet=getRowsLocal();

  //make 2 passes.做2遍

  //first time de-dup on advertisers第一次不复制有关广告
商

  //second time don’t dedup  第二次打开

  //System.out.println (″got rows:″+resultSet.size());
for(int pass=0;pass<2;pass++){if(resultCount>=
maxResults),break;for(int i=0;i<resultSet.size();
				
				<dp n="d24"/>
i++){thisRow=(Vector)resultSet.elementAt(i);if(thisRow.
size()==6){rank=((Number)(thisRow.elementAt(0))).
intValue();//System.out.println(thisRow.elementAt(0).
getClass().toString());canonCnt=((Integer)(thisRow.
elementAt(1))).intValue();rawSearchText=(String)
thisRow.elementAt(2);canonSearchText=(String)thisRow.
elementAt(3);advertiserIds=(String)thisRow.elementAt
(4);advertiserCnt=((Integer)thisRow.elementAt(5)).
intValue();

  //Drop out early if we detect magic ADULT_FLAG如果检测
到不可思议的ADULT_FLAG,先离开

  if(advertiserCnt==ADULT_FLAG)return null;if(canonCnt
==ADULT FLAG)return null;if(false){

  System.out.println (″rank:″+rank+″cnt:″+
canonCnt+″rst:″+rawSearchText+″cst:″+
canonSearchText+″aids:″+advertiserIds+″adcnt:″+
advertiserCnt);}}else(throw new
RelatedSearchException(″Texis query failed,protocol
violation″);)

  //De-dup the results,and also don’t return a related
//search term which canonically matches the original query不
完全相同该结果,并且也不返回规范地匹配最初的查询的一相关的搜
索项
if((!canonSearchText.equalsIgnoreCase(rawQuery))&amp; &amp;
(!canonSearchText.equals(canonQuery)) &amp; &amp; (!
rawSearchText.equalsIgnoreCase(rawQuery)) &amp; &amp; (!
rawSearchText.equalsIgnoreCase(canonQuery))&amp; &amp;
(canonSearchText.length()<=maxResultLength)){//System.
out.println(″got  cst:″+canonSearchText);

  //look for this advertiser in the hash table在散列表中
查询该广告商

  //if there,increment occurrances count//and if above
				
				<dp n="d25"/>
threshhold,we’ve seen enough//terms suggested by this
advertiser,so go to//next term//if not seen this
advertiser yet,put it in the//hashtable and process如果
有,递增出现数并且如果超出阀值,我们可以看到由该广告商建议的
很多的术语,因此到另一术语,如果未看到该广告商,将其放入该散
列表和过程中

  StringTokenizer st=new StringTokenizer
(advertiserIds,″″)i//if(st.countTokens()!=
advertiserCnt){//System.out.println (″toks:″//+st.
countTokens ()//+″Cnt:″//+advertiserCnt);//throw new
RelatedSearchException(//″Texis query suspect,wrong
advertiser count″);//}if(pass==0){int dupAdvCnt=0;
boolean Next=false;

  //Parse all the advertiser ID’s out of the returned row分
析不在返回行的所有广告商的ID

  while(st.hasMoreTokens())

  Integer advertiserld=Integer.valueOf(st.nextToken
());//if(!advertisers.containsKey(advertiserld))

  //if this advertiser is new to us (over whole query)//put
in the hash如果该广告商对我们来说是新的(在整个查询上),放
入散列中

  Integer cnt=(Integer)advertisers.get(advertiserId);if
(cnt==null)(advertisers.put(advertiserId,new Integer
(0));)else

  //Seen this advertiser before,so increment his//tally
以前见过该广告商,因此递增其计数

  advertisers.put(advertiserId,new

  Integer(cnt.intValue()+1));//System.out.println
(advertiserId+″dups:″+(cnt.intValue()+1));

  //If he’s (now) past the threshhold,don’t use //bidded
term(yet)如果他(现在)超过该阀值,不使用竞买的术语

  if(cnt.intValue()>=ADVERTISER_THRESHOLD)
				
				<dp n="d26"/>
  Next=true;break;}dupAdvCnt++;}}if(Next==true)
{continue;}else{if(!used.containsKey(canonSearchText))
{used.put(canonSearchText,new Boolean(true));}}}
else{if(!used.containsKey(canonSearchText)){used.put
(canonSearchText,new Boolean(true));}else{continue;}}
//if(dupAdvCnt>=ADVERTISER_THRESHOLD)

  //continue;继续

  //System.out.println (″dups:″+dupAdvCnt);//}/**
if(pass==0)

  //first time thru see if we’ve used this advertiser第一
次,通过查看我们是否已经使用该广告商

  Integer cnt=(Integer)advertisers.get(advertiser_id);
if(cnt==null)(advertisers.put(advertiser_id,new Integer
(0));}else{advertisers.put(advertiser_id,new

  Integer(cnt.intValue()+1));if(cnt.intValue()>=
ADVERTISER_THRESHOLD){continue;}}if(!used.
containsKey(canonSearchText)){used.put(canonSearchText,new
Boolean(true));}}else(//this is a second(or more)
time thru.

  //see if we’ve already used this term这是第一次(或更
多)次通过查看我们是否已经使用过该术语

  if(!used.containsKey(canonSearchText)){used.put
(canonSearchText,new Boolean(true));}else{continue;}}
**/if(resultCount<maxResults){resultVector.addElement
(new RelatedResult(rawSearchText,

  RelatedResult.NON CACHED));resultCount++;}else
{break;}//if-else}//if)//for}//for)catch
(TException te){throw new RelatedSearchException(″Texis
interface failed with:″+te.getMsg(),te);}catch
(Throwable t){t.printStackTrace();throw new
RelatedSearchException(″Unexpected Texis failure with:″+t.
getMessage(),t);}finally{if(timeOut !=0)
				
				<dp n="d27"/>
{synchronized(watchDog){//System.err.println(//System.
currentTimeMillis ()//+//+timeClock//+//+(System.
currentTimeMillis()-timeClock)//+″Core:setting
POST_QUERY″);semaphore=POST_QUERY;timeClock=System.
currentTimeMillis();

  //cause thread to wait on eternity使进行等待很长时间

  globalTimeOut=0;watchDog.notify();//System.err.
println(//System.currentTimeMillis()//+//+timeClock
//+//+(System.currentTimeMillis()-timeClock)//+″Core:
done with calling notify″);}}//System.out.println
(″INFO:115 err″s:″+cb.getErrll5());if(cb.getErrll5()
>100){

  System.out.println(″FATAL:Too many Errll5’s″);

  System.exit(3);}if(resultVector.size()==0)
return null;else{resultVector.copyInto(results=new
RelatedResult[resultVector.size()]);return results;}}
private String stripNoiseChars(String term){//Clean up the
query a bit if(term.length()<2)return(null);char[]
buf=new char[term.length()];int firstChar=0;term.
getChars(0,buf.length,buf,0);for(int i=0;i<buf.
length;i++){if(buf[i]<0x20‖buf[i]>0x7e)return
(null);switch(buf[i]){case(’-’):case(”):case case case
(’#’):case(’$’):case case case case case case case(’-’)case
case case case case case(’)’}:case case case case case case
(’″’):case(’\\’):case(’>’):  case(’,’):case case(’.’):
case case case buf[i]=”;//System.out.println(″i:″//+
i+″firstChar:″+firstChar//+″setting buf [i]:″//+
(String.valueOf(buf[i]))+″setting to space″);if(firstChar
==i)firstChar=i+l;}}}//only spaces left只留下
空间

  if(firstChar==buf.length)return(null);term=term==
null?null:
				
				<dp n="d28"/>
  String.valueOf(buf,firstChar,buf.length-firstChar).
trim();switch(term.length()){case 0:case l:return
(null);default:{switch(buf[firstChar])(case(’h’):
case(’H’):case(’w’):case(’W’):{

  String lowerTerm=term.toLowerCase();

  //Use the lcase vers of the string for testing//but make
sure to SET the original string to return使用该串的lcase版
本来测试,但确定设置该初始串来返回

  if(lowerTerm.startsWith(″http www″))term=term.
substring(10);else if(lowerTerm.startsWith(″http www″))
term=term.substring(9);else if(lowerTerm.startsWith
(″http www″))term=term.substring(8);else if(lowerTerm.
startsWith(″hhttp www″))term=term.substring(11);
else if(lowerTerm.startsWith(″http″))term=term.substring
(5);else if(lowerTerm.startsWith(″http″))term=term.
substring(4);else if(lowerTerm.startsWith(″www″))term=
term.substring(4);else if(lowerTerm.startsWith(″www″))term
=term.substring(3);}}}}switch(term.length())
(case 0:case 1:return(null);default:{switch(term.charAt
(term.length()-1)){case(’m’):case(’M’):case(’t’):
case(’T’)case(’g’):case(’G’):case(’f’):case(’F’):
{

  String lowerTerm=term.toLowerCase();if(lowerTerm.
endsWith(″dot com″))term=term.length()>8?term.
substring(0,term.length()-8).:null;else if(lowerTerm.
endsWith(″dotcom″))term=term.length()>

  7?term.substring(0,term.length()-7):null;else if
(lowerTerm.endsWith(″com″))term=term.length()>4?
term.substring(0,term.length()-4):null;else if
(lowerTerm.endsWith(″net″))term=term.length()>4?
term.substring(0,term.length()-4):null;else
if(lowerTerm.endsWith(″org″))term=term.length()>4?
				
				<dp n="d29"/>
term.substring(0,term.length()-4):null;else if
(lowerTerm.endsWith(″gif″))term=term.length()>4?
term.substring(0,term.length()-4):null;else if
(lowerTerm.endsWith(″jpg″))term=term.length()>4?
term.substring(0,term.length()-4):null;}})}
//Debug:System.out.println(″term :[″+term.trim()zu
return(term==null?null:term.trim());}private
boolean isAdult(String query)throws RelatedSearchException
{if(query==null)return false;

  Vector queryArgs=new Vector();

  Vector thisRow=new Vector();queryArgs.addElement
(query);try

  //perform JNI calls执行JNI调用

  texisAdultQuery.setParam(queryArgs);texisAdultQuery.
execSQL();if((thisRow=texisAdultQuery.getRow()).size()!=
0){return(true);}else(return(false);}catch
(TException te){throw new RelatedSearchException(″Texis
interface failed with:″+te.getMsg(),te);}catch
(RemoteException re){throw new RelatedSearchException(″Got a
RemoteException that should never occur:″+re);})private
String pluralize(String token)throws RelatedSearchException
{if(token==null)return null;

  Vector queryArgs=new Vector();

  String pluralToken;

  Vector thisRow=new Vector();

  StringTokenizer st0=new StringTokenizer(token,″″);

  String[]terms=new String[st0.countTokens()];

  String[]fullQuery=new String[MAX_PLURAL_QRY];int
fullQueryCnt=0;

  //Iterate over each token to see if there’s a plural version
使每个重复来查看是否有多个版本

  for(int ele0=0;st0.hasMoreTokens();ele0++)
				
				<dp n="d30"/>
{terms[ele0]=st0.nextToken();)for(int element=0;
element<terms.length &amp; &amp; fullQueryCnt<

  MAX_PLURAL_QRY;element++)

  //Do plurals lookup on this term from texis db从该Texis
数据库执行多次有关该术语的查找

  queryArgs.removeAllElements();queryArgs.addElement
(terms[element]);try

  //perform JNI calls执行JNI调用

  texisPlurQuery.setParam(queryArgs);texisPlurQuery.
execSQL();

  //retrieve the row检索该行

  if((thisRow=texisPlurQuery.getRow(##.size(#!=0#

  String term=null;

  //loop thru the terms循环完术语

  for(int elel=0;elel<terms.length;elel++)(if
(elel==element)(if(elel==0){term=(String)(thisRow.
elementAt(0));}else(term+=″″+(String)(thisRow.
elementAt(0));}}else{if(elel==0)term=terms
[elel];else term+=″″+terms[elel];}}fullQuery
[fullQueryCnt]=term;fullQueryCnt++;})catch
(TException te){throw new RelatedSearchException(″Texis
interface failed with:+te.getMsg(),te);}catch
(RemoteException re){throw new RelatedSearchException(″Got a

  RemoteException that should never occur:″+re);})

  //Build the new expanded query连编该新扩展的查询

  if(fullQueryCnt>0){pluralToken=″(″+token;
for(int i=0;i<fullQueryCnt;i++){pluralToken=
pluralToken+″,″+fullQuery[i];}pluralToken=
pluralToken+″)″}else(pluralToken=token;}return
(pluralToken);}public Vector getRowsLocal()throws
TException,RemoteException{

  Vector set=new Vector();int e;synchronized(APIToken.
				
				<dp n="d31"/>
Lock){while(true)

  Vector row=new Vector();row=texisQuery.getRow();if
(row.size()==0)break;set.addElement(row);)}
return set;}public synchronized void run(){while(true)
(try

  //start our timeout  开始我们的超时

  synchronized(watchDog){//System.err.println
(//System.currentTimeMillis()// //+timeClock //
//+(System.currentTimeMillis()-timeClock)  //+″run:
starting wait of″//+  globalTimeOut);watchDog.wait
(globalTimeOut);

  //just got woke up,//see why正好被换醒,查看为什么

  if(semaphore.equals(PRE-QUERY)){//System.err.
println(//System.currentTimeMillis  ()//+″-″//+
timeClock  //+″=″  //+(System.currentTimeMillis ()-
timeClock)  //+″run:got PRE_QUERY″);continue;}else if
(semaphore.equals(POST_QUERY)){//System.err.println
(//System.currentTimeMillis()//+//+timeClock
//+″=″//+(System.currentTimeMillis()-timeClock)
//+″run:got POST_QUERY″);continue;}else if(semaphore.
equals(MID-QUERY)){if(System.currentTimeMillis()-
timeClock>=globalTimeOut)

  //we timed out,but semaphore wasn’t   //set,so hose
ourselves我们时间到了,但没有设置信号标志,因此我们自己停止

  System.err.println(

  System.currentTimeMillis()  +  timeClock  +  (System.
currentTimeMills()-timeClock)+’Fatal:timeout″+
globalTimeOut  +″usec exceeded″);

  System.exit(1);}else{//System.err.println
(//System.currentTimeMillis()  // + //+ timeClock  //+
//+(System.currentTimeMillis()-timeClock)//+″run:got
MID-QUERY,but OK!″);}}else
				
				<dp n="d32"/>
  System.err.println(

  System.currentTimeMillis()+timeClock+(System.
currentTimeMillis()-timeClock)+″run:ARGHH got no_QUERY,
Hmmmmm!″);}})  catch(Exception e){

  System.err.println(″got wait()exception″);))}

  The following code is used to implement the cached results
lookup,first to see if we’ve seen this related search before,
to save time and not do the algorithmic lookup during the related
search execution.以下的代码被用来实现高速缓冲结果的查找,首
先查看我们以前是否已经看出过该相关的搜索结果以节约时间而且在
相关搜索执行期间不执行算法查找

  package com.go2.search.related;//import atg.nucleus.
GenericRMIService;import atg.nucleus.GenericService;
import atg.nucleus.ServiceException;import atg.service.
resourcepool.JDBCConnectionPool;import atg.service.
resourcepool.ResourceObject;import atg.service.
resourcepool.ResourcePoolException;import java.rmi.
RemoteException;import java.net.*;import java.io.*;import
java.sql.*;import java.util.Vector;

  /***This is the top level interface to the related search
*system it is meant to be used as a dynamo service available*
to other dynamo services* */这是对该相关搜索系统来说最高
级的接口,它表示可被用作对其他动态服务来说可用的一动态服务
//public class RelatedSearcherImpl extends GenericRMIService
public class RelatedSearcherImpl extends GenericService
implements RelatedSearcher{//my pool of Texis/UDP  我的
Texis/UDP库

  private TexisUDPConnectionPool texisUDPConnectionPool;
//my pool of connections to Oracle cache我的连接到Oracle高
速缓冲存储器上的库
  private JDBCConnectionPool relatedCacheConnectionPool;
//Statistics properties Stats学特性
				
				<dp n="d33"/>
  private int requestCount=0;private int oracleCacheHits=
0;private int texisRequests=0;private int
texisTimeoutMillis=0;private int slowTexisRequestCount=0;
//private constants专用常数
  private static String CACHE_SQL=″SELECT*FROM RESEARCH

  WHERE CANON_QUERY=?″;private static int BUFFER SIZE=
512;//parameters参数

  private boolean texisEnabled=false;private boolean
oracleEnabled=false;private boolean systemEnabled=false;
private long cummulativeOracleTime=0;private long
cummulativeTexisTime=0;

  /** *Create and export and instance of RelatedSearcher overRMI在RMI上RelatedSearcher的产生以及输出以及实例

  */public RelatedSearcherImpl()throws RemoteException
{super();//java.rmi.registry.LocateRegistry.
createRegistry(llll).rebind(″RelatedSearcher″,this);}
/**

  *This method was created in VisualAge.该方法用VisualAge
创建

  *&amp;commat;return RelatedResult[]* &amp;commat;param
canonQuery java.lang.String * &amp;commat;param maxResults int
* &amp;commat;param maxLength int */private RelatedResult[]
findFromCache(String canonQuery,int maxResults,int maxLength)
throws RelatedSearchException

  Vector resultVector=new Vector();

  RelatedResult[]results=null;

  PreparedStatement ps=null;

  ResultSet rs=null;try{//Get a Connection得到一连
接

  ResourceObject resource=null;try{.resource=
getRelatedCacheConnectionPool().checkout(getAbsoluteName
());
				
				<dp n="d34"/>
  Connection conn=(Connection)resource.getResource()
boolean success=false;try{//Here’s where we get the goods
from Oracle这里我们可从Oracle获得商品

  ps=conn.prepareStatement(CACHE_SQL);ps.setString(l,
canonQuery);rs=ps.executeQuery();//prime the cursor to
point to the one and only row we//expect from Oracle if now
matching rows were found //then we’ll simply drop thru to the
end最初将光标指向我们期望的来自Oracle的一个或仅一行,如果
现有匹配行已经找到,那么我们将简单地落到结尾

  if(rs.next()){//Extract the data we need if there was
something如果有,抽取我们需要的数据

  int numTerms=rs.getInt(2);if(numTerms==0)//The
cache tells us that there won’t be //any results so we’ll bail
early throw new高速缓存器告诉我们没有任何结果,因此我们将先
委托扔出新的

  RelatedSearchException(″No related Results″);int
cacheFlag=rs.getInt (3);//iterate over results retrieving
upto maxResults//of those of them that are maxLength or
smaller重复有关是maxLength或更小的那些maxResults检索的结
果

  int resultCount=0,rowCount=0;while(resultCount<
maxResults &amp; &amp; rowCount<numTerms){

  String term=rs.getString(4+rowCount);//push this
term into the result vector if its good将该术语推入该结果矢
量中,如果它的商品

  if(term.length()<=maxLength){resultVector.
addElement(new

  RelatedResult(term,cacheFlag));resultCount++;)
rowCount++;}}conn.commit();success=true;}
finally{//Cleanup result set整理结果集合

  if(rs!=null)rs.close ();//Cleanup prepared statement
整理准备的语句
				
				<dp n="d35"/>
  if(ps !=null)ps.close();//Cleanup connection整
理连接

  if(!success &amp; &amp; conn !=null)conn.rollback()
i}//try-finally最后一次尝试

  } finally{//Check the Connection back in再次登记该
连接

  if(resource!=null)getRelatedCacheConnectionPool().
checkIn(resource);}//try-finally最后一次尝试

  )catch(ResourcePoolException exc){if
(isLoggingError()){logError(″Unable to get Oracle cache
connection″,exc);}throw new RelatedSearchException
(″Unable to get Oracle cache connection″,exc);}catch
(SQLException se){if(isLoggingError()){logError
(″Interface with Oracle cache failed″,se);}throw new
RelatedSearchException(″Interface with Oracle cache failed″,
se);-}//try-catch尝试捕获

  if(resultVector.size()==0)return null;else
{resultVector.copyInto(results=new

  RelatedResult[resultVector.size()]);return
results;}}

  *Communicate to Texis thru TexisConnectionPool通过
TexisConnectionPool连接到Texis

  * &amp;commat;return RelatedResult[]*  &amp;commat;param
canonQuery java.lang.String* &amp;commat;param maxResults int
* &amp;commat;param maxLength int*/private RelatedResult[]
findFromUDPTexis(String rawQuery,String canonQuery,int
maxResults,int maxLength)throws RelatedSearchException{

  RelatedResult[]results=null;//Get a
UDPTexisConnection获得一UDPTexisConnection

  ResourceObject resource=null;

  TexisUDPConnection tc=null;try{resource=
getTexisUDPConnectionPool().checkOut(getAbsoluteName());
				
				<dp n="d36"/>
tc=(TexisUDPConnection)resource.getResource();

  DatagramSocket socket=tc.getSocket();//do this at run
time to be able to switch Dynamo at run time在运行时执行该操
作以便能在运行时转换Dynamo

  socket.setSoTimeout(getTexisTimeoutMillis());
//package data to send打包数据以便发送

  TexisRequest request=new TexisRequest();request.
setRawQuery(rawQuery);request.setCanonQuery(canonQuery);
request.setMaxResults(maxResults);request.setMaxChars
(maxLength);request;setSequenceNumber(++tc.
sequenceNumber);request.setTimeout(getTexisTimeoutMillis
());

  ByteArrayOutputStream baos=new ByteArrayOutputStream();
ObjectOutputStream ous=new ObjectOutputStream(baos);ous.
writeObject(request);ous.flush();baos.close();byte
[]sendData=baos.toByteArray();//send it off to the
server从服务器发送它

  if(isLoggingDebug()){logDebug(″About to send to Texis
at endpoint:″+tc.getHostO+″:″+tc.getPort());}
//send it发送它

  DatagramPacket sendPacket=new DatagramPacket(sendData,
sendData.length,tc.getHost(),tc.getPort());
socket.send(sendPacket);//wait for a reply upto timeOut
等待有关超过毫秒的一响应

  long startWait=System.currentTimeMillis();while
(true){//pull off inboud packets and check them the the right
sequenceNumber完成inboud信息包并用正确的顺序号核对它们
//throws a java.io.InterruptedIOExeption on timeout

  DatagramPacket receivePacket=new

  DatagramPacket(new byte[BUFFER_SIZE],BUFFER_SIZE);
socket.receive(receivePacket);

  ObjectInputStream ois=new ObjectInputStream(new
				
				<dp n="d37"/>
  ByteArrayInputStream(receivePacket.getData()));

  TexisResponse response=(TexisResponse)ois.readObject
();ois.close();if(response.getSequenceNumber()!=tc.
sequenceNumber){//we got a stale response我们获得一失效
的响应

  long midPoint=System.currentTimeMillis();int
remainder=(int)(getTexisTimeoutMillis()-(midPoint-
startWait));if(remainder>0){//if we can still wait some
more before a timeout如果我们在一超时前还能等待更久//reset
socket timeOut to the remaining time对剩余的时间重新设置插
槽的timeOut

  socket.setSoTimeout(remainder);}else{//give up at
this point break;在该点中止

  }}//if-wrong-sequence-number else{results=response.
getResults();break;})//while}catch
(ResourcePoolException rpe){if(isLoggingError()){logError
(″Unable to get or checkin a Texis connection″,rpe);)}
catch(ClassNotFoundException cnfe){if(isLoggingError())
{logError(″Class not found Exception″,cnfe);}}catch
(SocketException se){if(isLoggingError())logError
(″Socket Exception talking to Texis″,se);}}catch
(StreamCorruptedException sce){if(isLoggingError())
{logError(″Corrupted return from Texis″,sce);)}
catch(InterruptedIOException ioie){if(isLoggingDebug())
{logDebug(″Timed out talking to Texis″,ioie);}}catch
(IOException ioe){if(isLoggingDebug f logDebug(″Timed
out talking to Texis″,ioe);})finally{//Check the
Connection back in if we got it in the first place try{if
(resource !=null)  getTexisUDPConnectionPool().checkIn
(resource);}catch(ResourcePoolException rpe){/*ignore this
one忽略这一个*/}}return results;}* &amp;commat;return
RelatedResult[]-an array of RelatedResult objects
				
				<dp n="d38"/>
RelatedResult对象的一个数组*which is ordered by relebance from
high to low or null if the system is disable or no*related
results were found按关联从高到低或如果该系统被禁止或没有
找到相关的结果的空排序* &amp;commat;param rawQuery java.lang.
String-raw query for which related searches are needed用于
相关的搜索所需的原始查询* &amp;commat;param canonQuery java.lang.
String-canonocalized for of the raw query原始查询的规范化
* &amp;commat;param maxResults int-maximum number of results
requested所请求的最大结果数据* &amp;commat;param
maxResultLenght int-maximum lenght of a result in characters
在字符中一结果的最大长度*/public RelatedResult[]findRelated
(String rawQuery,String canonQuery,int maxResults,int
maxResultLength)//throws

  RelatedSearchException throws

  RelatedSearchException,RemoteException{requestCount++;
//Return fast if system is disabled如果系统被禁止,则返回
快

  if(!getSystemEnabled())return null;

  RelatedResult[]results=null;//first try getting data
from the Oracle pool(if enabled)(如果允许)第一次尝试从Oracle
取数

  if(getOracleEnabled()){try{long startOracle=System.
currentTimeMillis();//keep timing stats继续计时Stats

  results=.findFromCache(canonQuery,maxResults,
maxResultLength);oracleCacheHits++;//fixed statistics bug
cummulativeOracleTime+=(System.currentTimeMillis()-
startOracle);}catch(RelatedSearchException rse){//If
Oracle told us that this search has no related//i.e.
editorially-excluded porn,then drop out early如果Oracle告
诉我们该搜索没有相关的即编辑上排除了色情部分,则先退出

  if(rse.getRootCause ()==null){return null;)else
{//log it otherwise for post mortem否则对后mortem登录它
				
				<dp n="d39"/>
  if(isLoggingError())logError(″Failed to interface to
Oracle cache,will try Texis″,rse);}}catch(Exception e)
{if(isLoggingError())logError(″Failed to interface to
Oracle″,e);}}//if Oracle enabled如果Oracle允许//if
unsuccessfull then try Texis pool if enabled如果未成功,则如
果允许的话尝试Texis

  if(getTexisEnabled() &amp; &amp; results==null){try{long
startTexisQuery=System.currentTimeMillis();//keep texis
timing stats继续Texis计时Stats

  texisRequests++;results=findFromUDPTexis(rawQuery,
canonQuery,maxResults,maxResultLength);long texisQueryMillis
=System.currentTimeMillis()-startTexisQuery ;’//log
abnormally long request time记录异常的长的请求时间

  if(texisQueryMillis>getTexisTimeoutMillis())
slowTexisRequestCount++;cummulativeTexisTime+=
texisQueryMillis;}catch(Exceptione){if(isLoggingError())
logError(″Texis interface failed with:″+e.getMessage(),
e);)}//if texisEnabled return results;如果texisEnabled
返回结果)/** *Stats accessor Stats访问程序*
&amp;commat;return String */public String
getCummulativeOracleTime(){ return
(cummulativeOracleTime/1000.0)+″seconds″  }/**  *Stats
accessor  Stats访问程序* &amp;commat;return long  */public
String getCummulativeTexisTime () {return
(cummulativeTexisTime/1000.0)+″seconds″;}/**  *Stats
accessor Stats访问程序* &amp;commat;return int*/public int
getOracleCacheHits(){return oracleCacheHits;}/** *
Stats accessor Stats访问程序* &amp;commat;return boolean */
public boolean getOracleEnabled f return oracleEnabled;}
/** *Accessor for relatedCacheConnectionPool用于
relatedCacheConnectionPool的访问程序* &amp;commat;return atg.
service.resourcepool.JDBCConnectionPool */public
				
				<dp n="d40"/>
JDBCConnectionPool getRelatedCacheConnectionPool(){return
relatedCacheConnectionPool;}/** *Stats accessor Stats
访问程序

  &amp;commat;return int */public int getRequestCount()
{return requestCount;}/** *Stats accessor Stats访
问程序* &amp;commat;return int */public int
getslowTexisRequestCount(){return slowTexisRequestCount;}
/** *Stats accessor Stats访问程序* &amp;commat;return
boolean*/public boolean getSystemEnabled(){return
systemEnabled}/** *Stats accessor Stats访问程序*
&amp;commat;return boolean*/public boolean getTexisEnabled()
{return texisEnabled;}/**  *stats accessor Stats访
问程序* &amp;commat;return int*/public int getTexisRequests()
{return texisRequests;}/** *configu param accessor
配置参数访问程序* &amp;commat;return int */public int
getTexisTimeoutMillis(){return texisTimeoutMillis;}/**
*This method was created in VisualAge.该方法用VisualAge创
建

  * &amp;commat;return com.go2.search.related.
TexisUDPConnectionPool */public TexisUDPConnectionPool
getTexisUDPConnectionPool(){return texisUDPConnectionPool;}
/**  *mutator* &amp;commat;param newValue boolean*/ public
void setOracleEnabled(boolean newValue){this.oracleEnabled
=newValue;)/** *Mutator for relatedCacheConnectionPool
用于relatedCacheConnectionPool的Mutator* &amp;commat;param
newValue atg.service.resourcepool.JDBCConnectionPool*/
public void setRelatedCacheConnectionPool(JDBCConnectionPool
newValue){this.relatedCacheConnectionPool=newValue;}
/** *mutator* &amp;commat;param newValue boolean*/public
void setSystemEnabled(boolean newValue){this.systemEnabled
=newValue;}/** *mutator* &amp;commat;param newValue
boolean*/public void setTexisEnabled(boolean newValue)
				
				<dp n="d41"/>
{this.texisEnabled=newValue;}/** *parameter mutator
参数mutator* &amp;commat;param newValue int */public void
setTexisTimeoutMillis(int newValue){this.
texisTimeoutMillis=newValue;}/**  *This method was
created in VisualAge.该方法是用VisualAge创建

  *&amp;commat;param newValue com.go2.search.related.
TexisUDPConnectionPool */ public void
setTexisUDPConnectionPool(TexisUDPConnectionPool newValue)
{this.texisUDPConnectionPool=newValue;}}

  The following code is control code that controls the dumping
of search listing database,loading the crawled text,and
inverted-indexing all the related search indexing,including
building the’derived-data’elements  以下代码是控制搜索列表数
据库的信息转储、载入搜索(crawl)过的文字以及倒排索引所有的相关
搜索索引的控制码,包括连编“所导出的数据”部分:

  :#!/bin/ksh-x export PATH=/usr/local/morph3/bin  :.:
$PATH #../.zshrc export TMP=/export/home/goto/tmp export
TEMP=$TMP export TEMPDIR=$TMP export TMPDIR=$TMP
TMPTABLE=lineadO

  TMPTABLE2=linead

  TERMSTABLE=terms

  INC=02           NEWTABBE=line_ad$                {INC}
CRAWLDATA=/home/goto/rs/DONE/ALL.UNIQ CRAWLTABLE=line ad4

  SPOOL=/home/goto/list

  DB=/home/goto/crawldb###############

  Log(){echo’\n####’$(date″+%m/%d %H∶%M∶%S″):″${*}″
’####’}log 0.timport crawled data

  Log 0.1 Create line_ad4 and unique index in preparation
for’crawl’import tsql-d$DB#drop table line_ad4;create table
linead4(id counter,

  ad_url varchar(300),crawltitle varchar(750),crawlmeta
varchar(500),crawlbody varchar(8000))i drop index
				
				<dp n="d42"/>
idx4ad_url;create unique index idx4ad_url on line_ad4
(adurl);!timport-database $DB-table $CRAWLTABLE-s
/home/goto/rs/DONE/crawl.sch-file $CRAWLDATA

  Log 1.extract line_ads from live_ADMN into column delimited
spool file umpadm $SPOOL

  Log 2.timport-database $DB-table $TMPTABLE-s newrs.sch-
file ${SPOOL}timport-database $DB-table $TMPTABLE-s newrs.
sch-file ${SPOOL}

  Log 3.build canon index on $TMPTABLE##tsql-d${DB}#
drop index idxOcst;create index idxOcst on$(TMPTABLE}
(cannon search text);!

  Log 4.add counts of canons to $TMPTABLE texis DB=$DB
TMPTABLE=$TMPTABLE updatecnt

  Log 5.build url index on $TMPTABLE tsql-d${DB}< < !
drop index idxOurl;create index idxOurl on ${TMPTABLE}
(ad_url);

  Log 6.merge crawled text w/original tsql-d${DB}<<!
drop index idx4url;create index idx4url on ${CRAWLTABLE}
(ad_url);drop table $TMPTABLE2;

  CREATE TABLE $TMPTABLE2

  AS

  SELECT a.price price,a.rating rating,a.ad_id ad_id,a.
bit-date bid-date,a.raw_search_text raw_search_text,a.
cannon_search_text cannon_search_text.a.adspectitle
adspectitle,a.ad_spec_desc ad_spec_desc,a.ad_url ad_url,
a.resource resource_id,b.crawltitle crawltitle,b.crawlmeta
crawlmeta,b.crawlbody crawlbody,a.canon cntfrom $TMPTABLE a,
$CRAWLTABLE b where a.adurl=b.adurl order by price desc;#texis
DB=$DB CRAWLTABLE=$CRAWLTABLE TMPTABLE=$TMPTABLE updateit

  Log 7.collapse 0 onto 01

  Log 7.1 first-make the table tsql-d$(DB}#!drop table
${NEWTABLE};tsql-d${DB)<<!create table${NEWTABLE}
				
				<dp n="d43"/>
(canon cnt integer,cannon_search_text varchar(50),
raw_search_text varchar(50),advertiser_ids varchar(4096),
advertiser_cnt integer,words varchar(65536));!

  Log 7.2 second,build the uniq,sorted list of search terms

  Log 7.2 select cannon-search-text from ${TMPTABLE};
tsql-d${DB}<<!#sort-u I timport-s termstable.sch-
fileselect cannonsearchtext from ${TMPTABLE};

  Log 7.3 third,build uniq index on terms table

  Log 7.3 create unique index idxterm on terms term tsql-d$
(DB}<<!create unique index idxterm on terms(term);

  Log 7.3.9 prepare for collapse

  Log 7.3.9 create index idxcstcol on $TMPTABLE2
cannon_search_text

  Log 7.3.9 create index idxadidcol on $TMPTABLE2 adid tsql-d
${DB}<<!create index idxcstcol on $TMPTABLE2
(cannon_search_text);create index idxadidcol on $TMPTABLE2
(ad_id);!

  Log 7.4 fourth collapse around csts

  Log 7.4 texis SRCTABLE=$TMPTABLE2 TGTTABLE=$NEWTABLE

  TERMSTABLE=$TERMSTABLE collapse texis db=$DB
SRCTABLE=$TMPTABLE2 TGTTABLE=$NEWTABLE

  TERMSTABLE=$TERMSTABLE collapse

  Log 8 do the porn line buildporn#t import-database$DB-table
$NEWTABLE-s rsporn.schfile

  Log 9 do the porn table newporn #timport-database $DB-s
rsnewporn.sch-file

  Log 10 metamorph index words column tsql-d $DB<<!
create metamorph inverted index mmx ${INC)w on ${NEWTABLE}
(words);

  Log 10 all done

  Dumps bid-for-placement search listings data#!/bin/ksh

  TXSORAUSER=XXXXXXX TXSORAPWD=XXXXXXX
				
				<dp n="d44"/>
  LVSRVPWD=XXXXXX #SPOOL=pipe

  SPOOL=${1}

  SERVER=XXXXXX

  Log(){echo’\n####’$(date″+%m/%d %H∶%M∶%S″):″${1}″
’####’)

  Log″start dump″sqlplus-S${TXSORAUSER}/${TXSORAPWD}
&amp;commat;1#{SERVER}>/dev/null!//set heading off set
linesize 750 set pagesize 0 set arraysize 1 set maxdata 50000 set
buffer 50000 set crt off;set termout off spool $(SPOOL)select
rpad(to_char(advertiser_id),8)11 rpad(raw_search,30)#
rpad(canon-search,30)# rpad(title,100)#rpad
(description,280)# rpad(url,200)# rpad(resource_id,20)
#rpad(to_char(price*100),5)11 rpad(rating,2)11 rpad
(to_char(search_id),8)# rpad(resource_id,18)#rpad
(to_char(line_ad_id),8)11 to_char(bid_date,’YYYYMMDD
HHMMSS’)from ads where status=5 and rating=’G’and canon_search
< >’grab bag’and rownum<10000;spool off;quit;

  Log″end dump″exit 0

  Counts # of occurances of pagebids for each particular
potential related search result<script language=vortex><
timeout=-1></timeout><a name=main><DB
=″/home/goto/crawldb″><SQL ROW″select distinct
cannonsearchtext cst from″$TMPTABLE><SQL ROW″select count
(*)cnt from″$TMPTABLE″where cannon_search_text=$cst″><
SQL NOVARS″update″$TMPTABLE″set canon_cnt=$cnt where
cannon_search_text=$cst″></SQL></SQL></SQL
></a></script>

  Aggregates web page body-text and listings based on the
related-search result,while collecting and creating derived-
data of 1,how many different advertisers have web-pages
associated with the related-search result.基于相关的搜索结
				
				<dp n="d45"/>
果,总的网页文字字和列表,同时收集和创建导出的数据l,多少不
同的广告商具有与该相关的搜索结果相关的网页

  <script language=vortex><timeout=-1></timeout>
<DB=/home/goto/crawldb><a name=main><!--get all
canon-terms from tmp table--><SQL ROW″select term cst
from″$TERMSTABLE><$words=><$rsts=><$csts
=><$asts=><$asds=><$cts=><$cms=><
$cbs=><$advs=><$last_adv=><$adv_cnt=0>
<!--get all rows w/this canon term from tmp table--><SQL
ROW″select canon-cent cc,ad_id aid,raw-search-text rst,
cannon_search_text cts,ad_spec_title ast,.ad_spec_desc asd,
crawltitle ct,crawlbody cb,crawlmeta cm from″$SRCTABLE″where
cannon_search_text=$cst order by adid″><!--aggregate
the text to prepare for collapsed insert--><$rsts=t$rsts
+”+$rst)><$rsts=($csts+”+$cst)><$asts=
($asts+”+$ast)><$asds=($adsds+”+$asd)><
$cts=($cts+”+$ct)><$cms=($cms+”+$cm)><
$cbs=($cbs+”+$cb)><if $aid!=$last_adv>
<!--add advertiser to list if not seen him before--><
$advs=($advs+”+$aid)><$adv_cnt=($adv_cnt+1)>
<$lastadv=$aid></if></SQL><$canon_cnt=
$loop><$words=($rsts+”+$csts+”+$asts+”
+$asds+”+$cms+”+$cbs+”+$cts)><!--pick
off zeroeth element only from $rst array--><loop$rst>
<$Rst=$rst><break></loop><strlen $words>
<$wlen=$ret><strlen $advs><!--display which row
we’re working on-->$wlen $ret $cst<!--insert to
collapsed row--><SQL NOVARS″insert into″$TGTTABLE″
(canon_cnt,cannon_search_text,raw_search_text,
advertiser_ids,advertiser_cnt,words)VALUES($cc,$cst,$Rst,
$advs,$adv_cnt,$words)″></SQL><!--$words***<$ret
=(text2mm($words,50))>$ret--></SQL></a><
				
				<dp n="d46"/>
/script>

  Database schema layout used to upload bidded search listings
用来载入竞买的搜索列表的数据库大纲布局

  database/home/goto/crawldb  #droptableline adl droptable
line ad0 table line ad0 createtable col #keepfirst trimspace
#multiple datefmt yyyymmdd HHMMSS#Name Type Tag default val
field advertiserid varchar(8)1-8 field raw-search-text
varchar(40)9-48 field cannon_search_text varchar(40)49-88
field ad_spec_title varchar(100)89-188 field adspecdesc
varchar(2000)189-2188 field adurl varchar(200)2189-2388 field
resource varchar(20)2389-2408 field price integer 2409-2413 0
field rating char(2)2414-2415 field ad_id ihteger 2416-2423
0 field bid-date date 2424-2438 0 field canon_cnt integer-
0 field crawlwords varchar(40)

  Manual join of search listing data with crawled web page data
into a single merged table手工将搜索列表数据与搜索(crawl)过
的网页数据连接到一单个的合并表中

  <script language=vortex><timeout=-1></timeout>
<a name=main><DB=″/home/goto/crawldb″><SQL
ROW″select ad_url myurl,crawltitle ct,crawlmeta cm,crawlbody
cb from″$CRAWLTABLE><SQL NOVARS″update″$TMPTABLE″set
crawltitle=$ct,crawlmeta=$cm,crawlbody=$cbwhere ad_url
=$myurl″></SQL></SQL></a></script>

  Code to duplicate URL Crawl elimination完全相同URL搜索
(crawl)消除的代码

  /**  *Insert the type’s description here.这里插入类型
的说明

  *Creation date:(02/18/2000 11:12:12 AM)创建时间*
&amp;commat;author:*/

  import java.io.*;//import of java classes needed for
input/output用于输入/输出所需的Java类的输入

  import java.util.*;//import corejava.*;import java.lang.
				
				<dp n="d47"/>
String;             public             class             Url
{  /*********************************************************
********     *  Compare  URLs  Address     比 较  URLs  地 址
*************************************************************
*****/ public static void main(String args[]) throws Exception
{  //Decalarations   of the input and output File     输入和输
出文件的Decalaration

  BufferedReader inputFile;

  PrintWriter nonDupFile;

  PrintWriter dupFile;//Initialization初始化

  String firstUrl=″″;

  String secondUrl=″″;

  String urlBufferA,urlBufferB=″″,urlBufferC=″″;

  String compareDomainA=″″;

  String compareDomainB=″″;

  String compareDomainC=″″;

  String newFlag=″false″;inputFile=new BufferedReader
(new

  FileReader(″/home/lauw/urls.lau″));nonDupFile=new
PrintWriter(new FileWriter(″/home/lauw/nonDupFile.real″));
dupFile=new PrintWriter(new

  FileWriter(″/home/lauw/dupFile.real″));nonDupFile.
close();dupFile.close();firstUrl=inputFile.readLine();
secondUrl=inputFile.readLine();urlBufferC=inputFile.
readLine();urlBufferA=firstUrl;urlBufferB=secondUrl;
do

  Slash CcompareDomainA=new Slash();

  Slash ccompareDomainB=new Slash();

  Slash ccompareDomainC=new Slash();compareDomainA=
ccompareDomainA.Slash(urlBufferA);compareDomainB=
ccompareDomainB.Slash(urlBufferB);compareDomainC=
ccompareDomainC.Slash(urlBufferC);
				
				<dp n="d48"/>
  Compare compareSub=new Compare();newFlag=compareSub.
Compare(compareDomainA,compareDomainB,compareDomainC,
urlBufferB,newFlag);urlBufferA=urlBufferB;
urlBufferB=urlBufferC;urlBufferC=inputFile.readLine();}
while (urlBufferC!=null);///////////////////////////Loop
for first Null value用于第一Null值的循环

  urlBufferC=firstUrl;

  Slash ccompareFirstNullDomainA=new Slash();

  Slash ccompareFirstNullDomainB=new Slash();

  Slash ccompareFirstNullDomainC=new Slash();
compareDomainA=ccompareFirstNullDomainA.Slash(urlBufferA);
compareDomainB=ccompareFirstNullDomainB.Slash(urlBufferB);
compareDomainC=ccompareFirstNullDomainC.Slash(urlBufferC);

  Compare compareFirstNullSub=new Compare();
newFlag=compareFirstNullSub.Compare(compareDomainA,
compareDomainB,compareDomainC,urlBufferB,newFlag);
/////////////////////////Loop for last Null value用于最后的
Null值的循环

  urlBufferA=urlBufferB;urlBufferB=firstUrl;
urlBufferC=secondUrl;

  Slash ccompareLastNullDomainA=new Slash();

  Slash ccompareLastNullDomainB=new Slash();

  Slash ccompareLastNullDomainC=new Slash();
compareDomainA=ccompareLastNullDomainA.Slash(urlBufferA);
compareDomainB=ccompareLastNullDomainB.Slash(urlBufferB);
compareDomainC=ccompareLastNullDomainC.Slash(urlBufferC);

  Compare compareLastNullSub=new Compare();
newFlag=compareLastNullSub.Compare(compareDomainA,
compareDomainB,compareDomainC,urlBufferB,newFlag);
inputFile.close();}}class Slash(

  String Slash(String buffer){int domainSlashEnd=0;int
domainSlashStart=0;boolean domainIndex=false;boolean
				
				<dp n="d49"/>
startFound=false;boolean newFlag=false;

  String comparedomain;comparedomain=″’;for(int
domainSlashLoop=8;domainSlashLoop<=(buffer.length()-
l);domainSlashLoop++){if((buffer.substring
(domainSlashLoop,(domainSlashLoop+l)).equals(″/″))
(buffer.substring(domainSlashLoop,(domainSlashLoop+l)).
equals(″?″)))(if(startFound==false){///Check the Urls
with Domain Name only仅用域名核对该URLs

  if((domainSlashLoop+l)==buffer.length())
{comparedomain=buffer.substring(0,(buffer.length()));
domainIndex=true;domainSlashLoop=buffer.length()+
500;}//end for domain name only仅结束域名

  domainSlashStart=domainSlashLoop+1;startFound=true;}
else{domainSlashEnd=domainSlashLoop;domainSlashLoop=
buffer.length ()+500;///add 5 to get out of the loop加5
来退出循环

  }}//end for Loop结束循环

  if(domainSlashEnd==0)domainSlashEnd=buffer.length
();}if(domainIndex==false){comparedomain=buffer.
substring(domainSlashStart,domainSlashEnd);}}return
comparedomain;})import java.io.*;//import of java classes
needed for input/output class Compare用于输入类比较所需的Java
类的输入

  {

  String Compare(String aCompareDomainA,,String
aCompareDomainB,

  String aCompareDomainC,String aUrlBufferB,String newFlag)
throws

  Exception(

  PrintWriter nonDupFile;

  PrintWriter dupFile;nonDupFile=new PrintWriter(new

  FileWriter(″/home/lauw/nonDupFile.real″,true),true);
				
				<dp n="d50"/>
dupFile=new PrintWriter(new

  FileWriter(″/home/lauw/dupFile.real″,true),true);if
(aCompareDomainC.equals(aCompareDomainB))if(newFlag.
equals(″true″))(dupFile.println(″New″);newFlag
=″false″;}

  System.out.println(″Duplicate″);dupFile.println
(aUrlBufferB);}else{if(aCompareDomainB.equals
(aCompareDomainA))if(newFlag.equals(″true″)){dupFile.
println(″New″);newFlag=″false″;}

  System.out.println(″print a Duplicat in second time″);

  System.out.println (″Sec Duplicate″);dupFile.println
(aUrlBufferB);}else

  System.out.println(″non Dup″);nonDupFile.println
(aUrlBufferB);newFlag=″true″;}

   System.                    out.            println
(″***************************************s,);}

   return newFlag;

   }

    }

Claims (25)

1、一种生成一搜索结果列表的方法,该方法包括:
从一搜索者接收一搜索请求;在一按业绩付费数据库中包括多个搜索列表,识别生成与该搜索请求匹配的搜索列表;
在一相关的搜索数据库中包括从该按业绩付费数据库生成的相关的搜索列表,识别与该搜索请求关联的相关的搜索列表;以及
将一搜索结果列表返回给该搜索者,包括识别的搜索列表以及一个或多个识别的相关的搜索列表。
2、如权利要求1所述的方法,其中识别相关的搜索列表包括:
搜索该按业绩付费数据库的一倒排索引;以及
基于该按业绩付费数据库,搜索元信息的一索引。
3、如权利要求1所述的方法,进一步包括:
按与该搜索请求的关联排序识别的相关的搜索列表;
选择一预定数量的识别的相关的搜索列表作为最关联的相关的搜索列表;以及
返回在该搜索结果列表中的最关联的相关的搜索列表。
4、如权利要求3所述的方法,其中排序包括:
根据搜索请求的一查询术语的出现的频率,在相关的搜索列表中选择识别的相关的搜索列表。
5、如权利要求3所述的方法,其中排序包括:
根据搜索请求的一个或多个查询术语的近似度,在该相关的搜索列表中选择识别的相关的搜索列表。
6、如权利要求3所述的方法,其中排序包括:
根据预定的加权标准,加权相关的搜索列表;以及
根据相关的搜索列表的加权,选择识别的相关的搜索列表。
7、如权利要求6所述的方法,其中加权相关的搜索列表包括:
增加包括由一广告商标识的竞买的一个或多个搜索项的一相关的搜索列表的相对加权。
8、如权利要求6所述的方法,其中加权相关的搜索列表包括:
增加包含在由一广告商标识的一搜索列表的一说明中的一相关的搜索列表的相对加权。
9、如权利要求6所述的方法,其中加权相关的搜索列表包括:
增加包含在由一广告商标识的一搜索列表的一标题中的一相关的搜索列表的相对加权。
10、如权利要求6所述的方法,其中加权相关的搜索列表包括:
增加包含在由一广告商保持的一网页的元标签关键词中的一相关的搜索列表的相对加权。
11、如权利要求6所述的方法,其中加权相关的搜索列表包括:
增加包含在由一广告商维护的一网页的文字数据中的一相关的搜索列表的相对加权。
12、如权利要求3所述的方法,其中排序包括:
根据相关的搜索列表的分布,排列相关的搜索列表;以及
根据相关的搜索列表的等级,选择识别的相关的搜索列表。
13、如权利要求12所述的方法,其中排列包括:
识别包含在相关的搜索列表中的关键信息;以及
根据在相关的搜索列表中出现的关键信息,增加一相关的搜索列表的等级。
14、如权利要求13所述的方法,其中识别关键信息包括:
检测在相关的搜索列表中的字段性的广告商数据;以及
检测在相关的搜索列表中的搜索后的数据。
15、一个系统,包括:
一按业绩付费数据库;
至少部分使用该按业绩付费数据库形成的一相关的搜索数据库;以及
一服务器,与该按业绩付费数据库和该相关的搜索数据库耦合,该服务器响应一搜索者的一搜索请求,用来从该按业绩付费数据库选择第一组搜索结果以及从该相关的搜索数据库选择第二组搜索结果。
16、如权利要求15所述的系统,其中该按业绩付费数据库包括:
多个搜索列表,每个搜索列表包括:
一搜索项,
一竞买金额,以及
一统一资源定位器,与远离该系统的一网络服务器上的一文档的一地址一致。
17、如权利要求16所述的系统,其中该相关的搜索数据库包括:
多个相关的搜索列表,每个相关的搜索列表包括:
一关键词,与该按业绩付费数据库的一个文档
以及该文档的文字相关。
18、如权利要求17所述的系统,其中多个搜索列表的每个搜索列表进一步包括:
描述性文字,描述该文档,
一标题,以及
元标签,与该文档相关。
19、如权利要求18所述的系统,其中每个搜索列表包括:
该描述性文字与文档相关;
该标题,与该文档相关;以及
所述元标签,与该文档相关。
20、一种用于响应对包括多个搜索列表的一按业绩付费数据库的一搜索请求,形成用于识别相关的搜索的一相关的搜索数据库的方法,该方法包括:
将由该按业绩付费数据库的一搜索列表引用的每个网页存储为一相关的搜索数据库输入文字;
为该相关的搜索数据库输入创建一倒排索引;以及
创建用于与该按业绩付费数据库的每个搜索列表相关的关健信息的一索引。
21、如权利要求20所述的方法,其中存储包括:
响应用于由该按业绩付费数据库的搜索列表引用的两个或多个网页的统一资源定位器的根路径部分以及查询自变量,识别相似的网页。
22、如权利要求21所述的方法,其中识别相似的网页包括:
识别一第一网页的第一关键词;
识别一第二网页的第二关键词;以及
比较第一关键词以及第二关键词;
当第一关键词和第二关键词具有一预定的关系时,将第一网页和第二网页识别为相似的网页。
23、一种用于搜索在包括来自因特网站点的因特网数据的一数据库中的数据的方法,该方法包括:
形成与将被访问的因特网站点有关的统一资源定位器(URLs)的一列表;
从该列表移出完全相同的URLs;
如果在该列表上的一URL与在该列表上的另一URL相似,搜索多个预定的可能的完全相同的URLs;
将在该列表上的URL体与该可能完全相同的URLs比较;
如果在该列表上的URL体与该可能完全相同的URL相似;
挂起该可能完全相同的URLs的搜索,以及
为以后搜索在该数据库中在该列表上存储该URL体。
24、如权利要求23所述的方法,进一步包括:
将一选定的URL与在该列表上的其他URLs比较;以及
当该URL具有与在该列表上的其他URL一样的一预定的文字部分,确定该URL与在该列表上的其他URL相似。
25、如权利要求23所述的方法,其中将在该列表上的URL体与可能完全相同的URLs比较包括:
将来自该列表上的URL的文字与来自一个可能完全相同的URL的文字比较;以及
当来自该列表上的URL的文字与来自该可能完全相同的URL的文字共同具有一预定的文字部分,确定在该列表上的URL与该可能完全相同的URL相似。
CN01809998XA 2000-05-22 2001-05-18 用于在数据库搜索系统中识别相关搜索的方法和装置 Expired - Lifetime CN1430751B (zh)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US09/575,894 US6876997B1 (en) 2000-05-22 2000-05-22 Method and apparatus for indentifying related searches in a database search system
US09/575,894 2000-05-22
PCT/US2001/016161 WO2001090947A1 (en) 2000-05-22 2001-05-18 Method and apparatus for identifying related searches in a database search system

Publications (2)

Publication Number Publication Date
CN1430751A true CN1430751A (zh) 2003-07-16
CN1430751B CN1430751B (zh) 2010-05-26

Family

ID=24302119

Family Applications (1)

Application Number Title Priority Date Filing Date
CN01809998XA Expired - Lifetime CN1430751B (zh) 2000-05-22 2001-05-18 用于在数据库搜索系统中识别相关搜索的方法和装置

Country Status (11)

Country Link
US (3) US6876997B1 (zh)
EP (1) EP1297453B1 (zh)
JP (1) JP3860036B2 (zh)
KR (2) KR100719009B1 (zh)
CN (1) CN1430751B (zh)
AT (1) ATE465470T1 (zh)
AU (5) AU2001263275B2 (zh)
CA (1) CA2409642C (zh)
DE (2) DE10196212T1 (zh)
GB (1) GB2388678B (zh)
WO (1) WO2001090947A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011044780A1 (zh) * 2009-10-13 2011-04-21 腾讯科技(深圳)有限公司 一种搜索引擎系统和信息搜索方法
CN103092990A (zh) * 2013-02-14 2013-05-08 张康德 一种数据库的搜索引擎方法
CN103092856A (zh) * 2011-10-31 2013-05-08 阿里巴巴集团控股有限公司 搜索结果排序方法及设备、搜索方法及设备
CN105426536A (zh) * 2015-12-21 2016-03-23 北京奇虎科技有限公司 汽车类搜索结果页的展现方法及装置
CN110543310A (zh) * 2019-08-08 2019-12-06 山东中创软件商用中间件股份有限公司 一种jsp编译方法、装置、设备及存储介质

Families Citing this family (273)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7702537B2 (en) * 1999-05-28 2010-04-20 Yahoo! Inc System and method for enabling multi-element bidding for influencing a position on a search result list generated by a computer network search engine
JP3621643B2 (ja) * 2000-05-30 2005-02-16 株式会社 ネットピア.コム 実名を利用した地域情報提供システム及びその方法
US7319975B2 (en) * 2000-07-24 2008-01-15 Emergency 24, Inc. Internet-based advertising and referral system
US7359951B2 (en) 2000-08-08 2008-04-15 Aol Llc, A Delaware Limited Liability Company Displaying search results
US7225180B2 (en) * 2000-08-08 2007-05-29 Aol Llc Filtering search results
US7007008B2 (en) * 2000-08-08 2006-02-28 America Online, Inc. Category searching
US7047229B2 (en) * 2000-08-08 2006-05-16 America Online, Inc. Searching content on web pages
CA2323883C (en) * 2000-10-19 2016-02-16 Patrick Ryan Morin Method and device for classifying internet objects and objects stored oncomputer-readable media
US7925967B2 (en) * 2000-11-21 2011-04-12 Aol Inc. Metadata quality improvement
US7356530B2 (en) * 2001-01-10 2008-04-08 Looksmart, Ltd. Systems and methods of retrieving relevant information
US7428496B1 (en) * 2001-04-24 2008-09-23 Amazon.Com, Inc. Creating an incentive to author useful item reviews
US7546287B2 (en) * 2001-06-18 2009-06-09 Siebel Systems, Inc. System and method to search a database for records matching user-selected search criteria and to maintain persistency of the matched records
US7213013B1 (en) * 2001-06-18 2007-05-01 Siebel Systems, Inc. Method, apparatus, and system for remote client search indexing
US7464072B1 (en) 2001-06-18 2008-12-09 Siebel Systems, Inc. Method, apparatus, and system for searching based on search visibility rules
WO2003017023A2 (en) 2001-08-14 2003-02-27 Quigo Technologies, Inc. System and method for extracting content for submission to a search engine
US7752266B2 (en) 2001-10-11 2010-07-06 Ebay Inc. System and method to facilitate translation of communications between entities over a network
US8590013B2 (en) 2002-02-25 2013-11-19 C. S. Lee Crawford Method of managing and communicating data pertaining to software applications for processor-based devices comprising wireless communication circuitry
EP1343098A1 (en) * 2002-03-07 2003-09-10 Hewlett-Packard Company Improvements relating to network environments and location of resources therein
US8078505B2 (en) 2002-06-10 2011-12-13 Ebay Inc. Method and system for automatically updating a seller application utilized in a network-based transaction facility
US8050970B2 (en) 2002-07-25 2011-11-01 Google Inc. Method and system for providing filtered and/or masked advertisements over the internet
EP1535211A4 (en) * 2002-08-30 2006-08-23 Miva Inc SYSTEM AND METHOD FOR PAYING QUALITY ADVERTISING USING MULTIPLE ASSEMBLIES OF ADVERTISING LISTS
JP5108227B2 (ja) 2002-12-14 2012-12-26 エヌエイチエヌ ビジネス プラットフォーム コーポレーション 検索結果リストの生成システム及び方法
US20040138988A1 (en) * 2002-12-20 2004-07-15 Bart Munro Method to facilitate a search of a database utilizing multiple search criteria
KR100485322B1 (ko) * 2003-03-08 2005-04-27 엔에이치엔(주) 검색 엔진에서 검색 결과 리스트를 생성하는 방법
US20050065928A1 (en) * 2003-05-02 2005-03-24 Kurt Mortensen Content performance assessment optimization for search listings in wide area network searches
US20030167212A1 (en) * 2003-05-15 2003-09-04 Emergency 24, Inc. Method and system for providing relevant advertisement internet hyperlinks
US7739295B1 (en) * 2003-06-20 2010-06-15 Amazon Technologies, Inc. Method and system for identifying information relevant to content
US7647299B2 (en) * 2003-06-30 2010-01-12 Google, Inc. Serving advertisements using a search of advertiser web information
US8438154B2 (en) * 2003-06-30 2013-05-07 Google Inc. Generating information for online advertisements from internet data and traditional media data
US7599938B1 (en) 2003-07-11 2009-10-06 Harrison Jr Shelton E Social news gathering, prioritizing, tagging, searching, and syndication method
US8321400B2 (en) * 2003-08-29 2012-11-27 Vortaloptics, Inc. Method, device and software for querying and presenting search results
US7440964B2 (en) * 2003-08-29 2008-10-21 Vortaloptics, Inc. Method, device and software for querying and presenting search results
US7505964B2 (en) * 2003-09-12 2009-03-17 Google Inc. Methods and systems for improving a search ranking using related queries
US20050060289A1 (en) * 2003-09-12 2005-03-17 Mark Keenan A method for relaxing multiple constraints in search and calculation and then displaying results
US7664770B2 (en) * 2003-10-06 2010-02-16 Lycos, Inc. Smart browser panes
US20050125397A1 (en) * 2003-12-04 2005-06-09 William Gross Transparent search engine
US20050144064A1 (en) * 2003-12-19 2005-06-30 Palo Alto Research Center Incorporated Keyword advertisement management
US20050137939A1 (en) * 2003-12-19 2005-06-23 Palo Alto Research Center Incorporated Server-based keyword advertisement management
US7523087B1 (en) * 2003-12-31 2009-04-21 Google, Inc. Determining and/or designating better ad information such as ad landing pages
US20070088683A1 (en) * 2004-08-03 2007-04-19 Gene Feroglia Method and system for search engine enhancement
WO2005070164A2 (en) * 2004-01-12 2005-08-04 Chromotopy, Inc. Method and system for search engine enhancement
US8595146B1 (en) 2004-03-15 2013-11-26 Aol Inc. Social networking permissions
US9189568B2 (en) 2004-04-23 2015-11-17 Ebay Inc. Method and system to display and search in a language independent manner
US20050267872A1 (en) * 2004-06-01 2005-12-01 Yaron Galai System and method for automated mapping of items to documents
US7487145B1 (en) * 2004-06-22 2009-02-03 Google Inc. Method and system for autocompletion using ranked results
US7836044B2 (en) 2004-06-22 2010-11-16 Google Inc. Anticipated query generation and processing in a search engine
KR100806862B1 (ko) * 2004-07-16 2008-02-26 (주)이네스트커뮤니케이션 웹 사이트에서의 1차 키워드 검색에 대해 관련성 있는 2차키워드의 리스트를 제공하는 방법 및 장치
US7904337B2 (en) 2004-10-19 2011-03-08 Steve Morsa Match engine marketing
US8799079B2 (en) * 2004-10-22 2014-08-05 Adknowledge, Inc. System for prioritizing advertiser communications over a network
US9015263B2 (en) 2004-10-29 2015-04-21 Go Daddy Operating Company, LLC Domain name searching with reputation rating
US7499940B1 (en) 2004-11-11 2009-03-03 Google Inc. Method and system for URL autocompletion using ranked results
WO2006055000A1 (en) * 2004-11-12 2006-05-26 Benninghoff Charles F Iii International system to allocate vendue exclusivity ranking (isvaer)
US20060106769A1 (en) 2004-11-12 2006-05-18 Gibbs Kevin A Method and system for autocompletion for languages having ideographs and phonetic characters
US7966310B2 (en) * 2004-11-24 2011-06-21 At&T Intellectual Property I, L.P. Method, system, and software for correcting uniform resource locators
CN1609859A (zh) * 2004-11-26 2005-04-27 孙斌 搜索结果聚类的方法
US20060122976A1 (en) 2004-12-03 2006-06-08 Shumeet Baluja Predictive information retrieval
US8364670B2 (en) 2004-12-28 2013-01-29 Dt Labs, Llc System, method and apparatus for electronically searching for an item
US7974962B2 (en) 2005-01-06 2011-07-05 Aptiv Digital, Inc. Search engine for a video recorder
US20110208732A1 (en) 2010-02-24 2011-08-25 Apple Inc. Systems and methods for organizing data items
US10482474B1 (en) 2005-01-19 2019-11-19 A9.Com, Inc. Advertising database system and method
WO2006085778A2 (en) * 2005-02-11 2006-08-17 Eurekster, Inc Information prioritisation system and method
US7801880B2 (en) * 2005-03-29 2010-09-21 Microsoft Corporation Crawling databases for information
US9134884B2 (en) 2005-03-30 2015-09-15 Ebay Inc. Methods and systems to process a selection of a browser back button
US20060271389A1 (en) * 2005-05-31 2006-11-30 Microsoft Corporation Pay per percentage of impressions
US20060271426A1 (en) * 2005-05-31 2006-11-30 Microsoft Corporation Posted price market for online search and content advertisements
US20060293950A1 (en) * 2005-06-28 2006-12-28 Microsoft Corporation Automatic ad placement
US7752220B2 (en) * 2005-08-10 2010-07-06 Yahoo! Inc. Alternative search query processing in a term bidding system
US7634462B2 (en) * 2005-08-10 2009-12-15 Yahoo! Inc. System and method for determining alternate search queries
US8131594B1 (en) 2005-08-11 2012-03-06 Amazon Technologies, Inc. System and method for facilitating targeted advertising
US7747639B2 (en) * 2005-08-24 2010-06-29 Yahoo! Inc. Alternative search query prediction
US7672932B2 (en) 2005-08-24 2010-03-02 Yahoo! Inc. Speculative search result based on a not-yet-submitted search query
US8503995B2 (en) 2005-09-14 2013-08-06 Jumptap, Inc. Mobile dynamic advertisement creation and placement
US8819659B2 (en) 2005-09-14 2014-08-26 Millennial Media, Inc. Mobile search service instant activation
US10038756B2 (en) 2005-09-14 2018-07-31 Millenial Media LLC Managing sponsored content based on device characteristics
US20070100650A1 (en) * 2005-09-14 2007-05-03 Jorey Ramer Action functionality for mobile content search results
US8238888B2 (en) 2006-09-13 2012-08-07 Jumptap, Inc. Methods and systems for mobile coupon placement
US7752209B2 (en) 2005-09-14 2010-07-06 Jumptap, Inc. Presenting sponsored content on a mobile communication facility
US7860871B2 (en) 2005-09-14 2010-12-28 Jumptap, Inc. User history influenced search results
US7603360B2 (en) * 2005-09-14 2009-10-13 Jumptap, Inc. Location influenced search results
US20070100805A1 (en) * 2005-09-14 2007-05-03 Jorey Ramer Mobile content cross-inventory yield optimization
US9058406B2 (en) 2005-09-14 2015-06-16 Millennial Media, Inc. Management of multiple advertising inventories using a monetization platform
US20070061242A1 (en) * 2005-09-14 2007-03-15 Jorey Ramer Implicit searching for mobile content
US8103545B2 (en) 2005-09-14 2012-01-24 Jumptap, Inc. Managing payment for sponsored content presented to mobile communication facilities
US8832100B2 (en) 2005-09-14 2014-09-09 Millennial Media, Inc. User transaction history influenced search results
US8229914B2 (en) 2005-09-14 2012-07-24 Jumptap, Inc. Mobile content spidering and compatibility determination
US8195133B2 (en) 2005-09-14 2012-06-05 Jumptap, Inc. Mobile dynamic advertisement creation and placement
US8156128B2 (en) 2005-09-14 2012-04-10 Jumptap, Inc. Contextual mobile content placement on a mobile communication facility
US7912458B2 (en) 2005-09-14 2011-03-22 Jumptap, Inc. Interaction analysis and prioritization of mobile content
US7577665B2 (en) * 2005-09-14 2009-08-18 Jumptap, Inc. User characteristic influenced search results
US8027879B2 (en) 2005-11-05 2011-09-27 Jumptap, Inc. Exclusivity bidding for mobile sponsored content
US20110313853A1 (en) 2005-09-14 2011-12-22 Jorey Ramer System for targeting advertising content to a plurality of mobile communication facilities
US8660891B2 (en) 2005-11-01 2014-02-25 Millennial Media Interactive mobile advertisement banners
US20070100651A1 (en) * 2005-11-01 2007-05-03 Jorey Ramer Mobile payment facilitation
US8209344B2 (en) 2005-09-14 2012-06-26 Jumptap, Inc. Embedding sponsored content in mobile applications
US8131271B2 (en) 2005-11-05 2012-03-06 Jumptap, Inc. Categorization of a mobile user profile based on browse behavior
US8290810B2 (en) 2005-09-14 2012-10-16 Jumptap, Inc. Realtime surveying within mobile sponsored content
US20070168354A1 (en) * 2005-11-01 2007-07-19 Jorey Ramer Combined algorithmic and editorial-reviewed mobile content search results
US9076175B2 (en) 2005-09-14 2015-07-07 Millennial Media, Inc. Mobile comparison shopping
US7676394B2 (en) 2005-09-14 2010-03-09 Jumptap, Inc. Dynamic bidding and expected value
US8688671B2 (en) 2005-09-14 2014-04-01 Millennial Media Managing sponsored content based on geographic region
US7660581B2 (en) 2005-09-14 2010-02-09 Jumptap, Inc. Managing sponsored content based on usage history
US9201979B2 (en) 2005-09-14 2015-12-01 Millennial Media, Inc. Syndication of a behavioral profile associated with an availability condition using a monetization platform
US9471925B2 (en) 2005-09-14 2016-10-18 Millennial Media Llc Increasing mobile interactivity
US8532633B2 (en) 2005-09-14 2013-09-10 Jumptap, Inc. System for targeting advertising content to a plurality of mobile communication facilities
US8311888B2 (en) 2005-09-14 2012-11-13 Jumptap, Inc. Revenue models associated with syndication of a behavioral profile using a monetization platform
US8615719B2 (en) 2005-09-14 2013-12-24 Jumptap, Inc. Managing sponsored content for delivery to mobile communication facilities
US7548915B2 (en) 2005-09-14 2009-06-16 Jorey Ramer Contextual mobile content placement on a mobile communication facility
US8805339B2 (en) 2005-09-14 2014-08-12 Millennial Media, Inc. Categorization of a mobile user profile based on browse and viewing behavior
US8364521B2 (en) 2005-09-14 2013-01-29 Jumptap, Inc. Rendering targeted advertisement on mobile communication facilities
US10911894B2 (en) 2005-09-14 2021-02-02 Verizon Media Inc. Use of dynamic content generation parameters based on previous performance of those parameters
US9703892B2 (en) 2005-09-14 2017-07-11 Millennial Media Llc Predictive text completion for a mobile communication facility
US20070073719A1 (en) * 2005-09-14 2007-03-29 Jorey Ramer Physical navigation of a mobile search application
US7702318B2 (en) 2005-09-14 2010-04-20 Jumptap, Inc. Presentation of sponsored content based on mobile transaction event
US8666376B2 (en) 2005-09-14 2014-03-04 Millennial Media Location based mobile shopping affinity program
US8364540B2 (en) * 2005-09-14 2013-01-29 Jumptap, Inc. Contextual targeting of content using a monetization platform
US10592930B2 (en) 2005-09-14 2020-03-17 Millenial Media, LLC Syndication of a behavioral profile using a monetization platform
US8302030B2 (en) 2005-09-14 2012-10-30 Jumptap, Inc. Management of multiple advertising inventories using a monetization platform
US8812526B2 (en) 2005-09-14 2014-08-19 Millennial Media, Inc. Mobile content cross-inventory yield optimization
US7769764B2 (en) 2005-09-14 2010-08-03 Jumptap, Inc. Mobile advertisement syndication
US8989718B2 (en) 2005-09-14 2015-03-24 Millennial Media, Inc. Idle screen advertising
US7725464B2 (en) * 2005-09-27 2010-05-25 Looksmart, Ltd. Collection and delivery of internet ads
US8676781B1 (en) 2005-10-19 2014-03-18 A9.Com, Inc. Method and system for associating an advertisement with a web page
US8015065B2 (en) * 2005-10-28 2011-09-06 Yahoo! Inc. Systems and methods for assigning monetary values to search terms
US8082516B2 (en) * 2005-11-01 2011-12-20 Lycos, Inc. Preview panel
US8175585B2 (en) 2005-11-05 2012-05-08 Jumptap, Inc. System for targeting advertising content to a plurality of mobile communication facilities
US8571999B2 (en) 2005-11-14 2013-10-29 C. S. Lee Crawford Method of conducting operations for a social network application including activity list generation
US20070143255A1 (en) * 2005-11-28 2007-06-21 Webaroo, Inc. Method and system for delivering internet content to mobile devices
US8903810B2 (en) 2005-12-05 2014-12-02 Collarity, Inc. Techniques for ranking search results
US8429184B2 (en) 2005-12-05 2013-04-23 Collarity Inc. Generation of refinement terms for search queries
US7788131B2 (en) * 2005-12-15 2010-08-31 Microsoft Corporation Advertising keyword cross-selling
US7792858B2 (en) * 2005-12-21 2010-09-07 Ebay Inc. Computer-implemented method and system for combining keywords into logical clusters that share similar behavior with respect to a considered dimension
US7752190B2 (en) * 2005-12-21 2010-07-06 Ebay Inc. Computer-implemented method and system for managing keyword bidding prices
US8036937B2 (en) 2005-12-21 2011-10-11 Ebay Inc. Computer-implemented method and system for enabling the automated selection of keywords for rapid keyword portfolio expansion
US20070174255A1 (en) * 2005-12-22 2007-07-26 Entrieva, Inc. Analyzing content to determine context and serving relevant content based on the context
US20070156654A1 (en) * 2005-12-29 2007-07-05 Kalpana Ravinarayanan Method for displaying search results and contextually related items
US20070192246A1 (en) * 2006-01-23 2007-08-16 Intersearch Group, Inc. System and method for redirecting internet traffic
US20080140491A1 (en) * 2006-02-02 2008-06-12 Microsoft Corporation Advertiser backed compensation for end users
US20070179853A1 (en) * 2006-02-02 2007-08-02 Microsoft Corporation Allocating rebate points
US20070179846A1 (en) * 2006-02-02 2007-08-02 Microsoft Corporation Ad targeting and/or pricing based on customer behavior
US20070179848A1 (en) * 2006-02-02 2007-08-02 Microsoft Corporation Employing customer points to confirm transaction
US20070179849A1 (en) * 2006-02-02 2007-08-02 Microsoft Corporation Ad publisher performance and mitigation of click fraud
US20080114651A1 (en) * 2006-02-02 2008-05-15 Microsoft Corporation Omaha - user price incentive model
US20070192179A1 (en) * 2006-02-15 2007-08-16 Van Luchene Andrew S Survey-Based Qualification of Keyword Searches
US7689554B2 (en) * 2006-02-28 2010-03-30 Yahoo! Inc. System and method for identifying related queries for languages with multiple writing systems
US7899818B2 (en) * 2006-03-29 2011-03-01 A9.Com, Inc. Method and system for providing focused search results by excluding categories
US20090055373A1 (en) * 2006-05-09 2009-02-26 Irit Haviv-Segal System and method for refining search terms
US20070271255A1 (en) * 2006-05-17 2007-11-22 Nicky Pappo Reverse search-engine
KR100824435B1 (ko) 2006-06-23 2008-04-22 (주)첫눈 검색광고 리스트 순위 결정방법 및 장치
US7792830B2 (en) * 2006-08-01 2010-09-07 International Business Machines Corporation Analyzing the ability to find textual content
US7716201B2 (en) * 2006-08-10 2010-05-11 Yahoo! Inc. Method and apparatus for reconstructing a search query
US8639782B2 (en) 2006-08-23 2014-01-28 Ebay, Inc. Method and system for sharing metadata between interfaces
US8442972B2 (en) 2006-10-11 2013-05-14 Collarity, Inc. Negative associations for search results ranking and refinement
US8661029B1 (en) 2006-11-02 2014-02-25 Google Inc. Modifying search result ranking based on implicit user feedback
US8195512B2 (en) * 2006-11-03 2012-06-05 Joseph Franklin Shuhy System and method for serving relevant question-based advertisements
US8515809B2 (en) * 2006-12-12 2013-08-20 International Business Machines Corporation Dynamic modification of advertisements displayed in response to a search engine query
US8463830B2 (en) 2007-01-05 2013-06-11 Google Inc. Keyword-based content suggestions
US20080215416A1 (en) * 2007-01-31 2008-09-04 Collarity, Inc. Searchable interactive internet advertisements
US7685084B2 (en) * 2007-02-09 2010-03-23 Yahoo! Inc. Term expansion using associative matching of labeled term pairs
US8938463B1 (en) 2007-03-12 2015-01-20 Google Inc. Modifying search result ranking based on implicit user feedback and a model of presentation bias
US8694374B1 (en) 2007-03-14 2014-04-08 Google Inc. Detecting click spam
WO2008122055A2 (en) * 2007-04-02 2008-10-09 Gigablast, Inc. A system and method for generating and paying for ad listings for association with search results or other content
KR100930786B1 (ko) * 2007-04-04 2009-12-09 엔에이치엔비즈니스플랫폼 주식회사 광고 리스트 생성 방법 및 시스템
US20080256056A1 (en) * 2007-04-10 2008-10-16 Yahoo! Inc. System for building a data structure representing a network of users and advertisers
US9092510B1 (en) 2007-04-30 2015-07-28 Google Inc. Modifying search result ranking based on a temporal element of user feedback
US8176476B2 (en) * 2007-06-15 2012-05-08 Microsoft Corporation Analyzing software usage with instrumentation data
US7788284B2 (en) * 2007-06-26 2010-08-31 Yahoo! Inc. System and method for knowledge based search system
US7809745B2 (en) * 2007-08-09 2010-10-05 Yahoo! Inc. Method for generating structured query results using lexical clustering
US20090055436A1 (en) * 2007-08-20 2009-02-26 Olakunle Olaniyi Ayeni System and Method for Integrating on Demand/Pull and Push Flow of Goods-and-Services Meta-Data, Including Coupon and Advertising, with Mobile and Wireless Applications
US8694511B1 (en) 2007-08-20 2014-04-08 Google Inc. Modifying search result ranking based on populations
US8712758B2 (en) * 2007-08-31 2014-04-29 Microsoft Corporation Coreference resolution in an ambiguity-sensitive natural language processing system
US8868562B2 (en) * 2007-08-31 2014-10-21 Microsoft Corporation Identification of semantic relationships within reported speech
US8229970B2 (en) 2007-08-31 2012-07-24 Microsoft Corporation Efficient storage and retrieval of posting lists
US8209321B2 (en) * 2007-08-31 2012-06-26 Microsoft Corporation Emphasizing search results according to conceptual meaning
US8463593B2 (en) * 2007-08-31 2013-06-11 Microsoft Corporation Natural language hypernym weighting for word sense disambiguation
US8229730B2 (en) * 2007-08-31 2012-07-24 Microsoft Corporation Indexing role hierarchies for words in a search index
US20090070322A1 (en) * 2007-08-31 2009-03-12 Powerset, Inc. Browsing knowledge on the basis of semantic relations
US8346756B2 (en) * 2007-08-31 2013-01-01 Microsoft Corporation Calculating valence of expressions within documents for searching a document index
US8280721B2 (en) * 2007-08-31 2012-10-02 Microsoft Corporation Efficiently representing word sense probabilities
US8316036B2 (en) * 2007-08-31 2012-11-20 Microsoft Corporation Checkpointing iterators during search
US8108255B1 (en) 2007-09-27 2012-01-31 Amazon Technologies, Inc. Methods and systems for obtaining reviews for items lacking reviews
US8001003B1 (en) * 2007-09-28 2011-08-16 Amazon Technologies, Inc. Methods and systems for searching for and identifying data repository deficits
US10115124B1 (en) * 2007-10-01 2018-10-30 Google Llc Systems and methods for preserving privacy
US8909655B1 (en) 2007-10-11 2014-12-09 Google Inc. Time based ranking
CN101159967B (zh) * 2007-10-29 2011-08-31 中国移动通信集团设计院有限公司 一种将路测数据用于传播模型校正的方法及装置
KR100903499B1 (ko) * 2007-12-27 2009-06-18 엔에이치엔비즈니스플랫폼 주식회사 검색 의도 분류에 따른 광고 제공 방법 및 상기 방법을수행하기 위한 시스템
US8126877B2 (en) * 2008-01-23 2012-02-28 Globalspec, Inc. Arranging search engine results
US8595209B1 (en) * 2008-01-29 2013-11-26 Boundless Network, Inc. Product idea sharing algorithm
US8412571B2 (en) * 2008-02-11 2013-04-02 Advertising.Com Llc Systems and methods for selling and displaying advertisements over a network
US7991780B1 (en) 2008-05-07 2011-08-02 Google Inc. Performing multiple related searches
US8010544B2 (en) * 2008-06-06 2011-08-30 Yahoo! Inc. Inverted indices in information extraction to improve records extracted per annotation
US8606627B2 (en) * 2008-06-12 2013-12-10 Microsoft Corporation Sponsored search data structure
US8438178B2 (en) * 2008-06-26 2013-05-07 Collarity Inc. Interactions among online digital identities
US8521731B2 (en) 2008-07-09 2013-08-27 Yahoo! Inc. Systems and methods for query expansion in sponsored search
JP5576376B2 (ja) 2008-08-28 2014-08-20 ネイバー ビジネス プラットフォーム コーポレーション 拡張キーワードプールを用いる検索方法およびシステム
US20100057712A1 (en) * 2008-09-02 2010-03-04 Yahoo! Inc. Integrated community-based, contribution polling arrangement
US20100070334A1 (en) * 2008-09-08 2010-03-18 Dante Monteverde Method and system for location-based mobile device predictive services
EP2172853B1 (en) * 2008-10-01 2011-11-30 Software AG Database index and database for indexing text documents
US20100094835A1 (en) * 2008-10-15 2010-04-15 Yumao Lu Automatic query concepts identification and drifting for web search
US20100100563A1 (en) * 2008-10-18 2010-04-22 Francisco Corella Method of computing a cooperative answer to a zero-result query through a high latency api
US8468158B2 (en) * 2008-11-06 2013-06-18 Yahoo! Inc. Adaptive weighted crawling of user activity feeds
US8112393B2 (en) * 2008-12-05 2012-02-07 Yahoo! Inc. Determining related keywords based on lifestream feeds
US8396865B1 (en) 2008-12-10 2013-03-12 Google Inc. Sharing search engine relevance data between corpora
US8041729B2 (en) * 2009-02-20 2011-10-18 Yahoo! Inc. Categorizing queries and expanding keywords with a coreference graph
US9009146B1 (en) 2009-04-08 2015-04-14 Google Inc. Ranking search results based on similar queries
CN102483752A (zh) 2009-06-03 2012-05-30 谷歌公司 用于部分输入的查询的自动完成
US8447760B1 (en) 2009-07-20 2013-05-21 Google Inc. Generating a related set of documents for an initial set of documents
US7831609B1 (en) * 2009-08-25 2010-11-09 Vizibility Inc. System and method for searching, formulating, distributing and monitoring usage of predefined internet search queries
US8498974B1 (en) 2009-08-31 2013-07-30 Google Inc. Refining search results
US8972391B1 (en) 2009-10-02 2015-03-03 Google Inc. Recent interest based relevance scoring
US8266006B2 (en) 2009-11-03 2012-09-11 Ebay Inc. Method, medium, and system for keyword bidding in a market cooperative
US8874555B1 (en) 2009-11-20 2014-10-28 Google Inc. Modifying scoring data based on historical changes
US7890602B1 (en) 2009-12-11 2011-02-15 The Go Daddy Group, Inc. Tools enabling preferred domain positioning on a registration website
US8370217B1 (en) * 2009-12-11 2013-02-05 Go Daddy Operating Company, LLC Methods for determining preferred domain positioning on a registration website
US8875038B2 (en) 2010-01-19 2014-10-28 Collarity, Inc. Anchoring for content synchronization
US8615514B1 (en) 2010-02-03 2013-12-24 Google Inc. Evaluating website properties by partitioning user feedback
US8706728B2 (en) * 2010-02-19 2014-04-22 Go Daddy Operating Company, LLC Calculating reliability scores from word splitting
US9058393B1 (en) 2010-02-19 2015-06-16 Go Daddy Operating Company, LLC Tools for appraising a domain name using keyword monetary value data
US9330168B1 (en) 2010-02-19 2016-05-03 Go Daddy Operating Company, LLC System and method for identifying website verticals
US8515969B2 (en) * 2010-02-19 2013-08-20 Go Daddy Operating Company, LLC Splitting a character string into keyword strings
US8909558B1 (en) 2010-02-19 2014-12-09 Go Daddy Operating Company, LLC Appraising a domain name using keyword monetary value data
US9311423B1 (en) 2010-02-19 2016-04-12 Go Daddy Operating Company, LLC System and method for website categorization
US20110213660A1 (en) * 2010-02-26 2011-09-01 Marcus Fontoura System and Method for Automatic Matching of Contracts in an Inverted Index to Impression Opportunities Using Complex Predicates with Multi-Valued Attributes
US8924379B1 (en) 2010-03-05 2014-12-30 Google Inc. Temporal-based score adjustments
US8959093B1 (en) 2010-03-15 2015-02-17 Google Inc. Ranking search results based on anchors
WO2011156605A2 (en) 2010-06-11 2011-12-15 Doat Media Ltd. A system and methods thereof for enhancing a user's search experience
US9069443B2 (en) 2010-06-11 2015-06-30 Doat Media Ltd. Method for dynamically displaying a personalized home screen on a user device
US10713312B2 (en) 2010-06-11 2020-07-14 Doat Media Ltd. System and method for context-launching of applications
US9552422B2 (en) 2010-06-11 2017-01-24 Doat Media Ltd. System and method for detecting a search intent
US9372885B2 (en) 2010-06-11 2016-06-21 Doat Media Ltd. System and methods thereof for dynamically updating the contents of a folder on a device
US20120226676A1 (en) * 2010-06-11 2012-09-06 Doat Media Ltd. System and methods thereof for adaptation of a free text query to a customized query set
US9639611B2 (en) 2010-06-11 2017-05-02 Doat Media Ltd. System and method for providing suitable web addresses to a user device
US9529918B2 (en) 2010-06-11 2016-12-27 Doat Media Ltd. System and methods thereof for downloading applications via a communication network
US9665647B2 (en) 2010-06-11 2017-05-30 Doat Media Ltd. System and method for indexing mobile applications
US9141702B2 (en) 2010-06-11 2015-09-22 Doat Media Ltd. Method for dynamically displaying a personalized home screen on a device
US9623119B1 (en) 2010-06-29 2017-04-18 Google Inc. Accentuating search results
US8832083B1 (en) 2010-07-23 2014-09-09 Google Inc. Combining user feedback
US8380493B2 (en) 2010-10-01 2013-02-19 Microsoft Corporation Association of semantic meaning with data elements using data definition tags
US9002867B1 (en) 2010-12-30 2015-04-07 Google Inc. Modifying ranking data based on document changes
WO2012088706A1 (zh) * 2010-12-31 2012-07-05 Xiao Yan 一种检索的方法和系统
US9858342B2 (en) 2011-03-28 2018-01-02 Doat Media Ltd. Method and system for searching for applications respective of a connectivity mode of a user device
US9002926B2 (en) 2011-04-22 2015-04-07 Go Daddy Operating Company, LLC Methods for suggesting domain names from a geographic location data
US8868591B1 (en) * 2011-06-22 2014-10-21 Google Inc. Modifying a user query to improve the results
US10049377B1 (en) * 2011-06-29 2018-08-14 Google Llc Inferring interactions with advertisers
US8694507B2 (en) 2011-11-02 2014-04-08 Microsoft Corporation Tenantization of search result ranking
CN102622410A (zh) * 2012-02-17 2012-08-01 百度在线网络技术(北京)有限公司 一种数据资源的引入和调用方法及装置
CN102646134A (zh) * 2012-03-29 2012-08-22 百度在线网络技术(北京)有限公司 一种用于确定消息记录中的消息会话的方法和设备
CN103377240B (zh) * 2012-04-26 2017-03-01 阿里巴巴集团控股有限公司 信息提供方法、处理服务器及合并服务器
US9477711B2 (en) 2012-05-16 2016-10-25 Google Inc. Knowledge panel
US9020927B1 (en) * 2012-06-01 2015-04-28 Google Inc. Determining resource quality based on resource competition
US10037543B2 (en) * 2012-08-13 2018-07-31 Amobee, Inc. Estimating conversion rate in display advertising from past performance data
US10007731B2 (en) * 2012-09-12 2018-06-26 Google Llc Deduplication in search results
US9275040B1 (en) 2012-09-14 2016-03-01 Go Daddy Operating Company, LLC Validating user control over contact information in a domain name registration database
US8938438B2 (en) 2012-10-11 2015-01-20 Go Daddy Operating Company, LLC Optimizing search engine ranking by recommending content including frequently searched questions
US9900314B2 (en) 2013-03-15 2018-02-20 Dt Labs, Llc System, method and apparatus for increasing website relevance while protecting privacy
US9183499B1 (en) 2013-04-19 2015-11-10 Google Inc. Evaluating quality based on neighbor features
US9633080B2 (en) 2013-05-28 2017-04-25 Microsoft Technology Licensing, Llc Hierarchical entity information for search
US9904944B2 (en) 2013-08-16 2018-02-27 Go Daddy Operating Company, Llc. System and method for domain name query metrics
US9684918B2 (en) 2013-10-10 2017-06-20 Go Daddy Operating Company, LLC System and method for candidate domain name generation
US9715694B2 (en) 2013-10-10 2017-07-25 Go Daddy Operating Company, LLC System and method for website personalization from survey data
TW201518963A (zh) * 2013-11-05 2015-05-16 Richplay Information Co Ltd 推薦瀏覽物件之方法
US9922361B2 (en) * 2014-08-18 2018-03-20 Excalibur Ip, Llc Content suggestions
US9953105B1 (en) 2014-10-01 2018-04-24 Go Daddy Operating Company, LLC System and method for creating subdomains or directories for a domain name
US9779125B2 (en) 2014-11-14 2017-10-03 Go Daddy Operating Company, LLC Ensuring accurate domain name contact information
US9785663B2 (en) 2014-11-14 2017-10-10 Go Daddy Operating Company, LLC Verifying a correspondence address for a registrant
US9865011B2 (en) 2015-01-07 2018-01-09 Go Daddy Operating Company, LLC Notifying registrants of domain name valuations
US10296506B2 (en) 2015-01-07 2019-05-21 Go Daddy Operating Company, LLC Notifying users of available searched domain names
US9972041B2 (en) 2015-02-18 2018-05-15 Go Daddy Operating Company, LLC Earmarking a short list of favorite domain names or searches
US10083464B1 (en) * 2015-04-27 2018-09-25 Google Llc System and method of detection and recording of realization actions in association with content rendering
US11366872B1 (en) * 2017-07-19 2022-06-21 Amazon Technologies, Inc. Digital navigation menus with dynamic content placement
KR102247067B1 (ko) * 2019-03-28 2021-05-03 네이버클라우드 주식회사 웹사이트에서 수집된 url을 처리하는 방법, 장치 및 컴퓨터 프로그램
CN110908972B (zh) * 2019-11-19 2022-09-02 加和(北京)信息科技有限公司 一种日志数据预处理方法、装置、电子设备和存储介质
KR102588127B1 (ko) * 2021-04-02 2023-10-12 (주)피큐레잇 북마크 이력 기반의 개인화된 콘텐츠 큐레이션 시스템 및 콘텐츠 제안 방법
US20230176902A1 (en) * 2021-12-08 2023-06-08 Jpmorgan Chase Bank, N.A. System and method for automated onboarding

Family Cites Families (76)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0380239A3 (en) 1989-01-18 1992-04-15 Lotus Development Corporation Search and retrieval system
US5664111A (en) 1994-02-16 1997-09-02 Honicorp, Inc. Computerized, multimedia, network, real time, interactive marketing and transactional system
US5822749A (en) 1994-07-12 1998-10-13 Sybase, Inc. Database system with methods for improving query performance with cache optimization strategies
US5812996A (en) 1994-07-12 1998-09-22 Sybase, Inc. Database system with methods for optimizing query performance with a buffer manager
EP0792493B1 (en) 1994-11-08 1999-08-11 Vermeer Technologies, Inc. An online service development tool with fee setting capabilities
US6029195A (en) 1994-11-29 2000-02-22 Herz; Frederick S. M. System for customized electronic identification of desirable objects
JP3282937B2 (ja) 1995-01-12 2002-05-20 日本アイ・ビー・エム株式会社 情報検索方法及びシステム
US5809144A (en) 1995-08-24 1998-09-15 Carnegie Mellon University Method and apparatus for purchasing and delivering digital goods over a network
US5794210A (en) 1995-12-11 1998-08-11 Cybergold, Inc. Attention brokerage
US5826260A (en) 1995-12-11 1998-10-20 International Business Machines Corporation Information retrieval system and method for displaying and ordering information based on query element contribution
US5778367A (en) 1995-12-14 1998-07-07 Network Engineering Software, Inc. Automated on-line information service and directory, particularly for the world wide web
US5799284A (en) 1996-03-13 1998-08-25 Roy E. Bourquin Software and hardware for publishing and viewing products and services for sale
US5850442A (en) 1996-03-26 1998-12-15 Entegrity Solutions Corporation Secure world wide electronic commerce over an open network
US6243691B1 (en) * 1996-03-29 2001-06-05 Onsale, Inc. Method and system for processing and transmitting electronic auction information
US5826261A (en) 1996-05-10 1998-10-20 Spencer; Graham System and method for querying multiple, distributed databases by selective sharing of local relative significance information for terms related to the query
JP3108015B2 (ja) * 1996-05-22 2000-11-13 松下電器産業株式会社 ハイパーテキスト検索装置
US5802515A (en) 1996-06-11 1998-09-01 Massachusetts Institute Of Technology Randomized query generation and document relevance ranking for robust information retrieval from a database
US5915249A (en) 1996-06-14 1999-06-22 Excite, Inc. System and method for accelerated query evaluation of very large full-text databases
US5864845A (en) 1996-06-28 1999-01-26 Siemens Corporate Research, Inc. Facilitating world wide web searches utilizing a multiple search engine query clustering fusion strategy
US5864846A (en) 1996-06-28 1999-01-26 Siemens Corporate Research, Inc. Method for facilitating world wide web searches utilizing a document distribution fusion strategy
US5987460A (en) 1996-07-05 1999-11-16 Hitachi, Ltd. Document retrieval-assisting method and system for the same and document retrieval service using the same with document frequency and term frequency
US5913208A (en) * 1996-07-09 1999-06-15 International Business Machines Corporation Identifying duplicate documents from search results without comparing document content
US5862223A (en) 1996-07-24 1999-01-19 Walker Asset Management Limited Partnership Method and apparatus for a cryptographically-assisted commercial network system designed to facilitate and support expert-based commerce
EP0822502A1 (en) * 1996-07-31 1998-02-04 BRITISH TELECOMMUNICATIONS public limited company Data access system
US5864863A (en) 1996-08-09 1999-01-26 Digital Equipment Corporation Method for parsing, indexing and searching world-wide-web pages
US5842206A (en) 1996-08-20 1998-11-24 Iconovex Corporation Computerized method and system for qualified searching of electronically stored documents
US5819255A (en) 1996-08-23 1998-10-06 Tandem Computers, Inc. System and method for database query optimization
EP0829811A1 (en) 1996-09-11 1998-03-18 Nippon Telegraph And Telephone Corporation Method and system for information retrieval
US6253188B1 (en) 1996-09-20 2001-06-26 Thomson Newspapers, Inc. Automated interactive classified ad system for the internet
US5870740A (en) 1996-09-30 1999-02-09 Apple Computer, Inc. System and method for improving the ranking of information retrieval results for short queries
US5987446A (en) 1996-11-12 1999-11-16 U.S. West, Inc. Searching large collections of text using multiple search engines concurrently
US6032207A (en) 1996-12-23 2000-02-29 Bull Hn Information Systems Inc. Search mechanism for a queue system
US5950189A (en) 1997-01-02 1999-09-07 At&T Corp Retrieval system and method
US6285987B1 (en) 1997-01-22 2001-09-04 Engage, Inc. Internet advertising system
US6098065A (en) * 1997-02-13 2000-08-01 Nortel Networks Corporation Associative search engine
US5875446A (en) 1997-02-24 1999-02-23 International Business Machines Corporation System and method for hierarchically grouping and ranking a set of objects in a query context based on one or more relationships
US6016487A (en) 1997-03-26 2000-01-18 National Research Council Of Canada Method of searching three-dimensional images
US5950206A (en) 1997-04-23 1999-09-07 Krause; Gary Matthew Method and apparatus for searching and tracking construction projects in a document information database
US6006222A (en) * 1997-04-25 1999-12-21 Culliss; Gary Method for organizing information
US5924090A (en) 1997-05-01 1999-07-13 Northern Light Technology Llc Method and apparatus for searching a database of records
US5940821A (en) 1997-05-21 1999-08-17 Oracle Corporation Information presentation in a knowledge base search and retrieval system
US6012053A (en) 1997-06-23 2000-01-04 Lycos, Inc. Computer system with user-controlled relevance ranking of search results
US6233575B1 (en) 1997-06-24 2001-05-15 International Business Machines Corporation Multilevel taxonomy based on features derived from training documents classification using fisher values as discrimination values
JP3607462B2 (ja) * 1997-07-02 2005-01-05 松下電器産業株式会社 関連キーワード自動抽出装置及びこれを用いた文書検索システム
US5933822A (en) 1997-07-22 1999-08-03 Microsoft Corporation Apparatus and methods for an information retrieval system that employs natural language processing of search results to improve overall precision
US6081805A (en) * 1997-09-10 2000-06-27 Netscape Communications Corporation Pass-through architecture via hash techniques to remove duplicate query results
US5845278A (en) 1997-09-12 1998-12-01 Inioseek Corporation Method for automatically selecting collections to search in full text searches
US5903887A (en) 1997-09-15 1999-05-11 International Business Machines Corporation Method and apparatus for caching result sets from queries to a remote database in a heterogeneous database system
US5999929A (en) * 1997-09-29 1999-12-07 Continuum Software, Inc World wide web link referral system and method for generating and providing related links for links identified in web pages
US6026398A (en) 1997-10-16 2000-02-15 Imarket, Incorporated System and methods for searching and matching databases
US6006217A (en) 1997-11-07 1999-12-21 International Business Machines Corporation Technique for providing enhanced relevance information for documents retrieved in a multi database search
US5953718A (en) 1997-11-12 1999-09-14 Oracle Corporation Research mode for a knowledge base search and retrieval system
AU3292699A (en) * 1998-02-13 1999-08-30 Yahoo! Inc. Search engine using sales and revenue to weight search results
US6421675B1 (en) * 1998-03-16 2002-07-16 S. L. I. Systems, Inc. Search engine
US6154738A (en) 1998-03-27 2000-11-28 Call; Charles Gainor Methods and apparatus for disseminating product information via the internet using universal product codes
US6128623A (en) 1998-04-15 2000-10-03 Inktomi Corporation High performance object cache
US6212522B1 (en) 1998-05-15 2001-04-03 International Business Machines Corporation Searching and conditionally serving bookmark sets based on keywords
US6006225A (en) * 1998-06-15 1999-12-21 Amazon.Com Refining search queries by the suggestion of correlated terms from prior searches
US6401118B1 (en) * 1998-06-30 2002-06-04 Online Monitoring Services Method and computer program product for an online monitoring search engine
US6363377B1 (en) * 1998-07-30 2002-03-26 Sarnoff Corporation Search data processor
US6078866A (en) * 1998-09-14 2000-06-20 Searchup, Inc. Internet site searching and listing service based on monetary ranking of site listings
JP3645431B2 (ja) 1998-10-02 2005-05-11 富士通株式会社 情報検索支援装置および情報検索支援プログラム記憶媒体
KR100318015B1 (ko) 1998-10-22 2002-04-22 박화자 웹문서의하이퍼링크정보를이용한개념도의구축과이를통한인터넷검색방법
DE19904261A1 (de) 1999-02-03 2000-08-10 Basf Ag Verfahren zur Herstellung von Dimethylsulfit
US20030110161A1 (en) * 1999-04-05 2003-06-12 Eric Schneider Method, product, and apparatus for providing search results
US7835943B2 (en) * 1999-05-28 2010-11-16 Yahoo! Inc. System and method for providing place and price protection in a search result list generated by a computer network search engine
US6269361B1 (en) 1999-05-28 2001-07-31 Goto.Com System and method for influencing a position on a search result list generated by a computer network search engine
KR100337810B1 (ko) 1999-11-06 2002-05-23 유진우 인터넷상의 검색전문웹사이트 및 그 검색방법
US20020004735A1 (en) 2000-01-18 2002-01-10 William Gross System and method for ranking items
US7225151B1 (en) * 2000-01-27 2007-05-29 Brad S Konia Online auction bid management system and method
KR100382600B1 (ko) 2000-01-31 2003-05-01 주식회사 제이.이.씨 네트워크 시스템을 이용한 통합웹검색서비스 제공방법 및그 방법을 기록한 컴퓨터로 읽을 수 있는 기록매체
US20010051911A1 (en) * 2000-05-09 2001-12-13 Marks Michael B. Bidding method for internet/wireless advertising and priority ranking in search results
CA2415167C (en) * 2000-07-05 2017-03-21 Paid Search Engine Tools, L.L.C. Paid search engine bid management
HUP0002950A2 (hu) * 2000-07-27 2002-01-28 Tamás Lajtner Eljárás fizető adathordozók felhasználásával működő rendszer üzemeltetésére, valamint rendszer e célra
WO2002021292A1 (en) * 2000-09-01 2002-03-14 Search123.Com, Inc. Auction-based search engine
EP1535211A4 (en) * 2002-08-30 2006-08-23 Miva Inc SYSTEM AND METHOD FOR PAYING QUALITY ADVERTISING USING MULTIPLE ASSEMBLIES OF ADVERTISING LISTS

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011044780A1 (zh) * 2009-10-13 2011-04-21 腾讯科技(深圳)有限公司 一种搜索引擎系统和信息搜索方法
RU2534006C2 (ru) * 2009-10-13 2014-11-27 Тенсент Текнолоджи (Шэньчжэнь) Компани Лимитед Поисковая система и способ поиска информации
CN103092856A (zh) * 2011-10-31 2013-05-08 阿里巴巴集团控股有限公司 搜索结果排序方法及设备、搜索方法及设备
CN103092856B (zh) * 2011-10-31 2015-09-23 阿里巴巴集团控股有限公司 搜索结果排序方法及设备、搜索方法及设备
CN103092990A (zh) * 2013-02-14 2013-05-08 张康德 一种数据库的搜索引擎方法
CN105426536A (zh) * 2015-12-21 2016-03-23 北京奇虎科技有限公司 汽车类搜索结果页的展现方法及装置
CN110543310A (zh) * 2019-08-08 2019-12-06 山东中创软件商用中间件股份有限公司 一种jsp编译方法、装置、设备及存储介质

Also Published As

Publication number Publication date
GB2388678B (en) 2004-12-01
WO2001090947A1 (en) 2001-11-29
AU6327501A (en) 2001-12-03
EP1297453B1 (en) 2010-04-21
CA2409642A1 (en) 2001-11-29
AU2008202363A1 (en) 2008-06-19
GB0229482D0 (en) 2003-01-22
ATE465470T1 (de) 2010-05-15
CA2409642C (en) 2010-11-02
KR100699977B1 (ko) 2007-03-27
US20050240557A1 (en) 2005-10-27
KR20050071717A (ko) 2005-07-07
CN1430751B (zh) 2010-05-26
KR100719009B1 (ko) 2007-05-17
EP1297453A4 (en) 2005-08-17
GB2388678A (en) 2003-11-19
DE60141904D1 (de) 2010-06-02
DE10196212T1 (de) 2003-08-21
AU2001263275B2 (en) 2005-07-21
JP2003534602A (ja) 2003-11-18
KR20030003739A (ko) 2003-01-10
AU2011201608A1 (en) 2011-04-28
US6876997B1 (en) 2005-04-05
AU2005225097A1 (en) 2005-11-10
US20100106706A1 (en) 2010-04-29
EP1297453A1 (en) 2003-04-02
US7657555B2 (en) 2010-02-02
JP3860036B2 (ja) 2006-12-20

Similar Documents

Publication Publication Date Title
CN1430751B (zh) 用于在数据库搜索系统中识别相关搜索的方法和装置
AU2001263275A1 (en) Method and apparatus for identifying related searches in a database search system
US6311194B1 (en) System and method for creating a semantic web and its applications in browsing, searching, profiling, personalization and advertising
JP5033221B2 (ja) 電子的ドキュメントレポジトリーマネジメントおよびアクセスシステム
US20150199445A1 (en) Methods and systems for enhancing metadata
US20010044758A1 (en) Methods and systems for enabling efficient search and retrieval of products from an electronic product catalog
US20030018607A1 (en) Method of enabling browse and search access to electronically-accessible multimedia databases
JP2003228676A (ja) グループ化を利用して広告主による検索リストの管理を可能にする支払い式プレイスメント検索システム及び方法
Pant et al. Panorama: extending digital libraries with topical crawlers
WO2007103191A2 (en) Comparative web search
US20090265321A1 (en) Internet book marking and search results delivery
US20040015485A1 (en) Method and apparatus for improved internet searching
Wen et al. A multi-paradigm querying approach for a generic multimedia database management system
CN1336610A (zh) 网上商务交易的广告方法及其系统
Hu et al. World wide web search technologies
AU768160B2 (en) Method of enabling browse and search access to electronically-accessible multimedia databases
WO2001075681A1 (en) Method, apparatus, and system for creating and maintaining a shared hierarchical directory system
Svidzinska A world wide web meta search engine using an automatic query routing algorithm
Milios Information Retrieval by Semantic Similarity
WO2001075656A1 (en) Method, apparatus, and system for creating and maintaining a shared hierarchical directory system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C41 Transfer of patent application or patent right or utility model
TA01 Transfer of patent application right

Effective date of registration: 20090227

Address after: American California

Applicant after: Yahoo Corp.

Address before: American California

Applicant before: Overture Services Inc.

ASS Succession or assignment of patent right

Owner name: YAHOO| CO.,LTD.

Free format text: FORMER OWNER: WAFUL TOURS SERVICES

Effective date: 20090227

C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Owner name: FEIYANG MANAGEMENT CO., LTD.

Free format text: FORMER OWNER: YAHOO CORP.

Effective date: 20150331

TR01 Transfer of patent right

Effective date of registration: 20150331

Address after: The British Virgin Islands of Tortola

Patentee after: Fly upward Management Co., Ltd

Address before: American California

Patentee before: Yahoo Corp.

CX01 Expiry of patent term

Granted publication date: 20100526

CX01 Expiry of patent term