CN101025737B - Attention degree based same source information search engine aggregation display method - Google Patents

Attention degree based same source information search engine aggregation display method Download PDF

Info

Publication number
CN101025737B
CN101025737B CN 200610007905 CN200610007905A CN101025737B CN 101025737 B CN101025737 B CN 101025737B CN 200610007905 CN200610007905 CN 200610007905 CN 200610007905 A CN200610007905 A CN 200610007905A CN 101025737 B CN101025737 B CN 101025737B
Authority
CN
China
Prior art keywords
search
same
information
content
source
Prior art date
Application number
CN 200610007905
Other languages
Chinese (zh)
Other versions
CN101025737A (en
Inventor
王东
Original Assignee
王东
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 王东 filed Critical 王东
Priority to CN 200610007905 priority Critical patent/CN101025737B/en
Publication of CN101025737A publication Critical patent/CN101025737A/en
Application granted granted Critical
Publication of CN101025737B publication Critical patent/CN101025737B/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Abstract

The invention relates to a focus-based same-source information searching engine aggregation display method and system, comprising: searching engine finds all target websites according to conditions as the original searching results; according to the quality of contents, account information of purchasers of display weighting power, and quality of service, and other elements, aggregating the original searching results into a title searching result; only taking the title searching result as final searching result shown to an inquirer, and not showing all the searching results to the inquirer until the inquirer needs to view them. And the system adopts counting server to support network browser and converts all user's operations into PageFocus of a webpage, and transmits the PageFocus back tothe counting server to express the quality of contents of the webpage, thus able to become a method for the searching engine to select 'title searching result' and make result display arrangement. And the invention also relates to a method able to automatically judge user state and provide proper style and contents of webpage.

Description

基于关注度的同源信息搜索引擎聚合显示方法 Display method based on the degree of interest homology search engine

技术领域 FIELD

[0001] 本发明涉及计算机网络技术,特别是利用计算机在互联网或企业内部网上提供搜索服务的搜索引擎技术。 [0001] The present invention relates to computer network technology, especially the use of computers to provide search services on the Internet or corporate intranet search engine technology. 本发明还涉及一种获取网页用户关注度的系统及网站内容风格自适应装置与方法。 The present invention further relates to a system and apparatus and method adaptive style web site content acquired user attention.

背景技术 Background technique

[0002] 目前在hternet上存在着大量的“相同(或类似)来源的网页或网络服务”,例如:1由同一个人或组织写作的被大量复制的文章、观点、信息网页;2由同一个人或组织采访(或发布)的被大量复制的新闻报道网页;3由同一个人或组织在BBS论坛发言帖子的转贴;5由同一个人或组织产生的不同数据格式、压缩比例的多媒体文件;6由同一个人或组织产生的可执行程序、数据、设计文件;7其他方式产生的并被广泛复制的信息内容。 [0002] currently exist on hternet a large number of "same (or similar) source pages or network services," such as: 1 is large number of copies of articles written by the same person or organization, ideas, information web page; 2 by the same person News pages are a large number of duplicated or Group interview (or release) a; 3 posted by the same individual or organization in the BBS forum to speak posts; different data formats 5 produced by the same person or organization, the compression ratio of multimedia files; 6 by the executable program generated the same person or organization, data, design documents; and 7 otherwise generated widespread copying of content. 这些“相同(或类似)来源的网页或网络服务”在目前的搜索引擎搜索结果中被一一列举,占据大量篇幅,内容却雷同,不便查询者的浏览。 These "same (or similar) source pages or network services" are enumerated in the current search engine results, occupy a lot of space, but similar content, browse the query's inconvenient.

[0003] 目前各种搜索引擎和网页排行服务系统,均仅仅采用了点击流量和网页停留时间的方式来衡量网页的热门程度,而采取的方法主要:1)搜索引擎类:依靠查询者对搜索结果的点击来计算网页的热门程度,例如google、百度。 [0003] Currently various search engines and web ranking service system, are only using the click of traffic and time on page to measure the popularity of a web page, and the main approach taken: 1) search engine classes: those who rely on search queries click to calculate the result of the popularity of the page, such as google, Baidu. 2)ALEXA网站排行类:依靠内嵌在浏览器上的工具条软件,把用户对超级链接的点击和网页停留时间发送回服务器(参数包括当前网页地址、页面打开时间),但是不包含其他评估方法。 2) ALEXA website ranking categories: rely on embedded software in the toolbar on the browser, the user clicks on the hyperlink and time on page is sent back to the server (including the current parameters of the web address, open the page of time), but does not include other assessments method. Alexa工作原理可参见: Alexa works can be found in:

[0004] http://www. singtaonet. com/it/it sp/t20051110 43674. html, [0004] http:.. // www singtaonet com / it / it sp / t20051110 43674. html,

[0005] http://www. people, com. cn/GB/it/8219/41552/41597/3109586. html。 [0005] http:... // www people, com cn / GB / it / 8219/41552/41597/3109586 html.

[0006] 目前各种网站可以划分为如下类别: [0006] Currently various websites can be divided into the following categories:

[0007] 类别一:全部网站内容对任何用户在同一时刻均具有同样风格与内容(例如:新闻网站)。 [0007] a category: All site content to any user with the same style and content at the same time (for example: a news site).

[0008] 类别二:可以根据用户的设定显示不同的风格与内容(例如:g00gle的新闻网站)O [0008] Category 2: You can display a different style and content based on user settings (for example: g00gle news site) O

[0009] 但是这些网站,不能在实时根据用户的不同状态给出不同的显示风格与内容。 [0009] However, these websites can not be displayed in real time give a different style and content depending on the state of the user. 发明内容 SUMMARY

[0010] 为了改进上述问题的不足,本发明提供这样一种搜索方法,其能把因内容相同而对搜索者具有相同使用价值的搜索结果聚合成一条记录,即标题搜索结果,根据需要再展开查看其他结果的装置和方法,从而避免“标题搜索结果”由于频繁被点击而导致目标服务器访问量过大而瘫痪的,把“标题搜索结果”点击自动分散到其他搜索结果目标上的装置与方法。 [0010] To improve the deficiencies of the above-described problems, the present invention provides a searching method, which can have a content by recording the same search results using the same value for the searcher to the polymerization, i.e., title search result, according to need to expand View the results of other devices and methods, thus avoiding the "title search result" is clicked caused by frequent target server volume of traffic paralyzed, the "title search result" click automatically distributed to other devices and methods on a search result target . 本发明还提供了这样一种系统,其利用能够和网络上的统计服务器配合的网络浏览器,将用户的全部操作行为换算成对该网页的评分,并发送回统计服务器,作为对网页的关注程度的评分,从而可以作为搜索引擎的排名方法和工具。 The present invention also provides such a system, and its use can count on a network server with a web browser, will be fully operational in terms of user behavior score that page, and sent back to the server statistics, as concerns about a web page degree score, thus ranking as the methods and tools of search engines. 本发明还提供这样一种方法:利用各种可能获得的、有助于判断用户所处环境和状态的信息,在同一时刻、同一个网站内、甚至时同一个页面内,向不同状态的用户提供不同的显示风格和内容。 The present invention further provides a method: use of the possibilities to obtain help information to determine the user status of the environment and, at the same time, within the same site, even when the same one page, to a different state of the user provide different display style and content.

[0011] 为了实现上述目的,一种将同源信息站点搜索引擎聚合显示的搜索方法,其包括下列步骤: [0011] To achieve the above object, a method of polymerization homology search engine of the search information display method, comprising the steps of:

[0012] (1)查询者通过Web浏览器或应用软件访问搜索引擎,并输入需要查询的关键词; [0012] (1) a search engine query to access through a Web browser or application software, and enter the keyword query needs;

[0013] (2)由搜索引擎找到全部符合条件的目标站点作为原始搜索结果; [0013] (2) found by the search engine, all in compliance with the conditions of the target site as the original search results;

[0014] (3)由“同源信息处理模块”查询“成为标题搜索结果”的权力采购者的账户信息, 并结合其他判断规则在原始搜索结果中选取用来作为“标题搜索结果”的对象; Account information [0014] (3) by the "source information processing module" inquiry "has become a title search result" power buyers, combined with other selected objects are used to determine the rules as a "title search result" in the original search results ;

[0015] (4)由搜索引擎Web服务器或应用服务器只将选中的“标题搜索结果”作为搜索结果展示给查询者,并为其提供一个带有“展开查看细节或其他信息”含义的“按钮; [0015] (4) by a search engine Web server or application server will only select the "title search result" shows as the search result to the inquirer and provide "button with" to start to see the details or other information, "meaning ;

[0016] (5)查询者还可按动与之对应的“按钮”,搜索引擎再向其展示在O)中找到的原始搜索结果。 [0016] (5) the inquirer can also press the "button" By contrast, a search engine again showing search results found in the original) in O.

[0017] “同源信息处理模块”有多个“(相应信息种类的)同源信息处理模块”组成,例如: “同源网页处理模块”、“同源多媒体处理模块”、“同源图片处理模块”、“同源文档处理模块”、 “同源软件处理模块”、“同源数据或数据库处理模块”、“同源GIS信息处理模块”、“同价值网络服务处理模块”、“同价值商业信息处理模块”等。 [0017] "source information processing module" plurality "(corresponding to the information category) source information processing module", for instance: "the page processing module homology", "homologous multimedia processing module," "source picture processing module "," same-source document processing module "," same software processing module "," same data or database processing module "," same-source GIS information processing module "," same-value network service processing module "," same value commercial information processing module "and so on.

[0018] 所述“同源信息处理模块”包括如下步骤: [0018] The "source information processing module" includes the following steps:

[0019] (1)首先由“信息种类判断模块”对网络搜索器收到的信息进行种类判断; [0019] (1) First, the information received by the network search is performed is determined by the type "information type determination module";

[0020] (2)将相同种类的信息集中发送到“(相应信息种类的)同源信息处理模块”; [0020] (2) The same kinds of information transmitted to the centralized "(corresponding to the information category) source information processing module";

[0021] (3)将由“(相应信息种类的)同源信息处理模块”处理后的搜索信息归档进入“非同源(相应信息种类的)结果信息库”或“同源(相应信息种类的)结果信息库”。 [0021] (3) will be "(corresponding to the information category) source information processing module" archive search information processed into the "non-homologous (corresponding to the information category) result information database" or "homologous (corresponding to the information category ) results repository. "

[0022] (4)由系统把“非同源(相应信息种类的)结果信息库”和“同源(相应信息种类的)结果信息库”发布到Web服务器上,供查询者查询。 [0022] (4) by the system to "non-homologous (corresponding to the kind of information) result Repository" and "homologous (corresponding to the kind of information) results repository" Publish to the Web server for searchers. 作为另一中实现方法,也可以根据这两个数据库直接向查询者提供基于动态网页的查询服务。 As another method to achieve, we can also provide access to services based on dynamic web pages to searchers based on these two databases directly.

[0023] 所述由“同源网页处理模块”处理网页信息的步骤如下所示: [0023] The web page information by the processing of step "homologous web processing module" is as follows:

[0024] (1)在“搜索引擎搜索部分”接收需要查询的关键词的时候,首先由“搜索结果已经发布在Web服务器上的判决器”判断该关键词是否近期已经被其他人查询过,如果被查询过,并且结果已经在“搜索引擎搜索结果Web服务器”上发布,则直接返回搜索结果,该结果中已经将具有相同来源的网页聚合成一条搜索结果,点击“同源网页”按钮后,可以在“搜索引擎搜索结果Web服务器”上看到另一个包括全部搜索结果的搜索结果网页,完成整个查询过程; [0024] (1) In the "search engine part of the" need to receive query keywords, first by "search results have been published in judgment on the Web server" to determine whether the keyword has been recently inquired about other people, page same sources aggregated into one search result, if the query had been, and the results have been published on the "search engine results Web server", the direct return search results that have been clicking the "same page" button We can see the other search results page includes all of the search results on "search engine results Web server", complete the inquiry process;

[0025] (2)如果在“搜索引擎搜索部分”接收需要查询的关键词的时候,由“搜索结果已经发布在Web服务器上的判决器”判断该关键词近期没有被其他人查询过,并且也没有相应的查询结果在“搜索引擎搜索结果Web服务器”上发布则: [0025] (2) If the "search engine part of the" need to receive keyword query time by the "search results have been published in judgment on the Web server" no recent judgment the keyword queried by other people, and there is no corresponding results are published in the "search engine results Web server":

[0026] A.启动“网页搜索器”搜索“非同源网页结果数据库”和“同源网页结果数据库” 找到符合搜索关键词的网页地址,并获取这些网页的内容; [0026] A. Start "Web Search" in searching "non-same-source database" and "same-source database," find the page address match the search keywords, and get content of these pages;

[0027] B.如果“网页搜索器”在“非同源网页结果数据库”和“同源网页结果数据库”中没有找到符合搜索关键词的网页地址,则返回查询者“没有符合条件网页”的结果,并且将该搜索关键词加入到下一轮更新“非同源网页结果数据库”和“同源网页结果数据库”的任务中,如果在更新过程中找到了符合条件的网页地址则根据其是否具有同源网页而选择入“非同源网页结果数据库”或“同源网页结果数据库”,这样如果再有人搜索同样的关键词是就可以找到结果; [0027] B. If the "Web Search" on the web address does not match the search keywords found in the "non-same-source database" and "same-source database", the query returns were "not qualified page" of As a result, and adding the search keywords to next task of updating "non-same-source database" and "same-source database", if you find a qualified web address in the update process is based on whether homology page and select the "non-same-source database" or "same-source database," so if someone re-search the same keywords that you can find the results;

[0028] (3)由“网页内容分离器”将找到的网页内容及超级链接目标分解成:多媒体、图片、文字、超级链接等种类; [0028] (3) by the "Web Content splitter" to find web content and hyperlinks goals down into: multimedia, pictures, text, hyperlinks and other species;

[0029] (4)分别由各种内容判决器产生判决结果: [0029] (4) are produced by the decision result of various content decision device:

[0030] A.由“多媒体内容判决器”产生目标网页所含“相同多媒体文件程度SMS(Same Media Score)” ; [0030] A. from the "multimedia content decision device" to generate landing pages contained in the "same level of multimedia file SMS (Same Media Score)";

[0031] B.由“图片内容判决器”产生目标网页所含“相同图片的程度SPS(Same Photo Score),,; [0031] B. a "picture content decision" on page generates a target contained "image of the same degree of SPS (Same Photo Score) ,,;

[0032] C.由“文字内容判决器”产生目标网页所含“相同文字的程度STS(Same Text Score),,; [0032] C. by a "decision text" on the target page contained an extent that the "same character STS (Same Text Score) ,,;

[0033] D.由“链接内容判决器”产生目标网页所含“相同超级连接的程度SHS (Same Hyperlinks Score),,; [0033] D. from "link content decision" on page generates a target contained in "the same degree of super-connected SHS (Same Hyperlinks Score) ,,;

[0034] (5)从“同源网页判决规则库”分别获取“多媒体判决权重SMP”、“图片判决权重SPP”、“文字判决权重STP”、“链接判决权重SHP”并分别与第(4)步生成的“相同多媒体文件程度SMS”、“相同图片的程度SPS”、“相同文字的程度STS”、“相同超级连接的程度SHS” 做数学乘法; [0034] (5) to obtain "Multimedia verdict weights SMP", "Picture verdict weight SPP", "text verdict weight STP", "link decision weights SHP" from the "same page judgment rule base", respectively, and were the first (4 ) generated by step, "the same level of multimedia file SMS", "the same picture of the extent of SPS", "same degree of text STS", "super-connected to the same extent SHS" do the math multiplication;

[0035] (6)将第(5)步获得的数学乘法结果做加法,获得网页的“同源程度SSS(Same [0035] (6) to (5) obtained in step math multiplication results do addition, access to the web page "homologous degree SSS (Same

[0036] Sourc Score) ”,同源程度SSS = (SMS*SMP) + (SPS*SPP) + (STS*STP) + (SHS*SHP); [0036] Sourc Score) ", the degree of homology SSS = (SMS * SMP) + (SPS * SPP) + (STS * STP) + (SHS * SHP);

[0037] (7)判断该网页的“同源程度SSS”是否超出门限,如果超出门限则判定为与其它网页的“同源网页”,如果没有超出门限则判定为“非同源网页”; [0037] (7) determines that the page "degree of homology the SSS" exceeds a threshold, if it exceeds the threshold it is determined that the other page "same page", if not exceeded the threshold is determined to be "nonhomologous page";

[0038] (8)将第(7)步产生的“非同源网页”由“非同源网页处理模块”入“非同源网页结果数据库”;将第(7)步产生的“同源网页”由“同源网页处理模块”入“同源网页结果数据库”; [0038] (8) (7) generated in step "non-homologous page" from "non-homologous web processing module" into "non-homologous database"; the first (7) generated in step "homologous page "from" homologous web processing module "" homogenous database ";

[0039] (9)由“搜索结果网页发布器”根据“同源网页结果数据库”和“非同源网页结果数据库”的内容动态生成搜索结果的静态网页,发布到“搜索引擎搜索结果Web服务器”,再通过浏览器呈现给查询用户; [0039] (9) by a static web pages "search results page publisher" dynamically generating search results based on the content of "same-source database" and "non-same-source database", issued to "search engine search results Web server "and then presented to the inquiry by the browser user;

[0040] (10)作为第(9)步的另一种实现方法,也可以通过“动态网页Web服务器”直接通过浏览器呈现给查询用户。 [0040] (10) as the first (9) achieved another step method can also "dynamic webpage Web server" presented directly through the browser to query the user.

[0041] 所述由“同源信息处理模块”也可包括如下步骤: [0041], wherein the "source information processing module" may also include the steps of:

[0042] (1)在收到查询者的搜索关键词,并通过软件根据关键词内容和关键词语法判断需要查找的文件或网络服务; [0042] (1) the receipt of a query's search keywords, content and software based on keywords and keyword syntax judgment need to find a file or network services;

[0043] (2)判断“要搜索的内容已经发布在Web服务器上吗? ”,如果搜索的目标已经发布在“搜索引擎搜索结果Web服务器”上则直接返回搜索结果,该结果中已经将符合搜索条件并具有相同来源的文件或网络服务的获取入口聚合成一条“标题搜索结果”,点击“同源文件”按钮后,可以在“搜索引擎搜索结果Web服务器”上看到另一个包括全部搜索结果的网页,使查询者可以看到符合查询条件的全部搜索结果,完成搜索过程。 [0043] (2) judge "what you are searching has been published on the Web server?", If the target search has been published on the "search engine results Web server" direct return search results that have been formulated in line with gain entrance file or network services search criteria and have the same origin can be aggregated into a "title search result", click on "homologous file" button, you can see all the other included a search on "search engine results Web server" web results of the query can see all of the search results match the query, the search process is complete. 如果搜索的目标没有发布在“搜索引擎搜索结果Web服务器”上则从第(3)步开始; If the target is not released from the search section (3) the beginning steps in "search engine results Web server";

[0044] (3)返回查询者“没有符合条件的结果”的提示; [0044] (3) return to the inquirer, "there are not corresponding of" prompt;

9[0045] (4)将该搜索关键词加入到下一轮更新“同源信息索引数据库”和“非同源信息索引数据库”的任务中,并定期启动两个数据库的更新过程; 9 [0045] (4) adding the search keywords to next task of updating "source information index database" and "non-source information index database", and periodically start the update process two databases;

[0046] (5) “同源信息索引数据库”和“非同源信息索引数据库”的更新过程: [0046] (5) "source information index database" and "non-homologous index database information" of the update process:

[0047] A.由搜索器搜索网页新出现的目标文件或服务入口,通过软件进入该入口获取该文件或网络服务; [0047] A. search pages by the search target file or emerging service entrance, access to the file or web services through the portal into the software;

[0048] B.由“内容判决器”判断新找到的信息“与当前“同源信息索引数据库”的内容属于同一内容吗? ”如果“是”则将它作为一个新的元素归入“同源信息索引数据库”的该类别;如果“否”则由“内容判决器”判断它“与当前非同源信息索引数据库”的内容属于同一内容吗? [0048] B. information from "content decision device" newly found "and the current" source information index database "contents belong to the same content?" If "yes," then it is classified as a new element "with the source information index database "in the category; if" No "by the" content decision device "it is" content with the current non-source information index database "belong to the same content? "

[0049] C.如果“是”则:“为当前的信息和与之同源的并已经存贮在'非同源信息索引数据库'中的信息,新建一个类别并全部转移到'同源信息索引数据库'”; [0049] C. If "Yes": "as the current information, and information with homologous and has stored in the" non-homologous information index database is creating a new category and all transferred to the "source information index database ";

[0050] D.如果“否”则:“为当前的信息新建一个类别,并存入'非同源信息索引数据库,”; [0050] D. If "no" is: "To create a new category current information, and stored in 'non-source information index database";

[0051] (6)由“搜索结果网页发布器”根据“同源网页结果数据库”和“非同源网页结果数据库”的内容动态生成搜索结果的静态网页,发布到“搜索引擎搜索结果Web服务器”,再通过浏览器呈现给前来搜索的查询者; [0051] (6) by a "search results pages publisher" dynamically generating search results based on the content of "same-source database" and "non-same-source database", publishing them to "search engine search results Web server "and then presented to the inquirer through a browser;

[0052] (7)作为第(6)步的另一种实现方法,也可以通过“动态网页Web服务器”直接通过浏览器呈现给查询用户。 [0052] (7) as the first (6) Another realization method step, can also "dynamic webpage Web server" presented directly through the browser to query the user.

[0053] 所述由同源信息处理模块处理文档时,“同源信息索引数据库”和“非同源信息索引数据库”的更新过程为: [0053] The information processing module from homologous processing document, "source information index database" and "non-homologous index database information" of the update process is:

[0054] A.由“文档搜索器”搜索网页新出现的文档文件或链接入口,通过软件进入该入口获取该文件或服务; [0054] A. from "Document Finder" search the web emerging document files or links entrance, to obtain the files or services through the portal into the software;

[0055] B.由“文字内容判决器”和“图片内容判决器”判断新找到的文档内容“与当前'同源文档索引数据库'的内容属于同一内容吗? ”如果“是”则将它作为一个新的元素归入“同源文档索引数据库”的该类别;如果“否”则由“文档内容判决器”判断它“与当前非同源文档索引数据库”的内容属于同一内容吗? [0055] B. from "text content decision device" and "picture content decision device" newly found document content "with the contents of the current 'source document index database' belong to the same content?" If "yes," then it is as a new element classified as "same-source document index database" in the category; if "No" by "document content decision device" content that "with the current non-source document index database" belong to the same content? ” ; ";

[0056] C.如果“是”则:“为当前的文档和与之同源的并已经存贮在'非同源文档索引数据库'中的文档,新建一个类别并全部转移到'同源文档索引数据库'”;如果“否”则“为当前的文档新建一个类别,并存入'非同源文档索引数据库',,; [0056] C. If "Yes": "for the current document and the same-source document having been stored in 'non-source document index database' in creating a new category and all transferred to the 'source document index database "; if" no "is" a new category for the current document, and stored in 'non-source document index database' ,,;

[0057] 所述相关内容判决器模块包括如下步骤: [0057] The related content decision module comprising the steps of:

[0058] (1)接收“被判断对象”:可以接收多个来源的多媒体,并记录被判断对象的数量InputQuantity ; [0058] (1) receives the "judged objects": a plurality of multimedia sources may be received, and records the number of objects is judged InputQuantity;

[0059] (2)查找“被判断对象”既定的可参与比对的属性,记录当前属性具有相同值的“被判断对象”的数量SameQuantity ; [0059] (2) Find "judged objects" may participate in a given alignment attribute record "judged objects" attribute having the same value of the current number SameQuantity;

[0060] (3)输入当前属性在判断过程中的“权重”值Power ; [0060] (3) entering a current attribute determination process "weight" value of the Power;

[0061] (4)计算被全部“被判断对象”在当前属性上的吻合度:PSame = SameQuant i ty^Power ; [0061] (4) calculates all "judged objects" goodness of fit on the current attributes: PSame = SameQuant i ty ^ Power;

[0062] (5)返回(1)对下一个“属性”执行(1)〜(4),得到该属性的PSame,直至获得部属性的PSame值;[0063] (6)计算并返回“被判断对象”的相同内容程度值AameMediaPower =(全部Psame值的数学累加值)/InputQuantity。 [0062] (5) return (1) performed for the next "attribute" (1) to (4), to give PSame the attribute until a portion attribute PSame values; [0063] (6) calculates and returns "is judged that the target level of the same "value AameMediaPower = (mathematical Psame accumulated value of all values) / InputQuantity.

[0064] 内容判决器模块为文字内容判决器时,其包括如下步骤: [0064] The content decision module is text content decision device, comprising the steps of:

[0065] (1)找出文字内容中具有相同的单词或句子的部分的总计长度值SameLenth ; Total Length Value [0065] (1) find text in the same word or sentence portions SameLenth;

[0066] (2)找出输入的多个文字内容中,长度最短的输入文字的长度值MinLenth ; [0066] (2) identify a plurality of text input, the text input shortest length value MinLenth;

[0067] (3)返回文字相1以程度值SameTextPower = SameLenth/MinLentho [0067] (3) back to the text with value 1 to the extent SameTextPower = SameLenth / MinLentho

[0068] 内容判决器模块为链接内容判决器时,其包括如下步骤: [0068] The content decision module is linked content decision device, comprising the steps of:

[0069] (1)接收“被判断对象”:多个超级链接的URL地址; [0069] (1) receives the "judged objects": the URL address of a plurality of hyperlinks;

[0070] (2)统计“被判断对象”相似程度=SameURLPower =在被判断的每个超级链接所指向的页面上均出现过的目标URL地址数量; [0070] (2) statistical "judged objects" degree of similarity = SameURLPower = the number of target URL address both appear on each hyperlink is judged pointed to a page off;

[0071] (3)返回SameURLPower。 [0071] (3) returns SameURLPower.

[0072] 内容判决器模块为商业信息内容判决器时,其包括如下步骤: [0072] When the content decision module is a decision commercial content, comprising the steps of:

[0073] (1)比对参与比对的商业信息是否是相同的产品或服务,如果“不是”返回“不一致”,如果“是”进入第(2)步。 [0073] (1) whether the ratio of participation than the commercial information is the same product or service, if "not" return "inconsistent", if "yes" into the first (2) step.

[0074] (2)判断参与比对的商业信息是否具有地理位置敏感性,如果“不是”返回判断结果“一致”,如果“是”则进行第(3)步。 [0074] (2) participating commercial information judging whether the location of the alignment sensitivity, if "NO" to return the judgment result "match", if "Yes" for the first step (3).

[0075] (3)判断参与比对的商业信息的提供者是否处于相同的城市或区域,如果“不是” 返回判断结果“不一致”,如果是返回判断结果“一致”。 [0075] (3) determine the providers involved in the business of information than is in the same city or region, if "NO" to return to judge the results "inconsistent", if the judgment result is returned "consistent."

[0076] “标题搜索结果”选择的具体实现方法如下: [0076] "title search result" to select the specific method is as follows:

[0077] (1)计算每个“同源搜索结果”成为“标题搜索结果”的概率权值PWn : [0077] (1) is calculated for each "source search result" becomes "title search result" probability weights PWn:

[0078] Pffn = TP^PageFocus/(RespDelay-K) [0078] Pffn = TP ^ PageFocus / (RespDelay-K)

[0079] η:该搜索结果为第η条 [0079] η: the search result for the first section [eta]

[0080]当(RespDelay-K)小于等于零时,(RespDelay-K)应取值为 1 [0080] When (RespDelay-K) is less than zero, (RespDelay-K) shall be taken as 1

[0081] Pagei7Ocus :网页关注度值 [0081] Pagei7Ocus: page concern value

[0082] RespDelay :网页服务响应延迟 [0082] RespDelay: web service response delay

[0083] K :服务响应常数,建议K设置为50毫秒(ms)。 [0083] K: constant service response, K is recommended set to 50 milliseconds (ms).

[0084] TP :标题搜索结果权力 [0084] TP: title search results powers

[0085] (2)统计求和全部原始“同源搜索结果”的概率权值PWn的总和=PWall全部概率权值; The sum of [0085] (2) all of the original statistical summation "homology search results for" probability of weights PWn = PWall all probability weight value;

[0086] (3)计算每条“同源搜索结果”成为“标题搜索结果”的概率:Pn = Pffn/Pwall ; [0086] (3) calculates for each "source search result" becomes "title search result" probability: Pn = Pffn / Pwall;

[0087] (4)按照Pn值的概率,随着搜索者的访问动作,动态地随机选择“标题搜索结果”, 呈现给搜索者。 [0087] (4) according to the probability Pn values, along with access to the action searcher dynamically selected at random "title search result", presented to the searcher.

[0088] 所述“标题搜索结果”的概率权值PWn的计算方法还可以是: Calculation [0088] The "title search result" weights the probability PWn may also be:

[0089] a. PWn= (TP+PageFocus) / (RespDelay-K)或, [0089] a. PWn = (TP + PageFocus) / (RespDelay-K), or,

[0090] b. Pffn = (TP+PageFocus)/RespDelay/K 或, [0090] b. Pffn = (TP + PageFocus) / RespDelay / K or,

[0091] c. Pffn = TP氺PageFocus/RespDelay/K。 [0091] c. Pffn = TP Shui PageFocus / RespDelay / K.

[0092] 所述“同源信息处理模块”: [0092] The "source information processing module":

[0093] A.可以内嵌在搜索引擎中; [0093] A. can be embedded in the search engines;

[0094] B.可以放置在“搜索引擎”和“搜索引擎搜索结果Web服务器”之间;[0095] C.也可以作为预处理模块放置在“搜索引擎”和被搜索站点之间。 [0094] B. can be placed between the "search engine" and "search engine results Web server"; [0095] C. may be disposed between the preprocessing module as "search engine" and the search site.

[0096] 所述展开查看细节或其他信息含义的按钮可为超级连接或各种软件界面控件。 [0096] The expanded view details or other information on what button or hyperlink may be various software interface controls.

[0097] —种获取网页用户搜索结果关注度的系统,包括I^gei^ocus网络服务器、 PageFocus网络浏览器及网页计分服务器, [0097] - kind of web users get search results attention of the system, including I ^ gei ^ ocus network server, PageFocus web browser and web server scoring,

[0098] PageFocus网络服务器包括I^agei^cus浏览器ID注册服务器、 PageFocusAccServer网页关注统计服务器、PageFocus浏览器在线升级服务器及数据加解密模块; [0098] PageFocus network server, including I ^ agei ^ cus browser ID registration server, PageFocusAccServer web server statistics concerned, PageFocus browser online update server and data encryption and decryption module;

[0099] PageFocus网络浏览器包括I^agei^cus浏览器ID注册模块、关注分值I^agei^cus [0099] PageFocus web browsers include I ^ agei ^ cus browser ID registration module, attention scores I ^ agei ^ cus

[0100] 计算模块。 [0100] calculation module.

[0101] 其工作步骤如下: [0101] with the following working steps:

[0102] (1)"PageFocus网络浏览器”,每个浏览器均在安装时具备全球唯一的ID标识号, 或在使用时主动寻找网络上的“PageFocus浏览器ID注册服务器”以获得全球唯一的ID标识号; [0102] (1) "PageFocus web browser," each browsers have a global unique ID number at the time of installation, or when using actively looking "PageFocus browser ID register server" on the network to get the world's only ID mark;

[0103] (2) "PageFocus网络浏览器”具备具有常规网络浏览器,并将用户对浏览器的操作和对网页的操作按照权重转换成网页的“关注分值I^agei^cus”并形成“I^gei^ocus数据包”,以加密方式通过网络协议传递至本搜索引擎的“I^agei^cusAcckrver网页关注统计服务器”; [0103] (2) "PageFocus network browser" comprises a conventional web browser, and the user operation and the operation of the browser on the page according to their weights converted into the page "attention value I ^ agei ^ cus" and form "I ^ gei ^ ocus packet", in an encrypted way of passing through the network protocol to this search engine, "I ^ agei ^ cusAcckrver web server statistics concern";

[0104] GyTagei^cusAcckrVer网页关注统计服务器”在收到全球的每一个“PageFocus 网络浏览器”发来的“PageFocus数据包”后将其内部包含的“关注分值I^agei^cus”累加到相应的网页上; [0104] GyTagei ^ cusAcckrVer web server statistics concern "in the world each received a" PageFocus web browser "sent" PageFocus package "will contain its internal" focus score I ^ agei ^ cus "to accumulate respective pages;

[0105] (4) "PageFocusAccServer网页关注统计服务器”上包含的全球每一个网页的“关注分值I^agei^cus”,这些信息可以通过各种处理方法形成:搜索引擎对网页排行依据、搜索引擎在具有相同内容搜索结果中选择可以作为“标题搜索结果”的依据、也可以直接公布出来作为“网页热门程度排行榜”的服务。 [0105] (4) contains the "PageFocusAccServer web server statistics concern" Every page of the global "concern value I ^ agei ^ cus", this information may be formed by a variety of treatment methods: search engine ranking pages based on search engine selection as "title search result" basis, it can also be published directly as "web popularity list" of service in the search results with the same content.

[0106] 所述I^gei^cusAcckrver网页关注统计服务器可以采用数学对数或科学计数法记录得分。 [0106] The I ^ gei ^ cusAcckrver page Follow mathematical or scientific notation for the number of recorded score statistics server can be used.

[0107] 所述I^gei^ocus数据包可以在浏览器彻底关闭该网页时形成,也可以定时形成, 也可以累计到某个分值时再形成。 [0107] I ^ gei ^ ocus the packet may be formed in a complete shutdown of the browser page, the timing may be formed, may be further formed when accumulated to a certain value.

[0108] 所述关注分值I^agei^cus按照下表所列权重形成: [0108] The interest value I ^ agei ^ cus formed in heavy weights listed in the following table:

[0109] [0109]

12 12

Figure CN101025737BD00131

[0110] [0110]

Figure CN101025737BD00141

[0111] [0111]

Figure CN101025737BD00151

本网页和标题为“1214”等带有序号含义, 并且目标均指向同一个URL目录的网页通常为同一篇文章的分页显示。 This website and titled "1214" and other numbers with meaning, and the target point to the same page URL directory is usually displayed as a page with an article. 文章的任何页面的得分(包括负数得分),其他页面即使没有被打开也可得到相同的得分。 Article scoring any page (including negative score), even if no other page is opened can be obtained the same score.

1浏览器对网页的任何部分均可使用鼠标右键菜单,菜单中含有“投票10分”、“投票5分”、“投票1分”、 1 browser on any part of the page can use the right mouse button menu, the menu contains "Vote 10", "vote 5 minute", "Vote 1 minute"

“投票-1分”、“投票-5分”、“投票-10分” "Vote -1 point", "vote -5 minute", "Vote -10 minutes"

的菜单选项。 The menu options.

2用户使用右键菜单投票当前网页背景时, 其PageFocus得分=“当前PageFocus得分” 2 users right-click menu voting the current page background, it PageFocus score = "Current PageFocus score."

*权重。 *Weights.

3用户使用右键菜单投票当前网页各种网页元素时,该元素超级链接所指向的网页得到PageFocus得分=当前网页“当前PageFocus得分” *权重 3 users right-click menu when the current voting various web page element that hyperlink to a webpage to get PageFocus score = current page "Current PageFocus score" weighting *

[0112]注释: [0112] Notes:

[0113] 1表格中的权重值是实施例,其它数值也可采用,均为本发明的范围。 [0113] 1 weight values ​​in the table are examples, other values ​​may also be employed, are the scope of the invention.

[0114] 所述文字阅读速度的计算步骤如下: [0114] the step of calculating the speed of reading words as follows:

[0115] A.鼠标滚轮滚动:文字阅读速度=(显示区宽度/字体宽度)*每次滚动的文字行数/滚动时间间隔; [0115] A. mouse wheel: Text = reading speed (the display area width / font width) * the number of characters per line rolling / rolling time interval;

[0116] B.键盘翻页:文字阅读速度=(显示区宽度/字体宽度)*每次翻页的文字行数/翻页时间间隔; [0116] B. Keyboard page: Text = reading speed (the display area width / font width) * Number of lines of text in each page / page time interval;

[0117] C.窗体滚动条滚动:文字阅读速度=(显示区宽度/字体宽度)*每次滚动的文字行数/滚动时间间隔。 [0117] C. Form scroll bar to scroll: Text = reading speed (the display area width / font width) * the number of characters per line rolling / rolling time interval.

[0118] 所述PageFocus数据包包含PageFocus浏览器ID、网页URL及网页PageFocus得分值字段。 [0118] The data packet comprises PageFocus PageFocus browser ID, web page URL and PageFocus score value field.

[0119] 具备“同源网页”的每一个网页在参与搜索引擎提供的网页排名过程中,可以使用每一个“同源网页”获得的用户关注度I^agei^cus分值的总和作为排名的依据,即:A在“同源网页”的“标题搜索结果”在参与搜索引擎结果排名时可以采用每一个“同源网页”获得的用户关注度I^agei^ocus的总和作为排名依据;B “同源网页”内的每一个网页在参与搜索引擎结果排名时也可以采用其从属的“同源网页”的每一个网页获得的用户关注度I^gei^cus 的总和作为排名依据。 The sum of the [0119] have the "same page" of every page in the search engine page rank participate in the process, you can use user attention every "same page" to get the I ^ agei ^ cus scores as ranking basis, namely: a sum of user attention in the "same page" in the "title search result" when participating in search engine results ranking can be used every "same page" to get the I ^ agei ^ ocus as the ranking is based; B every single page in the "same page" when participating in search engine results ranking can also be used sum I ^ gei ^ cus attention of users of its subordinate "same page" for each page obtained as the ranking is based.

[0120] 一种自动判断用户状态并提供恰当的网页风格与内容的方法,其包括如下步骤: [0120] A user state and to automatically determine the appropriate method of providing the style and content of the page, comprising the steps of:

[0121] (1)在“网站服务器集群入口”收到用户首次访问本网站网页的请求后,首先在访问协议里在或IP层协议里获取其IP地址; [0121] (1) In the "server cluster entrance" to receive a user request to access this site after the first page, first obtain its IP address or IP layer protocol in the access protocol in;

[0122] (2)根据IP地址在“IP地址属性数据库”中查询其IP地址是“工作场合IP地址” 还是“私人或休闲场合的IP地址”,若是“工作场合IP地址”则进行第(3)步,若是“私人或休闲场合的IP地址”则进行第(4)步; [0122] (2) according to IP address lookup the IP address in the "IP address attribute database" is "workplace IP address" or "private or casual occasions IP address", if the "workplace IP address" is for the first ( 3) step, if the "IP address private or casual occasions" is for the first (4) steps;

[0123] (3)获取“工作场合IP地址”所处的地理位置,并得到该地理区域的行政时间,若是该IP地址所属区域正处于工作时间,则将其访问分配到“工作风格服务器”上向其提供适合工作场合使用的页面服务,否则进行第(4)步; [0123] (3) to obtain geographical location "workplace IP address" and get the administrative time that geographic area, if the IP address belongs to the region are in working hours, it is access assigned to "work style server" fit to page service provided to him on occasions of work, or for the first (4) steps;

[0124] (4)则将其访问分配到“个人和休闲风格服务器”上向其提供适合个人和休闲状态使用的页面服务。 [0124] (4) then it is assigned to access the "personal and casual style server" to provide it for personal and recreational use of state service page.

[0125] 通过上述方案,可把内容相同而对搜索者具有相同使用价值的搜索结果聚合成一条记录,即标题搜索结果,根据需要再展开查看其他结果的装置和方法。 Search Results [0125], having the same content may be the same value by using the above-described embodiment of the searcher aggregated into one record, i.e., title search result, according to the results need to expand the view of other devices and methods. 设计了避免“标题搜索结果”由于频繁被点击而导致目标服务器访问量过大而瘫痪,把“标题搜索结果”点击自动分散到其他搜索结果目标上的装置。 Designed to avoid the "title search result" is clicked caused by frequent target server volume of traffic paralyzed, the "title search result" Click on the device is automatically distributed to other search results targets. 本发明除了具备现有搜索引擎外,还具备搜索各种“多媒体”、“文档”、“软件”、“软件硬件源代码或设计文件”、“数据或数据库”、“信息”的各种网络服务,例如文件共享、FTP服务、P2P服务等的功能。 In addition to existing search engines outside the present invention further includes a variety of search "multimedia", "document", "software", "software source code or hardware design file", "data or database", "information" various network service functions, such as file sharing, FTP service, P2P service, etc.

[0126] 利用能够和网络上的统计服务器配合的网络浏览器,将用户的全部操作行为换算成对该网页的评分,并发送回统计服务器,作为对网页的关注程度的评分,从而可以作为搜索引擎的排名工具。 [0126] and be able to use statistical server on the network with a web browser, will be fully operational in terms of user behavior score that page, and sent back to the server statistics, as concerns the extent of the score of the web page, which can be used as search engine ranking tool.

[0127] 通过网站内容风格自适应方法,用户可以: [0127] adaptive method by style web content, users can:

[0128] 1.周1〜5的早上9:00〜18:00属于工作时间,处于工作状态的人需要看到简洁、相对严谨的风格和尽量与工作状态相关内容。 [0128] 1. Week 1 ~ 5 in the morning 9: 00~18: 00 people belonging to working time, in working condition needs to see is simple, relatively strict style and try to work-related state of the content.

[0129] 2.周1〜5的晚上18:00〜早上9:00和周6〜7的全天,属于休闲时间,处于休闲状态的人需要看到活波、热闹、休闲的风格和内容。 [0129] 2 ~ 5 weeks of night 18: 00~ week 6~7 9:00 am and throughout the day, belong to leisure time, people in casual wave state need to see live, lively, casual style and content .

[0130] 3.处于工作场所人需要看到简洁、相对严谨的风格和尽量与工作状态相关内容。 [0130] 3. The need to see people in the workplace is simple, relatively strict style and try to work-related state of the content.

[0131] 4.处于家庭和休闲场所的人需要看到活波、热闹、休闲的风格和内容。 [0131] 4. Home and leisure facilities in the style and content people need to see live wave, lively, casual.

[0132] 5.处于其它环境或状态的人需要看到与当时的环境和状态相适应的的风格和内 [0132] The other people in the environment or the need to see the state of the environment and state and then adapted to the style and

16容。 16 capacity.

[0133] 附图简要说明 [0133] BRIEF DESCRIPTION OF DRAWINGS

[0134] 图1为同源信息站点搜索引擎聚合显示方法的系统工作结构图; [0134] FIG polymerization system working configuration diagram of a display method for the same resource information search engine;

[0135] 图2为同源信息处理模块内部结构图; [0135] FIG. 2 is a homologous internal structure of the information processing module;

[0136] 图3为同源网页处理模块流程图; [0136] FIG. 3 is a flowchart illustrating a processing module homology page;

[0137] 图4为同源多媒体处理模块流程图; [0137] FIG 4 is a flowchart showing homology multimedia processing module;

[0138] 图5为同源图片处理模块流程图; [0138] FIG. 5 is a flowchart showing homology image processing module;

[0139] 图6为同源文档处理模块流程图; [0139] FIG 6 is a flowchart showing homology document processing module;

[0140] 图7为同源软件处理模块流程图; [0140] FIG. 7 is a flowchart showing homology software processing module;

[0141] 图8为同源数据或数据库处理模块流程图; [0141] FIG. 8 is a database or source data processing module flowchart;

[0142] 图9为同源GIS信息处理模块流程图; [0142] FIG. 9 is a flowchart showing homology GIS information processing module;

[0143] 图10为同价值网络服务处理模块流程图; [0143] FIG. 10 is a flowchart of the same value network service processing module;

[0144] 图11为同价值商业信息处理模块流程图; [0144] FIG. 11 is a flowchart of the same value commercial information processing module;

[0145] 图12为获取网页用户关注度系统结构图; [0145] FIG. 12 is a web page obtaining user configuration diagram of the system of interest;

[0146] 图13为不具备内容和风格自适应技术的现有常规搜索引擎网站系统; [0146] FIG. 13 is not provided with the content and style of the conventional prior art adaptive search engine system;

[0147] 图14为本发明具备内容和风格自适应技术的的搜索引擎网站系统。 [0147] FIG. 14 of the present invention comprising content and style adaptive art search engine system.

具体实施方式 Detailed ways

[0148] 现结合附图对本发明做进一步的说明。 [0148] DRAWINGS now further explanation of the invention.

[0149] 图1为同源信息站点搜索引擎聚合显示方法的系统工作结构图。 [0149] FIG polymerization system working configuration diagram of a display method for the same resource information search engine. 第1步:由查询者通过Web浏览器或应用软件访问搜索引擎,并输入需要查询的关键词。 Step 1: From the inquirer through a Web browser or application software to access the search engine and enter the keyword to be queried. 第2步:由搜索引擎找到全部符合条件的目标站点作为“原始搜索结果”。 Step 2: Find a search engine all in compliance with the conditions of the target site as "the original search results." 第3步:由“同源信息处理模块”查询“成为标题搜索结果”权力采购者的账户信息,并结合其他判断规则在“原始搜索结果”中选取用来作为“标题搜索结果”的对象:A “同源信息处理模块”可以内嵌在搜索引擎中;“同源信息处理模块”可以放置在“搜索引擎”和“搜索引擎搜索结果Web服务器”之间;C “同源信息处理模块”也可以作为预处理模块放置在“搜索引擎”和被搜索站点之间。 Step 3: the "source information processing module" inquiry "has become a title search result" power buyers account information, combined with other selected objects are used to determine the rules as a "title search result" in the "original search results" in: A "source information processing module" may be embedded in the search engine; "source information processing module" may be placed between the "search engine" and "search engine results Web server"; C "source information processing module" as a pre-processing module may also be placed between the "search engine" and the search site. 第4步: 由搜索引擎Web服务器或应用服务器只将选中的“标题搜索结果”作为搜索结果展示给查询者,并为其提供一个带有“展开查看细节或其他信息”含义的“按钮(包括超级连接或各种软件界面控件)”。 Step 4: From the search engine Web server or application server will only select the "title search result" shows as the search result to the inquirer and provide "button with" to start to see the details or other information, "meaning (including hyperlink or various software interface controls). " 第5步:只有查询者还希望进一步展开某条“标题搜索结果”,并按动与之对应的“按钮”时,搜索引擎再向其展示在“第2步”中找到的“原始搜索结果”。 Step 5: The only query is also seeking a further expansion of "title search result", and press the action with a "button" corresponding to the search engine again which shows found in "Step 2" in the "original search results . "

[0150] 图2为同源信息处理模块内部结构图。 [0150] FIG. 2 is a homologous internal structure of the information processing module. “同源信息处理模块”定义为:1)主要用来判断按照搜索关键词找到的一组信息节点中是否有多个节点只是一个或多个同信息源的重复站点(这些站点对查询者具有相同搜索价值或使用价值,通常不必全部直接展现给查询者),并且将这些重复站点聚合成一条搜索结果发给查询者,只有查询者需要其他同等价值的站点时才将这些搜索结果呈现出来。 "Source information processing module" is defined as: 1) mainly used to determine whether a set of a plurality of nodes only node information found by search keyword or a plurality of repeating the same information sources sites (those sites having inquirer the same search value or value in use, is usually not necessary to show all direct inquirer) and the aggregation of these sites into a duplicate search results to the inquirer, the inquirer only when the need for other sites of equal value to the search results presented. 2)和现有搜索引擎主要集中于网页的搜索不同, “同源信息处理模块”除了需要处理“Html网页”外还能够处理各种“多媒体”、“文档”、“软件”、“软件硬件源代码或设计文件”、“数据或数据库”、“信息”的各种网络服务,例如:文件共享、FTP服务、P2P服务等。 2) existing search engine focused on different search pages, "source information processing module" in addition to need to deal with "Html page" outside also able to handle a variety of "multimedia", "Documents", "software", "software and hardware source code or design documents "," data or database "," information "of a variety of network services, such as: file sharing, FTP service, P2P service.

[0151] “同源信息处理模块”采用模块化结构,可以根据需要逐步开发和实施其中的每 [0151] "source information processing module" modular structure, according to need and can gradually develop embodiment wherein each of

17一个模块,并且具备扩展能力,同时每一个模块也可以进一步加强其自动判断的准确性,其中包括: A module 17, and have the ability to expand, while each module may automatically determine which further enhance the accuracy, including:

[0152] 1 “信息种类判断模块”:判断信息的种类,并将同类型信息集中发送到相应类型信息的处理模块,如下列模块。 [0152] 1 "information type determination module": determines the type information, type information and transmits the same to the centralized processing module corresponding to the type of information, such as the following modules.

[0153] 2 “同源网页处理模块”:用来判断并处理找到的属于同一来源并对查询者具有相同价值的网页,例如=Html, ASP,JSP,PHP,BBS论坛的内容等。 [0153] 2 "source webpage processing module": used to determine and deal with inquiries and belong to the same source to find pages with the same value, for example = Html, ASP, JSP, PHP, BBS forum content.

[0154] 3 “同源多媒体处理模块”:用来判断并处理找到的属于同一来源,并对查询者具有相同价值的多媒体文件或网络服务,例如:.MP3,· AVI, · WMV, · MPEG, · WAV,· RM等各种视频文件,以及各种基于流媒体技术的视频服务接入端口。 [0154] 3 "homologous multimedia processing module": to judge and handle multimedia files or network services belong to the same source, and have the same value inquiry found, for example: .MP3, · AVI, · WMV, · MPEG , · WAV, · RM and other video files, and streaming media technology based on a variety of video service access port.

[0155] 4 “同源图片处理模块”:用来判断并处理找到的属于同一来源或具有相同内容的, 并对查询者具有相同价值的图片,例如:.GIF, · JPG, · BMP, · PNG等。 [0155] 4 "same picture processing module": used to determine the origin and treatment of the same or have the same content, and query images have the same value found, for example: .GIF, · JPG, · BMP, · PNG and so on.

[0156] 5 “同源文档处理模块”:用来判断并处理找到的属于同一来源、具有相同或相关内容,并对查询者具有相同价值的各种格式文档文件或网络服务,例如:“.D0C”,“.Txt”, ".Pdf", ".XLS", “.PPT” 等。 [0156] 5 "same-source document processing module": used to determine and deal with a variety of document file formats or network services belong to the same source, with the same or related content, and query have the same value found, for example:. " d0C ",". Txt "," .Pdf "," .XLS "," .PPT "and so on.

[0157] 6 “同源软件处理模块”:能够判断并处理找到的计算机应用软件安装程序属于同一作者的同一软件,它们可以是适应不同或相同操作系统的,相同或不同版本的软件安装程序。 [0157] 6 "same software processing module": the ability to judge and deal with computer application software installer finds belong to the same author of the same software, they can be adapted to different or the same operating system, the same or different versions of the software installer.

[0158] 7 “同源数据或数据库处理模块”:用来判断并处理找到的属于同一来源或具有相同内容的,并对查询者具有相同价值的,已知格式的数据文件或数据库文件,例如:.DAT,. XLS, . MDF, .DBF 等。 [0158] 7 "homologous data or database processing module": used to determine the source or belong to the same process and having the same content, and the data file or database file format known in the query have the same value found, e.g. :. .DAT ,. XLS, MDF, .DBF and so on.

[0159] 8 “同源GIS信息处理模块”:用来判断并处理找到的属于同一来源或具有相同内容的,并对查询者具有相同价值的数字地图文件或服务。 [0159] 8 "same-source GIS information processing module": for judging and processing digital map files or services that belong to the same source or have the same content, and query have the same value found.

[0160] 9 “同价值网络服务处理模块”:用来判断并处理找到的属于同一来源或具有相同内容的,并对查询者具有相同价值的网络服务,例如:相同文件的FTP下载服务,同时转播一个电视台的IPTV服务,同时提供IGB容量的邮件服务等。 [0160] 9 "same-value network service processing module": used to determine the origin and treatment of the same or have the same content, network and service queries have the same value found, for example: FTP downloads the same file at the same time broadcast a television IPTV services, while providing capacity IGB mail services.

[0161] 10 “同价值商业信息处理模块”:用来判断并处理找到的属于同一来源或具有相同内容的,处于相同地理或行政区域的,并对查询者具有相同价值的,通过网络发布自己的商业产品或服务的广告内容,例如:在同一个街区提供的鸡蛋出售信息,在同一个街区提供的理发服务出售信息,在同一个城市可以使用的电话通讯服务等。 [0161] 10 "same-value commercial information processing module": used to determine and find the process of belonging to the same source or have the same content, in the same geographical or administrative area, and queries have the same value, through the network to publish their own commercial products or advertising content services, such as: telephone communication services provided in the same block egg sales information, advertisements in the barber services provided on the same block, can be used in the same city and so on. “信息种类判断模块” "Information type determination module"

[0162] “信息种类判断模块”主要用于在搜集到的信息中,分类出其类型,并送至相应的信息处理模块。 [0162] "information type determination module" is mainly used for the information gathered, the classification its type, and sent to the appropriate information processing module.

[0163] “信息种类判断模块”处理的信息来源主要有3种形式: [0163] "information type determination module" source of information processing are mainly three kinds of forms:

[0164] (1)网页形式:信息来自于网站的网页内容,同时网页中还含有指向特定文件类型的超级链接,例如:“http://www. 008. org. cn/up/the_quiet_american. mp3,, [0164] (1) web page: page content information from the website, but the page also contains hyperlinks to specific file types, such as: "http:... // www 008. org cn / up / the_quiet_american mp3 ,,

[0165] (2)网络服务形式:包括各种网络服务器提供的网络服务入口,例如:FTP文件下载服务、各种P2P (Pear To Pear)软件(例如:BT下载、eMule下载)的种子服务,新闻服务器服务等。 [0165] (2) in the form of network services: a network including various network services provided by the server entry, for example: the FTP file download service, a variety of P2P (Pear To Pear) software (e.g.: BT download, download the eMule) seed service, news server services. 对于网络服务入口的获知可以有两种途径: For network service entrance can learn in two ways:

[0166] A.网页上可以查到的网络服务:通过解析网页内容可以获知的网络服务入口。 [0166] A. can be found on the website of the network services: network services can be learned by analyzing web content entrance.

[0167] B.直接由网络服务提供者向本搜索引擎提交其网络服务入口或内容。 [0167] B. submit its network service entrance or content to this search engine directly from Internet service providers. [0168] (3)数据或数据库形式:由搜索引擎直接向网络提供信息录入服务,由网络用户提交自己的信息,最终形成数据文件或数据库形式的信息,在本搜索引擎被查询时,从中提取信息来满足查询者的要求。 [0168] (3) data or database form: provided by the search engine directly to the network information into the service, to submit their information by the network users, eventually forming a data file or a database of information in the form of, at the time of this search engine is queried to extract information to meet the requirements of the inquirer.

[0169] [0169]

:网页形式”信息的种类判断方法如下 : Type determination method of a web page "information are as follows

“信 "letter

[0170] 网页本身就可以直接作为“网页”输出给“同源网页处理模块”进行处理,另外, 息种类判断模块”按照网页语法(例如:Html、Java, JSP、ASP、ASPX、PHP等等语言)针对“超级链接”的语法,可直接解析出其指向的文件类型,根据不同的文件类型可以区分出其 [0170] Web page itself can be used directly as a "page" to the "source webpage processing module" for processing, in addition, information type determination module "according to the page syntax (e.g.: Html, Java, JSP, ASP, ASPX, PHP, etc. language) for "hyperlink" syntax can be resolved directly points to the file type that can be distinguished according to their different file types

息类型,详见下表 The message type, the table below

[0171] [0171]

Figure CN101025737BD00191

[0172]举例: [0172] Example:

[0173] 1.网页中含有:“httD://XXX/XXX/SOng.mD3”超级链接,即可判断其目标为“多媒 [0173] 1. The page comprises: "httD: //XXX/XXX/SOng.mD3" hyperlink, to determine its destination is "Multimedia

体”类型信息。 Body "type information.

[0174] 2.网页中含有:“httD: //xxx/xxx/song. rar”超级链接,找到该目标文件后解压缩,发现里面只含有” song. mp3”仍可判断目标为“多媒体”类型信息。 Unzip the hyperlink to find the target file, found it contained only "song mp3." Judge can still target "multimedia" [0174] 2. Web page contains:: "// xxx / xxx / song rar httD." type information.

[0175] 3.网页中含有:“httD: //xxx/xxx/song. rar”超级链接,找到该目标文件后解压缩,发现里面含有的文件和目录的文件个数、每个文件的名称和大小均与某种已知软件的安装盘相同,即可判断其为“软件”类型信息。 [0175] 3. The page contains: Unzip after: "// xxx / xxx / song rar httD." Hyperlink, to find the target file, find the number of files it contains files and directories, the name of each file and a mounting plate with a certain size are known in the same software, the information can be judged as a "software" type.

[0176] “网络服务形式”信息的种类判断方法如下: [0176] The method of determining the kind of "network service forms" information are as follows:

[0177] 第1步:作为普通用户访问该服务,以获取其内容。 [0177] Step 1: As an ordinary user to access the service, in order to get its content.

[0178] 第2步:将获得的内容按照下表进行分类。 [0178] Step 2: The obtained content is classified according to the following table.

[0179] [0179]

Figure CN101025737BD00201

[0183] [0183]

[0184] [0184]

第3步:如果获得的是压缩格式文件,则需要展开其内容后在按照第2步进行分 Step 3: If the acquired file in a compressed format, it is necessary to expand the contents of which is divided in accordance with Step 2

“数据或数据库形式”信息的种类判断方法如下: 第1步:访问数据文件或数据库,以获取其内容。 "Data or database forms" method of determining the type of information are as follows: Step 1: accessing a data file or database to obtain its contents.

第2步:如果从数据文件或数据库中获得的信息是文件则直接进行“第4步”。 Step 2: If the information obtained from the data file or database file is directly "Step 4." 第3步:如果从数据文件或数据库中获得的信息是存放文件的位置,则需要访问该位置以获得目标文件。 Step 3: If the information obtained from the data file or database location of the file is stored, you need to access the location to achieve the target file.

[0185] 第4步:将获得的内容按照下表进行分类。 [0185] Step 4: The contents of the obtained classified according to the following table.

[0186] [0186]

Figure CN101025737BD00202

[0187] 第5步:如果获得的是压缩格式文件,则需要展开其内容后在按照4步进行分类。 [0187] Step 5: If the acquired file in a compressed format, it is necessary to expand the contents of which are classified according to four steps. “同源网页处理模块” "Source webpage processing module"

[0188] 图3为“同源网页处理模块”流程图。 [0188] FIG. 3 is a flowchart of "homologous web processing module." “同源网页处理模块”主要功能:将根据搜索关键词找到的,具有相同主要内容的网页,以一条“标题搜索结果”形式展现给查询者,并且通过“展开”含义按钮可以看到全部查询到的具有相同主要内容的网页的查询结果。 "Source webpage processing module" main functions: according to the main content of the page with the same keyword search to find the show with a "title search result" to the inquirer form and meaning button to see all the queries through the "expand" to query results with the same main content of the page. 为最大化地提高本系统的工作性能,我们采用了如下技术: To maximize improve the performance of this system, we use the following technologies:

[0189] 采用了网页发布技术,使用“搜索结果网页发布器”将搜索结果提前发布到“搜索引擎搜索结果Web服务器”,直接响应已经被查询过的搜索要求,避免根据请求动态从数据库生成动态网页的大量计算。 [0189] using a Web publishing technology, using the "search results page publisher" search results early release to "search engine search results Web server" direct response has been queried search request, to avoid the request of the dynamically from a database to generate dynamic computationally intensive web pages.

[0190] “同源信息处理模块”将处理结果分类别放置在“非同源网页结果数据库”和“同源网页结果数据库”中,并定期由“搜索结果网页发布器”发布到“搜索引擎搜索结果Web服务器”,避免了重复计算和减少了计算等待时间。 [0190] "source information processing module" processing results are classified and placed in the "non-homologous database" and "same-source database", and regularly issued by the "search results pages publisher" to "search engine Search results Web server ", to avoid double counting and reducing the waiting time is calculated.

[0191] “同源信息处理模块”处理流程如下: [0191] "source information processing module" process is as follows:

[0192] 第1步:在“搜索引擎搜索部分”接收需要查询的关键词的时候,首先由“搜索结果已经发布在Web服务器上的判决器”判断该关键词是否近期已经被其他人查询过,如果被查询过,并且结果已经在“搜索引擎搜索结果Web服务器”上发布,则直接返回搜索结果(见图“Ml”标记),该结果中已经将具有相同来源的网页聚合成一条搜索结果,点击“同源网页”按钮后,可以在“搜索引擎搜索结果Web服务器”上看到另一个包括全部搜索结果的搜索结果网页,完成整个查询过程。 [0192] Step 1: In the "search engine part of the" need to receive keyword query, the first thing the "search results have been published on the Web server's decision device" if the keyword has recently queried by others If the query before, and the results have been published on the "search engine results Web server", the direct return search results (see Figure "Ml" mark), which will have the same results have been aggregated into one source web search results , click on the "same page" button, you can see in the "search engine results Web server" another search results page includes all of the search results, complete the inquiry process.

[0193] 第2步:如果在“搜索引擎搜索部分”接收需要查询的关键词的时候,由“搜索结果已经发布在Web服务器上的判决器”判断该关键词近期没有被其他人查询过,并且也没有相应的查询结果在“搜索引擎搜索结果Web服务器”上发布则: [0193] Step 2: If the "search engine part of the" need to receive keyword query time by the "search results have been published on the Web server's decision device" is not the keyword recently queried by other people, and there is no corresponding results are published in the "search engine results Web server":

[0194] 启动“网页搜索器”搜索“非同源网页结果数据库”和“同源网页结果数据库”找到符合搜索关键词的网页地址,并获取这些网页的内容。 [0194] start the "Web Search" in searching "non-same-source database" and "same-source database," find the page address match the search keywords, and get content of these pages.

[0195] 如果“网页搜索器”在“非同源网页结果数据库”和“同源网页结果数据库”中没有找到符合搜索关键词的网页地址,则返回查询者“没有符合条件网页”的结果,并且将该搜索关键词加入到下一轮更新“非同源网页结果数据库”和“同源网页结果数据库”的任务中, 如果在更新过程中找到了符合条件的网页地址则根据其是否具有同源网页而选择入“非同源网页结果数据库”或“同源网页结果数据库”,这样如果再有人搜索同样的关键词是就可以找到结果。 [0195] If the "Web Search" on the web address does not match the search keywords found in the "non-same-source database" and "same-source database", the query returns were "not qualified page" of results, and adding the search keywords to next task of updating "non-same-source database" and "same-source database", if you find a qualified web address in the update process based on whether it has the same source page and select the "non-same-source database" or "same-source database," so if someone is re-search the same keywords to find the results.

[0196] 第3步:由“网页内容分离器”将找到的网页内容及超级链接目标分解成:多媒体、 图片、文字、超级链接等种类。 [0196] Step 3: decomposition of the "Web Content splitter" to find web content and hyperlinks to target: multimedia, pictures, text, hyperlinks and other species.

[0197] 第4步:分别由各种内容判决器产生判决结果 [0197] Step 4: generate decision result from various content decision device

[0198] A.由“多媒体内容判决器”产生目标网页所含“相同多媒体文件程度SMS” (Same Media Score)(多媒体的定义包括=Flash类、视频/音频文件的播放服务或文件服务、 IPTV/卫星直播/音视频监控/表演/人工应答等实时信息的播放服务或文件服务,其他多媒体服务)。 [0198] A. from the "multimedia content decision device" to generate landing pages contained in the "same level of multimedia files SMS" (Same Media Score) (= Flash multimedia is defined to include classes, services, or file playback video / audio file services, IPTV play real-time information service or file / DBS / audio and video monitoring / performance / manual answering services, other multimedia services).

[0199] B.由“图片内容判决器”产生目标网页所含“相同图片的程度SPS”(Same Photo Score)ο [0199] B. a "picture content decision device" to generate landing pages contained "the same picture of the extent of SPS" (Same Photo Score) ο

[0200] C.由“文字内容判决器”产生目标网页所含“相同文字的程度STS”(Same Text Score)ο [0200] C. by a "decision text" on page contained in the produced target "level of STS same text" (Same Text Score) ο

[0201] D.由“链接内容判决器”产生目标网页所含“相同超级连接的程度SHS” (Same Hyperlinks Score)。 [0201] D. from "link content decision device" to generate landing pages contained in the "same degree of super-connected SHS" (Same Hyperlinks Score).

[0202] 第5步:从“同源网页判决规则库”分别获取“多媒体判决权重SMP”、“图片判决权 [0202] Step 5: Get "Multimedia verdict weights SMP", "Pictures from the right verdict" same page judgment rule base "respectively

21重SPP”、“文字判决权重STP”、“链接判决权重SHP”并分别与第4步生成的“相同多媒体文件程度SMS”、“相同图片的程度SPS”、“相同文字的程度STS”、“相同超级连接的程度SHS” 做数学乘法。 21 heavy SPP "," text verdict weight STP "," link decision weights SHP "and were produced as in step 4," the extent of the same multimedia file SMS "," the same picture of the extent of SPS "," the same words the degree of STS ", "the same degree of super-connected SHS" do the math multiplication.

[0203] 第6步:将“第5步”获得的数学乘法结果做加法,获得网页的“同源程度SSS (Same Sourc Score) ”,同源程度SSS = (SMS*SMP) + (SPS*SPP) + (STS*STP) + (SHS*SHP) [0203] Step 6: The "Step 5" math multiplication results obtained do addition, access to the web page "homologous degree SSS (Same Sourc Score)", the degree of homology SSS = (SMS * SMP) + (SPS * SPP) + (STS * STP) + (SHS * SHP)

[0204] 第7步:判断该网页的“同源程度SSS”是否超出门限,如果超出门限则判定为与其它网页的“同源网页”,如果没有超出门限则判定为“非同源网页”。 [0204] Step 7: Analyzing the page "homologous degree of the SSS" exceeds a threshold, if it exceeds the threshold it is determined that the other page "same page", if not exceeded the threshold is determined to be "nonhomologous page" .

[0205] 第8步:将“第7步”产生的“非同源网页”由“非同源网页处理模块”入“非同源网页结果数据库”;将“第7步”产生的“同源网页”由“同源网页处理模块”入“同源网页结果数据库”。 [0205] Step 8: "Step 7" produce "non-homologous page" from "non-homologous web processing module" into "non-homologous database"; and "Step 7" generated "with source page "from" homologous web processing module "into" same-source database. "

[0206] 第9步:由“搜索结果网页发布器”根据“同源网页结果数据库”和“非同源网页结果数据库”的内容动态生成搜索结果的静态网页,发布到“搜索引擎搜索结果Web服务器”, 再通过浏览器呈现给查询用户。 [0206] Step 9: dynamically generated search results by "Search results pages publisher," according to the content of "same-source database" and "non-same-source database", publishing them to "search engine Web search results server ", and then presented to the user query through a browser. (见图“M2”标记)。 (See "M2" tag).

[0207] 作为第9步的另一种实现方法,也可以通过“动态网页Web服务器”直接通过浏览器呈现给查询用户。 [0207] As another method to achieve step 9, can also "dynamic webpage Web server" presented directly through the browser to query the user. (见图“M3”标记)。 (See "M3" tag).

[0208] “网页内容分类器”可通过软件实现,直接根据“Html语法”,"ASP/ASPX语法”, “PHP”,“JSP”等各种网页上使用的语法解析出每一个内容的类型。 [0208] "Web content classifier" can be implemented by software, directly parsed syntax of each content according to the "the Html Grammar", "ASP / ASPX grammar", "PHP", "JSP" and other types of page .

[0209] “同源多媒体处理模块” [0209] "homologous multimedia processing module"

[0210] 图4为“同源多媒体处理模块”流程图。 [0210] FIG 4 is a flowchart of "homologous multimedia processing module." 对于符合搜索条件的多媒体文件或服务, “同源多媒体处理模块”均采用Html网页中的超级链接方式提供给被查询者。 For multimedia files or services that meet the search criteria, "homologous multimedia processing module" are available to be queried by using hyperlinks Html page. 为最大化地提高本系统的工作性能,我们采用了如下技术: To maximize improve the performance of this system, we use the following technologies:

[0211] 采用了网页发布技术,使用“搜索结果网页发布器”将搜索结果提前发布到“搜索引擎搜索结果Web服务器”,直接响应已经被查询过的搜索要求,避免根据请求动态从数据库生成动态网页的大量计算。 [0211] using a Web publishing technology, using the "search results page publisher" search results early release to "search engine search results Web server" direct response has been queried search request, to avoid the request of the dynamically from a database to generate dynamic computationally intensive web pages.

[0212] “同源信息处理模块”将处理结果分类别放置在“非同源多媒体索引数据库”和“同源多媒体索引数据库”中,并定期由“搜索结果网页发布器”发布到“搜索引擎搜索结果Web 服务器”,避免了重复计算和减少了计算等待时间。 [0212] "source information processing module" processing results are classified and placed in a "non-source multimedia index database" and "same-source multimedia index database", and regularly issued by the "search results pages publisher" to "search engine Search results Web server ", to avoid double counting and reducing the waiting time is calculated.

[0213] “同源多媒体处理模块”处理流程如下: [0213] "homologous multimedia processing module" process is as follows:

[0214] 第1步:在收到查询者的搜索关键词,并通过软件根据关键词内容和关键词语法判断需要找的是多媒体文件或服务(例如,关键词中含有“MP3”表示需要寻找的是.MP3文件而不是含有该文字的网页)。 [0214] Step 1: receiving a query's search keywords, content and software based on keywords and keyword syntax judgment need to find is a multimedia file or service (for example, contained in the keywords "MP3" expressed the need to find .MP3 file instead of a webpage containing the text).

[0215] 第2步:判断“要搜索的内容已经发布在Web服务器上吗? ”,如果搜索的目标已经发布在“搜索引擎搜索结果Web服务器”上则直接返回搜索结果(见图“Ml”标记),该结果中已经将符合搜索条件具有相同来源的多媒体的获取接口聚合成一条“标题搜索结果”,点击“同源文件”按钮后,可以在“搜索引擎搜索结果Web服务器”上看到另一个包括全部搜索结果的网页,使查询者可以看到符合查询条件的全部搜索结果,完成搜索过程。 [0215] Step 2: determine "what you are searching has been published on the Web server?", If the target of the search has been published on the "search engine results Web server" is directly returned in the search results (see Figure "Ml" mark), the results will match the search criteria have been the same source multimedia access ports aggregated into a "title search result", click on "homologous file" button, you can see in the "search engine results Web server" another page includes all of the search results so that the query can see all of the search results match the query, the search process is complete. 如果搜索的目标没有发布在“搜索引擎搜索结果Web服务器”上则从第3步开始。 If the target of the search is not published on the "search engine results Web server" start from step 3.

[0216] 第3步:返回查询者“没有符合条件多媒体”的结果。 [0216] Step 3: Return the inquirer "not qualified multimedia" result.

[0217] 第4步:将该搜索关键词加入到下一轮更新“同源多媒体索引数据库”和“非同源多媒体索引数据库”的任务中,并定期启动两个数据库的更新过程。 [0217] Step 4: adding the search keywords to next task of updating "source multimedia index database" and "non-source multimedia index database", and periodically start the update process two databases.

[0218] 第5步:“同源多媒体索引数据库”和“非同源多媒体索引数据库”的更新过程: [0218] Step 5: "source multimedia index database" and "non-homologous multimedia index database" update process:

[0219] A.由“多媒体搜索器”搜索网页新出现的多媒体文件或服务入口,通过软件进入该入口获取该文件或服务。 [0219] A. from the "Multimedia Finder" search the web emerging multimedia file or service entrance, to obtain the files or services through the entrance into the software.

[0220] B.由“多媒体内容判决器”判断新找到的多媒体内容“与当前“同源多媒体索引数据库”的内容属于同一内容吗?”如果“是”则将它作为一个新的元素归入“同源多媒体索引数据库”的该类别;如果“否”则由“多媒体内容判决器”判断它“与当前非同源多媒体索引数据库”的内容属于同一内容吗? [0220] B. from the "multimedia content decision device" found new multimedia content "and the current" source multimedia index database "contents belong to the same content?" If "yes," then it is classified as a new element "source multimedia index database" in the category; if "No" by the "multimedia content decision device" content that "with the current non-homologous multimedia index database" belong to the same content? "

[0221] C.如果“是”则:“为当前的多媒体和与之同源的并已经存贮在'非同源多媒体索引数据库'中的多媒体,新建一个类别并全部转移到'同源多媒体索引数据库'”;如果“否” 则“为当前的多媒体新建一个类别,并存入'非同源多媒体索引数据库'”;。 [0221] C. If "Yes": "as the current multimedia homologous therewith and having been stored in the multimedia 'nonhomologous multimedia index database in creating a new category and all transferred to the' homologous Display index database "; if" no "is" a new category for the current multimedia, and stored in 'non-source multimedia index database ";.

[0222] 第6步:由“搜索结果网页发布器”根据“同源网页结果数据库”和“非同源网页结果数据库”的内容动态生成搜索结果的静态网页,发布到“搜索引擎搜索结果Web服务器”, 再通过浏览器呈现给前来搜索的查询者(见图“M2”标记)。 [0222] Step 6: dynamically generated search results by "Search results pages publisher," according to the content of "same-source database" and "non-same-source database", publishing them to "search engine Web search results server ", and then presented to the inquirer (see Fig come to search through the browser" M2 "mark).

[0223] 作为第6步的另一种实现方法,也可以通过“动态网页Web服务器”直接通过浏览器呈现给查询用户。 [0223] As another method to achieve step 6, you can also "dynamic webpage Web server" presented directly through the browser to query the user. (见图“M3”标记)。 (See "M3" tag).

[0224] “同源图片处理模块” [0224] "Homologous picture processing module"

[0225] 图5为同源图片处理模块流程图。 [0225] FIG. 5 is a flowchart showing homology image processing module. 对于符合搜索条件的图片文件或链接,“同源图片处理模块”均采用Html网页中的超级链接方式提供给被查询者。 For image files or links that match the search criteria, "same picture processing module" are available to be queried by using hyperlinks Html page. 为最大化地提高本系统的工作性能,我们采用了如下技术: To maximize improve the performance of this system, we use the following technologies:

[0226] 采用了网页发布技术,使用“搜索结果网页发布器”将搜索结果提前发布到“搜索引擎搜索结果Web服务器”,直接响应已经被查询过的搜索要求,避免根据请求动态从数据库生成动态网页的大量计算。 [0226] using a Web publishing technology, using the "search results page publisher" search results early release to "search engine search results Web server" direct response has been queried search request, to avoid the request of the dynamically from a database to generate dynamic computationally intensive web pages.

[0227] “同源信息处理模块”将处理结果分类别放置在“非同源图片索引数据库”和“同源图片索引数据库”中,并定期由“搜索结果网页发布器”发布到“搜索引擎搜索结果Web服务器”,避免了重复计算和减少了计算等待时间。 [0227] "source information processing module" processing results are classified and placed in the "non-homologous picture index database" and "same-source picture index database", and regularly issued by the "search results pages publisher" to "search engine Search results Web server ", to avoid double counting and reducing the waiting time is calculated. “同源图片处理模块”处理流程如下: "Source picture processing module" process is as follows:

[0228] 第1步:在收到查询者的搜索关键词,并通过软件根据关键词内容和关键词语法判 [0228] Step 1: receiving a query's search keywords, content and software based on keywords and keyword syntax sentence

[0229] 断需要找的是图片文件或链接(例如,关键词中含有“.JPG”表示需要寻找的是.JPG文件而不是含有该文字的网页)。 [0229] off the need to find an image file or a link (for example, contained in the keywords ".JPG" expressed the need to find a .JPG file instead of the page containing the text).

[0230] 第2步:判断“要搜索的内容已经发布在Web服务器上吗? ”,如果搜索的目标已经发布在“搜索引擎搜索结果Web服务器”上则直接返回搜索结果(见图“Ml”标记),该结果中已经将符合搜索条件具有相同来源的图片的获取接口聚合成一条“标题搜索结果”,点击“同源文件”按钮后,可以在“搜索引擎搜索结果Web服务器”上看到另一个包括全部搜索结果的网页,使查询者可以看到符合查询条件的全部搜索结果,完成搜索过程。 [0230] Step 2: determine "what you are searching has been published on the Web server?", If the target of the search has been published on the "search engine results Web server" is directly returned in the search results (see Figure "Ml" mark), the results have been consistent with the image search condition with the same source interface to get aggregated into a "title search result", click on "homologous file" button, you can see in the "search engine results Web server" another page includes all of the search results so that the query can see all of the search results match the query, the search process is complete. 如果搜索的目标没有发布在“搜索引擎搜索结果Web服务器”上则从第3步开始。 If the target of the search is not published on the "search engine results Web server" start from step 3.

[0231] 第3步:返回查询者“没有符合条件图片”的结果。 [0231] Step 3: Return the inquirer "not qualified picture" results.

[0232] 第4步:将该搜索关键词加入到下一轮更新“同源图片索引数据库”和“非同源图片索引数据库”的任务中,并定期启动两个数据库的更新过程。 [0232] Step 4: adding the search keywords to next task of updating "same picture index database" and "non-homologous picture index database", and periodically start the update process two databases. [0233] 第5步:“同源图片索引数据库”和“非同源图片索引数据库”的更新过程: [0233] Step 5: "same picture index database" and "non-homologous picture index database" of the update process:

[0234] A.由“图片搜索器”搜索网页新出现的图片文件或链接入口,通过软件进入该入口获取该文件或服务。 [0234] A. from "Picture Finder" search the web emerging picture files or links entrance, to obtain the files or services through the entrance into the software.

[0235] B.由“图片内容判决器”判断新找到的图片内容“与当前“同源图片索引数据库” 的内容属于同一内容吗? ”如果“是”则将它作为一个新的元素归入“同源图片索引数据库” 的该类别;如果“否”则由“图片内容判决器”判断它“与当前非同源图片索引数据库”的内容属于同一内容吗? [0235] B. from "picture content decision device" found new picture content "and the current" same-source picture index database "contents belong to the same content?" If "yes," then it is classified as a new element "source picture index database" in the category; if "No" by "picture content decision device" content that "non-homologous with the current picture index database" belong to the same content? "

[0236] C.如果“是”则:“为当前的图片和与之同源的并已经存贮在'非同源图片索引数据库'中的图片,新建一个类别并全部转移到'同源图片索引数据库'”;如果“否”则“为当前的图片新建一个类别,并存入'非同源图片索引数据库'”;。 [0236] C. If "Yes": "for the current picture and the same-source and has been stored in the" non-homologous picture index database of images, creating a new category and all transferred to the 'homologous Pictures index database "; if" no "is" a new category for the current picture, and stored in 'non-homologous picture index database ";.

[0237] 第6步:由“搜索结果网页发布器”根据“同源网页结果数据库”和“非同源网页结果数据库”的内容动态生成搜索结果的静态网页,发布到“搜索引擎搜索结果Web服务器”, 再通过浏览器呈现给前来搜索的查询者(见图“M2”标记)。 [0237] Step 6: dynamically generated search results by "Search results pages publisher," according to the content of "same-source database" and "non-same-source database", publishing them to "search engine Web search results server ", and then presented to the inquirer (see Fig come to search through the browser" M2 "mark).

[0238] 作为第6步的另一种实现方法,也可以通过“动态网页Web服务器”直接通过浏览器呈现给查询用户。 [0238] As another method to achieve step 6, you can also "dynamic webpage Web server" presented directly through the browser to query the user. (见图“M3”标记)。 (See "M3" tag).

[0239] “同源文档处理模块” [0239] "Homologous document processing module"

[0240] 图6为同源文档处理模块流程图。 [0240] FIG 6 is a flowchart showing homology document processing module. 同源文档处理模块”支持常见文档格式: “.Txt”,“. Doc", ". PPT",". PDF",". XLS”等等。对于符合搜索条件的文档文件或链接,“同源文档处理模块”均采用Html网页中的超级链接方式提供给被查询者。为最大化地提高本系统的工作性能,我们采用了如下技术: Same-source document processing module "support common document formats:" .Txt "," Doc "," PPT "," PDF "," XLS same.... "And so on for document files or links that match the search conditions." source document processing module "are available to be queried by using hyperlinks Html webpage to maximize improve the performance of this system, we use the following techniques:

[0241] 采用了网页发布技术,使用“搜索结果网页发布器”将搜索结果提前发布到“搜索引擎搜索结果Web服务器”,直接响应已经被查询过的搜索要求,避免根据请求动态从数据库生成动态网页的大量计算。 [0241] using a Web publishing technology, using the "search results page publisher" search results early release to "search engine search results Web server" direct response has been queried search request, to avoid the request of the dynamically from a database to generate dynamic computationally intensive web pages.

[0242] “同源信息处理模块”将处理结果分类别放置在“非同源文档索引数据库”和“同源文档索引数据库”中,并定期由“搜索结果网页发布器”发布到“搜索引擎搜索结果Web服务器”,避免了重复计算和减少了计算等待时间。 [0242] "source information processing module" processing results are classified and placed in a "non-same-source document index database" and "same-source document index database", and regularly issued by the "search results pages publisher" to "search engine Search results Web server ", to avoid double counting and reducing the waiting time is calculated. “同源文档处理模块”处理流程如下: "Same-source document processing module" process is as follows:

[0243] 第1步:在收到查询者的搜索关键词,并通过软件根据关键词内容和关键词语法判断需要找的是文档文件或链接(例如,关键词中含有“.PDF”表示需要寻找的是.PDF文件而不是含有该文字的网页)。 [0243] Step 1: receiving a query's search keywords, content and software based on keywords and keyword syntax judgment need to find a document file or a link (for example, contained in the keywords ".PDF" expressed the need looking for the .PDF file instead of the page containing the text).

[0244] 第2步:判断“要搜索的内容已经发布在Web服务器上吗? ”,如果搜索的目标已经发布在“搜索引擎搜索结果Web服务器”上则直接返回搜索结果(见图“Ml”标记),该结果中已经将符合搜索条件具有相同来源的文档的获取接口聚合成一条“标题搜索结果”,点击“同源文件”按钮后,可以在“搜索引擎搜索结果Web服务器”上看到另一个包括全部搜索结果的网页,使查询者可以看到符合查询条件的全部搜索结果,完成搜索过程。 [0244] Step 2: determine "what you are searching has been published on the Web server?", If the target of the search has been published on the "search engine results Web server" is directly returned in the search results (see Figure "Ml" mark), the results have been in line with the document search conditions have the same source access ports aggregated into a "title search result", click on "homologous file" button, you can see in the "search engine results Web server" another page includes all of the search results so that the query can see all of the search results match the query, the search process is complete. 如果搜索的目标没有发布在“搜索引擎搜索结果Web服务器”上则从第3步开始。 If the target of the search is not published on the "search engine results Web server" start from step 3.

[0245] 第3步:返回查询者“没有符合条件文档”的结果。 [0245] Step 3: Return the inquirer "not qualified document" result.

[0246] 第4步:将该搜索关键词加入到下一轮更新“同源文档索引数据库”和“非同源文档索引数据库”的任务中,并定期启动两个数据库的更新过程。 [0246] Step 4: adding the search keywords to next task of updating "source document index database" and "non-same-source document index database", and periodically start the update process two databases.

[0247] 第5步:“同源文档索引数据库”和“非同源文档索引数据库”的更新过程:[0248] A.由“文档搜索器”搜索网页新出现的文档文件或链接入口,通过软件进入该入口获取该文件或服务。 [0247] Step 5: "source document index database" and "non-same-source document index database" of the update process: [0248] A. a document or file from the link entry "Document Finder" search the web emerging through the entrance into the software to obtain the files or services.

[0249] B.由“文字内容判决器”和“图片内容判决器”判断新找到的文档内容“与当前'同源文档索引数据库'的内容属于同一内容吗? ”如果“是”则将它作为一个新的元素归入“同源文档索引数据库”的该类别;如果“否”则由“文档内容判决器”判断它“与当前非同源文档索引数据库”的内容属于同一内容吗? [0249] B. from "text content decision device" and "picture content decision device" newly found document content "with the contents of the current 'source document index database' belong to the same content?" If "yes," then it is as a new element classified as "same-source document index database" in the category; if "No" by "document content decision device" content that "with the current non-source document index database" belong to the same content? "

[0250] C.如果“是”则:“为当前的文档和与之同源的并已经存贮在'非同源文档索引数据库'中的文档,新建一个类别并全部转移到'同源文档索引数据库'”;如果“否”则“为当前的文档新建一个类别,并存入'非同源文档索引数据库'”;。 [0250] C. If "Yes": "for the current document and the same-source document having been stored in 'non-source document index database' in creating a new category and all transferred to the 'source document index database "; if" no "is" for the current document to create a new category, and stored in 'non-source document index database ";.

[0251] 第6步:由“搜索结果网页发布器”根据“同源网页结果数据库”和“非同源网页结果数据库”的内容动态生成搜索结果的静态网页,发布到“搜索引擎搜索结果Web服务器”, 再通过浏览器呈现给前来搜索的查询者(见图“M2”标记)。 [0251] Step 6: dynamically generated search results by "Search results pages publisher," according to the content of "same-source database" and "non-same-source database", publishing them to "search engine Web search results server ", and then presented to the inquirer (see Fig come to search through the browser" M2 "mark).

[0252] 作为第6步的另一种实现方法,也可以通过“动态网页Web服务器”直接通过浏览器呈现给查询用户。 [0252] As another method to achieve step 6, you can also "dynamic webpage Web server" presented directly through the browser to query the user. (见图“M3”标记)。 (See "M3" tag).

[0253] “同源软件处理模块” [0253] "homologous software processing module"

[0254] 图7为同源软件处理模块流程图。 [0254] FIG. 7 is a flowchart showing homology software processing module. 对于符合搜索条件的软件文件或链接,“同源软件处理模块”均采用Html网页中的超级链接方式提供给被查询者。 For software file or link that match the search criteria, "homology software processing module" are available to be queried by using hyperlinks Html page. 为最大化地提高本系统的工作性能,我们采用了如下技术: To maximize improve the performance of this system, we use the following technologies:

[0255] 采用了网页发布技术,使用“搜索结果网页发布器”将搜索结果提前发布到“搜索引擎搜索结果Web服务器”,直接响应已经被查询过的搜索要求,避免根据请求动态从数据库生成动态网页的大量计算。 [0255] using a Web publishing technology, using the "search results page publisher" search results early release to "search engine search results Web server" direct response has been queried search request, to avoid the request of the dynamically from a database to generate dynamic computationally intensive web pages.

[0256] “同源信息处理模块”将处理结果分类别放置在“非同源软件索引数据库”和“同源软件索引数据库”中,并定期由“搜索结果网页发布器”发布到“搜索引擎搜索结果Web服务器”,避免了重复计算和减少了计算等待时间。 [0256] "source information processing module" processing results are classified and placed in a "non-source software index database" and "same-source software index database", and regularly issued by the "search results pages publisher" to "search engine Search results Web server ", to avoid double counting and reducing the waiting time is calculated. “同源软件处理模块”处理流程如下: "Homologous software processing module" process is as follows:

[0257] 第1步:在收到查询者的搜索关键词,并通过软件根据关键词内容和关键词语法判断需要找的是软件文件或链接(例如,关键词中含有“.ΕΧΕ”表示需要寻找的是.EXE文件而不是含有该文字的网页)。 [0257] Step 1: receiving a query's search keywords, content and software based on keywords and keyword syntax judgment need to find a software file or a link (for example, contained in the keywords ".ΕΧΕ" expressed the need looking for a .EXE file instead of the page containing the text).

[0258] 第2步:判断“要搜索的内容已经发布在Web服务器上吗? ”,如果搜索的目标已经发布在“搜索引擎搜索结果Web服务器”上则直接返回搜索结果(见图“Ml”标记),该结果中已经将符合搜索条件具有相同来源的软件的获取接口聚合成一条“标题搜索结果”,点击“同源文件”按钮后,可以在“搜索引擎搜索结果Web服务器”上看到另一个包括全部搜索结果的网页,使查询者可以看到符合查询条件的全部搜索结果,完成搜索过程。 [0258] Step 2: determine "what you are searching has been published on the Web server?", If the target of the search has been published on the "search engine results Web server" is directly returned in the search results (see Figure "Ml" mark), the results have been in line with the criteria have the same software source access ports aggregated into a "title search result", click on "homologous file" button, you can see in the "search engine results Web server" another page includes all of the search results so that the query can see all of the search results match the query, the search process is complete. 如果搜索的目标没有发布在“搜索引擎搜索结果Web服务器”上则从第3步开始。 If the target of the search is not published on the "search engine results Web server" start from step 3.

[0259] 第3步:返回查询者“没有符合条件软件”的结果。 [0259] Step 3: Return the inquirer "not eligible for software" results.

[0260] 第4步:将该搜索关键词加入到下一轮更新“同源软件索引数据库”和“非同源软件索引数据库”的任务中,并定期启动两个数据库的更新过程。 [0260] Step 4: adding the search keywords to next task of updating "source software index database" and "non-source software index database", and periodically start the update process two databases.

[0261] 第5步:“同源软件索引数据库”和“非同源软件索引数据库”的更新过程: [0261] Step 5: "same software index database" and "non-source software index database" of the update process:

[0262] A.由“软件搜索器”搜索网页新出现的软件文件或链接入口,通过软件进入该入口获取该文件或服务。 [0262] A. software files or links from "Software Finder" search the web emerging entrance, to obtain the files or services through the entrance into the software. [0263] B.由“软件内容判决器”判断新找到的软件内容“与当前“同源软件索引数据库” 的内容属于同一内容吗? ”如果“是”则将它作为一个新的元素归入“同源软件索引数据库” 的该类别;如果“否”则由“软件内容判决器”判断它“与当前非同源软件索引数据库”的内容属于同一内容吗? [0263] B. software content from "software content decision device" newly found "and the current" source software index database "contents belong to the same content?" If "yes," then it is classified as a new element "homologous software index database" in the category; if "No" by "software content decision device" content that "non-homologous with the current software index database" belong to the same content? "

[0264] C.如果“是”则:“为当前的软件和与之同源的并已经存贮在'非同源软件索引数据库'中的软件,新建一个类别并全部转移到'同源软件索引数据库'”;如果“否”则“为当前的软件新建一个类别,并存入'非同源软件索引数据库'”;。 [0264] C. If "Yes": "as the current software homologous therewith and having been stored in the software 'nonhomologous index database software' is creating a new category and all transferred to the 'homology software index database "; if" no "is" for the current software to create a new category, and stored in 'non-source software index database ";.

[0265] 第6步:由“搜索结果网页发布器”根据“同源网页结果数据库”和“非同源网页结果数据库”的内容动态生成搜索结果的静态网页,发布到“搜索引擎搜索结果Web服务器”, 再通过浏览器呈现给前来搜索的查询者(见图“M2”标记)。 [0265] Step 6: dynamically generated search results by "Search results pages publisher," according to the content of "same-source database" and "non-same-source database", publishing them to "search engine Web search results server ", and then presented to the inquirer (see Fig come to search through the browser" M2 "mark).

[0266] 作为第6步的另一种实现方法,也可以通过“动态网页Web服务器”直接通过浏览器呈现给查询用户。 [0266] As another method to achieve step 6, you can also "dynamic webpage Web server" presented directly through the browser to query the user. (见图“M3”标记)。 (See "M3" tag).

[0267] “同源数据或数据库处理模块” [0267] "Homology data or database processing module"

[0268] 图8为同源数据或数据库处理模块流程图。 [0268] FIG. 8 is a database or source data processing module flowchart. 对于符合搜索条件的软件文件或链接,“同源数据处理模块”均采用Html网页中的超级链接方式提供给被查询者。 For software file or link that match the search criteria, "source data processing module" are available to be queried by using hyperlinks Html page. 为最大化地提高本系统的工作性能,我们采用了如下技术: To maximize improve the performance of this system, we use the following technologies:

[0269] 采用了网页发布技术,使用“搜索结果网页发布器”将搜索结果提前发布到“搜索引擎搜索结果Web服务器”,直接响应已经被查询过的搜索要求,避免根据请求动态从数据库生成动态网页的大量计算。 [0269] using a Web publishing technology, using the "search results page publisher" search results early release to "search engine search results Web server" direct response has been queried search request, to avoid the request of the dynamically from a database to generate dynamic computationally intensive web pages.

[0270] “同源信息处理模块”将处理结果分类别放置在“非同源数据索引数据库”和“同源数据索引数据库”中,并定期由“搜索结果网页发布器”发布到“搜索引擎搜索结果Web服务器”,避免了重复计算和减少了计算等待时间。 [0270] "source information processing module" processing results are classified and placed in the "non-homologous data index database" and "same-source data processing module", and periodically published by "the search results pages publisher" to "search engine Search results Web server ", to avoid double counting and reducing the waiting time is calculated. “同源数据处理模块”处理流程如下: "Homologous data processing module" process is as follows:

[0271] 第1步:在收到查询者的搜索关键词,并通过数据根据关键词内容和关键词语法判断需要找的是数据文件或链接(例如,关键词中含有“.DBF”表示需要寻找的是..DBF文件而不是含有该文字的网页)。 [0271] Step 1: receiving some search keywords from the query, and the data content based on keywords and keyword syntax judgment need to find a file or data link (for example, contained in the keywords ".DBF" expressed the need looking for ..DBF file instead of the page containing the text).

[0272] 第2步:判断“要搜索的内容已经发布在Web服务器上吗? ”,如果搜索的目标已经发布在“搜索引擎搜索结果Web服务器”上则直接返回搜索结果(见图“Ml”标记),该结果中已经将符合搜索条件具有相同来源的数据的获取接口聚合成一条“标题搜索结果”,点击“同源文件”按钮后,可以在“搜索引擎搜索结果Web服务器”上看到另一个包括全部搜索结果的网页,使查询者可以看到符合查询条件的全部搜索结果,完成搜索过程。 [0272] Step 2: determine "what you are searching has been published on the Web server?", If the target of the search has been published on the "search engine results Web server" is directly returned in the search results (see Figure "Ml" mark), the result has been the same search criteria in line with data source access ports aggregated into a "title search result", click on "homologous file" button, you can see in the "search engine results Web server" another page includes all of the search results so that the query can see all of the search results match the query, the search process is complete. 如果搜索的目标没有发布在“搜索引擎搜索结果Web服务器”上则从第3步开始。 If the target of the search is not published on the "search engine results Web server" start from step 3.

[0273] 第3步:返回查询者“没有符合条件数据”的结果。 [0273] Step 3: Return the inquirer "not qualified data" result.

[0274] 第4步:将该搜索关键词加入到下一轮更新“同源数据索引数据库”和“非同源数据索引数据库”的任务中,并定期启动两个数据库的更新过程。 [0274] Step 4: adding the search keywords to next task of updating "same data index database" and "non-homologous data index database", and periodically start the update process two databases.

[0275] 第5步:“同源数据索引数据库”和“非同源数据索引数据库”的更新过程: [0275] Step 5: "source data index database" and "non-homologous index database data" of the update process:

[0276] A.由“数据搜索器”搜索网页新出现的数据文件或链接入口,通过数据进入该入口获取该文件或服务。 [0276] A. from "Data Finder" search the web emerging data files or links entrance, access to the file or service by entering the data entry.

[0277] B.由“数据内容判决器”判断新找到的数据内容“与当前“同源数据索引数据库” 的内容属于同一内容吗? ”如果“是”则将它作为一个新的元素归入“同源数据索引数据库”的该类别;如果“否”则由“数据内容判决器”判断它“与当前非同源数据索引数据库”的内容属于同一内容吗? [0277] B. from "data content decision device" found new data content "and the current" source data or database index "content belonging to the same content?" If "yes," then it is classified as a new element "homologous data index database" in the category; if "No" by "data content decision device" content that "non-homologous with the current data index database" belong to the same content? "

[0278] C.如果“是”则:“为当前的数据和与之同源的并已经存贮在'非同源数据索引数据库'中的数据,新建一个类别并全部转移到'同源数据索引数据库'”;如果“否”则“为当前的数据新建一个类别,并存入'非同源数据索引数据库'”;。 [0278] C. If "Yes": "current data homologous therewith and having been stored in the 'non-homologous index database data' is data, creating a new category and all transferred to the 'homology data index database "; if" no "is" a new category for the current data and stored in 'non-homologous data index database ";.

[0279] 第6步:由“搜索结果网页发布器”根据“同源网页结果数据库”和“非同源网页结果数据库”的内容动态生成搜索结果的静态网页,发布到“搜索引擎搜索结果Web服务器”, 再通过浏览器呈现给前来搜索的查询者(见图“M2”标记)。 [0279] Step 6: dynamically generated search results by "Search results pages publisher," according to the content of "same-source database" and "non-same-source database", publishing them to "search engine Web search results server ", and then presented to the inquirer (see Fig come to search through the browser" M2 "mark).

[0280] 作为第6步的另一种实现方法,也可以通过“动态网页Web服务器”直接通过浏览器呈现给查询用户。 [0280] As another method to achieve step 6, you can also "dynamic webpage Web server" presented directly through the browser to query the user. (见图“M3”标记)。 (See "M3" tag).

[0281] “同源GIS信息处理模块” [0281] "homologous GIS information processing module"

[0282] 图9为“同源GIS信息处理模块”流程图。 [0282] FIG. 9 is a flowchart of "homologous GIS information processing module." 对于符合搜索条件的GIS信息文件或链接,“同源GIS信息处理模块”均采用Html网页中的超级链接方式提供给被查询者。 For GIS information files or links that match the search criteria, "same-source GIS information processing module" are available to be queried by using hyperlinks Html page. 为最大化地提高本系统的工作性能,我们采用了如下技术: To maximize improve the performance of this system, we use the following technologies:

[0283] 采用了网页发布技术,使用“搜索结果网页发布器”将搜索结果提前发布到“搜索引擎搜索结果Web服务器”,直接响应已经被查询过的搜索要求,避免根据请求动态从数据库生成动态网页的大量计算。 [0283] using a Web publishing technology, using the "search results page publisher" search results early release to "search engine search results Web server" direct response has been queried search request, to avoid the request of the dynamically from a database to generate dynamic computationally intensive web pages.

[0284] “同源信息处理模块”将处理结果分类别放置在“非同源GIS信息索引数据库”和“同源GIS信息索引数据库”中,并定期由“搜索结果网页发布器”发布到“搜索引擎搜索结果Web服务器”,避免了重复计算和减少了计算等待时间。 [0284] "source information processing module" processing results are classified and placed in the "non-same-source GIS information index database" and "same-source GIS information index database", and regularly issued by the "search results pages publisher" to " Search engine search results Web server ", to avoid double counting and reducing the waiting time is calculated. “同源GIS信息处理模块”处理流程如下: "Homologous GIS information processing module" process is as follows:

[0285] 第1步:在收到查询者的搜索关键词,并通过软件根据关键词内容和关键词语法判断需要找的是GIS信息文件或链接(例如,关键词中含有“.JPG”表示需要寻找的是.JPG 文件而不是含有该文字的网页)。 [0285] Step 1: receiving a query's search keywords, content and software based on keywords and keyword syntax judgment GIS information is need to find a file or a link (for example, contained in the keywords ".JPG" represents We need to find a .JPG file instead of the page containing the text).

[0286] 第2步:判断“要搜索的内容已经发布在Web服务器上吗? ”,如果搜索的目标已经发布在“搜索引擎搜索结果Web服务器”上则直接返回搜索结果(见图“Ml”标记),该结果中已经将符合搜索条件具有相同来源的GIS信息的获取接口聚合成一条“标题搜索结果”, 点击“同源文件”按钮后,可以在“搜索引擎搜索结果Web服务器”上看到另一个包括全部搜索结果的网页,使查询者可以看到符合查询条件的全部搜索结果,完成搜索过程。 [0286] Step 2: determine "what you are searching has been published on the Web server?", If the target of the search has been published on the "search engine results Web server" is directly returned in the search results (see Figure "Ml" mark), the results have been in line with the GIS information search condition with the same source of access ports aggregated into a "title search result", click on "homologous file" button, you can watch the "search engine results Web server" to another page including all the search results so that the query can see all of the search results match the query, the search process is complete. 如果搜索的目标没有发布在“搜索引擎搜索结果Web服务器”上则从第3步开始。 If the target of the search is not published on the "search engine results Web server" start from step 3.

[0287] 第3步:返回查询者“没有符合条件GIS信息”的结果。 [0287] Step 3: Return the inquirer "not qualified GIS information" results.

[0288] 第4步:将该搜索关键词加入到下一轮更新“同源GIS信息索引数据库”和“非同源GIS信息索引数据库”的任务中,并定期启动两个数据库的更新过程。 [0288] Step 4: adding the search keywords to next task of updating "same-source GIS information index database" and "non-same-source GIS information index database", and periodically start the update process two databases.

[0289] 第5步:“同源GIS信息索引数据库”和“非同源GIS信息索引数据库”的更新过程: [0289] Step 5: "same-source GIS information index database" and "non-same-source GIS information index database" of the update process:

[0290] A.由“GIS信息搜索器”搜索网页新出现的GIS信息文件或链接入口,通过软件进入该入口获取该文件或服务。 [0290] A. by the "GIS Information Finder" search the web emerging GIS information files or links entrance, to obtain the files or services through the entrance into the software.

[0291] B.由“GIS信息内容判决器”判断新找到的GIS信息内容“与当前“同源GIS信息索引数据库”的内容属于同一内容吗?”如果“是”则将它作为一个新的元素归入“同源GIS [0291] B. by the "GIS information content decision device" found new GIS information content "and the current" same-source GIS information index database "contents belong to the same content?" If "yes," then it is as a new elements classified as "same-source GIS

27信息索引数据库”的该类别;如果“否”则由“GIS信息内容判决器”判断它“与当前非同源GIS信息索引数据库”的内容属于同一内容吗?” 27 information index database "in the category; if" "? Contents belong to the same content you" No "by" GIS information content decision device "it" non-homologous with the current GIS information index database

[0292] C.如果“是”则:“为当前的GIS信息和与之同源的并已经存贮在'非同源GIS信息索引数据库'中的GIS信息,新建一个类别并全部转移到'同源GIS信息索引数据库'”; 如果“否”则“为当前的GIS信息新建一个类别,并存入'非同源GIS信息索引数据库' ”;。 [0292] C. If "Yes": "current GIS information and homologous thereto having been stored in the 'non-homologous GIS information index database of the GIS information, creating a new category and all transferred to' source GIS information index database "; if" no "is" for the current GIS information to create a new category, and stored in 'non-source GIS information index database ";.

[0293] 第6步:由“搜索结果网页发布器”根据“同源网页结果数据库”和“非同源网页结果数据库”的内容动态生成搜索结果的静态网页,发布到“搜索引擎搜索结果Web服务器”, 再通过浏览器呈现给前来搜索的查询者(见图“M2”标记)。 [0293] Step 6: dynamically generated search results by "Search results pages publisher," according to the content of "same-source database" and "non-same-source database", publishing them to "search engine Web search results server ", and then presented to the inquirer (see Fig come to search through the browser" M2 "mark).

[0294] 作为第6步的另一种实现方法,也可以通过“动态网页Web服务器”直接通过浏览器呈现给查询用户。 [0294] As another method to achieve step 6, you can also "dynamic webpage Web server" presented directly through the browser to query the user. (见图“M3”标记)。 (See "M3" tag).

[0295] “同价值网络服务处理模块” [0295] "same-value network service processing module"

[0296] 图10为“同价值网络服务处理模块”流程图。 [0296] FIG. 10 is a flowchart showing "same-value network service processing module." 对于符合搜索条件的网络服务,“同价值网络服务处理模块”均采用Html网页中的超级链接方式提供给被查询者。 For network services that meet the search criteria, "with the value network service processing module" are available to be queried by using hyperlinks Html page. 为最大化地提高本系统的工作性能,我们采用了如下技术: To maximize improve the performance of this system, we use the following technologies:

[0297] 采用了网页发布技术,使用“搜索结果网页发布器”将搜索结果提前发布到“搜索引擎搜索结果Web服务器”,直接响应已经被查询过的搜索要求,避免根据请求动态从数据库生成动态网页的大量计算。 [0297] using a Web publishing technology, using the "search results page publisher" search results early release to "search engine search results Web server" direct response has been queried search request, to avoid the request of the dynamically from a database to generate dynamic computationally intensive web pages.

[0298] “同价值信息处理模块”将处理结果分类别放置在“非同价值网络服务索引数据库”和“同价值网络服务索引数据库”中,并定期由“搜索结果网页发布器”发布到“搜索引擎搜索结果Web服务器”,避免了重复计算和减少了计算等待时间。 [0298] "with the value of the information processing module" processing results are classified and placed in the "exceptional value network service index database" and "same-value network service index database", and regularly issued by the "search results pages publisher" to " Search engine search results Web server ", to avoid double counting and reducing the waiting time is calculated. “同价值网络服务处理模块”处理流程如下: "Same-value network service processing module" process is as follows:

[0299] 第1步:在收到查询者的搜索关键词,并通过软件根据关键词内容和关键词语法判断需要找的是网络服务文件或链接(例如,关键词中含有“.JPG”表示需要寻找的是.JPG 文件而不是含有该文字的网页)。 [0299] Step 1: receiving a query's search keywords, content and software based on keywords and keyword syntax judgment need to find a file or service is a network link (for example, contained in the keywords ".JPG" represents We need to find a .JPG file instead of the page containing the text).

[0300] 第2步:判断“要搜索的内容已经发布在Web服务器上吗? ”,如果搜索的目标已经发布在“搜索引擎搜索结果Web服务器”上则直接返回搜索结果(见图“Ml”标记),该结果中已经将符合搜索条件具有相同来源的网络服务的获取接口聚合成一条“标题搜索结果”, 点击“同价值文件”按钮后,可以在“搜索引擎搜索结果Web服务器”上看到另一个包括全部搜索结果的网页,使查询者可以看到符合查询条件的全部搜索结果,完成搜索过程。 [0300] Step 2: determine "what you are searching has been published on the Web server?", If the target of the search has been published on the "search engine results Web server" is directly returned in the search results (see Figure "Ml" mark), the result would have been in line with the same network service criteria derived from access ports aggregated into a "title search result" click "with the value of the file" button, you can watch the "search engine results Web server" to another page including all the search results so that the query can see all of the search results match the query, the search process is complete. 如果搜索的目标没有发布在“搜索引擎搜索结果Web服务器”上则从第3步开始。 If the target of the search is not published on the "search engine results Web server" start from step 3.

[0301] 第3步:返回查询者“没有符合条件网络服务”的结果。 [0301] Step 3: Return the inquirer "not qualified network service" result.

[0302] 第4步:将该搜索关键词加入到下一轮更新“同价值网络服务索引数据库”和“非同价值网络服务索引数据库”的任务中,并定期启动两个数据库的更新过程。 [0302] Step 4: adding the search keywords to next task of updating "with the value network service index database" and "exceptional value network service index database", and periodically start the update process two databases.

[0303] 第5步:“同价值网络服务索引数据库”和“非同价值网络服务索引数据库”的更新过程: [0303] Step 5: "same-value network service index database" and "exceptional value network service index database" update process:

[0304] A.由“网络服务搜索器”搜索网页新出现的网络服务文件或链接入口,通过软件进入该入口获取该文件或服务。 [0304] A. from "Network Service Finder" search the web emerging network service entrance file or link, to obtain the files or services through the entrance into the software.

[0305] B.由“网络服务内容判决器”判断新找到的网络服务内容“与当前“同价值网络服务索引数据库”的内容属于同一内容吗?”如果“是”则将它作为一个新的元素归入“同价值网络服务索引数据库”的该类别;如果“否”则由“网络服务内容判决器”判断它“与当前非同价值网络服务索引数据库”的内容属于同一内容吗? [0305] B. from "Network service content decision device" found new network services "and the current" same-value network service index database "contents belong to the same content?" If "yes," then it is as a new elements classified as "same-value network service index database" in the category; if "No" by the "network services decision device" content that "with the current exceptional value network service index database" belong to the same content? "

[0306] C.如果“是”则:“为当前的网络服务和与之同价值的并已经存贮在'非同价值网络服务索引数据库'中的网络服务,新建一个类别并全部转移到'同价值网络服务索引数据库' ”;如果“否”则“为当前的网络服务新建一个类别,并存入'非同价值网络服务索引数据库,”;。 [0306] C. If "Yes": "for the current network and services with the same value having been stored in the network services 'extraordinary value network service index database is creating a new category and all transferred to the' same-value network service index database "; if" No "then" New for the current network services category, and credited 'exceptional value network service index database ";.

[0307] 第6步:由“搜索结果网页发布器”根据“同价值网页结果数据库”和“非同价值网页结果数据库”的内容动态生成搜索结果的静态网页,发布到“搜索引擎搜索结果Web服务器”,再通过浏览器呈现给前来搜索的查询者(见图“M2”标记)。 [0307] Step 6: the static pages' search results pages publisher "dynamically generating search results based on content" with the value of the database "and" extraordinary value database ", publishing the" Web search engine search results server ", and then presented to the inquirer (see Fig come to search through the browser" M2 "mark).

[0308] 作为第6步的另一种实现方法,也可以通过“动态网页Web服务器”直接通过浏览器呈现给查询用户。 [0308] As another method to achieve step 6, you can also "dynamic webpage Web server" presented directly through the browser to query the user. (见图“M3”标记)。 (See "M3" tag).

[0309] “同价值商业信息处理模块” [0309] "same-value commercial information processing module"

[0310] 图11为“同价值商业信息处理模块”流程图。 [0310] FIG. 11 is a flowchart showing "same-value commercial information processing module." 对于符合搜索条件的商业信息,“同价值商业信息处理模块”均采用Html网页中的超级链接方式提供给被查询者。 For commercial information that match the search criteria, "with the value commercial information processing module" are available to be queried by using hyperlinks Html page. 为最大化地提高本系统的工作性能,我们采用了如下技术: To maximize improve the performance of this system, we use the following technologies:

[0311] 采用了网页发布技术,使用“搜索结果网页发布器”将搜索结果提前发布到“搜索引擎搜索结果Web服务器”,直接响应已经被查询过的搜索要求,避免根据请求动态从数据库生成动态网页的大量计算。 [0311] using a Web publishing technology, using the "search results page publisher" search results early release to "search engine search results Web server" direct response has been queried search request, to avoid the request of the dynamically from a database to generate dynamic computationally intensive web pages.

[0312] “同价值信息处理模块”将处理结果分类别放置在“非同价值商业信息索引数据库”和“同价值商业信息索引数据库”中,并定期由“搜索结果网页发布器”发布到“搜索引擎搜索结果Web服务器”,避免了重复计算和减少了计算等待时间。 [0312] "with the value of the information processing module" processing results are classified and placed in the "extraordinary value commercial information index database" and "same-value commercial information index database", and regularly issued by the "search results pages publisher" to " Search engine search results Web server ", to avoid double counting and reducing the waiting time is calculated. “同价值商业信息处理模块”处理流程如下: "Same-value commercial information processing module" process is as follows:

[0313] 第1步:在收到查询者的搜索关键词,并通过软件根据关键词内容和关键词语法判断需要找的是商业信息文件或链接(例如,关键词中含有“.JPG”表示需要寻找的是.JPG 文件而不是含有该文字的网页)。 [0313] Step 1: receiving a query's search keywords, content and software based on keywords and keyword syntax judgment need to find a file or link business information (for example, contained in the keywords ".JPG" represents We need to find a .JPG file instead of the page containing the text).

[0314] 第2步:判断“要搜索的内容已经发布在Web服务器上吗? ”,如果搜索的目标已经发布在“搜索引擎搜索结果Web服务器”上则直接返回搜索结果(见图“Ml”标记),该结果中已经将符合搜索条件具有相同来源的商业信息的获取接口聚合成一条“标题搜索结果”, 点击“同价值文件”按钮后,可以在“搜索引擎搜索结果Web服务器”上看到另一个包括全部搜索结果的网页,使查询者可以看到符合查询条件的全部搜索结果,完成搜索过程。 [0314] Step 2: determine "what you are searching has been published on the Web server?", If the target of the search has been published on the "search engine results Web server" is directly returned in the search results (see Figure "Ml" mark), the results have been in line with the business information search condition with the same source of access ports aggregated into a "title search result" click "with the value of the file" button, you can watch the "search engine results Web server" to another page including all the search results so that the query can see all of the search results match the query, the search process is complete. 如果搜索的目标没有发布在“搜索引擎搜索结果Web服务器”上则从第3步开始。 If the target of the search is not published on the "search engine results Web server" start from step 3.

[0315] 第3步:返回查询者“没有符合条件商业信息”的结果。 [0315] Step 3: Return the inquirer "No qualified business information" results.

[0316] 第4步:将该搜索关键词加入到下一轮更新“同价值商业信息索引数据库”和“非同价值商业信息索引数据库”的任务中,并定期启动两个数据库的更新过程。 [0316] Step 4: adding the search keywords to next task of updating "the same value commercial information index database" and "extraordinary value commercial information index database", and periodically start the update process two databases.

[0317] 第5步:“同价值商业信息索引数据库”和“非同价值商业信息索引数据库”的更新过程: [0317] Step 5: "same-value commercial information index database" and "extraordinary value commercial information index database" of the update process:

[0318] A.由“商业信息搜索器”搜索网页新出现的商业信息文件或链接入口,通过软件进入该入口获取该文件或服务。 [0318] A. from "Business Information Finder" search the web emerging commercial information files or links entrance, to obtain the files or services through the entrance into the software.

[0319] B.由“商业信息内容判决器”判断新找到的商业信息内容“与当前“同价值商业信息索引数据库”的内容属于同一内容吗?”如果“是”则将它作为一个新的元素归入“同价值商业信息索引数据库”的该类别;如果“否”则由“商业信息内容判决器”判断它“与当前非同价值商业信息索引数据库”的内容属于同一内容吗? [0319] B. from "commercial content decision device" found new commercial content "with the current" same-value commercial information index database "contents belong to the same content?" If "yes," then it is as a new elements classified as "same-value commercial information index database" in the category; if "No" by "commercial content decision device" content that "with the current extraordinary value commercial information index database" belong to the same content? "

[0320] C.如果“是”则:“为当前的商业信息和与之同价值的并已经存贮在'非同价值商业信息索引数据库'中的商业信息,新建一个类别并全部转移到'同价值商业信息索引数据库' ”;如果“否”则“为当前的商业信息新建一个类别,并存入'非同价值商业信息索引数据库,”;。 [0320] C. If "Yes": "for the current business information and with the same value having been stored business information in the 'extraordinary value commercial information index database is creating a new category and all transferred to the' same-value commercial information index database "; if" no "is" to create a new category for the current business information, and stored in 'extraordinary value commercial information index database ";.

[0321] 第6步:由“搜索结果网页发布器”根据“同价值网页结果数据库”和“非同价值网页结果数据库”的内容动态生成搜索结果的静态网页,发布到“搜索引擎搜索结果Web服务器”,再通过浏览器呈现给前来搜索的查询者(见图“M2”标记)。 [0321] Step 6: the static pages' search results pages publisher "dynamically generating search results based on content" with the value of the database "and" extraordinary value database ", publishing the" Web search engine search results server ", and then presented to the inquirer (see Fig come to search through the browser" M2 "mark).

[0322] 作为第6步的另一种实现方法,也可以通过“动态网页Web服务器”直接通过浏览器呈现给查询用户。 [0322] As another method to achieve step 6, you can also "dynamic webpage Web server" presented directly through the browser to query the user. (见图“M3”标记)。 (See "M3" tag).

[0323] “同价值商业信息处理模块”的特点在于能够根据商品或服务特点、供应与查询者的分布自动判断多个商业信息目标是否对查询者具有相同的使用价值,从而作为将其聚合成一条“标题搜索结果”的依据,以及查询结果排序的依据。 [0323] Features "same-value commercial information processing module" is the ability according to characteristics of the goods or services, the distribution of supply and automatically determines the plurality of query's business information whether the target has the same value used by the query, so as to be polymerized into a "title search result" basis, and based on the query results sorted.

[0324] 内容判决器可以在各种“同源(同价值)信息处理模块”中通用。 [0324] the content of the decision may be common to the various "homologous (same value) information processing module."

[0325] “内容判决器”具体实现方案 [0325] "Content Decision Device" implementation

[0326] “多媒体内容判决器”具体实现方案: [0326] "multimedia content decision device" concrete implementation:

[0327] 1输入:可以接收多个来源的多媒文件(如果是播放服务就将起录制成文件,或从播放服务器上获取媒体文件信息)。 The multimedia file may be received from multiple sources (if a player from the service will be recorded as a file, or obtain media file information from the player server): [0327] 1 input.

[0328] 2处理:进行多媒体内容吻合度比对。 [0328] Processing 2: multimedia content matching degree comparison.

[0329] 3返回:计算输入多媒体中具有的相同内容程度值AameMediaPower。 [0329] 3 Returns: calculation of the input multimedia contents having the same degree value AameMediaPower.

[0330] 具体实现方法: [0330] specific method:

[0331] 第1步:接收“被判断对象”:可以接收多个来源的多媒体。 [0331] Step 1: Receive "judged objects": may receive a plurality of multimedia sources. 并记录被判断对象的数量JnputQuantity0 And recording the number of objects is determined JnputQuantity0

[0332] 第2步:查找“被判断对象”在下表中可参与比对的属性,记录当前属性具有相同值的“被判断对象”的数量:SameQUantity(例如,5个被判断对象中,有3个对象的属性具有相同的值,则该属性的SameQuantity = 3) [0332] Step 2: Find "judged objects" in the following table may be involved in alignment attribute record "judged objects" current attribute having the same value number: SameQuantity (e.g., 5 is determined object, there is properties of the three objects have the same value, then the attribute SameQuantity = 3)

[0333] 第3步:输入当前属性在判断过程中的“权重”值(从下表中查到)=Power [0333] Step 3: Enter the current properties determined in the process of "weight" value (found from the table) = Power

[0334] 第4步:计算被全部“被判断对象”在当前属性上的吻合度:PSame = SameQuantity氺Power [0334] Step 4: All calculations are "judged objects" goodness of fit on the current attributes: PSame = SameQuantity Shui Power

[0335] 第5步:返回“第1步”对下一个“属性”执行“第1步”〜“第4步”,得到该属性的PSame。 [0335] Step 5: Return "Step 1" execute "Step 1" to "Step 4" to the next "attribute", to give the PSame property. 直至获得部属性的的PSame值。 Until a portion of the property values ​​PSame.

[0336] 第6步:计算并返回“被判断对象”的相同内容程度值AameMediaPower =(全部Psame值的数学累加值)/InputQuantity。 [0336] Step 6: Calculate and return the content to the same extent "is judged object" value AameMediaPower = (Mathematical Psame all values ​​accumulated value) / InputQuantity.

[0337] 视频文件或播放服务判断内容: [0337] video files or play content services to determine:

[0338] [0338]

Figure CN101025737BD00311

[0339] [0339]

Figure CN101025737BD00321

[0340]注释: [0340] Notes:

[0341] 1.本发明在于采用“权重”值计算每种属性的比对重要性的方法,而不仅仅是表中所列具体数值,表中“权重”具体数值仅为典型值,根据实际需要改变其具体数值仍然属于本发明范畴。 [0341] 1. The present invention is the use of "weights", rather than list specific values ​​for each attribute value calculation method of the importance than the table, the table "weight" is only exemplary specific numerical value, the actual DETAILED still needs to change its value fall within the scope of the present invention.

[0342] 2.根据实际情况,某些属性值可能为“空(Null)”,计算过程中属性值为“空”时不应被视作属性相等。 [0342] The actual situation, some attribute values ​​may be "empty (Null)", it should not be considered when calculating the process attribute property value is "empty" equal.

[0343] 音频文件判断内容: [0343] determine the contents of the audio file:

[0344] [0344]

Figure CN101025737BD00331

[0345]注释: [0345] Notes:

[0346] 1本发明在于采用“权重”值计算每种属性的比对重要性的方法,而不仅仅是表中所列具体数值,表中“权重”具体数值仅为典型值,根据实际需要改变其具体数值仍然属于本发明范畴。 [0346] invention is the use of a "weight" value calculation than the method for each property of importance, rather than specific values ​​listed in the table, the table "weight" is only exemplary specific numerical value, according to actual needs change the specific values ​​still fall within the scope of the present invention.

[0347] 2根据实际情况,某些属性值可能为“空(Null) ”,计算过程中属性值为“空”时不应被视作属性相等。 [0347] 2 according to the actual situation, some attribute values ​​may be "empty (Null)", should not be considered when calculating the process attribute property value is "empty" equal.

[0348] Flash文件判断内容: [0348] Flash file to determine the content:

[0349] [0349]

Figure CN101025737BD00341

[0350]注释: [0350] Notes:

[0351] 1.本发明在于采用“权重”值计算每种属性的比对重要性的方法,而不仅仅是表中所列具体数值,表中“权重”具体数值仅为典型值,根据实际需要改变其具体数值仍然属于本发明范畴。 [0351] 1. The present invention is the use of "weights", rather than list specific values ​​for each attribute value calculation method of the importance than the table, the table "weight" is only exemplary specific numerical value, the actual DETAILED still needs to change its value fall within the scope of the present invention.

[0352] 2.根据实际情况,某些属性值可能为“空(Null)”,计算过程中属性值为“空”时不应被视作属性相等。 [0352] The actual situation, some attribute values ​​may be "empty (Null)", it should not be considered when calculating the process attribute property value is "empty" equal.

[0353] “图片内容判决器”具体实现方案 [0353] "picture content decision device" specific implementations

[0354] 1输入:可以接收多个来源的图片。 [0354] 1 Input: may receive a plurality of image sources.

[0355] 2处理:进行图片内容吻合度比对。 [0355] 2 treatment: graphic content goodness of fit than the right.

[0356] 3返回:计算输入图片中具有的相同内容程度值AamePicPower。 [0356] 3 Returns: calculation of the input picture content having the same degree value AamePicPower.

[0357] 具体实现方法: [0357] specific method:

[0358] 第1步:接收“被判断对象”:可以接收多个来源的图片。 [0358] Step 1: Receive "judged objects": may receive a plurality of image sources. 并记录被判断对象的数S JnputQuantity0 And recording the determined number of objects S JnputQuantity0

[0359] 第2步:查找“被判断对象”在下表中可参与比对的属性,记录当前属性具有相同值的“被判断对象”的数量:SameQUantity(例如,5个被判断对象中,有3个对象的属性具有相同的值,则该属性的SameQuantity = 3) [0359] Step 2: Find "judged objects" in the following table may be involved in alignment attribute record "judged objects" current attribute having the same value number: SameQuantity (e.g., 5 is determined object, there is properties of the three objects have the same value, then the attribute SameQuantity = 3)

[0360] 第3步:输入当前属性在判断过程中的“权重”值(从下表中查到)=Power[0361] 第4步:计算被全部“被判断对象”在当前属性上的吻合度:PSame = SameQuantity氺Power [0360] Step 3: Enter the current property determining process "weight" value (found from the table) = Power [0361] Step 4: calculating match all "judged objects" on the current properties degree: pSame = SameQuantity Shui Power

[0362] 第5步:返回“第1步”对下一个“属性”执行“第1步”〜“第4步”,得到该属性的PSame。 [0362] Step 5: Return "Step 1" execute "Step 1" to "Step 4" to the next "attribute", to give the PSame property. 直至获得部属性的的PSame值。 Until a portion of the property values ​​PSame.

[0363] 第6步:计算并返回“被判断对象”的相同内容程度值AamePicPower =(全部Psame值的数学累加值)/InputQuantity。 [0363] Step 6: Calculate and return the content to the same extent "is judged object" value AamePicPower = (Mathematical Psame all values ​​accumulated value) / InputQuantity.

[0364] 根据图片各种属性以及图像识别软件对于相似程度的判断。 [0364] According to various image attributes, and image recognition software for determining the degree of similarity.

[0365] [0365]

Figure CN101025737BD00351

[0366] [0366]

Figure CN101025737BD00361

[0367]注释: [0367] Notes:

[0368] 1.本发明在于采用“权重”值计算每种属性的比对重要性的方法,而不仅仅是表中所列具体数值,表中“权重”具体数值仅为典型值,根据实际需要改变其具体数值仍然属于本发明范畴。 [0368] 1. The present invention is the use of "weights", rather than list specific values ​​for each attribute value calculation method of the importance than the table, the table "weight" is only exemplary specific numerical value, the actual DETAILED still needs to change its value fall within the scope of the present invention.

[0369] 2.根据实际情况,某些属性值可能为“空(Null)”,计算过程中属性值为“空”时不应被视作属性相等。 [0369] The actual situation, some attribute values ​​may be "empty (Null)", it should not be considered when calculating the process attribute property value is "empty" equal.

[0370] “文字内容判决器”具体实现方案 [0370] "text content decision device" specific implementations

[0371] “文字内容判决器”,可通过软件实现: [0371] "text content decision device", can be implemented by software:

[0372] 1输入:可以接收多个来源的文字,作为“被判断对象”。 [0372] 1 Input: Text may be received from multiple sources, as "the object is determined."

[0373] 2处理:进行图片内容吻合度比对。 [0373] 2 treatment: graphic content goodness of fit than the right.

[0374] 3返回:“被判断对象”之间的一致程度值SameTextPower。 [0374] 3 Returns: the degree of consistency "is determined object" between the values ​​SameTextPower.

[0375] 实现方法: [0375] Method:

[0376] 第1步:找出输入的多个图片中 [0376] Step 1: Find entered more pictures

[0377] 文字内容中,具有相同的单词或句子的部分的总计长度值AameLenth。 [0377] In text, a portion having the same words or sentences of the total length value AameLenth.

[0378] 第2步:找出输入的多个文字内容中,长度最短的输入文字的长度值,MinLenth。 [0378] Step 2: identify a plurality of text input, the value of the shortest length of the input text, MinLenth.

[0379]第 3 步:返回文字相似程度值:SameTextPower = SameLenth/MinLenth [0379] Step 3: the degree of similarity values ​​back to the text: SameTextPower = SameLenth / MinLenth

[0380] 按照此方法找到的文字中:长度最长文字的通常是同一篇文章分页数量少或含有大量广告和外部超级链接的,而长度最短的文字通常是同一篇文章被分成更多页数或含有最少广告和外部超级链接的。 [0380] According to this method to find the text of: the length of the longest text usually contain little or a lot of advertising and external hyperlinks same article page number and the length of the shortest text is usually the same article is divided into more pages or contain minimal advertising and external hyperlinks.

[0381 ] “链接内容判决器”具体实现方案 [0381] "link content decision device" specific implementations

[0382] “链接内容判决器”,可通过软件实现:用来比对多个网页上所含有的超级链接是否具有共同特征。 [0382] "link content decision device", can be implemented by software: to compare hyperlinks contained on multiple pages if have common features.

[0383] 1输入:多组超级链接的Url地址(每组超级链接通常是从一个网页上获得的全部超级链接)。 [0383] 1 Input: Url plurality of sets of addresses of hyperlinks (each hyperlink is usually obtained from all hyperlinks on one page).

[0384] 2处理:进行各组间超级链接Url地址吻合度计算 [0384] Processing 2: hyperlink for each group address match degree calculating Url

[0385] 3返回:各组间具有相同超级链接数。 [0385] Returns 3: each group having the same number of links super. [0386] 实现方法: [0386] Method:

[0387] 第1步:接收“被判断对象”:多组超级链接的URL地址。 [0387] Step 1: Receive "judged objects": a plurality of sets hyperlink URL address.

[0388] 第2步:统计“被判断对象”相似程度=SameURLPower =在每组超级链接均出现过的URL地址数量。 [0388] Step 2: Statistics "judged objects" similarity degree = SameURLPower = number hyperlink URL address in each group appeared.

[0389]第 3 步:返回SameURLPower。 [0389] Step 3: Return SameURLPower.

[0390] “软件内容判决器”具体实现方案 [0390] "Software Content Decision Device" implementation

[0391] “软件内容判决器”,用来比对输入的多个软件是否是同种软件。 [0391] "software content decision device" for more software than on whether the input is the same kind of software.

[0392] 1输入:可以接收多个来源的软件。 [0392] 1 Input: the software may receive a plurality of sources.

[0393] 2处理:进行软件内容吻合度比对。 [0393] 2 processing: software content goodness of fit than the right.

[0394] 3返回:软件内容吻合度数值。 [0394] 3 Returns: software contents goodness of fit value.

[0395] 具体实现方法: [0395] specific method:

[0396] 第1步:接收“被判断对象”:多个输入的文件或目录。 [0396] Step 1: Receive "judged objects": a plurality of input files or directories. 并记录被判断对象的数量: InputQuantity0 And recording the determined number of objects: InputQuantity0

[0397] 第2步:查找“被判断对象”在下表中可比对的属性,记录当前属性具有相同值的“被判断对象”的数量=SameQuantity (例如,5个被判断对象中,有3个对象的属性具有相同的值,则该属性的SameQuantity = 3) [0397] Step 2: Find "judged objects" in the following table comparable to the properties, the number of recording "is judged object" current attribute having the same value = SameQuantity (e.g., five judged objects, there are three object have the same attribute value, the attribute SameQuantity = 3)

[0398] 第3步:输入当前属性在判断过程中的“权重”值(从下表中查到)=Power [0398] Step 3: Enter the current properties determined in the process of "weight" value (found from the table) = Power

[0399] 第4步:计算被全部“被判断对象”在当前属性上的吻合度:PSame = SameQuantity氺Power。 [0399] Step 4: All calculations are "judged objects" goodness of fit on the current attributes: PSame = SameQuantity Shui Power.

[0400] 第5步:返回“第1步”对下一个“属性”执行“第1步”〜“第4步”,得到该属性的PSame。 [0400] Step 5: Return "Step 1" execute "Step 1" to "Step 4" to the next "attribute", to give the PSame property. 直至获得部属性的的PSame值。 Until a portion of the property values ​​PSame.

[0401] 第6步:计算并返回“被判断对象”的吻合值:SameSoftPower =(全部I^same值的数学累加值)/InputQuantity。 [0401] Step 6: Calculate and return value consistent "is judged objects": SameSoftPower = (I ^ same mathematical value of all accumulated value) / InputQuantity.

[0402] [0402]

Figure CN101025737BD00381

[0403] [0403]

Figure CN101025737BD00391

[0404]注释: [0404] Notes:

[0405] 1.本发明在于采用“权重”值计算每种属性的比对重要性的方法,而不仅仅是表中所列具体数值,表中“权重”具体数值仅为典型值,根据实际需要改变其具体数值仍然属于本发明范畴。 [0405] 1. The present invention is the use of "weights", rather than list specific values ​​for each attribute value calculation method of the importance than the table, the table "weight" is only exemplary specific numerical value, the actual DETAILED still needs to change its value fall within the scope of the present invention.

[0406] 2.根据实际情况,某些属性值可能为“空(Null)”,计算过程中属性值为“空”时不应被视作属性相等。 [0406] The actual situation, some attribute values ​​may be "empty (Null)", it should not be considered when calculating the process attribute property value is "empty" equal.

[0407] “数据或数据库内容判决器”具体实现方案 [0407] "data or database content decision device" concrete implementation

[0408] 逐一比对不同数据库文件内的每条数据记录内容是否相等,返回参与比对的数据库一致程度值SameDBPower是否超过门限。 If [0408] one by one ratios are equal to the contents of each data record in different database file, the database returns to participate than the degree of coincidence exceeds a threshold value is SameDBPower.

[0409] SameDBPower =字段名称相同并且数值相等的记录数/参与比对的数据库拥有该字段的最少记录个数。 The same [0409] SameDBPower = field names and values ​​equal to the number of recording / recording involved in this field have a minimum number of alignment database.

[0410] SameDBPower反映了相同内容记录数相对拥有最少记录数的数据库的比例, SameDBPower 取值为:0 〜1。 [0410] SameDBPower reflect the same content ratio of the number of records of the database has a relatively minimal number of records, SameDBPower values: 0 ~ 1.

[0411] “数据或数据库内容判决器”具体实现方案 [0411] "data or database content decision device" concrete implementation

[0412] 对于数据文件可采用如下实现步骤: [0412] For data files can be accomplished by the steps of:

[0413] 第1步:在参与比对的多个数据文件中,随机选取一个文件作为“比对标准”。 [0413] Step 1: participating in a plurality of data files match, the randomly selected as a file "than the standard."

[0414] 第2步:进行其它文件与“比对标准”的一致性的粗略比较:文件长度、文件校验和、标题、主题、版本、作者、类别、关键字、备注等文件属性信息。 [0414] Step 2: Make a rough consistency with other files more "than standard": file size, file checksum, title, subject, version, author, category, keyword, notes and other file attribute information.

[0415] 第3步:如果一致则判定为“粗略一致”,这样的判断结果可以直接作为“数据或数据库内容判决器”的输出。 [0415] Step 3: If it is determined that the same "coarse match", so that the judgment result can be directly output "data or database content decision device" as.

[0416] 第4步:如需进一步的比对,在获得“粗略一致”的输入文件中,进行第5步。 [0416] Step 4: For further comparison, obtaining "coarse match" input file, step 5. [0417] 第5步:精细比较:文件属性信息和文件中每一个字节的逐一比对。 [0417] Step 5: fine comparison: the file attribute information and files one by one for each byte comparison. 全部特征均吻合的文件可以判定为“完全一致”,作为“数据或数据库内容判决器”的输出。 All features are consistent file can be judged to be "exactly the same" as the "data or database content decision device" in output.

[0418] 对于数据库文件可采用如下实现步骤: [0418] For database files can be accomplished by the steps of:

[0419] 第1步:对输入的数据库文件根据文件名后缀和文件属性判断是否符合同种数据库格式。 [0419] Step 1: Enter the database file based on the file name and file extension property to determine whether the same kind of database format.

[0420] 第2步:对于同种数据库格式进行第3步,对于不同种数据库格式直接第4步 [0420] Step 2: Go to step 3 for the same kinds of database format, directly to Step 4 for the different types of database formats

[0421] 第3步:同种格式数据库粗略比较:文件长度、文件校验和、标题、主题、版本、作者、类别、关键字、备注等文件属性信息。 [0421] Step 3: rough comparison with database formats: file size, file checksum, title, subject, version, author, category, keyword, notes and other file attribute information. 上述特征不完全符合作为“不一致”判断结果输出, 对于完全符合的数据库文件进行第4步。 The above features are not fully compliant with the output determination result "mismatch" as, for step 4 in full compliance with the database file.

[0422] 第4步:数据库精细比较:(本步骤适应各种不同的数据库文件参与内容比对)。 [0422] Step 4: This step :( Database fine comparison adapt to different content database files involved in alignment). 按照每种数据库文件的格式逐一提取其“数据库表”,判断其“数据库表”结构是否一致:不一致作为“不一致”输出,一致的数据库文件进行第5步。 Individually extracted "database table" in the format of each database file, determines whether "database table" uniform structure: inconsistent as "mismatch" output, consistent database files Step 5.

[0423] 第5步:逐一比对参与比对的数据库文件的每条记录的内容:遇到记录内容相同的情况:为计数器“SameRecNum字段名称相同并且数值相等的记录数”加1。 [0423] Step 5: by one than the content of the database files involved in the alignment of each record: the recorded content encountered the same situation: "values ​​are equal and the same number of records SameRecNum field name" counter is incremented by one.

[0424] 第6步:计算“SameDBPower数据库一致程度值” ="SameRecNum字段名称相同并且数值相等的记录数”/“参与比对的数据库拥有该字段的最少记录个数”。 [0424] Step 6: Calculation "SameDBPower database matching degree value" = "SameRecNum field name and the same number of records equal to the value" / "involved in this field have the least number of records than the database." (SameDBPower反映了相同内容记录数相对拥有最少记录数的数据库的比例,SameDBPower取值为:0〜1)。 (SameDBPower content reflects the same number of records have relative proportion of the minimum number of database records, SameDBPower values: 0~1).

[0425] 第7步:判断“SameDBPower数据库一致程度值”是否超过门限,超过门限则输出“一致”作为判断结果,否则输出“不一致”作为判断结果。 [0425] Step 7: Analyzing "SameDBPower database matching degree value" exceeds the threshold, exceeds a threshold and outputs "match" as a determination result, and otherwise outputs "mismatch" as a determination result.

[0426] “GIS信息内容判决器” [0426] "GIS information content decision device"

[0427] "GIS信息内容判决器”,可通过软件实现: [0427] "GIS information content decision device", can be implemented by software:

[0428] 1输入:可以接收多个来源的数字地图,作为“被判断对象”。 [0428] 1 Input: receiving a plurality of sources may be a digital map as "judged objects."

[0429] 2处理:进行数字地图的覆盖范围的吻合度比对。 [0429] Processing 2: goodness of fit for digital map coverage comparison.

[0430] 3返回:“被判断对象”之间的一致程度值SameMapPower (取值0〜1)。 [0430] 3 Returns: the degree of consistency "is determined object" between the values ​​SameMapPower (value 0~1).

[0431] 实现方法: [0431] Method:

[0432] 第1步:按照数字地图的格式打开参加比对的数字地图文件。 [0432] Step 1: Open the right to participate than in accordance with a digital map digital map file format.

[0433] 第2步:找到数字地图的西北角和东南角(也可以是其它形式的地图对角)的经纬度。 [0433] Step 2: Find the northwest and southeast corner of the digital map (which may be in the form of a map other diagonal) latitude and longitude.

[0434] 第3步:比对参加比对的数字地图的西北角和东南角的经度、纬度误差,计算地图覆盖区域的一致性值SameMapPower : [0434] Step 3: Lon than participating in the digital map alignment and the southeast corner of the northwest corner, latitude error calculated consistency value SameMapPower map coverage area:

[0435] 假设“地图1 ”和“地图2 ”参与比对: [0435] if "Map 1" and "map 2" participation ratio of:

[0436]则: [0436] then:

[0437] SameMapPower =两副地图重叠区域的面积/两副地图中最小地图的面积。 [0437] SameMapPower = area of ​​the overlapping area of ​​two maps / Map two smallest area of ​​the map.

[0438] 第4 步:返回SameMapPower 值。 [0438] Step 4: SameMapPower return value.

[0439] 第5步:判断SameMapPower是否超过门限(例如:门限值=0. 8),是则判定为相同的地图,不是则判定为不相同的地图。 [0439] Step 5: determining whether SameMapPower exceeds a threshold (e.g.: threshold = 08), it is determined that the same map, it is not determined that the map is not the same.

[0440] “网络服务内容判决器” [0440] "Network service content decision device"

[0441] “网络服务内容判决器”的FTP服务内容判决: [0441] "Network service content decision device" FTP service content decision:

[0442] 第1步:采用相应的FTP协议登陆参与比对的服务,并获取其内部的文件。 [0442] Step 1: Using the appropriate FTP protocol involved than landing on the service, and to get its internal documents. [0443] 第2步:在获取FTP服务的文件后,首先根据文件名后缀判断文件类型是否一致, 若不一致返回“不一致”作为输出,若文件类型一致,进行第3步。 [0443] Step 2: After obtaining the service FTP file, firstly the file name extension is consistent file type is determined. If not return to "mismatch" as an output, if the same file type, go to Step 3.

[0444] 第3步:根据文件类型采用“多媒体内容判决器”、“图片内容判决器”、“文字内容判决器”、“软件内容判决器”、“数据或数据库内容判决器”或“GIS信息内容判决器”判决其文件内容是否一致,并返回其判断结果。 [0444] Step 3: based on file type using "multimedia content decision device", "picture content decision device", "text content decision device", "software content decision device", "data or database content decision device" or "GIS content decider "decides whether the content files are consistent, and returns the result of the determination.

[0445] Emai 1网站提供的邮箱服务内容判决: [0445] mailbox services Emai 1 website provides decision:

[0446] Email网站提供的邮箱服务信息主要是通过软件搜索各个网站的网页,并从网页标签中解析出邮箱的大小、收费情况、是否支持POP协议等信息。 [0446]-mail service website provides information Email mainly search the Web site through various software and parses out the tag from the page size of the mailbox, fees, and other support information POP protocol.

[0447] 第1步:将邮箱尺寸划分成相应的等级,(例如:10MB〜25MB、25MB〜100MB、 100MB〜300MB、300MB〜IGBUGB〜100GB等),然后判断参与比对的邮箱是否在同一个级别里,如果“不是”则返回“不一致”,如果“是”则进行第2步。 [0447] Step 1: dividing the size of the mailbox to the corresponding level (e.g.: 10MB~25MB, 25MB~100MB, 100MB~300MB, 300MB~IGBUGB~100GB etc.), and then determines whether the ratio of participation in the same mailbox level, if "not" return "inconsistent", step 2 if "Yes" is performed.

[0448] 第2步:比对“收费情况”是否一致,如果“不是”则返回“不一致”,如果“是”则进 [0448] Step 2: if consistent than on the "fees", if "not" "inconsistent" is returned if the "Yes" into

行第3步。 Line Step 3.

[0449] 第3步:比对支持POP协议条件是否一致,如果“不是”则返回“不一致”,如果“是” 则返回“一致”。 [0449] Step 3: Support for POP protocol than whether conditions have been satisfied, if the "not" return "inconsistent", if "Yes" to return "the same."

[0450] “商业信息内容判决器” [0450] "commercial content decision device"

[0451] 在网页上发布的产品或服务出售信息是否相同,并在相同的自然地理范围内、相同的行政地理范围、相同的距离范围内。 [0451] posted on a web page to sell the product or service information are the same, and in the same natural geographic range, the same administrative geographic scope, within the same distance.

[0452] 第1步:比对参与比对的商业信息是否是相同的产品或服务,如果“不是回“不一致”,如果“是”进入第2步。 [0452] Step 1: participation ratio than whether commercial information is the same product or service, if "not go back" inconsistent ", if the" Yes "and go to step 2.

[0453] 第2步:判断参与比对的商业信息是否具有地理位置敏感性(例如:生活消费类商品、需要到现场服务的服务具有地理位置敏感性,例如冰激凌、家教服务等),如果“不是” 返回判断结果“一致”,如果“是”则进行第3步。 [0453] Step 2: business information to determine whether to participate in alignment with a location-sensitive (example: lifestyle consumer goods, services on-site service needs to have location-sensitive, such as ice cream, tutoring services, etc.), if " not "return result determination" match ", if" yES "in step 3 is performed.

[0454] 第3步:判断参与比对的商业信息的提供者是否处于相同的城市或区域,如果“不是”返回判断结果“不一致”,如果是返回判断结果“一致”。 [0454] Step 3: Determine the commercial information providers involved in the comparison of whether in the same city or region, if "NO" to return to judge the results "inconsistent", if the judgment result is returned "consistent."

[0455] “获取网页用户关注度子系统” [0455] "Get page user attention Subsystem"

[0456] 图12为获取网页用户关注度子系统结构图。 [0456] FIG. 12 is a structural view attention subsystem acquiring a Web page users. 该搜索引擎能够和与之配套的网络浏览器(或兼容该搜索引擎能够和与之配套的网页浏览器之间通讯协议的其他第三方浏览器)的协同工作方式,由网络浏览器采集用户对每个网页的关注程度,并上报给搜索引擎,作为搜索引擎进行搜索结果排名或选择“标题搜索结果”的依据。 The search engines and its accompanying web browser (or compatible with the search engines and other third-party browser communication protocol between its accompanying web browser) collaborative work, collected by the web browser users the degree of concern of each page, and reports it to the search engines, rankings were based on the search results or select "title search result" as a search engine. 本方法和装置还可以单独于搜索引擎之外,独立形成能够提供“网页热门程度排行榜”的Web查询系统,并可以进行收费业务或作为交换条件换取其他利益。 This method and apparatus can also be outside the search engine, to provide independent form a separate "web popularity list" of Web query system, and can be charged in exchange for service or in exchange for other benefits.

[0457] 本系统主要包含两大部分:“I^gei^ocus网络服务器”和“PageFocus网络浏览器”。 [0457] This system mainly consists of two parts: "I ^ gei ^ ocus Network Server" and "PageFocus web browser."

[0458] 'Tagei^cus网络服务器”构造 [0458] 'Tagei ^ cus network server "configuration

[0459] "PageFocus网络服务器”通过“PageFocus网络浏览器”获取全球用户对每一个网页的关注程度,并形成该网页的“关注分值I^gei^cus”数据库,做为网页的热门程度的衡量值。 [0459] "PageFocus Network Server" through "PageFocus web browser 'access to global user engagement on every page, and that page form" concerned about the score I ^ gei ^ cus "database, as the popularity of the web page a measure of value.

[0460] "PageFocus网络服务器”由下列组成: [0460] "PageFocus Network Server" of the following composition:

[0461] (1) “PageFocus浏览器ID注册服务器”:为正在网络上使用的“PageFocus网络浏览器”分配全球唯一的ID标识号。 [0461] (1) "PageFocus browser ID Registration Server": distribution of global unique ID number as "PageFocus web browser" is being used on the network.

[0462] (2) "PageFocusAccServer网页关注统计服务器”:接收全球正在运行的"PageFocus网络浏览器”发来的“PageFocus数据包”内包含的对于一个或多个网页的“关注分值I^agei^cus”。 [0462] (2) "PageFocusAccServer page Follow statistics server": receiving "PageFocus web browser," the world is running in the sent "PageFocus package" contains for one or more pages of "concern value I ^ agei ^ cus ". ID号用来区分不同的浏览用户。 ID number used to distinguish different users browsing.

[0463] (3) "PageFocus浏览器在线升级服务器”:用来向全球“PageFocus网络浏览器” 提供在线升级服务。 [0463] (3) "PageFocus browser online update server": Used to global "PageFocus web browser" provides online update service.

[0464] (4) “数据加解密模块”:用来在“PageFocus网络服务器”和“PageFocus网络浏览器”之间传递加密数据,放置被攻击或窃取信息。 [0464] (4) "Data encryption and decryption module": used to encrypt the data transfer between the "PageFocus Network Server" and "PageFocus web browser", placed attack or steal information.

[0465] 'Tagei^cus网络浏览器”构造 [0465] 'Tagei ^ cus web browser "configuration

[0466] "PageFocus网络浏览器”通过网络向“PageFocus网络服务器”汇报当前用户对于某个网页的关注程度。 [0466] "PageFocus web browser" to "PageFocus Network Server" report the current level of concern for a user pages through the network.

[0467] "PageFocus网络浏览器”由下列组成: [0467] "PageFocus web browser" consists of the following components:

[0468] (1) “关注分值I^agei^cus计算模块”:根据用户对“I^gei^ocus网络浏览器”的操作,计算用户对某网页的关注程度,并形成“PageFocus数据包”向“PageFocusAccServer网页关注统计服务器”汇报。 [0468] (1) "Follow I ^ agei ^ cus value calculating module": according to a user operation on the "I ^ gei ^ ocus Web browser", the user calculates the degree of focus on a page, and the formation of "packet PageFocus Watch statistics server "report" to the "PageFocusAccServer page.

[0469] (2) "PageFocus浏览器ID注册模块”:与“PageFocus浏览器ID注册服务器”通讯以获得全球唯一的标识ID,作为区分不同用户的依据。 [0469] (2) "PageFocus browser ID registration module": and "PageFocus browser ID registration server" newsletter to get the globally unique identifier ID, as a basis to distinguish between different users.

[0470] (3) “I^gei^ocus浏览器在线升级模块”:与“PageFocus浏览器在线升级服务器” 通讯,以保持当前用户计算机上的“PageFocus浏览器”是最新版本。 [0470] (3) "I ^ gei ^ ocus browser online upgrade module": and "PageFocus browser online update server" communication, in order to maintain the current "PageFocus Browser" on the user's computer is up to date.

[0471] 本装置包含:本发明创造的“PageFocus网络浏览器”、“PageFocus浏览器ID注册服务器”和“网页计分服务器”,具体实现方法如下: [0471] This apparatus comprises: a creation of the present invention "PageFocus web browser", "PageFocus browser ID registration server" and "scored web server", the specific method is as follows:

[0472] 第1步:开发一个特殊的“PageFocus网络浏览器”,每个浏览器均在安装时具备全球唯一的ID标识号,或在使用时主动寻找网络上的“PageFocus浏览器ID注册服务器”以获得全球唯一的10标识号。 [0472] Step 1: Develop a special "PageFocus web browser," each browsers have a global unique ID number at the time of installation, or when using actively looking "PageFocus browser ID registration server on the network "10 in order to obtain a globally unique identification number.

[0473] 第2步:“f^geR)CUS网络浏览器”具备具有常规网络浏览器(例如:微软公司的IE浏览器)的全部功能。 [0473] Step 2: "f ^ geR) CUS web browser" comprises (e.g., having a conventional web browsers: Microsoft's full functionality IE browser).

[0474] 第3步:“I^geR)CUS网络浏览器”还具备将用户对浏览器的操作和对网页的操作按照下表所列权重转换成网页的“关注分值I^gei^cus”并形成“PageFocus数据包”,以加密方式通过网络协议传递至本搜索引擎的“I^agei^cusAcckrver网页关注统计服务器”。 [0474] Step 3: "I ^ geR) CUS Web browser" further includes a user operation and the operation of the browser on the right page according to the weight given in the table is converted into page "attention value I ^ gei ^ cus "and form a" PageFocus packet ", in an encrypted way of passing this to the search engine through the network protocol" I ^ agei ^ cusAcckrver web server statistics concern. "

[0475] 第4步:"PageFocusAccServer网页关注统计服务器”在收到全球的每一个“PageFocus网络浏览器”发来的“PageFocus数据包”后将其内部包含的“关注分值I^ageR)cus”累加到相应的网页上。 [0475] Step 4: "PageFocusAccServer web server statistics concern" after "PageFocus packets" received worldwide each "PageFocus web browser" sent by its internal contains "concerned about the score I ^ ageR) cus "accumulated onto the corresponding page.

[0476] 第5步^I^agei^cusAcckrver网页关注统计服务器”上包含的全球每一个网页的“关注分值I^agei^cus”,这些信息可以通过各种处理方法形成:搜索引擎对网页排行依据、 搜索引擎在具有相同内容搜索结果中选择可以作为“标题搜索结果”的依据、也可以直接公布出来作为“网页热门程度排行榜”的服务。 [0476] Step 5 ^ I ^ agei ^ cusAcckrver web server statistics concern "included on each page of a global" concern value I ^ agei ^ cus ", this information may be formed by a variety of treatment methods: search engine for web pages Ranking based on search engine can choose a "title search result" basis, it can also be published directly as "web popularity list" of service in the search results with the same content.

[0477] "PageFocus网络浏览器”计算“关注分值I^agei^cus”的方法: [0477] "PageFocus web browser" calculation method "concerned about the score I ^ agei ^ cus" of:

[0478] 由于“I^ageR)CUS网络浏览器”具有普通浏览器的全部功能,故此可以在用户使用浏览器的时候,按照下表采集其操作行为,并按照每种行为的“权重”对该网页进行“关注分值PagefoCUS”计分,并在浏览器彻底关闭该网页的时候形成一条关于该网页“关注分值PageFocus"的分值记录,以“PageFocus数据包”的形式发给 [0478] Since the "I ^ ageR) CUS web browser" has all the functions of an ordinary browser therefore be when the user uses the browser, according to the table collecting their operating behavior, and in accordance with the "weight" of each behavior the pages' attention score PagefoCUS "score, and formed a record score on the page" attention scores PageFocus "in the browser completely shut down the web, issued in the form of" PageFocus data package "

[0479] "PageFocusAccServer网页关注统计服务器”。 [0479] "PageFocusAccServer web server statistics concern."

[0480] [0480]

Figure CN101025737BD00431

[0481] [0481]

Figure CN101025737BD00441

[0482] [0482]

Figure CN101025737BD00451

[0483] [0483]

Figure CN101025737BD00461

页元素时,该元素超级链接所指向的网页得到PageFocus得分=当前网页“当前PageFocus得分” *权重 When the page element that hyperlink to a webpage to get PageFocus score = current page "Current PageFocus score" weighting *

[0484]注释: [0484] Notes:

[0485] 1.用本评分标准虽然可能有误判,但是通过网络上的大量操作可以获得统计上的准确性。 [0485] 1. This score While there may be false positives, but the accuracy can be achieved through a large number of statistical operations on the network.

[0486] 2.表中所列“权重”具体数值,仅为典型值,本发明在于通过浏览器为页面打分,任何其他“权重项目,,和“权重”的改变,均属本发明范畴。 [0486] listed in Table 2. The "weight" particular value, merely typical values, the present invention is that the score for the page through a browser, any other "weights Project, and" weight "is changed, are the scope of the invention.

[0487] 3.采用用户对网页投票的方式是基于对于网民公德的充分信任,故此其“权重”对整体得分的数学乘法,而不是数学加法。 [0487] 3. The vote by way of web users is based on the full confidence of Internet users for morality, and therefore its "weight" overall score for mathematics multiplication, rather than mathematical addition.

[0488] 4.由于每个网页均可能得到大量的I^agei^cus得分,可能导致软件变量的溢出, 故此在“I^agei^cusAcckrver网页关注统计服务器”可以采用“数学对数”或“科学计数法” 记录得分。 [0488] 4. Since each page are likely to get a lot of I ^ agei ^ cus score, may result in an overflow of software variables, therefore the "I ^ agei ^ cusAcckrver web server statistics concern" may be a "mathematical logarithmic" or " scientific notation "record score.

[0489] 5.为本方法的其他途径,除了在浏览器彻底关闭该网页时可以形成“PageFocus 数据包”外,还可以用其他任意规则来确定“PageFocus数据包”的时机,例如:定时,累计到某个分值等等,这些方法均属于本发明范畴。 Other routes [0489] The present method can be formed in addition to "PageFocus packet" when the browser completely closes the page, but also may be used to determine the timing of any other rules "PageFocus packet", for example: timing to a cumulative value, etc., these methods fall within the scope of the present invention.

[0490] 6.表中“每行文字阅读速度”的详细计算方法: [0490] Table 6. "each line of text reading speed" of detailed calculations:

[0491] A.鼠标滚轮滚动:文字阅读速度=(显示区宽度/字体宽度)*每次滚动的文字行数/滚动时间间隔。 [0491] A. mouse wheel: Text = reading speed (the display area width / font width) * the number of characters per line rolling / rolling time interval.

[0492] B.键盘翻页:文字阅读速度=(显示区宽度/字体宽度)*每次翻页的文字行数/翻页时间间隔。 [0492] B. Keyboard page: Text = reading speed (the display area width / font width) * Number of lines of text in each page / page interval.

[0493] C.窗体滚动条滚动:文字阅读速度=(显示区宽度/字体宽度)*每次滚动的文滚动时间间隔。 [0493] C. Form scroll bar to scroll: Text = reading speed (the display area width / font width) * scrolls text scroll time interval. 字行数 Number of word lines

[0494] "PageFocus数据包”的形成方法 [0494] The method of forming "PageFocus data package"

[0495] "PageFocus数据包”的内容: [0495] "PageFocus packet" Content:

[0496] [0496]

Figure CN101025737BD00471

[0497] 注释:每个“PageFocus数据包”可以包含多个网页的得分记录。 [0497] Note: Each "PageFocus packet" may comprise a plurality of page record the score. 每条网页得分记录还可以添加其他的属性,但是为了提高效率,表中仅仅列出最重要的内容,在表中添加其他属性也属本发明范畴。 Each page record the score can also add other properties, but in order to improve efficiency, only the table lists the most important elements, adding other properties belong to the scope of the invention are also in the table. “PageFocus数据包”发送时机的选择: "PageFocus packet" to the timing of selection:

[0498] 了减少发送“PageFocus数据包”占用的带宽和给服务器端带来的压力,可以采取如下几个方案之一: [0498] the transmission bandwidth and reduce the pressure to bring the server "packet PageFocus" occupied, may take one of several programs are as follows:

[0499] 当某个网页被彻底从浏览器关闭时发送“PageFocus数据包”。 [0499] Send "PageFocus packet" when a page is completely closed from the browser.

[0500] 当浏览器被彻底关闭时发送“PageFocus数据包”。 [0500] Send "PageFocus packet" when the browser is closed completely.

[0501] 浏览器将“PageFocus数据包”以文件形式保留在本地计算机,积累到特定数量、或特定长度、或特定时间周期时再发送。 [0501] The browser "PageFocus packet" reserved as a file on the local computer, the accumulation of a certain number, or a specific length, or a certain time period before sending.

[0502] “标题搜索结果”选择算法 [0502] "title search result" selection algorithm

[0503] 本算法主要用于如何在原始搜索结果中选择可以用来作为“标题搜索结果”的“同源搜索结果”。 [0503] This algorithm can be used mainly for how to choose a "title search result" in the original search results "homology search results." 本算法需要解决如下问题: This algorithm needs to address the following issues:

[0504] 1.通过网络使用者行为和网页内容判断网页的内容质量,质量高的优先显示。 [0504] 1. Content page quality is determined by a network user behavior, and web content, high-quality displayed by priority.

[0505] 2.避免某一搜索结果因成为“标题搜索结果”而承受过多的点击流量,导致网站处理速度变慢甚至崩溃。 [0505] 2. Avoid a search result because of a "title search result" and click traffic when overloaded, causing the site to deal with slowdowns and even crashes.

[0506] 3.避免某一搜索结果因成为“标题搜索结果”而承受过多的点击流量导致服务响应速度变慢,而降低访问者的使用体验好感。 [0506] 3. Avoid a search result because of a "title search result" bear too much clickstream lead to slower service response, and reduce the use of visitor experience good impression.

[0507] 4.使成为“标题搜索结果”作为一种权力,可以提供给需要的网站,这些网站可以购买这种权力。 [0507] 4. Make a "title search result" as a kind of power that can be provided to the site in need, these sites can buy this power.

[0508] 5.每个“同源搜索结果”的原始结果均有机会按照某种概率成为“标题搜索结果”。 [0508] 5. Each "source search result" initial results have the opportunity to follow a certain probability of a "title search result."

[0509] “标题搜索结果”选择方法在于,在“同源搜索结果”中选择“标题搜索结果”时,同时考虑了“搜索结果内容质量”、“加权值”和“服务响应延迟”三个要素,即:内容质量高的优先显示、有加权的优先显示、网络服务好的优先显示;而在排列全部“同源搜索结果”时则仍然按照这种原则,而“加权值”可以向本发明的系统运营者购买。 [0509] "title search result" selection method wherein, when "title search result" in the "source search result" while considering the "search result content quality", "weight value" and "service response delay" three elements, namely: high-quality contents displayed by priority, weighted with a priority display, network service good display priority; in the arrangement of all "source search result" is still in accordance with this principle, and "weight value" to be present system operators invention of purchase. “标题搜索结果”选择的具体实现方法如下: "Title Search Results," select the specific method is as follows:

[0510] 第1步:计算每个“同源搜索结果”成为“标题搜索结果”的概率权值PWn(该搜索结果为第η条): [0510] Step 1: calculation of each "source search result" becomes "title search results for" probability weights PWn (the search result for the first article η):

[0511] Pffn = TP^PageFocus/(RespDelay-K)[0512] 注释1 :当(RespDelay-K)小于等于零时,(RespDelay-K)应取值为1。 [0511] Pffn = TP ^ PageFocus / (RespDelay-K) [0512] NOTE 1: When (RespDelay-K) is less than zero, (RespDelay-K) shall be taken as 1.

[0513] 注释2 :公式中变量含义如下 [0513] Note 2: Variable Meaning following formula

[0514] A. PageFocus网页关注度值:是该搜索结果根据本发明中“获取网页用户关注度的方法与装置”所获得的“PageFocus值”。 [0514] A. PageFocus page attention value: the search result is "acquisition method and apparatus for web users attention degree" "PageFocus value" is obtained according to the present invention.

[0515] B. RespDelay网页服务响应延迟:是该搜索结果在向搜索者提供服务访问时的响应延迟。 [0515] B. RespDelay web service response delay: The search result is a delay in the response time to provide service access to the searcher. (由于访问体验取决于网站的响应延迟,反应越慢,体验越差)。 (Because access experience delayed response depends on the site, the slower the reaction, the worse the experience).

[0516] CK服务响应常数:是可以定义的常数,建议使用50毫秒(ms),低于K值的服务响应延迟将不被察觉,不影响使用体验,从而可以忽略。 [0516] CK constant service response: is a constant that can be defined, using the recommended 50 milliseconds (MS), the K value is less than the service response delay will not be noticed, does not affect user experience, so can be ignored.

[0517] D. TP标题搜索结果权力:作为一种加权,任何人都可以和本发明系统的运营者通过各种交换条件而获取“TP标题搜索结果权力”。 [0517] D. TP title search result Power: as a weighted, anyone can acquire "TP power title search result" operator and the system of the present invention by various exchange.

[0518] E.作为本公式的其它实现算法,还可以有如下其它形式: [0518] E. the present algorithm other formulas, other forms can also have the following:

[0519] a. Pffn = (TP+PageFocus)/(RespDelay-K) [0519] a. Pffn = (TP + PageFocus) / (RespDelay-K)

[0520] b. Pffn = (TP+PageFocus)/RespDelay/K [0520] b. Pffn = (TP + PageFocus) / RespDelay / K

[0521] c. Pffn = TP氺PageFocus/RespDelay/K [0521] c. Pffn = TP Shui PageFocus / RespDelay / K

[0522] 第2步:统计求和全部原始“同源搜索结果”的概率权值PWn的总和=PWall全部概率权值。 [0522] Step 2: Statistical sum total of all of the original "source search result" of probability weights PWn = PWall all probability weights.

[0523] 第3步:计算每条“同源搜索结果”成为“标题搜索结果”的概率:Pn = PWn/PWall。 [0523] Step 3: Calculate each "source search result" becomes "title search result" of probability: Pn = PWn / PWall.

[0524] 第4步:按照Pn值的概率,随着搜索者的访问动作,动态地随机选择“标题搜索结果”,呈现给搜索者。 [0524] Step 4: according to the probability Pn values, along with access to the action searcher dynamically selected at random "title search result", presented to the searcher.

[0525] 网站内容风格自适应的装置与方法 [0525] Apparatus and method for website content adaptive style

[0526] 本发明的内容在于:利用各种可能获得的、有助于判断用户所处环境和状态的信息,使处于不同工作或生活休闲状态的用户在无需任何操作、注册、设定或Cookie设定的前提下,访问相同页面URL地址时看到不同的风格,其中包括: [0526] The present invention is: use of the possibilities to obtain help information to determine the state of the user and the environment, so that users in different working or living in the idle state without any operation, registration, or set Cookie under the premise set to see a different style when accessing the same page URL address, including:

[0527] 1.利用用户的IP地址判断其所处的国家或区域,再结合通过本网站时间便可以计算出访问者的当地行政区域时间,通过他的时间可以判断他处于工作状态还是休闲状态。 [0527] 1. using the user's IP address to determine which countries or regions which, combined with the local administrative area time visitors will be able to calculate the time through this site, you can determine the status of work and leisure by the state he was in his time .

[0528] 2.通过用户的IP地址,可以查询到该IP地址的属性:家庭、工作场所。 [0528] 2. The user's IP address, you can query the IP address of the property: homes, workplaces. 根据其所处场所提供适合其所处环境的风格和内容。 According to it was in place to provide for its style and content of their environment.

[0529] 3.通过用户的IP地址可以获知其所处的地理位置,在查询商业信息时,可以自动将距离他最近的供应商排列在最前面。 [0529] 3. To be informed of their geographical location through the user's IP address in the query business information, can automatically arrange the distance in front of his nearest supplier.

[0530] 举例如下: [0530] for example as follows:

[0531 ] 同一时刻,不同的用户访问本网站内相同URL的一个网页时看到的内容是不同的: [0531] the same time, what you see when different users access the same URL within a web page on this website is different:

[0532] A.工作状态和环境中的用户看到的是庄重、简介、不含休闲消遣娱乐信息的页面。 [0532] A. working condition and environment of the user sees is a solemn, Profile, leisure and recreation-free information page.

[0533] B.休闲状态和环境中的用户看到的是热闹、可含有休闲消遣娱乐信息、可含有个人消费广告信息的页面。 [0533] B. leisure and environmental status of the user sees is a lively, casual pastime may contain information may contain personal consumption page advertising messages.

[0534] 本发明可以部分或全部地被应用到搜索引擎以外的网站系统,均属本发明范畴。 [0534] The present invention may be partially or fully applied to a system other than the search engine site, are the scope of the invention.

[0535] 目前各个大型网站,为了满足大流量的访问,均采用了服务器集群,甚至在区域建立本地服务分系统,来分流用户访问。 [0535] Currently all major sites, in order to meet the large flow of access, server clusters are used, and even the establishment of local service subsystem in the region to bypass user access. 但是目前的服务器集群的重要特征就是每个集群成员均提供完全相同的内容。 However, an important feature of the current server cluster is that each cluster member offers the exact same content. 如图13 :前来访问的用户被“网站服务器集群入口”设备,部分任何特征地,直接分配到某个具有相同内容的服务器集群成员服务器上。 Figure 13: Users visit is "server cluster entrance" device, part of any features and directly assigned to a server cluster member servers with the same content.

[0536] 如图14,而本发明的装置对上述结构做了部分改动,在“网站服务器集群入口,,收到访问用户后,根据其访问网站时发过来的IP地址等各种用户属性信息判断其是否处于工作状态,并根据其是否处于工作状态向其提供不同风格和内容的信息服务。 [0536] As shown in FIG 14, the apparatus of the present invention the above constructions made some changes in the "entry cluster server ,, received user access based on the information sent, the IP address of the website whose access other user attributes determining whether it is in working condition, and to provide its content and different styles of information services in accordance with whether it is in working condition.

[0537] 自动判断用户状态并提供恰当的网页风格与内容的方法 [0537] automatically determine the user status and provides the appropriate web page content and style of the method

[0538] 第1步:首先将服务器集群划分成“工作风格”和“个人和休闲风格”两大类,无论是静态页面还是动态页面,在向这两类服务器更新相同的内容的时候,自动产生两类风格, 以便不同工作或生活休闲状态的用户在访问相同页面URL地址时看到不同的风格。 [0538] Step 1: First will be divided into a server cluster "work style" and "personal and casual style" two categories, whether static pages or dynamic pages, updating the same content to both types of server time, automatic produce two types of style, so different users work or leisure life status of different styles to see when you visit the same page URL address.

[0539] 第2步:在“网站服务器集群入口,,收到用户首次访问本网站网页的请求后,首先在访问协议里(或IP层协议里)获取其IP地址。 [0539] Step 2: After the first user access request pages of this website, first obtain its IP address in the access protocol in (or IP layer in the protocol) in the "server cluster entrance ,, received.

[0540] 第3步:根据IP地址在“IP地址属性数据库”中查询其IP地址是“工作场合IP 地址”还是“私人或休闲场合的IP地址”,若是“工作场合IP地址”则进行第4步,若是“私人或休闲场合的IP地址”则进行第5步。 [0540] Step 3: According to the IP address query its IP address in the "IP address attribute database" is "workplace IP address" or "private or casual occasions IP address", if the "workplace IP address" is for the first step 4, if the "IP address private or casual occasions," proceed to step 5.

[0541] 第4步:获取“工作场合IP地址”所处的地理位置,并得到该地理区域的行政时间, 若是该IP地址所属区域正处于工作时间(周1〜5的8:00〜20:00)则将其访问分配到服务器集群中的“工作风格服务器”上向其提供适合工作场合使用的页面服务,否则进行第5步。 [0541] Step 4: Get geographic location "workplace IP address", and an administrative time of the geographical area, if the IP address belongs is in the working region (weeks 8 ~ 5: 00~20 : 00) then it is assigned access to the servers in the cluster "work style server" page to provide it for workplace use of the service, otherwise proceed to step 5.

[0542] 第5步:则将其访问分配到服务器集群中的“个人和休闲风格服务器”上向其提供适合个人和休闲状态使用的页面服务。 [0542] Step 5: Access will assign it to a server cluster "personal and casual style server" on page to provide it for personal and recreational use of state services.

49 49

Claims (13)

1. 一种同源信息站点搜索引擎聚合显示方法,其包括下列步骤:(1)查询者通过Web浏览器或应用软件访问搜索引擎,并输入需要查询的关键词;(2)由搜索引擎找到全部符合条件的目标站点作为原始搜索结果;(3)由“同源信息处理模块”查询“成为标题搜索结果”的权力采购者的账户信息,并结合判断规则在原始搜索结果中选取用来作为“标题搜索结果”的对象;(4)由搜索引擎Web服务器或应用服务器只将选中的“标题搜索结果”作为搜索结果展示给查询者,并为“标题搜索结果”提供一个带有“展开查看细节”含义的按钮;(5)查询者还可按动与“标题搜索结果”对应的按钮,搜索引擎再向其展示在O)中找到的原始搜索结果。 A search engine of same resource information display method, comprising the steps of: (1) the inquirer through a Web browser or application software to access the search engine and enter the keyword query needs; (2) found by the search engines all in compliance with the conditions of the target site as the original search results; account information power of the purchaser (3) by the "source information processing module" inquiry "has become a title search result", combined with the original judgment rule selected in the search results to as "title search result" of the object; (4) by a search engine Web server or application server will only select the "title search result" shows as the search result to the inquirer, and the "title search result" with a "expand View details "button meaning; (5) the inquirer can also press the" title search result "button corresponding to the search engine again in the original search results showing O) found.
2.根据权利要求1所述的同源信息站点搜索引擎聚合显示方法,其特征在于,所述“同源信息处理模块”的处理流程包括如下步骤:(1)由信息种类判断模块对网页搜索器收到的信息进行种类判断,其中所述“同源信息处理模块”包含有信息种类判断模块;(2)由信息种类判断模块将相同种类的信息集中发送到相应类型的信息处理模块;(3)将由相应类型的信息处理模块处理后的搜索信息归档进入“非同源结果信息库”或“同源结果信息库”;(4)将“非同源结果信息库”或“同源结果信息库”发布到Web服务器上;其中:“同源信息处理模块”由多个“相应信息种类的同源信息处理模块”组成,所述“同源信息处理模块”包括“同源网页处理模块”、“同源多媒体处理模块”、“同源图片处理模块”、“同源文档处理模块”、“同源软件处理模块”、“同源 The homology search engine of information according to a display method as claimed in claim, characterized in that the "source information processing module," the process flow includes the following steps: (1) an information type determination module Web Search We receive information type determination is performed, wherein the "source information processing module" includes information type determination module; (2) an information type determination module transmits the same kinds of information to the centralized information processing module corresponding type; ( 3) the search information by the processing module corresponding to the type of information processing archive into the "non-homologous result information database" or "homologous result information repository"; (4) the "non-homologous result information database" or "homologous results repository "published on the Web server; wherein:" source information processing module "plurality" information category corresponding to the source information processing module "component, the" source information processing module "comprising" homologous web processing module "" homologous multimedia processing module "," same picture processing module "," same-source document processing module "," same software processing module "," homologous 据或数据库处理模块”、“同源GIS 信息处理模块”、“同价值网络服务处理模块”、和“同价值商业信息处理模块”。 Data or database processing module "," same-source GIS information processing module "," same-value network service processing module ", and" same-value commercial information processing module. "
3.根据权利要求2所述的同源信息站点搜索引擎聚合显示方法,其特征在于,所述“同源网页处理模块”处理网页信息的步骤如下所示:(1)在搜索引擎搜索部分接收需要查询的关键词的时候,首先由已经发布在Web服务器上的搜索结果的判决器判断该关键词是否近期已经被其它人查询过,如果被查询过,并且结果已经在搜索引擎搜索结果Web服务器上发布,则直接返回搜索结果,该结果中已经将具有相同来源的网页聚合成一条搜索结果,点击“同源网页”按钮后,可以在搜索引擎搜索结果Web服务器上看到包括全部搜索结果的搜索结果网页,完成整个查询过程;(2)如果在搜索引擎搜索部分接收需要查询的关键词的时候,由已经发布在Web服务器上的搜索结果的判决器判断该关键词近期没有被其它人查询过,并且也没有相应的查询结果在搜索引擎搜索结果Web服 The homology search engine of information according to claim 2 display method, wherein the step of processing web page information "homologous web processing module" is as follows: (1) receiving a search engine portion need keyword query, the first thing the judge has released the results of a search on the Web server decision is whether or not the keyword has been recently inquired about other people, if queried, and the results have been in search engine results Web server on release, the direct return search results that would have been the same source pages aggregated into one search results, click on the "same page" button, you can see on the search engine results Web server, including all search results search results page, complete the inquiry process; (2) If the receiving required keyword query in the search engine part time, has been released from the search results on the Web server determines if the decision does not keywords recently queried other people too, and there is no corresponding query results in search engine results Web service 器上发布则:A.启动“网页搜索器”搜索“非同源网页结果数据库”和“同源网页结果数据库”找到符合搜索关键词的网页地址,并获取这些网页的内容;B.如果“网页搜索器”在“非同源网页结果数据库”和“同源网页结果数据库”中没有找到符合搜索关键词的网页地址,则返回查询者“没有符合条件网页”的结果,并且将该搜索关键词加入到下一轮更新“非同源网页结果数据库”和“同源网页结果数据库”的任务中, 如果在更新过程中找到了符合条件的网页地址则根据其是否具有同源网页而选择入“非同源网页结果数据库”或“同源网页结果数据库”,这样如果再有人搜索同样的关键词是就可以找到结果;(3)由“网页内容分离器”将找到的网页内容及超级链接目标分解成:多媒体、图片、文字、超级链接种类;(4)分别由各种内容判决器产生判 Published on the device: A start "Web Search" in searching "non-same-source database" and "same-source database," find the page address match the search keywords, and get content of these pages; B if. ". Search results page "on the web page address that match the search keyword is not found, the inquirer is returned" "in the" non-homologous database "and" same-source database does not meet the conditions page ", and the search key word to the next task of updating "non-same-source database" and "same-source database", if you find a qualified web address in the update process is based on whether they have chosen the source webpage "non-homologous database" or "same-source database," so if someone re-search the same keywords that you can find the results; (3) the content of the page from the "web content splitter" will find and hyperlinks goals down into: multimedia, pictures, text, hyperlinks species; (4) are produced from a variety of content decision device sentenced 决结果:A.由“多媒体内容判决器”产生目标网页所含“相同多媒体文件程度SMS(Same Media Score),,;B.由“图片内容判决器”产生目标网页所含“相同图片的程度SPS(Same Photo Score),,;C.由“文字内容判决器”产生目标网页所含“相同文字的程度STS(Same Text Score)";D.由“链接内容判决器”产生目标网页所含“相同超级连接的程度SHS (Same Hyperlinks Score),,;(5)从“同源网页判决规则库”分别获取“多媒体判决权重SMP”、“图片判决权重SPP”、 “文字判决权重STP”、“链接判决权重SHP”并分别与第(4)步生成的“相同多媒体文件程度SMS”、“相同图片的程度SPS”、“相同文字的程度STS”、“相同超级连接的程度SHS”做数学乘法;(6)将第(¾步获得的数学乘法结果做加法,获得网页的“同源程度SSS(Same Source Score) ”,同源程度SSS = (SMS * SMP) + (SPS * SPP) + (STS * ST . Never result:; the degree of picture content decision device "to generate landing pages contained" the same picture of A ". B by the same degree multimedia file SMS (Same Media Score) ,," by the "multimedia content decision device" to generate landing pages contained SPS (same Photo Score) ,,;. C from "text decision device" contained in the page generates a target "level of the same characters STS (same text Score)";. D from "link content decision" on pages contained in the target generated "the extent of the same super-connected SHS (same Hyperlinks Score) ,,; (5) from" "get separately" multimedia verdict weights SMP homology page judgment rule base "," picture verdict weight SPP "," text verdict weight STP ", "link verdict weights SHP" and were the first (4) step-generated "the same multimedia file extent SMS", "the same picture of the extent of SPS", "degree of STS same text," "the same hyperlink extent SHS" do the math multiplication; (6) the first (¾-step math multiplication results obtained do addition, access to the web page "homologous degree SSS (Same Source Score)", the degree of homology SSS = (SMS * SMP) + (SPS * SPP) + (STS * ST P) + (SHS * SHP);(7)判断该网页的“同源程度SSS”是否超出门限,如果超出门限则判定为其它网页的“同源网页”,如果没有超出门限则判定为“非同源网页”;(8)将第(7)步产生的“非同源网页”由“非同源网页处理模块”存入“非同源网页结果数据库”;将第(7)步产生的“同源网页”由“同源网页处理模块”入“同源网页结果数据库”;(9)由“搜索结果网页发布器”根据“同源网页结果数据库”和“非同源网页结果数据库”的内容动态生成搜索结果的静态网页,发布到“搜索引擎搜索结果Web服务器”,再通过浏览器呈现给查询用户;(10)作为第(9)步的另一种实现方法,也可以通过“动态网页Web服务器”直接通过浏览器呈现给查询用户。 P) + (SHS * SHP); (7) determines that the page "degree of homology the SSS" exceeds a threshold, if it exceeds the threshold it is determined that the other page "same page", if not exceeded the threshold is determined as "non- homologous page "; (8) (7) generated in step" non-homologous page "from" non-homologous web processing module "is stored in" non-homologous database "; the first (7) produced by step "homologous page" from "homologous web processing module" "homogenous database"; (9) by a "search result web page publisher" The "homology database" and "non-homologous database" the dynamically generated content search results, publishing them to "search engine search results Web server", and then presented to the user through a browser query; (10) as another realization (9) step of the method, can also " dynamic Web page server "presented directly through the browser to query the user.
4.根据权利要求2所述的同源信息站点搜索引擎聚合显示方法,其特征在于,所述“同源信息处理模块”的处理流程还包括如下步骤:(1)在收到查询者的搜索关键词,并通过软件根据关键词内容和关键词语法判断需要找的文件或服务;(2)判断“要搜索的内容已经发布在Web服务器上吗? ”,如果要搜索的内容已经发布在“搜索引擎搜索结果Web服务器”上则直接返回搜索结果,该结果中已经将符合搜索条件具有相同来源的多媒体的获取接口聚合成一条“标题搜索结果”,点击“同源文件”按钮后,可以在“搜索引擎搜索结果Web服务器”上看到另一个包括全部搜索结果的网页,使查询者可以看到符合查询条件的全部搜索结果,完成搜索过程;如果搜索的目标没有发布在“搜索引擎搜索结果Web服务器”上则从第(3)步开始;(3)返回查询者没有符合条件的结果 The homology search engine of information according to claim 2 display method, wherein the "source information processing module" processing flow further comprises the following steps: (1) the receipt of a search query by keyword, content and software based on keywords and keyword syntax judgment need to find a file or service; (2) to determine "what you are searching has been published on the Web server?", if you want to search the content has been published in " Search engine search results Web server "on the direct return search results that match the search criteria will already have the same origin multimedia access ports aggregated into a" title search result ", click" homologous file "button, you can see the "search engine results Web server" another page includes all of the search results so that the query can see all of the search results match the query, the search process is complete; if the search target is not released in the "search engine results from the first (3) on the Web server "step start; (3) return to the inquirer does not meet the criteria (4)将该搜索关键词加入到下一轮更新“同源信息索引数据库”和“非同源信息索引数据库”的任务中,并定期启动两个数据库的更新过程;(5) “同源信息索引数据库”和“非同源信息索引数据库”的更新过程:A.由网页搜索器搜索网页新出现的目标文件或服务入口,通过软件进入该入口获取该文件或服务;B.由“内容判决器”判断新找到的信息“与当前'同源信息索引数据库'的内容属于同一内容吗?”,如果“是”则将它作为一个新的元素归入“同源信息索引数据库”的类别;如果“否”则由“内容判决器”判断它“与当前非同源信息索引数据库”的内容属于同一内容吗? (4) adding the search keywords to next task of updating "source information index database" and "non-source information index database", and periodically start the update process two databases; (5) "homologous information index database "and" non-source information index database. "the update process:. a web search by the search pages emerging target file or service entrance, to obtain the files or services through the entrance into the software; B by the" content decider "newly found information to determine" the contents of the current 'source information index database' belong to the same content? ", if" yes "then it is classified as a new element" source information index database "category ; if "No" by the "content decision device" content that "with the current non-source information index database" belong to the same content? ” ;C.如果“是”则:“为当前的信息和与之同源的并已经存贮在'非同源信息索引数据库' 中的信息,新建一个类别并全部转移到'同源信息索引数据库'”;如果“否”则“为当前的信息新建一个类别,并存入'非同源信息索引数据库'”;(6)由“搜索结果网页发布器”根据“同源网页结果数据库”和“非同源网页结果数据库”的内容动态生成搜索结果的静态网页,发布到“搜索引擎搜索结果Web服务器”,再通过浏览器呈现给前来搜索的查询者;(7)作为第(6)步的另一种实现方法,也可以通过“动态网页Web服务器”直接通过浏览器呈现给查询用户。 "; If C." Yes ":" as the current information and homologous thereto and has information 'nonhomologous information index database is stored, a new category and all transferred to the "source information index database ' "; if" no "is" a new category for the current information, and stored in' non-source information index database "; (6) by a" search results pages publisher, "according to" same-source database. " and the contents of "non-homologous database" dynamically generated search results, publishing them to "search engine search results Web server", and then presented to the inquirer's browser; (7) as the first (6 ) another way to achieve step, can also "dynamic webpage Web server" presented directly through the browser to query the user.
5.根据权利要求4所述的同源信息站点搜索引擎聚合显示方法,其特征在于,所述“同源信息处理模块”处理文档时,“同源信息索引数据库”和“非同源信息索引数据库”的更新过程为:(1)由“文档搜索器”搜索网页新出现的文档文件或链接入口,通过软件进入该入口获取该文件或服务;(2)由“文字内容判决器”和“图片内容判决器”判断新找到的文档内容“与当前'同源信息索引数据库'的内容属于同一内容吗? ”,如果“是”则将它作为一个新的元素归入“同源信息索引数据库”的类别;如果“否”则由“文档内容判决器”判断它“与当前非同源信息索引数据库”的内容属于同一内容吗? The homology search engine of information according to claim 4 display method, wherein, when the "source information processing module" documents, "source information index database" and "non-homologous information index database "of the update process as follows: (1) the" document Search "on the document file search the web emerging or link entry, to obtain the files or services through software enters the inlet; (2) by the" text content decision device "and" picture content decision device "newly found document content" and content of the current 'source information index database' belong to the same content? ", if" yes "then it is classified as a new element" source information index database "category; if" No "by" document content decision device "it is" content with the current non-source information index database "belong to the same content? ” ;(3)如果“是”则:“为当前的文档和与之同源的并已经存贮在'非同源信息索引数据库'中的文档,新建一个类别并全部转移到'同源信息索引数据库'”;如果“否”则“为当前的文档新建一个类别,并存入'非同源信息索引数据库'”。 "; (3) if the" Yes ":" for the current document and the same-source document having been stored in 'non-source information index database' in creating a new category and all transferred to the 'source information index database "; if" no "is" for the current document to create a new category, and stored in 'non-source information index database'. "
6.根据权利要求3、4或5任一项权利要求所述的同源信息站点搜索引擎聚合显示方法,其特征在于,所述内容判决器的处理流程包括如下步骤:(1)接收“被判断对象”:可以接收多个来源的多媒体,并记录被判断对象的数量InputQuantity ;(2)查找“被判断对象”既定的可参与比对的属性,记录当前属性具有相同值的“被判断对象”的数量SameQuantity ;(3)输入当前属性在判断过程中的“权重”值Power ;(4)计算被全部“被判断对象”在当前属性上的吻合度:PSame = SameQuantity Power ;(5)返回(1)对下一个“属性”执行(1)〜G),得到该属性的PSame,直至获得全部属性的PSame值;(6)计算并返回“被判断对象”的相同内容程度值AameMediaPower =全部I^same值的数学累加值/InputQuantity。 According to any of claims 3, 4 or 5, wherein the homology search engine of the information display method as claimed in claim, wherein the content of the processing flow decision device comprising the steps of: (1) receiving "is Analyzing objects ": a plurality of multimedia sources may be received, and records the number of objects is judged InputQuantity; (2) Find" judged objects "may participate in a given alignment attributes, the attribute having the same recording current value" judged objects "number SameQuantity; (3) enter the current attribute determination process" weight "value Power; (4) calculates all" judged objects "goodness of fit on the current attributes: pSame = SameQuantity Power; (5) returns (1) perform (1) ~G) next to the "properties", obtained PSame the attribute values ​​of all attributes until PSame; the same extent as the content (6) calculates and returns "judged objects" all values ​​AameMediaPower = I ^ same mathematical value of the accumulated value / InputQuantity.
7.根据权利要求3、4或5任一项权利要求所述的同源信息站点搜索引擎聚合显示方法,其特征在于,内容判决器为文字内容判决器时,其处理流程包括如下步骤:(1)找出输入的文字内容中具有相同的单词或句子的部分的总计长度值SameLenth ;(2)找出输入的多个文字内容中,长度最短的输入文字的长度值MinLenth ;(3)返回文字相似程度值 SameiTextPower = SameLenth/MinLenth。 3, 4, or 5 according to any one of claims homology search engine of the information display method according to claim polymerization, wherein the content is text content decision device decision device, which comprises the steps of the process flow :( 1) identify the input text portion having the same words or sentences of the total length value SameLenth; length value MinLenth plurality of text (2) identify the input, the shortest length of the input text; (3) return text similarity value SameiTextPower = SameLenth / MinLenth.
8.根据权利要求3、4任一项权利要求所述的同源信息站点搜索引擎聚合显示方法,其特征在于,内容判决器为链接内容判决器时,其处理流程包括如下步骤:(1)接收“被判断对象”:多个超级链接的URL地址;(2)统计“被判断对象”相似程度=SameURLP0wer =在每个超级链接均出现过的URL地址数量;(3)返回 SameURLPower0 Homology search engine of information according to claim any one of claims 3 and 4 display method according to claim, characterized in that the content decision is to link content decision device, which comprises the steps of the processing flow: (1) receiving "judged objects": the URL address of a plurality of hyperlinks; URL address number (2) statistics "judged objects" = SameURLP0wer = average degree of similarity appeared in each of the hyperlink; (3) returns SameURLPower0
9.根据权利要求4所述的同源信息站点搜索引擎聚合显示方法,其特征在于,内容判决器为商业信息内容判决器时,其处理流程包括如下步骤:(1)比对参与比对的商业信息是否是相同的产品或服务,如果“不是”返回“不一致”, 如果“是”进入第⑵步;(2)判断参与比对的商业信息是否具有地理位置敏感性,如果“不是”返回判断结果“一致”,如果“是”则进行第⑶步;(3)判断参与比对的商业信息的提供者是否处于相同的城市或区域,如果“不是”返回判断结果“不一致”,如果是返回判断结果“一致”。 Same resource information according to a search engine as claimed in claim 4, wherein the display method, wherein, when content decision is commercial content decision device that the process flow comprises the following steps: (1) the ratio of the ratio of participation whether commercial information is the same product or service, if "not" return "inconsistent", if "Yes" to enter the first step ⑵; Business information (2) to determine whether the participation ratio of the sensitivity of location, if "NO" to return judgment results "consistent", if the "Yes" for the first ⑶ step; providers of business information (3) judgment involved than whether in the same city or region, if "NO" to return to judge the results "inconsistent", if it is returns the result of the judgment "consistent."
10.根据权利要求1所述的同源信息站点搜索引擎聚合显示方法,其特征在于, “标题搜索结果”选择的具体实现方法如下:(1)计算每个“同源搜索结果”成为“标题搜索结果”的概率权值PWn : Pffn = TP氺PageFocus/(RespDelay-K)η:该搜索结果为第η条当(RespDelay-K)小于等于零时,(RespDelay-K)应取值为1 PageFocus :网页关注度值RespDelay :网页服务响应延迟K :服务响应常数:小于本值的服务延误将不被察觉, TP :标题搜索结果权力(2)统计求和全部原始“同源搜索结果”的概率权值PWn的总和=PWall全部概率权值;(3)计算每条“同源搜索结果”成为“标题搜索结果”的概率:Pn = Pffn/Pwall ;(4)按照Pn值的概率,随着搜索者的访问动作,动态地随机选择“标题搜索结果”,呈现给搜索者。 10. The search engine of the same resource information as claimed in claim 1 display method, which is characterized in that the "title search result" to select the specific method is as follows: (1) is calculated for each "source search result" a "title the search results for "probability weights PWn: Pffn = TP Shui PageFocus / (RespDelay-K) η: the search result for the first article when [eta] (RespDelay-K) is less than zero, (RespDelay-K) shall be taken as 1 PageFocus probability title search results powers (2) statistical sum all of the original "source search result":: page Views value RespDelay: web service response delay K: constant service response: this is less than the value of the service delays will not be noticed, TP the sum of the weights of PWn = PWall all probability weight value; (3) calculates for each "source search result" becomes "title search result" probability: Pn = Pffn / Pwall; (4) according to the probability value Pn, with searchers access movement, dynamic random selection "title search result", presented to the searcher.
11.根据权利要求10所述的同源信息站点搜索引擎聚合显示方法,其特征在于, 所述“标题搜索结果”的概率权值PWn的计算方法还可以是:a. PWn= (TP+PageFocus) / (RespDelay-K)或,b. Pffn = (TP+PageFocus)/RespDelay/K 或,c. Pffn = TP氺PageFocus/RespDelay/K。 Homology search engine of information according to claim 10, said display method, wherein the method of calculating "title search results for" probability weights PWn may also be:. A PWn = (TP + PageFocus ) / (RespDelay-K), or, b. Pffn = (TP + PageFocus) / RespDelay / K or, c. Pffn = TP Shui PageFocus / RespDelay / K.
12.根据权利要求1所述的同源信息站点搜索引擎聚合显示方法,其特征在于,所述“同源信息处理模块”:(1)可以内嵌在搜索引擎中;(2)可以放置在“搜索引擎”和“搜索引擎搜索结果Web服务器”之间;(3)也可以作为预处理模块放置在“搜索引擎”和被搜索站点之间。 12. The search engine of the same resource information as claimed in claim 1, display method, wherein the "source information processing module": (1) can be embedded in the search engine; (2) may be placed in between a "search engine" and "search engine results Web server"; (3) may be disposed between the preprocessing module as a "search engine" and the search site.
13.根据权利要求1所述的同源信息站点搜索引擎聚合显示方法,所述展开查看细节含义的按钮可为超级连接或各种软件界面控件。 Homology search engine of information according to claim 1, said display method, details of the deployment view as meaning hyperlink buttons or interface controls various software.
CN 200610007905 2006-02-22 2006-02-22 Attention degree based same source information search engine aggregation display method CN101025737B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 200610007905 CN101025737B (en) 2006-02-22 2006-02-22 Attention degree based same source information search engine aggregation display method

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN 200610007905 CN101025737B (en) 2006-02-22 2006-02-22 Attention degree based same source information search engine aggregation display method
PCT/CN2007/000370 WO2007095834A1 (en) 2006-02-22 2007-02-02 Composite display method and system for search engine of same resource information based on degree of attention
US12/279,949 US8176029B2 (en) 2006-02-22 2007-02-02 Composite display method and system for search engine of same resource information based on degree of attention

Publications (2)

Publication Number Publication Date
CN101025737A CN101025737A (en) 2007-08-29
CN101025737B true CN101025737B (en) 2011-08-17

Family

ID=38436934

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200610007905 CN101025737B (en) 2006-02-22 2006-02-22 Attention degree based same source information search engine aggregation display method

Country Status (3)

Country Link
US (1) US8176029B2 (en)
CN (1) CN101025737B (en)
WO (1) WO2007095834A1 (en)

Families Citing this family (46)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8166041B2 (en) * 2008-06-13 2012-04-24 Microsoft Corporation Search index format optimizations
CA2639438A1 (en) * 2008-09-08 2010-03-08 Semanti Inc. Semantically associated computer search index, and uses therefore
CN102043705A (en) * 2009-10-19 2011-05-04 阿里巴巴集团控股有限公司 Statistical method and apparatus for input behavior
KR101750049B1 (en) 2009-11-13 2017-06-22 삼성전자주식회사 Method and apparatus for adaptive streaming
US20110119268A1 (en) * 2009-11-13 2011-05-19 Rajaram Shyam Sundar Method and system for segmenting query urls
KR101777347B1 (en) 2009-11-13 2017-09-11 삼성전자주식회사 Method and apparatus for adaptive streaming based on segmentation
KR101786051B1 (en) 2009-11-13 2017-10-16 삼성전자 주식회사 Method and apparatus for data providing and receiving
KR101737084B1 (en) 2009-12-07 2017-05-17 삼성전자주식회사 Method and apparatus for streaming by inserting another content to main content
KR101777348B1 (en) 2010-02-23 2017-09-11 삼성전자주식회사 Method and apparatus for transmitting and receiving of data
US8972418B2 (en) * 2010-04-07 2015-03-03 Microsoft Technology Licensing, Llc Dynamic generation of relevant items
CN101853300B (en) * 2010-05-26 2013-01-30 中国科学技术大学 Method and system for identifying and evaluating video downloading service website
KR101837687B1 (en) * 2010-06-04 2018-03-12 삼성전자주식회사 Method and apparatus for adaptive streaming based on plurality of elements determining quality of content
CN101854399A (en) * 2010-06-09 2010-10-06 宇龙计算机通信科技(深圳)有限公司 Method and device for aggregating network data
US9858342B2 (en) 2011-03-28 2018-01-02 Doat Media Ltd. Method and system for searching for applications respective of a connectivity mode of a user device
US9069443B2 (en) 2010-06-11 2015-06-30 Doat Media Ltd. Method for dynamically displaying a personalized home screen on a user device
US9323844B2 (en) 2010-06-11 2016-04-26 Doat Media Ltd. System and methods thereof for enhancing a user's search experience
CN102375823B (en) * 2010-08-13 2014-11-05 腾讯科技(深圳)有限公司 Searching result gathering display method and system
US9152726B2 (en) 2010-12-01 2015-10-06 Microsoft Technology Licensing, Llc Real-time personalized recommendation of location-related entities
US20130054591A1 (en) * 2011-03-03 2013-02-28 Brightedge Technologies, Inc. Search engine optimization recommendations based on social signals
CN103064852A (en) * 2011-10-20 2013-04-24 阿里巴巴集团控股有限公司 Website statistical information processing method and website statistical information processing system
US9633122B2 (en) * 2011-10-20 2017-04-25 Aol Inc. Systems and methods for web site customization based on time-of-day
US9804754B2 (en) * 2012-03-28 2017-10-31 Terry Crawford Method and system for providing segment-based viewing of recorded sessions
CN102663048B (en) * 2012-03-29 2017-04-12 天津奇思科技有限公司 Method and device for providing search result
CN103365555A (en) * 2012-03-31 2013-10-23 国际商业机器公司 Data processing method and system and data collecting method and system
CN103389984B (en) * 2012-05-08 2018-03-23 百度在线网络技术(北京)有限公司 A kind of method and apparatus for being used to provide collection relevant information in search result
CN102880706A (en) * 2012-07-16 2013-01-16 刘二中 Method for processing link information input by search engine terminal user
CN102789508A (en) * 2012-07-27 2012-11-21 吴建辉 Distributed practical condition search engine and chat system on basis of geographical position
KR101974867B1 (en) * 2012-08-24 2019-08-23 삼성전자주식회사 Apparatas and method fof auto storage of url to calculate contents of stay value in a electronic device
CN103024055B (en) * 2012-12-18 2016-06-15 百度在线网络技术(北京)有限公司 For the Webpage compression method of mobile terminal, system and cloud server
CN103020276A (en) * 2012-12-27 2013-04-03 新浪网技术(中国)有限公司 Method and device for searching social contact objects
US9386071B2 (en) * 2013-01-15 2016-07-05 Allon Caidar System for communicating media to users over a network
CN104166659B (en) * 2013-05-20 2019-03-08 百度在线网络技术(北京)有限公司 A kind of map datum sentences the method and system of weight
US9471693B2 (en) * 2013-05-29 2016-10-18 Microsoft Technology Licensing, Llc Location awareness using local semantic scoring
CN103399957A (en) * 2013-08-21 2013-11-20 百度在线网络技术(北京)有限公司 Searching method, system and engine as well as client
CN104424261B (en) * 2013-08-29 2018-10-02 腾讯科技(深圳)有限公司 Information displaying method based on electronic map and device
CN103533399A (en) * 2013-09-30 2014-01-22 深圳创维-Rgb电子有限公司 Video-information display method and device
CN103646078B (en) * 2013-12-11 2017-01-25 北京启明星辰信息安全技术有限公司 Method and device for realizing internet propaganda monitoring target evaluations
US20150193804A1 (en) * 2014-01-09 2015-07-09 Microsoft Corporation Incentive mechanisms for user interaction and content consumption
JP6114707B2 (en) * 2014-02-28 2017-04-12 富士フイルム株式会社 Product search device, product search system, server system, and product search method
CN104036003B (en) * 2014-06-16 2018-12-14 北京奇虎科技有限公司 search result integration method and device
CN104504069A (en) * 2014-12-22 2015-04-08 北京奇虎科技有限公司 Building method and device for file index
CN105574061A (en) * 2015-05-24 2016-05-11 刘晓建 Method for filtering user generated content by network information acquisition tool
US9703689B2 (en) * 2015-11-04 2017-07-11 International Business Machines Corporation Defect detection using test cases generated from test models
CN107959665A (en) * 2016-10-18 2018-04-24 北京视联动力国际信息技术有限公司 A kind of communication means and communication system
CN106713353A (en) * 2017-01-23 2017-05-24 浙江省测绘科学技术研究院 Intelligent seamless aggregation method and system for geographic information service
CN107169147A (en) * 2017-06-20 2017-09-15 广州阿里巴巴文学信息技术有限公司 Data processing method, device and electronic equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1254136A (en) 1998-11-12 2000-05-24 英业达股份有限公司 Method for inquiring about index multi-media header data and its device
CN1639710A (en) 2002-02-28 2005-07-13 皇家飞利浦电子股份有限公司 Displaying search results
CN1728134A (en) 2004-07-30 2006-02-01 国际商业机器公司 Multi-language network information search method and system based on supertext

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2240663C (en) * 1995-12-30 2004-06-08 Timeline, Inc. Data retrieval method and apparatus with multiple source capability
JP4706143B2 (en) * 2001-08-02 2011-06-22 ソニー株式会社 Information providing method and apparatus
US20050105513A1 (en) * 2002-10-27 2005-05-19 Alan Sullivan Systems and methods for direction of communication traffic
JP3933617B2 (en) * 2003-09-22 2007-06-20 株式会社日立情報システムズ Shared information search method, shared information search program, and information sharing system
US7231405B2 (en) * 2004-05-08 2007-06-12 Doug Norman, Interchange Corp. Method and apparatus of indexing web pages of a web site for geographical searchine based on user location
US7617200B2 (en) * 2006-01-31 2009-11-10 Northwestern University Displaying context-sensitive ranked search results

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1254136A (en) 1998-11-12 2000-05-24 英业达股份有限公司 Method for inquiring about index multi-media header data and its device
CN1639710A (en) 2002-02-28 2005-07-13 皇家飞利浦电子股份有限公司 Displaying search results
CN1728134A (en) 2004-07-30 2006-02-01 国际商业机器公司 Multi-language network information search method and system based on supertext

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
JP特开2003-44484A 2003.02.14
JP特开2005-99890A 2005.04.14
全文.
崔建海等.Web环境下的个性化信息检索技术.现代图书情报技术 128.2005,(128),45-49.
赵银春等.基于Web浏览内容和行为相结合的用户兴趣挖掘.计算机工程31 12.2005,31(12),93-94,198.

Also Published As

Publication number Publication date
WO2007095834A1 (en) 2007-08-30
US8176029B2 (en) 2012-05-08
US20090094213A1 (en) 2009-04-09
CN101025737A (en) 2007-08-29

Similar Documents

Publication Publication Date Title
Bowman et al. Harvest: A scalable, customizable discovery and access system
Shokouhi et al. Federated search
Ostermaier et al. A real-time search engine for the web of things
CA2770868C (en) Objective and subjective ranking of comments
US7240049B2 (en) Systems and methods for search query processing using trend analysis
US7454417B2 (en) Methods and systems for improving a search ranking using population information
CN1882943B (en) Systems and methods for search processing using superunits
US8346753B2 (en) System and method for searching for internet-accessible content
US8370332B2 (en) Blending mobile search results
US9324112B2 (en) Ranking authors in social media systems
KR101171405B1 (en) Personalization of placed content ordering in search results
CA2786708C (en) Scalable topical aggregation of data feeds
US7533090B2 (en) System and method for rating electronic documents
US8312022B2 (en) Search engine optimization
JP4977624B2 (en) Matching and ranking of sponsored search listings that incorporate web search technology and web content
CN101351798B (en) Dynamic search with implicit user intention mining
US8468143B1 (en) System and method for directing questions to consultants through profile matching
Jansen et al. Web searcher interaction with the Dogpile. com metasearch engine
US7596571B2 (en) Ecosystem method of aggregation and search and related techniques
JP4950041B2 (en) Query log analysis for use in managing category-specific electronic content
US8589373B2 (en) System and method for improved searching on the internet or similar networks and especially improved MetaNews and/or improved automatically generated newspapers
US8055675B2 (en) System and method for context based query augmentation
US8645416B2 (en) Searching content in distributed computing networks
KR101084841B1 (en) Dynamic pricing models for digital content
US9710555B2 (en) User profile stitching

Legal Events

Date Code Title Description
C06 Publication
C10 Entry into substantive examination
C14 Grant of patent or utility model
EXPY Termination of patent right or utility model