New! View global litigation for patent families

CN100533434C - Method and apparatus for detecting invalid clicks on the internet search engine - Google Patents

Method and apparatus for detecting invalid clicks on the internet search engine Download PDF

Info

Publication number
CN100533434C
CN100533434C CN 200480007418 CN200480007418A CN100533434C CN 100533434 C CN100533434 C CN 100533434C CN 200480007418 CN200480007418 CN 200480007418 CN 200480007418 A CN200480007418 A CN 200480007418A CN 100533434 C CN100533434 C CN 100533434C
Authority
CN
Grant status
Grant
Patent type
Prior art keywords
method
apparatus
detecting
invalid
engine
Prior art date
Application number
CN 200480007418
Other languages
Chinese (zh)
Other versions
CN1761961A (en )
Inventor
姜锡昊
李宇晟
河定秀
Original Assignee
Nhn株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Grant date

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor ; File system structures therefor
    • G06F17/30861Retrieval from the Internet, e.g. browsers
    • G06F17/30876Retrieval from the Internet, e.g. browsers by using information identifiers, e.g. encoding URL in specific indicia, browsing history
    • G06F17/30887URL specific, e.g. using aliases, detecting broken or misspelled links
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor ; File system structures therefor
    • G06F17/30861Retrieval from the Internet, e.g. browsers
    • G06F17/30864Retrieval from the Internet, e.g. browsers by querying, e.g. search engines or meta-search engines, crawling techniques, push systems
    • G06F17/30867Retrieval from the Internet, e.g. browsers by querying, e.g. search engines or meta-search engines, crawling techniques, push systems with filtering and personalisation

Abstract

本发明涉及一种因特网搜索引擎服务器。 The present invention relates to an Internet search engine server. 更明确地说,本发明涉及用于检测搜索项的无效点击的方法和设备,搜索项被包括在一个由因特网搜索引擎服务器提供的搜索结果网页内。 More particularly, the present invention relates to a method and apparatus for detecting invalid clicks the search item, the search term is included within a search result web page provided by an Internet search engine server. 本发明涉及一种用于在因特网搜索引擎中检测无效点击的方法,包括下列步骤:响应于来自于搜索器的搜索请求产搜索结果网页;获取一对应于被产生网页的页面标识符;从搜索器接收一包括在搜索结果网页内的搜索项的点击;获取一对应于被点击的搜索项的站点标识符;并且如果页面标识符和站点标识符与预定时段内的其它点击有关的页面标识符和站点标识符一致,则确定该点击无效。 The present invention relates to a method for detecting invalid clicks in an Internet search engine, comprising the steps of: in response to a search request from the searcher production search result web page; acquiring a page identifier corresponding to the generated web page; the search receives a search term comprising one click in the search result web page; obtaining a corresponding to the clicked search item site identifier; and if the page identifier and a site identifier associated with other clicks within a predetermined period of time a page identifier and the same site identifier, determining that the click is invalid. 根据本发明提供了一个用于检测无效点击的方法和设备,其检测各种不正当地增加搜索项点击量的尝试,并且立即处理这些尝试。 According to the present invention provides a method for detecting invalid clicks and apparatus, which detect various improper attempt to increase traffic to the search term, and immediately deal with these attempts.

Description

在因特网搜索引擎上检测无效点击的方法和设备 On the Internet search engine detection method and apparatus for invalid clicks

技术领域 FIELD

本发明涉及因特网搜索引擎服务器。 The present invention relates to an Internet search engine server. 更明确地说,本发明涉及用于检测搜索项的无效点击的方法和设备,搜索项被包括在一个由因特网搜索引擎服务器提供的搜索结果网页内。 More particularly, the present invention relates to a method and apparatus for detecting invalid clicks the search item, the search term is included within a search result web page provided by an Internet search engine server. 此外,本发明涉及用于检测无效点击的方法和设备,其可以检测不公平地增加搜索项点击量的各种尝试并可以立即应付这些尝试。 Further, the present invention relates to a method and apparatus for detecting invalid clicks, which can detect a variety of attempts to increase the unfair traffic search item and can immediately cope with these attempts.

背景技术 Background technique

随着因特网的使用越来越广泛,诸如可经由因特网访问的网页之类的信息源的数量已经以算术级数增长。 With more and more widespread use of the Internet, the number of pages and the like, such as information sources accessible via the Internet has been growing at an arithmetic progression. 此外,为了在大量信息源之中发现信 Further, in order to find the letter among a large number of information sources

息,搜索器访问诸如NAVER、 Yahoo和Lycos之类的因特网搜索引擎服务 Interest rates, such as search access NAVER, Yahoo and Lycos search engines like Internet service

器以请求搜索。 It is to request a search. 因特网搜索服务提供商产生一个包括搜索项在内的搜索结果网页,其包括与搜索器输入的搜索字有关的信息,然后向搜索器提供生成的搜索结果网页。 Internet search service provider generates a search results page includes search terms, including which includes information related to a search word and the search input, and provide search results pages generated by the search engine. 例如,当搜索器访问NAVER搜索引擎服务器然后输入搜索字"Digital Camera (数码相机)"时,搜索结果网页如图2所示。 For example, when the search engine server accesses NAVER search and enter the search word when "Digital Camera (digital camera)," the search results page shown in Figure 2. 包括在搜索结果网页内的每一项都与URL(统一资源定位符)有关。 Included in the search results pages are related to each URL (Uniform Resource Locator).

因为与单一搜索字有关的搜索项的数量不计其数,然而,这类不计其 Since the number of search terms and search a single word about the numerous, however, these do not count it

数的搜索项如何在搜索结果网页上显示和以什么顺序显示对因特网搜索服务提供商来说是一个非常重要的问题。 How to display the number of search terms and show the Internet search provider is a very important issue in what order on the search results page. 因特网搜索服务提供商通过结合几个标准来确定搜索项的列出顺序。 Internet search provider list to determine the order of search terms by combining several criteria. 已被广泛使用的其中一个标准是用户对特殊搜索项的点击量。 One of the standards have been widely used user clicks on a particular search term. 例如,如果用户对一个搜索项的点击量很大,则该搜索项被显示在搜索结果网页相对靠上的部分。 For example, if a user searches for a term of traffic is large, the search terms are displayed in the search results page on the opposite part. 甚至在因特网搜索服务提供商通过结合多个参数来确定搜索项的列出顺序的情况中,如果其中一个参数是用户点击量,则具有很高点击量的搜索项被显示在搜索结果网页的相对靠上的部分。 Even in the case of an Internet search service provider determines the listing order of the search term by combining a plurality of parameters, if one of the parameters is the user clicks, the clicks have high search items are displayed in the search results page relative rely on part.

5此外,因特网搜索服务器产生的搜索结果网页被显示得越高,用户可 5 In addition, search result web page generated by an Internet search server is displayed is higher, the user can

能点击和访问该网页的可能性就越大。 You can access the page and click the greater the likelihood. 从而,web服务器的网络信息提供 Thus, the web server provides network information

商想要把与他(她)自己有关的搜索项显示在搜索结果网页的顶端。 Suppliers want to put him (her) own related search terms at the top of the search results page. 因为这个原因,为了将他(她)的网页搜索项显示在搜索结果网页的顶端,网络信息提供商可以故意地访问因特网搜索服务器来多次点击他(她)自己网页的搜索项。 For this reason, in order to him (her) Web search items are displayed at the top of the search results page, the network information provider may deliberately access the Internet search server to repeatedly click on his (her) own web page search item. 有时,网络信息提供商可以用一个专门的程序不断地点击他(她) 的网页的搜索项。 Sometimes, the network information provider may use a special program continue to click on his (her) pages of search terms. 因为这类不公平的搜索项点击并不反映真实的用户搜索结果,所以因特网搜索服务提供商必须检测这类无效的点击。 Because these unfair clicks the search term does not reflect the real user search results, so the Internet search service provider must detect this type of invalid clicks.

先有技术中存在这类服务,其中,与搜索项有关的网络信息提供商基于搜索结果网页中的每个搜索项的点击量被收费。 First, the existence of such services technology, where each traffic search term of the search results page is charged with the search terms related to network information provider based. 因特网搜索服务提供商 Internet search provider

Overture Services ,lnc.(USA)提供这类服务,其中,当搜索器点击与网络信息提供商有关的搜索结果网页中的搜索项时,网络信息提供商支付每次点击。 Overture Services, lnc. (USA) to provide such services, which, when the searcher clicks related to the network information provider search results page of a search term, pay per click network information provider. 在这种情况下,如果搜索器故意多次点击一个特殊的搜索项,则与搜索项有关的网络信息提供商必须支付额外的费用。 In this case, if the search is intentionally multiple clicks a specific search term, the search term associated with network information provider must pay an additional fee. 因此,甚至在这种情况下也必须要检测无效点击,其意图是只增加点击量而实际上没有对搜索项进行搜索。 Therefore, even in this case must also detect invalid clicks, the intention is only to increase traffic without actually searching for search terms.

发明内容 SUMMARY

本发明被提供来解决上述的先有技术中的问题。 The present invention is provided to solve the aforementioned problems of the prior art. 本发明的一个目的是提供用于检测搜索项的无效点击的方法和设备,搜索项包括在一个由因特网搜索引擎服务器提供的搜索结果网页内。 An object of the present invention is to provide a method and apparatus for detecting invalid clicks the search item, search item included in a search result web page provided by an Internet search engine server.

本发明的另一个目的是提供用于检测无效点击的方法和设备,其可以检测不正当增加搜索项的点击量的各种尝试,并且可以立即应付这些尝试。 Another object of the present invention is to provide a method and apparatus for detecting invalid clicks, which can detect unauthorized attempts to increase traffic to the various search item, and can immediately cope with these attempts.

本发明的另一个目的是提供一个用于检测无效点击的方法和设备,其中,为了检测无效点击而提供的几个标识符很难被仿造或伪造。 Another object of the present invention is to provide a method and apparatus for detecting invalid clicks, wherein several identifiers provided in order to detect invalid clicks are difficult to counterfeit or forged.

为了达到上述目的并解决先有技术中的上述问题,本发明提供了一个 To achieve the above object and solve the above problems of the prior art, the present invention provides a

在因特网搜索引擎中检测无效点击的方法,包括下列歩骤:响应于来自搜索器的搜索请求产生一个搜索结果网页,获取一个对应于被产生网页的页面标识符,从搜索器接收包括在搜索结果网页内的搜索项的点击,获取一个对应于被点击搜索项的站点标识符,并且如果页面标识符和站点标识符与在预定时段内的其它点击有关的页面标识符和站点标识符一致,则确定该点击是无效的。 The method of clicks in an Internet search engine detecting invalid, ho comprising the steps of: in response to a search request from the searcher generates a search result web page, acquiring a page identifier corresponding to the generated web page, receiving a search results from searcher click the search item page, click on obtaining a search term corresponding to the site identifier, and if the page identifier and the site identifier identifier pages and sites associated with other clicks within a predetermined period of time identifier matches determine that the click is invalid.

根据本发明的方面提供了一个用于在因特网搜索引擎中检测无效点击的方法,包括下列步骤:响应于来自搜索器的搜索请求产生一个搜索结 It provides a method for detecting invalid clicks in an Internet search engine according to the aspect of the invention, comprising the steps of: in response to a search request from the searcher generates a search result

果网页,获取一个包括在搜索器终端存储的会话cookie文件内的会话标识 If the web page, get a session identifier included in the session cookie file stored in the searcher's terminal

符,从搜索器接收一个包括在搜索结果网页内的搜索项点击,获取一个对应于被点击搜索项的站点标识符,并且如果会话标识符和站点标识符与预定时段内与其它点击有关的会话标识符和站点标识符一致,则确定该点击是无效的。 Symbol, received from the searcher clicks a search item included in a search result web page, acquires a search item corresponding to the clicked site identifier, and if the session identifier and a site identifier associated with the predetermined period of conversation with other clicks consistent identifier and site identifier, determining that the click is invalid.

根据本发明的方面提供了一个用于在因特网搜索引擎中检测无效点击的方法,包括下列步骤:从搜索器接收包括在搜索结果网页内的搜索项的点击,获取一个对应于搜索器终端的客户机IP地址,获取一个对应于被点击的搜索项的站点标识符,并且如果客户机IP地址和站点标识符与预定时段内的其它点击有关的客户机IP地址和站点标识符一致,则确定该点击是无效的。 Provides a method for detecting invalid clicks in an Internet search engine according to the aspect of the invention, comprising the steps of: receiving from a searcher clicks a search term comprises in the search result web page, acquires a terminal corresponding to the search client IP address, acquiring a site corresponding to the identifier of the clicked search item, and if the client IP address and a site identifier associated with other clicks within a predetermined period of time the client IP addresses and domain identifiers match, it is determined that click is invalid.

根据本发明的方面提供了一个用于在因特网搜索引擎中检测无效点击的方法,包括下列步骤:响应于来自搜索器的搜索请求产生一个搜索结果网页,获取一个对应于搜索器终端的终端标识符,产生一个包括终端标 Provides a method for detecting invalid clicks in an Internet search engine according to the aspect of the invention, comprising the steps of: in response to a search request from the searcher generates a search result web page, acquires a terminal corresponding to the terminal identifier searcher generating a standard terminal comprising

识符的用户cookie文件然后把用户cookie文件存储在搜索器终端中,从搜 User identifiers and the user cookie file cookie file stored in the searcher's terminal, the search

索器接收一个包括在搜索结果网页内的搜索项点击,获取一个对应于被点击搜索项的站点标识符,并且如果终端标识符和站点标识符与预定时段内的其它点击有关的终端标识符和站点标识符一致,则确定该点击是无效的。 Cable receiving a click on a search item included in a search result web page, acquires a search item corresponding to the clicked site identifier, and if the terminal identifier and terminal identifier site identifier associated with other clicks within a predetermined time period, and the same site identifier, determining that the click is invalid.

根据本发明的另一个方面提供了一个用于检测无效点击的设备,其中,如果搜索器点击包括在由因特网搜索引擎提供的搜索结果网页内的搜索项,则至少搜索器终端的IP地址、搜索器终端所属的网络地址、与搜索 Providing accordance with another aspect of the invention an apparatus for detecting invalid clicks, wherein if a searcher clicks the search item included in a search result web page provided by an Internet search engine, at least the IP address of the searcher's terminal, searches the network address of terminal belongs, the search

结果网页有关的搜索字、搜索器的web浏览器的相关信息、与存储在搜索器终端中的点击和cookie文件信息有关的点击时间、与搜索项有关的URL 信息的其中一个被接收,并且基于一个根据被接收信息预定的标准(reference)来确定该点击是否无效。 Result-related information web browser page related to the search word, searcher, URL information related to the click and cookie file information about the time of the click, the search term is stored in the searcher terminal one is received, and based on determining whether the hit to a valid received information according to the predetermined standard (reference).

根据本发明的另一个方面提供了一个用于检测无效点击的设备,包括(1)一个日志存储单元,其响应于搜索器点击包括在由因特网搜索引擎提供的搜索结果网页内的搜索项,来存储一个至少与下列两项有关的日志: 搜索器终端的IP地址,搜索器终端所属的网络地址,与搜索结果网页有关的搜索字,搜索器的web浏览器的相关信息,与点击有关的点击时间、存储在搜索器终端中cookie文件信息和与搜索项有关的URL信息,(2)—个无效点击模型存储单元,其存储与至少下列中两个有关的无效点击模型:搜索器终端的IP地址、搜索器终端所属的网络地址、与搜索结果网页有关的搜索字、搜索器的web浏览器的相关信息、与点击有关的点击时间、存储在搜索器终端中的cookie文件信息、和与搜索项有关的URL信息,和(3) 一个无效点击决定单元,其基于日志存储单元中存储 According to another aspect of the present invention to provide an apparatus for detecting invalid clicks, comprising (1) a log storage unit, in response to a searcher clicks is included in the search result web page provided by an Internet search engine, search terms, to at least two stores a log relating to the following: For information about the IP address of the searcher's terminal, the network address of the searcher's terminal belongs, relating to the search results page word search, web browser of the searcher, and click click time, cookie files store information related to the search terms and URL information in the searcher's terminal, (2) - invalid click pattern storage unit that stores at least two of the following invalid clicks about the model: IP searcher's terminal cookie file information address, the network address of the searcher's terminal belongs, relating to the search results page search words, information web browser of the searcher, associated with the click click of time, stored in the searcher's terminal, and search item URL information, and (3) an invalid click decision unit log storage unit based storage 日志和无效点击模型存储单元中存储的无效点击模型来确定搜索点击是否是一个无效点击。 Logs and invalid click model storage unit invalid clicks model to determine whether the search click is an invalid click.

根据本发明的另一个方面提供了一个用于检测无效点击的设备,包括一个点击计数器装置,用于针对包括在由因特网搜索引擎提供的搜索结果网页内的搜索项,计数预定时段内每个搜索项的搜索器点击量, 一个平均点击量计算装置,用于在预定时段内计算属于搜索项所属类别的搜索项的平均点击量,和一个决定装置,用于确定每个搜索项的点击量是否比平均点击量大一个预定的差。 Provided for in accordance with another aspect of the invention apparatus for detecting invalid clicks, comprising a click counter means for a predetermined time period for each search within a search item included in a search engine provided by an Internet search results page count searcher clicks the item, an average click-amount calculating means for averaging the calculated search item clicks within a predetermined period of category search item belongs, and a decision means for determining the number of clicks per search item whether a predetermined difference than the average click volume.

根据本发明的另一个方面提供了一个用于检测无效点击的设备,包括一个点击计数器装置,用于针对包括在由因特网搜索引擎提供的搜索结果网页内的搜索项,计数预定时段内每个搜索项的搜索器点击量, 一个平均点击量计算装置,用于在搜索结果网页中在预定时段内计算位于搜索项较高端的搜索项的预定第一数量和位于搜索项较低端的搜索结果的预定第二数量的平均点击量,和决定装置,用于确定每个搜索项的点击量是否比平均点击量大一个预定的差。 Provided for in accordance with another aspect of the invention apparatus for detecting invalid clicks, comprising a click counter means for a predetermined time period for each search within a search item included in a search engine provided by an Internet search results page count searcher clicks the item, an average click-amount calculating means for a first predetermined number of pages in the search results within a predetermined period is calculated in the search term the higher end of the search term in the search term and the search result of the lower end of the predetermined the average number of clicks of a second, and decision means for determining whether the number of clicks per search item is greater than the average number of clicks of a predetermined difference.

无效点击很难精确地定义,并且无效点击的范围应该取决于实施例和应用来不同地定义。 Invalid clicks difficult to accurately define, and the scope of the invalid clicks should be applied depending on the embodiment and defined differently. 然而,无效点击可能指的是以只增加点击量而不以实际搜索为目的而做出的点击。 However, invalid clicks are likely to mean increased traffic and not only click to the actual search for the purpose of making. 附图说明 BRIEF DESCRIPTION

图1是一个示意图,说明因特网搜索服务器的一个网络连接,包括用于检测无效点击的设备和根据本发明的客户机终端。 FIG 1 is a schematic diagram illustrating a network connection of an Internet search server including an apparatus for detecting invalid clicks and a client terminal according to the present invention.

图2是一个说明由因特网搜索引擎产生的搜索结果网页的示意图。 FIG 2 is a schematic view of a search result web page generated by an Internet search engine is described. 图3是一个说明根据本发明实施例来检测无效点击的设备结构的框图。 FIG 3 is a block diagram illustrating a configuration of a device to detect invalid clicks according to an embodiment of the present invention.

图4是一个根据本发明实施例来检测无效点击的方法流程图。 FIG 4 is an embodiment of the present invention, a method for detecting invalid clicks flowchart. 图5显示了根据本发明实施例的示例的日志文件。 Figure 5 shows an exemplary log file according to an embodiment of the present invention. 图6a和6b是一个根据本发明实施例来检测无效点击的方法流程图。 6a and 6b are a flowchart of a method embodiment of the present invention to detect invalid clicks. 图7显示了一个根据本发明实施例的示例的日志文件。 Figure 7 shows an exemplary log file according to an embodiment of the present invention. 图8是一个根据本发明实施例来产生会话标识符的方法流程图。 FIG 8 is a flowchart of a method embodiment of the present invention to generate a session identifier. 图9是一个根据本发明实施例来检测无效点击的方法流程图。 FIG 9 is a flowchart of a method embodiment of the present invention to detect invalid clicks. 图10显示了一个根据本发明实施例的示例的日志文件。 Figure 10 shows an exemplary log file according to an embodiment of the present invention. 图11是一个根据本发明实施例来检测无效点击的方法流程图。 FIG 11 is a method for detecting invalid clicks according to an embodiment of the present invention, a flow diagram. 图12是一个说明通用计算机系统的结构的框图,该系统可用于创立一个搜索引擎服务器和一个用于根据本发明检测无效点击的设备。 12 is a block diagram showing a configuration of a general purpose computer system explanatory view, the system can be used to create a search engine server and the apparatus for detecting invalid in accordance with the present invention clicks.

具体实施方式 detailed description

在下文中,本发明的优选实施例将参考附图被详细描述。 Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. 图1是一个示意图,说明包括用于检测无效点击的设备和根据本发明的客户机终端的因特网搜索服务器的网络连接。 FIG 1 is a schematic diagram, including description of the apparatus for detecting invalid clicks according to the network and Internet search server connected to a client terminal according to the present invention.

尝试不公平点击的搜索器或作弊器经由连接到因特网103的客户机终端101来访问因特网搜索服务器104。 Try to click on the search unfair or cheating to access the Internet search server 104 connected to the Internet via the client terminal 101 103. 作弊器通过多次点击由因特网搜索服务器104提供的搜索结果网页中的搜索项来增加点击量。 Cheating is to increase traffic to multiple clicks by the search results page provided by an Internet search server 104 in the search term. 例如在图2 中,假定搜索项202是一个与http:〃www.invalidclick.com有关的搜索项, 并且作弊器不断地点击搜索项202以便于搜索项202被显示在搜索结果网页的顶端。 For example, in FIG. 2, 202 is a search term is assumed to http: 〃www.invalidclick.com related search terms, and a cheater continuously click the search item 202 to be displayed at the top of the search results page 202 to the search term.

当客户机终端101被连接到搜索引擎服务器104或其它网络站点时, cookie文件102是一个由搜索引擎服务器104或其它网络站点存储在客户机终端101的硬盘中的特殊的文本文件。 When the client terminal 101 is connected to the search engine server 104 or other network site, cookie file 102 is a specific text file by the search engine server 104 or other network stations in the hard disk storage 101 of the client terminal in. 在用于连接网络站点的HTTP协 In the HTTP network site for connection to RA

9议中,每个对网页的请求都与其它请求无关。 9 meeting, each request for the web page requests are independent of the other. 因此,网络服务器不具这样 Therefore, the network server does not have this

的信息,即哪个页面先前已经被发送到客户机终端101或者客户机终端 Message that the page which has been previously sent to the client terminal 101 or client terminal

101先前已经执行了什么工作。 101 have previously performed any work. 因此,为了关联像这样独立处理的各个请 Accordingly, in order to associate each processed independently like please

求, 一个cookie文件被提供。 Seeking, a cookie file is provided. 这类cookie文件服务允许网络服务器把用户信息存储在用户的计算机中。 Such cookie file services allows the network server to the user's computer is stored in the user information. 为了在本发明中检测无效点击,甚至可以使用几个cookie文件。 In the present invention in order to detect invalid clicks, you can even use several cookie files. 这将在后面被详细描述。 This will be described in detail later.

日志文件105是一个用于存储与用户点击模型相关的几个日志的文件。 A log file 105 is used to store several log files associated with user clicks on the model. 在本发明中,为了检测无效点击而使用几个参数。 In the present invention, in order to detect invalid clicks using several parameters. 在与各个点击有关的参数被存储在日志文件中之后,基于预定的规则和模型来确定输入点击是否无效。 In related to each click parameters are stored in a log file after, based on predetermined rules and models to determine the input of invalid clicks. - -

根据本发明实施例的日志文件的例子如图5、 7和10中所示。 An example of a log file according to an embodiment of the present invention, FIGS. 5, 7 and 10 in FIG. 图3是一个说明根据本发明实施例来捡测无效点击的设备结构的框图。 FIG 3 is a block diagram illustrating a configuration of the device to pick invalid clicks measured according to embodiments of the present invention.

根据本发明实施例来检测无效点击301的设备包括参数输入单元304、日志存储单元305、无效点击模型存储单元306、无效点击验证单元307、无效点击报告单元308和无效点击决定单元309。 According to an embodiment of the present invention, apparatus for detecting invalid clicks 301 includes a parameter input unit 304, a log storage unit 305, an invalid click pattern storage unit 306, an invalid click verification unit 307, an invalid click report unit 308 and the invalid click decision unit 309.

如果搜索器点击包括在由因特网搜索引擎提供的搜索结果网页内的搜索项,则与该点击有关的几个参数302被输入到参数输入单元304。 If the searcher clicks the search item included in a search result web page provided by an Internet search engine, then with several parameters related to the click 302 is input to the parameter input unit 304. 这些参数是用于确定无效点击的基本信息,并且包括搜索器终端的IP地址、 搜索器终端所属的网络地址、与搜索结果网页有关的搜索字、搜索器的web浏览器的相关信息、与点击有关的点击时间、存储在搜索器终端中的cookie文件信息、与搜索项有关的URL信息等等。 These parameters are essential information for determining invalid clicks, and includes information about the IP address of the searcher's terminal, the network address of the searcher's terminal belongs, relating to the search results page word search, web browser of the searcher, and click about the time of the click, cookie file is stored in the searcher's terminal information, URL information related to the search term, and so on.

如果搜索器向因特网搜索引擎服务器104请求一个搜索,则搜索请求分组从客户机终端101被传递到因特网搜索引擎服务器104。 If the searcher requests a search to the Internet search engine server 104, the search request packet 101 is transmitted from the client terminal 104 to the Internet search engine server. 搜索请求分组包括一个根据HTTP协议的分组配置并且还被包含在因特网(IP:网际协议)分组内。 Search request packet includes a packet configuration according to the HTTP protocol and is also contained in the Internet (IP: Internet Protocol) within packets. 因为源IP地址字段被包括在因特网协议分组的配置内,所以因特网搜索引擎服务器104从点击所请求的搜索请求分组提取一个源IP地址,从而提取搜索器终端的IP地址。 Because the source IP address field is included in the configuration of Internet Protocol packets, the Internet search engine server 104 requests a search request from the packet clicks extracts a source IP address, thereby extracting the IP address of the searcher's terminal.

源IP地址的前部分是搜索器终端所属的网络地址。 The front part of the source IP address is the network address of the searcher's terminal belongs. IP地址由4个字节组成。 An IP address consists of 4 bytes. IP地址的前部分是一个用于识别搜索器终端所属网络的网络地址,而其剩余部分是用于识别网络内的搜索器终端的地址。 Front part of the IP address is a network address for identifying the searcher's terminal belongs to a network, while the remaining portion of the address is used to identify the searcher's terminal within the network. 因此,网络地址从源1P地址中被提取。 Thus, the network address is extracted from the source address 1P. 根据本发明的实施例,IP地址前部分的3个字节被认为是一个网络地址并且该网络地址从源IP地址被获得。 According to an embodiment of the present invention, the front portion 3 bytes of IP addresses are considered a network address and the network address is obtained from the source IP address. 例如,如果源IP地 For example, if the source IP

址是123.45.67.89,则123.45.67被提取为一个网络地址。 Address is 123.45.67.89, then 123.45.67 is extracted as a network address.

与搜索结果网页有关的搜索字是一个由搜索器输入因特网搜索服务器104的值。 Search word associated with the search result web page is a value input to the Internet search server 104 by the searcher.

搜索器的web浏览器的相关信息是web浏览器上的信息,所述web浏览器被装载在搜索器的客户机终端101中并被用来访问因特网搜索服务器104。 Web browser information searcher is information on a web browser, the web browser is loaded on the client terminal 101 of the searcher and is used to access the Internet search server 104. web浏览器的相关信息包括web浏览器的类型、web浏览器的版本、web浏览器的产品ID等等。 Related information include the type of web browser web browser, version of the web browser, web browser, product ID, and so on. 特别地,即使当多个搜索器具有相同类型和相同版本的web浏览器时,它们的web浏览器的产品ID也可能不同。 In particular, even when multiple search has the same type and version of web browser, the web browser of their product ID may be different. 从而,它变成了用于识别一个搜索器终端的有用信息。 Thus, it becomes useful information for identifying a searcher's terminal.

根据被用于连接到网络的HTTP协议,客户机的一部分环境参数被包括在HTTP分组内来传送到网络服务器。 The HTTP protocol is used to connect to the network, the client part of the environmental parameters are included in the HTTP packet to the network server. 网络服务器的程序(搜索弓i擎程序) 可以接收环境参数并且可以使用这些参数来检测无效点击。 The web server program (a search engine program bow i) can receive the environmental parameters and can use these parameters to detect invalid clicks.

这类环境参数包括下列信息: Such environmental parameters include the following information:

REMOTEJHOST:被连接者的域名 REMOTEJHOST: the domain name's connection

REMOTE—ADDR:被连接客户机主机的IP地址 REMOTE-ADDR: IP address of the client host connection

REMOTE—USER:被连接者的名字(在网络服务器设置了用户验证的情况下显示) REMOTE-USER: name of the person connected to (provided the network server where the user authentication is displayed)

REMOTE一USER:被连接者的! A REMOTE USER: connected person! D(在网络服务器设置了用户验证的情况下被显示) D (the network server is provided with a case where user authentication is displayed)

HTTP—USER—AGENT:被连接者驱动的程序的相关注册信息, 一般来说是浏览器的名称 HTTP-USER-AGENT: driven by the relevant registration information connected to the program, in general, is the name of the browser

HTTP_ACCEPT—LANGUAGE:被连接者使用的语言HTTP—REFERER:呼叫对应CGI程序的文档名称REQUEST—METHOD:向服务器传输数据的方法(GET,POST) QUERY—STRING:当数据以GET模式发送时,发送数据的被存储 HTTP_ACCEPT-LANGUAGE: Language HTTP-REFERER used connections by: calling the corresponding CGI program document name REQUEST-METHOD: the method server to transmit data (GET, POST) QUERY-STRING: When data is sent in a GET mode, the transmission data It is stored

参数 parameter

CONTENT—LENGTH:当数据以POST模式被发送时,被发射数据的总长度(字节数) CONTENT-LENGTH: when the data is transmitted in a POST mode, the total length of the data to be transmitted (number of bytes)

CONTENT—TYPE:当数据以POST模式被发射时,数据的MIME CONTENT-TYPE: when the data is transmitted in a POST mode, data MIME

类型 Types of

AUTH一TYPE:用于确认用户授权的参数SERVER—NAME:当前服务器的域名 AUTH a TYPE: used to confirm user authorization parameters SERVER-NAME: the current domain name server

SERVER—SOFTWARE:当前安装在服务器上的网络服务器程序的 SERVER-SOFTWARE: currently installed on a server's web server program

名称 name

SERVER一PROTOCOL:服务器当前使用的网络协议的名称和版本SERVER_PORT:服务器当前所使用的端口数(在HTTP的情况下一般是80) A SERVER PROTOCOL: network protocol server name and version of the currently used SERVER_PORT: port number currently used by the server (in case of HTTP, usually 80)

PATH—INFO:被呼叫的CGI程序的当前路径的信息PATH—TRANSLATED:网络要求的网络服务器中的当前资源路径的相关信息 Information about the current resource path network servers in the network requires: PATH-INFO: current path of a called CGI program PATH-TRANSLATED information

SCRIPT—NAME:当前正在被呼叫的CGI程序的名称HTTP_ACCEPT:当前可以以HTTP接收的资源的类型与搜索器的点击有关的点击时间是来自搜索器的点击输入被接收的时间。 SCRIPT-NAME: CGI program currently being called names HTTP_ACCEPT: current Click Click the relevant time can be received HTTP resource type and searchers are click input from the searcher is received. 根据本发明的另一个实施例,与搜索器的点击时间有关的其它时间可以被使用。 According to another embodiment of the present invention, other time associated with the click time of the searcher it may be used. 例如,可以使用搜索器实际上将点击输入客户机的时间。 For example, you can use the search actually click on the input time client.

存储在搜索器终端中的cookie文件上的信息被因特网搜索服务器104 获得,其中因特网搜索服务器104访问存储在客户机终端101中的cookie 文件102。 Cookie file information stored in the searcher's terminal is obtained in the Internet search server 104, wherein the cookie file to access the Internet search server 104 stored in the client terminal 101 102. 在本发明中,cookie文件102可以被用于多种用途。 In the present invention, cookie file 102 may be used for various purposes. 这将参考 This reference

其它实施例被详细描述。 Other embodiments are described in detail.

与搜索器点击的搜索项有关的URL信息可以通过查阅搜索数据库而获得,因为它被存储在与搜索引擎服务器104有关的搜索数据库(未示出) 中。 Searcher clicked search item URL information can be obtained by access to search the database, because it is stored in relation to the search engine server 104 searches a database (not shown). URL信息可以是网络服务器的域名或包括域名、目录和文件名的信息。 URL information can be a network server or domain name, including domain name information, directory and file names. 例如,http:〃www. naver.com和http:〃www.naver.com/download是相同的,因为它们是鉴于域名的www.naver.com,但是具有不同的URL。 For example, http: 〃www naver.com and http:. / Download is the same 〃www.naver.com, as they are in view of www.naver.com domain name, but with a different URL. 在本发明中,使用URL及至域名的实施例已经为了解释起见进行了说明。 In the present invention, the domain name up URL embodiments have been described for purposes of explanation. 然而, 本发明覆盖了所有的实施例,其中,如果URL尽管其域名相同但是具有不同的目录(因为它们包括了域名、目录和文件名全部),则URL被认为是不同的搜索项。 However, the present invention covers all embodiments, wherein if the URL that although the same domain, but have different directories (since they include a domain name, the directory and file names of all), the URL is considered different search items. 此外应当理解,在本发明中,URL信息包括根据这个说明书的所有实施例。 Moreover, it should be understood that, in the present invention, URL information includes all the embodiments according to this specification.

此外,除了上述的参数之外,在本发明的精神内,被用于检测无效点击的其它参数也可以被用来检测无效点击。 Further, in addition to the parameters described above, within the spirit of the invention, other parameters are used for detecting invalid clicks may also be used to detect invalid clicks.

上述种类的参数302被输入到参数输入单元304。 The above-described type parameters 302 are input to the parameter input unit 304. 这些参数又被存储在日志存储单元305中。 These parameters are in turn stored in the log storage unit 305. 根据本发明,存储在日志存储单元中的日志的例子如图5、 7和10中所示。 According to the present invention, an example of the log stored in the log storage unit of FIG. 5, 7 and 10 in FIG. 在这些附图中,只包括一部分参数的日志被显示以用于解释。 In these drawings, logs including only some of the parameters are displayed for interpretation. 然而,根据本发明的另一个实施例,包括全部或一部分参数302的日志可以被存储在日志存储单元305中。 However, according to another embodiment of the present invention, including all or a portion of the parameters 302 may be stored in the log storage unit 305 logs.

根据本发明的一个实施例,日志存储单元305在其中存储关于至少下列两项的日志:搜索器终端的IP地址、搜索器终端所属的网络地址、与搜索结果网页有关的搜索字、搜索器的web浏览器的相关信息、与点击有关的点击时间、存储在搜索器终端中的cookie文件信息和与搜索项有关的URL信息。 Embodiment, the log storage unit in accordance with one embodiment of the present invention is at least about 305 stores therein a log of the following two: IP address of a searcher's terminal, the network address of the searcher's terminal belongs, a search result web page associated with the search word, searcher Related information web browser, click on the time associated with the click, cookie file is stored in the searcher's terminal and search terms related URL information. 根据本发明的一个优选实施例,日志存储单元305在其中存储一个关于至少下列一项的日志:搜索器终端的IP地址、搜索器终端所属的网络地址、与搜索结果网页有关的搜索字、搜索器的web浏览器的相关信息、与点击有关的点击时间、存储在搜索器终端中的cookie文件信息和与搜索项有关的URL信息。 According to a preferred embodiment of the present invention, the log storage unit 305 stores therein a log regarding at least one of the following: IP address of a searcher's terminal, the network address of the searcher's terminal belongs, a search result web page associated with the search word, search Related information's web browser, click on the time associated with the click, cookie file is stored in the searcher's terminal and search terms related URL information.

无效点击型式存储单元306在其中存储一个与至少下列两项的一对有关的无效点击模型或规则:搜索器终端的IP地址、搜索器终端所属的网络地址、与搜索结果网页有关的搜索字、搜索器的web浏览器的相关信息、 与点击有关的点击时间、存储在搜索器终端中的cookie文件信息和与搜索项有关的URL信息。 Invalid click pattern storage unit 306 stores therein at least one of the following two models about the invalid click or one pair of rules: IP address of a searcher's terminal, the network address of the searcher's terminal belongs, a search result web page with the search word, Related information web browser searcher, cookie files associated with the click click of time, stored in the searcher's terminal information and search terms related URL information. 例如,搜索器终端的IP地址和与搜索项有关的URL 信息型在10分钟内的点击输入中彼此一致的规则或模型可以被存储在无效点击模型存储单元306中。 For example, IP addresses associated with the search item and URL information in the searcher's terminal type click input in 10 minutes and in rules or models coincide with each other may be stored in the invalid click pattern storage unit 306. 同样地,用于确定无效点击的被存储在无效点击模型存储单元306中的规则等等可以用文件的形式存储,该文件使用根据预定规则的预定语言。 Likewise, for determining invalid clicks are stored in the invalid click rule model storage unit 306, and the like may be stored in the form of a file, the file using a predetermined language according to a predetermined rule. 或者,在上述规则或模型的情况下,它可以用程序的形式被存储以便于它被确定是一无效点击。 Alternatively, in the case where the rules or models, which may be in the form of program is stored so that it is determined to be an invalid click.

无效点击决定单元309基于日志存储单元305中存储的日志和无效点击模型存储单元306中存储的无效点击模型来确定搜索器点击是否是无无效点击报告单元308向因特网搜索引擎的管理员303报告与点击中的预定标准一致的点击,其被无效点击决定单元309确定无效。 Invalid click decision unit 309 stored in the storage unit 305 based on the log and the log invalid click pattern storage unit 306 invalid clicks model to determine whether the searcher clicks is not invalid click report unit 308 administrator to Internet search engines and reports 303 the same predetermined standard click-click, which is invalid click decision unit 309 determines invalid. 根据本发明的一个实施例,无效点击报告单元308向因特网搜索引擎的管理员报告所有被无效点击决定单元309确定为无效的点击。 According to one embodiment of the invention, the invalid click report unit 308 of all invalid click decision unit administrator reporting to an Internet search engine 309 is determined to be invalid clicks. 在这种情况下,预定标准是已经被无效点击决定单元309确定为无效的所有点击。 In this case, the predetermined criteria has been invalid click decision unit 309 determines as all invalid clicks. 根据本发明的另一个实施例,指示是否向管理员303报告对应于规则或模型的情况的字段被存储在无效点击模型存储单元306中储存的每个规则或者模型中。 According to a field in another embodiment of the present invention, indicating the case corresponds to the rule model manager 303 or reported to be invalid click pattern storage unit 306 or stored in each of the rules stored in the model. 在这种情况下,在对应于管理员303必须被通知的规则的情况下,无效点击报告单元308将其报告给管理员303。 In this case, in the case corresponding to the rules administrator 303 must be notified of invalid click report unit 308 reports it to the administrator 303.

无效点击验证单元307允许管理员303把已经被无效点击决定单元309确定为无效的点击改变成有效点击。 Invalid click verification unit 307 allows the administrator 303 has been invalid click decision unit 309 is determined to be invalid clicks to valid clicks change. 因为无效点击验证单元307可以把误定为无效点击的点击改变成有效点击,所以无效点击可以被更精确地确定。 Because invalid click verification unit 307 can be mistaken as invalid clicks to valid clicks change clicks, invalid clicks can be more accurately determined.

图4是一个根据本发明实施例来检测无效点击的方法流程图。 FIG 4 is an embodiment of the present invention, a method for detecting invalid clicks flowchart.

因特网搜索服务器104从搜索器接收一个搜索请求(步骤401)。 The Internet search server 104 receives a search request (step 401) from the searcher. 如果搜索器访问因特网搜索服务器104然后输入搜索字,则该搜索字作为搜索请求分组被传送到因特网搜索服务器104。 If the searcher accesses the Internet search server 104 and then enter the search word, the search word as a search request packet is transmitted to the Internet search server 104.

因特网搜索服务器104响应于该搜索请求产生一个搜索结果网页(步骤402)。 The Internet search server 104 in response to the search request generating a search result web page (step 402). 例如图2中所示,包括多个对应于搜索器输入搜索字的搜索项的搜索结果网页被提供给搜索器。 For example, shown in Figure 2, comprising a plurality of search result web page corresponding to the search input search word search terms are provided to the searcher.

对应于产生的搜索结果网页的页面标识符被获取(步骤403)。 Search result web page corresponding to the acquired identifier is generated (step 403). 每当产 Whenever production

生搜索结果网页的时候就产生一个页面标识符。 When students search results page to generate a page identifier. 页面标识符是一个用于识别搜索结果网页的标识符。 Page identifier is used to identify a search results page identifier. 因此,如果相同的搜索器通过重复地向因特网搜索服务器104的搜索窗中输入相同的搜索字,则每次都分配一个新的页面标识符。 Thus, if the same search by repeatedly input the same to the search window of the Internet search server 104 in the search word is assigned each time a new page identifier. 同样地,如果搜索器点击显示搜索结果网页的web浏览器中的"reload (重新加载)",则因特网搜索服务器104向搜索结果网页分配一个新的页面标识符,因为搜索请求分组从客户机终端101传送到因特网搜索服务器104。 Similarly, if the searcher clicks the search results page to display the web browser "reload (reload)", the Internet search server 104 to allocate a new page identifier to the search results page, because the search request packet from the client terminal 101 transmitted to the Internet search server 104. 不同的页面标识符被分配给乍一看相同的搜索结果网页是可能的。 Different page identifier is assigned to the same search results page at first sight is possible. 然而,如果新的搜索请求从客户机终端101被接收,则搜索结果网页在那时被重新产生。 However, if a new search request is received from the client terminal 101, the search results page is re-generated at that time. 不同于先前的搜索结果网页的搜索结果网页从而可以被提供。 Unlike previous search results pages of search results pages which can be provided.

在步骤404中,因特网搜索服务器104从搜索器接收一个包括在搜索结果网页内的搜索项的点击。 In step 404, the Internet search server 104 receives from the searcher clicks a search item comprises in the search results page. 如果点击被接收,则因特网搜索服务器104 允许用于搜索项的超链接来连接因特网搜索服务器104,允许因特网搜索服务器104执行必要的处理,然后允许客户机终端访问对应于该搜索项的网络站点。 If the click is received, the Internet search server 104 allows a hyperlink for the search item to connect to the Internet search server 104, allows the Internet search server 104 performs necessary processing, and then allows a client terminal accesses the search item corresponding to the web site. 例如, 在 For example,

http:〃www.naver.com/abc^http:〃www.invalidclick.com/被准备作为对应于"http〃www.invalidclick.com/"的搜索项超链接的情况下,如果搜索器点击该搜索项,则搜索被允许以访问称作^口://,\^.naver.com的搜索服务器。 http: 〃www.naver.com / abc ^ http: the case of the search term hyperlinks 〃www.invalidclick.com / are prepared as corresponding to "http〃www.invalidclick.com /" if the searcher clicks on the search item, the search was called ^ allowed to access the port: //, \ ^ .naver.com search server. 搜索服务器允许客户机终端根据位于超链接后侧的URL来访问http:〃www.invalidclick.com。 Search terminal server allows clients to access http hyperlink located on the rear side according to the URL: 〃www.invalidclick.com.

因特网搜索服务器104获取一个对应于被点击搜索项的站点标识符(步骤405)。 The Internet search server 104 acquires a corresponding to the clicked search item site identifier (step 405). 站点标识符是一个用于识别搜索项的标识符,并且基于对应于搜索项的URL信息来产生。 Site identifier is an identifier for identifying a search item and is generated based on URL information corresponding to the search term. 根据本发明的另一个实施例,站点标识符使用对应于搜索项的原URL信息。 According to another embodiment of the present invention, the site identifier using the original URL information corresponding to the search term. 用作产生站点标识符的基本信息的URL Basic information generated as a site identifier of the URL

信息可以是网络服务器的域名或包括域名、目录和文件名在内的信息。 Information can be a network server domain name or domain name, directory and file names, including information. 例如,http:〃www.naver.com禾口http:〃www.naver.com/download是相同的, For example, http: 〃www.naver.com mouth Wo http: 〃www.naver.com / download is the same,

因为它们从域名的观点来看都是www.naver.com,但是从URL的观点来看则不相同。 From the point of view because they are the domain of view www.naver.com, but not the same URL from the viewpoint of view. 在本发明中, 一个使用URL及至域名的实施例已经为了解释方便起见而进行了说明。 In the present invention, embodiments of the URL domain up for convenience of explanation and has been described a use. 然而,本发明覆盖了所有的实施例,其中,如果URL尽管其域名相同但是具有不同的目录(因为它们不仅包括了域名,而且还包括了目录和文件名),则URL被认为是不同的搜索项。 However, the present invention covers all embodiments, wherein if the URL although the same as the domain name, but have different directories (because they include the domain only, but also a directory and file name), the URL is considered different search item. 此外应当理解,在本发明中,URL信息包括根据这个说明书的所有实施例。 Moreover, it should be understood that, in the present invention, URL information includes all the embodiments according to this specification.

在步骤406中,如果页面标识符和站点标识符与预定时段内的其它点击相关的页面标识符和站点标识符一致,则用于检测无效点击的设备确定点击是无效的。 In step 406, if the page identifier and page identifier and the site identifier associated with other clicks within a site of a predetermined period identifiers match, the apparatus for detecting invalid clicks determines click is invalid.

图5显示了根据本发明实施例的示例的日志文件。 Figure 5 shows an exemplary log file according to an embodiment of the present invention. 图4的实施例将参考图5来说明。 Embodiment of Figure 4 will be described with reference to FIG.

根据本发明,每当从用户接收一个搜索项的点击,页面标识符509和站点标识符510就被存储在日志文件500中。 According to the present invention, each time receiving a click, page identifier 509 and a site identifier 510 searches from the user is stored in the log file 500. 附图标记501到508指出被存储的各个点击输入的日志。 Noted that reference numerals 501 to 508 each click logs stored input.

作弊器访问因特网搜索服务器104以请求一搜索。 Cheater accesses the Internet search server 104 to request a search. 因特网搜索服务器104产生搜索结果网页并产生一个对应于搜索结果网页的页面标识符"nCe249sisnO"。 The Internet search server 104 generates a search result web page and generates a page corresponding to the search results page identifier "nCe249sisnO". 作弊器不断地点击包括在搜索结果网页内的一个特定的搜索项。 Cheater continuously clicks include a specific search terms in the search results page. 即使一旦所产生的搜索结果网页中的特定搜索项被不断地点击, 页面标识符也不会被重新产生。 Even if the search results page once produced in specific search terms are constantly clicks, page identifier will not be regenerated. 从而,页面标识符保留了相同的值。 Thus, the page identifier retain the same value.

从而在预定时段内的点击输入日志中,确定具有相同的页面标识符和相同的站点标识符的日志501、日志502和日志504是无效点击。 So click input within a predetermined time period in the log, the log 501 is determined having the same page identifier and the same site identifier, the log 502 and the log 504 is an invalid click. 根据本发明的一个实施例,确定一致的日志中的一个是无效点击,则剩余的日志是无效点击。 According to one embodiment of the log embodiment of the present invention, in determining a consistent invalid clicks, the remaining logs are invalid clicks.

作弊器可以通过点击web浏览器中的"reload"来更新搜索结果网页。 Cheating can update the search results page by clicking on the web browser "reload". 在这种情况下,页面标识符被重新分配并且关于页面标识符的日志是日志505。 In this case, a page identifier are redistributed and the log on page identifier is the log 505. 其后,作弊器点击相同搜索项的情况对应于日志506。 Thereafter, the cheater clicks on the same search term situation corresponds to log 506.

因此,根据这个实施例,如果作弊器点击"reloads"然后点击相同的搜索项(在日志506的情况下),则它不被确定是一个无效点击。 Thus, according to this embodiment, if the cheater clicks on "reloads" and then click on the same search item (in case of the log 506), it is not determined to be an invalid click. 同样地,用于确定"reload"是无效点击的情况的方法将参考图6在下列实施例中被说明。 Likewise, for determining the "reload" is a case where invalid clicks method with reference to FIG 6 is illustrated in the following examples.

图6a和6b是一个根据本发明实施例来检测无效点击的方法流程图。 6a and 6b are a flowchart of a method embodiment of the present invention to detect invalid clicks. 因特网搜索服务器104从搜索器接收搜索请求(步骤601)。 The Internet search server 104 receives a search request (step 601) from the searcher. 因特网搜索 Internet search

服务器104响应于该搜索请求产生一搜索结果网页(步骤602)。 In response to the search server 104 generates a search result web page request (step 602).

用于确定无效点击的设备确定会话cookie文件是否被存储在请求搜 In search request for determining invalid clicks device determines whether a session cookie file is stored

索的客户机终端101中(步骤603)。 (Step 603) the client terminal 101 cable. 步骤603到步骤611被处理以获得一个 Step 603 to step 611 is processed to obtain a

会话标识符。 Session identifier.

如果确定会话cookie文件没有存储在客户机终端101中,则用于确定无效点击的设备产生一个新的会话标识符(步骤604)。 If it is determined in the session cookie file is not stored in the client terminal 101, the apparatus for determining invalid clicks generates a new session identifier (step 604). 在步骤605中,包括会话标识符在内的会话cookie文件被存储在客户机终端101中。 In step 605, it includes the session identifier to session cookie file is stored in the client terminal 101. 会话标识符的更新时间还被存储在会话cookie文件中。 Updated session identifier is also stored in the session cookie file. 更新时间被存储在会话cookie文件中(步骤609)。 Update time is stored in the session cookie file (step 609).

如果确定会话cookie文件在步骤602中存储在客户机终端101中,则用于确定无效点击的设备确定包括会话cookie文件在内的会话标识符的最后更新时间是否在预定时段内(步骤606)。 If it is determined in the session cookie file stored in step 602 in the client terminal 101, the apparatus for determining invalid clicks comprises determining a last update time of the session cookie file including the session identifier is within a predetermined time period (step 606).

作为步骤606中的确定结果,如果包括在会话cookie文件内的会话标识符的最后更新时间在预定时段内,则用于确定无效点击的设备提取一个包括在会话cookie文件内的会话标识符(步骤607)。 As a result of the determination in step 606, if the session identifier contained in the session cookie file was last updated within a predetermined period, the apparatus for determining invalid clicks extracts a session identifier contained in the session cookie file (step 607).

作为步骤606中的确定结果,如果包括没有会话cookie文件内的会话标识符的最后更新时间不在预定时段内,则用于确定无效点击的设备产生一个新的会话标识符(步骤608)。 As a result of the determination in step 606, if the last update time including the session identifier in the session cookie file is not within the predetermined time period is not, the apparatus for determining invalid clicks generates a new session identifier (step 608). 包括在会话cookie文件内的会话标识符用重新创建的会话标识符来更新(步骤610)。 Including a session in the session cookie file identifier with the session identifier to re-create the update (step 610). 会话标识符的更新时间被存储在会话cookie文件中(步骤611)。 Updated session identifier is stored in the session cookie file (step 611).

因特网搜索服务器104从搜索器接收一个包括在搜索结果网页内的搜索项的点击(步骤612)。 The Internet search server 104 receives from a searcher comprises a click (step 612) the search term within the search results page.

因特网搜索服务器104获取一个对应于被点击搜索项的站点标识符(步骤613)。 The Internet search server 104 acquires a corresponding to the clicked search item site identifier (step 613).

如果会话标识符和站点标识符与在预定时段内与其它点击有关的会话标识符和站点标识符一致,则用于检测无效点击的设备确定该点击是无效点击(步骤614)。 If the session identifier and the site identifier associated with other clicks within a predetermined time period consistent session identifier and the site identifier, the apparatus for detecting invalid clicks determines that the click is invalid click (step 614).

图7显示了根据本发明实施例的示例的日志文件。 Figure 7 shows an exemplary log file according to an embodiment of the present invention.

在这个实施例中,每当从用户接收一个搜索项的点击,点击时间710、 会话标识符的更新时间711 、会话标识符712和站点标识符713被存储在日志文件700中。 In this embodiment, each time receiving a search term from a user click, click time 710, update time 711 the session identifier, the session identifier 712 and a site identifier 713 are stored in the log file 700. 附图标记701到708指出对应于各个点击输入存储的日志。 It noted that reference numerals 701-708 corresponding to respective click inputs the stored logs.

作弊器访问因特网搜索服务器104以请求一个搜索请求。 Cheater accesses the Internet search server 104 to request a search request. 因特网搜索服务器104产生一个搜索结果网页。 The Internet search server 104 generates a search results page. 因特网搜索服务器104接收一个包括 The Internet search server 104 comprises a receiver

在搜索结果网页内包括在内搜索项的点击。 In the search results page, including click search term.

因特网搜索服务器104确定会话cookie文件是否被存储在客户机终端101中。 The Internet search server 104 determines whether a session cookie file is stored in the client terminal 101. 如果确定会话cookie文件没有存储在客户机终端101中,则因特网搜索服务器104产生一个新的会话标识符,并且将其更新时间和包括会话标识符在内的会话cookie文件存储在客户机终端101中。 If it is determined in the session cookie file is not stored in the client terminal 101, the Internet search server 104 generates a new session identifier, and update time 101 and it includes a session identifier to the session cookie file stored in the client terminal . 在这个实施例中,会话标识符"xigw9492"和更新时间"10:50:14"被记录。 In this embodiment, the session identifier "xigw9492" and update time "10:50:14" is recorded. 此外,对应于搜索项的点击时间、更新时间、会话标识符和站点标识符作为日志701被存储在日志文件700中。 In addition, corresponding to the time of the click search item, update time, the session identifier and site identifier as the log 701 is stored in the log file 700. 在第一次产生会话cookie文件的情况中,只要在那时还产生点击和会话标识符,会话cookie文件就被产生。 In the case of the first to produce a session cookie file, just click and also generates a session identifier, a session cookie file at the time it was produced. 从而,点击时间和会话标识符更新时间是相同的。 Thus, time and clicks session identifier update times are the same.

作弊器在相同的搜索结果页面中点击相同的搜索项。 Cheater clicks on the same search terms in the same search results page. 因特网搜索服务器104确定会话cookie文件是否被存储在客户机终端101中。 The Internet search server 104 determines whether a session cookie file is stored in the client terminal 101. 因为上述产生的会话cookie文件已经被存储在客户机终端101中,因特网搜索服务器104访问存储在客户机终端101中的会话cookie文件。 Because the session cookie file generated as described above has been stored in the client terminal 101, the Internet search server session cookie file stored in the client 104 to access terminal 101. 会话cookie文件在其中存储一个会话标识符和会话标识符的最后更新时间。 Session cookie file which was last updated in memory a session identifier and the session identifier. 在这个实施例中, 会话标识符"xigw9492"和更新时间"10:50:14"被存储在会话cookie文件中。 In this embodiment, the session identifier "xigw9492" and update time "10:50:14" is stored in the session cookie file.

因特网搜索服务器104确定来自搜索器的搜索项的点击时间是否在从与会话标识符有关的最后更新时间开始的预定时段内。 The Internet search server 104 determines that the click time search item from the searcher is within a predetermined time period starting from the last update time associated with the session identifier. 在这个实施例中,第二点击的点击时间是"10:50:18"。 In this embodiment, the second click that time is "10:50:18." 如果预定时段是5秒,则点击时间"10:50:18"在从最后更新时间"10:50:14"开始的预定时段内。 If the predetermined period of time is 5 seconds, then click on the time "10:50:18" from the last update time "10:50:14" predetermined period began. 同样地, 在这种情况下,存储在会话cookie文件中的会话标识符被用作一个当前的会话标识符并且该会话cookie文件的会话标识符没有被更新。 Likewise, in this case, the session identifier stored in the session cookie file is used as a current session identifier and the session identifier of the session cookie file is not updated. 从而在这种情况下,例如日志702被记录。 So that in this case, for example, the log 702 is recorded.

从而,确定日志702是一个无效点击,因为它具有与日志701相同的会话标识符和站点标识符。 Thus, determine the log 702 is an invalid clicks, because it has the same session identifier and site identifier and log 701.

日志704对应于其中作弊器请求"reload"的情况。 Log 704 corresponds to a case where the cheater requests "reload" the. 同样地,结果作弊器请求"reload",制定出存储在客户机终端101中的会话cookie文件的标准,并且会话标识符没有被更新,因为存储在会话cookie文件中的最后更新时间在预定时段内。 Similarly, the results cheater requests "reload", developed standards for the session cookie file stored in the client terminal 101, and the session identifier is not updated since the last update period of a predetermined time stored in the session cookie file . 因此,例如日志704被记录。 Thus, for example, the log 704 is recorded. 因为它和日志701 — 样,所以确定日志704是一个无效点击。 Because it and log 701-- like, so make sure the log 704 is an invalid click. 即,根据这个实施例,有可能检测作弊器在短时间间隔内在点击"reload"之后点击相同的搜索项的情况。 That is, according to this embodiment, it is possible to detect a cheater clicks on the same search item after clicking on the situation "reload" in a short time interval within.

日志705对应于这种情况,即相同搜索项的点击从不同于日志701 、 日志702和日志704的搜索器被接收。 Log 705 corresponds to such a case, i.e. click on the same search item is different from the log 701, the log 702 and the log 704 from the searcher is received. 在这种情况下,因为新的会话标识符被分配,所以它不被确定为一个无效点击。 In this case, since the new session identifier is assigned, so it is not determined to be an invalid click.

日志709对应于这种情况,即与日志701相同的搜索器在相当多时间之后点击相同的搜索项。 Log 709 corresponds to the case that the same search logs 701 Click on the same search item after a considerable time. 在这种情况下,因为点击在相当长时间之后才被接收,所以它不被确定为一个无效点击。 In this case, because the click was only received after quite a long time, so it is not determined to be an invalid click.

根据这个实施例,作弊器在预定时段之后点击相同的搜索项的情况, 因为一个会话标识符被产生,所以它被确定是一个无效点击。 According to this embodiment, the case where cheater clicks on the same search item after a predetermined period of time, since a session identifier is generated, it is determined to be an invalid click.

同样地,根据本发明的另一个实施例基于无效点击决定来确定这样的情况可能是一个无效点击,即在从相同搜索项的最后点击时间开始的预定时段内做出点击。 Likewise, the invalid click decision is determined based on such a case could be an invalid click, click i.e. made within a predetermined period of time starting from the last click same search item in accordance with another embodiment of the present invention. 这将被简单地说明。 This will be briefly described.

如果点击从搜索器被接收,则确定会话cookie文件是否被存储在终端中。 If you click on is received from the searcher, it is determined whether a session cookie file is stored in the terminal. 如果确定会话cookie文件被存储在终端中,则确定来自搜索器的搜索项的点击时间是否在从与会话标识符有关的最后点击时间开始的预定时段内。 If it is determined to be in a terminal session cookie file, then click OK time search item from the searcher whether a predetermined time period starting from the last session identifier associated with the click time in storage.

如果确定搜索项的点击时间在预定时段内,则包括在会话cookie文件内的会话标识符被获取并且最后点击时间用搜索项的点击时间来更新。 If it is determined the click time of the search term within a predetermined time period, the session identifier included in the session cookie file is acquired with time and finally click click on a search term to time update.

如果确定搜索项的点击时间不在预定时段内,则新的会话标识符被产生以更新包括在会话cookie文件内的会话标识符。 If it is determined search item click time not within a predetermined time period, the new session identifier is generated to update the session identifier included in the session cookie file. 此外,最后点击时间用搜索项的点击时间来更新。 In addition, the last time with the click click on a search term to time update.

例如在图7中,在存在来自于相同客户机终端的相同搜索项的多个点击的情况下,如果确定从最后的点击已经过去了5秒的情况是有效的,则与日志704有关的点击被确定是有效的,因为它在先前的最后点击时间"10:50:18"的13秒后被做出"10:50:31"。 For example, in FIG. 7, in the present case from a plurality of the same search item clicked on the same client terminals, if it is determined from the last five seconds have elapsed click situation is valid, the click associated with the log 704 it is determined to be valid, because it made "10:50:31" after "10:50:18" of 13 seconds in the previous final click time.

根据本发明的优选实施例,时间参考根据无效点击的检测目的来决定。 According to a preferred embodiment of the present invention, the reference time determined in accordance with the purpose of detecting invalid clicks.

图8是一个根据本发明实施例来产生会话标识符的方法流程图。 FIG 8 is a flowchart of a method embodiment of the present invention to generate a session identifier. 会话标识符必须被唯一地分配以便它能与其它的会话标识符区分并且必须很难被仿造或伪造。 Session identifier must be uniquely allocated so that it can be difficult to counterfeit and forgery or other distinguishing session identifier. 在会话标识符只被唯一地分配的情况下,存在一个可能性,即作弊器实际上可能产生一个会话标识符然后把会话标识符存储在会话cookie中,或者可能用一个程序不正当地增加点击量,这个程 In the case where the session identifier is uniquely allocated only, there is a possibility that a cheater may actually generate a session identifier and the session identifier stored in the session cookie, or may unduly increase the traffic by a procedure this process

序被驱动来不断地点击搜索项而同时改变会话标识符。 Sequence is driven to constantly click on the search term while changing the session identifier.

源数据801是用于产生会话标识符805的基本数据。 Data source 801 for generating base data session identifier 805. 源数据可以是当 When the source data may be

前的时间信息、搜索字、搜索器的web浏览器的产品ID等等。 Time information before, word search, web browser's search product ID, and so on. 源数据可以是随机选择的数量。 The number of source data may be randomly selected. 散列函数802被应用到源数据801以产生一个编码串 Hash function 802 is applied to the source data to generate a code string 801

19803。 19803. 然后,校验和被添加到编码串803以产生会话标识符805。 Then, the checksum is added to the encoded string 803 to generate a session identifier 805. 校验和用来防止作弊器伪造会话标识符。 The checksum serves to prevent a cheater counterfeiting a session identifier.

用于根据这个实施例产生会话标识符的方法可以被应用来产生一个随后将被说明的页面标识符、站点标识符、终端标识符等等。 A method for generating a session identifier according to this embodiment can be applied to generate a page identifier to be described later, the site identifier, a terminal identifier and the like.

图9是一个根据本发明实施例来检测无效点击的方法流程图。 FIG 9 is a flowchart of a method embodiment of the present invention to detect invalid clicks.

因特网搜索服务器104从搜索器接收一个包括在搜索结果网页内的搜索项的点击(步骤901)。 The Internet search server 104 receives from a searcher comprises a click (step 901) the search term within the search results page. 因特网搜索服务器104获取一个对应于搜索器的终端101的客户机IP地址(步骤902)。 The Internet search server 104 acquires a searcher corresponding to the IP address of the client terminal 101 (step 902). 客户机的IP地址可以从被接收的IP分组的源IP地址字段中提取。 IP address of the client may be extracted from the source IP address field of the IP packet is received.

因特网搜索服务器104获取对应于被点击搜索项的站点标识符(步骤903)。 The Internet search server 104 acquires a corresponding to the clicked search item site identifier (step 903).

在步骤904中,如果客户机IP地址和站点标识符与预定时段内其它点击相关的客户机IP地址和站点客户机IP地址一致,则用于搜索无效点击的设备确定该点击无效。 In step 904, if the client IP address and a site identifier associated with other clicks within a predetermined period of time and the client IP address the IP address matches the client site, the apparatus for searching invalid clicks determines that the click is invalid.

图10显示了根据本发明实施例的示例的日志文件。 Figure 10 shows an exemplary log file according to an embodiment of the present invention.

在这个实施例中,每当从用户接收一个搜索项的点击,点击时间1010、客户机IP地址1011和站点标识符1012就被存储在日志文件1000 中。 In this embodiment, each time the item is received from a user clicks a search click time 1010, the client IP address 1011 and a site identifier 1012 is stored in the log file 1000. 附图标记1001到1009指定对应于各个点击输入的所存储的日志。 Reference numerals 1001 to 1009 designate corresponding to respective click inputs the stored logs.

如果相同的客户机终端不断地点击相同的搜索项,则如果点击在预定时段内被重复,则该点击无效的可能性很高。 If the same client terminals continuously click on the same search item, if the clicks are repeated within a predetermined period, the invalid click high possibility. 然而,往往是这样的情况, 即相同客户机终端的用户在相当长时间之后点击相同的搜索项。 However, this is often the case that the same client terminal user clicks on the same search item after a considerable time. 换言之, 存在一个趋势,即用户往往访问一个它很感兴趣的网络站点。 In other words, there is a tendency that users tend to access a web site it is very interesting. 如果用户在短时间内不断地访问一个网络站点,则很难把它看作是一个普通的点击。 If users continue to visit in a short time a network site, it is difficult to see it as a normal click. 从而,这个情况被确定是一个无效点击。 Thus, this situation is determined to be an invalid click. 例如,如果时间标准是5分钟, 则具有与日志1001相同的客户机IP地址和相同的站点标识符的日志1002、日志1004和日志1005被确定是无效点击。 For example, if the standard time is 5 minutes, it has the same log 1001 client IP address and the same logging site identifiers 1002, 1004 log and the log 1005 are determined to be invalid clicks. 确定在大约20分钟中与被点击日志1009相关的点击是有效点击。 In about 20 minutes is determined with the clicked log 1009 is valid clicks associated click.

如果基于客户机IP地址来确定无效点击,那么存在一些需要谨慎的点。 If based on the client IP address to determine invalid clicks, then there are some points need to be careful. 在客户机终端使用代理服务器或IP网关的情况中,存在一个危险,即使作弊器点击与其它的客户机终端相同的搜索项,它也可能被确定为一个无效点击。 In the case of using a proxy server or IP gateway client terminal, there is a danger, even if the cheater clicks with other client terminal the same search term, it may also be determined to be an invalid click. 因此,优选地,这个实施例与使用诸如会话标识符之类的其它参数的一个实施例一起联合构造。 Thus, preferably, this embodiment using other parameters such as a session identifier embodiment like the embodiment of the joint structure together.

相反地,存在这样一种情况,即点击相同搜索项的客户机终端的客户机1P地址是不同的,而它们的网络地址是相同的。 On the contrary, there is a case that the client terminal click on the same search term client 1P address is different, and their network addresses are the same. 这对应于这样一种情况, 即几个人不断地尝试用一个程序来不公平的点击一处或点击相同的搜索项,而同时改变它们的源IP地址。 This corresponds to a situation in which a few people continue to try to use a program to click an unfair or click on the same search term, while changing their source IP address. 在这种情况下,如果点击相同搜索项的客户机终端的网络地址是相同的并且其它情况(例如,在搜索项所属的目录内,点击量大于平均点击量的情况)被满足,则这可以被确定是一个无效点击。 In this case, if the network address of the client terminals click on the same search item is (For example, in the search term belongs to the directory, click on the average amount of traffic) and the same other conditions are satisfied, this may It is determined to be an invalid click.

图11是一个根据本发明实施例来检测无效点击的方法流程图。 FIG 11 is a method for detecting invalid clicks according to an embodiment of the present invention, a flow diagram. 因特网搜索服务器104从搜索器接收搜索请求(步骤1101)并且产生一 The Internet search server 104 receives a search request (step 1101) from a searcher and generates a

个搜索结果网页(步骤1102)。 Search results page (step 1102).

因特网搜索服务器104确定包括终端标识符在内的用户cookie文件是否被存储在终端中(步骤1103)。 The Internet search server 104 determines whether the terminal identifier comprises a user cookie file is stored in the terminal (step 1103).

由于步骤1103中的确定结果,如果包括终端标识符在内的用户cookie 文件没有被存储在终端中,则因特网搜索服务器104产生一个终端标识符(步骤1104)。 Since the result of the determination in step 1103, if the terminal identifier comprises a user cookie file is not stored in the terminal, the Internet search server 104 generates a terminal identifier (step 1104).

因特网搜索服务器104产生包括终端标识符在内的用户cookie文件并把它存储在搜索器终端中(步骤1105)。 The Internet search server 104 generates the user cookie file including the terminal identifier and stores it in the searcher's terminal (step 1105).

由于步骤1103中的确定结果,如果包括终端标识符在内的用户cookie 文件被存储在终端中,则因特网搜索服务器104从用户cookie文件中提取终端标识符(步骤1106)。 Since the result of the determination in step 1103, if the terminal identifier comprises a user cookie file is stored in the terminal, the Internet search server 104 extracts the terminal identifier from the user cookie file (step 1106).

因特网搜索服务器104从搜索器接收包括在搜索结果网页内的搜索项的点击(步骤1107),然后获取一个对应于被点击搜索项的站点标识符(步骤1108)。 The Internet search server 104 receives a click from a searcher comprises (step 1107) search item in the search results page, and obtaining a corresponding to the clicked search item site identifier (step 1108).

最后,在步骤1109中,用于确定如果无效点击的设备确定终端标识符和站点标识符与与预定时段内其它点击有关的终端标识符和站点标识符一致,则该点击是无效的。 Finally, in step 1109, if the means for determining invalid clicks determines that the terminal equipment identifier and the site identifier coincides with the identifier of the other click within a predetermined period related to the terminal identifier and the site, the click is invalid.

根据这个实施例,即使客户机终端使用一个代理服务器或IP网关,也有可能用终端标识符来判别客户机的终端。 According to this embodiment, even if the client terminal uses a proxy server or an IP gateway, it is also possible to discriminate the terminal identifier of the terminal client. 从而,即使不同的客户机终端使用代理服务器或IP网关,也可能正确地识别来自于不同客户机的点击。 Thus, even if a different client terminals using a proxy server or IP gateway, it could correctly identify clicks from different clients. 在本发明的另一个实施例中,如果对于包括在由因特网搜索引擎提供的搜索结果网页内的搜索项,预定时段内每个搜索项的搜索器的点击量大于属于搜索项所属类别的搜索项的平均点击量,则它被认为是一个无效点击并从而将其报告给管理员。 In another embodiment of the present invention embodiment, if for a search item included in a search result web page provided by an Internet search engine, a searcher clicks per search item is greater than a predetermined period search term belongs to the search term Category the average number of clicks, then it is considered to be an invalid click and thus report it to the administrators.

根据本实施例的用于检测无效点击的设备包括点击计数器装置,用于针对包括在由因特网搜索引擎提供的搜索结果网页内的搜索项计数预定时段内每个搜索项的搜索器点击量,,平均点击量计算装置,用于计算预定时段内属于.搜索项所属类别的搜索项的平均点击量,和决定装置,用于确定每个搜索项的点击量是否比平均点击量大一个预定的差。 According to the present embodiment comprises apparatus for detecting invalid clicks click counter means for searcher clicks per search item for a predetermined period counting a search item included in the search result web page provided by an Internet search engine,, the average traffic calculation means for calculating a predetermined period of time belongs to the average number of clicks a search term in the search term belongs to the category, and the decision means, for determining whether the number of clicks per search item is greater than the average number of clicks of a predetermined difference . 如果每个搜索项的点击量比平均点击量大一个预定的差,则这个事实经由无效点击报告单元308被报告给管理员。 If you click than the average number of clicks per search item is greater than a predetermined difference, the fact that through the invalid click report unit 308 is reported to the administrator.

根据本发明的另一个实施例,针对包括在由因特网搜索引擎提供的搜索结果网页内的搜索项,在预定时间段内,将每个搜索项的搜索器的点击量与预定时段内在搜索结果网页中的位于搜索项上端的搜索项预定第一数量和位于搜索项下端的搜索项的预定第二数量的平均点击量相比较。 According to another embodiment of the present invention, for the search item included in a search result web page provided by an Internet search engine, at a predetermined period of time, the searcher clicks the search item for each predetermined period and the internal search result web page in the search term in the upper end of a first predetermined number of search terms in the search term and the lower end of the average number of clicks of a predetermined second search term is compared. 例如,在相同的周期中,特殊的搜索项的点击量与紧接位于特殊搜索项上的两个搜索项和紧接位于特殊搜索项下的两个搜索项的点击量相比较。 For example, in the same period, the special traffic search item and two search items located immediately compared with the amount of the specific search item immediately Click the two search item under the specific search term. 作为比较的结果,如果特殊搜索项的点击量比围绕其它搜索项的点击量大5倍, As a result of the comparison, if a particular search term than clicking around other large click on a search term 5 times,

则它是无效点击的可能性很高并且从而同样地被报告给管理员。 It is the possibility of invalid clicks very high and thus likewise be reported to the administrator.

用于确定无效点击的各种方法已经在上面被说明。 For determining invalid clicks various methods have been explained above. 用于确定无效点击的方法可以被独立地使用或者可以与用于确定无效点击的方法联合使用。 A method for determining the invalid click may be used independently or may be used for determining invalid clicks methods in combination.

例如, 一个规则可以被存储在无效点击模型存储单元306中,其中,对应 For example, a rule may be stored in the invalid click pattern storage unit 306, which corresponds to

于搜索项的客户机IP地址、页面标识符和站点标识符在从搜索项的最后点 The search term client IP address, the page identifier and the site identifier at the last point from search terms

击开始的5分钟内被重复的情况是无效的。 The situation is repeated within five minutes of the start of strike is invalid.

在本发明中,因特网搜索服务器和用于识别不公平点击的设备已经被混乱地描述为单个单元。 In the present invention, the Internet search server and for identifying unfair clicks device has been described as a single unit disorder. 然而,根据本发明的另一个实施例,应当注意它们可以根据它们的功能被分开执行并且可以由不同的管理员来管理。 However, according to another embodiment of the present invention, it should be noted that they may be performed separately according to their functions and may be managed by different administrators.

此外,在本发明中,被显示并被描述为分开元件的元件可以物理上被创建在单个系统中并且可以物理上被创建在一个单独的系统中。 In the present invention, is shown and described may be created in a single component system is physically separate elements and may be created in a separate physical system.

22此外,尽管几个实施例已经在本发明中被说明,对于所属领域技术人员来说显而易见的是,多个实施例的一部分或剩余的实施例也属于本发明的精神。 22 In addition, although several embodiments have been described in the present invention, the person skilled in the art it will be apparent that the remaining part or a plurality of embodiments are also within the spirit of the embodiments of the present invention.

另外,本发明的实施例还涉及包括用于执纟亍不同的计算机执行操作的程序指令的计算机可读媒介。 Further, embodiments of the present invention further relates to a right foot for different Si executing computer program instructions to perform operations in a computer-readable medium. 该媒介还可以单独(或与程序指令相结合)包括数据文件、数据结构、数据表等等。 The medium also may be used alone (or in combination with the program instructions), data files, data structures, data tables and the like. 媒介和程序指令可以被特别ftfe设计并构造以用于本发明目的,或它们可能是众所周知的类型并是计算机软件领域的技术人员可用的。 Media and program instructions may be specially designed and configured for ftfe object of the present invention, or they may be of the type well known in the field of computer software and is available in the art. 计算机可读媒介的例子包括诸如硬盘、软盘和磁带之类的磁性媒介;诸如CD-ROM磁盘之类的光媒介;诸如可光读磁盘之类的磁光媒介;.和被特别配置来存储和执行程序指令的硬件装置,比如只读存储器装置(ROM)和随机存取存储器(RAM)。 Examples include computer-readable media such as hard disks, floppy disks, and magnetic tape magnetic media; optical media such as CD-ROM disks and the like; magneto-optical media such as floptical disks and the like;., And are specially configured to store and hardware means for executing program instructions, such as read only memory devices (ROM) and random access memory (RAM). 媒介还可能是诸如光或金 Media may also be as light or gold

属线路、导波器等等之类的传输媒介,包括发射规定程序指令、数据结构等等的信号的载波。 Genus line transmission media, waveguide the like, including a carrier signal transmitted predetermined program instructions, data structures, and the like. 程序指令的例子包括两个诸如由编译器产生的之类的 Examples of program instructions include two, such as produced by a compiler, or the like

机器代码,和包括可以由计算机使用解释器来执行的高级代码在内的文件。 Machine code, and includes advanced code can be executed by the computer using an interpreter, including the file.

图12是一个说明通用计算机系统的结构的框图,该系统可用于创立搜索引擎服务器和用于根据本发明检测无效点击的设备。 12 is a block diagram showing a configuration of a general purpose computer system explanatory view, the system can be used to create a search engine server and an apparatus according to the present invention detect invalid clicks.

计算机系统包括任意数量的处理器1240(也被称为中央处理器或CPUs),它们被耦合到包括主存储器1260(—般来说是随机存取存储器或"RAM")、主存储器1270(—般来说是只读存储器或"ROM")的存储装置。 The computer system includes any number of processors 1240 (also referred to as central processors or CPUs), which are coupled to a main memory 1260 comprising (- is generally a random access memory or "RAM"), a main memory 1270 (- as a read only memory or storage means "ROM") of the. 在本领域中众所周知的是,主存储器1260把数据和指令单向传送到CPU, 并且主存储器1260—般被用来以双向方式传送数据和指令。 Well known in the art, the main memory 1260 data and instructions uni-directionally to CPU, and is used as a main memory 1260- to transfer data and instructions in a bidirectional manner. 这两个主存储器装置都可以包括如上所述的任何适当的类型的计算机可读媒介。 Both primary storage devices may include any suitable type of computer-readable media described above. 大容量存储装置1210还被双向耦合到CPU1240和提供附加的数据存储量并且可以包括如上所述的任何计算机可读媒介。 1210 is also bidirectionally coupled to a mass storage device CPU1240 and provides additional data storage capacity and may include any computer-readable media described above. 大容量存储装置1210可以被用来存储程序、数据等等,并且一般是一个诸如比主存储器慢的硬盘之类的辅助存储器媒介。 The mass storage device 1210 may be used to store programs, data and the like and is typically a secondary storage medium such as a hard disk is slower than the main memory or the like. 诸如光盘1220之类的特殊大容量存储装置还可以把数据单向传递给CPU。 Special means such as a mass storage disc 1220 or the like may also be one-way data transfer to the CPU. 处理器1240还被耦合到一个接口1230,其包括一个或多个输入输出设备,比如视频监视器、跟踪球、鼠标、键盘、扩音器、 触控式显示器、换能器读卡机、磁或纸带读取器、写字板、触针、音频或手写识别器或诸如当然包括其它计算机之类的其它众所周知的输入装置。 The processor 1240 is also coupled to an interface 1230 that includes one or more input and output devices, such as video monitors, track balls, mice, keyboards, microphones, touch-sensitive displays, transducer card readers, magnetic or paper tape readers, tablets, styluses, audio or handwriting recognizers, or other well-known, such as of course comprise other input means of a computer or the like.

最后,如通常在1250所示,处理器1240可以选择性地使用网络连接被耦合到计算机或电信网。 Finally, as is commonly connected it may be selectively coupled to a computer or telecommunications network using a network in 1250, the processor 1240. 有了这类网络连接,CPU可以在执行上述方法步骤 With such a network connection, CPU may perform the above-described method steps

的过程中从网络接收信息或者可以向网络输出信息是可期望的。 The process of receiving information from the network, or may output information to the network may be desirable. 上述装置和材料对于计算机硬件和软件领域中的技术人员来说是很熟悉的。 The above-described devices and materials for computer hardware and software skilled in the art is familiar.

如上所述的硬件元件可以被配置(一般暂时)来充当一个或多个执行本发明操作的软件模块。 The hardware elements described above may be configured (usually temporarily) to act as one or more software modules performing the operations of the present invention. 工业实用性 Industrial Applicability

根据上述的本发明, 一个用于检测包括在由因特网搜索引擎服务器提供的搜索结果网页内的搜索项的无效点击的方法和设备被提供。 According to the present invention, comprising for detecting invalid in the search result web page provided by an Internet search engine server of a search term click method and apparatus are provided.

根据本发明, 一个用于检测无效点击的方法和设备,其可以检测各种不正当地增加搜索项点击量的尝试,并且立即处理这些尝试。 According to the present invention, a method and apparatus for detecting invalid clicks, which can detect a variety of attempts to unduly increase the traffic to the search term, and immediately deal with these attempts. 即,如果新模型的不公平的点击尝试被发现,则该模型或规则被存储在一个根据本发明的无效点击模型存储单元中。 That is, if the unjust click attempt of the new model is found, then the model or rule is stored in a model storage unit click is invalid according to the present invention. 从而,立即处理这个遵循新模型的不公平点击尝试是可能的。 Thus, immediately following the new model to deal with this unfair Click attempt is possible.

此外,根据本发明提供了一个用于检测无效点击的方法和设备,其可以防止为了检测无效点击而提供的几个标识符被仿造或伪造。 Further, according to the invention provides a method and apparatus for detecting invalid clicks, which can prevent several identifiers provided in order to detect invalid clicks are counterfeit or forged.

尽管本发明已经关于附图中说明的本发明实施例而被说明,然而它并没有被限制在其中,因为对于所属领域技术人员来说,显然可以在其中做出不同的置换、修改和改变。 While the embodiment of the present invention, the invention has been illustrated in the drawings are described embodiment, but it is not limited and where, as for the ordinary skilled in the art, may be made therein apparent various substitutions, modifications and alterations. 本发明的范围由附加的权利要求来定义。 The scope of the present invention is defined by the appended claims. 所有在权利要求的意义和范围内做出的改变或修改或其等效物应该被看作是属于本发明的范围。 All made within the meaning and scope of the appended claims or the equivalents thereof changes or modifications should be considered as belonging to the scope of the present invention.

Claims (14)

1. 一种用于在因特网搜索引擎中检测无效点击的方法,包括下列步骤:根据日志文件中的点击时间存储页面标识符和站点标识符;响应于来自于搜索器的搜索请求产生搜索结果网页;获取一对应于所产生的网页的页面标识符;从搜索器接收一包括在搜索结果网页内的搜索项的点击;获取一对应于被点击搜索项的站点标识符;和查阅日志文件,如果页面标识符和站点标识符与预定时段内的与其它点击有关的页面标识符和站点标识符一致,则确定该点击无效。 An Internet search engine for a method for detecting invalid clicks, comprising the steps of: storing a log file the click time of the page identifier and a site identifier; in response to a search request from the searcher to produce search results page ; acquiring a page identifier corresponding to a web page generated; received from the searcher clicks a search item included in the search results page; click on obtaining a search term corresponding to the site identifier; and access log files, if page identifier and a site identifier associated with other clicks and page identifier site identifier within a predetermined period of time consistent with, determining that the click is invalid.
2. 权利要求1的方法,其中,页面标识符和站点标识符包括一校验和。 The method of claim 1, wherein the page identifier and the site identifier comprises a checksum.
3. —种用于在因特网搜索引擎中检测无效点击的方法,包括下列步骤:根据日志文件中的点击时间存储会话标识符和站点标识符;响应于来自于搜索器的搜索请求产生搜索结果网页;获取一包括在搜索器终端中存储的会话cookie文件内的会话标识符;从搜索器接收一包括在搜索结果网页内的搜索项的点击; 获取一对应于被点击搜索项的站点标识符;和查阅日志文件,如果会话标识符和站点标识符与在预定时段内的与其它点击有关的会话标识符和站点标识符一致,则确定该点击无效。 3. - A method for species clicks in an Internet search engine detecting invalid, comprising the steps of: click time log file stored session identifier and site identifier; in response to a search request from the searcher to produce search results page ; includes obtaining a session identifier in the session cookie file stored in the searcher's terminal; receiving a search term comprising one click in the search result web page from the searcher; obtaining a corresponding to the clicked search item site identifier; and access to the log file, if the session identifier and site identifier and a session identifier and sites associated with other clicks within a predetermined time period identifiers match, determining that the click is invalid.
4. 权利要求3的方法,其中,获取包括在搜索器终端中存储的会话cookie文件内的会话标识符的步骤包括下列步骤:确定会话cookie文件是否被存储在终端中;和如果确定会话cookie文件没有存储在终端中,则产生一新的会话标识符然后把包括产生的会话标识符的会话cookie文件存储在终端中。 4. The method of claim 3, wherein obtaining a session including the session cookie file stored in the searcher's terminal identifier comprises the steps of: determining whether a session cookie file is stored in the terminal; and determining if the session cookie file is not stored in the terminal, generating a new session identifier and the session identifier includes generating a session cookie file stored in the terminal.
5. 权利要求4的方法,还包括下列步骤:如果确定会话cookie文件被存储在终端中,则确定包括在会话cookie 文件内的会话标识符的最后更新时间是否在预定时段内•,和如果确定最后更新时间在预定时段内,则获取一包括在会话cookie文件内的会话标识符。 The method of claim 4, further comprising the steps of: determining if the session cookie file in the terminal, it is determined whether the last update time storing a session identifier of the session cookie file is within a predetermined period of •, and if it is determined last updated within a predetermined period of time, acquiring a session identifier contained in the session cookie file.
6. 权利要求5的方法,还包括下列步骤:如果确定最后更新时间不在预定时段内,则通过产生新的会话标识符来更新包括在会话cookie文件内的会话标识符;和把会话标识符的更新时间存储在会话cookie文件中。 The method of claim 5, further comprising the steps of: determining a last update time, if not within a predetermined period of time, through a new session identifier to update the session identifier contained in the session cookie file is generated; and the session identifier updated stored in the session cookie file.
7. 权利要求4的方法,还包括下列步骤:如果确定会话cookie文件存储在终端中,则确定来自搜索器的搜索项的点击时间是否在与会话标识符有关的最后点击时间之后的预定时段内;如果确定搜索项的点击时间在最后点击时间之后的预定时段内,则获取一包括在会话cookie文件内的会话标识符;和用搜索项的点击时间来更新最后点击时间。 The method of claim 4, further comprising the steps of: determining if a session cookie file stored in the terminal, it is determined that the click time search item from the searcher is within a predetermined period after the last session identifier associated with the click time ; if it is determined the click time of the search term within a predetermined period of time after the last click, you get a session identifier contained in the session cookie file; and a click on a search term to update the time and finally click time.
8. 权利要求7的方法,还包括下列步骤:如果确定搜索项的点击时间不在最后点击时间之后的预定时段内,则通过产生新的会话标识符来更新包括在会话cookie文件内的会话标识符;禾口用搜索项的点击时间来更新最后点击时间。 The method of claim 7, further comprising the steps of: after a predetermined period of time if it is determined clicks the search item is not the final click time, to update the session identifier contained in the session cookie file by generating a new session identifier ; Wo mouth with a click on a search term to update the time and finally click time.
9. 权利要求3到8中任何一个的方法,其中,会话标识符和站点标识符包括一个校验和。 9.3 to 8 to any method, wherein, the session identifier and the site identifier comprises a checksum claims.
10. —种用于在因特网搜索引擎中检测无效点击的方法,包括下列步骤:根据日志文件中的点击时间存储客户机IP地址和站点标识符; 从搜索器接收一包括在搜索结果网页内的搜索项的点击; 获取一对应于搜索器终端的客户机IP地址; 获取一对应于被点击搜索项的站点标识符;和查阅日志文件,如果客户机IP地址和站点标识符与预定时段内的与其它点击有关的客户机IP地址和站点标识符一致,则确定该点击是无效的。 10. - A method for species for detecting invalid clicks in an Internet search engine, comprising the steps of: the log file the click time storing the client IP address and the site identifier; receiving from a searcher comprises a search result web page in click on a search term; corresponding to the searcher acquiring a client IP address of the terminal; obtaining a corresponding to the clicked search item site identifier; and access log file, if the client IP address and the site identifier within a predetermined period of time client IP address, and click on the site and other related identifiers match, determining that the click is invalid.
11. 权利要求10的方法,其中,站点标识符用其中包括的校验和来产生。 11. The method of claim 10, wherein the site identifier comprises a checksum which is generated by the.
12. —种用于在因特网搜索引擎中检测无效点击的方法,包括下列步骤:根据日志文件中的点击时间存储终端标识符和站点标识符;响应于来自于搜索器的搜索请求产生一搜索结果网页; 获取一对应于搜索器终端的终端标识符;产生一包括终端标识符的用户cookie文件,然后把用户cookie文件存储在搜索器终端中;从搜索器接收一包括在搜索结果网页内的搜索项的点击; 获取一对应于被点击搜索项的站点标识符;和查阅日志文件,如果终端标识符和站点标识符与预定时段内的与其它点击有关的终端标识符和站点标识符一致,则确定该点击是无效的。 12. - A method for species clicks in an Internet search engine detecting invalid, comprising the steps of: storing a log file the click time of the terminal identifier and a site identifier; in response to a search request from the searcher generates a search result pages; obtaining a corresponding terminal of the searcher terminal identifier; generating a user cookie file including the terminal identifier and the user cookie file stored in the searcher's terminal; receiving comprises searching a search result web page from the searcher click item; obtaining a corresponding to the clicked search item site identifier; and access log file, if the terminal identifier and a site associated with other clicks within a terminal identifier and a site identifier with a predetermined period identifiers match, then determine that the click is invalid.
13. 权利要求12的方法,还包括下列步骤: 确定包括终端标识符在内的cookie文件是否被存储在终端中;和如果确定包括终端标识符在内的用户cookie文件存储在终端中,则从用户cookie文件接收终端标识符。 13. The method of claim 12, further comprising the steps of: determining whether the terminal identifier comprises a cookie file is stored in the terminal; and if determined that the user cookie file including the terminal identifier stored in the terminal, from cookie file receiving user terminal identifier.
14. 权利要求12或13的方法,其中,终端标识符和站点标识符包括一校验和。 14. The method as claimed in claim 12 or 13, wherein the terminal identifier and the site identifier comprises a checksum.
CN 200480007418 2003-03-19 2004-02-27 Method and apparatus for detecting invalid clicks on the internet search engine CN100533434C (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
KR10-2003-0017233 2003-03-19
KR20030017233A KR100619178B1 (en) 2003-03-19 2003-03-19 Method and apparatus for detecting invalid clicks on the internet search engine

Publications (2)

Publication Number Publication Date
CN1761961A true CN1761961A (en) 2006-04-19
CN100533434C true CN100533434C (en) 2009-08-26

Family

ID=36707372

Family Applications (2)

Application Number Title Priority Date Filing Date
CN 200810161032 CN101388035A (en) 2003-03-19 2004-02-27 Method and device for detecting invalid click on internet search engine server
CN 200480007418 CN100533434C (en) 2003-03-19 2004-02-27 Method and apparatus for detecting invalid clicks on the internet search engine

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN 200810161032 CN101388035A (en) 2003-03-19 2004-02-27 Method and device for detecting invalid click on internet search engine server

Country Status (4)

Country Link
JP (1) JP4358188B2 (en)
KR (1) KR100619178B1 (en)
CN (2) CN101388035A (en)
WO (1) WO2004084097A1 (en)

Families Citing this family (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8706551B2 (en) * 2003-09-04 2014-04-22 Google Inc. Systems and methods for determining user actions
KR100786796B1 (en) * 2005-03-25 2007-12-18 주식회사 다음커뮤니케이션 Method and system for billing of internet advertising
US7933917B2 (en) * 2005-05-06 2011-04-26 Nhn Corporation Personalized search method and system for enabling the method
KR20060028463A (en) * 2006-03-09 2006-03-29 정성욱 Click tracking and management system for online advertisement service
KR100777659B1 (en) * 2006-04-10 2007-11-19 (주)소만사 Device of detecting invalid use of keyword advertisement
KR100777660B1 (en) * 2006-04-10 2007-11-19 (주)소만사 Method of detecting robot-based invalid use of keyword advertisement and computer-readable medium having thereon program performing function embodying the same
WO2008030670A1 (en) * 2006-09-08 2008-03-13 Microsoft Corporation Detecting and adjudicating click fraud
CN101075908B (en) * 2006-11-08 2011-04-20 腾讯科技(深圳)有限公司 Method and system for accounting network click numbers
KR100857148B1 (en) * 2007-04-26 2008-09-05 엔에이치엔(주) Method for processing invalid click and system for executing the method
KR100841348B1 (en) * 2007-08-16 2008-06-25 방용정 Non-cost internet advertisement system each time unfairness click of cost-per-click-view and method thereof
KR100902466B1 (en) * 2007-10-30 2009-06-11 엔에이치엔비즈니스플랫폼 주식회사 System and Method for Tracking a Keyword Search Abuser
KR100914600B1 (en) * 2007-11-14 2009-08-31 엔에이치엔(주) System and Method for Determining Invalid Clicks
KR101020949B1 (en) * 2008-11-18 2011-03-09 주식회사 데이타웨이브 시스템 Method and server for detecting unfair click of keyword advertisement
KR20110116562A (en) 2010-04-19 2011-10-26 서울대학교산학협력단 Method and system for detecting bot scum in massive multiplayer online role playing game
CN102289756A (en) * 2010-06-18 2011-12-21 百度在线网络技术(北京)有限公司 Click to check the validity of the method and system
KR101158464B1 (en) * 2010-11-26 2012-06-20 고려대학교 산학협력단 Method and apparatus for detecting bot process
CN103368857B (en) * 2012-03-26 2016-09-21 北大方正集团有限公司 A method and system for transmitting data information
CN102663062B (en) * 2012-03-30 2015-01-14 北京奇虎科技有限公司 Method and device for processing invalid links in search result
JP2014026528A (en) * 2012-07-27 2014-02-06 Nippon Telegr & Teleph Corp <Ntt> Effective click counter, method and program
US9692833B2 (en) 2013-07-26 2017-06-27 Empire Technology Development Llc Device and session identification
CN103475543A (en) * 2013-09-11 2013-12-25 北京思特奇信息技术股份有限公司 Abnormal system service call detection method and system
CN104331306B (en) * 2014-10-14 2017-05-10 北京齐尔布莱特科技有限公司 A content update method, apparatus and system
CN104580244B (en) * 2015-01-26 2018-03-13 百度在线网络技术(北京)有限公司 Defense method and apparatus for malicious clicks
KR101639752B1 (en) * 2015-02-13 2016-07-15 네이버 주식회사 System and method for aggregating view of contents using filter logic
CN105069061A (en) * 2015-07-28 2015-11-18 安一恒通(北京)科技有限公司 Method and system for loading webpage in historical browsing record, browser and server

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6269361B1 (en) 1999-05-28 2001-07-31 Goto.Com System and method for influencing a position on a search result list generated by a computer network search engine

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20020020584A (en) * 2000-09-09 2002-03-15 맹진기 Internet survey system and method and media for storing program source thereof

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6269361B1 (en) 1999-05-28 2001-07-31 Goto.Com System and method for influencing a position on a search result list generated by a computer network search engine

Also Published As

Publication number Publication date Type
JP4358188B2 (en) 2009-11-04 grant
WO2004084097A1 (en) 2004-09-30 application
CN1761961A (en) 2006-04-19 application
KR100619178B1 (en) 2006-09-05 grant
CN101388035A (en) 2009-03-18 application
KR20040082633A (en) 2004-09-30 application
JP2006520940A (en) 2006-09-14 application

Similar Documents

Publication Publication Date Title
US7426750B2 (en) Network-based content distribution system
US7043521B2 (en) Search agent for searching the internet
US7765481B2 (en) Indicating website reputations during an electronic commerce transaction
US6366962B1 (en) Method and apparatus for a buddy list
US6189030B1 (en) Method and apparatus for redirection of server external hyper-link references
US7188181B1 (en) Universal session sharing
US6735694B1 (en) Method and system for certifying authenticity of a web page copy
US20020065912A1 (en) Web session collaboration
US20070277235A1 (en) System and method for providing user authentication and identity management
US20050086683A1 (en) Multiple entity control of access restrictions for media playback
US20060004748A1 (en) Search engine spam detection using external data
US20020112162A1 (en) Authentication and verification of Web page content
US7430587B2 (en) Distributed globally accessible information network
US6636854B2 (en) Method and system for augmenting web-indexed search engine results with peer-to-peer search results
US5908469A (en) Generic user authentication for network computers
US20060095526A1 (en) Internet server access control and monitoring systems
US20060106802A1 (en) Stateless methods for resource hiding and access control support based on URI encryption
US20060167860A1 (en) Data extraction for feed generation
US20070271498A1 (en) System and method for bookmarking and tagging a content item
US20030095660A1 (en) System and method for protecting digital works on a communication network
US7594011B2 (en) Network traffic monitoring for search popularity analysis
US20060253582A1 (en) Indicating website reputations within search results
US5812776A (en) Method of providing internet pages by mapping telephone number provided by client to URL and returning the same in a redirect command by server
US7500099B1 (en) Method for mitigating web-based “one-click” attacks
US6401118B1 (en) Method and computer program product for an online monitoring search engine

Legal Events

Date Code Title Description
C06 Publication
C10 Request of examination as to substance
C14 Granted
C56 Change in the name or address of the patentee

Owner name: NHN BUSINESS PLATFORM CO., LTD.)

Free format text: FORMER NAME: NHN CO., LTD.

ASS Succession or assignment of patent right

Owner name: NABAO CO., LTD.

Free format text: FORMER OWNER: NHN CORP.

Effective date: 20141114

C41 Transfer of the right of patent application or the patent right