CN104050178A - Internet monitoring anti-spamming method and device - Google Patents

Internet monitoring anti-spamming method and device Download PDF

Info

Publication number
CN104050178A
CN104050178A CN201310079359.8A CN201310079359A CN104050178A CN 104050178 A CN104050178 A CN 104050178A CN 201310079359 A CN201310079359 A CN 201310079359A CN 104050178 A CN104050178 A CN 104050178A
Authority
CN
China
Prior art keywords
data
matching
field
monitoring
network behavior
Prior art date
Application number
CN201310079359.8A
Other languages
Chinese (zh)
Other versions
CN104050178B (en
Inventor
欧阳佑
费浩峻
冯是聪
吴明辉
Original Assignee
北京思博途信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京思博途信息技术有限公司 filed Critical 北京思博途信息技术有限公司
Priority to CN201310079359.8A priority Critical patent/CN104050178B/en
Publication of CN104050178A publication Critical patent/CN104050178A/en
Application granted granted Critical
Publication of CN104050178B publication Critical patent/CN104050178B/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/552Detecting local intrusion or implementing counter-measures involving long-term monitoring or reporting
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/554Detecting local intrusion or implementing counter-measures involving event detection and direct action
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing packet switching networks
    • H04L43/08Monitoring based on specific metrics

Abstract

The invention discloses an internet monitoring anti-spamming method and an internet monitoring anti-spamming device and relates to the technical field of computer networks. In order to more accurately recognize the abnormity of click rate or page views of network activities and detect the spamming behaviors of internet activities, the method comprises the following steps of performing data collection on single network activity by utilizing various monitoring schemes to obtain a plurality of groups of independent monitoring data; matching the independent monitoring data, judging whether the independent monitoring data belong to the same network behavior or not, and summarizing all monitoring data judged to belong to the same network behavior into a log record of the network behavior; performing spamming flow analysis on the log record of each network behavior to obtain an analysis result. The method and the device can be applied to the anti-spamming monitoring process of the internet behaviors, such as anti-spamming monitoring of internet advertisement putting, network research and other types of network activities.

Description

一种互联网监测反作弊方法和装置 An Internet monitoring anti-cheating method and apparatus

技术领域 FIELD

[0001] 本发明涉及计算机网络技术领域,尤其涉及一种互联网监测反作弊方法和装置。 [0001] The present invention relates to computer network technology, and in particular the Internet, to a method and apparatus for monitoring the anti-cheating.

背景技术 Background technique

[0002] 互联网广告投放活动中,所有用户参与到广告活动中的行为总量,例如总浏览数、总点击数等等,是衡量广告投放效果的基本指标。 [0002] Internet advertising campaign, the total amount of all users to participate in acts of campaign, for example, the total number of views, the total number of clicks and so on, is the basic measure of advertising effectiveness. 这些指标被媒体和广告主广泛用于广告投放活动的费用结算。 These indicators are widely used in the media and advertisers advertising fee settlement activity. 媒体能提供的最大浏览量、最大点击量等指标也直接体现了其投放广告的能力。 The maximum page views, clicks and other indicators of the largest media can provide directly reflects the ability to advertise. 实际投放中,部分媒体可能会采用伪造非真实流量的方式来提高广告主方监测到的曝光数、点击数等指标,从而达到从广告主方获取额外的收入或是夸大自己的广告投放能力等目的。 Actual delivery, the part of the media may use counterfeit non-real way to improve the flow of impressions advertisers square monitored, click the index number and so on, so as to obtain additional revenue from advertisers square or exaggerate their advertising capability purpose. 另一方面,这些伪造的虚假流量对于广告主的利益有着恶劣的影响。 On the other hand, these fake false flow has adverse effects on the interests of advertisers. 例如当广告主和媒体按照广告的曝光数来进行结算时,广告主就必须为没有任何广告效果的虚假曝光花费额外的预算。 For example, when advertisers and advertising media in accordance with the number of exposures to be settled, advertisers must spend extra budget without any effect of false advertising exposure.

[0003] 非真实的虚假流量可以由多种方式产生。 [0003] Non-real spurious traffic can be generated by a variety of ways. 例如:利用病毒/木马等恶意手段入侵普通互联网电脑并控制这些电脑进行额外的广告浏览和点击;利用脚本和软件模拟正常用户访问网站的行为;在网站中插入浏览器不可见的隐藏代码来凭空产生额外流量等。 For example: the use of viruses / Trojans and other malicious means to invade the average Internet computer and controls the computer for additional ad views and clicks; use of scripts and software to simulate the behavior of normal user to access the site; insert browser in the site invisible hidden code out of thin air generate additional traffic. 针对这些作弊方式,现有的反作弊方法主要通过监测浏览、点击等网络行为发生时的上下文信息来进行异常流量的识别。 In response to these cheating, existing anti-cheating methods, mainly through monitoring browsing, click contextual information such as network behavior to identify abnormal traffic. 例如,如果在很短的时间内同一个IP地址发生了极频繁的浏览/点击,远远超出了正常用户的上网频率,那么就可以判断这个IP地址存在作弊嫌疑。 For example, if a very frequent occurrence browse the same IP address within a very short period of time / clicks, far beyond the normal frequency of Internet users, then we can determine the presence of suspected cheating the IP address. 又例如,目前一种常见的作弊方式是在价格较低的广告位上播放本来不应该在这个位置上投放的高价广告,即通过将低价广告位的曝光伪装高价广告位的曝光获利。 As another example, currently a common way of cheating is playing high-priced advertising should not have been put in this position at a lower price of advertising, namely profit by Exposure camouflage expensive advertising of low-cost ad slot. 针对这种作弊方式,反作弊系统通过监测广告曝光时的URL(统一资源定位符,Uniform Resource Locator),并跟投放计划中购买的广告位置的资源信息进行对比。 In view of this cheating, URL (Uniform Resource Locator, Uniform Resource Locator) when anti-cheating system by monitoring advertising exposure, and compared with resource information put plans to buy ad position.

[0004] 然而,当作弊者获知一个特定规则的反作弊技术实现手段后,其可以相应地修改作弊方式使得作弊行为难以被识别。 [0004] However, when a particular rule known cheating anti-cheat technology means, which can be correspondingly modified such that cheating cheating difficult to be identified. 例如,当作弊者知道反作弊方法使用URL比对来进行作弊检测时,作弊者可通过技术手段将反作弊系统监测到的URL伪装成正常URL的手法以逃避作弊行为被反作弊系统捕获。 For example, when the anti-cheat method of cheating knows when aligned using the URL to detect cheating, cheating anti-cheat the system may be monitored by techniques URL disguised as a normal URL approach to avoid cheating anti-cheat the system is captured. 此时,反作弊系统亦需要相应地调整己方的技术手段才能重新识别出作弊者。 At this time, anti-cheat systems also need to be adjusted accordingly one's own technical means to re-identify cheaters. 因此,在实际中反作弊者和作弊者之间存在博弈关系。 Therefore, there is a game between anti-cheating and cheating in practice. 目前反作弊系统主要是通过监测代码、监测脚本或客户端来收集用户上网过程中的行为数据,再利用这些数据进行作弊检测。 Currently anti-cheating system is mainly through monitoring the code, monitoring scripts or client to collect user behavior data online process, and then use these data to detect cheating. 常见的反作弊系统的数据获取方式较为固定,收集到的数据较为单一和有限。 Common anti-cheating system data acquisition approach is more fixed, the collected data is single and limited. 在持续使用较长时间后,其方法就可能会被作弊者针对而导致反作弊能力的下降。 After continuous use for a long time, the method may be a result of decreased against cheating anti-cheating abilities.

发明内容 SUMMARY

[0005] 为了更准确的识别网络活动点击量或者浏览数的异常,检测互联网网络活动作弊行为,本发明提出一种互联网监测反作弊方法和装置。 [0005] In order to more accurately identify the network traffic or abnormal activity, the network activity detected cheating number Internet browsing, the present invention provides an Internet monitoring anti-cheat method and apparatus.

[0006] 为了解决上述技术问题,本发明提供了一种互联网监测反作弊方法,包括: [0006] To solve the above problems, the present invention provides an Internet monitoring anti-cheat method, comprising:

[0007] A、利用多种监测方案对单次网络活动进行数据收集,获得多组独立的监测数据; [0007] A, using a variety of network monitoring program for a single data collection activity to obtain multiple independent monitoring data;

[0008] B、对所述多组独立的监测数据进行匹配,判断所述多组独立的监测数据是否属于同一个网络行为,将被判断为属于同一个网络行为的所有监测数据汇总为该网络行为的日志记录; [0008] B, independent of said plurality of sets of monitor data matching, determining whether the plurality of sets of independent monitoring data belong to the same network behavior is determined that all the monitoring data belonging to the same network for the network behavior summary logging behavior;

[0009] C、对各网络行为的所述日志记录进行作弊流量分析,获得分析结果。 [0009] C, the log records of network traffic behavior analysis to cheat, the analysis result is obtained.

[0010] 进一步地,所述多种监测方案包括在网络行为发生的网页框架中直接嵌入代码、在访问页面中的Flash动画或JavaScript脚本中嵌入代码、在用户机上安装浏览器插件或客户端软件。 [0010] Further, the monitoring program includes a variety of embedded code directly on the page frame network behavior occurs in, Flash animation access page script or JavaScript embed code to install browser plug-in or client software on the user's machine .

[0011] 进一步地,对所述多组独立的监测数据进行匹配,判断所述多组独立的监测数据是否属于同一个网络行为的步骤包括: [0011] Furthermore, the plurality of sets of independent monitoring data matching determination step of determining whether said plurality of sets of data belong to the same independent monitoring behavior of a network comprising:

[0012] B1、将监测数据的字段分为一个或者多个精确匹配字段,或者将监测数据的字段分为一个或者多个精确匹配字段和一个或者多个模糊匹配字段; [0012] B1, the monitoring data field into one or more exact match field, or a field monitoring data into one or more exact match field and one or more fuzzy matching fields;

[0013] B2、将多组独立的监测数据按字段进行两两比对; [0013] B2, a plurality of sets of independent monitoring data field by pairwise alignments;

[0014] 对于精确匹配字段进行比对时,当两组独立的监测数据的所述精确匹配字段有一个或者多个不相同时,则判断所述两组独立的监测数据不属于同一个网络行为; [0014] For an exact match field for comparison, when the two independent sets of monitor data field has an exact match is not the same one or more, it is determined that the two independent monitoring data does not belong to the same network behavior ;

[0015] 对于模糊匹配字段进行比对时,当两组独立的监测数据的所述模糊匹配字段有一个或者多个差距大于该字段的模糊阈值时,则判断所述两组独立的监测数据不属于同一个网络行为; [0015] For fuzzy match field for comparison, when the two independent sets of fuzzy matching monitoring data field has one or more of the gap is greater than a threshold field blur when it is determined that the monitoring data are not two independent belong to the same network behavior;

[0016] 对于所有精确匹配字段都相同,并且所有模糊匹配字段的差距都小于模糊阈值时,则判断所述两组独立的监测数据属于同一个网络行为。 [0016] the same for all fields match exactly, and all gaps fuzzy match field are less than the threshold value blur, it is determined that the two separate sets of monitoring data belong to the same network behavior.

[0017] 或者 [0017] or

[0018] bl、将监测数据的字段分为一个或者多个精确匹配字段,或者将监测数据的字段分为一个或者多个精确匹配字段和一个或者多个模糊匹配字段; [0018] bl, the monitoring data field into one or more exact match field, or a field monitoring data into one or more exact match field and one or more fuzzy matching fields;

[0019] b2、将多组独立的监测数据进行两两比对; [0019] b2, the plurality of sets of data independent monitoring pairwise alignments;

[0020] 对于精确匹配字段进行比对时,当两组独立的监测数据的所述精确匹配字段相同时,则将该字段的匹配度置为1,当两组独立的监测数据的所述精确匹配字段不相同时,则将该字段的匹配度置为O ; [0020] For an exact match field for comparison, when the two independent monitoring data fields match exactly the same, then the matching counter field to 1, when the two independent accurate monitoring data match field are not the same, then the matching set of fields is O;

[0021] 对于模糊匹配字段进行比对时,按照模糊匹配字段的差距将该字段的匹配度置为O到I的数值;并将所有模糊匹配字段的匹配度相加,获得总匹配度; [0021] When aligned to the field of fuzzy matching, according to the match of the field gap fuzzy match field set to value O to I; and adding all fuzzy matching degree of matching fields, the total degree of matching is obtained;

[0022] 当两组独立的监测数据的所述精确匹配字段有一个或者多个匹配度为O时,则判断所述两组独立的监测数据不属于同一个网络行为; [0022] when the two separate sets of monitoring data field has an exact match or a plurality of matching is O, then two independent monitoring determining that the data does not belong to the same network behavior;

[0023] 当模糊匹配字段总匹配度小于匹配阈值时,则判断所述两组独立的监测数据不属于同一个网络行为; [0023] When the match field fuzzy matching degree is smaller than the total match threshold, it is determined that the two separate sets of monitoring data does not belong to the same network behavior;

[0024] 当两组独立的监测数据的所述精确匹配字段匹配度均为1,且模糊匹配字段总匹配度大于匹配阈值时,则判断所述两组独立的监测数据属于同一个网络行为。 [0024] when the two separate sets of monitoring data precisely match field matches are 1 degree and a total field of fuzzy matching degree of matching is greater than a match threshold, it is determined that the two separate sets of monitoring data belong to the same network behavior.

[0025] 进一步地,精确匹配字段包括网络行为发生的用户机的身份标识ID,模糊匹配字段包括统一资源定位符URL、网络行为发生时间Time,网络行为发送的用户机的协议地址IP,网络行为发生的用户机的浏览器Browser,网络行为发生的用户机的操作系统OS。 [0025] Further, an exact match field includes identity of the user's network behavior occurs ID, fuzzy matching fields includes a uniform resource locator the URL, network behavior occurrence time Time, protocol address IP user's network behavior transmission of network behavior browser browser user machine occurs, the operating system OS user's network behavior to occur.

[0026] 进一步地,作弊流量分析包括:监测所述网络行为日志记录中多组监测数据中的同一监测参数的不匹配程度来识别伪造的数据。 [0026] Further, cheating flow analysis comprising: a degree of mismatch of the same monitored parameter to monitor the plurality of sets of monitor data log records in the network behavior to detect forged data.

[0027] 进一步地,步骤C的分析结果包括所有日志记录中作弊流量的百分比和作弊流量的数据源。 [0027] Further, the analysis result of Step C comprises all log records in the data source cheat traffic flow percentages and cheat.

[0028] 为了解决上述技术问题,本发明还提供了一种互联网监测反作弊装置,包括:多个数据采集模块、匹配模块和分析模块, [0028] To solve the above problems, the present invention also provides an anti-cheat Internet monitoring apparatus, comprising: a plurality of data acquisition module, an analysis module and a matching module,

[0029] 所述数据采集模块,用于利用监测方案对单次网络活动进行数据收集,获得监测数据; [0029] The data acquisition module, for a single network activity using a monitoring program for data collection, monitoring data is obtained;

[0030] 所述匹配模块,用于对多组独立的监测数据进行匹配,判断所述多组独立的监测数据是否属于同一个网络行为,将被判断为属于同一个网络行为的所有监测数据汇总为一项日志记录; All monitoring data [0030] The matching module for independent monitoring of the plurality of sets of data match, it is determined whether a plurality of sets of independent monitoring data belong to the same network behavior is judged to belong to the same network behavior summary as a logging;

[0031] 所述分析模块,用于对所述日志记录进行作弊流量分析,获得分析结果。 The [0031] analysis module for logging the cheat traffic analysis, the analysis result is obtained.

[0032] 进一步地,所述匹配模块包括精确匹配模块、模糊匹配模块和判断模块; [0032] Furthermore, the exact matching module comprises a matching module, fuzzy matching module and a determination module;

[0033] 所述精确匹配模块,用于对两组独立的监测数据的精确匹配字段进行比对,并获得精确比对结果; [0033] The exact matching module configured to match the field exactly independent monitoring data sets were aligned, and to obtain an accurate comparison result;

[0034] 所述模糊匹配模块,用于对两组独立的监测数据的模糊匹配字段进行比对,并获得模糊比对结果; [0034] The fuzzy matching means for matching the field of fuzzy two separate sets of monitoring data for comparison, and the comparison result of blurring;

[0035] 所述判断模块,用于根据精确比对结果和模糊比对结果,判断所述多组独立的监测数据是否属于同一个网络行为。 [0035] The determining module, for a result, determining the plurality of sets of independent monitoring data belong to the same network behavior according to a precise alignments and fuzzy comparison.

[0036] 进一步地,判断模块的判断依据为: [0036] Further, the determining module is determined based on:

[0037] 当有一个或者多个精确比对结果不相同时,则判断所述两组独立的监测数据不属于同一个网络行为; [0037] When there is a plurality of precisely the same time or not in the comparison result, determining that the two separate sets of monitoring data does not belong to the same network behavior;

[0038] 当有一个或者多个模糊匹配字段的差距大于该字段的模糊阈值时,则判断所述两组独立的监测数据不属于同一个网络行为; [0038] When there is a gap or a plurality of fuzzy blur matching field is greater than the threshold value of the field when it is determined that the two separate sets of monitoring data does not belong to the same network behavior;

[0039] 当所有精确匹配字段都相同,并且所有模糊匹配字段的差距都小于模糊阈值时,则判断所述两组独立的监测数据属于同一个网络行为; [0039] When all fields match exactly the same, and all gaps fuzzy match field are less than the threshold value blur, it is determined that the two separate sets of monitoring data belong to the same network behavior;

[0040]或者, [0040] Alternatively,

[0041] 当有一个或者多个精确比对结果不相同时,则判断所述两组独立的监测数据不属于同一个网络行为; [0041] When there is a plurality of precisely the same time or not in the comparison result, determining that the two separate sets of monitoring data does not belong to the same network behavior;

[0042] 当模糊比对结果总匹配度小于匹配阈值时,则判断所述两组独立的监测数据不属于同一个网络行为; [0042] When the comparison result of fuzzy matching degree is smaller than the total match threshold, it is determined that the two separate sets of monitoring data does not belong to the same network behavior;

[0043] 当所有精确匹配字段都相同,且模糊比对结果总匹配度大于匹配阈值时,则判断所述两组独立的监测数据属于同一个网络行为。 [0043] When all fields match exactly the same, and the overall results of fuzzy matching ratio is greater than a match threshold, it is determined that the two separate sets of monitoring data belong to the same network behavior.

[0044] 与现有技术相比,本发明通过对单次网络活动同时进行多次数据收集,得到多个独立的监测数据。 [0044] Compared with the prior art, the present invention simultaneously on a single network event data collected by a plurality of times to obtain a plurality of independent monitoring data. 并将多个数据源中的独立的监测数据进行匹配和比对,得到单次互联网网络活动的一组日志记录,并通过比较这些记录识别网络活动行为的异常,更精确地识别出涉及作弊的网络行为。 And independent monitoring data from multiple sources and matching alignment, to obtain a single set of log records Internet network activity, and these abnormalities by comparing the behavior of network activity record identification, relates to more accurately identify the cheating network behavior.

附图说明 BRIEF DESCRIPTION

[0045] 图1为本发明实施例的互联网监测反作弊装置的结构示意图; [0045] Fig 1 a schematic structural diagram of the embodiment of anti-cheat Internet monitoring apparatus embodiment of the invention;

[0046] 图2为本发明实施例的互联网监测反作弊方法的流程图; [0046] FIG 2 is a flowchart of the Internet embodiment of the method of monitoring anti-cheat of the present invention;

[0047] 图3为本发明实施例一的互联网监测反作弊过程的结构示意图。 [0047] Fig 3 a schematic structural diagram of the Internet to monitor anti-cheat process embodiment of the invention.

具体实施方式 Detailed ways

[0048] 为使本发明的目的、技术方案和优点更加清楚明白,下文中将结合附图对本发明的实施例进行详细说明。 [0048] To make the objectives, technical solutions, and advantages of the present invention will become apparent from, the accompanying drawings hereinafter in conjunction with embodiments of the present invention will be described in detail. 需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互任意组合。 Incidentally, in the case of no conflict, embodiments and features of the embodiments of the present application may be arbitrarily combined with each other.

[0049] 本发明实施例提出了一种互联网监测反作弊方法,包括: [0049] The embodiments of the present invention proposes an Internet monitoring anti-cheat method, comprising:

[0050] A、利用多种监测方案对单次网络活动进行数据收集,获得多组独立的监测数据; [0050] A, using a variety of network monitoring program for a single data collection activity to obtain multiple independent monitoring data;

[0051] B、对所述多组独立的监测数据进行匹配,判断所述多组独立的监测数据是否属于同一个网络行为,将被判断为属于同一个网络行为的所有监测数据汇总为一项日志记录; [0051] B, independent of said plurality of sets of monitor data matching, determining whether a plurality of sets of independent monitoring data belong to the same network behavior is judged to belong to the same network acts as a summary of all the monitoring data logging;

[0052] C、对所述日志记录进行作弊流量分析,获得分析结果。 [0052] C, the cheat logging traffic analysis, the analysis result is obtained.

[0053] 本发明实施例提出了一种互联网监测反作弊装置,其特征在于,包括:多个数据采集模块、匹配模块和分析模块, [0053] Example embodiments of the present invention provides an anti-cheat Internet monitoring apparatus, characterized by comprising: a plurality of data acquisition module, an analysis module and a matching module,

[0054] 所述数据采集模块,用于利用监测方案对单次网络活动进行数据收集,获得监测数据;其中多个数据采集模块的监测方案可以相同,也可以不同; [0054] The data acquisition module, for a single network activity using a monitoring program for data collection, obtained monitoring data; wherein the plurality of data acquisition modules monitoring programs may be the same or different;

[0055] 所述匹配模块,用于对多组独立的监测数据进行匹配,判断所述多组独立的监测数据是否属于同一个网络行为,将被判断为属于同一个网络行为的所有监测数据汇总为一项日志记录; All monitoring data [0055] The matching module for independent monitoring of the plurality of sets of data match, it is determined whether a plurality of sets of independent monitoring data belong to the same network behavior is judged to belong to the same network behavior summary as a logging;

[0056] 所述分析模块,用于对所述日志记录进行作弊流量分析,获得分析结果。 The [0056] analysis module for logging the cheat traffic analysis, the analysis result is obtained.

[0057] 对于互联网用户的每一次网络行为(如浏览/点击等),使用多个数据采集模块来收集此网络行为的信息。 [0057] The behavior of this network to gather information for each user of Internet network behavior (e.g., browsing / clicks, etc.), using a plurality of data acquisition modules. 目前有多种监测方案可进行网络行为监测,包括在网络行为发生的网页框架中嵌入代码,在访问页面中的Flash动画或Javascript脚本中嵌入代码,在用户机上安装浏览器插件或客户端等。 There are a variety of monitoring programs available for network behavior monitoring, including embedded code on the page frame network behavior occurs in, Flash animation to access the page or Javascript script embedded code, install the browser plug-in or client, etc. on the user's machine. 不同的监测方案在权限和职责上有所不同,因此不同数据采集模块能收集到的监测数据也有所区别。 Different monitoring programs differ in rights and duties, and therefore monitoring data from different data acquisition modules can be collected is also somewhat different. 每一次监测产生一组记录了当次网络行为相关信息的监测数据,监测数据中包含一个或多个信息字段,如:统一用户机ID、行为时间、被访URL等等。 Each monitor generates a set of recorded data when the second monitoring network behavior related information, monitoring the data comprises one or more information fields, such as: a unified user your crash ID, time behavior, like the visited URL. 在获取到了监测数据后,将监测数据通过网络传输到服务器进行存储。 After obtaining the monitoring data, the monitoring data stored in the server over the network.

[0058] 本发明可以应用于互联网网络活动的反作弊监测过程中,诸如互联网广告投放的反作弊监控和网络调研的反作弊监测,还可以是其他类型网络活动的反作弊监测。 [0058] The present invention can be used in the Internet network activity of anti-cheating monitoring process, such as anti-cheating anti-cheating monitoring and network monitoring survey of Internet advertising, but also other types of anti-cheating monitor network activity.

[0059] 本发明的反作弊方法和装置同时采用多种不同的监测方案进行数据收集,也可以从其他监测数据提供商处获取更多数据。 [0059] anti-cheat method and apparatus of the present invention, while using a variety of different monitoring program data collection, more data may be acquired from other monitoring data provider. 必须指出的是,不同监测方案能获取到的字段并不完全相同,例如用户浏览网页的URL地址在权限较低的监测方案中(例如在Flash中嵌入代码的方案)可能无法获取。 It must be noted that the various monitoring programs can get to the field are not the same, for example, a user browsing the web URL address (such as embedded program code in Flash) may not be available at a lower privilege monitoring programs. 此外,不同数据源的同一字段之间也可能存在区别。 In addition, there may be differences between the different data sources in the same field. 例如:用户访问一个网页时,位于网页不同位置的不同监测代码的运行时间可能存在差异,所以在不同的数据采集模块中记录到的行为时间也可能会存在差异。 For example: when a user accesses a web page, run-time monitoring of different code pages located at different positions may be different, the recording to the data acquisition module in a different time behavior may differ.

[0060] 通常情况下,数据供应商需为每个互联网用户维护一个唯一的用户机ID以识别出来自同一个用户的多次不同的网络行为。 [0060] Typically, a data provider to the maintenance required for each Internet user's unique user ID to identify a machine different times from the same network user behavior. 为了识别出不同供应商的数据之间的关联性,在多数据源反作弊系统中除供应商自己的用户机ID外,还需额外为各个供应商提供一个统一的用户机ID。 In order to identify associations between different vendors of data, in addition to the user machine vendor's own ID data in a multi-source system, anti-cheat, needed for each additional vendor provides a unified user machine ID. 这个统一的用户机ID可以通过让数据供应商从Cookie (浏览器Cookie或Flash Cookie)中的固定位置读取用户信息来实现。 This unified user ID machine can be achieved by having data provider from Cookie (Cookie browser or Flash Cookie) in a fixed position to read user information. 为了保证所有的供应商获取到的Cookie ID之间的一致性,统一的Cookie ID由第三方服务器进行统一分配和管理。 In order to ensure consistency between all the suppliers get to the Cookie ID, unified Cookie ID assigned unified and managed by a third-party server.

[0061] 统一的Cookie ID使得不同的数据提供商无需采用同一批服务器进行数据存储,各供应商可采用独立的技术方案存储自己的监测数据。 [0061] uniform Cookie ID enables different data providers do not need to use the same number of servers for data storage, each vendor independent technical solutions can be stored in their own monitoring data.

[0062] 各数据采集模块收集各自的监测数据后,将各自存储的监测数据汇总到服务器进行多数据源数据汇总: [0062] Each data acquisition modules each monitoring data collected, the aggregated monitoring data stored in each data source to the server for multiple data summary:

[0063] 其中,数据采集模块和汇总服务器之间可采用多种技术方案实现数据传输。 [0063] wherein, using a variety of technical solutions to achieve the data transmission between the server and aggregated data acquisition module. 一种方式是每个数据采集模块收集一定数量的监测数据后,统一通过互联网或其他途径将监测数据传输给汇总服务器;另一种方式是各个数据采集模块获取到任何一条监测数据时,直接将此条监测数据同步推送给汇总服务器。 One way is to collect each data acquisition module after a certain amount of monitoring data, unified through the Internet or other means to monitor data transmissions to an aggregate server; another way is to get the individual data acquisition modules to any of the monitoring data, direct This entry summary monitoring data synchronization pushed to the server.

[0064] 考虑到多个数据源带来的巨大数据量,汇总服务器上的数据可进行分布式存储以解决海量数据存储问题。 [0064] Considering the huge amount of data from multiple data sources to bring the data on the server can be aggregated in order to solve the distributed storage of mass data storage problems. 一种可行的技术方案是按照监测数据时间来进行分布式存储:将同一时间段内(例如同一天内)所有数据源的监测数据传输到同一台服务器上存储;将不同时间段的监测数据传输到不同的服务器上存储。 One possible technical solution in accordance with the monitoring data to the distributed storage time: the same period of time (e.g., within the same) monitoring data transfer to the storage of all data sources on the same server; the different monitoring data transmission period to stored on a different server.

[0065] 将同一次网络行为在不同的数据采集模块形成的监测数据进行匹配,尽可能地还原出此次网络行为的全部相关信息。 [0065] The same behavior matching network monitoring data in different data acquisition module is formed, all the relevant information to restore the behavior of the network as much as possible. 考虑到监测手段的区别会导致不同数据采集模块的监测数据在同一字段上存在分歧,本发明的一种实施方式中将监测数据的字段分为了精确匹配字段和模糊匹配字段这两种,在其他实施例中精确匹配字段是必需包括的字段,模糊匹配字段是可选的字段,即监测数据中一定包括精确匹配字段,不一定包括模糊匹配字段。 Taking into account the difference between the means of monitoring the monitoring data will lead to different data acquisition module is divided on the same field, a method of monitoring data field in the embodiment of the present invention is a precise match field and fuzzy matching both fields, in other Example embodiments exact matches in the field is a required field comprises a fuzzy match field is an optional field, i.e., a certain monitoring data field including an exact match, the fields are not necessarily including fuzzy matching.

[0066] 精确匹配字段指的是:对于一个字段而言,如果两条监测数据的此字段不一样,那么认为这两条监测数据不是描述的同一网络行为。 [0066] exactly match the field means: for a field, if the two monitoring data in this field is not the same, it is considered the same network behavior of these two monitor data is not described. 例如统一用户机ID,由于所有的数据采集模块都读取一个唯一的统一用户机ID,所以当这个唯一的ID匹配不上时,则可直接认为两条监测数据不可能是同一次网络行为产生的。 Machines such as unified user ID, because all of the data acquisition module to read a unique ID unified user machine, so that when the unique ID does not match, the two can be directly monitored data that can not be same network behavior of. 除统一用户机ID外,在其他实施例中精确匹配字段可以为其他的字段。 In addition to a unified user your crash ID, exact match field in other embodiments may have other fields.

[0067] 模糊匹配字段指的是:对于一个字段而言,两条匹配上的监测数据在此字段上可以不完全一致,例如网络行为发生时间。 [0067] refers to a fuzzy match field: a field for the purposes of monitoring data on the two may not match exactly in this field, such as network time behavior. 由于网页加载过程的时间消耗和网络传输的延迟,同一次网络行为在不同的数据采集模块中的采集到时间可能不完全一致。 Because of the time consuming process of loading the page and network transit delays, same network behavior acquired at different data acquisition module may not exactly match the time. 这是因为不同的代码、脚本、客户端可能在网页从打开到加载完毕的过程中的不同时间触发,它们记录下来的网络行为的时间并不一定完全一样。 This is because a different code, scripts, clients may trigger the process at different times from the open web page is loaded in to the time they recorded network behavior is not necessarily exactly the same. 针对这种情况,在进行监测数据匹配的时候,并不要求两条匹配的监测数据中记录的时间完全一致,只需要两个时间之间的差距在一定的范围之内即可。 For this case, when performing the monitoring data matching, monitoring time does not require two matching data record exactly the same, only the time difference between the two can be within a certain range. 除网络行为发生时间外,在其他实施例中模糊匹配字段可以为其他的字段。 In addition to the network behavior outside time, in other embodiments, other fields may be fuzzy matching fields.

[0068] 对于精确匹配字段进行比对时,当两组独立的监测数据的所述精确匹配字段有一个或者多个不相同时,则判断所述两组独立的监测数据不属于同一个网络行为; [0068] For an exact match field for comparison, when the two independent sets of monitor data field has an exact match is not the same one or more, it is determined that the two independent monitoring data does not belong to the same network behavior ;

[0069] 对于模糊匹配字段进行比对时,当两组独立的监测数据的所述模糊匹配字段有一个或者多个差距大于模糊阈值时,则判断所述两组独立的监测数据不属于同一个网络行为; [0069] When aligned to the field of fuzzy matching, the monitoring data when two independent fuzzy matching field has one or more gaps larger than fuzzy threshold, it is determined that the two independent monitoring data does not belong to the same network behavior;

[0070] 对于所有精确匹配字段都相同,并且所有模糊匹配字段的差距都小于模糊阈值时,则判断所述两组独立的监测数据属于同一个网络行为。 [0070] the same for all fields match exactly, and all gaps fuzzy match field are less than the threshold value blur, it is determined that the two separate sets of monitoring data belong to the same network behavior.

[0071] 或者 [0071] or

[0072] 对于精确匹配字段进行比对时,当两组独立的监测数据的所述精确匹配字段的相同时,则将该字段的匹配度置为1,当两组独立的监测数据的所述精确匹配字段的不相同时,则将该字段的匹配度置为O ; [0072] When an exact match for fields aligned, when the same data to two independent monitoring exact match field, then the field is set to matching degree 1, when the two separate sets of monitoring data does not exactly match the same field, then the field is set to matching degree is O;

[0073] 对于模糊匹配字段进行比对时,按照模糊匹配字段的差距将该字段的匹配度置为O到I的数值;并将所有模糊匹配字段的匹配度相加,获得总匹配度; [0073] When aligned to the field of fuzzy matching, according to the match of the field gap fuzzy match field set to value O to I; and adding all fuzzy matching degree of matching fields, the total degree of matching is obtained;

[0074] 当两组独立的监测数据的所述精确匹配字段有一个或者多个匹配度为O时,则判断所述两组独立的监测数据不属于同一个网络行为;也即只要有任何一个精确匹配字段为0,则数据就会被判断为不同网络行为。 [0074] when the two separate sets of monitoring data field has an exact match or a plurality of matching is O, then determines the two independent monitoring data does not belong to the same network behavior; i.e. if any of the exact match field is 0, then the data will be determined for different network behavior.

[0075] 当模糊匹配字段总匹配度小于匹配阈值时,则判断所述两组独立的监测数据不属于同一个网络行为; [0075] When the match field fuzzy matching degree is smaller than the total match threshold, it is determined that the two separate sets of monitoring data does not belong to the same network behavior;

[0076] 当两组独立的监测数据的所述精确匹配字段匹配度均为1,且模糊匹配字段总匹配度大于匹配阈值时,则判断所述两组独立的监测数据属于同一个网络行为。 [0076] when the two separate sets of monitoring data precisely match field matches are 1 degree and a total field of fuzzy matching degree of matching is greater than a match threshold, it is determined that the two separate sets of monitoring data belong to the same network behavior.

[0077] 在实际应用过程中,由于网络原因和特定监测方案的访问权限等限制,并非所有的数据采集模块都能获取到网络行为的所有相关信息。 [0077] In the actual application process, due to network and other reasons, and access to specific monitoring programs, not all of the data acquisition module can get to the network behavior of all relevant information. 在获取不到相关信息时,监测数据中会存在部分字段为空的情况。 When not obtain relevant information, monitoring data will be the presence of some of the fields are empty. 对于这种字段为空的情况,也可以采用模糊匹配的方式来处理。 In the case of this field is empty, but also the way of fuzzy matching process may be employed.

[0078] 精确匹配字段包括网络行为发生的用户机的身份标识ID,模糊匹配字段包括统一资源定位符URL、网络行为发生时间Time,网络行为发送的用户机的协议地址IP,网络行为的用户机的浏览器Browser,网络行为发生的用户机的操作系统OS。 [0078] exact match field includes identity of the user's network behavior occurs ID, fuzzy matching fields includes a uniform resource locator the URL, network behavior occurrence time Time, protocol address IP user's network behavior transmission, network behavior user machine browser browser, OS operating system user's network behavior to occur.

[0079] 精确匹配字段和模糊匹配字段可有更多参数和指标,这里仅仅举例说明。 [0079] The exact and fuzzy matching fields match field may have more parameters and indicators, only exemplified herein.

[0080] 对于多个不同的字段进行匹配度计算,并用最后的总匹配度来判断匹配是否成功。 [0080] The matching degree calculating for a plurality of different fields, and the total degree of matching with the last match to determine success. 在本实施例的方案中,采用了如表I所示的字段进行匹配: In the embodiment according to the present embodiment, using a field as shown in Table I match:

[0081]表 I [0081] TABLE I

[0082] [0082]

[0083] 上述特征使用的阈值,即匹配度的各个得分的具体数值,可以根据实际情况进行修改。 [0083] The threshold value of the feature used, i.e., the matching score of each specific value, may be modified according to the actual situation.

[0084] 对于任意两条监测数据,将两者在上述每条规则中的匹配度相加,得到匹配度的总得分。 [0084] For any two monitoring data, both in the above-described matching degree in each rule is added to give a total degree of matching scores. 如果总得分超过了预先设定的匹配阈值,即认为这两条监测数据是一对匹配成功的监测数据。 If the total matching score exceeds the threshold value set in advance, i.e. that the two monitor data is successfully monitoring a pair of matching data. 特殊的是,如果两条监测数据的统一用户机ID不一样,则令匹配度的总得分直接为O。 Special is that, if the same two monitor data user machine ID is not the same, so that the total score for the matching direct O. 因此,两条匹配的监测数据的统一用户机ID要求必须完全一致,本实施例中只需对同一个统一用户机ID的日志之间两两进行匹配即可。 Thus, two requirements unified user ID matching machine monitoring data must match exactly, the present embodiment only needs to twenty-two between a unified user logs same machine ID matches can.

[0085] 监测数据匹配完毕后,所有数据采集模块中被判定属于同一个网络行为的所有监测数据被合并成为一条日志记录。 After [0085] Monitoring data matching is completed, all of the data acquisition module is determined in all the monitoring data belonging to the same network behavior are merged into a single log record. 日志记录可以存储在服务器之中,其中所有数据采集模块中都需完全一致的字段(例如统一用户机ID)只保留一个,其他字段加上对应数据采集模块的标签进行保存。 Log records may be stored in the server, wherein all of the data acquisition modules are required to exactly the same field (e.g., a unified user machine ID) but one, together with other fields corresponding to the tag data acquisition module saved.

[0086] 匹配完成的多个数据采集模块的日志记录具有两个优点:首先,多样的监测方案可较好地还原网络行为的相关信息;其次,多数据采集模块的监测数据中同一个字段有多个监测结果可供比对。 [0086] a plurality of matches is completed logging data acquisition module has two advantages: first, a variety of monitoring programs may be better restore information network behavior; secondly, the monitoring data from multiple data acquisition module has the same field multiple monitoring results can be compared. 基于这些特点,可以采用比单一数据源日志更丰富的规则来进行作弊行为的探测。 Based on these characteristics, you can use a single data source richer than the log rules to detect cheating.

[0087] 常规作弊分析规则主要通过分析同一用户网络行为发生的频率或者周期性来识别作弊流量,多数据源的反作弊方法则还可以监测不匹配的字段来识别伪造的数据。 [0087] The main routine analysis rule cheat by analyzing the frequency behavior of the same user network to identify cheating or periodic traffic, anti-cheat method of multiple data sources is also possible to monitor the field does not match the identification data of counterfeit.

[0088] 分析结果包括所有日志记录中作弊流量的百分比和作弊流量的数据源。 [0088] Analysis of the results of all logging data sources cheat traffic flow percentages and cheat.

[0089] 作弊流量分析的方式包括:监测所述网络行为日志记录中同一网络行为的多组监测数据中的同一监测参数的不匹配程度来识别伪造的数据和/或监测同一网络行为发生的频率或者周期来识别伪造的数据。 Embodiment [0089] cheating flow analysis comprises: the mismatch of the same monitored parameter monitoring a plurality of sets of data of the same network behavior monitoring the network behavior in the log records to identify counterfeit frequency data and / or monitoring the behavior of the same network identifying counterfeit or periodic data.

[0090] a.通过检查不匹配的字段来识别伪造数据的作弊方法。 [0090] a. Method of identifying cheating falsified data by checking fields do not match. 例如,广告投放者投放广告时,其自身的监测系统会对投放广告位进行描述。 For example, when advertising by advertising its own monitoring system will be put in place ad description. 一种作弊方式是在其他廉价广告位上伪造这个广告位的浏览行为,并通过技术手段屏蔽或修改广告投放者的监测系统获取到的URL地址。 One kind of cheating is this fake advertising on other low-cost advertising browsing habits and get to block or modify the URL's advertising monitoring system through technical means. 利用多数据源日志,只需从其他数据源中获取跟广告投放者的监测系统能匹配上的监测数据,并将监测数据中的真实URL跟广告位描述进行匹配,即可知道媒体方有没有使用伪造URL欺骗广告投放者的作弊方式。 Using multi-source data logging, monitoring systems just get put with advertisers on monitoring data can be matched from other data sources, and monitoring the real URL with advertising data matching the description, there is no way you can know the media URL spoofing using fake ad's cheating.

[0091] b.利用多数据源的丰富字段设计大量不同的规则并进行组合来克服单数据源的规则局限。 [0091] b. Utilize multiple data sources rich field of design rules and a number of different compositions to overcome the limitations of a single data source rule. 例如,如果某个用户在大部分数据源的数据中只有少数几次网络行为,而在某个特定数据源的数据中有大量的网络行为,即说明此数据源的监测代码正在被作弊者采用手段进行刷流量作弊。 For example, if a user in the data source of most of the data network behavior only a few times, but there are a lot of network behavior data for a specific data source, ie monitoring codes description of this data source is being adopted cheaters means to brush flow cheating.

[0092] 利用作弊流量的分析结果,还可以对各个数据采集模块的监测数据进行分析,找出可能正在被作弊者针对的数据源。 [0092] use the results to cheat traffic, you can also analyze monitoring data for each data acquisition module, identify data sources might be against cheaters. 例如,作弊者采用伪装技术手段使得数据源出现大量错误字段,或者采用拒绝访问的手段导致字段出现大量空白等时,在对比其它数据源的同一字段后,即可有效地识别出这些异常情况,并采用相应的技术方案来克服这些欺骗反作弊系统的手段,或是直接跟作弊者进行沟通要求其停止对于监测系统的欺骗和限制。 For example, use of cheating camouflage techniques such large data source field error occurs, or to use means to deny access when a large amount results in a blank field and the like, in contrast to the same field of the other data sources, to efficiently identify these anomalies, and the use of appropriate technical solutions to overcome these deceptive anti-cheat system, or to communicate directly with cheating and deception to halt its restrictions on the monitoring system.

[0093] 基于上述,本发明实施例的反作弊装置中匹配模块包括精确匹配模块、模糊匹配模块和判断模块; [0093] The exact matching module comprises a matching module, a determination module and a fuzzy matching module described above, anti-cheat apparatus embodiment of the present invention is based on the embodiment;

[0094] 所述精确匹配模块,用于对两组独立的监测数据的精确匹配字段进行比对,并获得精确比对结果; [0094] The exact matching module configured to match the field exactly independent monitoring data sets were aligned, and to obtain an accurate comparison result;

[0095] 所述模糊匹配模块,用于对两组独立的监测数据的模糊匹配字段进行比对,并获得模糊比对结果; [0095] The fuzzy matching means for matching the field of fuzzy two separate sets of monitoring data for comparison, and the comparison result of blurring;

[0096] 所述判断模块,用于根据精确比对结果和模糊比对结果,判断所述多组独立的监测数据是否属于同一个网络行为。 [0096] The determining module, for a result, determining the plurality of sets of independent monitoring data belong to the same network behavior according to a precise alignments and fuzzy comparison.

[0097] 判断模块的判断依据为: [0097] The determining module is determined based on:

[0098] 当有一个或者多个精确比对结果不相同时,则判断所述两组独立的监测数据不属于同一个网络行为; [0098] When there is a plurality of precisely the same time or not in the comparison result, determining that the two separate sets of monitoring data does not belong to the same network behavior;

[0099] 当有一个或者多个模糊匹配字段的差距大于模糊阈值时,则判断所述两组独立的监测数据不属于同一个网络行为; When [0099] When there is a gap or a plurality of fuzzy blur matching field is greater than the threshold value, it is determined that the two independent monitoring data does not belong to the same network behavior;

[0100] 当所有精确匹配字段都相同,并且所有模糊匹配字段的差距小于都模糊阈值的,则判断所述两组独立的监测数据属于同一个网络行为; [0100] When all fields match exactly the same, and all gaps are fuzzy matching field is less than the threshold blur, it is determined that the two separate sets of monitoring data belong to the same network behavior;

[0101]或者, [0101] Alternatively,

[0102] 当有一个或者多个精确比对结果不相同时,则判断所述两组独立的监测数据不属于同一个网络行为; [0102] When there is a plurality of precisely the same time or not in the comparison result, determining that the two separate sets of monitoring data does not belong to the same network behavior;

[0103] 当模糊比对结果总匹配度大于匹配阈值时,则判断所述两组独立的监测数据不属于同一个网络行为; [0103] When the comparison result of fuzzy matching degree is greater than the total match threshold, it is determined that the two separate sets of monitoring data does not belong to the same network behavior;

[0104] 当所有精确匹配字段都相同,且模糊比对结果总匹配度小于匹配阈值时,则判断所述两组独立的监测数据属于同一个网络行为。 [0104] When all fields match exactly the same, and the overall results than fuzzy matching degree is smaller than the match threshold, it is determined that the two separate sets of monitoring data belong to the same network behavior.

[0105] 相比于现有的反作弊的方法及装置,本发明的优势在于: [0105] Compared to conventional anti-cheat method and apparatus, the advantages of the present invention:

[0106] (I)通过多数据源的技术方案,可以收集到更多的网络行为的相关信息,因此可以更精确地识别出涉及作弊的网络行为。 [0106] (I) by the aspect of multiple data sources, to collect more information about network behavior, it is possible to identify more precisely relates cheating network behavior.

[0107] (2)多种数据源的使用可以有效地避免被作弊者使用伪造数据等手段针对,因此可以得到更稳定的效果。 [0107] (2) using various data sources can be effectively avoided by the use of forged cheating means for data, so a more stable effect can be obtained.

[0108] (3)通过分析各数据源的历史数据,基于多数据源的反作弊方法和系统可以检测出可能正被作弊者针对的数据源,并通过改进监测方案来提高数据源的抗干扰能力。 [0108] (3) by analysis of historical data for each data source, anti-cheat method and system based on multiple data sources can be detected for a data source may be being cheaters, and to improve the noise data sources through improved monitoring program ability.

[0109] 以下结合图1、2和3,以互联网广告投放的反作弊监测为例进行阐述: [0109] with reference to FIGS 1, 2 and 3, anti-cheat monitoring Internet advertising is set forth as an example:

[0110] 多个互联网广告监测系统,用于存储、记录和提取每一来访用户对象(即Cookie)所代表的用户机的每次网络行为的相关信息; [0110] plurality of Internet advertising monitoring system, for storing, recording and extracting each accessing user objects (i.e. cookies) network behavior information each represent a user machine;

[0111] 对于每一个来访Cookie的每一次网络行为,监测系统都会记录下来访Cookie的统一唯一标识(ID)、来访时间、访问URL、浏览行为等信息中的一种或多种。 [0111] For each network behavior of each visit Cookie, visitors will be recorded under the monitoring system of unified Cookie unique identification (ID), visit time, visited URL, the information in one or more browsing behavior. 实际运行中,不同的监测系统记录下的字段可存在区别。 In actual operation, the recording field under different monitoring systems there may be different.

[0112] 单个数据采集模块记录的Cookie用户机的信息和/或浏览行为如表2所示。 Information and / or browsing behavior [0112] a single data acquisition module records Cookie user machine as shown in Table 2.

[0113]表 2 [0113] TABLE 2

[0114] [0114]

[0115] 将多个互联网广告数据采集模块的监测数据汇总。 [0115] The monitoring data more Internet advertising data acquisition module summary. 汇总数据的样例如表3所示。 Comp summary data shown in Table 3, for example.

[0116]表 3 [0116] TABLE 3

[0117] [0117]

[0119] 匹配模块,对待匹配数据中的任意两条监测数据进行匹配,找出属于同一个网络行为的多条日志。 [0119] The matching module, to treat any two monitoring data matches the data for matching, identify a plurality of logs that belong to the same network behavior. 下面分别举例说明各种情况下的匹配结果。 The following results illustrate, respectively, in each case matching.

[0120]表 4 [0120] TABLE 4

[0121] [0121]

[0122] 如表4所示,来自不同数据采集模块的两条监测数据中的统一用户机ID不一致,因此这两条监测数据匹配失败,属于不同的网络行为。 [0122] As shown in Table 4, two monitor data from different data acquisition module in a unified user machine ID to inconsistent monitoring these two fail to match the data belonging to different network behavior.

[0123]表 5 [0123] TABLE 5

[0124] [0124]

[0125] 如表5所示,来自不同数据采集模块的两条监测数据中的统一用户机ID —致,因此需要继续进行模糊字段的匹配计算。 [0125] As shown in Table 5, two monitor data from different data acquisition module in a unified user machine ID - induced, it is necessary to continue the fuzzy matching calculation field.

[0126] 根据表1中的计算公式,网络行为发送的时间的匹配度等于 [0126] According to the formula in Table 1, the matching network behavior transmission time is equal to

[0127] I/(两个时间之间相差的秒数+1) = 1/15993 ^ O ; [0127] I / (the time difference between two seconds +1) = 1/15993 ^ O;

[0128] 网络行为发生的网页URL的匹配度等于0.2 ; [0128] Web page URL matching the behavior of the network is equal to 0.2;

[0129] 浏览行为的匹配度等于O。 [0129] Match browsing behavior is equal to O.

[0130] 因此3个模糊字段的总匹配度等于0.2。 [0130] Thus a total of three fuzzy matching degree field equal to 0.2.

[0131] 本实施例中设定事先设定的总匹配度的阈值为1,那么这两条监测数据之间的总匹配度小于阈值,所以这两条监测数据也不属于同一个网络行为。 [0131] The present embodiment a threshold set in the total Match embodiment previously set value 1, then the total degree of match between the two monitor data is less than the threshold value, so it does not belong to the same two monitor data network behavior.

[0132]表 6 [0132] TABLE 6

[0133] [0133]

[0134] 如表6所示,来自不同数据采集模块的两条监测数据中的统一用户机ID—致,进一步计算模糊字段的总匹配度。 [0134] As shown in Table 6, the monitoring data from two different data acquisition module in a unified user machine ID- actuator, fuzzy matching degree total field further calculations.

[0135] 根据表1中的计算公式,网络行为发送的时间的匹配度等于 [0135] According to the formula in Table 1, the matching network behavior transmission time is equal to

[0136] I/(两个时间之间相差的秒数+1) = 1/3 ^ 0.33 ; [0136] I / (the time difference between two seconds + 1) ^ 1/3 = 0.33;

[0137] 网络行为发生的网页URL的匹配度等于0.5 ; [0137] Web page URL matching the behavior of the network is equal to 0.5;

[0138] 浏览行为的匹配度等于0.2。 [0138] Match browsing behavior is equal to 0.2.

[0139] 因此模糊字段的总匹配度等于1.03。 [0139] Thus the total field of fuzzy matching degree is equal to 1.03.

[0140] 如前所述,本实施例中设定事先设定的总匹配度的阈值为1,那么这两条监测数据之间的总匹配度大于阈值,所以这两条监测数据属于同一个网络行为。 [0140] As described above, the threshold of the total degree of matching of setting the preset value 1 of the present embodiment, the total degree of match between the two monitor data is greater than the threshold value, which belong to the same two monitor data network behavior.

[0141] 通过如上的匹配过程,可以把同一个网络行为的多条监测数据匹配出来,并把这些监测数据作为此次网络行为的日志记录。 [0141] With the above matching process may be the same data to a plurality of monitoring network behavior matches out such monitoring data and to log the network behavior.

[0142] 作弊分析,分析匹配完毕的日志来识别作弊流量。 [0142] cheating analysis, to match the completed logs to identify cheating traffic. 本实施例中以检查不匹配的字段来识别伪造数据的方法作为例子说明分析日志的方法。 In this embodiment the fields do not match the falsified data to identify a method as a method of analyzing the logs examples of the present embodiment.

[0143]表 7 [0143] TABLE 7

[0144] [0144]

[0145] 如表7表示的是两条已匹配上的来自不同数据采集模块的监测数据,例如客户设置广告活动I投放的是女性频道。 [0145] Table 7 shows two matched on the monitoring data from different data acquisition module, such as client I campaign set female channel is served.

[0146] 从未监测URL的数据采集模块I的监测数据中来看,这个网络行为并没有异常,然而结合匹配上的数据采集模块2的监测数据进行分析,即可发现这个网络行为的实际URL为sports, bbb.com这样的形式。 [0146] Monitoring data never monitoring data acquisition module I URL of view, this is not abnormal network behavior, but in conjunction with the monitoring data acquisition module 2 matching analysis, found that the actual URL to the network behavior such as in the form of sports, bbb.com. 根据这个URL的特点,可以判断出这次网络行为访问的其实是一个体育频道的页面,而数据采集模块I中广告活动I本应投放的是女性频道,因此可以怀疑这个网络行为很可能涉及了用A广告位曝光伪装B广告位曝光的作弊手段。 According to this feature of the URL, you can determine the behavior of the network access page is actually a sports channel, and the data acquisition module I in the campaign I should have been running a women's channel, it is possible to suspect that this behavior is likely to involve a network a camouflaged with advertising exposure B advertising exposure of cheating.

[0147] 在本实施例中,有广告活动信息的数据采集模块I上,广告投放者隐藏了曝光页面的URL使得监测系统无法检查出这种作弊手段。 [0147] In the present embodiment, the data acquisition module I have campaign information, ad impressions hides the URL of the page so that the monitoring system can not check out this cheating. 而多数据源的反作弊方法从另一个数据采集模块2上查找出了这个曝光的URL,并成功地识别出了这样的作弊行为。 And anti-cheat method of multiple data sources to find another data acquisition module from the URL of the two impressions, and successfully identified such cheating.

[0148] 实际中作弊流量分析的方式多种多样,上例说明的是在多数据源的数据上不仅可以采用单数据源中的方法,还可以用多个数据交叉验证的方法来更准确地识别出作弊手段。 [0148] In practice a variety of ways to cheat traffic analysis, the embodiment illustrated in the data of a plurality of single data source data source is not only a method may be employed, with a plurality of data may also be cross-validation method to more accurately identify cheating.

[0149] 作弊结果反馈,对各个数据采集模块的结果进行分析,找出正被作弊者针对的数据源予以改进。 [0149] The results of cheating feedback, the results of each data acquisition module analysis to identify the data source being for cheaters to be improved.

[0150] 在本实施例中,数据采集模块I被广告投放者限制了URL的获取。 [0150] In the present embodiment, the data acquisition module I is restricted by advertising acquisition URL. 如果这种情况在数据采集模块I上较为普遍,就可以向数据采集模块I进行反馈,并要求广告投放者取消对数据采集模块I的限制等行为,从而提高数据采集模块I的监测质量。 If this situation is more common on the data acquisition module I, can I carry out the data acquisition module feedback and request removal of restrictions on advertising by data acquisition module I, and other acts to improve the monitoring of the quality of data acquisition module I.

[0151] 以上实施例仅用以说明本发明的技术方案而非限制,仅仅参照较佳实施例对本发明进行了详细说明。 [0151] Example embodiments above are intended to illustrate and not limit the present invention only with reference to the preferred embodiments of the present invention has been described in detail. 本领域的普通技术人员应当理解,可以对本发明的技术方案进行修改或者等同替换,而不脱离本发明技术方案的精神和范围,均应涵盖在本发明的权利要求范围当中。 Those skilled in the art will appreciate that modifications may be made to the technical solutions of the present invention, or equivalent replacements without departing from the spirit and scope of the technical solutions of the present invention shall be encompassed in the scope of the present invention accompanying claims.

Claims (10)

1.一种互联网监测反作弊方法,其特征在于,包括: A、利用多种监测方案对单次网络活动进行数据收集,获得多组独立的监测数据; B、对所述多组独立的监测数据进行匹配,判断所述多组独立的监测数据是否属于同一个网络行为,将被判断为属于同一个网络行为的所有监测数据汇总为该网络行为的日志记录; C、对各网络行为的所述日志记录进行作弊流量分析,获得分析结果。 An Internet monitoring anti-cheat method comprising: A, using a variety of network monitoring program for a single data collection activity to obtain multiple independent monitoring data; B, the plurality of sets of independent monitoring data matching, determining whether the independent monitoring of the plurality of sets of data belong to the same network behavior is determined that all the monitoring data summary for logging the behavior of the network belonging to the same network behavior; C, the behavior of each network said logging to cheat traffic analysis, analysis of the results obtained.
2.如权利要求1所述的方法,其特征在于:所述多种监测方案包括在网络行为发生的网页框架中直接嵌入代码、在访问页面中的Flash动画或JavaScript脚本中嵌入代码、在用户机上安装浏览器插件或客户端软件。 2. The method according to claim 1, wherein: said monitoring program comprising a plurality of embedded code directly on the page frame in the network behavior, Flash animation page is accessed or JavaScript script embedded code, the user install the browser plug-in or client software on the machine.
3.如权利要求1所述的方法,其特征在于:对所述多组独立的监测数据进行匹配,判断所述多组独立的监测数据是否属于同一个网络行为的步骤包括: B1、将监测数据的字段分为一个或者多个精确匹配字段,或者将监测数据的字段分为一个或者多个精确匹配字段和一个或者多个模糊匹配字段; B2、将多组独立的监测数据按字段进行两两比对; 对于精确匹配字段进行比对时,当两组独立的监测数据的所述精确匹配字段有一个或者多个不相同时,则判断所述两组独立的监测数据不属于同一个网络行为; 对于模糊匹配字段进行比对时,当两组独立的监测数据的所述模糊匹配字段有一个或者多个差距大于该字段的模糊阈值时,则判断所述两组独立的监测数据不属于同一个网络行为; 对于所有精确匹配字段都相同,并且所有模糊匹配字段的差距都小于模糊阈值 3. The method according to claim 1, wherein: the step of monitoring the multiple independent data matching, determining whether a plurality of sets of independent monitoring data belonging to the same network behavior include: Bl, will monitor field data is divided into one or more exact match field, or a field monitoring data into one or more exact match field and one or more fuzzy matching fields; B2, a plurality of sets of two independent monitoring data field by two alignments; when compared to an exact match of the fields, when the two independent sets of monitor data field has an exact match is not the same one or more, it is determined that the two separate sets of monitoring data does not belong to the same network behavior; when aligned for fuzzy matching fields, when the two independent monitoring data field fuzzy match with one or more of the gap is greater than a threshold field blur when it is determined that the two independent monitoring data does not belong same network behavior; exactly match the same for all fields, and all gaps are smaller than a field of fuzzy matching fuzzy threshold ,则判断所述两组独立的监测数据属于同一个网络行为。 , It is determined that the two separate sets of monitoring data belong to the same network behavior.
4.如权利要求1所述的方法,其特征在于:对所述多组独立的监测数据进行匹配,判断所述多组独立的监测数据是否属于同一个网络行为的步骤包括: bl、将监测数据的字段分为一个或者多个精确匹配字段,或者将监测数据的字段分为一个或者多个精确匹配字段和一个或者多个模糊匹配字段; b2、将多组独立的监测数据进行两两比对; 对于精确匹配字段进行比对时,当两组独立的监测数据的所述精确匹配字段相同时,则将该字段的匹配度置为1,当两组独立的监测数据的所述精确匹配字段不相同时,则将该字段的匹配度置为O ; 对于模糊匹配字段进行比对时,按照模糊匹配字段的差距将该字段的匹配度置为O到I的数值;并将所有模糊匹配字段的匹配度相加,获得总匹配度; 当两组独立的监测数据的所述精确匹配字段有一个或者多个匹配度为O时,则判断所 4. The method according to claim 1, wherein: the step of monitoring the multiple independent data matching, determining whether a plurality of sets of independent monitoring data belonging to the same network behavior include: BL, will monitor field data is divided into one or more exact match field, or a field monitoring data into one or more exact match field and one or more fuzzy matching fields; B2, a plurality of sets of monitoring data independent pairwise alignments on; when compared to an exact match of the fields, when the two independent monitoring data fields match exactly the same, then the matching counter field to 1, when the two separate sets of monitoring data precisely match fields are not the same, then the matching set of fields is O; when aligned for fuzzy matching fields, the gap is set according to the degree of matching fields fuzzy match field value is O to I; and all fuzzy match match fields together to get the total degree of matching; when the two independent monitoring data field has an exact match or a plurality of matching degrees is O, it is determined whether the 两组独立的监测数据不属于同一个网络行为; 当模糊匹配字段总匹配度小于匹配阈值时,则判断所述两组独立的监测数据不属于同一个网络行为; 当两组独立的监测数据的所述精确匹配字段匹配度均为1,且模糊匹配字段总匹配度大于匹配阈值时,则判断所述两组独立的监测数据属于同一个网络行为。 Two independent monitoring data does not belong to the same network behavior; fuzzy match when the field is always less than the matching match threshold, it is determined that the two separate sets of monitoring data does not belong to the same network behavior; when two independent monitoring data when the exact match field matches are 1 degree and a total field of fuzzy matching degree of matching is greater than a match threshold, it is determined that the two separate sets of monitoring data belong to the same network behavior.
5.如权利要求3或4所述的方法,其特征在于:精确匹配字段包括网络行为发生的用户机的身份标识ID,模糊匹配字段包括统一资源定位符URL、网络行为发生时间Time,网络行为发送的用户机的协议地址IP,网络行为发生的用户机的浏览器Browser,网络行为发生的用户机的操作系统OS。 Network behavior precisely match the user identity field includes the machine ID of the network behavior, fuzzy matching field includes the URL Uniform Resource Locator, the network behavior Time Time,: 5. A method as claimed in claim 3 or 4, characterized in that send protocol address of the user machine's IP, browser browser user's network behavior occurs, the operating system OS user's network behavior to occur.
6.如权利要求1所述的方法,其特征在于:作弊流量分析包括:监测所述网络行为日志记录中多组监测数据中的同一监测参数的不匹配程度来识别伪造的数据。 6. The method according to claim 1, wherein: cheat flow analysis comprising: a degree of mismatch of the same monitored parameter to monitor the plurality of sets of monitor data log records in the network behavior to detect forged data.
7.如权利要求1所述的方法,其特征在于:步骤C的分析结果包括所有日志记录中作弊流量的百分比和作弊流量的数据源。 7. The method according to claim 1, wherein: the analysis result in step C comprises all log records in the data source cheat traffic flow percentages and cheat.
8.—种互联网监测反作弊装置,其特征在于,包括:多个数据采集模块、匹配模块和分析模块, 所述数据采集模块,用于利用监测方案对单次网络活动进行数据收集,获得监测数据; 所述匹配模块,用于对多组独立的监测数据进行匹配,判断所述多组独立的监测数据是否属于同一个网络行为,将被判断为属于同一个网络行为的所有监测数据汇总为一项日志记录; 所述分析模块,用于对所述日志记录进行作弊流量分析,获得分析结果。 8.- kinds of anti-cheat Internet monitoring apparatus, characterized by comprising: a plurality of data acquisition module, a matching module and an analysis module, the data acquisition module, for a single network activity using a monitoring program for data collection, monitoring is obtained transactions; the matching module for independent monitoring of the plurality of sets of data match, determining whether the independent monitoring of the plurality of sets of data belong to the same network behavior is determined that all data belonging to the same monitor network activity is summarized a log; the analysis module, configured to record in the log cheat traffic analysis, the analysis result is obtained.
9.如权利要求8所述的装置,其特征在于:所述匹配模块包括精确匹配模块、模糊匹配模块和判断模块; 所述精确匹配模块,用于对两组独立的监测数据的精确匹配字段进行比对,并获得精确比对结果; 所述模糊匹配模块,用于对两组独立的监测数据的模糊匹配字段进行比对,并获得模糊比对结果; 所述判断模块,用于根据精确比对结果和模糊比对结果,判断所述多组独立的监测数据是否属于同一个网络行为。 9. The apparatus according to claim 8, wherein: said module comprises a matching module exact matching, fuzzy matching module and a determining module; the exact matching module for monitoring data field exact match of two independent for comparison, and to obtain an accurate comparison result; a fuzzy matching means for matching the field of fuzzy two separate sets of monitoring data for comparison, and the comparison result of blurring; the determination module for accurate alignments and fuzzy comparison result, determines the plurality of sets of independent monitoring data belong to the same network behavior.
10.如权利要求9所述的装置,其特征在于:判断模块的判断依据为: 当有一个或者多个精确比对结果不相同时,则判断所述两组独立的监测数据不属于同一个网络行为; 当有一个或者多个模糊匹配字段的差距大于该字段的模糊阈值时,则判断所述两组独立的监测数据不属于同一个网络行为; 当所有精确匹配字段都相同,并且所有模糊匹配字段的差距都小于模糊阈值时,则判断所述两组独立的监测数据属于同一个网络行为; 或者, 当有一个或者多个精确比对结果不相同时,则判断所述两组独立的监测数据不属于同一个网络行为; 当模糊比对结果总匹配度小于匹配阈值时,则判断所述两组独立的监测数据不属于同一个网络行为; 当所有精确匹配字段都相同,且模糊比对结果总匹配度大于匹配阈值时,则判断所述两组独立的监测数据属于同一个网络行为 10. The apparatus according to claim 9, wherein: the determining module for determining based on: when there is one or more accurate than the results are not the same, it is determined that the two independent monitoring data does not belong to the same network behavior; or when there is a plurality of fuzzy matching gap field is greater than the threshold value of the fuzzy field, it is determined that the two independent monitoring data does not belong to the same network behavior; exact match when all the fields are the same, and all the Fuzzy match field are less than the gap between the fuzzy threshold, it is determined that the two independent monitoring data belonging to the same network behavior; or, when there are a plurality of precisely the same time or not in the comparison result, determining that the two independent monitoring data does not belong to the same network behavior; when the comparison result of fuzzy matching degree is smaller than the total match threshold, it is determined that the two separate sets of monitoring data does not belong to the same network behavior; exact match when all the fields are the same, and ambiguous than when the matching degree is greater than the total of the results match threshold, it is determined that the two independent monitoring data belonging to the same network behavior .
CN201310079359.8A 2013-03-13 2013-03-13 An Internet monitoring anti-cheating method and apparatus CN104050178B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310079359.8A CN104050178B (en) 2013-03-13 2013-03-13 An Internet monitoring anti-cheating method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310079359.8A CN104050178B (en) 2013-03-13 2013-03-13 An Internet monitoring anti-cheating method and apparatus

Publications (2)

Publication Number Publication Date
CN104050178A true CN104050178A (en) 2014-09-17
CN104050178B CN104050178B (en) 2017-09-22

Family

ID=51503029

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310079359.8A CN104050178B (en) 2013-03-13 2013-03-13 An Internet monitoring anti-cheating method and apparatus

Country Status (1)

Country Link
CN (1) CN104050178B (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105183873A (en) * 2015-09-18 2015-12-23 北京博雅立方科技有限公司 Malicious clicking behavior detection method and device
CN105279674A (en) * 2015-10-13 2016-01-27 精硕世纪科技(北京)有限公司 Method and device for determining cheating behaviors of mobile advertisement delivering device
CN105653944A (en) * 2015-12-25 2016-06-08 北京奇虎科技有限公司 Detection method and device of cheating behaviors
CN105718462A (en) * 2014-12-02 2016-06-29 阿里巴巴集团控股有限公司 Cheating detection method and apparatus for application operation
CN105975379A (en) * 2016-05-25 2016-09-28 北京比邻弘科科技有限公司 False mobile device recognition method and system
CN106250761A (en) * 2016-07-28 2016-12-21 广州爱九游信息技术有限公司 Equipment, device and method for identifying web automation tool
CN106301980A (en) * 2015-05-28 2017-01-04 腾讯科技(深圳)有限公司 Method and device for detecting network flow generating tool
CN106294529A (en) * 2015-06-29 2017-01-04 阿里巴巴集团控股有限公司 Method and device for identifying abnormal operation of user
CN106355431A (en) * 2016-08-18 2017-01-25 晶赞广告(上海)有限公司 Detection method, device and terminal for cheating traffic
CN106372959A (en) * 2016-08-22 2017-02-01 广州图灵科技有限公司 Internet-based user access behavior digital marketing system and method
CN106384252A (en) * 2016-09-26 2017-02-08 广州艾媒数聚信息咨询股份有限公司 Mobile advertisement monitoring-based page visit theft-prevention method and system
CN106469383A (en) * 2015-08-14 2017-03-01 北京国双科技有限公司 Advertising quality detection method and device
CN107241347A (en) * 2017-07-10 2017-10-10 上海精数信息科技有限公司 Method and device for analyzing quality of advertising traffic

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101093510A (en) * 2007-07-25 2007-12-26 北京搜狗科技发展有限公司 Anti cheating method and system for aiming at cheat on web page
US20080162202A1 (en) * 2006-12-29 2008-07-03 Richendra Khanna Detecting inappropriate activity by analysis of user interactions
CN101393629A (en) * 2007-09-20 2009-03-25 阿里巴巴集团控股有限公司 Implementing method and apparatus for network advertisement effect monitoring
CN102724182A (en) * 2012-05-30 2012-10-10 北京像素软件科技股份有限公司 Recognition method of abnormal client side
CN103390027A (en) * 2013-06-25 2013-11-13 亿赞普(北京)科技有限公司 Internet advertisement anti-spamming method and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080162202A1 (en) * 2006-12-29 2008-07-03 Richendra Khanna Detecting inappropriate activity by analysis of user interactions
CN101093510A (en) * 2007-07-25 2007-12-26 北京搜狗科技发展有限公司 Anti cheating method and system for aiming at cheat on web page
CN101393629A (en) * 2007-09-20 2009-03-25 阿里巴巴集团控股有限公司 Implementing method and apparatus for network advertisement effect monitoring
CN102724182A (en) * 2012-05-30 2012-10-10 北京像素软件科技股份有限公司 Recognition method of abnormal client side
CN103390027A (en) * 2013-06-25 2013-11-13 亿赞普(北京)科技有限公司 Internet advertisement anti-spamming method and system

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105718462A (en) * 2014-12-02 2016-06-29 阿里巴巴集团控股有限公司 Cheating detection method and apparatus for application operation
CN106301980A (en) * 2015-05-28 2017-01-04 腾讯科技(深圳)有限公司 Method and device for detecting network flow generating tool
CN106294529A (en) * 2015-06-29 2017-01-04 阿里巴巴集团控股有限公司 Method and device for identifying abnormal operation of user
CN106469383A (en) * 2015-08-14 2017-03-01 北京国双科技有限公司 Advertising quality detection method and device
CN105183873A (en) * 2015-09-18 2015-12-23 北京博雅立方科技有限公司 Malicious clicking behavior detection method and device
CN105279674A (en) * 2015-10-13 2016-01-27 精硕世纪科技(北京)有限公司 Method and device for determining cheating behaviors of mobile advertisement delivering device
CN105653944A (en) * 2015-12-25 2016-06-08 北京奇虎科技有限公司 Detection method and device of cheating behaviors
CN105653944B (en) * 2015-12-25 2018-06-12 北京奇虎科技有限公司 Method and apparatus for detecting cheating
CN105975379A (en) * 2016-05-25 2016-09-28 北京比邻弘科科技有限公司 False mobile device recognition method and system
CN106250761A (en) * 2016-07-28 2016-12-21 广州爱九游信息技术有限公司 Equipment, device and method for identifying web automation tool
CN106355431A (en) * 2016-08-18 2017-01-25 晶赞广告(上海)有限公司 Detection method, device and terminal for cheating traffic
CN106372959A (en) * 2016-08-22 2017-02-01 广州图灵科技有限公司 Internet-based user access behavior digital marketing system and method
CN106384252A (en) * 2016-09-26 2017-02-08 广州艾媒数聚信息咨询股份有限公司 Mobile advertisement monitoring-based page visit theft-prevention method and system
CN107241347A (en) * 2017-07-10 2017-10-10 上海精数信息科技有限公司 Method and device for analyzing quality of advertising traffic

Also Published As

Publication number Publication date
CN104050178B (en) 2017-09-22

Similar Documents

Publication Publication Date Title
US9894173B2 (en) System and method to determine the validity of an interaction on a network
Cooley et al. Grouping web page references into transactions for mining world wide web browsing patterns
AU2013204865B2 (en) Methods and apparatus to share online media impressions data
US9967603B2 (en) Video viewer targeting based on preference similarity
CA2460668C (en) Method and system for characterization of online behavior
US9344343B2 (en) Methods and apparatus to determine impressions using distributed demographic information
US20170201540A1 (en) Protecting a Server Computer by Detecting the Identity of a Browser on a Client Computer
US20070255821A1 (en) Real-time click fraud detecting and blocking system
Roesner et al. Detecting and defending against third-party tracking on the web
US9659105B2 (en) Methods and apparatus to track web browsing sessions
Mayer et al. Third-party web tracking: Policy and technology
Acar et al. FPDetective: dusting the web for fingerprinters
US7657626B1 (en) Click fraud detection
Dreze et al. Is Internet advertising ready for prime time?
US8661119B1 (en) Determining a number of users behind a set of one or more internet protocol (IP) addresses
Li et al. Knowing your enemy: understanding and detecting malicious web advertising
Manadhata et al. Measuring the attack surfaces of two FTP daemons
US9430778B2 (en) Authenticating users for accurate online audience measurement
JP5055133B2 (en) The method and device for exposing behavior data of the cross network users
US20080028446A1 (en) System and method of efficient e-mail link expiration
Mowery et al. Fingerprinting information in JavaScript implementations
CN102460416B (en) Domain traffic ranking
Gill et al. Best paper--Follow the money: understanding economics of online aggregation and advertising
US20090083417A1 (en) Method and apparatus for tracing users of online video web sites
US7853684B2 (en) System and method for processing web activity data

Legal Events

Date Code Title Description
C06 Publication
C10 Entry into substantive examination
CB02 Change of applicant information
GR01 Patent grant