CN102629933A - Method for identifying actual behavior of user to click and access website and system thereof - Google Patents

Method for identifying actual behavior of user to click and access website and system thereof Download PDF

Info

Publication number
CN102629933A
CN102629933A CN201210047328XA CN201210047328A CN102629933A CN 102629933 A CN102629933 A CN 102629933A CN 201210047328X A CN201210047328X A CN 201210047328XA CN 201210047328 A CN201210047328 A CN 201210047328A CN 102629933 A CN102629933 A CN 102629933A
Authority
CN
China
Prior art keywords
http request
behavior
user
advance
browser
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201210047328XA
Other languages
Chinese (zh)
Other versions
CN102629933B (en
Inventor
陈钊毅
袁伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sangfor Technologies Co Ltd
Original Assignee
Sangfor Network Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sangfor Network Technology Shenzhen Co Ltd filed Critical Sangfor Network Technology Shenzhen Co Ltd
Priority to CN201210047328.XA priority Critical patent/CN102629933B/en
Publication of CN102629933A publication Critical patent/CN102629933A/en
Application granted granted Critical
Publication of CN102629933B publication Critical patent/CN102629933B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a method for identifying an actual behavior of a user to click and access a website and a system thereof. The method comprises the following steps: detecting whether a type of a browser which initiates a HTTP request is a preset browser type; if the type of the browser is the preset type, determining whether there is a preset Accept field in the HTTP request; if the there is the preset Accept field in the HTTP request, marking the HTTP request as the user access website behavior, otherwise, marking as an non-user behavior. By using the method and the system of the invention, the actual user behavior of clicking and accessing the website can be accurately identified, which can substantially reduce identification interference caused by a lot of the HTTP request automatically emitted by the browser during accessing the website so that it is convenient for network management personnel to carry out statistics.

Description

A kind of method and system of discerning the behavior of the actual click of user access websites
Technical field
The present invention relates to the network data analysis field, relate in particular to the method and system of the actual click of a kind of user of identification access websites behavior.
Background technology
The access websites behavior that a Core Feature of internet behavior management is an analysis user.In existing internet behavior management product, can't accurately identify certain HTTP request is to be initiated by the user capture website, still in the process of user capture website, is initiated automatically by browser.Except the internet behavior management product needs the actual click website behavior of analysis user; Also have products such as application performance management product, website traffic statistical system, web site performance analytical system; All need the actual user of analyzing web site to click behavior; Need to distinguish user's the actual click behavior and the automatic loading behavior of browser, application scenarios is very wide.
For instance; In using the browser access www.qq.com of company's site of Tengxun process; The HTTP request that browser sends has surpassed 120; Wherein have only HTTP request itself to be caused by this website of user capture, other HTTP request is a browser in order to download and to show information such as picture on this website, advertisement and initiate automatically.
Concerning the internet behavior management product, identify the HTTP request that the user capture website produces, most important to the access websites behavior of analysis user.Yet, also do not have a kind of maturation and otherwise effective technique can realize this function at present.Existing detection scheme generally is to detect whether the type of returning is html/text; Methods such as referer field of perhaps returning through detection and statistics number; All there are not shortcomings such as science, erroneous judgement are many in these methods, and the http visit that causes a lot of non-users to click behavior judges that also becoming is that the user clicks behavior.
Summary of the invention
The technical problem that the present invention will solve is to judging whether the HTTP request that access websites produces is the defective of the actual click behavior of user in the prior art, provides a kind of and can effectively identify the actual method and system of clicking the access websites behavior of identification user that the user clicks the HTTP request that behavior produces.
The technical solution adopted for the present invention to solve the technical problems is:
The method of the actual click of a kind of user of identification access websites behavior is provided, may further comprise the steps:
Detect whether the browser type of initiating the HTTP request is the browser type that is provided with in advance;
When if browser type is the type that is provided with in advance, judge in the HTTP request whether the Accept field that is provided with is in advance arranged;
If in the HTTP request Accept field that is provided with is in advance arranged, is the behavior of user capture website then, otherwise is labeled as non-user behavior with this HTTP request marks.
In the method for the present invention, the said browser type that is provided with in advance comprises any among Firefox, Chrome and the Safari.
In the method for the present invention, the said Accept field that is provided with in advance is " text/html, application/xhtml+xml, application/xml; Q=0.9, */*; Q=0.8 ".
In the method for the present invention, if the Accept field in the HTTP request is " */* ", " text/css, */*; Q=0.1 " perhaps " image/png, image/*; Q=0.8, */*; Q=0.5 " in any, then this HTTP request is initiated for browser Firefox automatically, is non-user behavior with this HTTP request marks; If the Accept field in the HTTP request is " */* " perhaps " text/css, */*; Q=0.1 ", then this HTTP request is that browser Chrome or Safari initiate automatically, is non-user behavior with this HTTP request marks.
The present invention solves another technical scheme that its technical problem adopts:
The system of the actual click of a kind of user of identification access websites behavior is provided, comprises:
The browser type detection module is used to detect whether the browser type of initiating the HTTP request is the browser type that is provided with in advance;
Accept field judge module is used for when browser type is the browser type that is provided with in advance, judges in the HTTP request whether the Accept field that is provided with is in advance arranged;
Mark module is used for when the HTTP request has the Accept field that is provided with in advance, being the behavior of user capture website with this HTTP request marks, otherwise being labeled as non-user behavior.
In the system of the present invention, the said browser type that is provided with in advance comprises any among Firefox, Chrome and the Safari.
In the system of the present invention, the said Accept field that is provided with in advance is " text/html, application/xhtml+xml, application/xml; Q=0.9, */*; Q=0.8 ".
In the system of the present invention, said mark module also is used for being " */* ", " text/css, */* in the Accept field of HTTP request; Q=0.1 " perhaps " image/png, image/*; Q=0.8, */*; Q=0.5 " in any the time, be non-user capture website behavior with the HTTP request marks.
The beneficial effect that the present invention produces is: whether the present invention is the browser type that is provided with in advance and judges in the HTTP request whether the Accept field that is provided with is in advance arranged through the browser type that detects initiation HTTP request, be behavior of user capture website or non-user behavior with the HTTP request marks in view of the above.Thereby can identify user's actual click access websites behavior more exactly; When significantly reducing access websites because browser sends the identification that a large amount of HTTP request causes automatically disturbs; Make things convenient for the webmaster personnel to be well understood to Intranet user and visited which website, so that take the webmaster measure of being correlated with.
Description of drawings
To combine accompanying drawing and embodiment that the present invention is described further below, in the accompanying drawing:
Fig. 1 is the flow chart of the method for the actual click of embodiment of the invention identification user access websites behavior;
Fig. 2 is the system configuration sketch map of the actual click of embodiment of the invention identification user access websites behavior.
Embodiment
In order to make the object of the invention, technical scheme and advantage clearer,, the present invention is further elaborated below in conjunction with accompanying drawing and embodiment.Should be appreciated that specific embodiment described herein only in order to explanation the present invention, and be not used in qualification the present invention.
The embodiment of the invention mainly when detecting website visiting the browser type in the HTTP request and the method for Accept field distinguish the automatic loading behavior that the actual user clicks behavior and browser, can be applied in the products such as internet behavior management product, application performance management product, website traffic statistical system and web site performance analytical system.
As shown in Figure 1, the method for the actual click of embodiment of the invention identification user access websites behavior mainly may further comprise the steps:
S101 detects whether the browser type of initiating the HTTP request is the browser type that is provided with in advance; User capture website (website URL perhaps clicks certain link on the website in the browser address bar input) can produce the HTTP request, and the HTTP that is produced request maybe be from actual user's the actual click behavior or the automatic loading behavior of browser.
The browser type that is provided with in advance in one embodiment of the invention comprises any among Firefox, Chrome and the Safari, and request judges it is that by which kind of browser to be initiated be prior art according to HTTP, does not give unnecessary details at this.
When S102 is the type that is provided with in advance as if browser type, judge in the HTTP request whether the Accept field that is provided with is in advance arranged; Whether if the type of browser is not Firefox, Chrome or Safari, then can only discern the HTTP request through additive method is the behavior of the actual click of user access websites.
In one embodiment of the invention, be primarily aimed at browser Firefox, Chrome and Safari, the Accept field that is provided with in advance be " text/html, application/xhtml+xml, application/xml; Q=0.9, */*; Q=0.8 ".Above-mentioned three kinds of browsers are clicked by the user that employed Accept field immobilizes in the HTTP request that behavior produces, and are as shown in table 1 below:
Table 1 user clicks the Accept field in the HTTP request that behavior produces
Browser Accept field contents in the HTTP request
Firefox text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Chrome text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Safari text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
S103 is the behavior of user capture website with this HTTP request marks then if the Accept field (see and go up table 1) that is provided with is in advance arranged in the HTTP request, i.e. the behavior of the actual click of user access websites.
S104 is if the Accept field that is not provided with in advance in the HTTP request is non-user behavior with this HTTP request marks then.
In embodiments of the present invention, if the Accept field in the HTTP request is " */* ", " text/css, */*; Q=0.1 " perhaps " image/png, image/*; Q=0.8, */*; Q=0.5 " in any, then this HTTP request is initiated for browser Firefox automatically, is non-user behavior with this HTTP request marks; If the Accept field in the HTTP request is " */* " perhaps " text/css, */*; Q=0.1 ", then this HTTP request is that browser Chrome or Safari initiate automatically, is non-user behavior with this HTTP request marks.
See for details shown in the following table 2:
Accept field in the HTTP request that table 2 browser is initiated automatically
Figure BDA0000138996340000051
If the Accept field contents in the HTTP request satisfies the corresponding relation of going up in the table 2, then mark this ask to be non-user behavior, when the method significantly reduces access websites because browser sends the identification that a large amount of HTTP requests cause automatically disturbs.According to the HTTP behind mark request, can be easy to user's actual click website behavior is added up, make things convenient for the webmaster personnel to be well understood to Intranet user and visited which website, so that take the webmaster measure of being correlated with.
If the Accept field contents in the HTTP request does not satisfy the corresponding relation of going up in table 1 and the table 2, then need use other recognition methodss to continue identification.Empirical tests; Its request of initiating automatically the inside Accept field is in most cases fixed for these three kinds of browsers of Firefox, Chrome and Safari; Satisfy the corresponding relation in the table 2, meeting is the same with the Accept field that the user clicks generation under the few cases.Whether the HTTP request that can discern above 80% through above method is the behavior of the actual click of user access websites.
As shown in Figure 2, the system of the actual click of embodiment of the invention identification user access websites behavior mainly comprises browser type detection module 201, Accept field judge module 202 and mark module 203, wherein:
Browser type detection module 201 is used to detect whether the browser type of initiating the HTTP request is the browser type that is provided with in advance;
Accept field judge module 202 is used for when browser type is the browser type that is provided with in advance, judges in the HTTP request whether the Accept field that is provided with is in advance arranged;
Mark module 203 is used for when the HTTP request has the Accept field that is provided with in advance, being the behavior of user capture website with this HTTP request marks, otherwise being labeled as non-user behavior.
In one embodiment of the invention, the browser type that is provided with in advance comprises any among Firefox, Chrome and the Safari.
Correspondingly, the Accept field that is provided with in advance is " text/html, application/xhtml+xml, application/xml; Q=0.9, */*; Q=0.8 ".
Further, in embodiments of the present invention, mark module also is used for being " */* ", " text/css, */* in the Accept field of HTTP request; Q=0.1 " perhaps " image/png, image/*; Q=0.8, */*; Q=0.5 " in any the time, be non-user capture website behavior with the HTTP request marks.
Accept field when the embodiment of the invention is used FireFox, Chrome, these 3 kinds of browser access websites of Safari through detecting in the HTTP request; Identification HTTP request is to be initiated by the user; Still initiate automatically by browser, thereby identify the actual behavior of clicking access websites of user.The embodiment of the invention can identify user's access websites behavior more exactly; When significantly reducing access websites because browser sends the identification that a large amount of HTTP request causes automatically disturbs; Make things convenient for the webmaster personnel to be well understood to Intranet user and visited which website, so that take the webmaster measure of being correlated with.
Should be understood that, concerning those of ordinary skills, can improve or conversion, and all these improvement and conversion all should belong to the protection range of accompanying claims of the present invention according to above-mentioned explanation.

Claims (8)

1. a method of discerning the behavior of the actual click of user access websites is characterized in that, may further comprise the steps:
Detect whether the browser type of initiating the HTTP request is the browser type that is provided with in advance;
If browser type is the type that is provided with in advance, judge in the HTTP request whether the Accept field that is provided with is in advance arranged;
If in the HTTP request Accept field that is provided with is in advance arranged, is the behavior of user capture website then, otherwise is labeled as non-user behavior with this HTTP request marks.
2. method according to claim 1 is characterized in that, the said browser type that is provided with in advance comprises any among Firefox, Chrome and the Safari.
3. method according to claim 2 is characterized in that, the said Accept field that is provided with in advance is " text/html, application/xhtml+xml, application/xml; Q=0.9, */*; Q=0.8 ".
4. method according to claim 2 is characterized in that, if the Accept field in the HTTP request is " */* ", " text/css, */*; Q=0.1 " perhaps " image/png, image/*; Q=0.8, */*; Q=0.5 " in any, then this HTTP request is initiated for browser Firefox automatically, is non-user behavior with this HTTP request marks; If the Accept field in the HTTP request is " */* " perhaps " text/css, */*; Q=0.1 ", then this HTTP request is that browser Chrome or Safari initiate automatically, is non-user behavior with this HTTP request marks.
5. a system that discerns the behavior of the actual click of user access websites is characterized in that, comprising:
The browser type detection module is used to detect whether the browser type of initiating the HTTP request is the browser type that is provided with in advance;
Accept field judge module is used for when browser type is the browser type that is provided with in advance, judges in the HTTP request whether the Accept field that is provided with is in advance arranged;
Mark module is used for when the HTTP request has the Accept field that is provided with in advance, being the behavior of user capture website with this HTTP request marks, otherwise being labeled as non-user behavior.
6. system according to claim 5 is characterized in that, the said browser type that is provided with in advance comprises any among Firefox, Chrome and the Safari.
7. system according to claim 6 is characterized in that, the said Accept field that is provided with in advance is " text/html, application/xhtml+xml, application/xml; Q=0.9, */*; Q=0.8 ".
8. system according to claim 6 is characterized in that, said mark module also is used for being " */* ", " text/css, */* in the Accept field of HTTP request; Q=0.1 " perhaps " image/png, image/*; Q=0.8, */*; Q=0.5 " in any the time, be non-user capture website behavior with the HTTP request marks.
CN201210047328.XA 2012-02-28 2012-02-28 Method for identifying actual behavior of user to click and access website and system thereof Active CN102629933B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210047328.XA CN102629933B (en) 2012-02-28 2012-02-28 Method for identifying actual behavior of user to click and access website and system thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210047328.XA CN102629933B (en) 2012-02-28 2012-02-28 Method for identifying actual behavior of user to click and access website and system thereof

Publications (2)

Publication Number Publication Date
CN102629933A true CN102629933A (en) 2012-08-08
CN102629933B CN102629933B (en) 2015-05-06

Family

ID=46588092

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210047328.XA Active CN102629933B (en) 2012-02-28 2012-02-28 Method for identifying actual behavior of user to click and access website and system thereof

Country Status (1)

Country Link
CN (1) CN102629933B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105205134A (en) * 2015-09-15 2015-12-30 深信服网络科技(深圳)有限公司 Method and device for recognizing behavior of clicking to access website by user
CN105577764A (en) * 2015-12-16 2016-05-11 北京浩瀚深度信息技术股份有限公司 User clicking behavior identification method, server and system
CN105677657A (en) * 2014-11-19 2016-06-15 杭州华三通信技术有限公司 Recoding method and device for access behaviors of uniform resource locators
CN107526748A (en) * 2016-06-22 2017-12-29 华为技术有限公司 A kind of method and apparatus for identifying user and clicking on behavior

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101242307A (en) * 2008-02-01 2008-08-13 刘峰 Website access analysis system and method based on built-in code proxy log
CN101610268A (en) * 2009-07-16 2009-12-23 杭州华三通信技术有限公司 A kind of implementation method of keyword filtration and equipment
WO2011050368A1 (en) * 2009-10-23 2011-04-28 Moov Corporation Configurable and dynamic transformation of web content
CN102130952A (en) * 2011-03-16 2011-07-20 广州市动景计算机科技有限公司 Method and device for forwarding hyper text transport protocol (HPPT) request message of mobile terminal

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101242307A (en) * 2008-02-01 2008-08-13 刘峰 Website access analysis system and method based on built-in code proxy log
CN101610268A (en) * 2009-07-16 2009-12-23 杭州华三通信技术有限公司 A kind of implementation method of keyword filtration and equipment
WO2011050368A1 (en) * 2009-10-23 2011-04-28 Moov Corporation Configurable and dynamic transformation of web content
CN102130952A (en) * 2011-03-16 2011-07-20 广州市动景计算机科技有限公司 Method and device for forwarding hyper text transport protocol (HPPT) request message of mobile terminal

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105677657A (en) * 2014-11-19 2016-06-15 杭州华三通信技术有限公司 Recoding method and device for access behaviors of uniform resource locators
CN105205134A (en) * 2015-09-15 2015-12-30 深信服网络科技(深圳)有限公司 Method and device for recognizing behavior of clicking to access website by user
CN105205134B (en) * 2015-09-15 2019-04-05 深信服网络科技(深圳)有限公司 Identify that user clicks the method and device of access website behavior
CN105577764A (en) * 2015-12-16 2016-05-11 北京浩瀚深度信息技术股份有限公司 User clicking behavior identification method, server and system
CN105577764B (en) * 2015-12-16 2017-06-23 北京浩瀚深度信息技术股份有限公司 A kind of user clicks on Activity recognition method, server and system
CN107526748A (en) * 2016-06-22 2017-12-29 华为技术有限公司 A kind of method and apparatus for identifying user and clicking on behavior

Also Published As

Publication number Publication date
CN102629933B (en) 2015-05-06

Similar Documents

Publication Publication Date Title
US8869025B2 (en) Method and system for identifying advertisement in web page
US8413044B2 (en) Method and system of retrieving Ajax web page content
CN102833212B (en) Webpage visitor identity identification method and system
CN1949259B (en) Method for collecting click information of web page by embedding code in web page
AU2016213858B2 (en) Methods and apparatus to integrate tagged media impressions with panelist information
CN110602045B (en) Malicious webpage identification method based on feature fusion and machine learning
CN102739663A (en) Detection method and scanning engine of web pages
CN102436564A (en) Method and device for identifying falsified webpage
CN103618696B (en) Method and server for processing cookie information
CN107294919A (en) A kind of detection method and device of horizontal authority leak
WO2013049853A1 (en) Analytics driven development
CN105183873A (en) Malicious clicking behavior detection method and device
CN102999420A (en) XSS (Cross Site Scripting) testing method and XSS testing system based on DOM (Document Object Model)
KR101712592B1 (en) Program
CN108595468A (en) A kind of acquisition methods of web data, device, server, terminal and system
CN102629933B (en) Method for identifying actual behavior of user to click and access website and system thereof
CN102214224A (en) Network resource access optimizing method, Web page browser and terminal
CN109428776A (en) A kind of monitoring method and device of website traffic
CN103744941A (en) Method and device for determining website evaluation result based on website attribute information
WO2015188604A1 (en) Phishing webpage detection method and device
CN102779172B (en) The recognition system of non-body text and method in a kind of webpage
CN104298780A (en) Method and system for pre-obtaining browser webpage information
CN102035905A (en) Method for obtaining visitor Local DNS (Local Domain-Name Server)
CN102185742B (en) Communication-network-message-based Internet advertising effect monitoring method and system
CN105930385A (en) Data crawling method and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20200615

Address after: Nanshan District Xueyuan Road in Shenzhen city of Guangdong province 518000 No. 1001 Nanshan Chi Park building A1 layer

Patentee after: SANGFOR TECHNOLOGIES Inc.

Address before: 518000 Nanshan Science and Technology Pioneering service center, No. 1 Qilin Road, Guangdong, Shenzhen 418, 419,

Patentee before: Shenxin network technology (Shenzhen) Co.,Ltd.

TR01 Transfer of patent right