CN103714182A - Association method and device for webpage request - Google Patents

Association method and device for webpage request Download PDF

Info

Publication number
CN103714182A
CN103714182A CN201410012342.5A CN201410012342A CN103714182A CN 103714182 A CN103714182 A CN 103714182A CN 201410012342 A CN201410012342 A CN 201410012342A CN 103714182 A CN103714182 A CN 103714182A
Authority
CN
China
Prior art keywords
web
page requests
request
field
page
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410012342.5A
Other languages
Chinese (zh)
Inventor
徐翔
张广兴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
HUNAN CNSUNET TECHNOLOGY Co Ltd
Original Assignee
HUNAN CNSUNET TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by HUNAN CNSUNET TECHNOLOGY Co Ltd filed Critical HUNAN CNSUNET TECHNOLOGY Co Ltd
Priority to CN201410012342.5A priority Critical patent/CN103714182A/en
Publication of CN103714182A publication Critical patent/CN103714182A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • G06F16/9574Browsing optimisation, e.g. caching or content distillation of access to content, e.g. by caching

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses an association method and device for a webpage request. The method comprises the following steps of acquiring a webpage request to be associated; judging whether a referrer field of the webpage request to be associated is empty; if yes, determining the target webpage request according to a webpage request in the same TCP connection with the webpage request to be associated; if not, determining the target webpage request of the referrer field; associating the webpage request to be associated to the target webpage. According to the association method and device for the webpage request, it is judged whether the referrer field of the webpage request to be associated is empty, and if yes, the target webpage request is determined according to the webpage request in the same TCP connection with the webpage request to be associated, so that the problem of influence on association reliability caused by the fact that an association failure appearing when the referrer field is empty in the prior art is solved.

Description

A kind of correlating method of web-page requests and device
Technical field
The application relates to internet access technical field, especially a kind of correlating method of web-page requests and device.
Background technology
Along with popularizing of internet, increasing user needs the web page contents in website access in routine work life.The webpage of user's access is a large page normally, in the described large page, embeds and has a plurality of little pages, in addition, in the described little page, may also embed the little page of next stage.Wherein said embedding has the large page of the little page to be considered to Web page-container object, and described each little page is considered to webpage embedded object.
The process of user's accessed web page is, the automatic generating web page request of browser after user's click browser, described web-page requests is sent to server, wherein, in described web-page requests, may both include container object request, and also include embedded object request, server returns to the page object corresponding with described each web-page requests, embedded object in described each page object is put into described container object, thereby be combined into the Webpage of user's access.Before carrying out described combination, need to set up the incidence relation between embedded object request and corresponding container object, be about to embedded object request and be associated to corresponding container object request, thereby could, according to described incidence relation, each page object be combined to form to the Webpage of visit to end user.
Referer field in existing web-page requests correlating method Main Basis web-page requests is carried out association, but this kind of associated scheme there will be associated failure when the referer of this web-page requests field is sky, and associated reliability is asked in impact.
Summary of the invention
In view of this, the application provides a kind of correlating method and device of web-page requests, in order to solve, of the prior artly according to the referer field in web-page requests, carry out association, the association failure occurring while being empty in described referer field, the integrity problem that impact request is associated.The technical scheme that the application provides is as follows:
A correlating method for web-page requests, comprising:
Obtain web-page requests to be associated;
Whether the referer field that judges described web-page requests to be associated is empty;
If so, according to the web-page requests being connected in same TCP with described web-page requests to be associated, determine target web request;
If not, according to described referer field, determine target web request;
Described web-page requests to be associated is associated with to described target web request.
Said method, preferred, described foundation and the web-page requests that described web-page requests to be associated is connected in same TCP, determine target web request, comprising:
Obtain the TCP connection identifier of described web-page requests to be associated;
In each web-page requests receiving, searching the identical and referer field of the TCP connection identifier of TCP connection identifier and described web-page requests to be associated is not empty web-page requests;
When finding, according to the referer field of the described web-page requests finding, determine target web request;
When not finding, obtain at least one container object request, in described container object request, search the shortest container object request in rise time interval of rise time and described web-page requests to be associated, and the described container object request finding is defined as to target web request.
Said method, preferred, the described referer field of described foundation is determined target web request, comprising:
Obtain the referer field of described web-page requests to be associated;
According to described referer field, obtain at least one web-page requests;
When described web-page requests is one, according to the referer field of the described web-page requests finding, determine target web request;
In described web-page requests, while being a plurality of, obtain respectively the User-Agent field of described web-page requests to be associated and the User-Agent field of described each web-page requests;
The web-page requests that User-Agent field is identical with the User-Agent field of described web-page requests to be associated is defined as target web request.
Said method, preferred, described the User-Agent field web-page requests identical with the User-Agent field of described web-page requests to be associated is defined as to target web request, comprising:
According to described User-Agent field, obtain at least one alternative web-page requests;
When described alternative web-page requests is one, described alternative web-page requests is defined as to target web request;
In described alternative web-page requests while being a plurality of, rise time and described each mistiming of alternative web-page requests between the rise time of calculating respectively described web-page requests to be associated, and in described each mistiming, determine minimum value; According to described minimum value, in described each alternative web-page requests, determine target web request.
The application also provides a kind of associated apparatus of web-page requests, comprising:
Acquiring unit, for obtaining web-page requests to be associated;
Judging unit, for judging whether the referer field of described web-page requests to be associated is empty; If so, trigger the first result unit, if not, trigger the second result unit;
The first result unit, for the web-page requests according to being connected in same TCP with described web-page requests to be associated, determines target web request;
The second result unit, for determining target web request according to described referer field;
Associative cell, for being associated with described target web request by described web-page requests to be associated.
Said apparatus, preferred, described the first result unit comprises:
Obtain subelement, for obtaining the TCP connection identifier of described web-page requests to be associated;
Search subelement, for each web-page requests receiving, searching the identical and referer field of the TCP connection identifier of TCP connection identifier and described web-page requests to be associated is not empty web-page requests; When finding, trigger first unit that bears fruit, when not finding, trigger second unit that bears fruit;
First unit that bears fruit, for the referer field of the described web-page requests according to finding, determines target web request;
Second unit that bears fruit, be used for obtaining at least one container object request, in described container object request, search the shortest container object request in rise time interval of rise time and described web-page requests to be associated, and the described container object request finding is defined as to target web request.
Said apparatus, preferred, described the second result unit comprises:
The first field is obtained subelement, for obtaining the referer field of described web-page requests to be associated;
Request obtains subelement, for obtaining at least one web-page requests according to described referer field; When described web-page requests is one, trigger the first determining unit; In described web-page requests, while being a plurality of, triggering the second field and obtain subelement;
First determines subelement, for the referer field of the described web-page requests according to finding, determines target web request;
The second field is obtained subelement, for obtaining respectively the User-Agent field of described web-page requests to be associated and the User-Agent field of described each web-page requests, triggers second and determines subelement;
Second determines subelement, for the User-Agent field web-page requests identical with the User-Agent field of described web-page requests to be associated is defined as to target web request.
Said apparatus, preferred, described second determines that subelement comprises:
Alternative request obtains subelement, for obtaining at least one alternative web-page requests according to described User-Agent field; When described alternative web-page requests is one, trigger the 3rd and determine subelement, in described alternative web-page requests, while being a plurality of, triggering the 4th and determine subelement;
The 3rd determines subelement, for described alternative web-page requests is defined as to target web request;
The 4th determines subelement, for calculating respectively rise time and described each alternative web-page requests mistiming between the rise time of described web-page requests to be associated, and in described each mistiming, determines minimum value; According to described minimum value, in described each alternative web-page requests, determine target web request.
From above technical scheme, the application provides a kind of correlating method and device of web-page requests, and the method is by obtaining web-page requests to be associated, and judges whether the referer field of this web-page requests to be associated is empty; If, according to the web-page requests being connected in same TCP with described web-page requests to be associated, determine target web request, if not, according to this referer field, determine target web request, this web-page requests to be associated is carried out associated with described target web request the most at last.The application judged before web-page requests is carried out to association whether the referer field of described web-page requests is empty, in referer field, be that sky causes carrying out target web request really regularly according to referer field, according to the web-page requests being connected in same TCP with described web-page requests to be associated, determine target web request, and the target web request associated with the described wish of determining of the most described web-page requests to be associated carried out associated, thereby effectively solved the association failure occurring in prior art when referer field is sky, affect the problem of associated reliability.
Accompanying drawing explanation
In order to be illustrated more clearly in the technical scheme in the embodiment of the present application, below the accompanying drawing of required use during embodiment is described is briefly described, apparently, accompanying drawing in the following describes is only some embodiment of the application, for those of ordinary skills, do not paying under the prerequisite of creative work, can also obtain according to these accompanying drawings other accompanying drawing.
The process flow diagram of a kind of web-page requests correlating method embodiment mono-that Fig. 1 provides for the application;
The part process flow diagram of a kind of web-page requests correlating method embodiment bis-that Fig. 2 provides for the application;
The part process flow diagram of a kind of web-page requests correlating method embodiment tri-that Fig. 3 provides for the application;
The part process flow diagram of a kind of web-page requests correlating method embodiment tetra-that Fig. 4 provides for the application;
The structural representation of a kind of web-page requests associated apparatus embodiment five that Fig. 5 provides for the application;
The part-structure schematic diagram of a kind of web-page requests associated apparatus embodiment six that Fig. 6 provides for the application;
The part-structure schematic diagram of a kind of web-page requests associated apparatus embodiment seven that Fig. 7 provides for the application;
The part-structure schematic diagram of a kind of web-page requests associated apparatus embodiment eight that Fig. 8 provides for the application.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present application, the technical scheme in the embodiment of the present application is clearly and completely described, obviously, described embodiment is only the application's part embodiment, rather than whole embodiment.Embodiment based in the application, those of ordinary skills are not making the every other embodiment obtaining under creative work prerequisite, all belong to the scope of the application's protection.
Refer to Fig. 1, it shows the process flow diagram of a kind of web-page requests correlating method embodiment mono-that the application provides, and the present embodiment can comprise:
Step 101: obtain web-page requests to be associated;
User is in the process of accessed web page, the automatic generating web page request of browser after click browser, described web-page requests need to be sent to server, server returns to the response data packet corresponding with described web-page requests, includes the web object corresponding with described web access requests in described response data packet.
In described web-page requests, include embedded object request, described embedded object request need to be carried out associated with corresponding container object request, thereby the web object corresponding with described embedded object request that server returns, that could return with server combines with the corresponding web object of described container object request, forms the final Webpage that user will access.
Obtain the embedded object request in described web-page requests, described embedded object request is web-page requests to be associated, described web-page requests to be associated is associated with to container object request corresponding thereto.
Step 102: whether the referer field that judges described web-page requests to be associated is empty; If so, perform step 103, if not, execution step 104.
The web-page requests described to be associated of obtaining in step 101 is the web-page requests based on http protocol, described web-page requests is to be generated by browser, and reason based on some setting may include referer field in the web-page requests of generation, may referer field be also sky.
Described referer field is pointed to a certain Webpage, for representing that described a certain Webpage exists associated with the web-page requests that comprises described referer field.For example, the referer field of web-page requests H_1 is http://www.sina.com.cn/, the Webpage that shows http://www.sina.com.cn/ sensing is associated with described web-page requests H_1 existence, the described web-page requests H_1 that the Webpage that described http://www.sina.com.cn/ points to sends.
Step 103: according to the web-page requests being connected in same TCP with described web-page requests to be associated, determine target web request.
The web-page requests that each client browser sends is used Transmission Control Protocol to transmit control in transmitting procedure.TCP(Transmission Control Protocol, transmission control protocol), be a kind of transport layer communication protocol.In same TCP connection, each web-page requests normally same page of transmission sends, and the web-page requests that shows to have identical TCP connection identifier is in the same page.
Based on above principle, with the web-page requests to be associated getting in step 101 (embedded object request), there is the web-page requests of identical TCP connection identifier in same TCP connection, the web-page requests of normally being sent by the same page, therefore to determine the target web request (container object request) of described web-page requests to be associated, can be according to each web-page requests in described same TCP connection.
Concrete, the process of described definite target web request can be, obtains the web-page requests being connected in same TCP with described web-page requests to be associated, and described web-page requests can be one, can be also a plurality of.In described at least one web-page requests, searching and whether having referer field is not empty web-page requests, if exist, the web-page requests that described referer field is pointed to is defined as target web request.
Certainly, the described process of obtaining web-page requests with whether search described referer field for empty process can be to carry out simultaneously, also can be first to search the web-page requests being connected in same TCP with described web-page requests to be associated, then in the described web-page requests finding, search referer field for empty web-page requests.Preferably, select first kind of way, to improve definite efficiency.
Step 104: determine target web request according to described referer field.
When the referer of described web-page requests to be associated field is not sky, the web-page requests (container object request) that described referer field is pointed to is defined as target web request.For example: the referer field of web-page requests H_1 is http://www.sina.com.cn/, the web-page requests (container object request) that this http://www.sina.com.cn/ points to is confirmed as target web request.
Certainly, in determining target web request process according to described referer field, utilize the IP address of described web-page requests to be associated and described target web request simultaneously, be that described web-page requests to be associated is identical with the IP address of described target web request, with this, guarantee that described two web-page requests are sent by same client, thereby can when a plurality of clients are used different IP addresses to access the same website and webpage page, also described web-page requests to be associated can be carried out to association simultaneously.
Step 105: described web-page requests to be associated is associated with to described target web request.
According to the target web request of determining in described step 103 or step 104, described web-page requests to be associated is carried out associated with it.Described interrelational form, can be to add identical sign for described web-page requests to be associated with described target web request, can be also to obtain the unique identification that described target web request has possessed, and described unique identification is added in described web-page requests to be associated; Also can set up the mapping table of web-page requests to be associated and described target web request.Certainly, interrelational form is not limited to above-mentioned three kinds, and the mode that can set up corresponding relation between two web-page requests in prior art all belongs to the application's protection domain.
From above technical scheme, the present embodiment provides a kind of web-page requests correlating method, and the method is by obtaining web-page requests to be associated, and judges whether the referer field of this web-page requests to be associated is empty; If so, according to the web-page requests being connected in same TCP with described web-page requests to be associated, determine target web request, if not, according to this referer field, determine target web request, this web-page requests to be associated is associated with described target web request the most at last.
The present embodiment judged before web-page requests is carried out to association whether the referer field of described web-page requests is empty, if it is empty, according to the web-page requests being connected in same TCP with described web-page requests to be associated, determine target web request, thereby effectively solved the association failure occurring in prior art when referer field is sky, has affected the problem of associated reliability.
Refer to Fig. 2, it shows the part process flow diagram of a kind of web-page requests correlating method embodiment bis-that the application provides, and the step 103 in embodiment mono-can comprise:
Step 201: the TCP connection identifier that obtains described web-page requests to be associated.
TCP connection identifier is definite by four parameters, i.e. source IP, object IP, source port and the destination interface of web-page requests.Resolve the packet of the web-page requests to be associated getting in step 101 in embodiment mono-, obtain source IP, object IP, source port and the destination interface of described web-page requests to be associated.
Step 202: in each web-page requests receiving, searching the identical and referer field of the TCP connection identifier of TCP connection identifier and described web-page requests to be associated is not empty web-page requests.When finding, enter step 203, when not finding, enter step 204.
Receive described each web-page requests, whether the referer field of obtaining successively the TCP connection identifier of described each web-page requests and judging this web-page requests is simultaneously for not empty, when the TCP of this web-page requests connection identifier (source IP, object IP, source port and destination interface) identical with the TCP connection identifier of described web-page requests to be associated, and when referer field is not empty, finish search procedure, enter step 203.Each web-page requests receiving described in searched, but there is not the web-page requests with above-mentioned feature, enter step 204.
It should be noted that, in above-mentioned search procedure, judge that TCP connection identifier is whether with the identical of web-page requests to be associated and judge whether referer is empty, and search efficiency is higher, thereby the associated efficiency of web-page requests to be associated also can correspondingly improve simultaneously.
Step 203: according to the referer field of the described web-page requests finding, determine target web request.
Described deterministic process is that the referer field in the web-page requests that obtaining step 202 finds, is defined as target web request by the web-page requests of described referer field sensing.
For example, the source IP address of web-page requests to be associated is 192.168.200.55, and object IP address is 218.30.13.36, and source port number is 100, and destination slogan is 80.The source IP address of the web-page requests finding is 192.168.200.55, object IP address is 218.30.13.36, source port number is 100, destination slogan is 80, and referer field is not empty, content is http://www.sina.com.cn/, web-page requests corresponding to http://www.sina.com.cn/ is defined as to target web request.
Step 204: obtain at least one container object request, in described container object request, search the shortest container object request in rise time interval of rise time and described web-page requests to be associated, and the described container object request finding is defined as to target web request.
For each user, safeguard all container object requests that it sends, and record the rise time of described container object request.The container object request that the rise time interval of rise time and described web-page requests to be associated is the shortest is defined as target web request.
For example, the source IP address of web-page requests to be associated is 192.168.200.55, and the rise time is 2013-11-1911:31:29.The container object request that the user of this IP address has sent has A and B, and wherein the rise time of A is 2013-11-1911:31:23, and the rise time of B is 2013-11-1911:31:28.The rise time of container object request B and the rise time of described web-page requests to be associated are spaced apart 1 second, are the shortest interval time, this container object request B are defined as to target web request.
From above technical scheme, a kind of web-page requests correlating method that the present embodiment provides, when the referer of described web-page requests to be associated field is sky, in connecting, the TCP of described web-page requests to be associated determines target web request, the method of determining is first in each web-page requests receiving, and searching the identical and referer field of the TCP connection identifier of TCP connection identifier and described web-page requests to be associated is not empty web-page requests; If find, according to this referer field, determine target web request, if not, utilize time mechanism, be confirmed as target web request with the shortest container object request described web-page requests to be associated interval time, thereby completed determining of target web request.In described definite method, first utilize the referer field of web-page requests, recycling time mechanism, the determination methods of the two combination, can improve associated reliability.
In embodiment mono-, according to described referer field, determine in target web request process, may occur the problem of associated ambiguity, a plurality of clients are used same IP address to access the same page, and the definite target web request of referer field is a plurality of.Refer to Fig. 3, it shows the part process flow diagram of a kind of Webpage correlation embodiment of the method three that the application provides, and the step 104 in embodiment mono-can comprise:
Step 301: the referer field of obtaining described web-page requests to be associated.
Resolve the HTTP request data package at described web-page requests place, obtain the referer field in described request packet.
Step 302: obtain at least one web-page requests according to described referer field, when described web-page requests is one, execution step 303, in described web-page requests while being a plurality of, execution step 304.
The URL that referer field contents in web-page requests is another web-page requests, for representing that described web-page requests and this another web-page requests exist incidence relation.According to the data content that obtains referer field in step 301, the IP address of the client of web-page requests is sent in judgement simultaneously, searches and has URL and the IP address web-page requests identical with the IP address of described web-page requests to be associated that described referer field contents represents.Described web-page requests may be one, also may be for a plurality of.Causing described web-page requests is a plurality of reasons, is that a plurality of clients are used same IP address to access same Webpage simultaneously, as often run into this situation under NAT mechanism.
Step 303: according to the referer field of the described web-page requests finding, determine target web request, the deterministic process of target end web-page requests.
Step 304: obtain respectively the User-Agent field of described web-page requests to be associated and the User-Agent field of described each web-page requests.
The described web-page requests obtaining in step 302 may be for a plurality of, for example, the referer field of the web-page requests H_1 getting is http://www.sina.com.cn/, and the URL that finds web-page requests H_2 and H_3 is described URL(http: //www.sina.com.cn/).
When described web-page requests is while being a plurality of, obtain respectively the User-Agent field of described web-page requests to be associated and each web-page requests.Wherein, described User-Agent field is a field in web-page requests, for recording the operating system of described web-page requests and the information such as version of browser sent.Inventor finds by research, and the described information comprising in User-Agent field has higher value, the browser version information especially comprising for distinguishing different clients.Different clients may be used same browser, but conventionally in different time points, carries out the renewal of browser version, and the information comprising in described User-Agent field will be different.
Step 305: the web-page requests that User-Agent field is identical with the User-Agent field of described web-page requests to be associated is defined as target web request.
According to the User-Agent field of obtaining in step 304, the identical web-page requests of the User-Agent with described web-page requests to be associated is defined as to target web request.
From above technical scheme, a kind of web-page requests correlating method that the present embodiment provides, according to described referer field, carrying out in the deterministic process of target web request, if because a plurality of clients are used same IP address to carry out web page access to cause definite web-page requests to exist when a plurality of, utilize the User-Agent field of web-page requests further to determine target web request, thereby effectively solved associated ambiguity problem, improved the reliability of web-page requests association.
On the basis of above-described embodiment three, also may there is the also identical situation of User-Agent field that gets each web-page requests.Refer to Fig. 4, it shows the part process flow diagram of a kind of web-page requests correlating method embodiment tetra-that the application provides, and the step 305 in embodiment tri-can comprise:
Step 401: obtain at least one alternative web-page requests according to described User-Agent field.
Step 402: when described alternative web-page requests is one, described alternative web-page requests is defined as to target web request.
Step 403: in described alternative web-page requests while being a plurality of, rise time and described each mistiming of alternative web-page requests between the rise time of calculating respectively described web-page requests to be associated, and in described each mistiming, determine minimum value; According to described minimum value, in described each alternative web-page requests, determine target web request.
When described alternative web-page requests is while being a plurality of, can adopt and arrive first first associated mechanism, the target web request of soon first receiving is defined as the target web request (container object request) of this web-page requests to be associated (embedded object request).Described deterministic process is that calculating receives the time of described web-page requests to be associated and the time interval between described each alternative web-page requests, and the shortest web-page requests in interval is defined as to target web request.
From above technical scheme, a kind of web-page requests correlating method that the present embodiment provides, when the described User-Agent field used in a plurality of clients is identical, adopt to arrive first first relation mechanism and completed determining of target web request, further improved the reliability of correlating method.
Refer to Fig. 5, it shows the structural representation of a kind of web-page requests associated apparatus embodiment five that the application provides, and the present embodiment can comprise: acquiring unit 501, judging unit 502, the first result unit 503, the second result unit 504 and associative cell 505.
Described acquiring unit 501, for obtaining web-page requests to be associated;
User is in the process of accessed web page, the automatic generating web page request of browser after click browser, described web-page requests need to be sent to server, server returns to the response data packet corresponding with described web-page requests, includes the web object corresponding with described web access requests in described response data packet.
In described web-page requests, include embedded object request, described embedded object request need to be carried out associated with corresponding container object request, thereby the web object corresponding with described embedded object request that server returns, that could return with server combines with the corresponding web object of described container object request, forms the final Webpage that user will access.
Described acquiring unit 501 obtains the embedded object request in described web-page requests, and described embedded object request is web-page requests to be associated, described web-page requests to be associated is associated with to container object request corresponding thereto.
Described judging unit 502, for judging whether the referer field of described web-page requests to be associated is empty; If so, trigger the first result unit 503, if not, trigger the second result unit 504.
The web-page requests described to be associated that described acquiring unit 501 obtains is the web-page requests based on http protocol, described web-page requests is to be generated by browser, and reason based on some setting may include referer field in the web-page requests of generation, may referer field be also sky.
Described referer field is pointed to a certain Webpage, for representing that described a certain Webpage exists associated with the web-page requests that comprises described referer field.For example, the referer field of web-page requests H_1 is http://www.sina.com.cn/, the Webpage that shows http://www.sina.com.cn/ sensing is associated with described web-page requests H_1 existence, the described web-page requests H_1 that the Webpage that described http://www.sina.com.cn/ points to sends.
Described the first result unit 503, for the web-page requests according to being connected in same TCP with described web-page requests to be associated, determines target web request.
The web-page requests that each client browser sends is used Transmission Control Protocol to transmit control in transmitting procedure.TCP(Transmission Control Protocol, transmission control protocol), be a kind of transport layer communication protocol.In same TCP connection, each web-page requests normally same page of transmission sends, and the web-page requests that shows to have identical TCP connection identifier is in the same page.
Based on above principle, the web-page requests to be associated (embedded object request) getting with described acquiring unit 501, there is the web-page requests of identical TCP connection identifier in same TCP connection, the web-page requests of normally being sent by the same page, therefore to determine the target web request (container object request) of described web-page requests to be associated, can be according to each web-page requests in described same TCP connection.
Concrete, described the first result unit 503 determines that the process of target web request can be, obtains the web-page requests being connected in same TCP with described web-page requests to be associated, described web-page requests can be one, can be also a plurality of.In described at least one web-page requests, searching and whether having referer field is not empty web-page requests, if exist, the web-page requests that described referer field is pointed to is defined as target web request.
Certainly, the process that described the first result unit 503 obtains web-page requests with whether search described referer field for empty process can be to carry out simultaneously, also can be first to search the web-page requests being connected in same TCP with described web-page requests to be associated, then in the described web-page requests finding, search referer field for empty web-page requests.Preferably, select first kind of way, to improve definite efficiency.
Described the second result unit 504, for determining target web request according to described referer field.
When the referer of described web-page requests to be associated field is not sky, the web-page requests (container object request) that described the second result unit 504 points to described referer field is defined as target web request.For example: the referer field of web-page requests H_1 is http://www.sina.com.cn/, the web-page requests (container object request) that this http://www.sina.com.cn/ points to is confirmed as target web request.
Certainly, in determining target web request process according to described referer field, described the second result unit 504 utilizes the IP address of described web-page requests to be associated and described target web request simultaneously, be that described web-page requests to be associated is identical with the IP address of described target web request, with this, guarantee that described two web-page requests are sent by same client, thereby can when a plurality of clients are used different IP addresses to access the same website and webpage page, also described web-page requests to be associated can be carried out to association simultaneously.
Described associative cell 505, for being associated with described target web request by described web-page requests to be associated.
The target web request of determining according to described the first result unit 503 or described the second result unit 504, described associative cell 505 carries out associated by described web-page requests to be associated with it.Described interrelational form, can be to add identical sign for described web-page requests to be associated with described target web request, can be also to obtain the unique identification that described target web request has possessed, and described unique identification is added in described web-page requests to be associated; Also can set up the mapping table of web-page requests to be associated and described target web request.Certainly, interrelational form is not limited to above-mentioned three kinds, and the mode that can set up corresponding relation between two web-page requests in prior art all belongs to the application's protection domain.
From above technical scheme, the present embodiment provides a kind of web-page requests associated apparatus, and this device is by obtaining web-page requests to be associated, and judges whether the referer field of this web-page requests to be associated is empty; If so, according to the web-page requests being connected in same TCP with described web-page requests to be associated, determine target web request, if not, according to this referer field, determine target web request, this web-page requests to be associated is associated with described target web request the most at last.
The present embodiment judged before web-page requests is carried out to association whether the referer field of described web-page requests is empty, if it is empty, according to the web-page requests being connected in same TCP with described web-page requests to be associated, determine target web request, thereby effectively solved the association failure occurring in prior art when referer field is sky, has affected the problem of associated reliability.
Refer to Fig. 6, it shows the part-structure schematic diagram of a kind of web-page requests associated apparatus embodiment six that the application provides, and the Unit 503, the first result unit in embodiment five can comprise: obtain subelement 601, search subelement 602, first unit 604 that bears fruit, unit 603 and second that bears fruit.Wherein:
The described subelement 601 that obtains, for obtaining the TCP connection identifier of described web-page requests to be associated.
TCP connection identifier is definite by four parameters, i.e. source IP, object IP, source port and the destination interface of web-page requests.Resolve the packet of the web-page requests to be associated getting in embodiment five acquiring units 501, described in obtain source IP, object IP, source port and the destination interface that subelement 601 obtains described web-page requests to be associated.
The described subelement 602 of searching, for each web-page requests receiving, searching the identical and referer field of the TCP connection identifier of TCP connection identifier and described web-page requests to be associated is not empty web-page requests; When finding, trigger first unit 603 that bears fruit, when not finding, trigger second unit 604 that bears fruit.
Receive described each web-page requests, the described subelement 602 of searching obtains successively the TCP connection identifier of described each web-page requests and judges whether the referer field of this web-page requests is not sky simultaneously, when the TCP of this web-page requests connection identifier (source IP, object IP, source port and destination interface) identical with the TCP connection identifier of described web-page requests to be associated, and when referer field is not empty, finish search procedure, trigger first unit 603 that bears fruit.Each web-page requests receiving described in searched, but do not have the web-page requests with above-mentioned feature, triggers second unit 604 that bears fruit.
It should be noted that, in above-mentioned search procedure, judge that TCP connection identifier is whether with the identical of web-page requests to be associated and judge whether referer is empty, and search efficiency is higher, thereby the associated efficiency of web-page requests to be associated also can correspondingly improve simultaneously.
Described first unit 603 that bears fruit, for the referer field of the described web-page requests according to finding, determines target web request.
Described deterministic process is, described first unit 603 that bears fruit is searched the referer field in the web-page requests that subelement 602 finds described in obtaining, and the web-page requests that described referer field is pointed to is defined as target web request.
For example, the source IP address of web-page requests to be associated is 192.168.200.55, and object IP address is 218.30.13.36, and source port number is 100, and destination slogan is 80.The described source IP address of searching the web-page requests that subelement 602 finds is 192.168.200.55, object IP address is 218.30.13.36, source port number is 100, destination slogan is 80, and referer field is not empty, content is http://www.sina.com.cn/, and described first unit 603 that bears fruit is defined as target web request by web-page requests corresponding to http://www.sina.com.cn/.
Described second unit 604 that bears fruit, be used for obtaining at least one container object request, in described container object request, search the shortest container object request in rise time interval of rise time and described web-page requests to be associated, and the described container object request finding is defined as to target web request.
For each user, safeguard all container object requests that it sends, and record the rise time of described container object request.Described second unit 604 that bears fruit is defined as target web request by the shortest container object request in the rise time interval of rise time and described web-page requests to be associated.
For example, the source IP address of web-page requests to be associated is 192.168.200.55, and the rise time is 2013-11-1911:31:29.The container object request that the user of this IP address has sent has A and B, and wherein the rise time of A is 2013-11-1911:31:23, and the rise time of B is 2013-11-1911:31:28.Described second bears fruit, and the rise time of container object request B is judged in unit 604 and the rise time of described web-page requests to be associated is spaced apart 1 second, is the shortest interval time, this container object request B is defined as to target web request.
From above technical scheme, a kind of web-page requests associated apparatus that the present embodiment provides, when the referer field of the described web-page requests to be associated of judging unit 501 judgement is sky, in connecting, the TCP of described web-page requests to be associated determines target web request, the method of determining is to search subelement 602 in each web-page requests receiving, and searching the identical and referer field of the TCP connection identifier of TCP connection identifier and described web-page requests to be associated is not empty web-page requests; If, target web request is determined according to this referer field in first unit 603 that bears fruit, if not, second unit 604 that bears fruit utilizes time mechanism, be confirmed as target web request with the shortest container object request described web-page requests to be associated interval time, thereby completed determining of target web request.In described definite method, first utilize the referer field of web-page requests, recycling time mechanism, can improve associated reliability.
In embodiment five, the second result unit 504 is determined in target web request process according to described referer field, the problem that may occur associated ambiguity, be that a plurality of clients are used same IP address to access the same page, the definite target web request of referer field is a plurality of.Refer to Fig. 7, it shows the part-structure schematic diagram of a kind of Webpage correlation device embodiment seven that the application provides, and the second result unit 504 in embodiment five can comprise: the first field is obtained subelement 701, request and obtained subelement 702, first and determine that subelement 703, the second field obtain subelement 704 and second and determine subelement 705.Wherein:
Described the first field is obtained subelement 701, for obtaining the referer field of described web-page requests to be associated.
Described the first field is obtained the HTTP request data package that subelement 701 is resolved described web-page requests place, obtains the referer field in described request packet.
Described request obtains subelement 702, for obtaining at least one web-page requests according to described referer field.
The URL that referer field contents in web-page requests is another web-page requests, for representing that described web-page requests and this another web-page requests exist incidence relation.Described request obtains subelement 702 and obtains according to described the first field the data content that subelement 701 obtains referer field, the IP address of the client of web-page requests is sent in judgement simultaneously, searches and has URL and the IP address web-page requests identical with the IP address of described web-page requests to be associated that described referer field contents represents.Described web-page requests may be one, also may be for a plurality of.Causing described web-page requests is a plurality of reasons, is that a plurality of clients are used same IP address to access same Webpage simultaneously, as often run into this situation under NAT mechanism.
Described first determines subelement 703, for when described web-page requests is one, according to the referer field of the described web-page requests finding, determines target web request.
Described the second field is obtained subelement 704, when being a plurality of in described web-page requests, obtains respectively the User-Agent field of described web-page requests to be associated and the User-Agent field of described each web-page requests.
The described web-page requests that described request acquisition subelement 702 obtains may be for a plurality of, for example, the referer field of the web-page requests H_1 getting is http://www.sina.com.cn/, and the URL that finds web-page requests H_2 and H_3 is described URL(http: //www.sina.com.cn/).
When described web-page requests is while being a plurality of, described the second field is obtained the User-Agent field that subelement 704 obtains respectively described web-page requests to be associated and each web-page requests.Wherein, described User-Agent field is a field in web-page requests, for recording the operating system of described web-page requests and the information such as version of browser sent.Inventor finds by research, and the described information comprising in User-Agent field has higher value, the browser version information especially comprising for distinguishing different clients.Different clients may be used same browser, but conventionally in different time points, carries out the renewal of browser version, and the information comprising in described User-Agent field will be different.
Described second determines subelement 705, for the User-Agent field web-page requests identical with the User-Agent field of described web-page requests to be associated is defined as to target web request.
According to described the second field, obtain the User-Agent field that subelement 704 obtains, described second determines that subelement 705 is defined as target web request by the identical web-page requests of the User-Agent with described web-page requests to be associated.
From above technical scheme, a kind of web-page requests associated apparatus that the present embodiment provides, according to described referer field, carrying out in the deterministic process of target web request, if because a plurality of clients are used same IP address to carry out web page access to cause definite web-page requests to exist when a plurality of, described second determines that subelement 705 utilizes the User-Agent field of web-page requests further to determine target web request, thereby effectively solved associated ambiguity problem, improved the reliability of web-page requests association.
On the basis of above-described embodiment seven, also may exist described the second field to obtain also identical situation of User-Agent field that subelement 704 gets each web-page requests.Refer to Fig. 8, it shows the part-structure schematic diagram of a kind of web-page requests associated apparatus embodiment eight that the application provides, and second in embodiment seven determines that subelement 705 can comprise: alternative request obtains subelement 801, the 3rd and determines that subelement 802 and the 4th determines subelement 803.Wherein:
Described alternative request obtains subelement 801, for obtaining at least one alternative web-page requests according to described User-Agent field; When described alternative web-page requests is one, trigger the 3rd and determine subelement, in described alternative web-page requests, while being a plurality of, triggering the 4th and determine subelement.
The described the 3rd determines subelement 802, for described alternative web-page requests is defined as to target web request.
The described the 4th determines subelement 803, for calculating respectively rise time and described each alternative web-page requests mistiming between the rise time of described web-page requests to be associated, and in described each mistiming, determines minimum value; According to described minimum value, in described each alternative web-page requests, determine target web request.
When described alternative request is when to obtain described alternative web-page requests that subelement 801 determines be a plurality of, the described the 4th determines that subelement 803 can adopt the mechanism that arrives first first association, and the target web request of soon first receiving is defined as the target web request (container object request) of this web-page requests to be associated (embedded object request).The described the 4th determines that the deterministic process of subelement 803 is, calculates and receives the time of described web-page requests to be associated and the time interval between described each alternative web-page requests, and the shortest web-page requests in interval is defined as to target web request.
From above technical scheme, a kind of web-page requests associated apparatus that the present embodiment provides, when the described User-Agent field used in a plurality of clients is identical, the described the 4th determines that subelement 803 employings arrive first first relation mechanism and completed determining of target web request, has further improved the reliability of correlating method.
It should be noted that, each embodiment in this instructions all adopts the mode of going forward one by one to describe, and each embodiment stresses is the difference with other embodiment, between each embodiment identical similar part mutually referring to.
Above the correlating method of a kind of web-page requests provided by the present invention and device are described in detail, the above-mentioned explanation to the disclosed embodiments, makes professional and technical personnel in the field can realize or use the present invention.To the multiple modification of these embodiment, will be apparent for those skilled in the art, General Principle as defined herein can, in the situation that not departing from the spirit or scope of the present invention, realize in other embodiments.Therefore, the present invention will can not be restricted to these embodiment shown in this article, but will meet the widest scope consistent with principle disclosed herein and features of novelty.

Claims (8)

1. a correlating method for web-page requests, is characterized in that, comprising:
Obtain web-page requests to be associated;
Whether the referer field that judges described web-page requests to be associated is empty;
If so, according to the web-page requests being connected in same TCP with described web-page requests to be associated, determine target web request;
If not, according to described referer field, determine target web request;
Described web-page requests to be associated is associated with to described target web request.
2. method according to claim 1, is characterized in that, described foundation and the web-page requests that described web-page requests to be associated is connected in same TCP, determine target web request, comprising:
Obtain the TCP connection identifier of described web-page requests to be associated;
In each web-page requests receiving, searching the identical and referer field of the TCP connection identifier of TCP connection identifier and described web-page requests to be associated is not empty web-page requests;
When finding, according to the referer field of the described web-page requests finding, determine target web request;
When not finding, obtain at least one container object request, in described container object request, search the shortest container object request in rise time interval of rise time and described web-page requests to be associated, and the described container object request finding is defined as to target web request.
3. method according to claim 1, is characterized in that, the described referer field of described foundation is determined target web request, comprising:
Obtain the referer field of described web-page requests to be associated;
According to described referer field, obtain at least one web-page requests;
When described web-page requests is one, according to the referer field of the described web-page requests finding, determine target web request;
In described web-page requests, while being a plurality of, obtain respectively the User-Agent field of described web-page requests to be associated and the User-Agent field of described each web-page requests;
The web-page requests that User-Agent field is identical with the User-Agent field of described web-page requests to be associated is defined as target web request.
4. method according to claim 3, is characterized in that, described the User-Agent field web-page requests identical with the User-Agent field of described web-page requests to be associated is defined as to target web request, comprising:
According to described User-Agent field, obtain at least one alternative web-page requests;
When described alternative web-page requests is one, described alternative web-page requests is defined as to target web request;
In described alternative web-page requests while being a plurality of, rise time and described each mistiming of alternative web-page requests between the rise time of calculating respectively described web-page requests to be associated, and in described each mistiming, determine minimum value; According to described minimum value, in described each alternative web-page requests, determine target web request.
5. an associated apparatus for web-page requests, is characterized in that, comprising:
Acquiring unit, for obtaining web-page requests to be associated;
Judging unit, for judging whether the referer field of described web-page requests to be associated is empty; If so, trigger the first result unit, if not, trigger the second result unit;
The first result unit, for the web-page requests according to being connected in same TCP with described web-page requests to be associated, determines target web request;
The second result unit, for determining target web request according to described referer field;
Associative cell, for being associated with described target web request by described web-page requests to be associated.
6. device according to claim 5, is characterized in that, described the first result unit comprises:
Obtain subelement, for obtaining the TCP connection identifier of described web-page requests to be associated;
Search subelement, for each web-page requests receiving, searching the identical and referer field of the TCP connection identifier of TCP connection identifier and described web-page requests to be associated is not empty web-page requests; When finding, trigger first unit that bears fruit, when not finding, trigger second unit that bears fruit;
First unit that bears fruit, for the referer field of the described web-page requests according to finding, determines target web request;
Second unit that bears fruit, be used for obtaining at least one container object request, in described container object request, search the shortest container object request in rise time interval of rise time and described web-page requests to be associated, and the described container object request finding is defined as to target web request.
7. device according to claim 5, is characterized in that, described the second result unit comprises:
The first field is obtained subelement, for obtaining the referer field of described web-page requests to be associated;
Request obtains subelement, for obtaining at least one web-page requests according to described referer field; When described web-page requests is one, trigger the first determining unit; In described web-page requests, while being a plurality of, triggering the second field and obtain subelement;
First determines subelement, for the referer field of the described web-page requests according to finding, determines target web request;
The second field is obtained subelement, for obtaining respectively the User-Agent field of described web-page requests to be associated and the User-Agent field of described each web-page requests, triggers second and determines subelement;
Second determines subelement, for the User-Agent field web-page requests identical with the User-Agent field of described web-page requests to be associated is defined as to target web request.
8. method according to claim 7, is characterized in that, described second determines that subelement comprises:
Alternative request obtains subelement, for obtaining at least one alternative web-page requests according to described User-Agent field; When described alternative web-page requests is one, trigger the 3rd and determine subelement, in described alternative web-page requests, while being a plurality of, triggering the 4th and determine subelement;
The 3rd determines subelement, for described alternative web-page requests is defined as to target web request;
The 4th determines subelement, for calculating respectively rise time and described each alternative web-page requests mistiming between the rise time of described web-page requests to be associated, and in described each mistiming, determines minimum value; According to described minimum value, in described each alternative web-page requests, determine target web request.
CN201410012342.5A 2014-01-10 2014-01-10 Association method and device for webpage request Pending CN103714182A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410012342.5A CN103714182A (en) 2014-01-10 2014-01-10 Association method and device for webpage request

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410012342.5A CN103714182A (en) 2014-01-10 2014-01-10 Association method and device for webpage request

Publications (1)

Publication Number Publication Date
CN103714182A true CN103714182A (en) 2014-04-09

Family

ID=50407157

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410012342.5A Pending CN103714182A (en) 2014-01-10 2014-01-10 Association method and device for webpage request

Country Status (1)

Country Link
CN (1) CN103714182A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113360814A (en) * 2020-03-03 2021-09-07 腾讯科技(深圳)有限公司 Webpage request processing method and device and computer equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070288571A1 (en) * 2006-06-07 2007-12-13 Nokia Siemens Networks Gmbh & Co. Kg Method and device for the production and distribution of messages directed at a multitude of recipients in a communications network
CN101136834A (en) * 2007-10-19 2008-03-05 杭州华三通信技术有限公司 SSL VPN based link rewriting method and apparatus
CN101227390A (en) * 2008-01-22 2008-07-23 中兴通讯股份有限公司 Method for implementing priority level for generating order of mapping item for network address conversion
CN101287013A (en) * 2008-05-30 2008-10-15 杭州华三通信技术有限公司 Method for updating Webpage and Web proxy device
CN103036746A (en) * 2012-12-21 2013-04-10 中国科学院计算技术研究所 Passive measurement method and passive measurement system of web page responding time based on network intermediate point

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070288571A1 (en) * 2006-06-07 2007-12-13 Nokia Siemens Networks Gmbh & Co. Kg Method and device for the production and distribution of messages directed at a multitude of recipients in a communications network
CN101136834A (en) * 2007-10-19 2008-03-05 杭州华三通信技术有限公司 SSL VPN based link rewriting method and apparatus
CN101227390A (en) * 2008-01-22 2008-07-23 中兴通讯股份有限公司 Method for implementing priority level for generating order of mapping item for network address conversion
CN101287013A (en) * 2008-05-30 2008-10-15 杭州华三通信技术有限公司 Method for updating Webpage and Web proxy device
CN103036746A (en) * 2012-12-21 2013-04-10 中国科学院计算技术研究所 Passive measurement method and passive measurement system of web page responding time based on network intermediate point

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113360814A (en) * 2020-03-03 2021-09-07 腾讯科技(深圳)有限公司 Webpage request processing method and device and computer equipment

Similar Documents

Publication Publication Date Title
CN101431539B (en) Domain name resolution method, system and apparatus
US8489724B2 (en) CNAME-based round-trip time measurement in a content delivery network
US6377961B1 (en) Method for displaying internet search results
CN103891247A (en) Method and system for domain name system based discovery of devices and objects
CN105187396A (en) Method and device for identifying web crawler
CN110430188B (en) Rapid URL filtering method and device
WO2001090946A3 (en) Method and apparatus for utilizing user feedback to improve signifier mapping
CN105635064B (en) CSRF attack detection method and device
CA2589365A1 (en) Predictive information retrieval
CN110855636B (en) DNS hijacking detection method and device
CN102546854A (en) Domain name analysis method for building hyper text transport protocol (HTTP) connection for domain name and server
CN108111547B (en) Domain name health monitoring method and system
CN108768982B (en) Phishing website detection method and device, computing equipment and computer storage medium
CN103685611A (en) Network access processing method and device
CN104468860A (en) Method and device for recognizing risk of domain name resolution server
CN103873602A (en) Network resource naming method and generating device
CN104253796B (en) Quick area's recognition methods based on network address binding region layer level in domain name system
CN104636368A (en) Data retrieval method and device and server
CN103684823A (en) Weblog recording method, network access path determining method and related devices
CN103729458B (en) Method and device for distinguishing webpage requests
CN111614792B (en) Transparent transmission method, system, server, electronic device and storage medium
CN103714182A (en) Association method and device for webpage request
CN109413022A (en) A kind of method and apparatus based on user behavior detection HTTP FLOOD attack
CN105530329B (en) A kind of novel domain name resolution service method and apparatus for supporting name to search for
CN105515882A (en) Website security detection method and website security detection device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20140409