CN104731949A - Method and device for recognizing webpage skipping - Google Patents

Method and device for recognizing webpage skipping Download PDF

Info

Publication number
CN104731949A
CN104731949A CN201510150008.0A CN201510150008A CN104731949A CN 104731949 A CN104731949 A CN 104731949A CN 201510150008 A CN201510150008 A CN 201510150008A CN 104731949 A CN104731949 A CN 104731949A
Authority
CN
China
Prior art keywords
redirect
webpage
scripted code
attribute
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510150008.0A
Other languages
Chinese (zh)
Other versions
CN104731949B (en
Inventor
付通敏
魏少俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qihoo Technology Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Qizhi Software Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd, Qizhi Software Beijing Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201510150008.0A priority Critical patent/CN104731949B/en
Publication of CN104731949A publication Critical patent/CN104731949A/en
Application granted granted Critical
Publication of CN104731949B publication Critical patent/CN104731949B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Document Processing Apparatus (AREA)

Abstract

The invention provides a method and device for recognizing webpage skipping. The method comprises the steps of simulating an object, for processing webpage skipping, in a browser to obtain a simulated object for processing webpage skipping, extracting script codes from webpages to be recognized, calling the simulated object to execute the script codes, and judging whether the webpages to be recognized skip according to script code executing results. According to the technical scheme, the efficiency of the method and device is many times higher than the scheme that a browser is called to execute script codes in webpages to recognize skipping, and due to the fact that according to the technical scheme, only parts of objects relating to skipping are simulated, computing resources cannot be wasted on the codes irrelevant to skipping.

Description

Identify the method and apparatus of webpage redirect
Technical field
The present invention relates to Internet technical field, in particular to a kind of method and apparatus identifying webpage redirect.
Background technology
At present, the spider of search engine capture webpage time, need the situation processing webpage generation redirect, a kind of mode of webpage generation redirect is JavaScript redirect: embedded in JavaScript code in the webpage HTML code that web page server returns, and these codes webpage when browser end performs just redirect can occur.For JavaScript redirect, because JavaScript is very flexible, so the feature of JavaScript redirect is not fairly obvious, current search engine spider is not very good to the identification of JavaScript redirect.
The major way of current search engine spider identification JavaScript redirect is canonical matching method, namely form common for JavaScript redirect is configured to regular expression, carry out canonical coupling after having downloaded webpage, match, think that webpage there occurs redirect and obtains redirect result.The shortcoming of this method has two: one to be difficult to regular expression to write more complete; Two is cannot adopt the method under the JavaScript code containing variable or calculating.
Another method easily expected calls browser to perform JavaScript code, so just can make spider to the identification of redirect and user browser completely the same.But the shortcoming of this method needs to consume a large amount of computational resources, carry out identification feasible, but for a large amount of webpage, the method is substantially infeasible for minority webpage.
Summary of the invention
In view of the above problems, the present invention is proposed to provide a kind of method and apparatus of identification webpage redirect overcoming the problems referred to above or solve the problem at least in part.
According to one aspect of the present invention, provide a kind of method identifying webpage redirect, it comprises: simulate for the treatment of the object of webpage redirect in browser, obtains the simulated object for the treatment of webpage redirect; From webpage to be identified, extract scripted code, and call described simulated object to perform described scripted code; According to the result performing described scripted code, judge whether described webpage to be identified redirect occurs.
Alternatively, aforesaid method, simulate for the treatment of the object of webpage redirect in browser, obtain the simulated object for the treatment of webpage redirect, comprise further: only the method for the treatment of webpage redirect of object described in described browser and/or attribute are simulated, obtain the context environmental of simulating, described scripted code performs in the context environmental of described simulation.
Alternatively, aforesaid method, the context environmental of described simulation comprises for recording the attribute treating jump page; According to performing the result of described scripted code, judge whether described webpage to be identified redirect occurs, and comprises further: according to the network address of described webpage, the value for treating the attribute of jump page described in recording is set; After the described scripted code of execution, check and treat whether the value of the attribute of jump page changes described in recording; When for changing until the value of the attribute of jump page described in recording, judge to need the value after by change jump to described in treat jump page.
Alternatively, aforesaid method, the context environmental of described simulation comprises the method for generating pagefile, represent when there is predetermined redirect label in described pagefile the page that described pagefile is corresponding for described in treat jump page; Described method also comprises: when for not changing until the value of the attribute of jump page described in recording, check in described pagefile whether there is described redirect label; When there is described redirect label in described pagefile, judge to need to jump to the page corresponding to described pagefile.
Alternatively, aforesaid method, calls described simulated object to perform described scripted code, specifically comprises: perform each paragraph successively after described scripted code is divided into multiple paragraph; Catch abnormal after executing described each paragraph.
Alternatively, aforesaid method, performs each paragraph successively after described scripted code is divided into multiple paragraph, specifically comprises: be unit by the statement in described scripted code and/or statement block, and described scripted code is divided into described multiple paragraph.
Alternatively, aforesaid method, be unit by the statement in described scripted code and/or statement block, before described scripted code is divided into multiple paragraph, also comprise: from described scripted code, search the first symbol, and identify the statement in described scripted code according to described first symbol found; And/or search the second symbol from described scripted code, and identify the statement block in described scripted code according to described second symbol found.
According to another aspect of the present invention, provide a kind of device identifying webpage redirect, it comprises: analog module, for simulating for the treatment of the object of webpage redirect in browser, obtains the simulated object for the treatment of webpage redirect; Scripted code execution module, for extracting scripted code from webpage to be identified, and calls described simulated object to perform described scripted code; Redirect judge module, for according to the result performing described scripted code, judges whether described webpage to be identified redirect occurs.
Alternatively, aforesaid device, described analog module is only simulated the method for the treatment of webpage redirect of object described in described browser and/or attribute, and obtain the context environmental of simulating, described scripted code performs in the context environmental of described simulation.
Alternatively, aforesaid device, the attribute of described simulated object comprises for recording the attribute treating jump page; Described device also comprises: attribute setup module, for the network address according to described webpage, arranges the value for treating the attribute of jump page described in recording; Change checking module, for after the described scripted code of execution, checks and treat whether the value of the attribute of jump page changes described in recording; Described redirect judge module when for changing until the value of the attribute of jump page described in recording, judge to need the value after by change jump to described in treat jump page.
Alternatively, aforesaid device, the context environmental of described simulation comprises the method for generating pagefile, represent when there is predetermined redirect label in described pagefile the page that described pagefile is corresponding for described in treat jump page; Described device also comprises: label checking module, for when for not changing until the value of the attribute of jump page described in recording, checks in described pagefile whether there is described redirect label; When described redirect judge module exists described redirect label in described pagefile, judge to need to jump to the page corresponding to described pagefile.
Alternatively, aforesaid device, described scripted code execution module performs each paragraph after described scripted code is divided into multiple paragraph successively; Described device also comprises: catch of exception module, abnormal for catching after executing described each paragraph.
Alternatively, aforesaid device, described scripted code execution module is unit by the statement in described scripted code and/or statement block, and described scripted code is divided into described multiple paragraph.
Alternatively, aforesaid device, also comprises: statement identification module, for searching the first symbol from described scripted code, and identifies the statement in described scripted code according to described first symbol found; And/or statement block identification module, for searching the second symbol from described scripted code, and identify the statement block in described scripted code according to described second symbol found.
According to above technical scheme, the method and apparatus of known identification webpage of the present invention redirect at least has the following advantages:
In the inventive solutions, webpage redirect is identified based on to the execution of the scripted code (such as JavaScript) in webpage, but be not call browser to perform the scripted code in webpage, but pass through the object for the treatment of webpage redirect of only simulation browser, identify webpage redirect; Technical scheme of the present invention performs scripted code in webpage to identify high times of the scheme efficiency of redirect than calling browser, reason only simulates the part object relevant to redirect in the inventive solutions, and computational resource can not be wasted on the code that other and redirect have nothing to do.
Above-mentioned explanation is only the general introduction of technical solution of the present invention, in order to technological means of the present invention can be better understood, and can be implemented according to the content of instructions, and can become apparent, below especially exemplified by the specific embodiment of the present invention to allow above and other objects of the present invention, feature and advantage.
Accompanying drawing explanation
By reading hereafter detailed description of the preferred embodiment, various other advantage and benefit will become cheer and bright for those of ordinary skill in the art.Accompanying drawing only for illustrating the object of preferred implementation, and does not think limitation of the present invention.And in whole accompanying drawing, represent identical parts by identical reference symbol.In the accompanying drawings:
Fig. 1 shows the process flow diagram of the method for identification webpage according to an embodiment of the invention redirect;
Fig. 2 shows the process flow diagram of the method for identification webpage according to an embodiment of the invention redirect;
Fig. 3 shows the process flow diagram of the method for identification webpage according to an embodiment of the invention redirect;
Fig. 4 shows the process flow diagram of the method for identification webpage according to an embodiment of the invention redirect;
Fig. 5 shows the block diagram of the device of identification webpage according to an embodiment of the invention redirect;
Fig. 6 shows the block diagram of the device of identification webpage according to an embodiment of the invention redirect;
Fig. 7 shows the block diagram of the device of identification webpage according to an embodiment of the invention redirect.
Embodiment
Below with reference to accompanying drawings exemplary embodiment of the present disclosure is described in more detail.Although show exemplary embodiment of the present disclosure in accompanying drawing, however should be appreciated that can realize the disclosure in a variety of manners and not should limit by the embodiment set forth here.On the contrary, provide these embodiments to be in order to more thoroughly the disclosure can be understood, and complete for the scope of the present disclosure can be conveyed to those skilled in the art.
As shown in Figure 1, provide a kind of method identifying webpage redirect in one embodiment of the present of invention, it comprises:
Step 110, simulates for the treatment of the object of webpage redirect in browser, obtains the simulated object for the treatment of webpage redirect.In the present embodiment, to identify that the JavaScript redirect of webpage is described: first, can by some the JavaScript objects write in advance, carry out the part in simulation browser for the treatment of the BOM object of JavaScript redirect and the attribute of DOM object and method, particularly, in the present embodiment can windows, document, location, navigator, history object of simulation browser, have very many methods for the treatment of webpage redirect and attribute in these objects, other object then there is no need to simulate.
Step 120, extracts scripted code from webpage to be identified, and calls simulated object to perform scripted code.In the present embodiment, extract the JavaScript code in webpage, and by using JavaScript enforcement engine (such as, Google V8 or Node.js), in JavaScript environment, perform aforementioned simulated object of writing, in the context environmental after complete, just have the objects such as window, document, location, navigator, history of simulation.
Step 130, according to the result performing scripted code, judges whether webpage to be identified redirect occurs.In the present embodiment, allow the JavaScript code in webpage perform in this context environmental of simulating, finally provide the URL whether redirect occurring and jumps to according to result.
In the technical scheme of the present embodiment, can find owing to only simulating the limited object in browser, so can perform the JavaScript code in webpage, ignore other codes in webpage, this just causes identifying the few ultrahigh in efficiency simultaneously of the resource taken in JavaScript jump procedure simultaneously.
As shown in Figure 2, provide a kind of method identifying webpage redirect in one embodiment of the present of invention, it comprises:
Step 210, only simulates the method for the treatment of webpage redirect of object in browser and/or attribute, obtains the context environmental of simulating.In the present embodiment, still be described for JavaScript redirect, do like this is because need the object of simulation originally to have very many methods and attribute, so only wherein relevant to redirect certain methods and attribute can be simulated, such as, most crucial in aforementioned object is location object, the more comprehensive of simulation is needed for location object, concrete needs simulates various get and the set methods for the treatment of webpage redirect wherein, makes can handle this object in JavaScript environment; And for document object, then only can simulate write () wherein and writeln () method,
True value is directly returned for methods such as getElementById () and getElementByTagNage ().
Step 220, extracts scripted code from webpage to be identified, and scripted code performs in the context environmental of simulation.
Step 230, according to the result performing scripted code, judges whether webpage to be identified redirect occurs.In the technical scheme of the present embodiment, only the limited method in each object and attribute to be simulated as seen, method in the context environmental obtained and attribute are more for JavaScript code, and this just causes identifying the few ultrahigh in efficiency simultaneously of the resource taken in JavaScript jump procedure.
As shown in Figure 3, provide a kind of method identifying webpage redirect in one embodiment of the present of invention, it comprises:
Step 310, only simulates the method for the treatment of webpage redirect of object in browser and/or attribute, obtains the context environmental of simulating; The context environmental of simulation comprises for recording the attribute treating jump page, particularly, and such as, window.location.href attribute in aforementioned window object; The context environmental of simulation comprises the method for generating pagefile, the page that when there is predetermined redirect label in pagefile, representation page file is corresponding is for treating jump page, particularly, such as, window.write () function in aforementioned window object, it can generate pagefile window._html.
Step 320, extracts scripted code from webpage to be identified, and scripted code performs in the context environmental of simulation.
Step 330, according to the network address of webpage, arranges the value for recording the attribute treating jump page.In conjunction with foregoing teachings, the value by window.location.href is set to the network address of current web page.
Step 340, after execution scripted code, checks whether the value for recording the attribute treating jump page changes.In conjunction with foregoing teachings, namely check that whether this value of window.location.href is identical with original network address.
Step 350, for record change until the value of the attribute of jump page time, judge to need the value after by change to jump to and treat jump page.In conjunction with foregoing teachings, if the value of window.location.href changes, illustrate and produce redirect, now this value of window.location.href is the URL jumped to.
Step 360, for record do not change until the value of the attribute of jump page time, check in pagefile whether there is redirect label.In conjunction with foregoing teachings, window._html is the HTML code that window.write () function generates, and needs to check whether window._html contains the situation of <meta> label redirect.
Step 370, when there is redirect label in pagefile, judges to need to jump to the page corresponding to pagefile.
As shown in Figure 4, provide a kind of method identifying webpage redirect in one embodiment of the present of invention, the embodiment corresponding compared to Fig. 1, step 120 comprises:
Step 121, performs each paragraph after scripted code is divided into multiple paragraph successively.
Further, in an embodiment of the present embodiment, be unit by the statement in scripted code and/or statement block, scripted code is divided into multiple paragraph.
Further, in another embodiment of the present embodiment, from scripted code, search the first symbol, and identify the statement in scripted code according to the first symbol found; And/or from scripted code, search the second symbol, and identify the statement block in scripted code according to the second symbol found.For JavaScript code, normally carrying out specification statement using branch as the first symbol, is that the second symbol is to distinguish code block with braces.
Step 122, catches abnormal after executing each paragraph.
In the technical scheme of the present embodiment, each code snippet is performed successively in above-mentioned Javascript context environmental, during execution, should try be added ... catch catches exception, avoids because some inessential exceptions cause whole program to exit.
As shown in Figure 5, provide a kind of device identifying webpage redirect in one embodiment of the present of invention, it comprises:
Analog module 510, for simulating for the treatment of the object of webpage redirect in browser, obtains the simulated object for the treatment of webpage redirect.In the present embodiment, to identify that the JavaScript redirect of webpage is described: first, can by some the JavaScript objects write in advance, carry out the part in simulation browser for the treatment of the BOM object of JavaScript redirect and the attribute of DOM object and method, particularly, in the present embodiment can windows, document, location, navigator, history object of simulation browser, have very many methods for the treatment of webpage redirect and attribute in these objects, other object then there is no need to simulate.
Scripted code execution module 520, for extracting scripted code from webpage to be identified, and calls simulated object to perform scripted code.In the present embodiment, extract the JavaScript code in webpage, and by using JavaScript enforcement engine (such as, Google V8 or Node.js), in JavaScript environment, perform aforementioned simulated object of writing, in the context environmental after complete, just have the objects such as window, document, location, navigator, history of simulation.
Redirect judge module 530, for according to the result performing scripted code, judges whether webpage to be identified redirect occurs.In the present embodiment, allow the JavaScript code in webpage perform in this context environmental of simulating, finally provide the URL whether redirect occurring and jumps to according to result.
In the technical scheme of the present embodiment, can find owing to only simulating the limited object in browser, so can perform the JavaScript code in webpage, ignore other codes in webpage, this just causes identifying the few ultrahigh in efficiency simultaneously of the resource taken in JavaScript jump procedure simultaneously.
Provide a kind of device identifying webpage redirect in one embodiment of the present of invention, it comprises:
Analog module 510, only simulates the method for the treatment of webpage redirect of object in browser and/or attribute, obtains the context environmental of simulating.In the present embodiment, still be described for JavaScript redirect, do like this is because need the object of simulation originally to have very many methods and attribute, so only wherein relevant to redirect certain methods and attribute can be simulated, such as, most crucial in aforementioned object is location object, the more comprehensive of simulation is needed for location object, concrete needs simulates various get and the set methods for the treatment of webpage redirect wherein, makes can handle this object in JavaScript environment; And for document object, then only can simulate write () wherein and writeln () method,
True value is directly returned for methods such as getElementById () and getElementByTagNage ().
Scripted code execution module 520, extracts scripted code from webpage to be identified, and scripted code performs in the context environmental of simulation.
Redirect judge module 530, for according to the result performing scripted code, judges whether webpage to be identified redirect occurs.In the technical scheme of the present embodiment, only the limited method in each object and attribute to be simulated as seen, method in the context environmental obtained and attribute are more for JavaScript code, and this just causes identifying the few ultrahigh in efficiency simultaneously of the resource taken in JavaScript jump procedure.
As shown in Figure 6, provide a kind of device identifying webpage redirect in one embodiment of the present of invention, it comprises:
Analog module 610, only simulates the method for the treatment of webpage redirect of object in browser and/or attribute, obtains the context environmental of simulating; The context environmental of simulation comprises for recording the attribute treating jump page, particularly, and such as, window.location.href attribute in aforementioned window object; The context environmental of simulation comprises the method for generating pagefile, the page that when there is predetermined redirect label in pagefile, representation page file is corresponding is for treating jump page, particularly, such as, window.write () function in aforementioned window object, it can generate pagefile window._html.
Scripted code execution module 620, extracts scripted code from webpage to be identified, and scripted code performs in the context environmental of simulation.
Attribute setup module 630, for the network address according to webpage, arranges the value for recording the attribute treating jump page.In conjunction with foregoing teachings, the value by window.location.href is set to the network address of current web page.
Change checking module 640, for after execution scripted code, checks whether the value for recording the attribute treating jump page changes.In conjunction with foregoing teachings, namely check that whether this value of window.location.href is identical with original network address.
Redirect judge module 650, for record change until the value of the attribute of jump page time, judge to need the value after by change to jump to and treat jump page.In conjunction with foregoing teachings, if the value of window.location.href changes, illustrate and produce redirect, now this value of window.location.href is the URL jumped to.
Label checking module 660, for for record do not change until the value of the attribute of jump page time, check in pagefile whether there is redirect label.In conjunction with foregoing teachings, window._html is the HTML code that window.write () function generates, and needs to check whether window._html contains the situation of <meta> label redirect.
When redirect judge module 650 exists redirect label in pagefile, judge to need to jump to the page corresponding to pagefile.
As shown in Figure 7, provide a kind of device identifying webpage redirect in one embodiment of the present of invention, the embodiment corresponding compared to Fig. 5, scripted code execution module 520 performs each paragraph after scripted code is divided into multiple paragraph successively.
Further, in an embodiment of the present embodiment, scripted code execution module 520 is unit by the statement in scripted code and/or statement block, and scripted code is divided into multiple paragraph.
Further, in another embodiment of the present embodiment, device also comprises: statement identification module 540, searches the first symbol from scripted code, and identifies the statement in scripted code according to the first symbol found; And/or statement block identification module 550, from scripted code, search the second symbol, and identify the statement block in scripted code according to the second symbol found.For JavaScript code, normally carrying out specification statement using branch as the first symbol, is that the second symbol is to distinguish code block with braces.
Device also comprises: catch of exception module 560, abnormal for catching after executing each paragraph.
In the technical scheme of the present embodiment, each code snippet is performed successively in above-mentioned Javascript context environmental, during execution, should try be added ... catch catches exception, avoids because some inessential exceptions cause whole program to exit.
Intrinsic not relevant to any certain computer, virtual system or miscellaneous equipment with display at this algorithm provided.Various general-purpose system also can with use based on together with this teaching.According to description above, the structure constructed required by this type systematic is apparent.In addition, the present invention is not also for any certain programmed language.It should be understood that and various programming language can be utilized to realize content of the present invention described here, and the description done language-specific is above to disclose preferred forms of the present invention.
In instructions provided herein, describe a large amount of detail.But can understand, embodiments of the invention can be put into practice when not having these details.In some instances, be not shown specifically known method, structure and technology, so that not fuzzy understanding of this description.
Similarly, be to be understood that, in order to simplify the disclosure and to help to understand in each inventive aspect one or more, in the description above to exemplary embodiment of the present invention, each feature of the present invention is grouped together in single embodiment, figure or the description to it sometimes.But, the method for the disclosure should be construed to the following intention of reflection: namely the present invention for required protection requires feature more more than the feature clearly recorded in each claim.Or rather, as claims below reflect, all features of disclosed single embodiment before inventive aspect is to be less than.Therefore, the claims following embodiment are incorporated to this embodiment thus clearly, and wherein each claim itself is as independent embodiment of the present invention.
Those skilled in the art are appreciated that and adaptively can change the module in the equipment in embodiment and they are arranged in one or more equipment different from this embodiment.Module in embodiment or unit or assembly can be combined into a module or unit or assembly, and multiple submodule or subelement or sub-component can be put them in addition.Except at least some in such feature and/or process or unit be mutually repel except, any combination can be adopted to combine all processes of all features disclosed in this instructions (comprising adjoint claim, summary and accompanying drawing) and so disclosed any method or equipment or unit.Unless expressly stated otherwise, each feature disclosed in this instructions (comprising adjoint claim, summary and accompanying drawing) can by providing identical, alternative features that is equivalent or similar object replaces.
In addition, those skilled in the art can understand, although embodiments more described herein to comprise in other embodiment some included feature instead of further feature, the combination of the feature of different embodiment means and to be within scope of the present invention and to form different embodiments.Such as, in the following claims, the one of any of embodiment required for protection can use with arbitrary array mode.
All parts embodiment of the present invention with hardware implementing, or can realize with the software module run on one or more processor, or realizes with their combination.It will be understood by those of skill in the art that the some or all functions that microprocessor or digital signal processor (DSP) can be used in practice to realize according to the some or all parts in the method and apparatus of the identification webpage redirect of the embodiment of the present invention.The present invention can also be embodied as part or all equipment for performing method as described herein or device program (such as, computer program and computer program).Realizing program of the present invention and can store on a computer-readable medium like this, or the form of one or more signal can be had.Such signal can be downloaded from internet website and obtain, or provides on carrier signal, or provides with any other form.
The present invention will be described instead of limit the invention to it should be noted above-described embodiment, and those skilled in the art can design alternative embodiment when not departing from the scope of claims.In the claims, any reference symbol between bracket should be configured to limitations on claims.Word " comprises " not to be got rid of existence and does not arrange element in the claims or step.Word "a" or "an" before being positioned at element is not got rid of and be there is multiple such element.The present invention can by means of including the hardware of some different elements and realizing by means of the computing machine of suitably programming.In the unit claim listing some devices, several in these devices can be carry out imbody by same hardware branch.Word first, second and third-class use do not represent any order.Can be title by these word explanations.

Claims (10)

1. identify a method for webpage redirect, it comprises:
Simulate for the treatment of the object of webpage redirect in browser, obtain the simulated object for the treatment of webpage redirect;
From webpage to be identified, extract scripted code, and call described simulated object to perform described scripted code;
According to the result performing described scripted code, judge whether described webpage to be identified redirect occurs.
2. method according to claim 1, wherein, simulates for the treatment of the object of webpage redirect in browser, obtains, for the treatment of the simulated object of webpage redirect, comprising further:
Only simulate the method for the treatment of webpage redirect of object described in described browser and/or attribute, obtain the context environmental of simulating, described scripted code performs in the context environmental of described simulation.
3. the method according to any one of claim 1-2, wherein, the context environmental of described simulation comprises for recording the attribute treating jump page; According to the result performing described scripted code, judge whether described webpage to be identified redirect occurs, and comprises further:
According to the network address of described webpage, the value for treating the attribute of jump page described in recording is set;
After the described scripted code of execution, check and treat whether the value of the attribute of jump page changes described in recording;
When for changing until the value of the attribute of jump page described in recording, judge to need the value after by change jump to described in treat jump page.
4. the method according to any one of claim 1-3, wherein, the context environmental of described simulation comprises the method for generating pagefile, represent when there is predetermined redirect label in described pagefile the page that described pagefile is corresponding for described in treat jump page; Described method also comprises:
When for not changing until the value of the attribute of jump page described in recording, check in described pagefile whether there is described redirect label;
When there is described redirect label in described pagefile, judge to need to jump to the page corresponding to described pagefile.
5. the method according to any one of claim 1-4, wherein, call described simulated object to perform described scripted code, specifically comprise:
Each paragraph is performed successively after described scripted code is divided into multiple paragraph;
Catch abnormal after executing described each paragraph.
6. the method according to any one of claim 1-5, wherein, performs each paragraph successively after described scripted code is divided into multiple paragraph, specifically comprises:
Be unit by the statement in described scripted code and/or statement block, described scripted code is divided into described multiple paragraph.
7. the method according to any one of claim 1-6, wherein, is being unit by the statement in described scripted code and/or statement block, before described scripted code is divided into multiple paragraph, is also comprising:
From described scripted code, search the first symbol, and identify the statement in described scripted code according to described first symbol found; And/or
From described scripted code, search the second symbol, and identify the statement block in described scripted code according to described second symbol found.
8. identify a device for webpage redirect, it comprises:
Analog module, for simulating for the treatment of the object of webpage redirect in browser, obtains the simulated object for the treatment of webpage redirect;
Scripted code execution module, for extracting scripted code from webpage to be identified, and calls described simulated object to perform described scripted code;
Redirect judge module, for according to the result performing described scripted code, judges whether described webpage to be identified redirect occurs.
9. device according to claim 8, wherein,
Described analog module is only simulated the method for the treatment of webpage redirect of object described in described browser and/or attribute, and obtain the context environmental of simulating, described scripted code performs in the context environmental of described simulation.
10. the device according to Claim 8 described in-9 any one, wherein, the attribute of described simulated object comprises for recording the attribute treating jump page; Described device also comprises:
Attribute setup module, for the network address according to described webpage, arranges the value for treating the attribute of jump page described in recording;
Change checking module, for after the described scripted code of execution, checks and treat whether the value of the attribute of jump page changes described in recording;
Described redirect judge module when for changing until the value of the attribute of jump page described in recording, judge to need the value after by change jump to described in treat jump page.
CN201510150008.0A 2015-03-31 2015-03-31 Method and device for recognizing webpage skipping Active CN104731949B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510150008.0A CN104731949B (en) 2015-03-31 2015-03-31 Method and device for recognizing webpage skipping

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510150008.0A CN104731949B (en) 2015-03-31 2015-03-31 Method and device for recognizing webpage skipping

Publications (2)

Publication Number Publication Date
CN104731949A true CN104731949A (en) 2015-06-24
CN104731949B CN104731949B (en) 2017-05-03

Family

ID=53455836

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510150008.0A Active CN104731949B (en) 2015-03-31 2015-03-31 Method and device for recognizing webpage skipping

Country Status (1)

Country Link
CN (1) CN104731949B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108280102A (en) * 2017-02-08 2018-07-13 广州市动景计算机科技有限公司 Internet behavior recording method, device and user terminal

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030033606A1 (en) * 2001-08-07 2003-02-13 Puente David S. Streaming media publishing system and method
CN102436564A (en) * 2011-12-30 2012-05-02 奇智软件(北京)有限公司 Method and device for identifying falsified webpage
CN102819451A (en) * 2011-06-09 2012-12-12 深圳市财付通科技有限公司 Method and system for calling browser plug-in

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030033606A1 (en) * 2001-08-07 2003-02-13 Puente David S. Streaming media publishing system and method
CN102819451A (en) * 2011-06-09 2012-12-12 深圳市财付通科技有限公司 Method and system for calling browser plug-in
CN102436564A (en) * 2011-12-30 2012-05-02 奇智软件(北京)有限公司 Method and device for identifying falsified webpage

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108280102A (en) * 2017-02-08 2018-07-13 广州市动景计算机科技有限公司 Internet behavior recording method, device and user terminal
CN108280102B (en) * 2017-02-08 2020-12-08 阿里巴巴(中国)有限公司 Internet surfing behavior recording method and device and user terminal

Also Published As

Publication number Publication date
CN104731949B (en) 2017-05-03

Similar Documents

Publication Publication Date Title
US9983984B2 (en) Automated modularization of graphical user interface test cases
CN101964025B (en) XSS detection method and equipment
CN105335404B (en) Page info loading method and device
CN109829096B (en) Data acquisition method and device, electronic equipment and storage medium
CN103632100A (en) Method and device for detecting website bugs
US20110320415A1 (en) Piecemeal list prefetch
CN105095067A (en) User interface element object identification and automatic test method and apparatus
CN104462583A (en) Browser device for advertisement blocking processing and mobile terminal
CN104168250B (en) Business Process Control method and device based on CGI frames
CN103647678A (en) Method and device for online verification of website vulnerabilities
CN105430002A (en) Vulnerability detection method and device
CN106844486A (en) Crawl the method and device of dynamic web page
CN104036019A (en) Method and device for opening webpage links
CN103455758A (en) Method and device for identifying malicious website
CN102867144A (en) Method and device for detecting and removing computer viruses
CN106371987A (en) Test method and device
CN104899217A (en) Method and apparatus for implementing customized function
CN103631906A (en) Method and device for recognizing page number identification in webpage URL
CN102917053B (en) A kind of method, apparatus and system for judging webpage urlrewriting
CN103838865A (en) Method and device for mining timeliness seed page
CN104731949A (en) Method and device for recognizing webpage skipping
CN103955548B (en) A kind of webpage rendering intent and device
CN105426500A (en) Extraction method and device of link dynamically generated by webpage scripts
CN105204907A (en) Browser starting method and device
CN115809193A (en) Front-end reverse abnormal data robustness detection method, device and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220725

Address after: Room 801, 8th floor, No. 104, floors 1-19, building 2, yard 6, Jiuxianqiao Road, Chaoyang District, Beijing 100015

Patentee after: BEIJING QIHOO TECHNOLOGY Co.,Ltd.

Address before: 100088 room 112, block D, 28 new street, new street, Xicheng District, Beijing (Desheng Park)

Patentee before: BEIJING QIHOO TECHNOLOGY Co.,Ltd.

Patentee before: Qizhi software (Beijing) Co.,Ltd.

TR01 Transfer of patent right