CN103034711B - Form recognition method and device - Google Patents

Form recognition method and device Download PDF

Info

Publication number
CN103034711B
CN103034711B CN201210529911.4A CN201210529911A CN103034711B CN 103034711 B CN103034711 B CN 103034711B CN 201210529911 A CN201210529911 A CN 201210529911A CN 103034711 B CN103034711 B CN 103034711B
Authority
CN
China
Prior art keywords
webpage
attribute
data
web page
code
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201210529911.4A
Other languages
Chinese (zh)
Other versions
CN103034711A (en
Inventor
蔡磊
张骏
万振
傅盛
徐鸣
王昆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhuhai Baohaowan Technology Co Ltd
Original Assignee
Beijing Kingsoft Internet Security Software Co Ltd
Conew Network Technology Beijing Co Ltd
Beijing Cheetah Mobile Technology Co Ltd
Beijing Cheetah Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingsoft Internet Security Software Co Ltd, Conew Network Technology Beijing Co Ltd, Beijing Cheetah Mobile Technology Co Ltd, Beijing Cheetah Network Technology Co Ltd filed Critical Beijing Kingsoft Internet Security Software Co Ltd
Priority to CN201210529911.4A priority Critical patent/CN103034711B/en
Publication of CN103034711A publication Critical patent/CN103034711A/en
Application granted granted Critical
Publication of CN103034711B publication Critical patent/CN103034711B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a form identification method and a form identification device. The form identification method comprises the following steps: receiving an access instruction; loading a webpage corresponding to the access instruction; scanning the webpage codes of the loaded webpage; judging whether the scanned webpage codes comprise elements with the attributes being first preset attributes or not; judging whether the scanned webpage codes comprise elements with the attributes being second preset attributes or not; and if the scanned webpage codes comprise the elements with the attributes being the first preset attributes and the elements with the attributes being the second preset attributes, determining that the loaded webpage is the form webpage. By the method and the device, the problem of low form recognition rate in the prior art is solved, and the effect of improving the form recognition rate is achieved.

Description

Form recognition method and apparatus
Technical field
The present invention relates to data processing field, in particular to a kind of form recognition method and apparatus.
Background technology
Double-core browser, i.e. has the browser of two kernels, including Trident kernel and Webkit kernel.Trident kernel is web browser (InternetExplorer, it is called for short IE) used, IE browser popularity rate at home is the highest, a lot of websites only consider compatible IE, and do not meet World Wide Web Consortium (WorldWideWebConsortium, it is called for short W3C) standard, such as Net silver, on-line payment class website.What in Webkit, verification W3C standard was supported is the most perfect, has feature at a high speed simultaneously.The compatibility of Trident kernel adds the high speed of Webkit kernel, and double-core browser meets different user's requests.In prior art, Trident kernel have employed for HTML (HypertextMarkupLanguage with the double-core form recognition of Webkit kernel, it being called for short HTML) form list is identified in webpage, concrete recognition method is when user inserts form information in the page, click on and submit button to, after performing submission event, judge whether list is submitted to successfully by the result performing submission event.If list is submitted to successfully, then form data is stored in data base, data base can store the multiple fields in list, it is considered as a successful form information, be can be seen that by foregoing description, prior art needs when being identified list the multiple fields in the list after submitting to successfully are judged, the identification to list is can be only achieved in the case of multiple fields are satisfied by condition, this kind needs the mode being identified fields multiple in list to not only result in the reduction of form recognition rate, when list is filled in by subsequent user, need nonetheless remain for the multiple fields in data base are mated respectively and just can judge whether the list that user is filling in is current web page list, only just can normally fill in when judging for current form, cause inconvenient user operation, user experience reduces.
For the problem that form recognition rate in correlation technique is relatively low, effective solution is the most not yet proposed.
Summary of the invention
Present invention is primarily targeted at a kind of form recognition method and apparatus of offer, to solve the problem that in prior art, form recognition rate is relatively low.
To achieve these goals, according to an aspect of the invention, it is provided a kind of form recognition method, including: receive access instruction;Load the webpage corresponding with access instruction;The web page code of the webpage loaded is scanned;Judging whether include, in the web page code scanned, the element that attribute is the first preset attribute, wherein, the element that the first preset attribute is corresponding is cryptographic element;Judging whether include, in the web page code scanned, the element that attribute is the second preset attribute, wherein, the element that the second preset attribute is corresponding is user name element;And if judge that the web page code scanned includes the element that attribute is the first preset attribute, and also including the element that attribute is the second preset attribute, it is determined that the webpage of loading is list webpage.
Further, it is scanned including to the web page code of the webpage loaded: obtain the core type producing access instruction;If the core type got is Trident kernel, then inject preset scripted code in web page code so that web page code is scanned;And if the core type got is Webkit kernel, then the input control in the dom tree in web page code being scanned.
Further, after the webpage determining loading is list webpage, form recognition method also includes: judge whether to receive triggering command, and wherein, triggering command is used for submission form webpage;And if judge to receive triggering command, it is determined that list webpage is effective list.
Further, when the core type of generation access instruction is Trident kernel, it may be judged whether receive triggering command and include: obtaining attribute in web page code is the element of the 3rd preset attribute, obtains the first element, wherein, the element that the 3rd preset attribute is corresponding is submission event;Replicate the first element, obtain the second element;The first element is covered with the second element;And judge whether the second element is performed, if judging, the second element is performed, it is determined that receive triggering command.
Further, when the core type of generation access instruction is Webkit kernel, it may be judged whether receive triggering command and include: obtaining attribute in web page code is the element of the 3rd preset attribute, obtains the first element, wherein, the element that the 3rd preset attribute is corresponding is submission event;And judge whether the first element is performed, if judging, the first element is performed, it is determined that receive triggering command.
Further, after the webpage determining loading is list webpage, and before judging whether to receive triggering command, form recognition method also includes: obtains the element that attribute is the first preset attribute, obtains cryptographic element;Obtain the element that attribute is the second preset attribute, obtain user name element;Inquiry presetting database is to judge that code data and username data have been saved in presetting database the most, and wherein, code data is the data that cryptographic element is corresponding, and username data is the data that user name element is corresponding;And if judging that code data and username data have been saved in presetting database the most, then in the cryptographic element of the webpage that interpolation code data extremely loads, and in the user name element of the interpolation username data extremely webpage of loading.
Further, after the webpage determining loading is list webpage, and before judging whether to receive triggering command, form recognition method also includes: obtains the element that attribute is the first preset attribute, obtains cryptographic element;Obtain the element that attribute is the second preset attribute, obtain user name element;Inquiry presetting database is to judge that code data and username data have been saved in presetting database the most, and wherein, code data is the data that cryptographic element is corresponding, and username data is the data that user name element is corresponding;If judging, username data is saved in presetting database, and code data is not held in presetting database, then add in the user name element of the webpage that username data extremely loads, and receives the code data of user's input;And if judge that username data and code data are all not held in presetting database, then receive user input code data and username data.
Further, after judging to receive triggering command, form recognition method also includes: pop-up is preset in display, wherein, preset and be provided with suggestion content in pop-up, suggestion content is used for prompting the user to choose whether to preserve code data and username data, or prompts the user to choose whether to preserve code data;Receive the selection from user to instruct;And when selection instruction represents and selects to preserve code data and username data, preservation code data and username data are to presetting database, or preserve code data to presetting database.
To achieve these goals, according to a further aspect in the invention, it is provided that a kind of form recognition device, this form recognition device is for performing any one form recognition method that foregoing of the present invention is provided.
To achieve these goals, according to a further aspect in the invention, it is provided that a kind of form recognition device, including: reception unit, it is used for receiving access instruction;Loading unit, for loading the webpage corresponding with access instruction;Scanning element, for being scanned the web page code of the webpage loaded;First judging unit, for judging whether include, in the web page code scanned, the element that attribute is the first preset attribute, wherein, the element that the first preset attribute is corresponding is cryptographic element;Second judging unit, for judging whether include, in the web page code scanned, the element that attribute is the second preset attribute, wherein, the element that the second preset attribute is corresponding is user name element;And determine unit, if for judging that the web page code scanned includes the element that attribute is the first preset attribute, and also include the element that attribute is the second preset attribute, it is determined that the webpage of loading is list webpage.
Further, scanning element includes: first obtains subelement, for obtaining the core type producing access instruction;First scanning subelement, for when the core type got is Trident kernel, to be scanned web page code in the default scripted code of injection to web page code;And the second scanning subelement, for when the core type got is Webkit kernel, the input control in the dom tree in web page code is scanned.
By the present invention, use and receive access instruction;Load the webpage corresponding with access instruction;The web page code of the webpage loaded is scanned;Judging whether include, in the web page code scanned, the element that attribute is the first preset attribute, wherein, the element that the first preset attribute is corresponding is cryptographic element;Judging whether include, in the web page code scanned, the element that attribute is the second preset attribute, wherein, the element that the second preset attribute is corresponding is user name element;And if judge that the web page code scanned includes the element that attribute is the first preset attribute, and also including the element that attribute is the second preset attribute, it is determined that the webpage of loading is list webpage.It is scanned by user being accessed the web page code of the webpage loaded, realize the monitoring to web page code, and then realize the monitoring of each element property in web page code, meet the element of preset attribute (i.e. quickly detecting whether the webpage of loading comprises, realize quickly detecting cryptographic element and user name element), this kind only need to be by the method being monitored the username field in webpage and password field, whether only need to be scanned the web page code of the webpage loaded realizing is the identification of list webpage to the webpage loaded, for hinge structure needs the recognition methods that the multiple fields in the list after submitting to successfully judge, significantly reduce the complexity of form recognition, solve the problem that in prior art, form recognition rate is relatively low, and then reached to improve the effect of form recognition rate.
Accompanying drawing explanation
The accompanying drawing of the part constituting the application is used for providing a further understanding of the present invention, and the schematic description and description of the present invention is used for explaining the present invention, is not intended that inappropriate limitation of the present invention.In the accompanying drawings:
Fig. 1 is the flow chart of form recognition method according to embodiments of the present invention;
Fig. 2 is that form recognition method according to embodiments of the present invention is to cryptographic element in Webkit kernel browser and the flow chart of user name elemental scan;
Fig. 3 is the decision flow chart to whether receiving triggering command in Trident kernel browser of the form recognition method according to embodiments of the present invention;
Fig. 4 is the form recognition method being applied to Trident kernel browser according to embodiments of the present invention;
Fig. 5 is the form recognition method being applied to Webkit kernel browser according to embodiments of the present invention;And
Fig. 6 is the schematic diagram of form recognition device according to embodiments of the present invention.
Detailed description of the invention
It should be noted that in the case of not conflicting, the embodiment in the application and the feature in embodiment can be mutually combined.Describe the present invention below with reference to the accompanying drawings and in conjunction with the embodiments in detail.
Embodiments providing a kind of form recognition method, the form recognition method provided the embodiment of the present invention below is specifically introduced:
Fig. 1 is the flow chart of form recognition method according to embodiments of the present invention, as it is shown in figure 1, the method includes that steps S101 is to step S107:
S101: receive from the access instruction of user, specifically, when user wants to conduct interviews some websites, can input or link and input the network address of this website to carry out opening webpage, now, can receive the access instruction of user.
S102: load the webpage corresponding with access instruction, i.e. the HTML in the network address corresponding with access instruction is loaded, obtain the webpage corresponding with access instruction.
S103: the web page code of the webpage loaded is scanned;
S104: judge whether include, in the web page code scanned, the element that attribute is the first preset attribute, wherein, the element that first preset attribute is corresponding is cryptographic element, specifically, mainly by the web page code of the webpage loaded is scanned, whether detection scanning process can scan the element that attribute is the first preset attribute, in embodiments of the present invention, first preset attribute can be defined as attribute type=" password ", if scanning the input element containing attribute type=" password " in the web page code of the webpage loaded, then determine that the web page code scanned includes the element that attribute is the first preset attribute.
S105: judge whether include, in the web page code scanned, the element that attribute is the second preset attribute, wherein, the element that second preset attribute is corresponding is user name element, specifically, mainly by whether detection scanning process can scan the element that attribute is the second preset attribute, in embodiments of the present invention, it is closest that second preset attribute can be defined as distance cryptographic element, and meet attribute type=" text ", if scanning the input element containing attribute type=" text " in the web page code of the webpage loaded, then determine that the web page code scanned includes the element that attribute is the second preset attribute.
S106: the web page code scanned includes the element that attribute is the first preset attribute if judging, and also include the element that attribute is the second preset attribute, then determine that the webpage loaded in step S102 is list webpage, i.e., when judging that web page code not only comprises cryptographic element but also comprise user name element, the webpage that i.e. can determine that loading is a list webpage, namely realizes the identification to list.
nullThe form recognition method of the embodiment of the present invention is scanned by user accesses the web page code of the webpage loaded,Realize the monitoring to web page code,And then realize the monitoring of each element property in web page code,Meet the element of preset attribute (i.e. quickly detecting whether the webpage of loading comprises,Realize quickly detecting cryptographic element and user name element),This kind only need to be by the method being monitored the username field in webpage and password field,Whether only need to be scanned the web page code of the webpage loaded realizing is the identification of list webpage to the webpage loaded,For hinge structure needs the recognition methods that the multiple fields in the list after submitting to successfully judge,Significantly reduce the complexity of form recognition,Solve the problem that in prior art, form recognition rate is relatively low,And then reached to improve the effect of form recognition rate.
Specifically, when user is conducted interviews when issuing of instruction by different types of browser kernel, and for the browser of different kernels, the concrete processing mode of step S103 in the recognition methods that the embodiment of the present invention is provided is different.
Wherein, when the web page code loading webpage is carried out surface sweeping, first the core type producing access instruction is obtained, then according to the difference of the core type got takes different scan methods, specifically, when the core type got is Trident kernel, then after the webpage loading that access instruction is corresponding terminates, default JavaScript scripted code is injected (i.e. in the web page code of the webpage loaded, JS scripted code), then rely on the web page code HTML of the JS scripted code webpage to loading to be scanned monitoring;nullWhen the core type got is Webkit kernel,Then can directly the input control in the dom tree in web page code be scanned,Wherein,Dom tree refers to DOM Document Object Model (HTMLDocumentObjectModel,It is called for short HTMLDOM),HTMLDOM is then to be suitable for the DOM Document Object Model with HTML optimization HTML/XHTML specially,It is scanned determining the flow process of cryptographic element and user name element as shown in Figure 2 to the input control in dom tree,Wherein,When scanning in dom tree containing general input control,Directly determine that the input element that attribute is type=" password " is Password Input frame element,In dom tree on Password Input frame、Closest attribute type=" text "、Editable input element is user name input frame;When scan in dom tree be not general input control time, can be in the case of webpage start the webpage of automatic form filling function, according to webpage OriginURL, element id, element name attribute and other elementary composition compound condition location user name element and cryptographic element.
After the webpage determining loading is list webpage, i.e. after identifying list, the recognition methods of the embodiment of the present invention also includes: judge whether to receive the triggering command that user issues, wherein, triggering command is used for submission form webpage, after judging to receive triggering command, determine that list webpage is effective list, i.e., after this list webpage loaded is submitted to by user, browser i.e. determines that this list submitted to is an effective list, in order to follow-up when again loading this list webpage, it is possible to identify this list more rapidly and accurately.Specifically, when user is when submitting to the list webpage including cryptographic element and user name element, can correspondingly trigger submission event corresponding in web page code, then can be by judging whether this submission event is triggered to realize judging whether to receive the triggering command that user issues.
Wherein, if the core type producing access instruction is Trident kernel, judge whether that the judgement flow process receiving triggering command figure 3 illustrates, specifically, as shown in Figure 3, the web page code first passing through the JS scripted code webpage to loading is scanned, to obtain the element (the hereinafter referred to as first element) that attribute is the 3rd preset attribute, wherein, in the case of submission button in web page code uses general rule, 3rd preset attribute can be defined as attribute type=" submit ", getting the first element and referring to get attribute in web page code is the element of type=" submit ", namely refer to the submission button finding in the webpage of loading;In the case of submission button in web page code uses non-universal rule, then submit button to according to element, attribute and other elementary composition eligible location;Then the first element is carried out duplication and obtain the second element, and cover the first element with the second element, i.e., use the cloning process in the JavaScript scripted code preset that this submission button is cloned, and the submission button after clone to be placed on former submission button front end, and by time stream naming method to submission button one complexity of name after clone and unique id;Finally, judge whether the second element is performed, owing to user is when carrying out cryptographic element and user name element submits to, having to pass through triggering submission event id to realize, now the second element is then performed, so, by judging whether the second element is performed, can realize judging whether to receive the triggering command that user issues, wherein, when the second element is performed, determines and receive the triggering command that user issues.After processing the submission button performing clone, the event that original submission button is corresponding also can be according to original sequentially executed, if after performing to submit the event of button own to, return value is true, then continue executing with the form event in web page code, if return value is false, then stop.Wherein, " the first element ", " the second element " such description language is used to be intended merely to different elements is made a distinction expression, and be not that the sequencing to element is construed as limiting, the place of similar statement hereinafter occurs also for making a distinction, be not the restriction to sequencing.
If the core type producing access instruction is Webkit kernel, it is then the element of the 3rd preset attribute by the input control in the dom tree in web page code is scanned getting attribute in the web page code of the webpage of loading, obtain the first element, i.e., get the submission button in the webpage of caryogram browser in Webkit, then by judging whether the first element is executed to judge whether to receive the triggering command that user issues, if judging, the first element is performed, it is determined that receive triggering command.
Further, the form recognition method of the embodiment of the present invention also includes the process step filled in list and list preserves, when the core type of browser is Trident kernel, whole form recognition method figure 4 illustrates, when the core type of browser is Webkit kernel, whole form recognition method is in fig. 5 it is shown that can be seen that from Fig. 4 and Fig. 5, for the browser of different kernels, the process step that concrete list is filled in list preserves is identical.
nullSpecifically,After the webpage determining loading is list webpage,And carry out list before judging whether to receive triggering command and fill in,Particularly as follows: both included the element that attribute is the first preset attribute in judging web page code,And also after including the element that attribute is the second preset attribute,When user clicks on user name login frame,Trigger Renderer process the element that attribute is the first preset attribute and the element that attribute is the second preset attribute are captured,Obtain cryptographic element and user name element,And be the most saved in presetting database the presetting database of host process to be carried out the inquiry judging code data corresponding with cryptographic element and the username data corresponding with user name element to host process by the transmission IPC request of Renderer process,Wherein,This presetting database is used to preserve the data base of form information,By form information is saved in preset data,Achieve the intercommunication of form data under double-core browser;Finally, when judging code data and username data is the most saved in presetting database, host process sends an IPC and asks to Renderer process, the username data preserved and the code data of optimal coupling is filtered out by Renderer process, and perform to add to the cryptographic element of the webpage loaded the code data filtered out, and username data is added to the user name element of the webpage loaded.Wherein, username data and code data are screened by Renderer process and the principle mated is: first determine whether whether there is the list preserved under current URL, if having, under current URL, the list preserved, then preferentially code data and the username data of this list are added in element corresponding on webpage, it is achieved precisely mate;If the list the most preserved under current URL, then search the list preserved under the main territory of current URL, and code data and the username data of the list preserved under its main territory are added in element corresponding on webpage, it is achieved fuzzy matching.Illustrate, if saving A list under URL " a.xxx.com ", B list is saved under URL " b.xxx.com ", when user opens " a.xxx.com ", can by the username data of list A and B and code data all as alternative, but the username data of prioritizing selection A list and code data are as optimal coupling;If not preserving list under URL " a.xxx.com ", under URL " b.xxx.com ", save B list, when user opens " a.xxx.com ", then can be by the username data of list B and code data as the username data filtered out and code data.
nullWhen list preservation is processed,Particularly as follows: first,After the webpage determining loading is list webpage,And before judging whether to receive triggering command,When user clicks on user name login frame,Trigger Renderer process the element that attribute is the first preset attribute and the element that attribute is the second preset attribute are captured,Obtain cryptographic element and user name element,And be the most saved in presetting database the presetting database of host process to be carried out the inquiry judging code data corresponding with cryptographic element and the username data corresponding with user name element to host process by the transmission IPC request of Renderer process,When judging that username data is saved in presetting database,When but code data is not held in presetting database,On the one hand host process sends an IPC request to Renderer process,The username data preserved of optimal coupling is filtered out by Renderer process,And the username data filtered out is added to the user name element of the webpage loaded,On the other hand the code data of user's input is received;When judging username data and code data is all not held in presetting database, then directly receive username data and the code data of user's input.So far the step that list is filled in has been only reached.Secondly, after receiving triggering command, i.e., after user triggers login, Renderer process send an IPC and ask to host process, host process trigger pop-up prompting, to prompt the user whether to select to preserve code data and username data, or whether prompting preserves code data;Finally, receive the selection instruction of user, and when selection instruction expression selects to preserve, preservation code data and username data are to presetting database, or preserve code data to presetting database.
The embodiment of the present invention additionally provides a kind of form recognition device, and the form recognition device provided the embodiment of the present invention below is specifically introduced:
Fig. 6 is the schematic diagram of list device according to embodiments of the present invention, and as shown in Figure 6, the form recognition device of this embodiment includes receiving unit 10, loading unit 20, scanning element the 30, first judging unit the 40, second judging unit 50 and determining unit 60.
Receive unit 10 to be used for receiving access instruction, specifically, when user wants to conduct interviews some websites, can input or link and input the network address of this website to carry out opening webpage, now, receive unit 10 to be received realizing the reception to user's access instruction by the network address of the network address that user is inputted or link input;
HTML in the network address corresponding with access instruction, for loading the webpage corresponding with access instruction, specifically, is loaded, obtains the webpage corresponding with access instruction by loading unit 20;
Scanning element 30 is for being scanned the web page code of the webpage loaded;
First judging unit 40 is for judging whether include, in the web page code scanned, the element that attribute is the first preset attribute, wherein, the element that first preset attribute is corresponding is cryptographic element, specifically, mainly by the web page code of the webpage loaded is scanned, whether detection scanning process can scan the element that attribute is the first preset attribute, in embodiments of the present invention, first preset attribute can be defined as attribute type=" password ", if scanning the input element containing attribute type=" password " in the web page code of the webpage loaded, then determine that the web page code scanned includes the element that attribute is the first preset attribute;
Second judging unit 50 is for judging whether include, in the web page code scanned, the element that attribute is the second preset attribute, wherein, the element that second preset attribute is corresponding is user name element, specifically, mainly by whether detection scanning process can scan the element that attribute is the second preset attribute, in embodiments of the present invention, it is closest that second preset attribute can be defined as distance cryptographic element, and meet attribute type=" text ", if scanning the input element containing attribute type=" text " in the web page code of the webpage loaded, then determine that the web page code scanned includes the element that attribute is the second preset attribute;
If determining, unit 60 is for judging that the web page code scanned includes the element that attribute is the first preset attribute, and also include the element that attribute is the second preset attribute, then determine that the webpage loaded in loading unit 20 is list webpage, i.e., not only cryptographic element is comprised but also comprise user name element in judging web page code, one list webpage when i.e. can determine that the webpage of loading, namely realize the identification to list.
nullThe form recognition device of the embodiment of the present invention is scanned by user accesses the web page code of the webpage loaded,Realize the monitoring to web page code,And then realize the monitoring of each element property in web page code,The element of preset attribute is met (i.e. quickly to detect whether to comprise,Realize quickly detecting cryptographic element and user name element),This kind only need to be by the method being monitored the username field in webpage and password field,Whether only need to be scanned the web page code of the webpage loaded realizing is the identification of list webpage to the webpage loaded,For hinge structure needs the recognition methods that the multiple fields in the list after submitting to successfully judge,Significantly reduce the complexity of form recognition,Solve the problem that in prior art, form recognition rate is relatively low,And then reached to improve the effect of form recognition rate.
Specifically, when user is conducted interviews when issuing of instruction by different types of browser kernel, and for the browser of different kernels, the scan mode of the scanning element 30 corresponding when performing to be scanned the web page code of the webpage loaded is different.
Wherein, when the web page code loading webpage is scanned, first obtained by the core type of the first acquisition subelement generation access instruction in scanning element 30;When the core type got is Trident kernel, then injected by the first scanning subelement in scanning element 30 preset scripted code in web page code so that web page code is scanned;When the core type got is Webkit kernel, then by the second scanning subelement in scanning element 30, the input control in the dom tree in web page code is scanned.
Further, the form recognition device of the embodiment of the present invention is receiving after the triggering command of submission form webpage, can determine that list webpage is effective list, in order to follow-up when again this list webpage being loaded, it is possible to identify this list more rapidly and accurately.Wherein, having done concrete introduction in the form recognition method that the invention described above embodiment is provided for whether receiving the determination methods of triggering command, here is omitted.
In addition, the list identified can also be preserved and fill in by the form recognition device of the embodiment of the present invention, form recognition device carry out list preservation with the concrete grammar filled in the form recognition method that the invention described above embodiment is provided carries out list preservation and the step filled in is identical, repeat no more the most equally.
As can be seen from the above description, the present invention, by whether including cryptographic element and user name element in quickly detection web page code, significantly reduces the complexity of form recognition, has reached to improve the effect of form recognition rate;Meanwhile, by form data is preserved, it is achieved that the intercommunication of the form data under double-core browser, improve the suitability of list.
It should be noted that, can perform in the computer system of such as one group of computer executable instructions in the step shown in the flow chart of accompanying drawing, and, although showing logical order in flow charts, but in some cases, can be to be different from the step shown or described by order execution herein.
Obviously, those skilled in the art should be understood that, each module of the above-mentioned present invention or each step can realize with general calculating device, they can concentrate on single calculating device, or it is distributed on the network that multiple calculating device is formed, alternatively, they can realize with calculating the executable program code of device, thus, can be stored in storing in device and be performed by calculating device, or they are fabricated to respectively each integrated circuit modules, or the multiple modules in them or step are fabricated to single integrated circuit module realize.So, the present invention is not restricted to the combination of any specific hardware and software.
The foregoing is only the preferred embodiments of the present invention, be not limited to the present invention, for a person skilled in the art, the present invention can have various modifications and variations.All within the spirit and principles in the present invention, any modification, equivalent substitution and improvement etc. made, should be included within the scope of the present invention.

Claims (8)

1. a form recognition method, it is characterised in that including:
Receive access instruction;
Load the webpage corresponding with described access instruction;
The web page code of the webpage loaded is scanned;
Judging whether include, in the web page code scanned, the element that attribute is the first preset attribute, wherein, the element that described first preset attribute is corresponding is cryptographic element;
Judging whether include, in the web page code scanned, the element that attribute is the second preset attribute, wherein, the element that described second preset attribute is corresponding is user name element;And
If judging, the web page code scanned includes the element that attribute is described first preset attribute, and also includes the element that attribute is described second preset attribute, it is determined that the webpage of loading is list webpage,
Wherein, after the webpage determining loading is list webpage, described form recognition method also includes: judge whether to receive triggering command, and wherein, described triggering command is used for submitting described list webpage to;And if judge to receive described triggering command, it is determined that described list webpage is effective list,
Wherein, when the core type of the described access instruction of generation is Trident kernel, it may be judged whether receive triggering command and include: obtaining attribute in described web page code is the element of the 3rd preset attribute, obtains the first element, wherein, the element that described 3rd preset attribute is corresponding is submission event;Replicate described first element, obtain the second element;Described first element is covered with described second element;And judge whether described second element is performed, if judging, described second element is performed, it is determined that receive described triggering command.
Form recognition method the most according to claim 1, it is characterised in that be scanned including to the web page code of the webpage loaded:
Obtain the core type producing described access instruction;
If the core type got is Trident kernel, then injects and preset in scripted code extremely described web page code so that described web page code is scanned;And
If the core type got is Webkit kernel, then the input control in the dom tree in described web page code is scanned.
Form recognition method the most according to claim 1, it is characterised in that when the core type of the described access instruction of generation is Webkit kernel, it may be judged whether receive triggering command and include:
Obtaining attribute in described web page code is the element of the 3rd preset attribute, obtains the first element, and wherein, the element that described 3rd preset attribute is corresponding is submission event;And
Judging whether described first element is performed, if judging, described first element is performed, it is determined that receive described triggering command.
Form recognition method the most according to claim 1, it is characterised in that after the webpage determining loading is list webpage, and before judging whether to receive triggering command, described form recognition method also includes:
Obtain the element that attribute is described first preset attribute, obtain cryptographic element;
Obtain the element that attribute is described second preset attribute, obtain user name element;
Inquiry presetting database is to judge that code data and username data have been saved in described presetting database the most, and wherein, described code data is the data that described cryptographic element is corresponding, and described username data is the data that described user name element is corresponding;And
If judging, described code data and described username data have been saved in described presetting database the most, then add in the cryptographic element of the webpage that described code data extremely loads, and add in the user name element of the webpage that described username data extremely loads.
Form recognition method the most according to claim 1, it is characterised in that after the webpage determining loading is list webpage, and before judging whether to receive triggering command, described form recognition method also includes:
Obtain the element that attribute is described first preset attribute, obtain cryptographic element;
Obtain the element that attribute is described second preset attribute, obtain user name element;
Inquiry presetting database is to judge that code data and username data have been saved in described presetting database the most, and wherein, described code data is the data that described cryptographic element is corresponding, and described username data is the data that described user name element is corresponding;
If judging, described username data has been saved in described presetting database, and described code data is not held in described presetting database, then add in the user name element of the webpage that described username data extremely loads, and receive the code data of user's input;And
If judging, described username data and described code data are all not held in described presetting database, then receive code data and the username data of user's input.
Form recognition method the most according to claim 5, it is characterised in that after judging to receive described triggering command, described form recognition method also includes:
Pop-up is preset in display, wherein, described default pop-up is provided with suggestion content, and described suggestion content is used for pointing out described user to choose whether to preserve described code data and described username data, or points out described user to choose whether to preserve described code data;
Receive the selection from described user to instruct;And
When described selection instruction represents and selects to preserve described code data and described username data, preserve described code data and described username data extremely described presetting database, or preserve described code data to described presetting database.
7. a form recognition device, it is characterised in that including:
Receive unit, be used for receiving access instruction;
Loading unit, for loading the webpage corresponding with described access instruction;
Scanning element, for being scanned the web page code of the webpage loaded;
First judging unit, for judging whether include, in the web page code scanned, the element that attribute is the first preset attribute, wherein, the element that described first preset attribute is corresponding is cryptographic element;
Second judging unit, for judging whether include, in the web page code scanned, the element that attribute is the second preset attribute, wherein, the element that described second preset attribute is corresponding is user name element;And
Determine unit, if for judging that the web page code scanned includes the element that attribute is described first preset attribute, and also include the element that attribute is described second preset attribute, it is determined that the webpage of loading is list webpage,
Wherein, described form recognition device also includes: described reception unit is additionally operable to after the webpage determining loading is list webpage, it may be judged whether receive triggering command, and wherein, described triggering command is used for submitting described list webpage to;And if judge to receive described triggering command, it is determined that described list webpage is effective list,
Wherein, when the core type of the described access instruction of generation is Trident kernel, it is the element of the 3rd preset attribute that described reception unit is specifically used for obtaining attribute in described web page code, obtains the first element, wherein, the element that described 3rd preset attribute is corresponding is submission event;Replicate described first element, obtain the second element;Described first element is covered with described second element;And judge whether described second element is performed, if judging, described second element is performed, it is determined that receive described triggering command.
Form recognition device the most according to claim 7, it is characterised in that described scanning element includes:
First obtains subelement, for obtaining the core type producing described access instruction;
First scanning subelement, for when the core type got is Trident kernel, injects and presets in scripted code extremely described web page code to be scanned described web page code;And
Second scanning subelement, for when the core type got is Webkit kernel, is scanned the input control in the dom tree in described web page code.
CN201210529911.4A 2012-12-10 2012-12-10 Form recognition method and device Active CN103034711B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210529911.4A CN103034711B (en) 2012-12-10 2012-12-10 Form recognition method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210529911.4A CN103034711B (en) 2012-12-10 2012-12-10 Form recognition method and device

Publications (2)

Publication Number Publication Date
CN103034711A CN103034711A (en) 2013-04-10
CN103034711B true CN103034711B (en) 2016-08-03

Family

ID=48021605

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210529911.4A Active CN103034711B (en) 2012-12-10 2012-12-10 Form recognition method and device

Country Status (1)

Country Link
CN (1) CN103034711B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104571903A (en) * 2013-10-28 2015-04-29 腾讯科技(深圳)有限公司 Input box switching method and input box switching device
CN109246069B (en) * 2018-06-15 2020-10-16 华为技术有限公司 Webpage login method and device and readable storage medium
CN109460522A (en) * 2018-10-30 2019-03-12 北京网众共创科技有限公司 The acquisition methods and device of site information
CN114510930B (en) * 2022-03-31 2022-07-15 北京圣博润高新技术股份有限公司 Method, device, electronic equipment and medium for auditing operation document

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663130A (en) * 2012-04-27 2012-09-12 华为技术有限公司 Method and device for submitting webpage data

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100037219A1 (en) * 2008-08-05 2010-02-11 International Buisness Machines Corporation Predictive logic for automatic web form completion
CN102651019B (en) * 2012-03-30 2013-12-04 北京奇虎科技有限公司 Method and device for parsing tagged file

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663130A (en) * 2012-04-27 2012-09-12 华为技术有限公司 Method and device for submitting webpage data

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
自动填充深度网入口表单;孙翀;《中国优秀硕士学位论文全文数据库 信息科技辑》;20070915;I139-230第25-26页 *
面向Web表单的信息抽取通用模型;张忠;《中国优秀硕士学位论文全文数据库 信息科技辑》;20070815;I138-104第28-29页 *

Also Published As

Publication number Publication date
CN103034711A (en) 2013-04-10

Similar Documents

Publication Publication Date Title
CN105530175B (en) Message processing method, device and system
US7299403B1 (en) Methods and apparatus for obtaining a state of a browser
US8392837B2 (en) Browser supporting multiple users
CN103036902B (en) Log-in control method and system based on Quick Response Code
US9053078B1 (en) Statistics overlay
CN102647417B (en) The implementation method of network access, device and system and network system
US20120311419A1 (en) System for displaying cached webpages, a server therefor, a terminal therefor, a method therefor and a computer-readable recording medium on which the method is recorded
CN107766344B (en) Template rendering method and device and browser
US20020046170A1 (en) User impersonation by a proxy server
US20130339511A1 (en) Mapped parameter sets using bulk loading system and method
US11455365B2 (en) Data processing method and apparatus
US8359031B2 (en) Computer based method and system for logging in a user mobile device at a server computer system
US10452736B1 (en) Determining whether an authenticated user session is active for a domain
CN103577597A (en) Keyword searching system based on current browse webpage
CN102043834A (en) Method for realizing searching by utilizing client and search client
CN103034711B (en) Form recognition method and device
US20140372871A1 (en) Method and apparatus for providing web pages
CN102833212A (en) Webpage visitor identity identification method and system
CN109688280A (en) Request processing method, request processing equipment, browser and storage medium
US20100185930A1 (en) Method and apparatus for incorporating application functionality into a web page
CN105653526B (en) Page access method and apparatus
CN104199865B (en) Searching method, client and the system of the customization result of content providers are provided
CN111339456B (en) Preloading method and device
CN103716319B (en) A kind of apparatus and method of web access optimization
JP2011043924A (en) Web action history acquisition system, web action history acquisition method, gateway device and program

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 100022 Beijing City, Chaoyang District Chaoyang Road No. 237 Fosun international center 12

Applicant after: BEIJING KINGSOFT INTERNET SECURITY SOFTWARE Co.,Ltd.

Applicant after: Beijing Cheetah Network Technology Co.,Ltd.

Applicant after: Beijing Cheetah Mobile Technology Co.,Ltd.

Applicant after: CONEW NETWORK TECHNOLOGY (BEIJING) Co.,Ltd.

Address before: 100022 Beijing City, Chaoyang District Chaoyang Road No. 237 Fosun international center 12

Applicant before: BEIJING KINGSOFT INTERNET SECURITY SOFTWARE Co.,Ltd.

Applicant before: BEIJING KINGSOFT NETWORK TECHNOLOGY Co.,Ltd.

Applicant before: SHELL INTERNET (BEIJING) SECURITY TECHNOLOGY Co.,Ltd.

Applicant before: CONEW NETWORK TECHNOLOGY (BEIJING) Co.,Ltd.

COR Change of bibliographic data
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20181224

Address after: Room 105-53967, No. 6 Baohua Road, Hengqin New District, Zhuhai City, Guangdong Province

Patentee after: Zhuhai Leopard Fun Technology Co.,Ltd.

Address before: 100022 the 12 level of Fuxing International Center, 237 Chaoyang North Road, Chaoyang District, Beijing.

Co-patentee before: Beijing Cheetah Network Technology Co.,Ltd.

Patentee before: BEIJING KINGSOFT INTERNET SECURITY SOFTWARE Co.,Ltd.

Co-patentee before: Beijing Cheetah Mobile Technology Co.,Ltd.

Co-patentee before: CONEW NETWORK TECHNOLOGY (BEIJING) Co.,Ltd.

TR01 Transfer of patent right