CN103186532A - Method and device for capturing key pictures in web page - Google Patents

Method and device for capturing key pictures in web page Download PDF

Info

Publication number
CN103186532A
CN103186532A CN201110443869XA CN201110443869A CN103186532A CN 103186532 A CN103186532 A CN 103186532A CN 201110443869X A CN201110443869X A CN 201110443869XA CN 201110443869 A CN201110443869 A CN 201110443869A CN 103186532 A CN103186532 A CN 103186532A
Authority
CN
China
Prior art keywords
picture
webpage
pictures
canonical
attribute
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201110443869XA
Other languages
Chinese (zh)
Other versions
CN103186532B (en
Inventor
李晓明
刘臻
蒋有星
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Beijing Co Ltd
Original Assignee
Tencent Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Beijing Co Ltd filed Critical Tencent Technology Beijing Co Ltd
Priority to CN201110443869.XA priority Critical patent/CN103186532B/en
Publication of CN103186532A publication Critical patent/CN103186532A/en
Application granted granted Critical
Publication of CN103186532B publication Critical patent/CN103186532B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a method and device for capturing key pictures in a web page. The method comprises the following steps of: A, acquiring a document object model (DOM) structure of the web page according to a web page address; B, positioning the central node of the web page according to the DOM structure of the web page; C, performing regular matching on pictures at the central node and the brother nodes of the central node, filtering the pictures obtained by performing the regular matching according to a preset filtering condition, and outputting the pictures which are in accordance with the filtering condition; and D, taking the pictures output in the step C as the captured key pictures of the web page. The device comprises a corresponding DOM structure acquisition module, a node determination module, a regular matching module, a filter and a key picture determination module. By utilizing the method and the device, the coincidence degree of the captured key pictures of the web page and the subject contents of the web page can be improved; human-computer interaction times can be reduced; and the operation can be simplified.

Description

The grasping means of key picture and device in the webpage
Technical field
The present invention relates to the internet information process field, relate in particular to grasping means and the device of key picture in a kind of Webpage (abbreviation webpage).
Background technology
The function of sharing that has occurred internet content at present, for example some microblogging platform can provide and share interface, and the third party website can be inserted this and shares interface the web page contents of this website is shared in the microblogging system, thereby has promoted user's experience.Present share the web page contents that interface shares and mainly comprise: the picture in the link of webpage, simplified summary character introduction and the webpage.Detailed process is: the user shares the information such as chained address, subject content and picture that interface can grasp this webpage after clicking and sharing button, and these these information are shared in the goal systems, for example shares in the microblogging.Interface is shared in utilization, and the user can be shared with webpage that like or valuable his bean vermicelli, audience or the good friend in the microblogging system, thereby has increased the flowing of access to this webpage.This interface of sharing has been utilized widely on the third party website at present.
Existing this when sharing the picture of interfacing in sharing web page, need carry out the multistep operation: at first, all pictures in the webpage are extracted be shown to the user, by the artificial key picture of selecting wherein of clicking of user; Secondly, after the selection instruction of receiving the user, confirm the final picture of sharing again; Click up to the user at last and just picture is shared in the goal systems (as the microblogging platform) after determining to share.
Website page is explained one or one possibly with last subject content, and the picture of these subject contents of pictute (or replenishing) is exactly key picture, for example: the attached picture of the news of news pages.
But during the picture of prior art in sharing web page, there is following shortcoming:
Can't accomplish the intelligent key picture that grasps in the webpage, the man-machine interaction number of times of user and internet machine side is too much, complicated operation; And its picture of selecting the often matching degree with the subject content of webpage is low, it or not key picture, when especially in webpage, having a large amount of pictures and icon, can't find key picture wherein especially quick, intelligently, what often select is the picture that has nothing to do, the user operates more complicated when sharing picture, selects the time of wait longer.
Summary of the invention
In view of this, fundamental purpose of the present invention is to provide grasping means and the device of key picture in a kind of Webpage, with the key picture of the raising webpage that grasped and the matching degree of Web page subject content, reduces the man-machine interaction number of times, simplifies the operation.
Technical scheme of the present invention is achieved in that
The grasping means of key picture in a kind of webpage comprises:
A, obtain DOM Document Object Model (DOM, the Document Object Model) structure of webpage according to web page address;
B, according to the Centroid of the DOM structure locating web-pages of webpage;
C, canonical are mated the picture at described Centroid and brotgher of node place thereof, according to default filtercondition the picture that canonical matches are filtered, and output meets the picture of filtercondition;
D, with the picture of the step C output key picture as the described webpage that grabs.
The grabbing device of key picture is characterized in that in a kind of webpage, and this device comprises:
DOM structure acquisition module is for the DOM structure of obtaining webpage according to web page address;
The node determination module is used for the Centroid according to the DOM structure locating web-pages of described webpage, and Centroid is input to the canonical matching module;
The canonical matching module is used for node that the canonical coupling imports and the picture at brotgher of node place thereof, and the picture that output matches is to filtrator;
Filtrator is used for filtering according to the picture of default filtercondition to input, and output meets the picture of filtercondition;
The key picture determination module is for the key picture of the picture that described filtrator is exported as the described webpage that grabs.
Compared with prior art, the present invention utilizes the Centroid of the DOM structure locating web-pages of webpage, canonical is mated the picture at described Centroid and brotgher of node place thereof then, and filters according to default filtercondition, with the key picture of the picture after filtering as webpage.Described Centroid and the brotgher of node thereof and Web page subject content matching degree are higher, and picture is through the filtration of filtercondition, finally can improve the key picture that grasps and the matching degree of Web page subject content, the extracting step of key picture of the present invention can be carried out by computing machine fully simultaneously, the user only need manual activation once flow process get final product, reduce the man-machine interaction number of times, simplified operation, saved corresponding computational resource and bandwidth resources.
Description of drawings
Fig. 1 is a kind of process flow diagram of key picture grasping means in the webpage of the present invention;
Fig. 2 is the weight synoptic diagram of a kind of webpage DOM structure (being also referred to as dom tree) node;
Fig. 3 is the process flow diagram of a kind of specific embodiment of the method for the invention;
Fig. 4 is a kind of composition synoptic diagram of key picture grabbing device in the webpage of the present invention.
Embodiment
The present invention is further described in more detail below in conjunction with drawings and the specific embodiments.
DOM can visit and revise the content and structure of a document in a kind of mode that is independent of platform and language.DOM is expression and the common method of handling a HTML(Hypertext Markup Language) or extend markup language (XML) document.Therefore and present webpage all is based on HTML or XML document, the present invention is based on the highest Centroid of the DOM structure analysis of webpage and subject content matching degree.
Fig. 1 is a kind of process flow diagram of key picture grasping means in the webpage of the present invention; Referring to Fig. 1, method of the present invention comprises:
Step 101, obtain the DOM structure of webpage according to web page address.
Described web page address generally is uniform resource locator (URL, Universal Resource Locator) address, and the URL address is to go up a kind of identification method of webpage and other resource addresses for intactly describing the Internet (Internet).The present invention often shares technology with webpage in actual applications and uses simultaneously, shares the URL address that interface can get access to this webpage during user's sharing web page, and this step 101 can be utilized and share the URL address that interface gets access to.The concrete grammar that obtains the DOM structure can adopt existing known technology, repeats no more herein.
Step 102, according to the Centroid of the DOM structure locating web-pages of webpage.
Concrete localization method can be determined according to the H label in the DOM structure herein, described H tag identifier the weight of web page joint, wherein H1 label node weight is the highest, H2 label node weight is taken second place, H3 label node weight is taken second place again, by that analogy.In this step, can be according to described H label according to the one or more Centroid of weight positioned in sequence from high to low; A plurality of nodes for the H label of same weight grade can sort to these nodes according to the structure of web page order.
Step 103, canonical are mated the picture at described Centroid and brotgher of node place thereof, according to default filtercondition the picture that canonical matches are filtered, and output meets the picture of filtercondition.
This step 103 can also have numerous embodiments, specifically introduces in the following embodiments.
Step 104, with the picture of the step 103 output key picture as the described webpage that grabs.
Fig. 2 is the weight synoptic diagram of a kind of webpage DOM structure (being also referred to as dom tree) node.Referring to Fig. 2, H1 label node and H2 label node are (because length is limited, only marked the H1 label node among Fig. 2) node content generally be the subject information (meeting w3c standard and SEO optimizing criterion) of webpage, and key picture is often near H1 label node or H2 label node, it is distance H 1 label node, the more near node picture weight of H2 label node is more high, and described distance can be determined according to the path in the DOM structure (path length).
Fig. 3 is the process flow diagram of a kind of specific embodiment of the method for the invention.This embodiment is that example describes with DOM structure webpage shown in Figure 2.Referring to Fig. 2 and Fig. 3, this flow process comprises:
Step 301, share interface and monitor the user and clicked a key and share button, import the URL address of institute's sharing web page into, obtain page DOM according to this URL address, set up dom tree, namely obtained the DOM structure of webpage.
Step 302, according to the Centroid of the DOM structure locating web-pages of described webpage, suppose to have located among this embodiment two Centroids (H1 node, H2 node), the weight of H1 node is the highest.And define the key picture that overall array find_imgarr storage is grasped out.
Step 303, canonical are mated the picture that described Centroid is H1 node and brotgher of node place thereof.Concrete mode is: the father node (H1->parent) that searches Centroid and be the H1 node, the child node (being the brotgher of node of H1 node) of traversal H1->parent node, the picture at canonical coupling H1 node and brotgher of node place thereof turns back to a picture array all_imgarr with the picture that matches.
Step 304~step 307, judge that whether the all_imgarr array is empty, be previous step whether canonical matched the picture at H1 node and brotgher of node place thereof, if be not empty, then the picture input filter in the all_imgarr array is filtered, return the picture that the meets filtercondition picture of filtrator (namely by) to the find_imgarr array; And judge whether the find_imgarr array is empty, if be not empty, then judge and has found key picture, and output findi array is as the key picture of described sharing web page, process ends.
If described alli array is empty (namely not having canonical to match the picture at H1 node and brotgher of node place thereof), perhaps described findi array is empty (namely having filtered out all pictures through after the described filtration treatment), and then execution in step 308.
Step 308, determine the father node of described Centroid H1 node according to the DOM structure of described webpage, canonical is mated the picture at this father node and brotgher of node place thereof.Specifically be the father node (H1->parent->parent) that searches the father node of Centroid H1 according to the DOM structure of described webpage, the child node (brotgher of node of H1 father node) of traversal H1->parent->parent, the picture at canonical coupling H1 node and brotgher of node place thereof turns back to array all_imgarr with the picture that matches.
Step 309~step 312, judge that whether the all_imgarr array is empty, be previous step whether canonical matched the picture at H1 node father node and brotgher of node place thereof, if be not empty, then the picture input filter in the all_imgarr array is filtered, return the picture that the meets filtercondition picture of filtrator (namely by) to the find_imgarr array; And judge whether the find_imgarr array is empty, if be not empty, then judge and has found key picture, and output findi array is as the key picture of described sharing web page, process ends.
If described alli array is empty (namely not having canonical to match the picture at H1 node father node and brotgher of node place thereof), perhaps described findi array is empty (namely having filtered out all pictures through after the described filtration treatment), and then execution in step 313.
In the present embodiment, if can not find key picture, only described Centroid and its father node double-layer structure described canonical coupling and filtration treatment have been done.In a further embodiment, if this father node and the brotgher of node thereof can not mated picture or the picture that matches is all filtered out, the last layer father node of this father node be can also further determine, and above-mentioned canonical coupling and filter process repeated.By that analogy, can also further determine the father node of last layer again, concrete level quantity can preestablish as required.
Step 313, judge whether that canonical mated all Centroids and (located two Centroid H1 and H2 in the present embodiment, also may locate two above nodes) or the centromere that mated of canonical count whether reach preset threshold value (if the centromere of location is counted too much, then threshold value can be set, surpass threshold value and then no longer carry out the operation of above-mentioned steps), if execution in step 315 then, otherwise execution in step 314.
Step 314, determine that according to the DOM structure of described webpage next Centroid is the H2 node, at this H2 node, return step 303 again and carry out described step 303 to step 313.
In above-mentioned steps, in case found the key picture of webpage, then return key picture and carry out the follow-up operation of sharing to sharing interface, and process ends.
But, after canonical was mated all Centroids or the centromere that mated of canonical count reach preset threshold value after, if do not match picture or filtered out all pictures through after the described filtration treatment, then further carry out following step:
Step 315, canonical are mated the picture of the overall DOM structure of described webpage, and the picture that matches is turned back to array all_imgarr.
Step 316~step 319, judge that whether the all_imgarr array is empty, be that whether canonical has matched picture to previous step, if be not empty, then the picture input filter in the all_imgarr array is filtered, return the picture that the meets filtercondition picture of filtrator (namely by) to the find_imgarr array; And judge whether the find_imgarr array is empty, if be not empty, then judge and has found key picture, and output findi array is as the key picture of described sharing web page, process ends.
If described alli array is empty (namely not having canonical to match the picture at H1 node father node and brotgher of node place thereof), perhaps described findi array is empty (namely having filtered out all pictures through after the described filtration treatment), and then execution in step 320.
Step 320, return null value, namely judge the key picture that does not grab described webpage.
In the said process, described filtrator is specially the method that the picture of importing filters:
At first carrying out form and filter, select and meet specified format, mainly is the picture of PNG and JPG form in the present embodiment;
Next carries out the attribute filtration, selects the picture that meets specified altitude assignment and width.The condition of described specified altitude assignment and width for example can be: picture tall and big in 139px and high wide while greater than 99px, perhaps picture is wider than 139px and high wide while greater than 99px.
In a further embodiment, further comprise in the described method that picture is filtered:
Be weighted through the picture that form filters and the attribute filtration is selected described according to alt attribute and title attribute, select the highest picture of weight; DOM structure according to described webpage, some pictures (because the text accompanying drawing is that key picture generally is continuous) that picture definite and that described weight is the highest is continuous, described some pictures are carried out again described form filters and attribute filters, picture and the described weight the highest picture of output by filtration turns back in the findi array as the output of filtrator.
Perhaps, can further include in the described method that picture is filtered:
From described picture through selection area maximum the picture that form filters and the attribute filtration is selected; DOM structure according to described webpage, determine the some pictures continuous with the picture of described area maximum, described some pictures are carried out again described form filters and attribute filters, picture and the picture of described area maximum of output by filtration turns back in the findi array as the output of filtrator.
Based on said method, the invention also discloses the grabbing device of key picture in a kind of webpage, this grabbing device can be carried out the grasping means of key picture in the above-mentioned webpage.Fig. 4 is a kind of composition synoptic diagram of key picture grabbing device in the webpage of the present invention.Referring to Fig. 4, this grabbing device 400 comprises:
DOM structure acquisition module 401 is for the DOM structure of obtaining webpage according to web page address.
Node determination module 402 is used for the Centroid according to the DOM structure locating web-pages of described webpage, and Centroid is input to the canonical matching module.This node determination module 402 can also determine that the father nodes at different levels of Centroid are input to described canonical matching module 403 according to the feedback result of canonical matching module 403 and filtrator in a further embodiment, determine that perhaps next Centroid is input to described canonical matching module 403, detailed process is as described in the above-mentioned method.
Canonical matching module 403 is used for node that the canonical coupling imports and the picture at brotgher of node place thereof, and the picture that output matches is to filtrator; The node of importing comprises Centroid and its father nodes at different levels.
Filtrator 404 is used for filtering according to the picture of default filtercondition to input, and output meets the picture of filtercondition.
Key picture determination module 405 is for the key picture of the picture that described filtrator is exported as the described webpage that grabs.
Wherein, described filtrator specifically comprises:
The form filtering module is used for selecting the picture that meets specified format (present embodiment mainly is the picture of PNG and JPG form);
The attribute filtering module is used for selecting the picture that meets specified altitude assignment and width.
In a kind of specific embodiment, described filtrator further comprises:
Module is selected in weighting, is used for being weighted through the picture that form filters and the attribute filtration is selected described according to alt attribute and title attribute, selects the highest picture of weight, imports the first gravity treatment module;
The first gravity treatment module, be used for the DOM structure according to described webpage, the continuous some pictures of picture definite and that described weight is the highest, described some pictures are input to form filtering module and attribute filtering module carry out again that form filters and attribute filters, output is by picture and the highest picture of described weight of filtration.
In another kind of specific embodiment, described filtrator further comprises:
Area is selected module, is used for filtering the picture that the picture of selecting is selected the area maximum from described through form filtration and attribute, imports the second gravity treatment module;
The second gravity treatment module, be used for the DOM structure according to described webpage, determine the some pictures continuous with the picture of described area maximum, described some pictures are input to form filtering module and attribute filtering module carry out again that form filters and attribute filters, output is by the picture of filtration and the picture of described area maximum.
Utilize the present invention, can realize that intelligence grasps the key picture of coupling subject content, omnidistance operating in a key, not only with the key picture of the raising webpage that grasped and the matching degree of Web page subject content, human-machine operation number of times in the time of can also reducing the user and share picture, improve user's experience, saved computational resource that too much human-machine operation causes and the waste of bandwidth resources.
The above only is preferred embodiment of the present invention, and is in order to limit the present invention, within the spirit and principles in the present invention not all, any modification of making, is equal to replacement, improvement etc., all should be included within the scope of protection of the invention.

Claims (12)

1. the grasping means of key picture in the webpage is characterized in that, comprising:
A, obtain the DOM Document Object Model DOM structure of webpage according to web page address;
B, according to the Centroid of the DOM structure locating web-pages of webpage;
C, canonical are mated the picture at described Centroid and brotgher of node place thereof, according to default filtercondition the picture that canonical matches are filtered, and output meets the picture of filtercondition;
D, with the picture of the step C output key picture as the described webpage that grabs.
2. method according to claim 1, it is characterized in that, among the described step C, if if do not have canonical to match the picture at described Centroid and brotgher of node place thereof or through having filtered out all pictures after the described filtration treatment, then further comprise: the father node of determining described Centroid according to the DOM structure of described webpage, canonical is mated the picture at this father node and brotgher of node place thereof, the described father node that canonical is matched according to default filtercondition and the picture at brotgher of node place thereof filter, and output meets the picture of filtercondition.
3. method according to claim 2, it is characterized in that, among the described step C, if do not carry out having filtered out all pictures after the filtration treatment if there is canonical to match the picture at described father node and brotgher of node place thereof or the described father node that canonical is matched and the picture of the brotgher of node thereof, then further comprise: the DOM structure according to described webpage is determined next Centroid, re-executes this step C.
4. method according to claim 3, it is characterized in that, among the step C, after canonical was mated all Centroids or the centromere that mated of canonical count reach preset threshold value after, if do not match picture or filtered out all pictures through after the described filtration treatment, then further comprise:
Canonical is mated the picture of the overall DOM structure of described webpage, according to default filtercondition the picture that described canonical matches is filtered, and output meets the picture of filtercondition.
5. according to each described method of claim 2 to 4, it is characterized in that the described method that picture is filtered is specially:
Carry out form and filter, select the picture that meets specified format;
Carry out attribute and filter, select the picture that meets specified altitude assignment and width.
6. method according to claim 5 is characterized in that, further comprises in the described method that picture is filtered:
Be weighted through the picture that form filters and the attribute filtration is selected described according to alt attribute and title attribute, select the highest picture of weight;
According to the DOM structure of described webpage, the continuous some pictures of picture definite and that described weight is the highest carry out described form filtration and attribute filtration again to described some pictures, and picture and the highest picture of described weight of filtration passed through in output.
7. method according to claim 5 is characterized in that, further comprises in the described method that picture is filtered:
From described picture through selection area maximum the picture that form filters and the attribute filtration is selected;
According to the DOM structure of described webpage, the continuous some pictures of picture definite and described area maximum carry out described form filtration and attribute filtration again to described some pictures, and the picture of filtration and the picture of described area maximum are passed through in output.
8. method according to claim 5 is characterized in that, the picture of the specified format described in described form filters is JPG picture and PNG picture.
9. the grabbing device of key picture in the webpage is characterized in that this device comprises:
DOM structure acquisition module is for the DOM structure of obtaining webpage according to web page address;
The node determination module is used for the Centroid according to the DOM structure locating web-pages of described webpage, and Centroid is input to the canonical matching module;
The canonical matching module is used for node that the canonical coupling imports and the picture at brotgher of node place thereof, and the picture that output matches is to filtrator;
Filtrator is used for filtering according to the picture of default filtercondition to input, and output meets the picture of filtercondition;
The key picture determination module is for the key picture of the picture that described filtrator is exported as the described webpage that grabs.
10. grabbing device according to claim 9 is characterized in that, described filtrator specifically comprises:
The form filtering module is used for selecting the picture that meets specified format;
The attribute filtering module is used for selecting the picture that meets specified altitude assignment and width.
11. grabbing device according to claim 10 is characterized in that, described filtrator further comprises:
Module is selected in weighting, is used for being weighted through the picture that form filters and the attribute filtration is selected described according to alt attribute and title attribute, selects the highest picture of weight, imports the first gravity treatment module;
The first gravity treatment module, be used for the DOM structure according to described webpage, the continuous some pictures of picture definite and that described weight is the highest, described some pictures are input to form filtering module and attribute filtering module carry out again that form filters and attribute filters, output is by picture and the highest picture of described weight of filtration.
12. grabbing device according to claim 10 is characterized in that, described filtrator further comprises:
Area is selected module, is used for filtering the picture that the picture of selecting is selected the area maximum from described through form filtration and attribute, imports the second gravity treatment module;
The second gravity treatment module, be used for the DOM structure according to described webpage, determine the some pictures continuous with the picture of described area maximum, described some pictures are input to form filtering module and attribute filtering module carry out again that form filters and attribute filters, output is by the picture of filtration and the picture of described area maximum.
CN201110443869.XA 2011-12-27 2011-12-27 The grasping means of key picture and device in webpage Active CN103186532B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110443869.XA CN103186532B (en) 2011-12-27 2011-12-27 The grasping means of key picture and device in webpage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110443869.XA CN103186532B (en) 2011-12-27 2011-12-27 The grasping means of key picture and device in webpage

Publications (2)

Publication Number Publication Date
CN103186532A true CN103186532A (en) 2013-07-03
CN103186532B CN103186532B (en) 2019-05-10

Family

ID=48677703

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110443869.XA Active CN103186532B (en) 2011-12-27 2011-12-27 The grasping means of key picture and device in webpage

Country Status (1)

Country Link
CN (1) CN103186532B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103544271A (en) * 2013-10-18 2014-01-29 北京奇虎科技有限公司 Picture processing window loading method and device for browsers
CN114817639A (en) * 2022-05-18 2022-07-29 山东大学 Webpage graph convolution document ordering method and system based on comparison learning

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090307219A1 (en) * 2008-06-05 2009-12-10 Bennett James D Image search engine using image analysis and categorization
CN102270206A (en) * 2010-06-03 2011-12-07 北京迅捷英翔网络科技有限公司 Method and device for capturing valid web page contents
CN102270234A (en) * 2011-08-01 2011-12-07 北京航空航天大学 Image search method and search engine

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090307219A1 (en) * 2008-06-05 2009-12-10 Bennett James D Image search engine using image analysis and categorization
CN102270206A (en) * 2010-06-03 2011-12-07 北京迅捷英翔网络科技有限公司 Method and device for capturing valid web page contents
CN102270234A (en) * 2011-08-01 2011-12-07 北京航空航天大学 Image search method and search engine

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103544271A (en) * 2013-10-18 2014-01-29 北京奇虎科技有限公司 Picture processing window loading method and device for browsers
CN103544271B (en) * 2013-10-18 2017-03-15 北京奇虎科技有限公司 Load Image in a kind of browser the method and apparatus for processing window
CN114817639A (en) * 2022-05-18 2022-07-29 山东大学 Webpage graph convolution document ordering method and system based on comparison learning
CN114817639B (en) * 2022-05-18 2024-05-10 山东大学 Webpage diagram convolution document ordering method and system based on contrast learning

Also Published As

Publication number Publication date
CN103186532B (en) 2019-05-10

Similar Documents

Publication Publication Date Title
EP3400540B1 (en) Database operation using metadata of data sources
CN101408877B (en) System and method for loading tree node
EP2321745B1 (en) Providing posts to discussion threads in response to a search query
CN103365924B (en) A kind of method of internet information search, device and terminal
CN111435344B (en) Big data-based drilling acceleration influence factor analysis model
CN101908071B (en) Method and device thereof for improving search efficiency of search engine
CN102760151B (en) Implementation method of open source software acquisition and searching system
CN105760397B (en) Internet of things ontology model processing method and device
DE102017111438A1 (en) API LEARNING
CN105243159A (en) Visual script editor-based distributed web crawler system
CN104516982A (en) Method and system for extracting Web information based on Nutch
CN102521232B (en) Distributed acquisition and processing system and method of internet metadata
CN102930059A (en) Method for designing focused crawler
KR102222287B1 (en) Web Crawler System for Collecting a Structured and Unstructured Data in Hidden URL
CN104077402A (en) Data processing method and data processing system
CN101984429A (en) Method and device for acquiring destination page, search engine and browser
CN106202467A (en) A kind of definable towards peer-to-peer network searches for the web crawlers method of emphasis
CN105302876A (en) Regular expression based URL filtering method
CN104391978A (en) Method and device for storing and processing web pages of browsers
CN104112015A (en) DOM (document object model) and XML (extensible markup language) path language based intelligent substation SCD (substation configuration description) file parsing method
CN102063454A (en) Method and equipment combining search and application
CN110309386B (en) Method and device for crawling web page
CN101894109A (en) Database building method and device
CN111949619A (en) Dynamic directory generation method, system, electronic device and storage medium
CN103186532A (en) Method and device for capturing key pictures in web page

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant