CN103440239B - A kind of segmenting web page method and device based on functional area identification - Google Patents

A kind of segmenting web page method and device based on functional area identification Download PDF

Info

Publication number
CN103440239B
CN103440239B CN201310176551.9A CN201310176551A CN103440239B CN 103440239 B CN103440239 B CN 103440239B CN 201310176551 A CN201310176551 A CN 201310176551A CN 103440239 B CN103440239 B CN 103440239B
Authority
CN
China
Prior art keywords
block
web page
sub
picture
tag
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310176551.9A
Other languages
Chinese (zh)
Other versions
CN103440239A (en
Inventor
郭瑞
牛正雨
吴璞
吴一璞
李乐丁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201310176551.9A priority Critical patent/CN103440239B/en
Publication of CN103440239A publication Critical patent/CN103440239A/en
Application granted granted Critical
Publication of CN103440239B publication Critical patent/CN103440239B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Transfer Between Computers (AREA)
  • Document Processing Apparatus (AREA)

Abstract

Disclose a kind of segmenting web page method and device based on functional area identification.Described method includes: setting for auto-building html files DOM Document Object Model (DOM), dom tree includes the content for web page display;Extract positional information and the size information of DOM tree node;Parse the boundary edge attribute in CSS (CSS) attribute and marge clear area attribute;Utilize web page release dimensioning algorithm that webpage is labeled, to mark out function and semantic space, and the block of mark is labeled as granularity candidate;In residue webpage, scan picture and text mixing block according to DOM tree structure, the picture and text mixing block scanned is labeled as granularity candidate;Scan remaining piece, if the boundary edge attribute of the block scanned and marge clear area attribute are not 0, be then labeled as granularity candidate by described piece;Markd piece is not had to be labeled as granularity candidate residue in dom tree.

Description

A kind of segmenting web page method and device based on functional area identification
Technical field
The present invention relates to a kind of segmenting web page method and device, particularly relate to a kind of functional areas based on webpage Territory is identified method and the device of also cutting webpage.
Background technology
At present, on a web browser during displaying web page, major part is to be entered the page by parsing html source code Row layout.For the page shown in browser, can be according to difference functionally or semantically to the page Divide, thus may determine which part is the main contents of this webpage.It addition, currently use hands Machine browses the user of webpage and is on the increase, and mobile phone screen can viewing area the least, thus to determine Which content in webpage to combine and to represent.
Owing to the webpage on the Internet is extremely complex, and a lot of webpage does not observe the mark of homepages language Standard, is difficult to have a standard to judge which block a webpage can be cut into.Accordingly, it would be desirable to be given A kind of method in a suitable granularity, webpage being carried out cutting so that follow-up to structure of web page and The application of content is more targeted.
Summary of the invention
The present invention proposes the segmenting web page method and device of a kind of functional area identification.The method is by webpage Being divided into the granularity of a tiling, each granularity is an independent function, semanteme or content area, Application can use in units of granularity.Such as, mobile phone browser can push away in units of granularity Give user;Web page contents can be studied by the Granular Computing importance degree of output according to importance degree Deng.
According to an exemplary embodiment of the present invention on the one hand, it is provided that a kind of net based on functional area identification Page cutting method, described method includes: set for auto-building html files DOM Document Object Model (DOM), DOM Tree includes the content for web page display;Extract positional information and the size information of DOM tree node;Solve Separate out the boundary edge attribute in CSS (CSS) attribute and marge clear area attribute;Utilize net Webpage is labeled by page piecemeal dimensioning algorithm, to mark out function and semantic space, and by the block mark of mark It is designated as granularity candidate;In residue webpage, scan picture and text mixing block according to DOM tree structure, will scan Picture and text mixing block be labeled as granularity candidate;Scan remaining piece, if the boundary edge of the block scanned Attribute and marge clear area attribute are not 0, then be labeled as granularity candidate by described piece;By surplus in dom tree More than do not have markd piece to be labeled as granularity candidate.
Described function and semantic space can include navigation bar, breadcrumb, copyright, page turning hurdle and picture box.
Picture and text mixing block can be the block meeting following condition: dimension of picture is more than 5000 pixels;Figure segment Text node is had beyond Dian;Link number is less than or equal to 5;Brother's block has identical dom tree knot Structure.
Described segmenting web page method may also include that and carries out the whole of described granularity candidate according to merging condition Merge, and wipe the candidates of the sub-block merging block.
If sub-block is picture and text mixing blocks, and sub-block to be combined does not comprise two or more difference in functionality With the block of semantic type, and the main node of sub-block do not comprise more than TAG_UL, TAG_TABLE, Two kinds of label in TAG_FORM, then can merge.
If sub-block be plain text block, and sub-block to be combined do not comprise two or more difference in functionality and The block of semantic type, and the main node of sub-block do not comprise more than TAG_UL, TAG_TABLE, Two kinds of label in TAG_FORM, then can merge.
If sub-block has in identical structure, and sub-block to be combined in dom tree does not comprise two kinds Above difference in functionality and the block of semantic type, and the main node of sub-block do not comprise more than TAG_UL, Two kinds of label in TAG_TABLE, TAG_FORM, then can merge.
According to an exemplary embodiment of the present invention on the one hand, it is provided that a kind of net based on functional area identification Page cutting device, described device includes: DOM Document Object Model (DOM) tree signal generating unit, for webpage Generating dom tree, DOM includes the content for web page display;DOM information extraction unit, extracts The positional information of DOM tree node and size information;CSS (CSS) resolution unit, parses Boundary edge attribute in CSS attribute and marge clear area attribute;Mark unit, is labeled webpage, To mark out function and semantic space;Picture and text mixing block scan unit, according to DOM tree structure at residue net Picture and text mixing block is scanned in Ye;Rest block scanning element, in remaining piece scanning boundary edge attributes and Marge clear area attribute is not the block of 0;Granularity candidates unit, by the block by mark unit mark It is labeled as granularity candidate;The picture and text mixing block gone out by picture and text mixing block scan unit scan is labeled as grain Degree candidate;The block scanned by rest block scanning element is labeled as granularity candidate;By in dom tree Residue does not has markd piece to be labeled as granularity candidate.
Segmenting web page device may also include that combining unit, according to the condition of merging to by granularity candidates The whole of granularity candidate that unit marks merge, and wipe the candidates of the sub-block merging block.
Accompanying drawing explanation
Will be become by the description carried out below in conjunction with the accompanying drawings, the above and other purpose of the present invention and feature Obtain clearer, wherein:
Fig. 1 illustrates based on functional area identification according to an exemplary embodiment of the present invention segmenting web page device Block diagram;
Fig. 2 to Fig. 5 is to illustrate based on functional area identification according to an exemplary embodiment of the present invention webpage The schematic diagram of cutting example;
Fig. 6 is to illustrate based on functional area identification according to an exemplary embodiment of the present invention segmenting web page side The flow chart of method.
Detailed description of the invention
There is provided description referring to the drawings to help the present invention limited by claim and equivalent thereof Comprehensive understanding of exemplary embodiment.Describe and include being adapted to assist in the various specific detail of understanding, but These details should be considered only as exemplary.Therefore, those of ordinary skill in the art it will be recognized that In the case of scope and spirit of the present invention, the embodiments described herein can be carried out various changing Become and amendment.Additionally, for clarity and conciseness, the description to known function and structure can be omitted.
Fig. 1 illustrates based on functional area identification according to an exemplary embodiment of the present invention segmenting web page device Block diagram.
With reference to Fig. 1, segmenting web page device 100 include DOM Document Object Model (DOM) tree signal generating unit 110, DOM information extraction unit 120, CSS (CSS) resolution unit 130, mark unit 140, Picture and text mixing block scan unit 150, rest block scanning element 160 and granularity candidates unit 170.
Dom tree signal generating unit 110 includes for web page display for auto-building html files dom tree, DOM Content.DOM information extraction unit 120 extracts positional information and the size information of DOM tree node. CSS resolution unit 130 parses the boundary edge attribute in CSS attribute and marge clear area attribute, this In, boundary edge is the boundary line in the page, if border width is set to zero, then and boundary edge and CSS Another attribute (filling edge) overlap;Marge clear area is the outside of node, its show node it How many blank should be retained outward, if width is set to zero, then overlap with boundary edge.Mark unit 140 utilize web page release dimensioning algorithm to be labeled webpage, to mark out function and semantic space, such as Navigation bar, breadcrumb, copyright, page turning hurdle and picture box etc..Publication No. CN102637172A Patent application provides a kind of web page release mask method and system.This web page release mask method can basis Machine learning algorithm automatically generates the training sample of piecemeal mark, and Automatic Cycle iteration, thus combines people The training sample that work sets, sums up classifying rules, sets up disaggregated model, to realize web page release mark, With it, can mark out the functions such as navigation bar, breadcrumb, copyright, page turning hurdle, picture box and Semantic space, omits detailed description at this.Picture and text mixing block scan unit 150 is according to dom tree Structure scans picture and text mixing block in residue webpage.Rest block scanning element 160 scans in remaining piece Boundary edge attribute and marge clear area attribute are not the block of 0.
Picture and text mixing is a kind of common structure during webpage is arranged, (such as, " first at various types of webpages Page ", " list type " etc.) inner be all likely to occur.Picture and text mixing block is the block meeting following condition: picture Size is sufficiently large, such as, more than 5000 pixels;Text node is had beyond picture node;Link number is less than Or equal to 5;Brother's block has identical DOM tree structure, and brother's block is picture and text mixing blocks.
The block marked by mark unit 140 is labeled as granularity candidate by granularity candidates unit 170, The picture and text mixing block scanned by picture and text mixing block scan unit 150 is labeled as granularity candidate, will be logical Cross the residue block that scans of block scan unit 160 and be labeled as granularity candidate, and by dom tree Remaining after above-mentioned marking operation do not have markd piece to be labeled as granularity candidate.
Preferably, segmenting web page device 100 also includes combining unit 180.By granularity candidates list In the case of the block of unit's 170 labellings meets merging condition, the block meeting condition is carried out by combining unit 180 Merging, described merging condition is as follows:
1) sub-block is picture and text mixing blocks, and does not comprise two or more difference in functionality and language in block to be combined Justice type block, the main node of sub-block do not comprise more than TAG_UL, TAG_TABLE, Two kinds of label in TAG_FORM;Or
2) two or more difference in functionality and semanteme are not comprised during sub-block is plain text block, and block to be combined The block of type, the main node of sub-block does not comprise more than TAG_UL, TAG_TABLE, TAG_FORM In two kinds of label;Or
3) do not comprise during sub-block has identical structure, and block to be combined in dom tree two kinds with Upper difference in functionality and the block of semantic type, the main node of sub-block do not comprise more than TAG_UL, Two kinds of label in TAG_TABLE, TAG_FORM.
Afterwards, the sub-block merging block is wiped candidates by combining unit 180.Remaining have candidate's mark The block of note is a suitable particle size cutting of current page.
Fig. 2 to Fig. 5 is to illustrate based on functional area identification according to an exemplary embodiment of the present invention webpage The schematic diagram of cutting example.
To describe in detail according to an exemplary embodiment of the present invention based on functional area with reference to Fig. 2 to Fig. 5 The example of the segmenting web page identified.
Know as a example by the page by Baidu, on dom tree, first add visual information and CSS attribute, right The page is laid out, original web page as shown in Figure 2.
Utilize web page release dimensioning algorithm that original web page is carried out piecemeal mark, identify the function in webpage And semantic space, such as the picture block of red line part, navigation bar, mutual block and breadcrumb in Fig. 3.
According to the boundary edge attribute in CSS attribute and marge clear area attribute, scan in remaining piecemeal Qualified piecemeal, such as the blue line part in Fig. 4.Owing to this page does not has picture and text mixing structure, therefore Omit the identification of picture and text mixing block.
The page there remains at present a part of piecemeal in the upper right corner, these blocks do not have specific function and its Boundary edge attribute and marge clear area attribute are 0, therefore these piecemeals remaining are merged into one nearby Individual piecemeal, such as Fig. 5 green line part.
Up to the present, segmenting web page has been completed.It is 8 blocks, wherein 4 merits that webpage is finally split Energy block, 3 blocks separated according to CSS attribute, 1 rest block merges block, and these 8 blocks are this net One suitable cutting granularity of page.
Fig. 6 is to illustrate based on functional area identification according to an exemplary embodiment of the present invention segmenting web page side The flow chart of method.
With reference to Fig. 6, in step S601, set for auto-building html files DOM Document Object Model (DOM), DOM Tree includes the content for web page display.
In step S602, extract positional information and the size information of DOM tree node.
In step S603, parse the boundary edge attribute in CSS attribute and marge clear area attribute.
In step S604, utilize web page release dimensioning algorithm that webpage is labeled, with mark out function and Semantic space, and the block of mark is labeled as granularity candidate.Utilize web page release dimensioning algorithm that webpage is carried out The operation of mark, disclosed in the patent application of Publication No. CN102637172A, therefore saves at this Slightly detailed description.
In step S605, in residue webpage, scan picture and text mixing block according to DOM tree structure, will scanning The picture and text mixing block gone out is labeled as granularity candidate.Here, picture and text mixing block is the block meeting following condition: Dimension of picture is sufficiently large, such as, more than 5000 pixels;Text node is had beyond picture node;Link number Less than or equal to 5;Brother's block has identical DOM tree structure, and brother's block is picture and text mixing Block.
In step S606, scan remaining piece, if the boundary edge attribute of the block scanned and marge are empty White area attribute is not 0, then be labeled as granularity candidate by described piece.
In step S607, markd piece is not had to be labeled as granularity candidate residue in dom tree.
Based on functional area identification according to an exemplary embodiment of the present invention segmenting web page method may also include Step S608, in step S608, according to the merging condition grain to marking in step S604 to S607 Degree the whole of candidate merge, and wipes the candidates of the sub-block merging block.Described merging condition is Described above.
Present invention piecemeal based on bottom labeling system, utilizes the advanced features such as piecemeal annotation results to the page Carry out functional area identification, thus complete the cutting of this webpage.Segmenting web page is become to have difference by the present invention Function, semantic region, and on this basis rest block is merged, ultimately form one layer of tiling and Mutually without block collection that is that occur simultaneously and that cover full page, i.e. the suitable cutting granularity of this webpage.Often Individual granularity is an independent function, semanteme or content area, and application can be carried out in units of granularity Use.Such as, mobile phone browser can be pushed to user in units of granularity;Can be to the grain of output Degree calculates importance degree, according to importance degree, web page contents is carried out research etc..
Although the certain exemplary embodiments with reference to the present invention illustrate and describes the present invention, but this Field it should be understood to the one skilled in the art that in the essence without departing from the present invention limited by claim and equivalent thereof In the case of god and scope, in form and details the present invention can be carried out various change.

Claims (14)

1. a segmenting web page method based on functional area identification, described method includes:
Setting for auto-building html files DOM Document Object Model (DOM), dom tree includes for web page display Content;
Extract positional information and the size information of DOM tree node;
Parse the boundary edge attribute in CSS (CSS) attribute and marge clear area attribute;
Utilize web page release dimensioning algorithm that webpage is labeled, to mark out function and semantic space, and will The block of mark is labeled as granularity candidate;
In residue webpage, picture and text mixing block, the picture and text mixing that will scan is scanned according to DOM tree structure Block is labeled as granularity candidate;
Scan remaining piece, if the boundary edge attribute of the block scanned and marge clear area attribute are not 0, then it is labeled as granularity candidate by described piece;
Markd piece is not had to be labeled as granularity candidate residue in dom tree.
Segmenting web page method the most according to claim 1, wherein, described function and semantic space include Navigation bar, breadcrumb, copyright, page turning hurdle and picture box.
Segmenting web page method the most according to claim 1, wherein, picture and text mixing block is below meeting The block of condition: dimension of picture is more than 5000 pixels;Text node is had beyond picture node;Link number is less than Or equal to 5;Brother's block has identical DOM tree structure.
Segmenting web page method the most according to claim 1, also includes: according to the condition of merging to described The whole of granularity candidate merge, and wipe the candidates of the sub-block merging block.
Segmenting web page method the most according to claim 4, wherein, if sub-block is picture and text mixing blocks, And sub-block to be combined do not comprises the block of two or more difference in functionality and semantic type, and sub-block It is two kinds of that main node does not comprise more than in TAG_UL, TAG_TABLE, TAG_FORM Label, then merge.
Segmenting web page method the most according to claim 4, wherein, if sub-block is plain text block, And sub-block to be combined do not comprises the block of two or more difference in functionality and semantic type, and sub-block It is two kinds of that main node does not comprise more than in TAG_UL, TAG_TABLE, TAG_FORM Label, then merge.
Segmenting web page method the most according to claim 4, wherein, if sub-block is in dom tree Have in identical structure, and sub-block to be combined and do not comprise two or more difference in functionality and semantic type Block, and the main node of sub-block do not comprises more than TAG_UL, TAG_TABLE, TAG_FORM In two kinds of label, then merge.
8. a segmenting web page device based on functional area identification, described device includes:
DOM Document Object Model (DOM) tree signal generating unit, for auto-building html files dom tree, DOM bag Include the content for web page display;
DOM information extraction unit, extracts positional information and the size information of DOM tree node;
CSS (CSS) resolution unit, parses the boundary edge attribute in CSS attribute and hurdle Outer clear area attribute;
Mark unit, is labeled webpage, to mark out function and semantic space;
Picture and text mixing block scan unit, scans picture and text mixing block according to DOM tree structure in residue webpage;
Rest block scanning element, in remaining piece, scanning boundary edge attributes and marge clear area attribute are not It it is the block of 0;
Granularity candidates unit, will be labeled as granularity candidate by the block of mark unit mark;To pass through The picture and text mixing block that picture and text mixing block scan unit scan goes out is labeled as granularity candidate;To be swept by rest block Retouch the block that unit scan goes out and be labeled as granularity candidate;Markd piece is not had to be labeled as residue in dom tree Granularity candidate.
Segmenting web page device the most according to claim 8, wherein, described function and semantic space include Navigation bar, breadcrumb, copyright, page turning hurdle and picture box.
Segmenting web page device the most according to claim 8, wherein, picture and text mixing block be meet with The block of lower condition: dimension of picture is more than 5000 pixels;Text node is had beyond picture node;Link number is little In or equal to 5;Brother's block has identical DOM tree structure.
11. segmenting web page devices according to claim 8, also include: combining unit, according to conjunction And the whole of granularity candidate marked by granularity candidates unit are merged by condition, and wipe Merge the candidates of the sub-block of block.
12. segmenting web page devices according to claim 11, wherein, if sub-block is picture and text mixings Block, and sub-block to be combined do not comprise the block of two or more difference in functionality and semantic type, and son The main node of block does not comprise more than two kinds in TAG_UL, TAG_TABLE, TAG_FORM The label of type, then described sub-block is merged by combining unit.
13. segmenting web page devices according to claim 11, wherein, if sub-block is plain text block, And sub-block to be combined do not comprises the block of two or more difference in functionality and semantic type, and sub-block It is two kinds of that main node does not comprise more than in TAG_UL, TAG_TABLE, TAG_FORM Label, then described sub-block is merged by combining unit.
14. segmenting web page devices according to claim 11, wherein, if sub-block is at dom tree In have in identical structure, and sub-block to be combined and do not comprise two or more difference in functionality and semantic category The block of type, and the main node of sub-block do not comprise more than TAG_UL, TAG_TABLE, Two kinds of label in TAG_FORM, then described sub-block is merged by combining unit.
CN201310176551.9A 2013-05-14 2013-05-14 A kind of segmenting web page method and device based on functional area identification Active CN103440239B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310176551.9A CN103440239B (en) 2013-05-14 2013-05-14 A kind of segmenting web page method and device based on functional area identification

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310176551.9A CN103440239B (en) 2013-05-14 2013-05-14 A kind of segmenting web page method and device based on functional area identification

Publications (2)

Publication Number Publication Date
CN103440239A CN103440239A (en) 2013-12-11
CN103440239B true CN103440239B (en) 2016-08-10

Family

ID=49693930

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310176551.9A Active CN103440239B (en) 2013-05-14 2013-05-14 A kind of segmenting web page method and device based on functional area identification

Country Status (1)

Country Link
CN (1) CN103440239B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105677827B (en) * 2016-01-04 2019-03-29 百度在线网络技术(北京)有限公司 A kind of acquisition methods and device of list
CN109657208B (en) * 2017-10-10 2023-07-04 株式会社理光 Webpage similarity calculation method, device, equipment and computer readable storage medium
CN109492177B (en) * 2018-11-02 2019-12-17 中国搜索信息科技股份有限公司 web page blocking method based on web page semantic structure
CN111353115B (en) * 2018-12-24 2023-10-27 中移(杭州)信息技术有限公司 Method and device for generating snowplow map
CN112906559B (en) * 2021-02-10 2022-03-18 网易有道信息技术(北京)有限公司 Machine-implemented method for correcting formulas and related product
CN113806665A (en) * 2021-09-24 2021-12-17 刘秀萍 Webpage blocking method based on non-patterned Web data model
CN114186164B (en) * 2021-12-17 2023-06-09 北京大学 Method and system for determining and dividing boundary of webpage content block
CN118132794B (en) * 2024-05-07 2024-07-05 江西风向标智能科技有限公司 Multi-mode data partitioning method and system based on enterprise information semantic retrieval

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101251855A (en) * 2008-03-27 2008-08-27 腾讯科技(深圳)有限公司 Equipment, system and method for cleaning internet web page
CN102637172A (en) * 2011-02-10 2012-08-15 北京百度网讯科技有限公司 Webpage blocking marking method and system
CN102841920A (en) * 2012-06-30 2012-12-26 北京百度网讯科技有限公司 Method and device for extracting webpage frame information

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7171618B2 (en) * 2003-07-30 2007-01-30 Xerox Corporation Multi-versioned documents and method for creation and use thereof

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101251855A (en) * 2008-03-27 2008-08-27 腾讯科技(深圳)有限公司 Equipment, system and method for cleaning internet web page
CN102637172A (en) * 2011-02-10 2012-08-15 北京百度网讯科技有限公司 Webpage blocking marking method and system
CN102841920A (en) * 2012-06-30 2012-12-26 北京百度网讯科技有限公司 Method and device for extracting webpage frame information

Also Published As

Publication number Publication date
CN103440239A (en) 2013-12-11

Similar Documents

Publication Publication Date Title
CN103440239B (en) A kind of segmenting web page method and device based on functional area identification
CN108228183B (en) Front-end interface code generation method and device, electronic equipment and storage medium
CN102253979B (en) Vision-based web page extracting method
CN105723358B (en) System and method for interactive website and the automatic conversion of application
US8593666B2 (en) Method and system for printing a web page
CN105446989B (en) Searching method and device, display device
CN103514234B (en) A kind of page info extracting method and device
WO2014127535A1 (en) Systems and methods for automated content generation
CN105874449A (en) Systems and methods for extracting and generating images for display content
CN103164443B (en) Method and device of picture merging
CN107423322A (en) Method and device for displaying label nesting hierarchy of webpage
CN106354697A (en) Transforming data into consumable content
Touya et al. Automatic derivation of on-demand tactile maps for visually impaired people: first experiments and research agenda
CN105094775B (en) Webpage generation method and device
CN107688557A (en) Composition method, composing system and terminal
CN104391786A (en) Webpage automatic test system and method thereof
Xu et al. Identifying semantic blocks in Web pages using Gestalt laws of grouping
CN110377777A (en) A kind of multiple mask method of picture based on deep learning and device
CN109656652A (en) Webpage graph making method, apparatus, computer equipment and storage medium
CN102982088B (en) It is a kind of for providing user the method for the feedback information on target pages
JP2022179507A (en) Automatic web content generation system
CN1786965B (en) Method for acquiring news web page text information
KR20140098908A (en) Template Selectable logo Making device.
JP2012203491A (en) Document processing device and document processing program
CN102708167B (en) Web-based semantic annotation system and Web-based semantic annotation method for high resolution SAR (synthetic aperture radar) image interpretation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant