CN106910195A - A kind of web page layout monitoring method and device - Google Patents

A kind of web page layout monitoring method and device Download PDF

Info

Publication number
CN106910195A
CN106910195A CN201710047524.XA CN201710047524A CN106910195A CN 106910195 A CN106910195 A CN 106910195A CN 201710047524 A CN201710047524 A CN 201710047524A CN 106910195 A CN106910195 A CN 106910195A
Authority
CN
China
Prior art keywords
web page
interest
area
color
target web
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710047524.XA
Other languages
Chinese (zh)
Other versions
CN106910195B (en
Inventor
刘楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN201710047524.XA priority Critical patent/CN106910195B/en
Publication of CN106910195A publication Critical patent/CN106910195A/en
Application granted granted Critical
Publication of CN106910195B publication Critical patent/CN106910195B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a kind of web page layout monitoring method and device, the described method comprises the following steps:The background colour of normal Webpage is extracted, and normal Webpage is split using background colour, web page template of the generation with area-of-interest;When the target web page is different from the picture size of normal Webpage, forward and reverse two-way pumping station is carried out to the target web page according to web page template, the difference in area-of-interest between the two is calculated, to obtain the state of the target web page;When the target web page is identical with normal Webpage picture size, one-to-one positive comparison is carried out to the target web page according to web page template, the difference in area-of-interest between the two is calculated, to obtain the state of target web.The invention also discloses a kind of web page layout monitoring device.The present invention can replace manual working, realize it is round-the-clock webpage is monitored automatically, a large amount of manpower and materials of saving.

Description

A kind of web page layout monitoring method and device
Technical field
The present invention relates to technical field of image processing, more particularly to a kind of web page layout monitoring method and device.
Background technology
Internet era network becomes for media main, there is each webpage of a large number of users in internet daily On browsed, obtain magnanimity information.Therefore, the quality of each website and webpage becomes the important indicator for weighing Consumer's Experience. The quality of webpage is not only designed including UI, obtains the convenience of information etc. these subjective evaluation criterions, also steady including webpage Qualitative grade these objective appraisal standards, go wrong if as web page code, cause display mistake etc. occur in user terminal Problem, can cause very bad Consumer's Experience.The renewal speed of present webpage is quickly, it is impossible to ensure in each renewal process In, the various mistakes of web displaying will not be caused because of reasons such as human negligences.If such issues that using personal monitoring, it is necessary to Artificial 7x24 hours continuous is monitored to all webpages, takes time and effort.
The content of the invention
It is a primary object of the present invention to provide a kind of web page layout monitoring method and device, it is intended to solve existing skill Need manually constantly to be monitored all webpages in art, the technical problem for taking time and effort.
To achieve the above object, a kind of web page layout monitoring method that the present invention is provided, comprises the following steps:
The background colour of normal Webpage is extracted, and the normal Webpage is split using the background colour, Web page template of the generation with area-of-interest;
When the target web page is different from the picture size of the normal Webpage, according to the web page template to institute Stating the target web page carries out forward and reverse two-way pumping station, the difference in area-of-interest between the two is calculated, to obtain State the state of the target web page;
When the target web page is identical with the normal Webpage picture size, according to the web page template pair The target web page carries out one-to-one positive comparison, calculates the difference in area-of-interest between the two, described to obtain The state of target web.
Preferably, the background colour for extracting normal Webpage, and using the background colour to the normal webpage page Face is split, generation with area-of-interest web page template the step of include:
The image of the described normal Webpage being input into is converted into gray space or any brightness and color separated space;
Using horizontal direction edge gradient operator and vertical direction edge gradient operator, with the gray space or any Brightness and color separated space carries out convolution, obtains horizontal edge figure Eh and vertical edge figure Ev;
Each row to the horizontal direction edge graph Eh carries out the projection of horizontal direction, obtains the histogram of horizontal direction Hedge;
Count the color histogram Hcolor of image pixel P (x, y) in the histogram Hedge;
Obtain two kinds of background colors colorbg1, colorbg2 of described image;
Image is split by first direction using the background color colorbg1;
To the region after each segmentation, the segmentation of second direction is carried out, and to the region after the segmentation of each second direction, Reuse the background color colorbg1 to split image by first direction, obtain several area-of-interests;Its In, the first direction is horizontal direction, and second direction is vertical direction;Or, the first direction is vertical direction, second Direction is horizontal direction;
Compare any pixel P (x, y) in each area-of-interest and background color colorbg1, if P (x, y)= Colorbg1, will herein be set to scope of statistics, otherwise be set to non-statistical scope;Location of pixels in regions of non-interest is set to Non-statistical scope, obtains the final web page template with area-of-interest.
Preferably, the background colour for extracting normal Webpage, and using the background colour to the normal webpage page Face is split, generation with area-of-interest web page template the step of also include:
Judge the first row pixel P (x, y) of word segment in the normal Webpage image, if be satisfied by P (x, y) =colorbg2, if so, then checking next line, it is the first starting position otherwise to record P (x, y);
When any pixel P (x, y) in next line, when being satisfied by P (x, y)=colorbg2, then the first end is recorded as Position;
Repeat the above steps to obtain the second starting position and the second end position, and obtaining second starting position Found with stopping after the second end position;
By the institute being expert at the second starting position to the second end position of the first starting position to the first end position It is expert at and is set to area-of-interest.
Preferably, when the target web page is different from the picture size of the normal Webpage, according to described Web page template carries out forward and reverse two-way pumping station to the target web page, calculates the difference in area-of-interest between the two It is different, include the step of with the state for obtaining the target web page:
If the height of the target web page is less than normal Webpage highly, the of the target web is calculated At one area-of-interest position, between the color of the target web page pixel and the web page template color colorbg1 Difference;
In units of horizontal line, calculate from top to bottom in per a line in each area-of-interest, the target web page Difference between pixel color and template colors colorbg1, and according to presetting rule decision problem type;
Terminate positive comparison when forward direction comparison is pinpointed the problems, start opposite direction and compare.
Preferably, it is described when the target web page is identical with normal Webpage picture size, according to the net Page template carries out one-to-one positive comparison to the target web page, the difference in interest region between the two is calculated, to obtain The step of state for taking the target web page, includes:
In units of horizontal line, calculate in often each area-of-interest of a line, the color of the target web page pixel With the difference between template colors colorbg1, and according to presetting rule decision problem type;
If comparison is not pinpointed the problems, second comparison is carried out to each area-of-interest in horizontal line, record is every In individual area-of-interest, the difference between the color and template colors colorbg2 of the target web page pixel.
The present invention also provides a kind of web page layout monitoring device, and the web page layout monitoring device includes:
Normal Webpage template generation module, the background colour for extracting normal Webpage, and utilize the background Color is split to the normal Webpage, web page template of the generation with area-of-interest;
Aniso- size page comparing module, for when the target web page and the picture size of the normal Webpage When different, forward and reverse two-way pumping station is carried out to the target web page according to the web page template, calculated between the two Difference in area-of-interest, to obtain the state of the target web page;
Equivalent size page comparing module, for when the target web page and the normal Webpage picture size When identical, one-to-one positive comparison is carried out to the target web page according to the web page template, calculating feels emerging between the two Difference in interesting region, to obtain the state of the target web.
Preferably, the normal Webpage template generation module includes:
Converting unit, for the image of the described normal Webpage being input into be converted into gray space or any brightness Color-separated space;
Arithmetic element, for utilizing horizontal direction edge gradient operator and vertical direction edge gradient operator, with the ash Degree space or any brightness and color separated space carry out convolution, obtain horizontal edge figure Eh and vertical edge figure Ev;
Histogram acquiring unit, the projection of horizontal direction is carried out for each row to the horizontal direction edge graph Eh, Obtain the histogram Hedge of horizontal direction;
Statistic unit, the color histogram Hcolor for counting image pixel P (x, y) in the histogram Hedge;
Background color acquiring unit, two kinds of background colors colorbg1, colorbg2 for obtaining described image;
Cutter unit, for being split to image by first direction using the background color colorbg1;
Area-of-interest acquiring unit, for the region after each segmentation, carrying out the segmentation of second direction, and to every Region after individual second direction segmentation, reuses the background color colorbg1 image is split by first direction, Obtain several area-of-interests;Wherein, the first direction is horizontal direction, and second direction is vertical direction;Or, it is described First direction is vertical direction, and second direction is horizontal direction
Stencil value setting unit, for comparing any pixel P (x, y) and background color in each area-of-interest Colorbg1, if P (x, y)=colorbg1, will herein be set to scope of statistics, otherwise be set to non-statistical scope;Non- sense is emerging Location of pixels in interesting region is set to non-statistical scope, obtains the final web page template with area-of-interest
Preferably, the normal Webpage template generation module also includes:
Caption judging unit, the first row pixel P for judging word segment in the normal Webpage image (x, y), if be satisfied by P (x, y)=colorbg2, if so, then checking next line, it is the first start bit otherwise to record P (x, y) Put;
When any pixel P (x, y) in next line, P (x, y)=colorbg2 is satisfied by, when, then it is recorded as the first end Position;
Repeat the above steps to obtain the second starting position and the second end position, and obtaining the second starting position and the Stop finding after two end positions;
By the institute being expert at the second starting position to the second end position of the first starting position to the first end position It is expert at and is set to area-of-interest.
Preferably, the aniso- size page comparing module is used for:
If the height of the target web page is less than normal Webpage highly, the of the target web is calculated At one area-of-interest position, between the color of the target web page pixel and the web page template color colorbg1 Difference;
In units of horizontal line, calculate from top to bottom in per a line in each area-of-interest, the target web page Difference between pixel color and template colors colorbg1, and according to presetting rule decision problem type;
Terminate positive comparison when forward direction comparison is pinpointed the problems, start opposite direction and compare.
Preferably, the equivalent size page comparing module is used for:
In units of horizontal line, calculate in often each area-of-interest of a line, the color of the target web page pixel With the difference between template colors colorbg1, and according to presetting rule decision problem type;
If comparison is not pinpointed the problems, second comparison is carried out to each area-of-interest in horizontal line, record is every In individual area-of-interest, the difference between the color and template colors colorbg2 of the target web page pixel.
Web page layout monitoring method proposed by the present invention, can automatically generate the webpage only in accordance with normal Web page image Template, it is not necessary to extraneous information is artificially provided, strengthens the versatility and automaticity of algorithm, it is not necessary to which manual intervention can be with Image generation template is automatically analyzed, all kinds webpage can be processed.Manual working can be replaced, realized 7x24 hours It is automatic webpage is monitored, save a large amount of manpower and materials.
Brief description of the drawings
Fig. 1 is the schematic flow sheet of the embodiment of web page layout monitoring method one of the present invention;
The step of Fig. 2 is web page template of the generation with area-of-interest in web page layout monitoring method of the present invention is flowed Journey schematic diagram;
Fig. 3 is the histogrammic pattern of horizontal direction and effect diagram in the present invention;
Fig. 4 is the web page template of area-of-interest of the present invention and the sample schematic diagram of normal Webpage;
Fig. 5 is that the flow of aniso- size page comparison in the embodiment of web page layout monitoring method one of the present invention is illustrated Figure;
Fig. 6 is that the process of aniso- size page comparison in the embodiment of web page layout monitoring method one of the present invention is illustrated Figure;
Fig. 7 is that the flow of equivalent size page comparison in the embodiment of web page layout monitoring method one of the present invention is illustrated Figure;
Fig. 8 is the modular structure schematic diagram of the embodiment of web page layout monitoring device one of the present invention;
Fig. 9 is the knot of normal Webpage template generation module in the embodiment of web page layout monitoring device one of the present invention Structure schematic diagram.
The realization of the object of the invention, functional characteristics and advantage will be described further referring to the drawings in conjunction with the embodiments.
Specific embodiment
It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.
A kind of web page layout monitoring method of present invention offer, reference picture 1, in one embodiment, the Webpage cloth Office's monitoring method includes:
Step S10, extracts the background colour of normal Webpage, and the normal Webpage is entered using the background colour Row segmentation, web page template of the generation with area-of-interest;
In the embodiment of the present invention, segmentation is carried out to normal Webpage using the background colour and be can be understood as to normal net The page page carries out sectional drawing, and area-of-interest refers to the target area to be compared, and area-of-interest can be image district or text Block, can be multiple.
Step S20, when the target web page is different from the picture size of the normal Webpage, according to the webpage Template carries out forward and reverse two-way pumping station to the target web page, calculates the difference in area-of-interest between the two, To obtain the state of the target web page.
The purpose of monitoring is the position where the problem and problem for pointing out webpage, in the embodiment of the present invention, generates webpage After template, the regional of the target web page is compared with the area-of-interest of web page template, can according to comparison result It is by target web page classifications:Normally, pattern entanglement, layout of a page without columns missing, layout of a page without columns content missing, picture missing, picture are not loaded Seven class problems are lacked with word, and points out position of the generation problem in target web.
Step S30, when the target web page is identical with the normal Webpage picture size, according to the net Page template carries out one-to-one positive comparison to the target web page, calculates the difference in area-of-interest between the two, with Obtain the state of the target web.
Web page layout monitoring method proposed by the present invention, can automatically generate the webpage only in accordance with normal Web page image Template, it is not necessary to extraneous information is artificially provided, strengthens the versatility and automaticity of algorithm, it is not necessary to which manual intervention can be with Image generation template is automatically analyzed, all kinds webpage can be processed.Manual working can be replaced, 7x24 oneself is realized It is dynamic that webpage is monitored, save a large amount of manpower and materials.
Reference picture 2, in a preferred embodiment of the present invention, abovementioned steps S10 may include:
Step S11, gray space or any brightness and color are converted into by the image of the described normal Webpage being input into Separated space;
Specifically, input picture can be converted into gray scale/or any brightness and color separated space by rgb color space (such as YUV, HSV, HSL, LAB), changing formula for gray space is:Gray=R*0.299+G*0.587+B*0.114;For bright Degree color-separated space, is illustrated with HSL, and the conversion formula of brightness L (Lightness) is:L=(max (R, G, B)+min (R, G, B))/2。
Step S12, using horizontal direction edge gradient operator and vertical direction edge gradient operator, with the gray space Or any brightness and color separated space carries out convolution, obtains horizontal edge figure Eh and vertical edge figure Ev;
Using horizontal direction edge gradient operator and vertical direction edge gradient operator, with the gray space or any Brightness and color separated space carries out convolution, obtains horizontal edge figure Eh and vertical edge figure Ev;Horizontally and vertically Edge gradient operator is by taking Sobel operators as an example, and other operators are equally applicable.
Step S13, each row to the horizontal direction edge graph Eh carries out the projection of horizontal direction, obtains horizontal direction Histogram Hedge;
In order to exclude the influence at the edge of vertical direction, horizontal edge number is only counted, i.e., for any point on image (x, y), only counts Eh (x, y)>Th1&&Ev(x,y)<The edge of Th2.
Step S14, counts the color histogram Hcolor of image pixel P (x, y) in the histogram Hedge;
The horizontal edge Nogata of the color histogram Hcolor of pixel P (x, y) in statistical picture, and if only if P (x, y) Hedge [y] in figure==0 when, the pixel can just be included into the Hcolor statistics of color histogram.In the embodiment of the present invention, directly The pattern and effect of square figure refer to Fig. 3.
Step S15, obtains two kinds of background colors colorbg1, colorbg2 of described image;
Two kinds of primary background color colorbg1 of image are obtained, colorbg2, method causes to be reached in Hcolor to find To maximum position as colorbg1 so that second largest position is reached in Hcolor as colorbg2, the main face of both backgrounds The physical significance of color is the background colour of overall page, and the frame around picture background colour.
Step S16, is split by first direction using the background color colorbg1 to image;
In the present invention, first direction can be horizontal direction, or vertical direction, and second direction can be Vertical Square To, or vertical direction, below will be with first direction as horizontal direction, as a example by second direction is vertical direction, specifically Bright technical solution of the present invention.For example, horizontal resection is carried out to image using background domain color colorbg1, if the institute of next line There is pixel P (x, y) to be equal to colorbg1, but one's own profession is ineligible, using this position as segmentation starting position.If All pixels P (x, y) per a line are equal to colorbg1, but next line is ineligible, using this position as segmentation End position, image level pre-segmentation position is obtained by step S16.Divide the image into the area after obtaining some horizontal segmentations Domain.
Step S17, to the region after each segmentation, carries out the segmentation of second direction, and each second direction is split Region afterwards, reuses the background color colorbg1 image is split by first direction, obtains several senses emerging Interesting region;Wherein, the first direction is horizontal direction, and second direction is vertical direction;Or, the first direction is vertical Direction, second direction is horizontal direction;
For example, for the region after each horizontal segmentation, the segmentation of vertical direction is carried out, if all pictures of next column Plain P (x, y) is equal to colorbg1, but this row is ineligible, using this position as segmentation starting position.If each All pixels P (x, y) of row are equal to colorbg1, but next column is ineligible, and this position is terminated as segmentation Position, the vertical pre-segmentation position in each horizontal segmentation region can be obtained by step 7.Horizontal segmentation region is divided Cut, obtain the region after some vertical segmentations.
Step S18, compares any pixel P (x, y) in each area-of-interest and background color colorbg1, if P (x, y)=colorbg1, will herein be set to scope of statistics, otherwise be set to non-statistical scope;By the pixel in regions of non-interest Position is set to non-statistical scope, obtains the final web page template with area-of-interest
By comparing any pixel P (x, y) and colorbg1 in each area-of-interest, if P (x, y)= Colorbg1, will herein be set to scope of statistics (specifically can be set to 0 realization by by the stencil value of the pixel), otherwise be set to Non-statistical scope (specifically can be set to 255 realizations) by by the stencil value of the pixel;For the picture in regions of non-interest Plain position, is set to non-statistical scope, obtains the final web page template with area-of-interest.
In one embodiment, be may also include in abovementioned steps S10:
Judge the first row pixel P (x, y) of word segment in the normal Webpage image, if be satisfied by P (x, y) =colorbg2, if so, then checking next line, it is the first starting position otherwise to record P (x, y);
When any pixel P (x, y) in next line, when being satisfied by P (x, y)=colorbg2, then the first end is recorded as Position;Repeat the above steps to obtain the second starting position and the second end position, and obtain second starting position and Stop finding after second end position, by being expert at and the second starting position to for the first starting position to the first end position Being expert at for two end positions is set to area-of-interest, that is, the position found position occurs for caption.
For the area-of-interest that above-mentioned steps are obtained, word region is begun look for from bottom, i.e., for from bottom The often row pixel for rising, if wherein any pixel P (x, y), there is P (x, y)=colorbg2, then checks next line, otherwise remembers The first starting position is recorded, any pixel P (x, y) in a line is searched out upwards, there is P (x, y)=colorbg2 to be designated as first End position.With same method, after finding the second starting position and the second end position, stop finding, the position found is text There is position in word title.In the embodiment of the present invention, the sample of Page Template and normal Webpage refer to Fig. 4.
Reference picture 5, in a preferred embodiment, abovementioned steps S20 includes:
Step S21, if the height of the target web page is less than normal Webpage highly, calculates the target At first area-of-interest position of webpage, the color of the target web page pixel and the web page template color Difference between colorbg1;
In the embodiment of the present invention, if the height of the target web page of input is more than normal Webpage height or two Person's width is unequal, exports pattern entanglement, and algorithm terminates.If the height for being input into the target web page is less than normal Webpage Highly, the target web same template of page pixel color at webpage first area-of-interest (title bar in webpage) position is calculated Difference between color colorbg1.The definition mode of difference is:Target web page pixel P (x, y) in area-of-interest= The number of colorbg1.If difference is more than the coordinate in certain threshold value, output error mode pattern entanglement, and this region, Algorithm terminates, and otherwise goes to step S22.
Step S22, in units of horizontal line, calculates in per a line in each area-of-interest, the target from top to bottom Difference between Webpage pixel color and template colors colorbg1, and according to presetting rule decision problem type;
In the embodiment of the present invention, problem can be:
(1)diff>There is layout of a page without columns content missing problem in Thhigh, posting field position, record.
(2)diff>There is image missing problem in Thmedian, posting field position, record.
(3)diff>There is the number in the region of the problem in Thlow, record.
Step S23, positive comparison is terminated when forward direction comparison is pinpointed the problems, and is started opposite direction and is compared;
Compared per a line region after finishing, if the problem in above-mentioned (1) all occurs in the Zone Full of this line, regarded as The layout of a page without columns is lacked, and terminates positive comparison, carries out opposite direction comparison, continually looks for other problemses, otherwise goes to step S22, until all Row comparison is finished.When reversely comparing, compare upwards since the area-of-interest of last column of template, calculate in often going In each area-of-interest, target web page pixel color is with the difference between template colors colorbg1, calculating method method With step S22, and which kind of problem is occurred according to rule judgment.Compared per a line region after finishing, if in Zone Full appearance The problem in (1) is stated, layout of a page without columns missing is regarded as, terminates reversely comparing, export all Problem-Errors, otherwise go to step S23, continued Compare per a line.If occurring the problem in (3) in the every a line after comparing, pattern entanglement is exported.Finally export it is all go out Wrong situation.In the embodiment of the present invention, compare and the output procedure of error situation can be found in shown in Fig. 6.
Shown in Figure 7, in one embodiment, abovementioned steps S30 may include:
Step S31, in units of horizontal line, calculates in often each area-of-interest of a line, the target web page picture Difference between the color and template colors colorbg1 of element, and according to presetting rule decision problem type;
For example, in units of horizontal line, calculating in often each area-of-interest of a line, target web page pixel color is same Difference between template colors colorbg1, and which kind of problem is occurred according to rule judgment:
(1)diff>There is layout of a page without columns content missing problem in Thhigh, posting field position, record.
(2)diff>There is image missing problem in Thmedian, posting field position, record.
(3)diff>There is the number in the region of the problem in Thlow, record.
If the problem for all occurring (1) description in a line, misregistration problem is lacked for the layout of a page without columns.If do not compared Complete whole area-of-interests, proceed step S31, until all compare finishing.If complete in whole rows after comparison is finished All there is the problem in 1- (3) in portion's area-of-interest, then export pattern entanglement, exports each errors present, otherwise goes to step S32。
Step S32, if comparison is not pinpointed the problems, second ratio is carried out to each area-of-interest in horizontal line It is right, record in each area-of-interest, the difference between the color and template colors colorbg2 of the target web page pixel It is different.
If above-mentioned comparison does not find any problem, second ratio is carried out for each area-of-interest in horizontal line It is right, record in each area-of-interest, target web page pixel color with the difference diff between template colors colorbg2, If diff>There is the non-loading problem of picture in Thhigh, posting field position, record.There is position in caption for recording Put, record each caption and position occur, target web page pixel color is with the difference between template colors colorbg2 Diff, if diff>There is caption missing problem in Thhigh, posting field position, record.If not finding any asking Topic, then it is normal webpage to export webpage, and algorithm terminates.
The present invention also provides a kind of web page layout monitoring device, for realizing the above method.Webpage of the present invention Layout monitoring device realizes that its implementation process specifically refer to the illustrated embodiment of earlier figures 1 to 7, each mould using computer program The function and principle of block can correspond to each step in previous embodiment, not describe in detail one by one herein.Shown in reference picture 8, one In embodiment, the web page layout monitoring device includes:
Normal Webpage template generation module 10, the background colour for extracting normal Webpage, and utilize the back of the body Scenery is split to the normal Webpage, web page template of the generation with area-of-interest;
Aniso- size page comparing module 20, for when the target web page and the image chi of the normal Webpage When very little different, forward and reverse two-way pumping station is carried out to the target web page according to the web page template, calculate both it Between difference in area-of-interest, to obtain the state of the target web page;
Equivalent size page comparing module 30, for when the target web page and the normal Webpage image chi When very little identical, one-to-one positive comparison is carried out to the target web page according to the web page template, calculating is felt between the two Difference in interest region, to obtain the state of the target web.
Referring to Fig. 9, in one embodiment, normal Webpage template generation module 10 includes:
Converting unit 11, for the image of the described normal Webpage being input into be converted into gray space or any bright Degree color-separated space;
Arithmetic element 12, for utilizing horizontal direction edge gradient operator and vertical direction edge gradient operator, with described Gray space or any brightness and color separated space carry out convolution, obtain horizontal edge figure Eh and vertical edge figure Ev;
Histogram acquiring unit 13, the throwing of horizontal direction is carried out for each row to the horizontal direction edge graph Eh Shadow, obtains the histogram Hedge of horizontal direction;
Statistic unit 14, the color histogram Hcolor for counting image pixel P (x, y) in the histogram Hedge;
Background color acquiring unit 15, two kinds of background colors colorbg1, colorbg2 for obtaining described image;
Cutter unit 16, for being split to image by first direction using the background color colorbg1;
Area-of-interest acquiring unit 17 is for the region after each segmentation, carrying out the segmentation of second direction and right Region after the segmentation of each second direction, reuses the background color colorbg1 image is divided by first direction Cut, obtain several area-of-interests;Wherein, the first direction is horizontal direction, and second direction is vertical direction;Or, The first direction is vertical direction, and second direction is horizontal direction
Stencil value setting unit 18, for comparing any pixel P (x, y) and background color in each area-of-interest Colorbg1, if P (x, y)=colorbg1, will herein be set to scope of statistics, otherwise be set to non-statistical scope;Non- sense is emerging Location of pixels in interesting region is set to non-statistical scope, obtains the final web page template with area-of-interest
In one embodiment, foregoing normal Webpage template generation module 10 may also include:
Caption judging unit 19, for caption judging unit, for judging the normal Webpage image The first row pixel P (x, y) of middle word segment, if be satisfied by P (x, y)=colorbg2, if so, next line is then checked, it is no Then record P (x, y) is the first starting position;
When any pixel P (x, y) in next line, P (x, y)=colorbg2 is satisfied by, when, then it is recorded as the first end Position;
Repeat the above steps to obtain the second starting position and the second end position, and obtaining the second starting position and the Stop finding after two end positions;
By the institute being expert at the second starting position to the second end position of the first starting position to the first end position It is expert at and is set to area-of-interest.
In one embodiment, aniso- size page comparing module 20 is used for:
If the height of the target web page is less than normal Webpage highly, the of the target web is calculated At one area-of-interest position, between the color of the target web page pixel and the web page template color colorbg1 Difference;
In units of horizontal line, calculate from top to bottom in per a line in each area-of-interest, the target web page Difference between pixel color and template colors colorbg1, and according to presetting rule decision problem type;
Terminate positive comparison when forward direction comparison is pinpointed the problems, start opposite direction and compare.
In one embodiment, equivalent size page comparing module 30 is used for:
In units of horizontal line, calculate in often each area-of-interest of a line, the color of the target web page pixel With the difference between template colors colorbg1, and according to presetting rule decision problem type;
If comparison is not pinpointed the problems, second comparison is carried out to each area-of-interest in horizontal line, record is every In individual area-of-interest, the difference between the color and template colors colorbg2 of the target web page pixel.
The preferred embodiments of the present invention are these are only, the scope of the claims of the invention is not thereby limited, it is every to utilize this hair Equivalent structure or equivalent flow conversion that bright specification and accompanying drawing content are made, or directly or indirectly it is used in other related skills Art field, is included within the scope of the present invention.

Claims (10)

1. a kind of web page layout monitoring method, it is characterised in that including:
The background colour of normal Webpage is extracted, and the normal Webpage is split using the background colour, generated Web page template with area-of-interest;
When the target web page is different from the picture size of the normal Webpage, according to the web page template to the mesh Mark Webpage carries out forward and reverse two-way pumping station, the difference in area-of-interest between the two is calculated, to obtain the mesh Mark the state of Webpage;
When the target web page is identical with the normal Webpage picture size, according to the web page template to described The target web page carries out one-to-one positive comparison, the difference in area-of-interest between the two is calculated, to obtain the target The state of webpage.
2. web page layout monitoring method as claimed in claim 1, it is characterised in that the normal Webpage of extraction Background colour, and the normal Webpage is split using the background colour, webpage mould of the generation with area-of-interest The step of plate, includes:
The image of the described normal Webpage being input into is converted into gray space or any brightness and color separated space;
Using horizontal direction edge gradient operator and vertical direction edge gradient operator, with the gray space or any brightness Color-separated space carries out convolution, obtains horizontal edge figure Eh and vertical edge figure Ev;
Each row to the horizontal direction edge graph Eh carries out the projection of horizontal direction, obtains the histogram of horizontal direction Hedge;
Count the color histogram Hcolor of image pixel P (x, y) in the histogram Hedge;
Obtain two kinds of background colors colorbg1, colorbg2 of described image;
Image is split by first direction using the background color colorbg1;
To the region after each segmentation, the segmentation of second direction is carried out, and the region after splitting to each second direction, again Image is split by first direction using the background color colorbg1, several area-of-interests are obtained;Wherein, institute First direction is stated for horizontal direction, second direction is vertical direction;Or, the first direction is vertical direction, second direction It is horizontal direction;
Compare any pixel P (x, y) in each area-of-interest and background color colorbg1, if P (x, y)= Colorbg1, will herein be set to scope of statistics, otherwise be set to non-statistical scope;Location of pixels in regions of non-interest is set to Non-statistical scope, obtains the final web page template with area-of-interest.
3. web page layout monitoring method as claimed in claim 2, it is characterised in that the normal Webpage of extraction Background colour, and the normal Webpage is split using the background colour, webpage mould of the generation with area-of-interest The step of plate, also includes:
Judge the first row pixel P (x, y) of word segment in the normal Webpage image, if be satisfied by P (x, y)= Colorbg2, if so, then checking next line, it is the first starting position otherwise to record P (x, y);
When any pixel P (x, y) in next line, when being satisfied by P (x, y)=colorbg2, then the first end position is recorded as;
Repeat the above steps to obtain the second starting position and the second end position, and obtaining second starting position and the Stop finding after two end positions;
By being expert to the second end position with the second starting position of being expert at of the first starting position to the first end position It is set to area-of-interest.
4. web page layout monitoring method as claimed in claim 2, it is characterised in that when the target web page and institute When stating the picture size difference of normal Webpage, the target web page is carried out according to the web page template positive and anti- To two-way pumping station, the difference in area-of-interest between the two is calculated, the step of with the state for obtaining the target web page Including:
If the height of the target web page highly, calculates first of the target web less than normal Webpage At area-of-interest position, the difference between the color of the target web page pixel and the web page template color colorbg1 It is different;
In units of horizontal line, calculate from top to bottom in per a line in each area-of-interest, the target web page pixel Difference between color and template colors colorbg1, and according to presetting rule decision problem type;
Terminate positive comparison when forward direction comparison is pinpointed the problems, start opposite direction and compare.
5. web page layout monitoring method as claimed in claim 2, it is characterised in that described when the target web page When identical with normal Webpage picture size, one-to-one forward direction is carried out to the target web page according to the web page template Compare, calculate the difference between the two in interest region, include the step of with the state for obtaining the target web page:
In units of horizontal line, calculate in often each area-of-interest of a line, the color and mould of the target web page pixel Difference between plate color colorbg1, and according to presetting rule decision problem type;
If comparison is not pinpointed the problems, second comparison is carried out to each area-of-interest in horizontal line, record each sense In interest region, the difference between the color and template colors colorbg2 of the target web page pixel.
6. a kind of web page layout monitoring device, it is characterised in that the web page layout monitoring device includes:
Normal Webpage template generation module, the background colour for extracting normal Webpage, and utilize the background colour pair The normal Webpage is split, web page template of the generation with area-of-interest;
Aniso- size page comparing module, for when the target web page it is different from the picture size of the normal Webpage When, forward and reverse two-way pumping station is carried out to the target web page according to the web page template, calculating feels emerging between the two Difference in interesting region, to obtain the state of the target web page;
Equivalent size page comparing module, for when the target web page it is identical with the normal Webpage picture size When, one-to-one positive comparison is carried out to the target web page according to the web page template, calculate region of interest between the two Difference in domain, to obtain the state of the target web.
7. web page layout monitoring device as claimed in claim 6, it is characterised in that the normal Webpage template life Include into module:
Converting unit, for the image of the described normal Webpage being input into be converted into gray space or any brightness and color Separated space;
Arithmetic element, it is empty with the gray scale for utilizing horizontal direction edge gradient operator and vertical direction edge gradient operator Between or any brightness and color separated space carry out convolution, obtain horizontal edge figure Eh and vertical edge figure Ev;
Histogram acquiring unit, the projection of horizontal direction is carried out for each row to the horizontal direction edge graph Eh, is obtained The histogram Hedge of horizontal direction;
Statistic unit, the color histogram Hcolor for counting image pixel P (x, y) in the histogram Hedge;
Background color acquiring unit, two kinds of background colors colorbg1, colorbg2 for obtaining described image;
Cutter unit, for being split to image by first direction using the background color colorbg1;
Area-of-interest acquiring unit, for the region after each segmentation, carrying out the segmentation of second direction, and to each the Region after the segmentation of two directions, reuses the background color colorbg1 image is split by first direction, obtains Several area-of-interests;Wherein, the first direction is horizontal direction, and second direction is vertical direction;Or, described first Direction is vertical direction, and second direction is horizontal direction;
Stencil value setting unit, for comparing any pixel P (x, y) and background color in each area-of-interest Colorbg1, if P (x, y)=colorbg1, will herein be set to scope of statistics, otherwise be set to non-statistical scope;Non- sense is emerging Location of pixels in interesting region is set to non-statistical scope, obtains the final web page template with area-of-interest.
8. web page layout monitoring device as claimed in claim 7, it is characterised in that the normal Webpage template life Also include into module:
Caption judging unit, for judge word segment in the normal Webpage image the first row pixel P (x, Y), if be satisfied by P (x, y)=colorbg2, if so, then checking next line, it is the first starting position otherwise to record P (x, y);
When any pixel P (x, y) in next line, P (x, y)=colorbg2 is satisfied by, when, then it is recorded as the first stop bits Put;
Repeat the above steps to obtain the second starting position and the second end position, and obtaining the second starting position and the second knot Stop finding after beam position;
By being expert to the second end position with the second starting position of being expert at of the first starting position to the first end position It is set to area-of-interest.
9. web page layout monitoring device as claimed in claim 7, it is characterised in that the aniso- size page is compared Module is used for:
If the height of the target web page highly, calculates first of the target web less than normal Webpage At area-of-interest position, the difference between the color of the target web page pixel and the web page template color colorbg1 It is different;
In units of horizontal line, calculate from top to bottom in per a line in each area-of-interest, the target web page pixel Difference between color and template colors colorbg1, and according to presetting rule decision problem type;
Terminate positive comparison when forward direction comparison is pinpointed the problems, start opposite direction and compare.
10. web page layout monitoring device as claimed in claim 7, it is characterised in that the equivalent size page is compared Module is used for:
In units of horizontal line, calculate in often each area-of-interest of a line, the color and mould of the target web page pixel Difference between plate color colorbg1, and according to presetting rule decision problem type;
If comparison is not pinpointed the problems, second comparison is carried out to each area-of-interest in horizontal line, record each sense In interest region, the difference between the color and template colors colorbg2 of the target web page pixel.
CN201710047524.XA 2017-01-22 2017-01-22 Webpage layout monitoring method and device Active CN106910195B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710047524.XA CN106910195B (en) 2017-01-22 2017-01-22 Webpage layout monitoring method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710047524.XA CN106910195B (en) 2017-01-22 2017-01-22 Webpage layout monitoring method and device

Publications (2)

Publication Number Publication Date
CN106910195A true CN106910195A (en) 2017-06-30
CN106910195B CN106910195B (en) 2020-06-16

Family

ID=59206823

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710047524.XA Active CN106910195B (en) 2017-01-22 2017-01-22 Webpage layout monitoring method and device

Country Status (1)

Country Link
CN (1) CN106910195B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107368690A (en) * 2017-08-09 2017-11-21 贵阳朗玛信息技术股份有限公司 The preprocess method and device of medical image picture
CN110955369A (en) * 2019-11-19 2020-04-03 广东智媒云图科技股份有限公司 Focus judgment method, device and equipment based on click position and storage medium
CN111124721A (en) * 2018-10-31 2020-05-08 阿里巴巴集团控股有限公司 Webpage processing method and device and electronic equipment
CN112651942A (en) * 2020-12-28 2021-04-13 三星电子(中国)研发中心 Layout detection method and device
WO2022041825A1 (en) * 2020-08-28 2022-03-03 平安科技(深圳)有限公司 Method and apparatus for converting image into webpage, and computer device and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1752979A (en) * 2004-09-23 2006-03-29 捷讯研究有限公司 Web browser graphical user interface and method for implementing same
CN101331473A (en) * 2005-12-07 2008-12-24 三维实验室公司 Methods for manipulating web pages
CN101433075A (en) * 2006-04-28 2009-05-13 伊斯曼柯达公司 Generating a bitonal image from a scanned colour image
CN104036262A (en) * 2014-06-30 2014-09-10 南京富士通南大软件技术有限公司 Method and system for screening and recognizing LPR license plate
CN106227823A (en) * 2016-07-21 2016-12-14 知几科技(深圳)有限公司 A kind of webpage update detection method, info web capture and rendering method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1752979A (en) * 2004-09-23 2006-03-29 捷讯研究有限公司 Web browser graphical user interface and method for implementing same
CN101331473A (en) * 2005-12-07 2008-12-24 三维实验室公司 Methods for manipulating web pages
CN101433075A (en) * 2006-04-28 2009-05-13 伊斯曼柯达公司 Generating a bitonal image from a scanned colour image
CN104036262A (en) * 2014-06-30 2014-09-10 南京富士通南大软件技术有限公司 Method and system for screening and recognizing LPR license plate
CN106227823A (en) * 2016-07-21 2016-12-14 知几科技(深圳)有限公司 A kind of webpage update detection method, info web capture and rendering method

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107368690A (en) * 2017-08-09 2017-11-21 贵阳朗玛信息技术股份有限公司 The preprocess method and device of medical image picture
CN107368690B (en) * 2017-08-09 2022-01-18 贵阳朗玛信息技术股份有限公司 Medical image picture preprocessing method and device
CN111124721A (en) * 2018-10-31 2020-05-08 阿里巴巴集团控股有限公司 Webpage processing method and device and electronic equipment
CN111124721B (en) * 2018-10-31 2023-05-05 阿里巴巴集团控股有限公司 Webpage processing method and device and electronic equipment
CN110955369A (en) * 2019-11-19 2020-04-03 广东智媒云图科技股份有限公司 Focus judgment method, device and equipment based on click position and storage medium
WO2022041825A1 (en) * 2020-08-28 2022-03-03 平安科技(深圳)有限公司 Method and apparatus for converting image into webpage, and computer device and storage medium
CN112651942A (en) * 2020-12-28 2021-04-13 三星电子(中国)研发中心 Layout detection method and device

Also Published As

Publication number Publication date
CN106910195B (en) 2020-06-16

Similar Documents

Publication Publication Date Title
CN106910195A (en) A kind of web page layout monitoring method and device
CN102799669B (en) Automatic grading method for commodity image vision quality
US20100067863A1 (en) Video editing methods and systems
CN107274373B (en) Code printing method and device in live streaming
CN102402918B (en) Method for improving picture quality and liquid crystal display (LCD)
CN104978565B (en) A kind of pictograph extracting method of universality
CN107977645B (en) Method and device for generating video news poster graph
US20230290118A1 (en) Automatic classification method and system of teaching videos based on different presentation forms
CN108615030A (en) A kind of title consistency detecting method, device and electronic equipment
Fan et al. Visual complexity of chinese ink paintings
CN113301408A (en) Video data processing method and device, electronic equipment and readable storage medium
Wang et al. How real is reality? A perceptually motivated system for quantifying visual realism in digital images
CN106447656A (en) Rendering flawed image detection method based on image recognition
CN115082400A (en) Image processing method and device, computer equipment and readable storage medium
JP2021189527A5 (en)
CN107145888A (en) Video caption real time translating method
CN107798355A (en) A kind of method automatically analyzed based on file and picture format with judging
CN116168192A (en) Image detection area determination method and device, electronic equipment and storage medium
CN114972367B (en) Method, apparatus, device and computer readable storage medium for segmenting images
CN105354833A (en) Shadow detection method and apparatus
CN113052821B (en) Quality evaluation method for power equipment inspection picture
CN111083468B (en) Short video quality evaluation method and system based on image gradient
Chen et al. Saliency detection via topological feature modulated deep learning
CN103186785A (en) Skin color detection method and system
CN112561823B (en) Filtering method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant