CN102708371A - Method for recognizing and automatically sequencing comic frames according to segmenting lines - Google Patents
Method for recognizing and automatically sequencing comic frames according to segmenting lines Download PDFInfo
- Publication number
- CN102708371A CN102708371A CN2012101201649A CN201210120164A CN102708371A CN 102708371 A CN102708371 A CN 102708371A CN 2012101201649 A CN2012101201649 A CN 2012101201649A CN 201210120164 A CN201210120164 A CN 201210120164A CN 102708371 A CN102708371 A CN 102708371A
- Authority
- CN
- China
- Prior art keywords
- coma
- caricature
- line
- cut
- frame
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 54
- 238000012163 sequencing technique Methods 0.000 title abstract 4
- 230000011218 segmentation Effects 0.000 claims abstract description 37
- 206010010071 Coma Diseases 0.000 claims description 72
- 206010073261 Ovarian theca cell tumour Diseases 0.000 claims description 53
- 208000001644 thecoma Diseases 0.000 claims description 53
- 238000001514 detection method Methods 0.000 claims description 32
- 238000003709 image segmentation Methods 0.000 claims description 3
- 239000000470 constituent Substances 0.000 abstract 1
- 241000556720 Manga Species 0.000 description 13
- 238000007781 pre-processing Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 238000000605 extraction Methods 0.000 description 2
- 229920001690 polydopamine Polymers 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000010187 selection method Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 230000001186 cumulative effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
Images
Landscapes
- Image Analysis (AREA)
Abstract
Description
技术领域 technical field
本发明设计一种基于分割线的漫画帧识别与自动排序方法,属于图像数据处理领域。 The invention designs a comic frame recognition and automatic sorting method based on dividing lines, and belongs to the field of image data processing. the
背景技术 Background technique
像电子书一样,电子漫画不但具有方便经济、自动排好序的优点,而且具有动跃感,现在逐渐发展并流行起来,尤其在青少年消费市场中非常受欢迎。而在手机或PDA上阅读电子漫画随着上下键的翻动产生动漫画的效果,备受读者亲睐。但是漫画的阅读不同于电子书,因为手机与PDA屏幕大小的限制,以及为了在手机上阅读时使故事情节具有先后顺序,需要把每张漫画分割成更小的图片,而又不影响阅读。 Like e-books, electronic comics not only have the advantages of convenience, economy and automatic sorting, but also have a sense of dynamism. Now they are gradually developing and becoming popular, especially in the youth consumer market. While reading electronic comics on mobile phones or PDAs, the effect of animation comics is produced by flipping the up and down keys, which is favored by readers. But reading comics is different from e-books, because of the limitation of the screen size of mobile phones and PDAs, and in order to make the storylines have a sequence when reading on mobile phones, each comic needs to be divided into smaller pictures without affecting reading. the
目前已经出现了很多电子漫画服务,但这些工作大部分都是通过人工来识别分割漫画页中的漫画帧。现阶段,国内外对电子漫画的研究还比较少。Masashi YAMADA等人提出了一种按角度对漫画帧以及漫画帧中对话框内容进行排序的方法,主要着重于排序结果的正确性,没有提到如何有效的处理分割线不完整的漫画帧。Takamasa Tanaka、Kenji Shoji and Fubito Toyama提出了一种用树结构来表示漫 画帧,通过用梯度密度函数识别直线对漫画帧进行布局分析。这个方法可以解决分割线不完整,即有对话框截断分割线的情况,但不能实时应用。还有些其它方面的漫画图片处理工作得到了研究,如Syeda-Mahmood等人提出了一种在漫画图片中文本识别的方法,使得漫画中对话框中的文字被单独放出来阅读,这对于电子漫画也是非常重要的,但能真正得到实际的应用还需进一步的研究工作。 There have been many electronic comic services, but most of these works are done manually to identify the comic frames in the segmented comic pages. At present, there are relatively few researches on electronic comics at home and abroad. Masashi YAMADA et al. proposed a method of sorting manga frames and dialog boxes in manga frames by angle, focusing on the correctness of the sorting results, and did not mention how to effectively deal with manga frames with incomplete dividing lines. Takamasa Tanaka, Kenji Shoji and Fubito Toyama proposed a tree structure to represent manga frames, and analyzed the layout of manga frames by identifying straight lines with gradient density functions. This method can solve the problem that the split line is incomplete, that is, the dialog box cuts off the split line, but it cannot be applied in real time. There are also some other aspects of comic picture processing work that have been studied. For example, Syeda-Mahmood et al. proposed a method for text recognition in comic pictures, so that the text in the dialog boxes in the comics can be read separately, which is very important for electronic comics. It is also very important, but it needs further research work to be able to get practical application. the
发明内容 Contents of the invention
本发明要解决的技术问题是针对一般漫画分割线特点,通过判断漫画帧之间的分割线所具有的特点对漫画帧进行识别,并且根据漫画帧的排布特点提出了利用二叉树与漫画帧结合的排序方法,简单易懂、执行效率高。 The technical problem to be solved in the present invention is to identify the cartoon frame by judging the characteristics of the dividing line between the cartoon frames according to the characteristics of the general cartoon dividing line, and according to the arrangement characteristics of the cartoon frame, a combination of a binary tree and a cartoon frame is proposed. The sorting method is simple and easy to understand, and has high execution efficiency. the
为解决上述技术问题,本发明提供了一种基于分割线的漫画帧识别与自动排序方法,其特征在于,作如下定义: In order to solve the above-mentioned technical problems, the invention provides a kind of caricature frame recognition and automatic sorting method based on dividing line, it is characterized in that, make following definition:
a、漫画帧:每页漫画中不能再分割的最小图片单元; a. Comic frame: the smallest picture unit that cannot be divided in each page of comics;
b、漫画段:由至少两个相连的漫画帧组成的四边形; b. Comic segment: a quadrilateral composed of at least two connected comic frames;
该方法包括以下步骤: The method includes the following steps:
A、对给定的漫画页作预处理,包括: A. Preprocess the given manga page, including:
a、对所述漫画页二值化,得到二值化图像; a. Binarize the comic page to obtain a binarized image;
b、对所述二值化图像作轮廓检测,得到漫画页的前景图像; b. Perform contour detection on the binarized image to obtain the foreground image of the comic page;
B、对前景图像作直线检测,获取前景图像中包含的直线; B. Perform straight line detection on the foreground image to obtain the straight line contained in the foreground image;
C、定义直线检测后获取的四边形结构体为coma,coma为漫画帧或漫画段,漫画分割从最大的coma开始,若coma满足下列条件之一,则判定coma不可分割并且coma对应的图片为漫画帧,执行步骤G,否则执行步骤D; C. Define the quadrilateral structure obtained after straight line detection as coma, coma is a comic frame or comic segment, and comic segmentation starts from the largest coma. If coma satisfies one of the following conditions, it is determined that coma is indivisible and the picture corresponding to coma is a comic frame, execute step G, otherwise execute step D;
a、coma内部没有直线; a. There is no straight line inside coma;
b、coma内部最大前景图像面积与coma面积比值小于阈值m,其中0<m<0.25; b. The ratio of the largest foreground image area inside the coma to the area of the coma is less than the threshold m, where 0<m<0.25;
c、coma内部最大前景图像区域像素数量与coma内部所有像素数量的比值小于阈值n,其中0<n<0.2; c. The ratio of the number of pixels in the largest foreground image area inside the coma to the number of all pixels inside the coma is less than the threshold n, where 0<n<0.2;
d、coma内部最大前景图像区域与coma对应的长、宽比值小于阈值p,0.08<p<0.2; d. The ratio of the length and width of the largest foreground image area inside the coma to the coma is less than the threshold p, 0.08<p<0.2;
D、从直线检测获取的直线中选出分割线,所述分割线必须满足以下条件: D. Select a dividing line from the straight line obtained by the straight line detection, and the dividing line must meet the following conditions:
a、所述分割线只与coma中相对的两条边相交,与其他边没有交点; a. The dividing line only intersects the two opposite sides of the coma, and has no intersection with other sides;
b、所述分割线与coma的交点在coma的边缘线上; b. The intersection of the dividing line and the coma is on the edge line of the coma;
E、对所述分割线按其权重的降序一一筛选,选出与其他直线交点个数不超过3的分割线,定义为标准分割线; E. Screen the dividing lines one by one in descending order of their weights, and select a dividing line whose number of intersections with other straight lines does not exceed 3, and define it as a standard dividing line;
F、在所述标准分割线两侧各HW像素点的宽度范围内,选出与其它直线不相交、或最多只与两条直线有交点的直线,从这些直线中再选出一条与前景图像交点数最少的直线作为coma的最分割位置,定义为最佳分割线;如果选不出符合条件的最佳分割线,返回步骤E,按照所述分割线的权重选择下一条符合要求的最佳分割线; F, within the width range of each HW pixel point on both sides of the standard dividing line, select a straight line that does not intersect with other straight lines, or only has intersection points with two straight lines at most, and then selects a straight line that is intersecting with the foreground image from these straight lines The straight line with the least number of intersection points is the most segmented position of the coma, which is defined as the best segmented line; if the best segmented line that meets the conditions cannot be selected, return to step E, and select the next best segmented line that meets the requirements according to the weight of the segmented line Dividing line;
G、采用二叉树数据结构按照漫画的阅读顺序储存经过分处理的coma; G. Use the binary tree data structure to store the sub-processed coma according to the reading order of the comics;
H、按照最佳分割线对漫画页作第一次分割后,用同样的方法对分割而产生的漫画段继续分割并储存新产生的coma,直到不存在可分割的coma为止,一页漫画的分割完成。 H. After the comic page is divided for the first time according to the optimal dividing line, continue to divide and store the newly generated coma in the same way to the comic segment produced by the division, until there is no divisible coma, and one page of comic The split is complete. the
采用无参数和非监督的图像分割阈值选择方法对所述漫画页二值化,通过选择一个阈值来把小于阈值的像素值设置为0,大于这个阈值的像素值设置为1。 A non-parameter and non-supervised image segmentation threshold selection method is used to binarize the comic page, and a threshold value is selected to set the pixel value smaller than the threshold value to 0, and the pixel value greater than the threshold value to be set to 1. the
对轮廓检测后得到的漫画页前景图像,在直线检测之前作细化处理,将前景图像中线条宽度大于1个像素的线条细化成只有一个像素宽。 For the manga page foreground image obtained after contour detection, thinning is performed before straight line detection, and the lines in the foreground image with a line width greater than 1 pixel are thinned to only one pixel wide. the
判断coma内部是否有直线的方法是:如果直线落入coma内部的像素量小于直线像素总量的一半,则认为这条直线不在coma内部,否则认为该直线在coma内部。 The method of judging whether there is a straight line inside the coma is: if the amount of pixels that the straight line falls into the coma is less than half of the total number of pixels of the straight line, then the straight line is considered not to be inside the coma; otherwise, the straight line is considered to be inside the coma. the
采用Kernel-based Hough transform方法作直线检测,并确定分割线的权重。 Use the Kernel-based Hough transform method for line detection and determine the weight of the dividing line. the
二叉树储存从右到左阅读顺序的漫画coma方法如下: The comic coma method for storing right-to-left reading order in a binary tree is as follows:
定义一颗二叉树,二叉树与漫画coma的关系为一个coma为二叉树的一个节点,根节点为一页漫画,中间节点为漫画段,叶子节点为漫画帧;若最佳分割线把coma分为左右两个部分,则将最佳分割线右面的coma作为左子树,最佳分割线左面的coma作为右子树;若最佳分割线把coma分为上下两个部分,则把最佳分割线上面的coma作为左子树,最佳分割线下面的coma作为右子树;若某个coma无最佳分割线,则该coma对应的图像为漫画帧,没有子节点,不再对其分割;按照上述方法将每个coma均储存在二叉树中,得到的二叉树所有叶子即为所有的漫画帧,漫画帧的阅读顺序为从左到右对应叶子的排列顺序。 Define a binary tree, the relationship between the binary tree and the comic coma is that a coma is a node of the binary tree, the root node is a page of comics, the middle node is a comic segment, and the leaf node is a comic frame; if the optimal dividing line divides the coma into left and right two part, the coma on the right side of the best dividing line is taken as the left subtree, and the coma on the left side of the best dividing line is taken as the right subtree; The coma of the coma is used as the left subtree, and the coma below the best dividing line is used as the right subtree; if a coma does not have the best dividing line, the image corresponding to the coma is a comic frame, and there are no child nodes, so it is no longer divided; according to In the above method, each coma is stored in a binary tree, and all the leaves of the obtained binary tree are all comic frames, and the reading order of the comic frames is the arrangement order of the corresponding leaves from left to right. the
二叉树储存从左到右阅读顺序的漫画coma方法如下: The comic coma method of storing the reading order from left to right in binary tree is as follows:
定义一颗二叉树,二叉树与漫画coma的关系为一个coma为二叉树的一个节点,根节点为一页漫画,中间节点为漫画段,叶子节点为漫画帧;若最佳分割线把coma分为左右两个部分,则将最佳分割线左面的coma作为左子树,最佳分割线右面的coma作为右子树;若最佳分割线把coma分为上下两个部分,则把最佳分割线上面的 coma作为左子树,最佳分割线下面的coma作为右子树;若某个coma无最佳分割线,则该coma对应的图像为漫画帧,没有子节点,不再对其分割;按照上述方法将每个coma均储存在二叉树中,得到的二叉树所有叶子即为所有的漫画帧,漫画帧的阅读顺序为从左到右对应叶子的排列顺序。 Define a binary tree, the relationship between the binary tree and the comic coma is that a coma is a node of the binary tree, the root node is a page of comics, the middle node is a comic segment, and the leaf node is a comic frame; if the optimal dividing line divides the coma into left and right two part, the coma on the left side of the best dividing line is taken as the left subtree, and the coma on the right side of the best dividing line is taken as the right subtree; if the best dividing line divides the coma into upper and lower parts, then the above The coma of the coma is used as the left subtree, and the coma below the best dividing line is used as the right subtree; if a coma does not have the best dividing line, the image corresponding to the coma is a comic frame, and there is no child node, so it is no longer divided; according to In the above method, each coma is stored in a binary tree, and all the leaves of the obtained binary tree are all comic frames, and the reading order of the comic frames is the arrangement order of the corresponding leaves from left to right. the
漫画预处理中,二值化、轮廓检测、图像细化然后是直线检测,使得直线检测速度更快;在分割过程中通过权重选取分割线并选择最佳分割线,提高了分割线选取的准确率;利用二叉树数据结构来对构成一幅漫画的每一帧储存并排序,操作方便快捷,实用性强;整个方案快速、准确、高效。 In the preprocessing of manga, binarization, contour detection, image thinning and then line detection make line detection faster; in the segmentation process, the segmentation line is selected by weight and the best segmentation line is selected, which improves the accuracy of segmentation line selection High efficiency; using the binary tree data structure to store and sort each frame that constitutes a cartoon, the operation is convenient and fast, and the practicability is strong; the whole scheme is fast, accurate and efficient. the
附图说明 Description of drawings
图1为本发明的流程图; Fig. 1 is a flowchart of the present invention;
图2(a)为漫画分割示意图; Figure 2(a) is a schematic diagram of cartoon segmentation;
图2(b)为二叉树储存漫画帧原理图; Fig. 2 (b) is the schematic diagram of the comic frame stored in the binary tree;
图3为最佳分割线选取示意图; Figure 3 is a schematic diagram of the selection of the best dividing line;
图4为试测试结果统计表。 Figure 4 is a statistical table of test results. the
具体实施方式 Detailed ways
下面结合附图对本发明作进一步说明。 The present invention will be further described below in conjunction with accompanying drawing. the
如图1所示为本发明的流程图,包括以下步骤: As shown in Figure 1, it is a flowchart of the present invention, comprising the following steps:
A、对给定的漫画页作预处理,包括: A. Preprocess the given manga page, including:
a、对所述漫画页二值化,得到二值化图像; a. Binarize the comic page to obtain a binarized image;
b、对所述二值化图像作轮廓检测,得到漫画页的前景图像; b. Perform contour detection on the binarized image to obtain the foreground image of the comic page;
B、对前景图像作直线检测,获取前景图像中包含的直线; B. Perform straight line detection on the foreground image to obtain the straight line contained in the foreground image;
C、定义直线检测后获取的四边形结构体为coma,coma为漫画帧或漫画段,漫画分割从最大的coma开始,若coma满足下列条件之一,则判定coma不可分割并且coma对应的图片为漫画帧,执行步骤G,否则执行步骤D; C. Define the quadrilateral structure obtained after straight line detection as coma, coma is a comic frame or comic segment, and comic segmentation starts from the largest coma. If coma satisfies one of the following conditions, it is determined that coma is indivisible and the picture corresponding to coma is a comic frame, execute step G, otherwise execute step D;
a、coma内部没有直线; a. There is no straight line inside coma;
b、coma内部最大前景图像面积与coma面积比值小于阈值m,其中0<m<0.25; b. The ratio of the largest foreground image area inside the coma to the area of the coma is less than the threshold m, where 0<m<0.25;
c、coma内部最大前景图像区域像素数量与coma内部所有像素数量的比值小于阈值n,其中0<n<0.2; c. The ratio of the number of pixels in the largest foreground image area inside the coma to the number of all pixels inside the coma is less than the threshold n, where 0<n<0.2;
d、coma内部最大前景图像区域与coma对应的长、宽比值小于阈值p,0.08<p<0.2; d. The ratio of the length and width of the largest foreground image area inside the coma to the coma is less than the threshold p, 0.08<p<0.2;
D、从直线检测获取的直线中选出分割线,所述分割线必须满足以下条件: D. Select a dividing line from the straight line obtained by the straight line detection, and the dividing line must meet the following conditions:
a、所述分割线只与coma中相对的两条边相交,与其他边没有交点; a. The dividing line only intersects the two opposite sides of the coma, and has no intersection with other sides;
b、所述分割线与coma的交点在coma的边缘线上; b. The intersection of the dividing line and the coma is on the edge line of the coma;
E、对所述分割线按其权重的降序一一筛选,选出与其他直线交点个数不超过3的分割线,定义为标准分割线; E, screen the dividing lines one by one in descending order of their weights, select the dividing line with no more than 3 intersection points with other straight lines, and define it as a standard dividing line;
F、在所述标准分割线两侧各HW像素点的宽度范围内,选出与其它直线不相交、或最多只与两条直线有交点的直线,从这些直线中再选出一条与前景图像交点数最少的直线作为coma的最佳分割位置,定义为最佳分割线;如果选不出符合条件的最佳分割线,返回步骤E,按照所述分割线的权重选择下一条符合要求的最佳分割线; F, within the width range of each HW pixel point on both sides of the standard dividing line, select a straight line that does not intersect with other straight lines, or only has intersection points with two straight lines at most, and then selects a straight line that is intersecting with the foreground image from these straight lines The straight line with the least number of intersection points is the best split position of coma, and is defined as the best split line; if the best split line that meets the conditions cannot be selected, return to step E, and select the next best split line that meets the requirements according to the weight of the split line. good dividing line;
G、采用二叉树数据结构按照漫画的阅读顺序储存经过分割处理的coma,此处的coma既包括进行分割的coma,也包括分割后产生的coma,还包括不可分割的coma; G. Use the binary tree data structure to store the split coma according to the reading order of the comics. The coma here includes both the split coma, the split coma, and the indivisible coma;
H、按照最佳分割线对漫画页作第一次分割后,用同样的方法对分割而产生的漫画段继续分割并储存新产生的coma,直到不存在可分割的coma为止,一页漫画的分割完成。其中,定义如下,漫画帧:每页漫画中不能再分割的最小图片单元。漫画段:由至少两个相连的漫画帧组成的四边形。 H. After the comic page is divided for the first time according to the optimal dividing line, continue to divide and store the newly generated coma in the same way to the comic segment produced by the division, until there is no divisible coma, and one page of comic The split is complete. Among them, the definition is as follows, manga frame: the smallest picture unit that cannot be further divided in each page of manga. Comic Segment: A quadrilateral consisting of at least two connected comic frames. the
利用二叉树存储漫画帧结构,与人在判断漫画排序具有相同的原理。一颗二叉树对应一页漫画。根节点用来存储一整页漫画,首先把漫画分为两部分,放在根节点的左右子树中。对每棵子树的部分继续分割,直到最后全部分割完成。最后的结果,即分割完成的漫画帧保 存在二叉树的叶子节点中。 Using a binary tree to store the frame structure of comics has the same principle as humans judging the sorting of comics. A binary tree corresponds to a comic page. The root node is used to store a whole page of comics. First, the comics are divided into two parts and placed in the left and right subtrees of the root node. Continue to divide the part of each subtree until all the divisions are finally completed. The final result, that is, the divided manga frames are stored in the leaf nodes of the binary tree. the
如图2(a)所示,直线a、b、c、d为四条最佳分割线把漫画分为(1)、(2)、(3)、(4)、(5)五个漫画帧,图2(b)所示的二叉树按如下方法储存上述五个漫画帧:首先把整一幅漫画看成树根,表示为A,A由两部分组成,即最佳分割线a上面的部分以及下面的部分,分别用B、C表示。采用递归的方法,B被最佳分割线c分为左右的E、D两部分,C被最佳分割线d分为上下的F、G两部分,F、G不可再分是树叶。最后最佳分割线c把E分成最后的上下M、N两片树叶。我们可以用括号的形式来表示一颗二叉树,即(A(B(D,E(M,N)),C(F,G))) As shown in Figure 2(a), the straight lines a, b, c, and d are the four best dividing lines to divide the comic into (1), (2), (3), (4), and (5) five comic frames , the binary tree shown in Figure 2(b) stores the above five comic frames as follows: first, the whole comic is regarded as the root of the tree, which is denoted as A, and A is composed of two parts, that is, the part above the optimal dividing line a And the following parts are represented by B and C respectively. Using a recursive method, B is divided into left and right parts E and D by the best dividing line c, C is divided into upper and lower parts F and G by the best dividing line d, and F and G cannot be further divided into leaves. Finally, the best dividing line c divides E into the last upper and lower M and N leaves. We can use brackets to represent a binary tree, namely (A(B(D, E(M, N)), C(F, G)))
我们采用二叉树数据结构来按照漫画的排序存储漫画每一帧,可以对漫画直接排序。此处我们用日本漫画排布(阅读顺序为从右到左,从上到下)规则举例,我们用二叉树按照如下规则存放每一帧: We use a binary tree data structure to store each frame of the comics according to the sorting of the comics, and the comics can be sorted directly. Here we use the Japanese manga layout (reading order is from right to left, from top to bottom) as an example, we use a binary tree to store each frame according to the following rules:
(a)、如果最佳分割线为上下线:需把最佳分割线上面的部分作为左子树,例如图2(b)中的B、F、M部分都在左子树部分;把最佳分割线下面的部分作为右子树,例如图2(b)中的C、G、N部分都在右子树部分。 (a), if the best dividing line is the upper and lower lines: the part above the best dividing line needs to be used as the left subtree, for example, the B, F, and M parts in Figure 2 (b) are all in the left subtree part; the most The part below the optimal dividing line is used as the right subtree, for example, the C, G, and N parts in Figure 2(b) are all in the right subtree. the
(b)、如果最佳分割线为左右线的话:需把最佳分割线右面的部分作为左子树,例如图2(b)中的D部分;把最佳分割线下面的部 分作为右子树,例如图2(b)中的E部分。 (b), if the best dividing line is the left and right lines: the part on the right side of the best dividing line should be used as the left subtree, such as part D in Figure 2 (b); the part below the best dividing line should be used as the right subtree. Subtree, such as part E in Fig. 2(b). the
按照上述规则存放的二叉树的所有叶子为最终的分割结果,并且已经从左到右排好顺序。图2(b)中二叉树的所有叶子结点为最终的分割结果,阅读顺序为D、M、N、F、G,按照帧的排序正好为(1)(2)(3)(4)(5)。如果漫画的阅读习惯为从右到左(改为从左到右),如国产漫画,利用同样的方法分割排序即可,只需对左右最佳分割线两边的漫画帧储存位置调换即可。 All the leaves of the binary tree stored according to the above rules are the final segmentation results, and have been sorted from left to right. All the leaf nodes of the binary tree in Figure 2(b) are the final segmentation results, the reading order is D, M, N, F, G, and the order of the frames is exactly (1)(2)(3)(4)( 5). If the reading habit of comics is from right to left (change from left to right), such as domestic comics, the same method can be used to divide and sort, and only need to exchange the storage positions of the comic frames on both sides of the best left and right dividing line. the
如图3所示,直线x为我们找到的标准分割线,但是为了避免把漫画的内容损坏,我们在其右边的空白的地方直线y处将其分割开来,y为最佳分割线。 As shown in Figure 3, the straight line x is the standard dividing line we found, but in order to avoid damaging the content of the comic, we divide it at the straight line y in the blank space on the right, and y is the best dividing line. the
最佳分割线位置的选择:在当前直线左右(或上下)HW的范围内,与当前标准分割线平行的直线组中,把与其他直线交点数量超过2个的直线过滤掉,这样剩下的直线都是与其他直线不相交、或只与一条或两条直线有交点的线。在这些直线中,再选一条与前景像素相交点数最少的直线作为最佳分割线。对备选分割线分别规定三种情况下的分割比的阈值,其中分割比指的是分割线的实际长度与分割线的理论长度之比。如图3中直线m、n由于中间没被截断,其实际长度与理论长度一致,分割比为1。若出现分割线中间被截断的情况,则分割比小于1。定义备选分割线的分割比,与其他直线没有交点的分割线 分割比定义为RATIO,与其他直线只有一个交点的分割线分割比定义为RATIO1,与其他直线有两个交点的分割线分割比定义为RATIO2。如果备选的分割线与其他直线没有交点,那么我们选一条与前景相交最少的、分割比超过RATIO的分割线作为最佳分割线。如果备选的分割线与其他直线只有一个交点,那么我们选一条与前景相交最少的、分割比超过RATIO1的分割线作为最佳分割线。如果备选分割线与其他直线有两个交点,那么我们选一条与前景相交最少的、分割比超过RATIO2的分割线作为最佳分割线。其中RATIO的范围为0.45-0.55,RATIO1范围为0.7-0.8,RATIO2范围为0.5-0.6。最佳分割线的具体位置确定后,我们可以按照这个位置分割。如果没有一个符合以上要求的分割位置,那么返回继续选择标准分割线。选中的分割线以及被排除掉的分割线标记为已使用,更新可使用的分割线列表。 Selection of the best dividing line position: within the range of the current straight line left and right (or up and down) HW, among the straight line groups parallel to the current standard dividing line, filter out the straight lines with more than 2 intersection points with other straight lines, so that the remaining A straight line is a line that does not intersect with other straight lines, or only intersects with one or two straight lines. Among these straight lines, select a straight line with the least number of intersecting points with foreground pixels as the best dividing line. The thresholds of the split ratios in the three cases are respectively specified for the candidate split lines, where the split ratio refers to the ratio of the actual length of the split line to the theoretical length of the split line. As shown in Figure 3, since the straight lines m and n are not cut off in the middle, their actual length is consistent with the theoretical length, and the division ratio is 1. If the split line is cut off in the middle, the split ratio is less than 1. Define the split ratio of the alternative split line, the split ratio of the split line that has no intersection with other straight lines is defined as RATIO, the split ratio of the split line with only one intersection point with other straight lines is defined as RATIO1, and the split ratio of the split line with two intersection points with other straight lines Defined as RATIO2. If the candidate splitting line has no intersection with other straight lines, then we choose a splitting line that intersects the least with the foreground and the splitting ratio exceeds RATIO as the best splitting line. If the alternative dividing line has only one intersection point with other straight lines, then we choose a dividing line with the least intersection with the foreground and the dividing ratio exceeding RATIO1 as the best dividing line. If the candidate segmentation line has two intersections with other straight lines, then we choose a segmentation line with the least intersection with the foreground and a segmentation ratio exceeding RATIO2 as the optimal segmentation line. The range of RATIO is 0.45-0.55, the range of RATIO1 is 0.7-0.8, and the range of RATIO2 is 0.5-0.6. After the specific position of the optimal dividing line is determined, we can divide according to this position. If there is no division position that meets the above requirements, then go back and continue to select the standard division line. The selected dividing line and the excluded dividing line are marked as used, and the list of available dividing lines is updated. the
漫画图片预处理具体方法如下: The specific method of comic image preprocessing is as follows:
二值化:本发明用到的所有图片都是二值化后的图片,图像的二值化有利于图像的进一步处理,使图像变得简单,而且数据量减小,能凸显出感兴趣的目标的轮廓。要进行二值图像的处理与分析,首先要把灰度图像二值化,得到二值化图像。在本发明中用到的所有图像都是二值化后的漫画图片。本发明运用了按照一定规则从灰度直方图 提取最佳阈值的方法对图像进行二值化,方法如下:本发明对图像的二值化用了NOBUYUKI OTSU提出的一个无参数和非监督的图像分割阈值选择方法[9]。通过选择一个阈值来把小于阈值的像素值设置为0,大于这个阈值的像素值设置为1,这样就是一幅图片的二值化。我们运用的方法可以最大化灰度级分类结果的分离性,通过这个评判标准可以获得一个最佳的阈值。过程非常简单,仅仅利用灰度直方图的0阶与1阶累积矩,由于二值化采用的是现有技术,具体方法不再赘述。 Binarization: All the pictures used in the present invention are pictures after binarization, and the binarization of the image is conducive to the further processing of the image, making the image simple, and the amount of data is reduced, which can highlight the interesting The silhouette of the target. To process and analyze the binary image, firstly, the grayscale image should be binarized to obtain the binarized image. All images used in the present invention are binarized caricature pictures. The present invention uses the method of extracting the optimal threshold from the gray histogram according to certain rules to binarize the image, the method is as follows: the present invention uses a non-parameter and non-supervised image proposed by NOBUYUKI OTSU for the binarization of the image Segmentation threshold selection method [9] . By selecting a threshold, the pixel value smaller than the threshold is set to 0, and the pixel value greater than this threshold is set to 1, which is the binarization of a picture. The method we use can maximize the separation of gray-level classification results, and an optimal threshold can be obtained through this criterion. The process is very simple, only the 0-order and 1-order cumulative moments of the gray histogram are used. Since the binarization adopts the existing technology, the specific method will not be repeated.
轮廓检测:图像预处理中,往往需要对目标边缘作跟踪处理,也叫轮廓跟踪。顾名思义,轮廓跟踪就是通过顺序找出边缘点来跟踪边界的。在分割线的选取中,需要判断分割线与前景图像的交点数量,本发明对二值化后的漫画图像做轮廓检测,检测漫画中的前景图像,在判断前景图像与分割线交点的时候,为分割线的选取起到了重要作用。更重要的是轮廓检测后从轮廓中进行直线检测,从而省去了所有点进行检测的步骤,只需对轮廓上的像素进行计算,为直线检测减少了大量的计算。由于采用的为现有技术,不再细述。 Contour detection: In image preprocessing, it is often necessary to track the edge of the target, also called contour tracking. As the name implies, contour tracking is to track the boundary by sequentially finding edge points. In the selection of the dividing line, it is necessary to judge the number of intersections between the dividing line and the foreground image. The present invention performs contour detection on the binarized cartoon image to detect the foreground image in the cartoon. When judging the intersection of the foreground image and the dividing line, It plays an important role in the selection of the dividing line. More importantly, after the contour detection, the line detection is performed from the contour, which saves the steps of detecting all points, and only needs to calculate the pixels on the contour, which reduces a lot of calculations for line detection. Because what adopt is prior art, no longer describe in detail. the
图像细化:图像细化一般作为一种图像预处理技术出现,目的是提取源图像的骨架,即是将原图像中线条宽度大于1个像素的线条细化成只有一个像素宽,形成“骨架”,形成骨架后能比较容易的分析图 像,如提取图像的特征。细化基本思想是“层层剥夺”,即从线条边缘开始一层一层向里剥夺,直到线条剩下一个像素的为止。图像细化大大地压缩了原始图像地数据量,并保持其形状的基本拓扑结构不变,从而为图像的特征抽取等应用奠定了基础。本发明在轮廓提取后对图像进行细化使得我们计算前景与分割线交点的数量大大减小,同时因为所有分割线与前景图像的交点是成比例的减少,并不会影响到我们对分割线判断的情况。 Image thinning: Image thinning generally appears as an image preprocessing technique, the purpose is to extract the skeleton of the source image, that is, to thin the lines in the original image with a line width greater than 1 pixel to only one pixel wide, forming a "skeleton" , after forming the skeleton, it is easier to analyze the image, such as extracting the features of the image. The basic idea of thinning is "layer-by-layer stripping", that is, stripping from the edge of the line layer by layer until there is only one pixel left in the line. Image thinning greatly compresses the data volume of the original image and keeps the basic topology of its shape unchanged, thus laying the foundation for image feature extraction and other applications. The present invention refines the image after contour extraction so that we can greatly reduce the number of intersection points between the foreground and the dividing line. At the same time, because the intersection of all dividing lines and the foreground image is proportionally reduced, it will not affect our calculation of the dividing line. case of judgment. the
直线检测:采用如下技术,因为是现有技术,这里只作简要介绍:漫画中的分割线是决定漫画分割情况关键所在。漫画的分割需要首先检测出直线,然后通过直线的特点来判断是否为分割线,以及具体怎样分割。如果检测出来的直线过少,就会使漫画被“漏分”,检测出来的直线过于多或是不准确,会使分割过程做大量的判断工作,容易导致“错分”。所以找出一种适合的直线检测方法来正确的检测出分割线会使得分割过程大大简化并且准确。Hough变换是图像处理中从图像中识别几何形状的基本方法之一,即它可以检测已知形状的目标,而且受噪声和曲线间断的影响小。Hough直线检测技术因为它针对噪声和错误数据的健壮性被广泛使用,但是计算成本较高,所以对较大的图片不能实时的执行。本发明采用了一种改进后的Hough直线检测方法。由Leandro A.F.Fernandes和Manuel M.Oliveira提出的 Kernel-based Hough transform(KHT)方法来检测直线。本方法由于先把大约共线的像素组成簇集,然后在选取其中最吻合的直线,比Hough直线检测法开销要小的多,但正确率却更高。本发明的直线检测是在图像二值化后,对图像进行轮廓检测,然后对轮廓做了细化处理的基础上进行的,所以本发明中的直线检测速度将更快。 Straight line detection: the following technology is used, because it is an existing technology, here is only a brief introduction: the dividing line in the comic is the key to determine the segmentation of the comic. The segmentation of comics needs to detect the straight line first, and then judge whether it is a dividing line and how to divide it according to the characteristics of the straight line. If there are too few detected straight lines, the cartoon will be "missed". If there are too many or inaccurate detected straight lines, the segmentation process will do a lot of judgment work, which will easily lead to "misclassification". Therefore, finding a suitable line detection method to correctly detect the dividing line will make the dividing process greatly simplified and accurate. Hough transform is one of the basic methods for recognizing geometric shapes from images in image processing, that is, it can detect objects of known shapes and is less affected by noise and curve discontinuities. The Hough line detection technique is widely used because of its robustness against noise and erroneous data, but the computational cost is high, so it cannot be performed in real time for larger images. The invention adopts an improved Hough straight line detection method. The Kernel-based Hough transform (KHT) method proposed by Leandro A.F.Fernandes and Manuel M.Oliveira to detect straight lines. Because this method first forms clusters of approximately collinear pixels, and then selects the most consistent straight line, it is much less expensive than the Hough line detection method, but the accuracy rate is higher. The straight line detection in the present invention is performed on the basis of performing contour detection on the image after image binarization, and then thinning the contour, so the straight line detection speed in the present invention will be faster. the
对于有些不规则的图片,本发明设计了人工辅助的方法来使得图片分割更精确。采用优先检测人工直线的方法对手动添加的直线进行判断。 For some irregular pictures, the present invention designs a human-assisted method to make the picture segmentation more accurate. The artificial straight line is firstly detected to judge the manually added straight line. the
测试结果:其中实验结果见图4,成功了184张,失败了16张,过分2张,少分13张,错分1张,达到了92%的成功率。计算机用时926秒,即每张需要时间4.633秒。效率为0.215张/秒。 Test results: The experimental results are shown in Figure 4, 184 pieces were successful, 16 pieces were failed, 2 pieces were over-scored, 13 pieces were under-scored, and 1 piece was mis-scored, reaching a success rate of 92%. The computer takes 926 seconds, that is, each takes 4.633 seconds. The efficiency is 0.215 sheets/second. the
当然,本发明还可有其他多种实施例,在不背离本发明精神及其实质的情况下,熟悉本领域的技术人员当可根据本发明作出各种相应的改变和变形,但这些相应的改变和变形都应属于本发明所附的权利要求的保护范围。 Of course, the present invention can also have other various embodiments, and those skilled in the art can make various corresponding changes and deformations according to the present invention without departing from the spirit and essence of the present invention, but these corresponding Changes and deformations should belong to the scope of protection of the appended claims of the present invention. the
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210120164.9A CN102708371B (en) | 2012-04-23 | 2012-04-23 | Method for recognizing and automatically sequencing comic frames according to segmenting lines |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210120164.9A CN102708371B (en) | 2012-04-23 | 2012-04-23 | Method for recognizing and automatically sequencing comic frames according to segmenting lines |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102708371A true CN102708371A (en) | 2012-10-03 |
CN102708371B CN102708371B (en) | 2014-04-30 |
Family
ID=46901114
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201210120164.9A Expired - Fee Related CN102708371B (en) | 2012-04-23 | 2012-04-23 | Method for recognizing and automatically sequencing comic frames according to segmenting lines |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102708371B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105208183A (en) * | 2014-06-27 | 2015-12-30 | 上海玄霆娱乐信息科技有限公司 | Method enabling electronic device to display cartoons |
CN105574524A (en) * | 2015-12-11 | 2016-05-11 | 北京大学 | Cartoon image page identification method and system based on dialogue and storyboard united identification |
CN109902541A (en) * | 2017-12-10 | 2019-06-18 | 彼乐智慧科技(北京)有限公司 | A kind of method and system of image recognition |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101866418A (en) * | 2009-04-17 | 2010-10-20 | 株式会社理光 | Method and equipment for determining file reading sequences |
CN102184378A (en) * | 2011-04-27 | 2011-09-14 | 茂名职业技术学院 | Method for cutting portable data file (PDF) 417 standard two-dimensional bar code image |
-
2012
- 2012-04-23 CN CN201210120164.9A patent/CN102708371B/en not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101866418A (en) * | 2009-04-17 | 2010-10-20 | 株式会社理光 | Method and equipment for determining file reading sequences |
CN102184378A (en) * | 2011-04-27 | 2011-09-14 | 茂名职业技术学院 | Method for cutting portable data file (PDF) 417 standard two-dimensional bar code image |
Non-Patent Citations (2)
Title |
---|
20071231 Takamasa Tanaka et al Layout Analysis of Tree-Structured Scene Frames in Comic Images , * |
TAKAMASA TANAKA ET AL: "Layout Analysis of Tree-Structured Scene Frames in Comic Images", <IJCAI’07 PROCEEDINGS OF THE 20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICAL INTELLIGENCE>, 31 December 2007 (2007-12-31) * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105208183A (en) * | 2014-06-27 | 2015-12-30 | 上海玄霆娱乐信息科技有限公司 | Method enabling electronic device to display cartoons |
CN105208183B (en) * | 2014-06-27 | 2018-11-02 | 上海玄霆娱乐信息科技有限公司 | The method that electronic equipment shows caricature |
CN105574524A (en) * | 2015-12-11 | 2016-05-11 | 北京大学 | Cartoon image page identification method and system based on dialogue and storyboard united identification |
CN105574524B (en) * | 2015-12-11 | 2018-10-19 | 北京大学 | Based on dialogue and divide the mirror cartoon image template recognition method and system that joint identifies |
CN109902541A (en) * | 2017-12-10 | 2019-06-18 | 彼乐智慧科技(北京)有限公司 | A kind of method and system of image recognition |
CN109902541B (en) * | 2017-12-10 | 2020-12-15 | 彼乐智慧科技(北京)有限公司 | Image recognition method and system |
Also Published As
Publication number | Publication date |
---|---|
CN102708371B (en) | 2014-04-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111612751B (en) | Lithium battery defect detection method based on Tiny-yolov3 network embedded with grouping attention module | |
CN108920580B (en) | Image matching method, device, storage medium and terminal | |
CN111460927B (en) | Method for extracting structured information of house property evidence image | |
CN104050247B (en) | The method for realizing massive video quick-searching | |
CN101976258B (en) | Video semantic extraction method by combining object segmentation and feature weighing | |
CN107688808A (en) | A kind of quickly natural scene Method for text detection | |
CN111091124B (en) | Spine character recognition method | |
CN112200117A (en) | Form identification method and device | |
CN114463767B (en) | Letter of credit identification method, device, computer equipment and storage medium | |
CN107085726A (en) | Single character location method in oracle bone rubbings based on multi-method denoising and connected region analysis | |
CN110428438B (en) | Single-tree modeling method and device and storage medium | |
TW202127371A (en) | Image-based defect detection method and computer readable medium thereof | |
CN114255223B (en) | Deep learning-based double-stage bathroom ceramic surface defect detection method and equipment | |
CN111652140A (en) | Method, device, equipment and medium for accurately segmenting questions based on deep learning | |
CN105184225A (en) | Multinational paper money image identification method and apparatus | |
CN116311310A (en) | Universal form identification method and device combining semantic segmentation and sequence prediction | |
CN102136074B (en) | Man-machine interface (MMI) based wood image texture analyzing and identifying method | |
CN113723330A (en) | Method and system for understanding chart document information | |
CN102708371B (en) | Method for recognizing and automatically sequencing comic frames according to segmenting lines | |
CN113191235A (en) | Sundry detection method, device, equipment and storage medium | |
CN114972947B (en) | Depth scene text detection method and device based on fuzzy semantic modeling | |
CN112445926A (en) | Image retrieval method and device | |
CN113628113A (en) | Image splicing method and related equipment thereof | |
CN101615255A (en) | A method for multi-frame fusion of video and text | |
CN114627456A (en) | Bill text information detection method, device and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20140430 Termination date: 20150423 |
|
EXPY | Termination of patent right or utility model |