CN105160300B - A kind of text abstracting method based on level-set segmentation - Google Patents

A kind of text abstracting method based on level-set segmentation Download PDF

Info

Publication number
CN105160300B
CN105160300B CN201510474071.XA CN201510474071A CN105160300B CN 105160300 B CN105160300 B CN 105160300B CN 201510474071 A CN201510474071 A CN 201510474071A CN 105160300 B CN105160300 B CN 105160300B
Authority
CN
China
Prior art keywords
text
image
region
area
level set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510474071.XA
Other languages
Chinese (zh)
Other versions
CN105160300A (en
Inventor
吕英俊
李敏花
柏猛
吕雪菲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University of Science and Technology
Original Assignee
Shandong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University of Science and Technology filed Critical Shandong University of Science and Technology
Priority to CN201510474071.XA priority Critical patent/CN105160300B/en
Publication of CN105160300A publication Critical patent/CN105160300A/en
Application granted granted Critical
Publication of CN105160300B publication Critical patent/CN105160300B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/414Extracting the geometrical structure, e.g. layout tree; Block segmentation, e.g. bounding boxes for graphics or text

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Character Input (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开了一种基于水平集分割的文本抽取方法,包括:读取图像数据信息,确定边界曲线;对读取的图像进行灰度化;抽取灰度特征值;根据灰度特征值采用水平集函数将图像分为两个区域;对分割出的两个区域进行二值化;对二值化后的两个区域分别进行连通元标定;对两个区域中标定的连通元进行滤波;对滤波后的区域进行极性判定,判断出文本像素区域和背景像素区域;对文本区域进行滤波,滤除背景噪声;输出文本抽取结果。本发明不仅能够抽取复杂背景中的文本信息,而且对含空心字的图像文本抽取也十分准确,具有一定的通用性和实用性。

The invention discloses a text extraction method based on level set segmentation, which includes: reading image data information, determining the boundary curve; graying the read image; extracting gray feature values; The set function divides the image into two areas; binarizes the two divided areas; calibrates the connected elements of the two areas after binarization; filters the marked connected elements in the two areas; The polarity of the filtered area is judged, and the text pixel area and the background pixel area are judged; the text area is filtered to filter out the background noise; and the text extraction result is output. The invention not only can extract the text information in the complex background, but also can extract the image text containing outline characters very accurately, and has certain versatility and practicability.

Description

一种基于水平集分割的文本抽取方法A Text Extraction Method Based on Level Set Segmentation

技术领域technical field

本发明涉及图像处理领域中的文本抽取方法,尤其涉及一种基于水平集分割的文本抽取方法。The invention relates to a text extraction method in the field of image processing, in particular to a text extraction method based on level set segmentation.

背景技术Background technique

随着网络和计算机技术的发展,越来越多的信息以图像或视频等多媒体的形式出现。图像或视频中含有丰富的文本信息,这些文本信息对图像或视频起着说明和诠释的作用。提取和识别这些文本信息对图像理解、视频内容分析、智能交通、机器视觉、智能控制等方面有着重要的意义。然而,由于文本信息通常处于复杂背景中,通用的OCR系统很难识别出文本信息。因而文本检测出来在提交给OCR系统之前还需要一个去除背景的过程即文本抽取过程。因此,如何从复杂背景图像中抽取文本信息,成为以文本信息为线索来理解图像内容的一个关键任务。With the development of network and computer technology, more and more information appears in the form of multimedia such as images or videos. Images or videos contain rich text information, which can explain and explain the images or videos. Extracting and recognizing these text information is of great significance to image understanding, video content analysis, intelligent transportation, machine vision, intelligent control, etc. However, since text information is usually in a complex background, it is difficult for general OCR systems to recognize text information. Therefore, before the text is detected and submitted to the OCR system, a process of removing the background is required, that is, the text extraction process. Therefore, how to extract text information from complex background images has become a key task to understand image content using text information as clues.

现有的图像文本抽取技术主要分为基于阈值的方法、基于聚类的方法和基于统计模型的方法。基于阈值的方法主要利用文本和背景颜色的分割,设定阈值将文本和背景分离。阈值的选取有全局阀值和局部阀值两种。该种方法抽取的效果取决于阀值对图像背景和文本的区分度,一般适用于图像背景比较单一的情况。基于聚类的方法一般利用颜色信息将文本块图像分为K类,然后根据某一聚类算法和设定的阀值将符合规则的类聚合,逐步的减少颜色的分类数。文本像素最后对应其中的一类,其余各类均为背景。这类方法但当背景中含有与文本颜色相同或相近的成分时,这些成分会被误分入文本类,从而产生大量的残余背景,影响OCR识别。基于统计模型的方法对文本块中的所有像素建立概率模型,然后设定合理的概率模型中的参数,然后根据最大似然法则确定每个像素是否属于文本像素。概率模型方法中模型参数一般需要统计学习得到,需要大量的学习样本。Existing image text extraction techniques are mainly divided into threshold-based methods, cluster-based methods and statistical model-based methods. The threshold-based method mainly uses the segmentation of text and background color, and sets the threshold to separate the text and background. There are two types of threshold selection: global threshold and local threshold. The extraction effect of this method depends on the degree of discrimination between the image background and the text by the threshold value, and is generally applicable to situations where the image background is relatively single. The clustering-based method generally uses color information to divide the text block image into K categories, and then aggregates the categories that meet the rules according to a certain clustering algorithm and a set threshold, gradually reducing the number of color categories. The text pixels finally correspond to one of them, and the rest are backgrounds. However, when the background contains components with the same or similar color as the text, these components will be misclassified into the text class, resulting in a large amount of residual background, which affects OCR recognition. The method based on the statistical model establishes a probability model for all pixels in the text block, then sets reasonable parameters in the probability model, and then determines whether each pixel belongs to a text pixel according to the maximum likelihood rule. In the probabilistic model method, the model parameters generally need statistical learning, which requires a large number of learning samples.

上述各种文本抽取方法,只利用了图像底层局部的灰度或彩色信息,对复杂背景图像中的文本或空心字进行抽取时,往往存在残余背景,文本抽取效果不好。The various text extraction methods mentioned above only use the local grayscale or color information of the bottom layer of the image. When extracting text or outline characters in a complex background image, there is often a residual background, and the text extraction effect is not good.

发明内容Contents of the invention

本发明的目的就是为了解决上述问题,提供一种基于水平集分割的文本抽取方法。首先采用水平集函数把图像分为两个区域,然后对两个域进行极性判断,判断出文本区域和背景区域,最后对文本区域滤波,去除背景噪声。该方法利用了图像的全图信息,不仅能够抽取复杂背景中的文本信息,而且对空心字图像的抽取效果也十分理想。具有一定的通用性和实用性。The object of the present invention is to provide a text extraction method based on level set segmentation in order to solve the above problems. First, the level set function is used to divide the image into two regions, and then the polarity of the two regions is judged to determine the text region and the background region. Finally, the text region is filtered to remove the background noise. This method utilizes the full image information of the image, not only can extract the text information in the complex background, but also has an ideal extraction effect on the outline image. It has certain versatility and practicality.

为了实现上述目的,本发明采用如下技术方案:In order to achieve the above object, the present invention adopts the following technical solutions:

一种基于水平集分割的文本抽取方法,包括:A text extraction method based on level set segmentation, comprising:

读取图像数据信息,确定边界曲线;对读取的图像进行灰度化;抽取灰度特征值;根据灰度特征值采用水平集函数将图像分为边界曲线内区域和边界曲线外区域;对分割出的两个区域进行二值化;对二值化的两个区域分别进行连通元标定;对两个区域中标定的连通元进行滤波;对滤波后的区域进行极性判定,判断出文本像素区域和背景像素区域;对文本区域进行滤波,滤除背景噪声;输出文本抽取结果。Read the image data information and determine the boundary curve; grayscale the read image; extract the grayscale feature value; divide the image into the area inside the boundary curve and the area outside the boundary curve according to the gray level feature value using the level set function; Perform binarization on the two divided areas; perform connected element calibration on the two binarized areas; filter the marked connected elements in the two areas; determine the polarity of the filtered area to determine the text The pixel area and the background pixel area; filter the text area to filter out the background noise; output the text extraction result.

具体步骤包括:Specific steps include:

步骤(1):给定图像u0(x,y),(x,y)∈Ω,Ω为图像区域,ω为Ω的开子集,C为ω的边界曲线,读取图像信息;Step (1): Given an image u 0 (x, y), (x, y) ∈ Ω, Ω is the image area, ω is the open subset of Ω, and C is the boundary curve of ω, read the image information;

步骤(2):对读取的图像灰度化;Step (2): Grayscale the read image;

步骤(3):抽取图像的灰度特征值;Step (3): extracting the grayscale feature value of the image;

步骤(4):采用水平集函数图像分割成边界曲线内区域和边界曲线外区域;Step (4): using the level set function to segment the image into regions inside the boundary curve and regions outside the boundary curve;

步骤(5):判断分割是否完成,如果完成则进入步骤(6),否则,返回步骤(4);Step (5): judge whether the segmentation is completed, if it is completed, then enter step (6), otherwise, return to step (4);

步骤(6):对分割的两个区域进行二值化,即曲线内区域用黑色像素表示,曲线外区域用白色像素表示;Step (6): Binarize the two divided regions, that is, the region inside the curve is represented by black pixels, and the region outside the curve is represented by white pixels;

步骤(7):对二值化后的两个区域分别采用区域增长法进行连通元标定;Step (7): The region growth method is used to calibrate the connected elements of the two regions after binarization;

步骤(8):判断连通元标定是否完成,如果完成进入步骤(9),否则,返回步骤(7);Step (8): Determine whether the calibration of connected elements is completed, if it is completed, enter step (9), otherwise, return to step (7);

步骤(9):对两个区域中的连通元进行滤波;Step (9): filter the connected elements in the two regions;

步骤(10):判断两个区域连通元滤波是否完成,如果完成进入步骤(11),否则,返回步骤(9);Step (10): Judging whether the filtering of the connected elements of the two regions is completed, if completed, enter step (11), otherwise, return to step (9);

步骤(11):对滤波后的两个区域进行极性判定,以判断两个区域中哪个区域为文本区域;通过比较两个区域中连通元的数目,取连通元数目多的区域为文本区域,取连通元数目少的区域为背景区域;Step (11): Determine the polarity of the two regions after filtering to determine which of the two regions is the text region; by comparing the number of connected elements in the two regions, take the region with the largest number of connected elements as the text region , take the area with a small number of connected elements as the background area;

步骤(12):对确定的文本区域,进一步滤波去除残余背景;Step (12): For the determined text area, further filter to remove the residual background;

步骤(13):输出文本抽取结果。Step (13): output the text extraction result.

所述步骤(4)中,水平集分割的能量函数为:In described step (4), the energy function of level set segmentation is:

其中,μ,v,λ12均是正常数,c1,c2分别是图像u0(x,y)中曲线边界C内部与外部的灰度平Among them, μ, v, λ 1 , λ 2 are all positive constants, c 1 , c 2 are the gray level of the inside and outside of the curve boundary C in the image u 0 (x, y) respectively.

均值,H(z)和δ(z)分别表示正则化的Heaviside函数H(z)和Dirac函数δ(z);其中,The mean, H(z) and δ(z) represent the regularized Heaviside function H(z) and Dirac function δ(z) respectively; where,

所述步骤(4)中的具体方法为:The concrete method in the described step (4) is:

步骤(4-1):将边界曲线曲线C用水平集函数代替,如果点(x,y)在曲线C内部,则如果点(x,y)在曲线C外部,则如果点(x,y)在曲线C上,则 Step (4-1): use the level set function of the boundary curve curve C Instead, if the point (x,y) is inside the curve C, then If the point (x,y) is outside the curve C, then If the point (x,y) is on the curve C, then

步骤(4-2):初始化水平集函数,令k=0;为常数值;Step (4-2): Initialize the level set function, let k=0; is a constant value;

步骤(4-3):最小化水平集的能量函数固定为第K次迭代的值,计算c1 k和c2 k的值;Step (4-3): Minimize the energy function of the level set fixed for the Kth iteration The value of , calculate the value of c 1 k and c 2 k ;

步骤(4-4):最小化水平集的能量函数固定c1 k和c2 k,计算其中表示第k次迭代时的值;Step (4-4): Minimize the energy function of the level set Fixing c 1 k and c 2 k , calculate in Indicates that when the kth iteration value;

步骤(4-5):判断的解是否趋于稳定,如果不是趋于稳定,则另k=k+1,返回步骤(4-3),继续迭代运算,否则停止迭代进入步骤(4-6);Step (4-5): Judgment Whether the solution tends to be stable, if it is not stable, then another k=k+1, return to step (4-3), and continue iterative operation, otherwise stop iteration and enter step (4-6);

步骤(4-6):输出水平集函数分割结果。Step (4-6): output the segmentation result of the level set function.

所述步骤(4-3)第k次迭代时计算c1和c2值的方法为:The method for calculating c1 and c2 values during the kth iteration of the step (4-3) is:

其中,u0(x,y)为给定图像上的点,为正则化的Heaviside函数。Among them, u 0 (x, y) is a point on the given image, is the regularized Heaviside function.

计算的具体方法为:calculate The specific method is:

利用步骤(4-3)中计算的c1 k和c2 k,按照下式先计算然后积分求出 Using the c 1 k and c 2 k calculated in step (4-3), first calculate according to the following formula Then integrate to find

其中,div代表散度算子、代表梯度算子,μ,v,λ12均是正常数,c1,c2分别是图像u0(x,y)中曲线边界C内部与外部的灰度平均值。Among them, div represents the divergence operator, Represents the gradient operator, μ, v, λ 1 , λ 2 are all positive constants, c 1 , c 2 are the average gray levels of the inside and outside of the curve boundary C in the image u 0 (x, y) respectively.

所述步骤(7)中对二值化后的两个区域分别采用区域增长法进行连通元标定的方法为:In the step (7), the method of using the region growth method to calibrate the connected elements in the two regions after binarization is as follows:

步骤(7-1):对区域中的像素分别按从上到下、从左到右的顺序进行搜索,若搜索到像素点未进行标记,则赋该像素点新的标记号;Step (7-1): Search the pixels in the area in order from top to bottom and from left to right. If the searched pixel point is not marked, assign a new label number to the pixel point;

步骤(7-2):以新标记的像素点为起始点进行8邻域搜索,若在其8邻域搜索到未标记的像素点,则为搜索到的未标记像素点赋相同标号,并以新标记的像素点为起始点进行8邻域搜索;Step (7-2): Use the newly marked pixel as the starting point to perform 8-neighborhood search. If an unmarked pixel is found in its 8-neighborhood, assign the same label to the searched unmarked pixel, and Perform 8-neighborhood search with the newly marked pixel as the starting point;

步骤(7-3):若在8邻域内未搜索到未标记的像素点,则结束该次搜索;Step (7-3): If no unmarked pixel is found within the 8 neighborhoods, then end the search;

步骤(7-4):判断所有像素点标记是否完成;如果完成进入步骤(7-5);如果未完成进入步骤(7-1),对区域中所有未标记的像素点进行标记,直到完成所有像素点标记为止;Step (7-4): Determine whether all pixel points are marked; if completed, proceed to step (7-5); if not complete, proceed to step (7-1), mark all unmarked pixels in the area until completed until all pixels are marked;

步骤(7-5):将具有相同标号的像素点作为一个连通元。Step (7-5): Take the pixels with the same label as a connected element.

所述步骤(9)中对连通元滤波的方法为:In described step (9), the method to connected element filtering is:

分别判断两个区域中连通元的位置和连通元内像素点的个数,如果连通元与边界相连或者连通元内像素点数目小于设定阈值,则将该连通元删除。The position of the connected element in the two regions and the number of pixels in the connected element are respectively judged. If the connected element is connected to the boundary or the number of pixels in the connected element is less than the set threshold, the connected element is deleted.

所述步骤(11)中,对滤波后的两个区域进行极性判定的方法为:In described step (11), the method for polarity judgment is carried out to two regions after filtering is:

步骤(11-1):滤波后将两个区域中具有相同标号的像素点作为一个连通元;Step (11-1): After filtering, use the pixels with the same label in the two regions as a connected element;

步骤(11-2):分别统计两个区域中连通元的数目,设两个区域中连通元的数目分别为n1和n2Step (11-2): count the number of connected elements in the two areas respectively, and set the number of connected elements in the two areas to be n1 and n2 respectively;

步骤(11-3):比较n1和n2,如果n1>n2,则n1所对应的区域为文本区域,否则n2所对应的区域为文本区域。Step (11-3): compare n 1 and n 2 , if n 1 >n 2 , then the area corresponding to n 1 is a text area, otherwise the area corresponding to n 2 is a text area.

所述步骤(12)中,对确定的文本区域,进一步去除残余背景的方法为:In the step (12), for the determined text region, the method for further removing the residual background is:

通过统计区域内每个连通元的灰度平均值,并将各连通元的灰度平均值按从小到大的顺序排列,然后计算相邻灰度平均值的差值,接着依次将灰度差值与设定的阈值进行比较,如果灰度差值大于设定阈值,则以此差值作为分段位置,所有差值判断结束后,得到N个分段位置,取各分段中所对应像素点个数最多的那一段为文本区域段,该文本区域段所对应的连通元为文本连通元,文本连通元所对应位置即为文本区域,图像中的其它区域为背景区域。By counting the average gray value of each connected element in the area, and arranging the average gray value of each connected element in ascending order, then calculating the difference between the adjacent gray average values, and then sequentially Values are compared with the set threshold, if the gray difference is greater than the set threshold, then this difference is used as the segment position, after all the differences are judged, N segment positions are obtained, and the corresponding The segment with the largest number of pixels is the text region segment, the connected elements corresponding to the text region segment are the text connected elements, the corresponding position of the text connected elements is the text region, and the other regions in the image are the background regions.

本发明的有益效果是:The beneficial effects of the present invention are:

本发明根据复杂背景图像中文本信息的特点,首先采用水平集函数对图像进行分割,然后对分割区域进行极性判断、背景滤除,得到文本抽取结果。该方法利用了文本图像的全局信息,不仅能够抽取复杂背景图像中的文本信息,而且对空心字的文本抽取效果也十分准确,具有一定的通用性和实用性,避免了残余背景对抽取结果的影响。该发明的成果可直接应用于图像理解、视频内容分析、智能交通、机器视觉、智能控制等领域,具有广阔的应用前景。According to the characteristics of the text information in the complex background image, the invention firstly uses the level set function to segment the image, and then performs polarity judgment and background filtering on the segmented area to obtain the text extraction result. This method utilizes the global information of the text image, not only can extract the text information in the complex background image, but also the text extraction effect of the outline characters is very accurate, has certain versatility and practicability, and avoids the influence of the residual background on the extraction results. influences. The achievements of this invention can be directly applied to image understanding, video content analysis, intelligent transportation, machine vision, intelligent control and other fields, and have broad application prospects.

附图说明Description of drawings

图1是本发明一种基于水平集分割的文本抽取方法的流程图。FIG. 1 is a flowchart of a text extraction method based on level set segmentation in the present invention.

具体实施方式:Detailed ways:

下面结合附图与实施例对本发明做进一步说明:Below in conjunction with accompanying drawing and embodiment the present invention will be further described:

实现本发明的系统结构所需的基本的硬件条件为:一台主频为2.4GHZ,内存为1G的计算机,所需软件条件为:编程环境为Visual C++。The basic hardware condition needed to realize the system structure of the present invention is: a main frequency is 2.4GHZ, and the internal memory is a computer with 1G, and the required software condition is: the programming environment is Visual C++.

一种基于水平集活动轮廓模型的文本分割方法,如图1所示,具体步骤如下:A text segmentation method based on the level set active contour model, as shown in Figure 1, the specific steps are as follows:

步骤(1):开始,读取图像;Step (1): start, read the image;

步骤(2):对读取的图像灰度化;Step (2): Grayscale the read image;

步骤(3):抽取图像的灰度特征值;Step (3): extracting the grayscale feature value of the image;

步骤(4):采用于水平集函数将图像分割成两个区域;Step (4): using the level set function to divide the image into two regions;

给定图像u0(x,y),(x,y)∈Ω,Ω被称为图像区域,ω为Ω的开子集,C为ω的边界曲线,曲线C可用水平集函数代替,如果点(x,y)在曲线C内部,则如果点(x,y)在曲线C外部,则如果点(x,y)在曲线C上,则 Given an image u 0 (x, y), (x, y) ∈ Ω, Ω is called the image region, ω is an open subset of Ω, C is the boundary curve of ω, and the curve C can use the level set function Instead, if the point (x,y) is inside the curve C, then If the point (x,y) is outside the curve C, then If the point (x,y) is on the curve C, then

水平集能量函数可表示为:The level set energy function can be expressed as:

其中,μ,v,λ12是正常数,c1,c2是图像u0(x,y)中曲线边界C内部与外部的灰度平均值,H(z)和δ(z)分别表示正则化的Heaviside函数H(z)和Dirac函数δ(z)Among them, μ, v, λ 1 , λ 2 are positive constants, c 1 , c 2 are the average gray values of the inside and outside of the curve boundary C in the image u 0 (x, y), H(z) and δ(z ) represent the regularized Heaviside function H(z) and Dirac function δ(z) respectively

最小化能量函数,固定可以估计出c1,c2的值,Minimize the energy function, fixed The values of c 1 and c 2 can be estimated,

然后,固定c1,c2,最小化能量函数,可得到Then, fix c 1 , c 2 , and minimize the energy function, we can get

具体实现步骤为:The specific implementation steps are:

步骤(4-1):初始化水平集函数,令k=0,本发明中选取5个圆作为水平集初始化曲线;Step (4-1): Initialize the level set function, let k=0, choose 5 circles as the level set initialization curve in the present invention;

步骤(4-2):根据公式(4),(5)计算c1 k和c2 kStep (4-2): Calculate c 1 k and c 2 k according to formulas (4) and (5);

步骤(4-3):根据计算出来的c1 k和c2 k,根据公式(6)计算 Step (4-3): According to the calculated c 1 k and c 2 k , calculate according to formula (6)

步骤(4-4):判断解是否趋于稳定,如果没有,另k=k+1,转到步骤(4-2),继续迭代运算,否则停止迭代进入步骤(4-5);Step (4-4): judge whether the solution tends to be stable, if not, otherwise k=k+1, go to step (4-2), continue iterative operation, otherwise stop iteration and enter step (4-5);

步骤(4-5):输出水平集分割结果。Step (4-5): Output level set segmentation results.

步骤(5):判断分割是否完成,如果完成则进入步骤(6),如果未完成则进入步骤(4);Step (5): Judging whether the segmentation is completed, if completed, then enter step (6), if not, then enter step (4);

步骤(6):对分割的两个区域进行二值化,即曲线内区域用黑色像素表示,曲线外区域用白色像素表示;Step (6): Binarize the two divided regions, that is, the region inside the curve is represented by black pixels, and the region outside the curve is represented by white pixels;

步骤(7):对分割出的两个区域采用区域增长法进行8连通元标定;Step (7): Perform 8-connected element calibration on the two segmented regions using the region growth method;

具体步骤为:The specific steps are:

步骤(7-1):对区域中的像素分别按从上到下、从左到右的顺序进行搜索,若搜索到像素点未进行标记,则赋该像素点新的标记号;Step (7-1): Search the pixels in the area in order from top to bottom and from left to right. If the searched pixel point is not marked, assign a new label number to the pixel point;

步骤(7-2):以新标记的像素点为起始点进行8邻域搜索,若在其8邻域搜索到未标记的像素点,则为搜索到的未标记像素点赋相同标号,并以新标记的像素点为起始点进行8邻域搜索;Step (7-2): Use the newly marked pixel as the starting point to perform 8-neighborhood search. If an unmarked pixel is found in its 8-neighborhood, assign the same label to the searched unmarked pixel, and Perform 8-neighborhood search with the newly marked pixel as the starting point;

步骤(7-3):若在8邻域内未搜索到未标记的像素点,则结束该次搜索;Step (7-3): If no unmarked pixel is found within the 8 neighborhoods, then end the search;

步骤(7-4):判断所有像素点标记是否完成。如果完成进入步骤(7-5);如果未完成进入步骤(7-1),对区域中所有未标记的像素点进行标记,直到完成所有像素点标记为止;Step (7-4): Judging whether all pixel markings are completed. If it is completed, enter step (7-5); if it is not completed, enter step (7-1), mark all unmarked pixels in the area until all pixel points are marked;

步骤(7-5):将具有相同标号的像素点作为一个连通元。Step (7-5): Take the pixels with the same label as a connected element.

步骤(8):判断连通元标定是否完成,如果完成进入步骤(9),如果未完成返回步骤(7);Step (8): Determine whether the calibration of connected elements is completed, if completed, enter step (9), if not complete, return to step (7);

步骤(9):对两个区域中的连通元进行滤波,分别判断两个区域中连通元的位置和连通元内像素点的个数,如果连通元与边界相连或者连通元内像素点数目少于给定阈值,则将该连通元删除。Step (9): Filter the connected elements in the two regions, respectively judge the position of the connected elements in the two regions and the number of pixels in the connected elements, if the connected elements are connected to the boundary or the number of pixels in the connected elements is small If it is lower than a given threshold, the connected element is deleted.

步骤(10):判断两个区域连通元滤波是否完成,如果完成进入步骤(11),如果未完成进入步骤(9);Step (10): Judging whether the filtering of connected elements in two regions is completed, if completed, enter step (11), if not completed, enter step (9);

步骤(11):对滤波后两个区域进行极性判定,以判断两个区域中哪个区域为文本区域。比较两个区域所含像素点的个数,取像素点个数多的区域为文本区域,像素点少的区域为背景区域;Step (11): Polarity determination is performed on the filtered two regions to determine which of the two regions is the text region. Compare the number of pixels contained in the two areas, take the area with more pixels as the text area, and the area with fewer pixels as the background area;

具体步骤为:The specific steps are:

步骤(11-1):滤波后将两个区域中具有相同标号的像素点作为一个连通元;Step (11-1): After filtering, use the pixels with the same label in the two regions as a connected element;

步骤(11-2):分别统计两个区域中连通元的数目,设两个区域中连通元的数目分别为n1和n2Step (11-2): count the number of connected elements in the two areas respectively, and set the number of connected elements in the two areas to be n1 and n2 respectively;

步骤(11-3):比较n1和n2,如果n1>n2,则n1所对应的区域为文本区域,否则n2所对应的区域为文本区域。Step (11-3): compare n 1 and n 2 , if n 1 >n 2 , then the area corresponding to n 1 is a text area, otherwise the area corresponding to n 2 is a text area.

步骤(12):对确定的文本区域,进一步滤波去除残余背景;Step (12): For the determined text area, further filter to remove the residual background;

具体步骤为:The specific steps are:

步骤(12-1):求区域中每个连通元的灰度平均值;Step (12-1): Calculate the average gray value of each connected element in the region;

步骤(12-2):将每个连通元灰度平均值按照从小到大的顺序进行排列;Step (12-2): Arrange the average gray value of each connected element in ascending order;

步骤(12-3):计算每个灰度平均值与其后相邻的灰度平均值之间的差值;Step (12-3): Calculate the difference between each gray-scale average value and the adjacent gray-scale average value;

步骤(12-4):将步骤(12-3)获得的差值分别与设定的阈值进行比较,如果差值大于设定的阈值,则以此差值作为分段位置;Step (12-4): The difference obtained in step (12-3) is compared with the set threshold respectively, if the difference is greater than the set threshold, then use this difference as the segmentation position;

步骤(12-5):判断所有的差值与阈值比较是否完成,如果完成进入步骤(12-6),如果未完成进入步骤(12-4);Step (12-5): judge whether all difference and threshold value comparisons are completed, if complete, enter step (12-6), if not complete, enter step (12-4);

步骤(12-6):比较结束后共得到N个分段位置,该N个分段位置将各连通元分为N+1段;Step (12-6): After the comparison, a total of N segmentation positions are obtained, and the N segmentation positions divide each connected element into N+1 segments;

步骤(12-7):分别统计N+1段中各段所对应连通元所含像素点的个数,像素个数最多的分段所对应的连通元为文本连通元,文本连通元对应的区域为文本区域,其余分段对应的区域为背景区域。Step (12-7): Count the number of pixels contained in the connected elements corresponding to each segment in the N+1 segment respectively. The connected element corresponding to the segment with the largest number of pixels is a text connected element, and the text connected element corresponds to The area is the text area, and the area corresponding to the rest of the segments is the background area.

步骤(12-8):删除背景区域。Step (12-8): Delete the background area.

步骤(13):输出文本抽取结果。Step (13): output the text extraction result.

上述虽然结合附图对本发明的具体实施方式进行了描述,但并非对本发明保护范围的限制,所属领域技术人员应该明白,在本发明的技术方案的基础上,本领域技术人员不需要付出创造性劳动即可做出的各种修改或变形仍在本发明的保护范围以内。Although the specific implementation of the present invention has been described above in conjunction with the accompanying drawings, it does not limit the protection scope of the present invention. Those skilled in the art should understand that on the basis of the technical solution of the present invention, those skilled in the art do not need to pay creative work Various modifications or variations that can be made are still within the protection scope of the present invention.

Claims (8)

1. A text extraction method based on level set segmentation is characterized by comprising the following steps:
reading image data information and determining a boundary curve; graying the read image; extracting a gray characteristic value; dividing the image into an area inside the boundary curve and an area outside the boundary curve by adopting a level set function according to the gray characteristic value; carrying out binarization on the two divided regions; respectively calibrating connected elements of the two binarized areas; filtering the calibrated connected elements in the two areas; judging the polarity of the filtered area to judge a text pixel area and a background pixel area; filtering the text region to filter background noise; outputting a text extraction result;
the method comprises the following specific steps:
step (1): given image u0(x, y), (x, y) belongs to omega, omega is an image area, omega is an open subset of omega, C is a boundary curve of omega, and image information is read;
step (2): graying the read image;
and (3): extracting a gray characteristic value of the image;
and (4): dividing the image into an area inside the boundary curve and an area outside the boundary curve by adopting a level set function;
and (5): judging whether the segmentation is finished or not, if so, entering the step (6), and otherwise, returning to the step (4);
and (6): binarizing the two divided regions, namely representing the region inside the curve by using black pixels and representing the region outside the curve by using white pixels;
and (7): respectively adopting a region growing method to calibrate the connected elements of the two binarized regions;
and (8): judging whether the calibration of the connected element is finished, if so, entering the step (9), and otherwise, returning to the step (7);
and (9): filtering connected elements in the two regions;
step (10): judging whether the filtering of the two area connected elements is finished, if so, entering the step (11), and otherwise, returning to the step (9);
step (11): judging the polarity of the two filtered areas to judge which area of the two areas is a text area; comparing the number of the connected elements in the two regions, taking the region with a large number of the connected elements as a text region, and taking the region with a small number of the connected elements as a background region;
step (12): further filtering the determined text area to remove residual background;
step (13): outputting a text extraction result;
in the step (4), the energy function of the level set segmentation is as follows:
wherein, mu, v, lambda12Are all normal numbers, c1,c2Are respectively an image u0(x, y) mean gray levels inside and outside the boundary curve C, h (z) and δ (z) representing regularized Heaviside function h (z) and Dirac function δ (z), respectively;representing a level set function, (x, y) belongs to omega, and omega is an image area; wherein,
2. the method for extracting text based on level set segmentation as claimed in claim 1, wherein the specific method in step (4) is as follows:
step (4-1): using the boundary curve C as a level set functionInstead, if the point (x, y) is inside the boundary curve C, thenIf the point (x, y) is outside the boundary curve C, thenIf the point (x, y) is on the boundary curve C, then
Step (4-2): initialize level set function, orderk=0;Is a constant value;as a function of the level setAn initial value of (1);
step (4-3): minimizing energy function of level setFixingFor the k-th iterationA value of (c) is calculated1 kAnd c2 kA value of (d); c. C1 kIs the mean value of the gray levels inside the boundary curve C at the k-th iteration, C2 kThe gray level average value outside the boundary curve C in the k iteration is obtained;
step (4-4): minimizing energy function of level setFastening of c1 kAnd c2 kCalculatingWhereinWhen representing the (k + 1) th iterationA value of (d);
step (4-5): judgment ofIf not, returning to the step (4-3) to continue the iterative operation, otherwise, stopping the iteration and entering the step (4-6);
step (4-6): and outputting a level set function segmentation result.
3. The method as claimed in claim 2, wherein c is calculated at the k-th iteration of the step (4-3)1And c2The method of the value is:
wherein u is0(x, y) is the number of given images,is a regularized Heaviside function.
4. The method of claim 2, wherein the computing is based on a level set segmentation text extraction methodThe specific method comprises the following steps:
using c calculated in step (4-3)1 kAnd c2 kFirst, according to the following formulaThen, the integral is calculated
Wherein div represents the divergence operator, ▽ represents the gradient operator, μ, v, λ12Are all normal numbers, c1,c2Are respectively an image u0The gray level average value inside and outside the boundary curve C in (x, y).
5. The method for extracting text based on level set segmentation as claimed in claim 1, wherein the method for performing connected component calibration on the two binarized regions by using the region growing method in step (7) comprises:
step (7-1): searching pixels in the region from top to bottom and from left to right respectively, and assigning a new mark number to the pixel if the pixel is not marked;
step (7-2): carrying out 8 neighborhood search by taking the newly marked pixel point as a starting point, if an unmarked pixel point is searched in the 8 neighborhood, assigning the same label to the searched unmarked pixel point, and carrying out 8 neighborhood search by taking the newly marked pixel point as the starting point;
step (7-3): if the unmarked pixel point is not searched in the 8-neighborhood, the search is finished;
step (7-4): judging whether all the pixel point marks are finished or not; entering step (7-5) if the step is finished; if the step (7-1) is not finished, marking all unmarked pixel points in the area until all pixel point marks are finished;
step (7-5): and taking the pixel points with the same label as a connected element.
6. The method as claimed in claim 1, wherein the method for extracting text based on level set segmentation in step (9) comprises the following steps:
and respectively judging the positions of the connected elements in the two regions and the number of pixel points in the connected elements, and deleting the connected elements if the connected elements are connected with the boundary or the number of the pixel points in the connected elements is less than a set threshold value.
7. The method for extracting text based on level set segmentation as claimed in claim 1, wherein in the step (11), the method for determining the polarity of the two filtered regions comprises:
step (11-1): after filtering, taking the pixel points with the same label in the two areas as a connected element;
step (11-2): respectively counting the number of connected elements in the two regions, and respectively setting the number of the connected elements in the two regions as n1And n2
Step (11-3): comparison of n1And n2If n is1>n2Then n is1The corresponding region is a text region, otherwise n2The corresponding region is a text region.
8. The method for extracting text based on level set segmentation as claimed in claim 1, wherein in the step (12), for the determined text region, the method for further removing the residual background comprises:
the gray level average value of each connected element in the region is counted, the gray level average values of the connected elements are arranged from small to large, then the difference value of the adjacent gray level average values is calculated, the gray level difference value is compared with a set threshold value in sequence, if the gray level difference value is larger than the set threshold value, the difference value is used as a segmentation position, after all the difference values are judged, N segmentation positions are obtained, the section with the largest number of corresponding pixel points in each segmentation is taken as a text region section, the connected element corresponding to the text region section is taken as a text connected element, the position corresponding to the text connected element is taken as a text region, and other regions in the image are taken as background regions.
CN201510474071.XA 2015-08-05 2015-08-05 A kind of text abstracting method based on level-set segmentation Active CN105160300B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510474071.XA CN105160300B (en) 2015-08-05 2015-08-05 A kind of text abstracting method based on level-set segmentation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510474071.XA CN105160300B (en) 2015-08-05 2015-08-05 A kind of text abstracting method based on level-set segmentation

Publications (2)

Publication Number Publication Date
CN105160300A CN105160300A (en) 2015-12-16
CN105160300B true CN105160300B (en) 2018-08-21

Family

ID=54801152

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510474071.XA Active CN105160300B (en) 2015-08-05 2015-08-05 A kind of text abstracting method based on level-set segmentation

Country Status (1)

Country Link
CN (1) CN105160300B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109754443B (en) 2019-01-30 2021-04-20 京东方科技集团股份有限公司 Image data conversion method, device and storage medium
CN112001406B (en) * 2019-05-27 2023-09-08 杭州海康威视数字技术股份有限公司 Text region detection method and device
CN112749599B (en) * 2019-10-31 2024-12-06 北京金山云网络技术有限公司 Image enhancement method, device and server

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102147863A (en) * 2010-02-10 2011-08-10 中国科学院自动化研究所 Method for locating and recognizing letters in network animation
CN102332097A (en) * 2011-10-21 2012-01-25 中国科学院自动化研究所 A Segmentation Method of Complex Background Text Image Based on Graph Cut
CN103077391A (en) * 2012-12-30 2013-05-01 信帧电子技术(北京)有限公司 Automobile logo positioning method and device
CN104091332A (en) * 2014-07-01 2014-10-08 黄河科技学院 Method for optimizing multilayer image segmentation of multiclass color texture images based on variation model

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102147863A (en) * 2010-02-10 2011-08-10 中国科学院自动化研究所 Method for locating and recognizing letters in network animation
CN102332097A (en) * 2011-10-21 2012-01-25 中国科学院自动化研究所 A Segmentation Method of Complex Background Text Image Based on Graph Cut
CN103077391A (en) * 2012-12-30 2013-05-01 信帧电子技术(北京)有限公司 Automobile logo positioning method and device
CN104091332A (en) * 2014-07-01 2014-10-08 黄河科技学院 Method for optimizing multilayer image segmentation of multiclass color texture images based on variation model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
文档图像段落分割技术研究与应用;赵娜;《中国优秀硕士学位论文全文数据库(电子期刊)信息科技辑》;20110430;I138-1156 *
车牌识别系统的研究;顾钰彪;《中国优秀硕士学位论文全文数据库(电子期刊)信息科技辑》;20140831;I138-1367 *

Also Published As

Publication number Publication date
CN105160300A (en) 2015-12-16

Similar Documents

Publication Publication Date Title
CN110163842B (en) Building crack detection method and device, computer equipment and storage medium
CN101334836B (en) License plate positioning method incorporating color, size and texture characteristic
CN104809481B (en) A kind of natural scene Method for text detection based on adaptive Color-based clustering
CN105205488B (en) Word area detection method based on Harris angle points and stroke width
CN111814722A (en) A form recognition method, device, electronic device and storage medium in an image
CN110659644B (en) Automatic stroke extraction method of calligraphy words
CN110838126A (en) Cell image segmentation method, device, computer equipment and storage medium
US10438083B1 (en) Method and system for processing candidate strings generated by an optical character recognition process
CN113158808A (en) Method, medium and equipment for Chinese ancient book character recognition, paragraph grouping and layout reconstruction
CN111091124B (en) Spine character recognition method
CN108133212A (en) A kind of quota invoice amount identifying system based on deep learning
CN108520278A (en) A Detection Method and Evaluation Method for Pavement Cracks Based on Random Forest
CN106780440A (en) Destruction circuit plate relic image automatic comparison recognition methods
CN101957919A (en) Character recognition method based on image local feature retrieval
CN113111868A (en) Character defect detection method, system, device and storage medium
CN112749673A (en) Method and device for intelligently extracting stock of oil storage tank based on remote sensing image
CN105160300B (en) A kind of text abstracting method based on level-set segmentation
CN111626302A (en) Method and system for cutting adhered text lines of ancient book document images of Ujin Tibetan
CN112766246A (en) Document title identification method, system, terminal and medium based on deep learning
US10217020B1 (en) Method and system for identifying multiple strings in an image based upon positions of model strings relative to one another
Gui et al. A fast caption detection method for low quality video images
S Deshmukh et al. A hybrid character segmentation approach for cursive unconstrained handwritten historical Modi script documents
CN111325199B (en) Text inclination angle detection method and device
CN108549889B (en) A Simple Method of Printed Number Recognition
CN115170507B (en) Grouting pipe surface defect detection method and system based on image data

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant