CN1941838A - File and picture binary coding method - Google Patents

File and picture binary coding method Download PDF

Info

Publication number
CN1941838A
CN1941838A CN 200510107630 CN200510107630A CN1941838A CN 1941838 A CN1941838 A CN 1941838A CN 200510107630 CN200510107630 CN 200510107630 CN 200510107630 A CN200510107630 A CN 200510107630A CN 1941838 A CN1941838 A CN 1941838A
Authority
CN
China
Prior art keywords
pixel
threshold
image
undetermined
global
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN 200510107630
Other languages
Chinese (zh)
Other versions
CN100479484C (en
Inventor
郝瑛
欧文武
王刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Science Of Co
Ricoh Software Research Center Beijing Co Ltd
Original Assignee
Ricoh Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ricoh Co Ltd filed Critical Ricoh Co Ltd
Priority to CNB200510107630XA priority Critical patent/CN100479484C/en
Publication of CN1941838A publication Critical patent/CN1941838A/en
Application granted granted Critical
Publication of CN100479484C publication Critical patent/CN100479484C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Facsimile Image Signal Circuits (AREA)
  • Image Processing (AREA)

Abstract

本发明提供一种对文档图像进行二值化处理的图像处理方法,包含如下步骤:a)在全局阈值化处理中,确定用于图像进行二值化的全局阈值,根据所述全局阈值将所述文档图像的像素分为三类:黑,白和待定像素;b)为每个待定像素确定一个自适应的二值化阈值,根据所述自适应二值化阈值,将待定像素二值化。

Figure 200510107630

The present invention provides an image processing method for binarizing a document image, which includes the following steps: a) in the global thresholding processing, determining a global threshold for binarizing an image, and dividing the global threshold according to the global threshold The pixels of the document image are divided into three categories: black, white and undetermined pixels; b) determine an adaptive binarization threshold for each undetermined pixel, and binarize the undetermined pixels according to the adaptive binarization threshold .

Figure 200510107630

Description

文档图像二值化方法Document Image Binarization Method

技术领域technical field

本发明涉及图像处理领域,具体来说提供了一种把从扫描仪、传真机或者数码相机得到的数字图像转化为二值图像的技术。本发明的应用领域为文档图像处理、文档管理以及文档识别。The invention relates to the field of image processing, and specifically provides a technology for converting digital images obtained from scanners, fax machines or digital cameras into binary images. Fields of application of the invention are document image processing, document management and document recognition.

背景技术Background technique

当代社会中,文档是首要的信息载体。因此本发明针对图像,特别是由文本、表格、线条以及图片构成的文档图像的二值化进行了改进。由于文档图像的信息本质上是二值信息,理想条件下,可以将其用单一的前景和背景来表示,比如用白色表示背景,黑色表示有用信息,即前景。然而,实际应用中,由于打印过程、不均匀的反光、文档本身内容的多样化以及各种丰富的艺术效果,通常图像中的前景和背景都是变化的。文档图像二值化的目的就是从无用信息中将有用信息分离出来,并将结果表示为一幅二值图像。In contemporary society, documents are the primary information carrier. Therefore, the present invention improves the binarization of images, especially document images composed of text, tables, lines and pictures. Since the information of a document image is essentially binary information, under ideal conditions, it can be represented by a single foreground and background, for example, white represents the background, and black represents useful information, that is, the foreground. However, in practical applications, due to the printing process, uneven reflection, the diversity of the content of the document itself, and various rich artistic effects, the foreground and background in the image usually change. The purpose of document image binarization is to separate useful information from useless information and represent the result as a binary image.

图像二值化在很多应用中是必要的步骤,比如美国专利5,452,107提出了一种根据原始图像局部区域的密度,包括目标像素和周围像素的平均值,来确定二值化阈值的方法。该方法的缺陷是局部只能提供有限的信息。Image binarization is a necessary step in many applications. For example, US Patent No. 5,452,107 proposes a method to determine the binarization threshold based on the density of the local area of the original image, including the average value of the target pixel and surrounding pixels. The disadvantage of this method is that only limited information can be provided locally.

发明内容Contents of the invention

本发明的目的在于提供一种能够解决现有技术中存在的上述问题的文档图像二值化方法。The object of the present invention is to provide a document image binarization method that can solve the above-mentioned problems in the prior art.

为了实现上述目的,本发明提供一种对文档图像进行二值化处理的图像处理方法,包含如下步骤:a)在全局阈值化处理中,确定用于图像进行二值化的全局阈值,根据所述全局阈值将所述文档图像的像素分为三类:黑,白和待定像素;b)为每个待定像素确定一个自适应的二值化阈值,根据所述自适应二值化阈值,将待定像素二值化。In order to achieve the above object, the present invention provides an image processing method for binarizing a document image, comprising the following steps: a) in the global thresholding processing, determining a global threshold for image binarization, according to the The global threshold divides the pixels of the document image into three categories: black, white and undetermined pixels; b) determine an adaptive binarization threshold for each undetermined pixel, according to the adaptive binarization threshold, the Pending pixel binarization.

本发明的文档图像二值化方法结合了全局和局部信息,同时有效地利用了图像的局部信息和历史信息,因此,能够提供更高质量的二值化文档图像。The document image binarization method of the present invention combines global and local information, and effectively utilizes image local information and historical information at the same time, so it can provide higher quality binarized document images.

附图说明Description of drawings

通过下面结合附图进行的描述,本发明的上述和其他目的和特点将会变得更加清楚,其中:The above and other objects and features of the present invention will become clearer through the following description in conjunction with the accompanying drawings, wherein:

图1概述了本发明所提出的图像二值化方法的流程图。Fig. 1 outlines the flowchart of the image binarization method proposed by the present invention.

图2示出了图1中本发明方法的预处理模块的详细流程图。Fig. 2 shows a detailed flowchart of the preprocessing module of the method of the present invention in Fig. 1 .

图3示出了图1中本发明方法的全局阈值化模块的详细流程图。FIG. 3 shows a detailed flowchart of the global thresholding module of the method of the present invention in FIG. 1 .

图4示出了全局阈值化后的一个文档图像直方图的例子,并相应地标出了用全局阈值化方法得到的三个全局阈值T1、T2和T3。Fig. 4 shows an example of a document image histogram after global thresholding, and correspondingly marks three global thresholds T1, T2 and T3 obtained by the global thresholding method.

图5示出了图1中本发明方法的局部阈值化模块的详细流程图。FIG. 5 shows a detailed flowchart of the local thresholding module of the method of the present invention in FIG. 1 .

图6示出了图1中本发明方法的后处理模块的详细流程图。FIG. 6 shows a detailed flowchart of the post-processing module of the method of the present invention in FIG. 1 .

图7表示应用本发明方法对图像进行二值化的过程的例子。Fig. 7 shows an example of the process of binarizing an image by applying the method of the present invention.

具体实施方式Detailed ways

以下,参照附图来详细说明本发明的实施例。Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

如果图像中的背景和有用信息(或称为前景)的像素值或色彩值在全图中是一致的,那么采用单一阈值就可以得到高质量的二值化图像。这种方法称为全局阈值化。If the pixel values or color values of the background and useful information (or called the foreground) in the image are consistent in the whole image, then a high-quality binarized image can be obtained by using a single threshold. This approach is called global thresholding.

但是,目前使用的大多数文档图像含有丰富的图表和艺术效果,单一阈值往往会引入噪声或者无法保留有用的信息。对不同的像素或者不同区域的像素采用不同的阈值进行二值化的方法,通常称为局部阈值化。However, most of the document images currently used are rich in diagrams and artistic effects, and a single threshold often introduces noise or fails to retain useful information. The method of binarizing different pixels or pixels in different regions with different thresholds is usually called local thresholding.

图1概述了本发明所提出的图像二值化方法的流程图。本发明的图像二值化方法是结合全局和局部信息进行的。Fig. 1 outlines the flowchart of the image binarization method proposed by the present invention. The image binarization method of the present invention is carried out in combination with global and local information.

参考图1,在本发明的文档图像二值化方法中,其输入为一个纸质或电子文档10,经过预处理模块11、全局阈值化模块12、局部阈值化模块13以及后处理模块14后被转化为电子二值化图像。Referring to Fig. 1, in the document image binarization method of the present invention, its input is a paper or electronic document 10, after the preprocessing module 11, the global thresholding module 12, the local thresholding module 13 and the postprocessing module 14 is transformed into an electronically binarized image.

输入文档10如果是纸质文档的话,需要采用光学扫描设备如扫描仪、传真机或者数码照相机将其转换为计算机能够处理的数字图像。数字图像的格式可以为BMP、JPEG、TIF等。If the input document 10 is a paper document, it needs to use an optical scanning device such as a scanner, a fax machine or a digital camera to convert it into a digital image that can be processed by a computer. The format of the digital image can be BMP, JPEG, TIF, etc.

预处理模块11对图像进行下文将要说明的一系列的处理,其处理结果为后续的阈值化模块所用。The preprocessing module 11 performs a series of processing on the image which will be described below, and the processing result is used by the subsequent thresholding module.

此后全局阈值化模块12确定两个阈值,将图像像素分为白、黑和待定像素。待定像素指在全局阈值阶段无法根据直方图信息确定其分类的像素集,这些像素可能是艺术效果、图表、照片、表格甚至是文字信息。由于全局阈值化可以处理大多数图像像素,因此可以显著提高二值化的速度。另外一个好处在于由于全局阈值化阶段不区分前景和背景,因此能够保持文档中的反色信息,即文本符号的颜色与背景颜色比深的情况。Thereafter the global thresholding module 12 determines two thresholds to classify image pixels into white, black and undetermined pixels. Pending pixels refer to the set of pixels whose classification cannot be determined according to the histogram information in the global threshold stage. These pixels may be artistic effects, charts, photos, tables or even text information. Since global thresholding can process most of the image pixels, it can significantly improve the speed of binarization. Another benefit is that since the global thresholding stage does not distinguish between foreground and background, it can preserve the anti-color information in the document, that is, the color of the text symbol is darker than the background color.

在本发明中,局部阈值化模块13根据图像局部特征和历史阈值信息为每一个待定像素确定一个二值化阈值。这里的局部特征包括图像局部区域的均值和方差。而历史阈值信息则来自于已经二值化的邻近像素。本发明中,历史阈值信息的使用非常重要,它可以显著提高输出二值化文档图像的质量。In the present invention, the local thresholding module 13 determines a binarization threshold for each undetermined pixel according to the local features of the image and historical threshold information. The local features here include the mean and variance of the local area of the image. The historical threshold information comes from neighboring pixels that have been binarized. In the present invention, the use of historical threshold information is very important, which can significantly improve the quality of the output binarized document image.

最后,后处理模块14对经过全局阈值化模块12以及局部阈值化13二值化后的图像进行处理,以便去除该图像上的噪声。一般来讲,这样的噪声有三类:文本笔划的粘连、文本笔划的断开以及孤立噪声点。本发明的后处理方法能够在不引入新的噪声的情况下去除图像中的大多数噪声。Finally, the post-processing module 14 processes the image binarized by the global thresholding module 12 and the local thresholding 13 to remove noise on the image. Generally speaking, there are three types of such noise: sticking of text strokes, disconnection of text strokes, and isolated noise points. The post-processing method of the present invention can remove most of the noise in the image without introducing new noise.

经过上述处理,输入文档10的有效信息被表示为一个二值化文档图像15。此图像可被用于很多领域,如进一步的图像分析、文本字的颜色检测、文档图像压缩、文档的版面分析以及光学字符识别等。After the above processing, the effective information of the input document 10 is represented as a binary document image 15 . This image can be used in many fields, such as further image analysis, color detection of text words, image compression of documents, layout analysis of documents, and optical character recognition, etc.

下面通过图2-6对图1中的每个模块进行详细介绍。Each module in Figure 1 is described in detail below through Figure 2-6.

图2详细表示了预处理模块11的流程。预处理模块11的功能是对图像进行平滑以去除噪声,同时为后续的全局阈值化模块12提供必要的数据。如果输入是纸质文档,首先通过模块101对其进行数字化产生数字图像。如果是彩色图像,通过模块102将其转化为灰度图像或者对每个通道分别进行处理。根据图像的内容和质量,可采用直方图均衡化模块对灰度进行处理。随后的低通滤波器104可选择如高斯滤波器的线性滤波器,或者如均值滤波器的非线性滤波器。FIG. 2 shows the flow of the preprocessing module 11 in detail. The function of the preprocessing module 11 is to smooth the image to remove noise, and at the same time provide necessary data for the subsequent global thresholding module 12 . If the input is a paper document, it is first digitized by module 101 to generate a digital image. If it is a color image, it is converted into a grayscale image through the module 102 or each channel is processed separately. According to the content and quality of the image, the histogram equalization module can be used to process the gray scale. The subsequent low-pass filter 104 may be chosen to be a linear filter such as a Gaussian filter, or a non-linear filter such as a mean filter.

此后图像被划分为图像块,如果图像块内像素最大值和最小值的差小于预先设定的阈值,则认为该图像块是均匀的,对确定全局阈值无法提供有意义的信息,因此在图像蒙版估计模块105中该均匀的图像块被屏蔽掉不予考虑。对于有效信息只占图像很小部分的情况,该蒙版也能发挥很好的作用。最后,根据图像蒙版计算图像的直方图分布,这将作为全局阈值化模块12的输入。出于速度的考虑,也可对图像进行降采样,并将得到的阈值应用于原始图像。After that, the image is divided into image blocks. If the difference between the maximum and minimum values of pixels in the image block is less than the preset threshold, the image block is considered to be uniform, which cannot provide meaningful information for determining the global threshold. Therefore, in the image In the mask estimation module 105, the uniform image block is masked out and ignored. This mask also works well for cases where the valid information is only a small part of the image. Finally, the histogram distribution of the image is calculated according to the image mask, which will serve as the input of the global thresholding module 12 . For speed, the image can also be down-sampled and the resulting threshold applied to the original image.

图3详细表示了全局阈值化算法的流程图。该模块对从预处理模块11得到的直方图进行分析,首先模块111在像素灰度最大值和最小值之间选取一个最优阈值T1,随后模块112和113分别在最小值和T1之间以及T1和最大值之间选取阈值T2和T3。在本发明的一个可能实施例中,基于线性判别准则的Otsu算法(这是一个非常常用的算法,出处N.Otsu,“A thresholdselection method from grey-level histograms,”IEEE Trans.Syst.,Man,Cybern.,vol.SMC-1,pp.62-66,Jan.1979.)被用于确定T1、T2和T3,即,根据Otsu算法在直方图上算出来T1、T2和T3,这三个阈值满足T2≤T1≤T3。在模块114中,图像中的像素灰度值如果小于T2,则被判别为黑色像素,表示为1,如果大于T3,则被判别为白色像素,表示为0。剩下的像素则被判别为待定。值得一提的是,因为随着印刷技术的提高,出现了大量含有丰富背景,而有效的文字信息由单一的亮色表示的文档。因此为了能够保持反色信息,模块114不对前景和背景进行区分。Figure 3 shows the flowchart of the global thresholding algorithm in detail. This module analyzes the histogram obtained from the preprocessing module 11. First, the module 111 selects an optimal threshold T1 between the maximum value and the minimum value of the pixel gray scale, and then the modules 112 and 113 respectively select between the minimum value and T1 and Thresholds T2 and T3 are chosen between T1 and the maximum value. In a possible embodiment of the present invention, the Otsu algorithm based on the linear discriminant criterion (this is a very commonly used algorithm, source N.Otsu, "A threshold selection method from grey-level histograms," IEEE Trans. Syst., Man, Cybern., vol.SMC-1, pp.62-66, Jan.1979.) are used to determine T1, T2 and T3, that is, T1, T2 and T3 are calculated on the histogram according to the Otsu algorithm, these three The threshold satisfies T2≤T1≤T3. In module 114, if the pixel gray value in the image is less than T2, it is judged as a black pixel, which is represented as 1; if it is greater than T3, it is judged as a white pixel, represented as 0. The remaining pixels are judged as pending. It is worth mentioning that with the improvement of printing technology, there have been a large number of documents with rich background and effective text information represented by a single bright color. Therefore, in order to be able to maintain the reverse color information, the module 114 does not distinguish between the foreground and the background.

图4给出了全局阈值化的一个例子,其中,横坐标为像素灰度值,纵坐标为每个像素灰度值在全图出现的次数,即直方图,T1,T2和T3是根据上述方法确定的三个全局阈值,其中T2和T3被用于全局阈值化。Figure 4 shows an example of global thresholding, where the abscissa is the pixel gray value, and the ordinate is the number of times each pixel gray value appears in the whole image, that is, the histogram. T1, T2 and T3 are based on the above method determines three global thresholds, where T2 and T3 are used for global thresholding.

仅仅通过对直方图的分析无法确定落入T2和T3区间的像素(即待定像素)是否包含有用信息,因此需要借助更多的信息进行分析。Only by analyzing the histogram, it is impossible to determine whether the pixels falling into the interval between T2 and T3 (ie, undetermined pixels) contain useful information, so more information is needed for analysis.

图5给出了局部自适应阈值化模块的流程图,用于确定落入T2和T3区间的像素(即待定像素)是否包含有用信息。该模块逐一检查图像中的像素,如果当前像素是黑或者白,则检查下一个像素;如果当前像素的值介于黑和白之间,即属于待定类的像素,则为该像素确定一个阈值,并根据该阈值,对该待定像素进行二值化。Fig. 5 shows the flow chart of the local adaptive thresholding module, which is used to determine whether the pixels falling into the T2 and T3 intervals (ie, undetermined pixels) contain useful information. This module checks the pixels in the image one by one, if the current pixel is black or white, check the next pixel; if the value of the current pixel is between black and white, that is, a pixel belonging to the undetermined class, then determine a threshold for the pixel , and according to the threshold, binarize the undetermined pixel.

如果当前像素是所在行的第一个待定像素,则模块121采用当前像素的局部特征指局部均值和局部方差,采用的方法为Sauvola算法(参见出处:J.Sauvola,M.Pietkinen,“Adaptive document image binarization”,PatternRecognition,Vol.33,pp.225-236,2000.)。If the current pixel is the first undetermined pixel of the row, then module 121 uses the local feature of the current pixel to refer to the local mean and the local variance, and the method adopted is the Sauvola algorithm (see source: J.Sauvola, M.Pietkinen, " Adaptive document image binarization”, Pattern Recognition, Vol. 33, pp. 225-236, 2000.).

如果当前像素不是所在行的第一个待定像素,则在局部特征的基础上增加历史阈值信息,即,上一个待定像素确定的阈值。模块122对局部信息和历史阈值信息采用特定的方式来为当前像素确定阈值,具体的系数可以根据应用领域以及文档的特点确定。例如,对OCR应用来说,可以将字提取率作为标准来对系数进行优化。选定阈值后,如果像素灰度值小于阈值,则该待定像素被二值化为黑,否则二值化为白。If the current pixel is not the first undetermined pixel in the row, add historical threshold information based on the local features, that is, the threshold determined by the last undetermined pixel. The module 122 adopts a specific method for local information and historical threshold information to determine the threshold for the current pixel, and the specific coefficient can be determined according to the application field and the characteristics of the document. For example, for OCR applications, the word extraction rate can be used as a standard to optimize the coefficients. After the threshold is selected, if the gray value of the pixel is less than the threshold, the pending pixel is binarized into black, otherwise it is binarized into white.

在本发明的一个可能的实施例中,局部信息和历史阈值信息通过如下公式被组合在一起:In a possible embodiment of the present invention, local information and historical threshold information are combined by the following formula:

            T=m*(1-k1*(k2*VAR+k3*Thistory)/R)T=m*(1-k 1 *(k 2 *VAR+k 3 *T history )/R)

其中,T是待定像素的阈值,m是以待定像素为中心的一个邻域的均值,VAR是所述邻域的反差,Thistory是历史阈值信息,k1、k2、k3和R均是线性系数。Among them, T is the threshold of the undetermined pixel, m is the mean value of a neighborhood centered on the undetermined pixel, VAR is the contrast of the neighborhood, T history is the historical threshold information, k 1 , k 2 , k 3 and R are all is a linear coefficient.

文档图像通常都由字符、线、表格、照片和图表等构成,这些不同的成分通常各有特点。但是从二值化图像上来看,最重要的信息是字符、线、表格的结构以及内部的字符。如上所述,二值化图像中的噪声可以分为三类:笔划之间的粘连、笔划的断裂以及孤立噪声点/块。后处理的目的是将粘连的比划分开,连接断裂的笔划并去除孤立噪声点,并且在处理过程中不引入新的噪声。Document images are usually composed of characters, lines, tables, photos and charts, etc., and these different components usually have their own characteristics. But from the point of view of the binarized image, the most important information is the structure of characters, lines, tables, and internal characters. As mentioned above, noise in a binarized image can be classified into three categories: adhesion between strokes, breakage of strokes, and isolated noise points/blocks. The purpose of post-processing is to divide the cohesive ratio, connect broken strokes and remove isolated noise points, and not introduce new noise during the processing.

图6详细给出了后处理模块的流程图,其基本思路是用迭代的方式对图像进行分析,是否继续取决于每次迭代的结果。首先,后处理的输入是经过全局和局部阈值化的二值化图像,在每次迭代中,检查每个像素邻域内与其颜色相同的像素数目,如果数目少于一定阈值T4,则将中心像素反色,否则保持其颜色。该方法的成功与否取决于邻域的阈值和大小。本发明中的后处理选取一个相对较大的邻域,同时邻域阈值根据前次迭代的结果进行适度增大。如果某次迭代中,颜色被反色的像素数目少于一定阈值T5,说明图像的噪声已经在一定范围内,因此迭代停止。这种方式有效减少了引入的噪声。Figure 6 shows the flow chart of the post-processing module in detail. The basic idea is to analyze the image in an iterative manner, and whether to continue depends on the result of each iteration. First, the post-processing input is the binarized image after global and local thresholding. In each iteration, check the number of pixels with the same color in the neighborhood of each pixel. If the number is less than a certain threshold T4, the central pixel Invert the color, otherwise keep its color. The success of this method depends on the threshold and size of the neighborhood. The post-processing in the present invention selects a relatively large neighborhood, and at the same time, the neighborhood threshold is moderately increased according to the result of the previous iteration. If in a certain iteration, the number of pixels whose color is reversed is less than a certain threshold T5, it means that the noise of the image is already within a certain range, so the iteration stops. This approach effectively reduces the introduced noise.

经过图2-6所述的处理,将一个输入文档转化为一个二值图像。After the processing described in Figure 2-6, an input document is converted into a binary image.

图7给出了一个二值化的具体例子。其中,A是原始图像,B是全局阈值化后的结果,而C是局部阈值化后的结果。Figure 7 shows a specific example of binarization. Among them, A is the original image, B is the result after global thresholding, and C is the result after local thresholding.

本发明不限于上述的具体实施例。对于本领域普通技术人员来说,在不超出所附权利要求书限定的保护范围内,显然可以进行各种各样的组合、改变和变型。The invention is not limited to the specific examples described above. It will be apparent to those skilled in the art that various combinations, changes and modifications can be made without departing from the scope of protection defined by the appended claims.

例如,本发明的针对预处理模块的一种可能的变型为,可以去除或者改变图2的模块103、104和105。如果前景和背景的像素分布比较均匀的话,无需进行低通滤波。For example, a possible modification of the present invention for the preprocessing module is that the modules 103, 104 and 105 in FIG. 2 can be removed or changed. If the distribution of foreground and background pixels is relatively uniform, low-pass filtering is not required.

本发明的针对全局阈值化模块的一种可能的变型为,可以改变图3的模块111、112和113中的全局阈值化方法,例如基于信息熵或矩的方法,并且用于确定T1,T2和T3的方法也无须相同。A possible modification of the present invention for the global thresholding module is that the global thresholding method in modules 111, 112 and 113 in FIG. 3 can be changed, such as a method based on information entropy or moment, and used to determine T1, T2 The method of T3 need not be the same either.

本发明的针对局部阈值化模块的一种可能的变型为,可以改变图5的有关当前行的第一个待定像素的阈值的确定方法。而且,图5中的历史阈值信息可以选取来自与当前像素位于同一列的前一个待定像素的阈值。此外,用于组合局部特征和历史阈值信息的线性系数可根据具体的应用进行调整。A possible modification of the present invention for the local thresholding module is to change the method for determining the threshold of the first undetermined pixel in the current row in FIG. 5 . Moreover, the history threshold information in FIG. 5 may be selected from the threshold of the previous pending pixel located in the same column as the current pixel. In addition, the linear coefficients used to combine local features and historical threshold information can be adjusted according to specific applications.

Claims (9)

1. one kind is carried out the image processing method of binary conversion treatment to file and picture, comprises following steps:
A) in the global threshold processing, be identified for image is carried out the global threshold of binaryzation, be divided three classes according to the pixel of described global threshold described file and picture: black, pixel white and undetermined;
B) determine an adaptive binary-state threshold for each pixel undetermined, according to described self-adaption binaryzation threshold value, with pixel binaryzation undetermined.
2. according to the image processing method of claim 1, step a) further comprises the steps:
By histogram analysis, between pixel minimum and maximum, determine first global threshold (T1);
By histogram analysis, between pixel minimum and first global threshold (T1), determine second global threshold (T2);
By histogram analysis, between second global threshold (T2) and pixel maximum, determine the 3rd global threshold (T3);
According to second global threshold (T2) and the 3rd global threshold (T3), image pixel is divided into 3 classes: pixel value is less than the black pixel that is of second global threshold (T2), pixel value is a white pixel greater than the 3rd global threshold (T3), and pixel value is a pixel undetermined between second global threshold (T2) and the 3rd global threshold (T3).
3. according to the image processing method of claim 1, it is characterized in that step b) further comprises the steps:
The employing local feature is that first pixel undetermined of every row or every row is determined described adaptive threshold;
Adopt specific mode in conjunction with local feature and historical threshold information, for each follow-up pixel undetermined is determined described adaptive threshold;
Behind the selected described adaptive threshold, if grey scale pixel value undetermined less than described adaptive threshold, then this pixel undetermined is turned to blackly by two-value, otherwise that two-value turns to is white.
4. according to the image processing method of claim 3, it is characterized in that:
Described local feature comprises image local mean value of areas and variance;
Described historical threshold information is the threshold value of current line or the previous pixel undetermined that lists.
5. according to the image processing method of claim 1, wherein, before carrying out binary conversion treatment, also comprise step:
Image is carried out preliminary treatment think that the global threshold processing provides data.
6. according to the image processing method of claim 5, wherein said pre-treatment step further comprises the steps:
File and picture is carried out low-pass filtering to remove high-frequency noise;
Determine the image masking-out according to the pixel value amplitude of variation in the image block;
If desired, can carry out down-sampled to image according to the image masking-out;
Calculate original image or down-sampled image histogram according to the image masking-out.
7. according to the image processing method of claim 6, it is characterized in that using Gaussian filter or mean filter that file and picture is carried out low-pass filtering.
8. according to the image processing method of claim 1, it is characterized in that further comprising following steps:
D) on the image of binaryzation, remove the reprocessing of noise.
9. image processing method according to Claim 8 is characterized in that step d) can further comprise following steps:
Calculate the number of pixels identical in the current pixel neighborhood with the current pixel color;
If the number of pixels that obtains is less than the 4th threshold value (T4), then with the current pixel inverse;
If reached maximum times by the pixel of inverse less than the 5th threshold value (T5) or iteration in the current iteration, iteration stopping then, otherwise recomputate the 4th threshold value (T4) and the 5th threshold value (T5), and continue iteration.
CNB200510107630XA 2005-09-29 2005-09-29 File and picture binary coding method Expired - Fee Related CN100479484C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB200510107630XA CN100479484C (en) 2005-09-29 2005-09-29 File and picture binary coding method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB200510107630XA CN100479484C (en) 2005-09-29 2005-09-29 File and picture binary coding method

Publications (2)

Publication Number Publication Date
CN1941838A true CN1941838A (en) 2007-04-04
CN100479484C CN100479484C (en) 2009-04-15

Family

ID=37959585

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB200510107630XA Expired - Fee Related CN100479484C (en) 2005-09-29 2005-09-29 File and picture binary coding method

Country Status (1)

Country Link
CN (1) CN100479484C (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102289668A (en) * 2011-09-07 2011-12-21 谭洪舟 Binaryzation processing method of self-adaption word image based on pixel neighborhood feature
CN102496021A (en) * 2011-11-23 2012-06-13 南开大学 Wavelet transform-based thresholding method of image
CN106203251A (en) * 2015-05-29 2016-12-07 柯尼卡美能达美国研究所有限公司 File and picture binary coding method
CN106446896A (en) * 2015-08-04 2017-02-22 阿里巴巴集团控股有限公司 Character segmentation method and device and electronic equipment
CN106778761A (en) * 2016-12-23 2017-05-31 潘敏 A kind of processing method of vehicle transaction invoice
CN107610132A (en) * 2017-08-28 2018-01-19 西北民族大学 A kind of ancient books file and picture greasiness removal method
CN107609558A (en) * 2017-09-13 2018-01-19 北京元心科技有限公司 Character image processing method and processing device
CN109635823A (en) * 2018-12-07 2019-04-16 湖南中联重科智能技术有限公司 The method and apparatus and engineering machinery of elevator disorder cable for identification

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102289668A (en) * 2011-09-07 2011-12-21 谭洪舟 Binaryzation processing method of self-adaption word image based on pixel neighborhood feature
CN102496021A (en) * 2011-11-23 2012-06-13 南开大学 Wavelet transform-based thresholding method of image
CN106203251A (en) * 2015-05-29 2016-12-07 柯尼卡美能达美国研究所有限公司 File and picture binary coding method
CN106203251B (en) * 2015-05-29 2019-04-23 柯尼卡美能达美国研究所有限公司 File and picture binary coding method
CN106446896A (en) * 2015-08-04 2017-02-22 阿里巴巴集团控股有限公司 Character segmentation method and device and electronic equipment
US10552705B2 (en) 2015-08-04 2020-02-04 Alibaba Group Holding Limited Character segmentation method, apparatus and electronic device
CN106446896B (en) * 2015-08-04 2020-02-18 阿里巴巴集团控股有限公司 Character segmentation method and device and electronic equipment
CN106778761A (en) * 2016-12-23 2017-05-31 潘敏 A kind of processing method of vehicle transaction invoice
CN107610132A (en) * 2017-08-28 2018-01-19 西北民族大学 A kind of ancient books file and picture greasiness removal method
CN107609558A (en) * 2017-09-13 2018-01-19 北京元心科技有限公司 Character image processing method and processing device
CN109635823A (en) * 2018-12-07 2019-04-16 湖南中联重科智能技术有限公司 The method and apparatus and engineering machinery of elevator disorder cable for identification

Also Published As

Publication number Publication date
CN100479484C (en) 2009-04-15

Similar Documents

Publication Publication Date Title
CN1208943C (en) Method and apparatus for increasing digital image quality
Kasar et al. Font and background color independent text binarization
CN101042735A (en) Image binarization method and device
CN101710425B (en) Self-adaptive pre-segmentation method based on gray scale and gradient of image and gray scale statistic histogram
JPH07231388A (en) System and method for detection of photo region of digital image
CN1622589A (en) Image processing method and image processing apparatus
CN1578381A (en) Adaptive halftone scheme to preserve image smoothness and sharpness with region identification
CN1941838A (en) File and picture binary coding method
Amin et al. A binarization algorithm for historical arabic manuscript images using a neutrosophic approach
Nandy et al. An analytical study of different document image binarization methods
Saini Document image binarization techniques, developments and related issues: a review
CN1275191C (en) Method and appts. for expanding character zone in image
CN1797428A (en) Method and device for self-adaptive binary state of text, and storage medium
CN1694119A (en) A Method of Image Binarization
Cherala et al. Palm leaf manuscript/color document image enhancement by using improved adaptive binarization method
KR100537827B1 (en) Method for the Separation of text and Image in Scanned Documents using the Distribution of Edges
Wagdy et al. Border noise removal and clean up based on retinex theory
Gatos et al. Locating text in historical collection manuscripts
Boiangiu et al. Bitonal image creation for automatic content conversion
Xi et al. A novel binarization system for degraded document images
CN1310183C (en) Binary conversion method of character and image
Kasar et al. Specialized Text Binarization Technique for Camera based Document images
Chandra et al. Improved adaptive binarization technique for document image analysis.
Yoshida et al. A new binarization method for a sign board image with the blanket method
Yang OCR oriented binarization method of document image

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20090724

Address after: Tokyo, Japan

Co-patentee after: RICOH SOFTWARE RESEARCH CENTER (BEIJING) CO., LTD.

Patentee after: Science of the company

Address before: Tokyo, Japan, Japan

Patentee before: Ricoh Co., Ltd.

CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20090415

Termination date: 20190929