JP5337563B2

JP5337563B2 - Form recognition method and apparatus

Info

Publication number: JP5337563B2
Application number: JP2009093533A
Authority: JP
Inventors: 広新庄; 健永崎; 和樹中島; 博文木村
Original assignee: Hitachi Computer Peripherals Co Ltd
Current assignee: Hitachi Information and Telecommunication Engineering Ltd
Priority date: 2009-04-08
Filing date: 2009-04-08
Publication date: 2013-11-06
Anticipated expiration: 2029-04-08
Also published as: JP2010244372A

Abstract

<P>PROBLEM TO BE SOLVED: To leave only the pixels of characters behind while dropping out pixels of other colors in a situation where the colors of the characters and ruled lines are unstable. <P>SOLUTION: Color clustering of the pixels of an input image is performed and the colors of the background and characters are determined based on the number of pixels in a cluster. Colors close to the colors of the characters in the image are left behind to create binary images of the characters. By dropping out the colors other than the colors of the characters, it is possible to drop out regardless of the colors of noises such as ruled lines or the hues of false colors. A determination is made as to whether or not pixels where colors are mixed because of the crossing of the characters and the ruled lines, and the false colors in the characters, should be dropped out in relation to surrounding pixels. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は、ＯＣＲ（ＯｐｔｉｃａｌＣｈａｒａｃｔｅｒＲｅａｄｅｒ：光学式文字読取装置）等の画像の処理技術に関し、特に除去対象の色が不定な状況において、背景色などをドロップアウトする技術に関する。 The present invention relates to an image processing technique such as an OCR (Optical Character Reader), and more particularly to a technique for dropping out a background color or the like in a situation where a color to be removed is indefinite.

ＯＣＲで帳票上の文字を認識するためには、イメージスキャナ等で読み取った画像中から文字成分のみを抽出する必要がある。文字成分を抽出するためには、罫線などのノイズ成分や背景と文字成分とを分離しなければならない。一般的に、文字と背景とを分離するには２値化処理が用いられる。２値化により、輝度が低い文字成分が黒に、輝度が高い背景部分が白となる２値画像が得られる。文字成分と罫線などのノイズ成分の色が異なる場合には、２値化の前にドロップアウト処理が用いられる。ドロップアウト処理とは、印刷もしくは印字されている帳票や文書の画像において、特定の色を画像として現れないようにする処理である。 In order to recognize a character on a form by OCR, it is necessary to extract only a character component from an image read by an image scanner or the like. In order to extract a character component, it is necessary to separate a noise component such as a ruled line, a background, and a character component. In general, a binarization process is used to separate a character and a background. By binarization, a binary image in which a character component with low luminance is black and a background portion with high luminance is white is obtained. When the color of a noise component such as a character component and a ruled line is different, dropout processing is used before binarization. The dropout processing is processing for preventing a specific color from appearing as an image in a printed or printed form or document image.

ドロップアウト処理の代表的な従来技術としては、以下の３種がある。
（１）指定した赤系もしくは青系の色を光学的にドロップアウト
ノイズ成分が赤系もしくは青系のいずれかを指定し、ノイズ成分と同系色の光源で撮像した画像を文字認識に用いる。例えば、文字が黒で罫線が赤い帳票を赤系の光源で撮像すると、罫線部分が画像に現れなくなる。青系でも同様である。既存のハードウェアＯＣＲ製品の多くがこの方式をとっている。
（２）ＲＧＢで最適なドロップアウト色を動的に選択
カラー画像を撮像し、ＲＢＧの3つの色のうち、ノイズ成分の輝度が最も高くなる色を選択した後、その色における画素の濃淡値を用いて２値化する方式がある。この方式の従来例としては特許文献１がある。
（３）濃淡画像の濃度で分離
この方式は色情報を使わず、濃淡画像を用いる方式である。濃淡画像において輝度値のヒストグラムをとり、最も明るい領域を背景、最も暗い領域を文字、中間をノイズ成分と仮定して分離する方式がある。この方式の従来例としては特許文献２がある。 There are the following three types of typical conventional techniques for dropout processing.
(1) The specified red or blue color is optically specified as either a red or blue dropout noise component, and an image captured with a light source having the same color as the noise component is used for character recognition. For example, if a form with black characters and red ruled lines is imaged with a red light source, the ruled line portions do not appear in the image. The same applies to the blue system. Many existing hardware OCR products use this method.
(2) Dynamically selecting an optimal dropout color in RGB After picking up a color image and selecting a color having the highest luminance of the noise component among the three RBG colors, the gray value of the pixel in that color There is a method of binarization using. There is Patent Document 1 as a conventional example of this method.
(3) Separation by density of gray image This method uses a gray image without using color information. There is a method of taking a histogram of luminance values in a grayscale image and separating them on the assumption that the brightest area is a background, the darkest area is a character, and the middle is a noise component. There exists patent document 2 as a prior art example of this system.

特開２００３−１９６５９２号公報JP 2003-196292 A 特開２００４−９４４２７号公報JP 2004-94427 A 特開２００７−１５６７６４号公報JP 2007-156664 A

本発明の画像処理装置は、画像中の色情報や濃淡情報を用いて文字成分とその他の成分を分離することを目的とする。さらに、画像ごとに文字成分とノイズ成分の色の組み合わせが異なる場合でも、正しくドロップアウトすることを目的としている。この処理を実現するにあたって解決すべき課題は以下の通りである。
（ａ）文字成分やノイズ成分の色が様々で，これらの色の組み合わせが画像毎に異なる
異なる機関で発行された帳票を一括して処理する場合、文字の色や罫線などのノイズ成分の色は同じである保証はない。したがって、文字やノイズ成分が異なる帳票が混在した状態で、文字以外の成分をドロップアウトする必要がある。この場合、ドロップアウトする色をあらかじめ指定することができないため、従来手法（１）の方式は適用できない。
（ｂ）ＲＧＢでは分離困難な色
茶色や紫などＲＢＧの情報では分離が困難な色や、濃淡値が高い色を含む帳票では、従来手法（２）を適用することができない。
（ｃ）複数のノイズ成分色
従来手法（２）のように、ＲＧＢで分離する手法では、複数の色のノイズ成分を含む場合にはドロップアウトできない場合がある。例えば、赤系と青系のノイズ成分を含む場合には、どちらか一方の色しかドロップアウトできない。
（ｄ）高濃度のノイズ成分色
従来手法（１）（２）（３）は、画素の濃淡値に基づいて２値化処理利用する。したがって、ノイズ成分が文字成分と同程度に高濃度の場合には両者を分離できないため、ノイズ成分をドロップアウトできない。
（ｅ）同系色のノイズ成分色
文字色とノイズ成分色の色が同系色の場合には、色の違いに着目した従来手法（１）（２）ではノイズ成分をドロップアウトできない。濃度に着目すればドロップアウトできる場合もある。しかしながらこの方式では、文字と同系色のノイズと、文字と濃度が同じで異なる色のノイズが混在する場合には、後者のノイズ成分をドロップアウトできない。
（ｆ）偽色
スキャナの撮像素子の特性上、文字や罫線などの境界付近に本来の色とは異なる色（偽色）が発生する場合がある。従来手法（１）（２）は、除去対象の色に着目してドロップアウト処理を行なうため、除去対象とは異なる色である偽色部分が除去されないという問題がある。さらに、文字内に偽色が発生する場合には、この画素をドロップアウトしてしまうという問題もある。 An object of the image processing apparatus of the present invention is to separate a character component and other components using color information and shading information in an image. Furthermore, the object is to drop out correctly even when the combination of the color of the character component and the noise component is different for each image. Problems to be solved in realizing this processing are as follows.
(A) Character components and noise components vary in color, and combinations of these colors differ from image to image When processing forms issued by different organizations at one time, the colors of characters and noise components such as ruled lines There is no guarantee that they are the same. Therefore, it is necessary to drop out components other than characters in a state where forms with different characters and noise components are mixed. In this case, since the color to be dropped out cannot be designated in advance, the method of the conventional method (1) cannot be applied.
(B) Colors that are difficult to separate with RGB Conventional method (2) cannot be applied to forms that include colors that are difficult to separate with RBG information, such as brown and purple, and forms that have high gray values.
(C) A plurality of noise component colors As in the conventional method (2), in the method of separating with RGB, there are cases where dropout cannot be performed when noise components of a plurality of colors are included. For example, when red and blue noise components are included, only one of the colors can be dropped out.
(D) High-density noise component color The conventional methods (1), (2), and (3) use binarization processing based on the gray value of a pixel. Therefore, when the noise component is as high as the character component, the noise component cannot be dropped out because they cannot be separated.
(E) Noise component of similar color When the character color and the noise component color are similar colors, the conventional methods (1) and (2) focusing on the color difference cannot drop out the noise component. In some cases, it is possible to drop out by focusing on the concentration. However, with this method, when noise of the same color as the character and noise of the same color and different color are mixed, the latter noise component cannot be dropped out.
(F) False color Due to the characteristics of the image sensor of the scanner, a color (false color) different from the original color may occur near the boundary of characters, ruled lines, and the like. In the conventional methods (1) and (2), dropout processing is performed by paying attention to the color to be removed, so that there is a problem that a false color portion that is different from the color to be removed is not removed. Further, when a false color occurs in the character, there is a problem that the pixel is dropped out.

本発明の一例では、処理画像中の文字を構成する画素以外の画素をドロップアウトする画像処理方法であって、処理画像を入力するステップと、処理画像の画素を第１の色空間上で色クラスタリングするステップと、色クラスタリングに基づいて背景色、文字色を識別するステップを有する。背景色、文字色を識別するステップにおいて、例えば最も画素数が多いクラスタまたは最も明度が高いクラスタに属する画素から背景色を識別することができる。また、背景色でないと識別されたクラスタから文字色が含まれるクラスタを選択する。例えば、２番目に画素数が多いクラスタまたは最も明度が低いクラスタに属する画素から文字色を識別することができる。 In one example of the present invention, there is provided an image processing method for dropping out pixels other than those constituting a character in a processed image, the step of inputting the processed image, and the pixels of the processed image in the first color space. Clustering, and identifying a background color and a character color based on the color clustering. In the step of identifying the background color and the character color, for example, the background color can be identified from the pixels belonging to the cluster having the largest number of pixels or the cluster having the highest brightness. In addition, a cluster including a character color is selected from clusters identified as not having a background color. For example, the character color can be identified from the pixel belonging to the cluster having the second largest number of pixels or the cluster having the lowest brightness.

さらに、処理画像の各画素を、上記で識別された背景色、文字色に加えて、当該画素のＲＧＢ以外の要素（例えば、色相や彩度）を用いて、背景候補、文字候補および除去候補等に弁別することができる。 Furthermore, in addition to the background color and character color identified above, each pixel of the processed image is used as a background candidate, a character candidate, and a removal candidate using an element other than RGB (for example, hue and saturation) of the pixel. And so on.

第１の色空間は、ＲＧＢ、ＨＳＶ、ＨＬＳ、ＨＳＢ、ＣＭＹ、ＣＭＹＫなどの各種色空間を用いることができる。画素の色相および彩度を用いる場合に、画像の画素を第１の色空間からＲＧＢ以外の色相、彩度等の次元が含まれる第２の色空間、例えばＨＳＶ、ＨＬＳ、ＨＳＢ等に変換してから処理を行うことができる。別の例としては、色空間の変換を行わず、処理の都度第１の色空間の数値から色相、彩度等を計算することもできる。 As the first color space, various color spaces such as RGB, HSV, HLS, HSB, CMY, and CMYK can be used. When using the hue and saturation of a pixel, the pixel of the image is converted from the first color space to a second color space including dimensions such as hue and saturation other than RGB, such as HSV, HLS, and HSB. Can be processed. As another example, hue, saturation, and the like can be calculated from numerical values in the first color space each time processing is performed without performing color space conversion.

本発明の一形態によると、文字成分色とノイズ成分色が不定である帳票に対して、自動的にノイズ成分をドロップアウトすることができる。さらに、ノイズ成分が同系色である場合や、複数の場合、高濃度の場合でもドロップアウトできる。 According to one embodiment of the present invention, a noise component can be automatically dropped out of a form whose character component color and noise component color are indefinite. Furthermore, when the noise component is a similar color, or when there are a plurality of noise components, it can be dropped out even when the density is high.

本発明の実施形態を図面を用いて説明する。なお、本発明は、以下の説明によって限定されるものではない。 Embodiments of the present invention will be described with reference to the drawings. In addition, this invention is not limited by the following description.

図１は、本発明の実施形態の帳票認識システムの構成を示す図である。 FIG. 1 is a diagram showing a configuration of a form recognition system according to an embodiment of the present invention.

帳票認識システムは、入力装置１０、画像入力装置２０、画像処理装置３０、辞書４０、表示装置５０、及び画像データベース（ＤＢ）６０を備える。 The form recognition system includes an input device 10, an image input device 20, an image processing device 30, a dictionary 40, a display device 50, and an image database (DB) 60.

入力装置１０は、画像処理装置３０にコマンド及びコードデータ等を入力するためのキーボード及びマウス等の装置である。 The input device 10 is a device such as a keyboard and a mouse for inputting commands and code data to the image processing device 30.

画像入力装置２０は、帳票を画像データとして、画像処理装置３０に入力するためのスキャナ等の装置である。 The image input device 20 is a device such as a scanner for inputting a form as image data to the image processing device 30.

画像処理装置３０は、画像入力装置２０によって入力された帳票の読み取り領域を検出してドロップアウト処理をする計算機であって、図示しないＣＰＵ、メモリ、及び記憶装置を備える。画像処理装置では、ドロップアウト画像の文字認識などの処理も実行することができる。 The image processing apparatus 30 is a computer that detects a reading area of a form input by the image input apparatus 20 and performs a dropout process, and includes a CPU, a memory, and a storage device (not shown). The image processing apparatus can also execute processing such as character recognition of the dropout image.

辞書４０は、画像処理装置３０が帳票を認識する際に参照する辞書データベースである。辞書４０は、具体的には、画像処理装置３０が文字認識する際に参照する文字認識辞書や、帳票の読み取り領域を検出する際に参照する帳票情報等を格納する。 The dictionary 40 is a dictionary database that is referred to when the image processing apparatus 30 recognizes a form. Specifically, the dictionary 40 stores a character recognition dictionary that is referred to when the image processing apparatus 30 recognizes characters, form information that is referred to when a form reading area is detected, and the like.

表示装置５０は、画像処理装置３０によって帳票が認識された結果を表示するディスプレイ等の装置である。 The display device 50 is a device such as a display that displays the result of the form recognized by the image processing device 30.

画像ＤＢ６０は、画像入力装置２０によって画像処理装置３０に入力された画像データを格納する。また、画像ＤＢ６０には、画像入力装置２０によって画像処理装置３０が認識する対象となる画像データが予め格納されていてもよい。 The image DB 60 stores image data input to the image processing device 30 by the image input device 20. The image DB 60 may store in advance image data to be recognized by the image processing apparatus 30 by the image input apparatus 20.

なお、本発明は、画像処理装置３０と同じ機能を備えるソフトウェアによって通常の計算機に実装されてもよい。 The present invention may be implemented in a normal computer by software having the same function as the image processing apparatus 30.

本発明で実現するドロップアウト方式の具体例を示す前に、本発明の処理結果の概要を図２に示す日付印の例を用いて説明する。
図２（a）において，２０００は処理対象の領域，２０１０は押印，２０２０は帳票上の罫線、２０３０はノイズを表す。ドロップアウト処理は，図２(a)の入力画像から図２(b)のドロップアウト画像を生成する処理である。ドロップアウトされた画像図（b）では、領域内には押印の色成分の画素（２０４０）のみが残っている。なお、図２は押印の例なので、スタンプの輪郭も文字と同じ色であるので、円形の輪郭も一緒に残っている。
このような処理を実現するために、本発明では以下のような方式をとる。まず、ＲＧＢ色空間の情報を用いて文字と背景、およびその他の色をクラスタリングする。次に文字色か否かを判定して、文字以外の色の成分をドロップアウトする。文字の色か否かの判定の際にはＲＧＢ色空間の情報だけでなく，ＨＳＶ色空間の情報も利用する。ＨＳＶ色空間の情報を利用することにより，高濃度のノイズ成分も色相での分離が可能となる。
次に、本発明における課題の解決策についての概要を説明する。詳細については図３以降を用いて説明する。
（ａ）文字成分やノイズ成分の色が様々で，これらの色の組み合わせが帳票毎に異なる
背景色と文字色を推定した後，文字色を黒，それ以外を白とする2値化処理を行なう。文書ごとに文字色を推定するため，文字の色が一定でなくてもドロップアウトが可能になる。
（ｂ）ＲＧＢでは分離困難な色
茶色や紫など、単純にＲＢＧ分離してもドロップアウトが困難な色をドロップアウトするには，ＲＧＢの色情報だけでなく，ＨＳＶ色空間の明度（Ｖ）や色相（Ｈ）を利用して分離する。
（ｃ）複数のノイズ成分色
従来の方式では，除去する色を選択してドロップアウトする方式が多かったため，除去したい色が複数ある場合には対応が困難な方式があった。本発明では，文字色以外を除去する方式をとる。このため，ドロップアウトしたい色が複数の場合でも，除去したい色ごとの判定処理は不要であり，除去したい色の数に依存しない。
（ｄ）高濃度のノイズ成分色
従来の方式では，濃度が高い画素は２値化処理で黒になる傾向が多い。本発明では，文字色との違いに着目しているため，文字色と異なる色相の画素除去することにより，濃度が高い画素をドロップアウトできる。
（ｅ）同系色のノイズ成分色
文字色とノイズ成分が同系色の場合は，相対的に明度が低い方を文字色，高い方をノイズ色として分離することにより，ドロップアウトが実現できる。
（ｆ）偽色
文字色を残して，他の色を除去するため，偽色が残ることは少ない。文字の画素が偽色になった場合には，周辺の画素の色を勘案して文字色かそれ以外かを判定することにより，文字内の偽色を残すことが可能になる。 Before showing a specific example of the dropout method realized in the present invention, an outline of the processing result of the present invention will be described using an example of a date stamp shown in FIG.
In FIG. 2A, 2000 represents a region to be processed, 2010 represents a stamp, 2020 represents a ruled line on the form, and 2030 represents noise. The dropout process is a process for generating the dropout image of FIG. 2B from the input image of FIG. In the image (b) that has been dropped out, only the pixel (2040) of the color component of the seal remains in the area. Since FIG. 2 is an example of a stamp, the outline of the stamp is the same color as the character, so that a circular outline remains together.
In order to realize such processing, the present invention adopts the following method. First, characters, backgrounds, and other colors are clustered using information in the RGB color space. Next, it is determined whether or not it is a character color, and components of colors other than characters are dropped out. When determining whether or not the character color is used, not only the RGB color space information but also the HSV color space information is used. By using the information in the HSV color space, it is possible to separate high density noise components by hue.
Next, the outline | summary about the solution of the subject in this invention is demonstrated. Details will be described with reference to FIG.
(A) There are various colors of character components and noise components, and combinations of these colors are different for each form. After estimating the background color and character color, binarization processing is performed in which the character color is black and the others are white. Do. Since the text color is estimated for each document, dropout is possible even if the text color is not constant.
(B) Colors that are difficult to separate with RGB In order to drop out colors that are difficult to drop out even with simple RBG separation, such as brown and purple, not only RGB color information but also the brightness (V) of the HSV color space Or using hue (H).
(C) Multiple Noise Component Colors In the conventional method, there are many methods of selecting and dropping out the color to be removed, so there is a method that is difficult to cope with when there are a plurality of colors to be removed. In the present invention, a method for removing colors other than the character color is adopted. For this reason, even when there are a plurality of colors to be dropped out, determination processing for each color to be removed is unnecessary and does not depend on the number of colors to be removed.
(D) High-density noise component color In the conventional method, pixels with high density tend to become black by binarization processing. Since the present invention focuses on the difference from the character color, a pixel having a high density can be dropped out by removing pixels having a hue different from the character color.
(E) Noise component colors of similar colors When the character color and noise components are similar colors, dropout can be realized by separating the relatively lighter as the character color and the higher one as the noise color.
(F) False color Since the character color is left and other colors are removed, the false color is rarely left. When the pixel of the character becomes false color, it is possible to leave the false color in the character by determining whether it is the character color or not by considering the color of the surrounding pixels.

以下、本発明を適用した画像処理方法および画像処理装置の一実施形態について説明する。 Hereinafter, an embodiment of an image processing method and an image processing apparatus to which the present invention is applied will be described.

図３は、本発明を適用した画像処理方法の処理フローを示す図である。これは，画像処理装置（ＣＰＵ）３０にて実行される。通常はＣＰＵで実行されるプログラムとして実現され、このようなプログラムは各種記録媒体に格納することができ、メモリに格納されてＣＰＵで実行される。 FIG. 3 is a diagram showing a processing flow of an image processing method to which the present invention is applied. This is executed by an image processing device (CPU) 30. Usually, it is realized as a program executed by the CPU, and such a program can be stored in various recording media, stored in a memory and executed by the CPU.

処理領域選択処理（３０００）は、画像中から読取対象の文字を含む領域を選択する処理である。この処理は画像中の一部分を選択しても、全体を選択しても良い。 The processing area selection process (3000) is a process for selecting an area including characters to be read from an image. This processing may select a part of the image or the entire image.

色クラスタリング処理（３０１０）は、処理領域中の各画素を色クラスタリングする処理である。色クラスタリングは、画像中の各画素を色空間上にマッピングした後、あらかじめ決められた基準に基づいて、近い色同士を同じクラスタとする。 The color clustering process (3010) is a process for color clustering each pixel in the processing area. In color clustering, each pixel in an image is mapped onto a color space, and then close colors are made the same cluster based on a predetermined criterion.

図４に色クラスタリングの一例を示す。図４は、ＲＧＢ色空間へのマッピングの例であるが、他の種類の色空間でも良い。ＲＧＢ色空間は、ＲＧＢそれぞれの値を軸に持つ3次元空間である。通常、ＲＧＢの各軸の値は０から２５５をとる。原点（０，０，０）は黒を表し、（２５５，２５５，２５５）は白を表す。処理対象領域内の各画素はＲＧＢ色空間上の一点にマッピングできる。文字、罫線、背景の色は全く同じになることはないので、各カテゴリの代表的な色を中心とした複数の分布ができる。このようにＲＧＢ色空間では赤、緑、青の色に着目したクラスタリングができる。図２の例で、背景を白、文字（２０１０）の色を赤、罫線（２０２０）の色を青とすると、図４では、４０００が背景の画素の分布、４０１０が文字の画素の分布、４０２０が罫線の画その分布となる。それ以外にも、ノイズや偽色として４０３０から４０５０のような別の色も存在しうる。色クラスタリングでは、各分布の中心の色や最も頻度が高い色を中心とするクラスタ間を、ボロノイ分割やユークリッド距離の閾値などを用いて分割することである。これにより、例えば３つのクラスタに分ける場合には、４０００、４０１０，４０２０を中心とした３つのクラスタに分割できる。 FIG. 4 shows an example of color clustering. FIG. 4 shows an example of mapping to the RGB color space, but other types of color spaces may be used. The RGB color space is a three-dimensional space having RGB values as axes. Usually, the value of each axis of RGB ranges from 0 to 255. The origin (0, 0, 0) represents black and (255, 255, 255) represents white. Each pixel in the processing target area can be mapped to one point on the RGB color space. Since characters, ruled lines, and background colors are never the same, a plurality of distributions centered on representative colors of each category can be created. In this way, clustering focusing on red, green, and blue colors can be performed in the RGB color space. In the example of FIG. 2, if the background is white, the color of the character (2010) is red, and the color of the ruled line (2020) is blue, in FIG. 4, 4000 is the background pixel distribution, 4010 is the character pixel distribution, 4020 is the distribution of ruled lines. In addition, other colors such as 4030 to 4050 may exist as noise and false colors. In color clustering, a cluster centered on the center color of each distribution or a color with the highest frequency is divided using Voronoi division, a threshold of Euclidean distance, or the like. Thereby, when dividing into three clusters, for example, it can be divided into three clusters centered on 4000, 4010, and 4020.

背景色選択処理（３０２０）は、色クラスタリングした結果から、背景色を選択する処理である。図４の例では４０００のクラスタを選択した後、このクラスタの分布の中心の色、もしくは最も画素数（頻度）が多い色を背景色とする。背景色を含むクラスタの選択基準の一例としては、最も画素数が多いクラスタを採用する。この理由は、処理対象領域内では、文字、罫線、ノイズ、背景の中で、一般的に背景の面積が最も多いからである。背景色を含むクラスタの選択基準の他の例としては、最も明度が高いクラスタを選択することができる。この理由は、紙の色は文字の色に比べて明度が高いためである。クラスタの明度としてはクラスタに含まれる画素の平均の明度や、分布のピークまたは中心の明度を用いることができる。なお、明度を基準とする場合には、ＲＧＢ色空間ではなく、ＨＳＶ色空間などを用いても良い。ＨＳＶ色空間については、図５を用いて後述する。 The background color selection process (3020) is a process for selecting a background color from the result of color clustering. In the example of FIG. 4, after 4000 clusters are selected, the color at the center of the cluster distribution or the color with the largest number of pixels (frequency) is used as the background color. As an example of a selection criterion for a cluster including a background color, a cluster having the largest number of pixels is employed. This is because the area of the background is generally the largest among the characters, ruled lines, noise, and background in the processing target area. As another example of the selection criterion for the cluster including the background color, the cluster having the highest brightness can be selected. This is because the paper color has a higher brightness than the character color. As the brightness of the cluster, the average brightness of the pixels included in the cluster or the brightness of the peak or center of the distribution can be used. Note that when the brightness is used as a reference, an HSV color space or the like may be used instead of the RGB color space. The HSV color space will be described later with reference to FIG.

文字色選択処理（３０３０）は、色クラスタリングした結果から、文字色を選択する処理である。図４の例では４０１０のクラスタを選択した後、このクラスタの分布の中心の色、もしくは最も画素数（頻度）が多い色を背景色とする。文字色を含むクラスタの選択基準の一例としては、背景色を含むクラスタの次に画素数が多いクラスタを採用する。この理由は、背景色と同様、処理領域内に占める面積に起因する。あるいは、最も明度が低いクラスタを採用する方法もある。 The character color selection process (3030) is a process for selecting a character color from the result of color clustering. In the example of FIG. 4, after selecting 4010 clusters, the color at the center of the cluster distribution or the color with the largest number of pixels (frequency) is set as the background color. As an example of a selection criterion for a cluster including a character color, a cluster having the next largest number of pixels is employed after a cluster including a background color. The reason for this is due to the area occupied in the processing region, similar to the background color. Alternatively, there is a method of adopting a cluster having the lowest brightness.

なお背景や文字の色があらかじめ指定されている場合においては、指定された色情報を用いてクラスタを選択してもよい。例えば背景が白や黄色、文字が黒や青などであることが多いので、これらの情報を利用しても良い。 When the background and the character color are designated in advance, the cluster may be selected using the designated color information. For example, since the background is often white or yellow and the characters are black or blue in many cases, such information may be used.

なお、背景色選択処理（３０２０）と文字色選択処理（３０３０）で色クラスタリングする処理領域は、同じでも異なっていてもよい。同じ場合の例は、３０００で指定された処理領域内の画素を全て利用すればよい。異なる場合の例としては、文字色選択の際には文字が存在している領域に限定することができる。これは、帳票上の文字など、あらかじめ文字の位置が特定できる場合に利用できる。また、背景色選択の際には背景が存在しやすい領域に限定することができる。 Note that the processing areas for color clustering in the background color selection process (3020) and the character color selection process (3030) may be the same or different. In the case of the same case, all the pixels in the processing area designated by 3000 may be used. As an example of a different case, the character color selection can be limited to a region where a character exists. This can be used when a character position such as a character on a form can be specified in advance. Further, when selecting a background color, it can be limited to an area where a background tends to exist.

ＨＳＶ変換処理（３０４０）は、画素分類処理（３０５０）に利用するために、処理対象の画素を入力装置から得られるＲＧＢ色空間からＨＳＶ色空間に変換する処理である。
図５を用いてＨＳＶ色空間の概要を説明する。ＨＳＶ色空間は、色を色相（Ｈ）、彩度（Ｓ）、明度（Ｖ）で表現するモデルであり、円錐で視覚化できる。色相は円錐の外周に添って変化する。垂直軸は明度を、水平軸は彩度を表す。ＲＧＢ色空間からＨＳＶ色空間へは、数式を用いて変換することが可能である。本発明でＨＳＶ色空間を用いた理由は、一般にＨＳＶ色空間を用いた色の表現は、色相や明るさを用いた人間の色の知覚方法と類似しているためである。なお、色空間の変換はＨＳＶだけでなく、ＨＬＳ色空間（色相（Ｈ）、輝度（Ｌ）、彩度（Ｓ）で表現する）やＨＳＢ色空間（色相（Ｈ）、彩度（Ｓ）、明度（Ｂ）で表現する）などを用いてもよい。その他にも、印刷の過程で利用する減法混色の表現法であるＣＭＹ色空間やＣＭＹＫ色空間などを利用してもよい。 The HSV conversion process (3040) is a process of converting the pixel to be processed from the RGB color space obtained from the input device to the HSV color space for use in the pixel classification process (3050).
The outline of the HSV color space will be described with reference to FIG. The HSV color space is a model that expresses colors by hue (H), saturation (S), and brightness (V), and can be visualized as a cone. Hue changes along the circumference of the cone. The vertical axis represents lightness, and the horizontal axis represents saturation. Conversion from the RGB color space to the HSV color space can be performed using mathematical formulas. The reason why the HSV color space is used in the present invention is that the expression of colors using the HSV color space is generally similar to a human color perception method using hue and brightness. Note that the conversion of the color space is not limited to HSV, but the HLS color space (represented by hue (H), luminance (L), saturation (S)) or HSB color space (hue (H), saturation (S)). , Expressed in brightness (B)) or the like. In addition, a CMY color space or a CMYK color space that is a subtractive color expression method used in the printing process may be used.

このようにＲＧＢ以外の次元を用いることにより、高精度の画素分類処理が可能となる。なお、上記の例では、一括して色空間の変換を行ってから次の処理を行ったが、色空間の変換を行わずに、Ｈ，Ｓ等をその都度ＲＧＢから計算して処理を行うことも可能である。 By using dimensions other than RGB in this way, highly accurate pixel classification processing can be performed. In the above example, the following processing is performed after batch conversion of the color space. However, H, S, etc. are calculated from RGB each time without performing color space conversion. It is also possible.

画素分類処理（３０５０）は、ドロップアウトを目的として領域内の各画素を分類する処理である。 The pixel classification process (3050) is a process of classifying each pixel in the area for the purpose of dropout.

図６を用いてこの処理の概要を説明する。図６は、図２の数字「２」付近の拡大図である。個々の正方形は画素を表す。図２（a）は入力された画像の一部である。文字の画素（６０００など）と背景の画素（６０１０など）、罫線の画素（６０２０など）、ノイズ等の画素が存在する。ノイズについては、紙に存在していたノイズ（６０３０など）の他に、偽色などスキャン時に発生するノイズ（６０４０など）がある。偽色とは、撮像素子や光学系の特性により、実際とは異なる色が発生することである。偽色は色が大きく異なる箇所に発生しやすいため、罫線や文字の境界付近に発生することが多い。図６（a）の画像を入力して、図（ｂ）に示すように各画素を分類する。分類は、まず文字（６０５０など）と背景（６０６０など）とに判定する。この判定は、図３の３０２０、３０３０で選択された背景色、文字色を用いて行うことができる。この２つに明確に分類できない場合は、画素の色や明度に応じて、主に偽色を意味する文字候補（６０７０など）と、文字でも背景でもない色であるとして主に罫線などを意味する除去候補（６０８０など）に判定する。この画素分類処理の詳細は図７を用いて後述する。 The outline of this process will be described with reference to FIG. FIG. 6 is an enlarged view around the number “2” in FIG. 2. Each square represents a pixel. FIG. 2A shows a part of the input image. There are character pixels (such as 6000), background pixels (such as 6010), ruled line pixels (such as 6020), and pixels such as noise. Regarding noise, in addition to noise (such as 6030) that existed on paper, there is noise (such as 6040) that occurs during scanning, such as false colors. The false color is a color that is different from the actual color due to the characteristics of the image sensor or the optical system. Since false colors are likely to occur at locations where the colors are significantly different, they are often generated near the borders of ruled lines and characters. The image of FIG. 6A is input, and each pixel is classified as shown in FIG. The classification is first made by character (eg 6050) and background (eg 6060). This determination can be made using the background color and character color selected in 3020 and 3030 of FIG. If it cannot be clearly classified into these two, depending on the color and brightness of the pixel, it mainly means a character candidate (6070, etc.) that means false color and a ruled line, etc., because it is neither a character nor a background. It is determined as a removal candidate to be removed (6080 or the like). Details of the pixel classification processing will be described later with reference to FIG.

濃淡画像生成処理（３０６０）は、後段の２値化処理（３０７０）において文字の画素が残り、それ以外の画素が除去されやすいように、各画素の輝度値を補正した濃淡画像を生成する処理である。補正処理は、画素分類処理（３０５０）において文字、背景、文字候補、もしくは除去候補に分けられた各画素に対して、当該画素と周囲の画素の判定結果や明度などから、当該画素の輝度値を変更する。背景、文字、文字候補、除去候補のそれぞれの輝度値の補正方法の概要を以下に示す。背景の画素は、確実にドロップアウトできるように白(輝度値255)に変換する。文字の画素の輝度値はそのままにする。文字候補や除去候補の画素は、当該画素の周囲の画素の判定結果や輝度から信頼度を設定し、信頼度が低ければ白に変換し、第１の基準以上、第２の基準以下であれば輝度値を高く（明るく）する。第２の基準以上であれば、輝度値はそのままにする。この処理の詳細については、図８から図１１を用いて後述する。 The grayscale image generation process (3060) is a process for generating a grayscale image in which the luminance value of each pixel is corrected so that the character pixels remain in the subsequent binarization process (3070) and the other pixels are easily removed. It is. In the correction process, for each pixel divided into a character, background, character candidate, or removal candidate in the pixel classification process (3050), the luminance value of the pixel is determined based on the determination result and brightness of the pixel and surrounding pixels. To change. An outline of a method for correcting the luminance values of the background, characters, character candidates, and removal candidates is shown below. Background pixels are converted to white (luminance value 255) to ensure dropout. The luminance value of the character pixel is left as it is. Pixels for character candidates and removal candidates are set to reliability based on determination results and brightness of pixels around the pixel, and converted to white if the reliability is low. Increase the brightness value. If it is above the second reference, the luminance value is left as it is. Details of this processing will be described later with reference to FIGS.

2値化処理（３０７０）は、濃淡画像生成処理（３０６０）で生成された濃淡画像を白と黒に２値化する処理である。この結果、文字の画素を残して、背景や罫線、ノイズ等の画素をドロップアウトした2値画像が生成される。2値化の手法としては、固定閾値を用いる手法や、動的に閾値を変える手法など多くの手法が提案されている。代表的な手法としては大津の2値化手法がある。この２値画像が文字認識などの処理に利用される。 The binarization process (3070) is a process for binarizing the grayscale image generated by the grayscale image generation process (3060) into white and black. As a result, a binary image is generated in which the pixels of the background, ruled lines, noise, and the like are dropped out, leaving the character pixels. As a binarization method, many methods such as a method using a fixed threshold and a method of dynamically changing the threshold have been proposed. As a representative method, there is Otsu's binarization method. This binary image is used for processing such as character recognition.

図７を用いて、以下、図３における画素分類（３０５０）の処理フローを詳細に説明する。 Hereinafter, the processing flow of the pixel classification (3050) in FIG. 3 will be described in detail with reference to FIG.

まず、ステップ７０００において、領域内の最初の画素を選択する。この画素に対して、以下の判定処理を行なう。ステップ７０１０は、背景の画素を判定する処理である。紙色は白であることが多いので、背景色を白と仮定した上で、当該画素の色が白に近ければ、ステップ７０２０において当該画素を背景と判定する。また、白の代わりに図３の３０２０で選択された背景色に近い色を背景と判定することもできる。本実施例では、判定結果はフラグに記録されることとする。白（または背景色）に近いか否かの判定の一例として、ＲＧＢ空間において、白（２５５，２５５，２５５）と当該画素とのユークリッド距離が基準以下であるか否かを判定する方法がある。ＨＳＶ色空間であれば、明度が基準以上で彩度が基準以下であれば白と判定する。ステップ７０１０の判定条件を満たさない場合は、ステップ７０３０の処理を行なう。ステップ７０３０は白や背景色以外の背景の画素を判定する処理である。背景は白に限らないが、文字に比べると明度が高いため、当該画素の明度が基準以上であれば、ステップ７０２０において当該画素を背景と判定する。明るさの尺度の一例としては、ＲＧＢ色空間やＨＳＶ色空間から求めた輝度や明度を利用することができる。ここまでの処理で背景の画素を判定できる。 First, in step 7000, the first pixel in the region is selected. The following determination process is performed on this pixel. Step 7010 is processing for determining a background pixel. Since the paper color is often white, assuming that the background color is white and the color of the pixel is close to white, in step 7020 the pixel is determined to be the background. Further, instead of white, a color close to the background color selected in 3020 of FIG. 3 can be determined as the background. In this embodiment, the determination result is recorded in the flag. As an example of determining whether or not it is close to white (or background color), there is a method of determining whether or not the Euclidean distance between white (255, 255, 255) and the pixel is equal to or less than a reference in the RGB space. . In the HSV color space, if the lightness is equal to or higher than the reference and the saturation is lower than the reference, it is determined to be white. If the determination condition in step 7010 is not satisfied, the process in step 7030 is performed. Step 7030 is a process for determining a background pixel other than white or a background color. Although the background is not limited to white, since the brightness is higher than that of characters, if the brightness of the pixel is equal to or higher than the reference, in step 7020, the pixel is determined as the background. As an example of the brightness scale, the luminance and brightness obtained from the RGB color space and HSV color space can be used. The background pixels can be determined by the processing so far.

ステップ７０４０は、当該画素が文字か否かを判定する処理である。当該画素が図３のステップ３０３０で求めた文字色に近い画素であれば、ステップ７０５０において当該画素を文字と判定する。文字色に近いか否かの判定の一例は、ＲＧＢ空間内における当該画素の色と文字色とのユークリッド距離が基準以下であれば文字と判定する。
ステップ７０４０の判定条件を満たさない場合、ステップ７０６０において、ステップ３０３０で求めた文字色が無彩色（灰色や黒）か有彩色かで処理を分ける。この判定はＨＳＶ色空間での彩度（Ｓ）を用いることができる。彩度が低いと文字色は黒に近く、彩度が高いと文字色は色彩をもつ（カラーである）と判定できる。 Step 7040 is processing for determining whether or not the pixel is a character. If the pixel is a pixel close to the character color obtained in step 3030 of FIG. 3, the pixel is determined to be a character in step 7050. An example of whether or not the character color is close is determined as a character if the Euclidean distance between the pixel color and the character color in the RGB space is equal to or less than the reference.
If the determination condition in step 7040 is not satisfied, the process is divided in step 7060 depending on whether the character color obtained in step 3030 is an achromatic color (gray or black) or a chromatic color. This determination can use the saturation (S) in the HSV color space. When the saturation is low, the character color is close to black, and when the saturation is high, it can be determined that the character color has a color (a color).

ステップ７０６０の判定で文字色が黒であると判定された場合に、ステップ７０７０で当該画素が文字か否かを判定する。当該画素の彩度が基準いかであれば、彩度が低い文字色に近いため、ステップ７０５０で当該画素を文字と判定する。ステップ７０７０の判定条件を満たさない場合は、ステップ７０８０において当該画素を文字候補と判定する。文字候補と判定された画素は、図８を用いて後述する輝度補正処理が実行される。この処理については図８と図９を用いて後述する。 If it is determined in step 7060 that the character color is black, it is determined in step 7070 whether the pixel is a character. If the saturation of the pixel is the reference, it is close to a character color with low saturation, so that the pixel is determined to be a character in step 7050. If the determination condition in step 7070 is not satisfied, the pixel is determined as a character candidate in step 7080. A pixel determined as a character candidate is subjected to brightness correction processing described later with reference to FIG. This process will be described later with reference to FIGS.

ステップ７０６０の判定で文字色がカラーであると判定された場合に、ステップ７０９０で当該画素が文字か否かを判定する。文字の色がカラーである場合には、当該画素と文字色の色相の差が基準以下であれば、ステップ７０５０において当該画素を文字と判定する。色相のみを判定することにより、かすれなどによる濃淡の違いを吸収できるというメリットがある。 If it is determined in step 7060 that the character color is a color, it is determined in step 7090 whether the pixel is a character. If the character color is color, if the difference in hue between the pixel and the character color is below the reference, the pixel is determined to be a character in step 7050. By judging only the hue, there is an advantage that the difference in shading due to fading can be absorbed.

ステップ７０９０の条件を満たさない場合には、当該画素は文字とは異なる色であり罫線などのノイズであるとして、ステップ７１００にて除去候補と判定する。文字候補と判定された画素は、図８を用いて後述する輝度補正処理が実行される。この処理については図８と図１０を用いて後述する。 If the condition in step 7090 is not satisfied, the pixel is determined to be a removal candidate in step 7100, assuming that the pixel has a color different from that of the character and noise such as a ruled line. A pixel determined as a character candidate is subjected to brightness correction processing described later with reference to FIG. This process will be described later with reference to FIGS.

当該画素がステップ７０２０、７０５０、７０８０、７１００のいずれかの処理を経た後、ステップ７１１０にて領域内の全ての画素が処理されたか否かを判定する。全て処理していなければステップ７１２０にて次の画素を選択してステップ７０１０に戻る。全て処理していれば、画素分類処理を終了する。 After the pixel has undergone any one of steps 7020, 7050, 7080, and 7100, it is determined in step 7110 whether all the pixels in the region have been processed. If not all have been processed, the next pixel is selected in step 7120 and the process returns to step 7010. If all are processed, the pixel classification process is terminated.

なお、図７においては、文字以外の色をドロップアウトするために文字の色のみに着目してステップ７０９０の色相の判定を行なった。しかし、罫線などのノイズ成分の色が検出できる場合には、ノイズ成分の色相との判定を加えてもよい。
また、図７においては、文字、背景、文字候補、除去候補に画像を分類したが、文字候補と除去候補は一つにまとめてもよい。すなわち、ステップ７０８０とステップ７１００が同じ処理となる。この場合、図８から図１１で説明する濃淡画像生成においても、文字候補と除去候補は同じであるとして処理を行なう。 In FIG. 7, in order to drop out colors other than characters, the hue is determined in Step 7090 while paying attention only to the color of characters. However, when the color of a noise component such as a ruled line can be detected, a determination as to the hue of the noise component may be added.
In FIG. 7, images are classified into characters, backgrounds, character candidates, and removal candidates. However, the character candidates and removal candidates may be combined into one. That is, Step 7080 and Step 7100 are the same processing. In this case, in the grayscale image generation described with reference to FIGS. 8 to 11, the processing is performed assuming that the character candidate and the removal candidate are the same.

以下、図８から図１１を用いて、図３の濃淡画像生成処理（３０６０）の処理フローを詳細に説明する。 Hereinafter, the processing flow of the grayscale image generation process (3060) in FIG. 3 will be described in detail with reference to FIGS.

図８は、濃淡画像生成処理の全体概要である。図７で分類した結果は必ずしも正しくないために、分類結果の信頼度を算出し、信頼度を用いて濃淡画像を生成する。まず、ステップ８０００において文字候補と判定された画素について信頼度を設定する。この処理の詳細については、図９を用いて後述する。次に、ステップ８０１０において、除去候補と判定された画素について信頼度を設定する。この処理の詳細については、図１０を用いて後述する。最後に、ステップ８０２０において、文字候補と除去候補と判定された画素に対して信頼度に応じた輝度値の修正を行なって濃淡画像を生成する。この処理の詳細は図１１を用いて後述する。 FIG. 8 is an overview of the entire gray image generation process. Since the result of classification in FIG. 7 is not necessarily correct, the reliability of the classification result is calculated, and a grayscale image is generated using the reliability. First, the reliability is set for the pixel determined as a character candidate in step 8000. Details of this processing will be described later with reference to FIG. Next, in step 8010, the reliability is set for the pixel determined as the removal candidate. Details of this processing will be described later with reference to FIG. Finally, in step 8020, the luminance value is corrected in accordance with the reliability of the pixels determined as the character candidate and the removal candidate to generate a grayscale image. Details of this processing will be described later with reference to FIG.

図９は、図８のステップ８０００に示した、文字候補と判定された画素に対する信頼度付けの処理フローを示す図である。この処理の目的は、主に境界付近に発生する偽色を、適切に除去もしくは残留させることである。この処理は、文字候補の画素の周囲に文字の画素があるか否かを判定し、文字の画素があれば、その数や輝度に応じて当該画素の信頼度を高める。当該画素の周辺の画素を判定する理由は、周辺に文字の画素が多い場合は、文字の一部の色が偽色などで変化した画素である可能性が高いからである。そうでなければ文字色に近いノイズ成分と判断できる。図９の処理において信頼度が高くなる文字候補の画素は、図３の2値化処理（３０８０）において、文字の画素として判定される可能性が高くなる。 FIG. 9 is a diagram showing a processing flow for assigning reliability to pixels determined to be character candidates shown in Step 8000 of FIG. The purpose of this processing is to appropriately remove or leave the false color generated mainly near the boundary. This process determines whether there is a character pixel around the pixel of the character candidate. If there is a character pixel, the reliability of the pixel is increased according to the number and luminance. The reason why the pixels around the pixel are determined is that when there are many character pixels in the periphery, there is a high possibility that the color of a part of the character has changed due to a false color or the like. Otherwise, it can be determined that the noise component is close to the character color. The character candidate pixels having high reliability in the processing of FIG. 9 are more likely to be determined as character pixels in the binarization processing (3080) of FIG.

図９では、まず、ステップ９０００において、領域内の最初の画素を選択する。この画素に対して以下の判定処理を行なう。ステップ９０１０は、当該画素が文字候補か否かを判定する。文字候補の場合には、ステップ９０２０において、当該画素の周囲に文字と判定された画素の有無を判定する。周囲とは、当該画素を中心として隣接する８近傍でも４近傍でもよい。周囲に文字の画素が存在する場合には、周囲の文字画素の数や、当該画素や文字画素の輝度の情報を利用して信頼度を設定する。信頼度の一例としては、周囲の文字画素の数を定数倍する手法がある。その他には、当該画素と文字画素との輝度の差の逆数を定数倍する手法がある。輝度の差を利用する理由は、文字の画素との輝度の差が少ないほど文字の可能性が高いからである。 In FIG. 9, first, in step 9000, the first pixel in the region is selected. The following determination process is performed on this pixel. Step 9010 determines whether or not the pixel is a character candidate. In the case of a character candidate, in step 9020, it is determined whether or not there is a pixel determined to be a character around the pixel. The perimeter may be 8 or 4 neighbors adjacent to each other with the pixel as a center. When there are character pixels around, the reliability is set using the number of surrounding character pixels and the luminance information of the pixels and character pixels. As an example of reliability, there is a method of multiplying the number of surrounding character pixels by a constant. In addition, there is a method of multiplying the reciprocal of the difference in luminance between the pixel and the character pixel by a constant. The reason for using the luminance difference is that the smaller the luminance difference from the pixel of the character, the higher the possibility of the character.

ステップ９０２０で条件を満たさない場合、もしくはステップ９０３０の後、ステップ９０４０にて領域内の全ての画素が処理されたか否かを判定する。全て処理していなければステップ９０５０にて次の画素を選択してステップ９０１０に戻る。全て処理していれば、文字候補画素の信頼度設定処理を終了する。 If the condition is not satisfied in step 9020, or after step 9030, it is determined in step 9040 whether all the pixels in the region have been processed. If not all are processed, the next pixel is selected at step 9050 and the process returns to step 9010. If all the processes have been processed, the character candidate pixel reliability setting process ends.

図１０は、図８のステップ８０１０に示した、除去候補と判定された画素に対する信頼度付けの処理フローを示す図である。この処理の目的は、文字でも背景でもない色を持つ罫線などのノイズ成分の画素を除去することである。この処理では、除去候補の画素の周囲に、文字の画素や信頼度が高い文字候補の画素があるか否かを判定し、文字の画素や文字候補の画素があれば、その数や輝度に応じて当該画素の信頼度を高める。当該画素の周囲の画素を判定する理由は、周囲に文字の画素が多い場合は、罫線などのノイズ成分と文字が交差している部分である可能性が高いからである。図９の処理において信頼度が高い文字候補画素も判定対象とする理由は、文字に隣接しているために最終的に文字になる可能性が高いからである。 FIG. 10 is a diagram showing a processing flow for assigning reliability to a pixel determined as a removal candidate shown in step 8010 of FIG. The purpose of this process is to remove pixels of noise components such as ruled lines having colors that are neither characters nor background. In this process, it is determined whether there is a character pixel or a character candidate pixel with high reliability around the removal candidate pixel. If there is a character pixel or character candidate pixel, the number or luminance is determined. Accordingly, the reliability of the pixel is increased. The reason why the pixels around the pixel are determined is that when there are many character pixels around the pixel, there is a high possibility that a noise component such as a ruled line intersects the character. The reason why the character candidate pixel having high reliability in the processing of FIG. 9 is also determined is because it is highly likely that it will eventually become a character because it is adjacent to the character.

図１０では、まずステップ１００００において、領域内の最初の画素を選択する。この画素に対して以下の判定処理を行なう。ステップ１００１０は、当該画素が除去候補か否かを判定する。除去候補の場合には、ステップ１００２０において、当該画素の周囲に文字と判定された画素の有無を判定する。周囲に文字の画素が存在する場合には、周囲の文字画素の数や、当該画素や文字画素の輝度の情報を利用して信頼度を設定する。この信頼度の設定方法は図９のステップ９０３０と同様でも別でもよい。ステップ１００２０の条件を満たさない場合には、当該画素の周囲に図８のステップ８０００で高い信頼度となった文字候補の有無を判定する。この条件を満たす場合には、ステップ１００３０に進む。 In FIG. 10, first, in step 10000, the first pixel in the region is selected. The following determination process is performed on this pixel. Step 10010 determines whether or not the pixel is a removal candidate. In the case of a removal candidate, in step 10020, it is determined whether or not there is a pixel determined to be a character around the pixel. When there are character pixels around, the reliability is set using the number of surrounding character pixels and the luminance information of the pixels and character pixels. This reliability setting method may be the same as or different from step 9030 in FIG. If the condition of step 10020 is not satisfied, the presence / absence of a character candidate having high reliability in step 8000 of FIG. 8 is determined around the pixel. If this condition is met, the process proceeds to step 10030.

ステップ１００４０で条件を満たさない場合、もしくはステップ１００３０の後、ステップ１００５０にて領域内の全ての画素が処理されたか否かを判定する。全て処理していなければステップ１００６０にて次の画素を選択してステップ１００１０に戻る。全て処理していれば、除去候補画素の信頼度設定処理を終了する。 If the condition is not satisfied in step 10040, or after step 10030, it is determined in step 10050 whether all the pixels in the region have been processed. If not all are processed, the next pixel is selected in step 10060 and the process returns to step 10010. If all the processes have been processed, the reliability setting process for the removal candidate pixel is terminated.

図１１は、図８のステップ８０２０に示した、判定結果と信頼度に基づく濃淡画像生成処理のフローを示す図である。この処理では、画素の分類と信頼度に応じて画素の輝度値を補正して濃淡画像を生成する。輝度値の補正では、背景は白に、文字でない画素は輝度値を上げる。この処理により、後段の2値化処理において文字のみが残る2値画像を生成することを目的としている。 FIG. 11 is a diagram showing the flow of the grayscale image generation process based on the determination result and the reliability shown in step 8020 of FIG. In this process, the gray level image is generated by correcting the luminance value of the pixel according to the classification and reliability of the pixel. In the correction of the luminance value, the background is set to white, and the luminance value is increased for pixels that are not characters. The purpose of this processing is to generate a binary image in which only characters remain in the subsequent binarization processing.

まず、ステップ１１０００において、領域内の最初の画素を選択する。この画素に対して、以下の判定処理を行なう。ステップ１１０１０は、当該画素が文字か否を判定する処理である。文字の画素の場合は、ステップ１１０１０において当該画素の輝度値をセットする。ステップ１１０１０の条件を満たさない場合、ステップ１１０２０において当該画素が背景か否かを判定する。背景の画素の場合は、確実にドロップアウトできるように、白の輝度値（最大輝度値、２５５）をセットする。 First, in step 11000, the first pixel in the region is selected. The following determination process is performed on this pixel. Step 11010 is processing for determining whether or not the pixel is a character. In the case of a character pixel, in step 11010, the luminance value of the pixel is set. If the condition of step 11010 is not satisfied, it is determined in step 11020 whether the pixel is the background. In the case of a background pixel, a white luminance value (maximum luminance value 255) is set so as to ensure dropout.

ステップ１１０２０の条件を満たさない画素は、文字候補もしくは除去候補である。これらはステップ１１０４０において、信頼度が０か否かを判定する。信頼度が０であるということは、周囲に文字の画素が存在しないということである。この場合、ノイズであると判定して、ステップ１１０３０にて白の輝度値をセットする。信頼度が０より大きい場合は、ステップ１１０５０において、信頼度が予め決められた基準値を超えるか否かを判定する。信頼度が基準値以下の場合には、ステップ１１０６０にて輝度値を増加させた値をセットする。ステップ１１０５０の条件を満たす画素は、文字である可能性が低いものの可能性が０ではない。そこで、2値化でドロップアウトしやすくするために輝度値を上げた値をセットする。輝度値を上げたとしても、当該画素の輝度値や周囲の輝度値により、ドロップアウトされる場合とされない場合がある。最終判定は、後段の2値化処理（３０７０）で行なう。輝度値を上げる計算の一例は、輝度値に信頼度の定数倍を増やすことができる。 Pixels that do not satisfy the condition of step 11020 are character candidates or removal candidates. In step 11040, it is determined whether or not the reliability is zero. A reliability of 0 means that there are no character pixels around. In this case, it is determined as noise, and a white luminance value is set in step 11030. If the reliability is greater than 0, it is determined in step 11050 whether the reliability exceeds a predetermined reference value. If the reliability is less than or equal to the reference value, a value obtained by increasing the luminance value is set in step 11060. A pixel that satisfies the condition of step 11050 has a low possibility of being a character, but its possibility is not zero. Therefore, in order to make it easy to drop out by binarization, a value with an increased luminance value is set. Even if the luminance value is increased, the pixel may be dropped out depending on the luminance value of the pixel or the surrounding luminance value. The final determination is performed in the subsequent binarization process (3070). An example of a calculation to increase the luminance value can increase a constant multiple of the reliability to the luminance value.

ステップ１１０５０の条件を満たさない場合は、１１０１０にて輝度値をそのままセットする。これは、当該画素と隣接する文字の画素が多いため、文字の画素と同じ扱いをしたものである。 If the condition of step 11050 is not satisfied, the luminance value is set as it is at 11010. This is the same treatment as the character pixel because there are many character pixels adjacent to the pixel.

当該画素がステップ１１０１０、１１０３０、１１０６０のいずれかの処理を経た後、ステップ１１０７０にて領域内の全ての画素が処理されたか否かを判定する。全て処理していなければステップ１１０８０にて次の画素を選択してステップ１１０１０に戻る。全て処理していれば、判定結果と信頼度に基づく濃淡画像生成処理を終了する。
図９から図１１の処理を図６の例を用いて補足する。まず、文字候補の画素について説明する。６０９０や６１００のような周囲に文字の画素が多い場合は、文字であるとして図６（ｃ）に示すように黒に２値化される。６１１０は周囲に文字の画素がないため、図６（ｃ）では白となる。６０７０は周囲に文字の画素があるものの、輝度が高かったため、図６（ｃ）では白となる。次に、除去候補の画素について説明する。６０８０は罫線の画素である。これは文字の色とは輝度もしくは色相が異なるため、図６（ｃ）では白となる。６０３０は周囲に文字の画素がないため、図６（ｃ）では白となる。一方、６１２０や６１３０は除去候補であるものの、周囲に文字候補が多いため、図６（ｃ）では黒となる。このように、偽色や罫線などの他の色が存在している場合でも、文字と同じ色のみをドロップアウトできる。 After the pixel has undergone any one of steps 11010, 11030, and 11060, it is determined in step 11070 whether all the pixels in the region have been processed. If not all are processed, the next pixel is selected at step 11080 and the processing returns to step 11010. If all are processed, the grayscale image generation process based on the determination result and the reliability is terminated.
The processing of FIGS. 9 to 11 will be supplemented by using the example of FIG. First, the pixel of a character candidate will be described. When there are many character pixels around 6090 and 6100, the character is binarized to black as shown in FIG. 6110 is white in FIG. 6C because there is no character pixel around it. Although 6070 has character pixels around it, the luminance is high, so in FIG. 6C, it is white. Next, the removal candidate pixels will be described. Reference numeral 6080 denotes a ruled line pixel. This is white in FIG. 6C because the luminance or hue is different from the character color. 6030 is white in FIG. 6C because there is no character pixel around it. On the other hand, although 6120 and 6130 are removal candidates, there are many character candidates around them, and therefore, black is displayed in FIG. 6C. Thus, even when other colors such as false colors and ruled lines exist, only the same color as the character can be dropped out.

次に、本発明を適用した画像処理方法および画像処理装置の第二の実施形態について説明する。 Next, a second embodiment of an image processing method and an image processing apparatus to which the present invention is applied will be described.

図１２は、本発明のドロップアウト処理のフローを示す別の図である。図３と同じ番号がついている処理は図３と同じである。この例では、図３の２値化処理（３０７０）の前後に画像補正１（１２０００）と画像補正２（１２０１０）が追加されている。なお、この処理はどちらか一方でもよい。
画像補正１（１２０００）は、濃淡画像を用いた画像補正である。補正の例としては、傾き補正がある。傾き補正をここで行なうメリットは、２値化後の画像に比べて濃淡画像では傾き補正後に量子化誤差が発生しにくいことである。具体的には、斜めの線や曲線を含む画像を補正した際に、線上にギザギザが発生しにくくなる。傾き補正手法の例としては、バイリニア法やバイキュービック法を利用することができる。なお、傾き補正のためには、傾きを検出する必要がある。これは画像補正１（１２０００）内で実行しても、他の処理で求めてもよい。画像補正１で行なう処理の他の例としては、ノイズ除去がある。ノイズ除去の例としては、隣接する画素の濃度からスムージング処理を行なうなどがある。
画像補正２（１２０１０）は、２値画像を用いた画像補正である。補正の一例としては、傾き補正がある。傾き補正をここで行なうメリットは、濃淡画像に比べて処理時間が短いことである。なお、傾き補正のため傾き検出は画像補正２（１２０１０）内で実行しても、他の処理で求めてもよい。傾き補正は画像補正１（１２０００）と画像補正２（１２０１０）のどちらでも実行可能であるので、通常はどちらか一方で行なえばよい。画像補正１で行なう処理の他の例としては、ノイズ除去がある。ノイズ除去の例としては、孤立点除去などがある。 FIG. 12 is another diagram showing a flow of the dropout processing of the present invention. The processes with the same numbers as in FIG. 3 are the same as those in FIG. In this example, image correction 1 (12000) and image correction 2 (12010) are added before and after the binarization processing (3070) in FIG. This process may be either one.
Image correction 1 (12000) is image correction using a grayscale image. An example of correction is tilt correction. The merit of performing the inclination correction here is that quantization errors are less likely to occur after the inclination correction in the grayscale image as compared to the binarized image. Specifically, when an image including an oblique line or curve is corrected, a jagged line is hardly generated on the line. As an example of the inclination correction method, a bilinear method or a bicubic method can be used. In order to correct the inclination, it is necessary to detect the inclination. This may be executed within the image correction 1 (12000) or may be obtained by other processing. Another example of processing performed in image correction 1 is noise removal. As an example of noise removal, smoothing processing is performed from the density of adjacent pixels.
Image correction 2 (12010) is image correction using a binary image. One example of correction is tilt correction. The advantage of performing the inclination correction here is that the processing time is shorter than that of the grayscale image. Note that for inclination correction, the inclination detection may be executed within the image correction 2 (12010) or may be obtained by other processing. Since the inclination correction can be executed by either image correction 1 (12000) or image correction 2 (12010), it is usually sufficient to perform either one. Another example of processing performed in image correction 1 is noise removal. An example of noise removal is removal of isolated points.

次に、本発明を適用した画像処理方法および画像処理装置の第三の実施形態について説明する。 Next, an image processing method and an image processing apparatus to which the present invention is applied will be described.

図１３は、本発明のドロップアウト処理を利用した、帳票上の文字の読取を行なう処理フローである。画像入力（１３０００）において対象のカラー画像を入力し、読取領域選択（１３０１０）において読取対象の文字を含む領域を検出する。読取り対象の領域検出方法としては、帳票上の枠を検出する方法や、あらかじめ決められた座標から求める方法などがある。次に、読取対象の領域に対してドロップアウト処理（１３０２０）を行なう。この処理は、本発明の図３や図１２の手法を利用することができる。次に、ドロップアウトした画像から文字領域を検出する（１３０３０）。文字領域の検出の一例として、行抽出などの手法を利用できる。次に、文字認識（１３０４０）を行なった後、認識結果を出力（１３０５０）する。 FIG. 13 is a processing flow for reading characters on a form using the dropout processing of the present invention. In the image input (13000), the target color image is input, and in the reading area selection (13010), an area including the character to be read is detected. As a method for detecting a region to be read, there are a method for detecting a frame on a form, a method for obtaining from a predetermined coordinate, and the like. Next, dropout processing (13020) is performed on the area to be read. For this processing, the technique of FIG. 3 or FIG. 12 of the present invention can be used. Next, a character area is detected from the dropped-out image (13030). As an example of character area detection, a technique such as line extraction can be used. Next, after character recognition (13040), the recognition result is output (13050).

次に、本発明を適用した画像処理方法および画像処理装置の第四の実施形態について説明する。 Next, a fourth embodiment of an image processing method and an image processing apparatus to which the present invention is applied will be described.

図１４は、本発明のドロップアウト処理を利用した、領収印の日付認識の処理フローである。図１３内の番号は図１３と同じ処理である。画像入力（１３０００）において対象のカラー画像を入力し、領収印検知（１４０００）において領収印の領域を検出する。領収印領域検出の一例としては、特許文献３がある。次に、領収印の領域に対してドロップアウト処理（１４０１０）を行なう。この処理は、本発明の図３や図１２の手法を利用することができる。この処理では、傾き補正も行なうものとする。次に、ドロップアウトした領収印の画像から日付領域を検出する（１４０２０）。日付領域の検出の一例として、上下方向に中央付近に存在する黒画素の塊の列を選択するなどの手法を利用できる。次に、ドロップアウト画像から日付領域を切り出して日付認識をすることができる。日付認識（１４０３０）ではＯＣＲを用いて文字を認識する。さらに、様々な日付の表記形式をあらかじめ知識として蓄えておき、文字認識結果と照合することにより、文字認識結果を修正して日付として矛盾のない認識結果を出力する。文字認識結果が日付として矛盾がなければ結果を出力して（１３０５０）終了する。矛盾があれば、日付画像を１８０度回転して（１４０５０）認識しなおす。 FIG. 14 is a process flow for recognizing the date of receipt using the dropout process of the present invention. The numbers in FIG. 13 are the same as those in FIG. The target color image is input in the image input (13000), and the region of the receipt is detected in the receipt detection (14000). As an example of receipt area detection, there is Patent Document 3. Next, dropout processing (14010) is performed on the region of receipt. For this processing, the technique of FIG. 3 or FIG. 12 of the present invention can be used. In this process, inclination correction is also performed. Next, a date area is detected from the image of the receipt that has been dropped out (14020). As an example of the date area detection, a technique such as selecting a column of black pixel blocks existing near the center in the vertical direction can be used. Next, the date area can be cut out from the dropout image for date recognition. In date recognition (14030), characters are recognized using OCR. Furthermore, various date notation formats are stored as knowledge in advance and collated with the character recognition result to correct the character recognition result and output a consistent recognition result as a date. If the character recognition result is consistent with the date, the result is output (13050) and the process ends. If there is a contradiction, the date image is rotated 180 degrees (14050) and re-recognized.

本発明の実施形態の画像処理装置の構成図。1 is a configuration diagram of an image processing apparatus according to an embodiment of the present invention. 本発明の実施形態のドロップアウト処理の概略を示すフローチャート。The flowchart which shows the outline of the dropout process of embodiment of this invention. 本発明の実施形態のドロップアウト処理の一例を示す図。The figure which shows an example of the dropout process of embodiment of this invention. 本発明の実施形態の色クラスタリング処理の一例を示す図。The figure which shows an example of the color clustering process of embodiment of this invention. HSV色空間を示す図。The figure which shows HSV color space. 本発明の実施形態の画素分類と2値化の一例を示す図。The figure which shows an example of the pixel classification | category and binarization of embodiment of this invention. 本発明の実施形態の画素分類処理の概略を示すフローチャート。The flowchart which shows the outline of the pixel classification | category process of embodiment of this invention. 本発明の実施形態の輝度補正処理の概略を示すフローチャート。The flowchart which shows the outline of the brightness | luminance correction process of embodiment of this invention. 本発明の実施形態の輝度補正処理における、文字候補の信頼度設定処理の概略を示すフローチャート。The flowchart which shows the outline of the reliability setting process of a character candidate in the brightness correction process of embodiment of this invention. 本発明の実施形態の輝度補正処理における、除去候補の信頼度設定処理の概略を示すフローチャート。The flowchart which shows the outline of the reliability setting process of a removal candidate in the brightness | luminance correction process of embodiment of this invention. 本発明の実施形態の輝度補正処理における、判定結果と信頼度に基づく濃淡画像生成処理の概略を示すフローチャート。The flowchart which shows the outline of the grayscale image production | generation process based on the determination result and reliability in the brightness correction process of embodiment of this invention. 本発明の実施形態のドロップアウト処理の別の例を示す図。The figure which shows another example of the dropout process of embodiment of this invention. 本発明のドロップアウト処理を利用した帳票読み取りの処理フロー。The process flow of the form reading using the dropout process of this invention. 本発明のドロップアウト処理を利用した押印日付読み取りの処理フロー。The processing flow of the stamp date reading using the dropout processing of the present invention.

Claims

An image processing apparatus that drops out some pixels in an input image,
By color clustering the pixels in the image,
Select a background color and a text color from colors other than the background color,
And further comprises means for classifying the pixels in the processing region for dropout,
Based on the classification result and the surrounding information of each pixel, the luminance value of the pixel is corrected to generate a grayscale image,
Generate a binary image by binarizing the grayscale image,
The image processing apparatus classifies image pixels into characters, backgrounds, character candidates, and removal candidates,
If the color of the pixel is close to the background color or white or the brightness is above the standard, it will be judged as the background
If the color of the pixel is close to the character color, determine the pixel as a character,
For pixels that are excluded from the above judgment,
When the character color is close to black, if the saturation of the pixel is less than the first reference, the pixel is determined to be a character, and if the saturation is greater than or equal to the first reference, the pixel is determined as a character candidate. Judgment
When the color of the character has a color, if the difference between the hue of the pixel and the hue of the character is less than the second reference, the pixel is determined to be a character, and if the difference is greater than or equal to the second reference, it is determined as a removal candidate. An image processing apparatus.

The image processing apparatus according to claim 1.
If there are character pixels or character candidate pixels around the candidate and removal candidate pixels, increase the reliability.
Set the brightness value of the input pixel to the character pixel,
Set the maximum brightness value for the background pixels,
For character candidates and removal candidate pixels, if the reliability is less than the third standard, the background is determined to be the maximum brightness, and the reliability is greater than the third standard and less than the fourth standard. If the reliability is higher than the fourth standard, set the luminance value of the input pixel.
Generate a grayscale image using the brightness value of the set pixel
An image processing apparatus.