JP2000048120A

JP2000048120A - Method for extracting character area of gray level image and recording medium having recorded the program thereon

Info

Publication number: JP2000048120A
Application number: JP10212258A
Authority: JP
Inventors: Norifumi Katabuchi; 典史片渕; Shuichi Ohara; 秀一大原; Koichi Tanaka; 弘一田中; Masashi Okudaira; 雅士奥平
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1998-07-28
Filing date: 1998-07-28
Publication date: 2000-02-18
Anticipated expiration: 2018-07-28
Also published as: JP3416058B2

Abstract

PROBLEM TO BE SOLVED: To improve precision in reading and recognizing characters by extracting and specifying only a part including a character from the image of a processing object so as to execute image processing and character collation only in a partial area. SOLUTION: Based on the density value of each divided partial area, a distributed quantity is calculated to judge the area to be a character block or a background block (S201). Each partial area is rotated to calculate a distribution value from the distribution of the projecting values of each direction to judge the area to be the character block or the background block (S202). Whether the partial area is the character block or the background block is totally judged by these discriminated result to finally establish the character area (S203). Only the character area is extracted by limiting to the partial area including a character specified from the image of the processing object. Otherwise, the shape of a character boundary where variation of density is steep is extracted (S204). Thus, character reading and recognition hardly influenced by density variation and a noise pattern existing in the background area of an original image.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、画像処理による文
字認識技術の一つと位置づけられ、分割された各部分領
域の濃度分布およびその方向性に着目して文字領域と背
景領域とを分離し、文字存在領域のみを抽出する濃淡画
像の文字領域抽出方法及びそのプログラムを記録した記
録媒体に関するものである。The present invention is regarded as one of the character recognition techniques based on image processing, and separates a character area from a background area by paying attention to the density distribution of each divided partial area and its directionality. The present invention relates to a method for extracting a character area of a grayscale image for extracting only a character existing area and a recording medium on which a program for the method is recorded.

【０００２】本発明は、表面加工精度の粗い金属製部品
や鋳物に打刻された刻印文字の認識や照合の前処理（マ
スクの自動生成など）として利用可能である。[0002] The present invention can be used as a preprocessing (automatic generation of a mask, etc.) of recognition and collation of engraved characters imprinted on metal parts or castings with rough surface processing accuracy.

【０００３】[0003]

【従来の技術】工業製品の製造工程において、文字の読
み取り・認識は、製品の識別や分類の自動化を行う上で
も必要不可欠な技術である。以下、従来用いられてきた
濃淡画像の文字読み取り技術について説明する。2. Description of the Related Art In the manufacturing process of industrial products, reading and recognizing characters is an indispensable technology for automating product identification and classification. Hereinafter, a conventional technique of reading a gray-scale image character will be described.

【０００４】代表的な方法としては、前もって基準とな
るテンプレート画像を文字の種類数だけ用意しておき、
これらの規準画像群と比較して最もよく合致する規準画
像を特定することによって行うテンプレートマッチング
がある。この方法は、処理対象の文字画像が鮮明であれ
ば高精度で認識可能であるが、背景部分のノイズに起因
する画像的な差異がテンプレート画像との合致を妨げ、
文字の認識精度が低下するという問題点があった。[0004] As a typical method, a template image serving as a reference is prepared in advance by the number of types of characters, and
There is template matching performed by specifying the best matching reference image as compared with these reference image groups. This method can be recognized with high accuracy if the character image to be processed is clear, but image differences due to noise in the background part prevent matching with the template image,
There is a problem that the recognition accuracy of the character is reduced.

【０００５】こうした問題点を解決すべく、通常これま
では２値化処理による背景雑音の除去技法、あるいは種
々のエッジ抽出技法によって、濃度変化の急峻な文字境
界を抽出する方法が用いられてきた。In order to solve such a problem, a method of extracting a character boundary having a sharp change in density has conventionally been used in the past by a technique for removing background noise by binarization processing or various edge extraction techniques. .

【０００６】[0006]

【発明が解決しようとする課題】しかしながら、前者の
２値化処理方式では文字部分と背景部分とのコントラス
トが十分とれていないと２値化のための閾値決定が困難
である。また、後者のエッジ抽出方式は、その微分処理
によって文字境界以外の背景部分においても濃度変化が
急峻なところでは文字境界の類似画像が生じてしまうと
いう課題を抱えている。いずれの方法に関しても、文字
以外の雑音性画像が残り、その背景雑音がテンプレート
画像とのマッチングにおいて照合対象領域に含まれてし
まうことが、後の認識・照合処理の精度低下を招く主た
る原因となっている。However, in the former binarization processing method, it is difficult to determine a threshold value for binarization unless the contrast between the character portion and the background portion is sufficient. In addition, the latter edge extraction method has a problem that a similar image of a character boundary is generated in a background portion other than a character boundary where the density change is steep due to the differential processing. Regardless of the method, the main cause that the accuracy of later recognition / matching process is reduced is that the noise image other than the character remains and the background noise is included in the matching target region in the matching with the template image. Has become.

【０００７】図８は、被検査物の文字画像パターンの一
例を示す。図８（ａ）は理想的な文字パターン例である
が、実際に得られる画像は、図８（ｂ）、図８（ｃ）の
ように、部品表面の微妙な凹凸と照明条件とに起因する
濃度むらや背景雑音の影響を受けた画像となってしま
い、部品表面の加工具合や刻印打刻位置によっては同一
文字の構成線分であっても、背景領域と比較して文字部
分が明るくなる場合と暗くなる場合が混在した画像とな
る事象が発生し、背景領域からの濃度変化に基づく文字
領域の抽出を困難なものにしている。FIG. 8 shows an example of a character image pattern of the inspection object. FIG. 8A shows an example of an ideal character pattern, but the actually obtained image is caused by subtle irregularities on the component surface and the lighting conditions as shown in FIGS. 8B and 8C. Depending on the processing condition of the part surface and the stamping position, even if it is a component line of the same character, the character part is brighter than the background area depending on the density unevenness and the background noise. An event occurs in which an image becomes mixed when it becomes dark and when it becomes dark, which makes it difficult to extract a character area based on a density change from a background area.

【０００８】工業製品の外観検査では、このような状況
下においても刻印文字の高精度な認識が要求されるが、
文字画像の濃度分布や輪郭線が不安定となり易く、背景
部の雑音性パターンの影響を受けて、実際に明瞭な画像
を取得することが容易ではない。In the appearance inspection of industrial products, high-precision recognition of engraved characters is required even under such circumstances.
The density distribution and outline of the character image tend to be unstable, and it is not easy to actually obtain a clear image under the influence of the noise pattern in the background.

【０００９】本発明は上記のような問題点に鑑み、文字
を含む濃淡画像から文字を構成する線分を含む部分領域
のみを特定・抽出し分離精度を向上させることによっ
て、文字を含まない部分領域をあらかじめ除去し、また
文字を含む限定された部分領域に対して種々の画像処理
や文字照合を行うことによって文字読み取り・認識の精
度向上をもたらすようにすることを目的としている。In view of the above-mentioned problems, the present invention specifies and extracts only a partial region including a line segment constituting a character from a grayscale image including a character, and improves the separation accuracy, thereby improving the separation accuracy. It is an object of the invention to improve the accuracy of character reading / recognition by removing a region in advance and performing various image processing and character matching on a limited partial region including a character.

【００１０】[0010]

【課題を解決するための手段】上記目的を達成するため
に、本発明では、処理の対象となる濃淡画像を一定サイ
ズの小領域に分割し、各部分領域に対し、当該部分領域
の濃度値をもとに算出される分散量によって文字候補部
分領域を特定する第１の判定処理と、当該部分領域にお
ける各回転方向についての、濃度変化の方向性を表す射
影値分布の回転角度依存性によって文字候補部分領域を
特定する第２の判定処理と、上記第１の判定処理に基づ
く判別結果、ならびに上記第２の判定処理に基づく判別
結果の両者から文字領域を確定する総合判定処理と、を
有することを特徴とする。In order to achieve the above object, according to the present invention, a gray image to be processed is divided into small areas of a fixed size, and the density value of the partial area is determined for each partial area. A first determination process for specifying a character candidate partial region based on the variance calculated based on the above, and a rotation angle dependency of a projection value distribution indicating the directionality of density change for each rotation direction in the partial region A second determination process for specifying a character candidate partial region; a comprehensive determination process for determining a character region from both a determination result based on the first determination process and a determination result based on the second determination process. It is characterized by having.

【００１１】本発明は文字が線分によって構成されると
いう画像的性質に着眼し、前記射影値算出により画像中
からの線分の方向性を抽出することを特徴の１つとして
おり、従来の技術とは、処理対象の画像から文字領域を
分離・抽出する方法が大きく異なる。One feature of the present invention is to focus on the image property that a character is composed of line segments, and to extract the directionality of line segments from an image by calculating the projection value. The method for separating and extracting a character area from an image to be processed is greatly different from the technique.

【００１２】[0012]

【発明の実施の形態】本発明の実施例に関し、文字領域
の抽出に適用した場合を例にとって、図面に基づき詳細
に説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Embodiments of the present invention will be described in detail with reference to the drawings, taking as an example a case where the present invention is applied to extraction of a character area.

【００１３】図１は、本発明による濃淡画像の文字領域
抽出を用いて、工業部品の刻印文字濃淡画像から文字線
分の存在領域のみを抽出する処理の一実施例を示す。本
実施例は第１から第４までの処理段階を有する。ここ
で、文字線を含む部分領域を文字ブロック、それ以外の
背景や濃度むらを無視して一様とみなす部分領域を背景
ブロックと呼ぶこととする。FIG. 1 shows an embodiment of a process for extracting only the existing region of the character line from the engraved character grayscale image of the industrial part using the character region extraction of the grayscale image according to the present invention. This embodiment has first to fourth processing steps. Here, a partial area including a character line is referred to as a character block, and a partial area that is regarded as uniform ignoring background and density unevenness is referred to as a background block.

【００１４】第１の処理段階２０１では、分割された各
部分領域の濃度値をもとに前述の分散量を計算し当該値
の０との大小比較によって、当該部分領域が文字ブロッ
クか背景ブロックかを判定する。In a first processing step 201, the variance is calculated based on the density value of each divided partial area, and the value of the variance is compared with 0 to determine whether the partial area is a character block or a background block. Is determined.

【００１５】第２の処理段階２０２では、各部分領域に
ついて回転を行い、当該各方向の射影値分布から前述の
分散値を計算し、その値の大小で当該部分領域が文字ブ
ロックか背景ブロックかを判定する。In a second processing step 202, rotation is performed for each partial area, and the above-mentioned variance value is calculated from the projection value distribution in each direction. Depending on the magnitude of the value, whether the partial area is a character block or a background block is determined. Is determined.

【００１６】第３の処理段階２０３では、上記第１の判
定処理に基づく判別結果、ならびに上記第２の判定処理
に基づく判別結果の論理積あるいは論理和をとることに
より、当該部分領域が文字領域か背景領域かを総合的に
判定し、最終的に文字領域を確定させる。In a third processing step 203, a logical product or a logical sum of the determination result based on the first determination process and the determination result based on the second determination process is calculated, so that the partial region is a character region. Or the background area is comprehensively determined, and the character area is finally determined.

【００１７】第４の処理段階２０４では、処理対象の画
像から特定された文字を含む部分領域に限定して、例え
ば判別分析法などの２値化処理によって、文字領域のみ
を抽出する。あるいは、いわゆるエッジ抽出技法を用い
て濃度変化の急峻な文字境界形状を抽出する。In the fourth processing step 204, only the character area is extracted by binarizing processing such as discriminant analysis, for example, by limiting to the partial area including the character specified from the image to be processed. Alternatively, a character boundary shape having a sharp density change is extracted using a so-called edge extraction technique.

【００１８】図２は、本実施例の詳細な処理手順を示し
たものである。３０１は領域分割処理、３０２は分散量
算出と第１のブロック判定処理、３０３は回転画像生成
処理、３０４は射影分布算出と分散データ算出判定処
理、３０５は第２のブロック判定処理、３０６は文字領
域／背景領域の総合判定処理、３０７はエッジ抽出処理
である。以下では、上記実施例による各処理について具
体的に説明する。FIG. 2 shows a detailed processing procedure of the present embodiment. 301 is a region division process, 302 is a variance amount calculation and a first block determination process, 303 is a rotation image generation process, 304 is a projection distribution calculation and a variance data calculation determination process, 305 is a second block determination process, and 306 is a character An area / background area comprehensive determination process, and 307, an edge extraction process. Hereinafter, each process according to the above embodiment will be specifically described.

【００１９】まず、入力の濃淡画像Ｇ（ｉ，ｊ）は領域
分割処理３０１で、あらかじめ与えられた大きさ（例え
ば、１６×１６画素）の部分領域に分割する。図３は分
割態様を説明する図である。以降の説明のため、画像の
横方向にｍ番目、縦方向にｎ番目の部分領域番号を
（ｍ，ｎ）と記す。この分割処理は、必ずしも図３
（ａ）に示すような等分割である必要はなく、図３
（ｂ），（ｃ）に示すように互いの領域が重なり合って
いたり、離れていたりしても構わない。First, the input gray-scale image G (i, j) is divided into partial regions of a predetermined size (for example, 16 × 16 pixels) in a region dividing process 301. FIG. 3 is a diagram illustrating a division mode. For the following description, the m-th partial area number in the horizontal direction and the n-th partial area number in the vertical direction of the image are described as (m, n). This dividing process is not necessarily performed by using FIG.
It is not necessary to perform equal division as shown in FIG.
As shown in (b) and (c), the regions may overlap each other or may be separated from each other.

【００２０】分散量算出と第１のブロック判定処理３０
２では、分割された部分領域（ｍ，ｎ）ごとに各部分領
域内の画素の濃度値の平均μ_mnと標準偏差σ_mnとを計算
し、当該平均と当該標準偏差とから文字ブロックと背景
ブロックとを判別する背景分離直線を設定する。この分
離直線を用いて当該部分領域が文字ブロック、背景ブロ
ックのいずれに属するかを判定する。すなわち、当該分
散量ｖ_mnは分離直線の傾きをｃ₁、切片をｃ₂としてｖ_mn＝σ_mn−ｃ₁μ_mn−ｃ₂ で与えられ、判定規則は以下のようになる。Calculation of variance and first block determination processing 30
In 2, the average μ _mn and the standard deviation σ _mn of the density values of the pixels in each of the divided partial areas (m, n) are calculated, and the character block and the background are calculated from the average and the standard deviation σ _mn. Set a background separation straight line for discriminating between blocks. Using this separation line, it is determined whether the partial region belongs to a character block or a background block. That is, the variance amount v _mn is given by v _mn = σ _mn −c ₁ μ _mn −c ₂ where c ₁ is the slope of the separation line and c ₂ is the intercept, and the determination rule is as follows.

【００２１】・ｖ _mn＞０ならば当該部分領域
（ｍ，ｎ）は文字ブロック・ｖ_mn＜０ならば当該部分領域（ｍ，ｎ）は背景
ブロック図４は、横軸に各部分領域の濃度の平均、縦軸に濃度の
標準偏差をプロットした典型的な点列データであるが、
このような場合は図示の背景分離直線で文字ブロックと
背景ブロックとを判別できる。If v _mn > 0, the partial area (m, n) is a character block. If v _mn <0, the partial area (m, n) is a background block. Average point concentration, typical point sequence data plotting the standard deviation of concentration on the vertical axis,
In such a case, the character block and the background block can be distinguished from each other by the illustrated background separation straight line.

【００２２】回転画像生成処理３０３は、各部分領域に
おいて、所定の角度（例えば、１０度）ずつ回転させた
画像をアフィン変換により作成する処理である。射影分
布を求めるための前処理であり、その目的からすれば、
３０度の回転方向と２１０度の回転方向は同じ向きとな
るため、例えば１０度刻みの場合では、０度方向から１
７０度方向まで計１８枚の画像を生成すればよい。The rotated image generating process 303 is a process for creating an image rotated by a predetermined angle (for example, 10 degrees) by affine transformation in each partial area. This is pre-processing for obtaining the projection distribution, and from that purpose,
Since the rotation direction of 30 degrees and the rotation direction of 210 degrees are the same direction, for example, in the case of increments of 10 degrees, the direction of rotation from 0 degree is 1 degree.
It is sufficient to generate a total of 18 images up to the 70-degree direction.

【００２３】射影分布算出と分散データ算出判定処理３
０４では、各部分領域において、各回転方向ごとに、各
画素列における垂直方向の射影値（各走査ライン上の画
素の濃度値を積算して求められる平均値）を算出し、当
該方向の射影値データの分散を算出する。すなわち、部
分領域の大きさが横Ｐ画素×縦Ｑ画素で、１８０度をＤ
分割した場合、回転方向ｋの上記射影値データの分散ｄ
_k（ｋ＝０，…，Ｄ−１）は、Projection distribution calculation and distributed data calculation determination processing 3
In step 04, in each partial region, a vertical projection value (an average value obtained by integrating the density values of pixels on each scanning line) in each pixel column is calculated for each rotation direction, and the projection in that direction is calculated. Calculate the variance of the value data. In other words, the size of the partial region is P horizontal pixels × Q vertical pixels, and 180 degrees is D pixels.
When divided, the variance d of the projection value data in the rotation direction k
_k (k = 0, ..., D-1) is

【００２４】[0024]

【数１】 (Equation 1)

【００２５】で与えられる。ここで、ｐ_i（ｉ＝０，
…，Ｐ−１）は画素列ｉにおける射影値を表す。第２の
ブロック判定処理３０５では、上記の分散データｄ
_k（ｋ＝０，…，Ｄ−１）から、その分散値 var₁、ま
たは差分散（分散最大値と分散最小値の差）var₂を計
算し、その値の大小で各部分領域における文字線方向性
の有無を判定する。いいかえれば、 var₁あるいは var
₂に対して所定の閾値との大小比較によって、当該部分
領域が検出目的の文字を含む文字ブロックかそうでない
背景ブロックかを判別する。ここで、分散値 var₁、差
分散 var₂はそれぞれ次式で与えられる。Is given by Here, p _i (i = 0,
.., P-1) represent projection values in the pixel row i. In the second block determination processing 305, the shared data d
_{From k} (k = 0,..., D-1), the variance value var ₁ or the difference variance (difference between the variance maximum value and the variance minimum value) var ₂ is calculated. It is determined whether there is linearity. In other words, var ₁ or var
_By comparing the value of ₂ with a predetermined threshold value, it is determined whether the partial area is a character block including a character to be detected or a background block that is not. Here, the variance value var ₁ and the difference variance var ₂ are given by the following equations, respectively.

【００２６】[0026]

【数２】 (Equation 2)

【００２７】なお、図５は、Ｄ＝１８として上述の処理
３０４と処理３０５との流れを概念的に図示したもので
ある。図示右上に示されている濃淡画像について、図示
の如き「部分領域原画像」を抽出する。そして当該抽出
された部分領域原画像について、図示の如く、例えば１
０度ずつ０度、１０度…９０度…１６０度、１７０度と
回転させ、夫々について縦方向に射像して射影分布を得
る。そして、夫々の射影分布毎に射影値の分散ｄ₀，ｄ
₁…ｄ₉…ｄ₁₆，ｄ₁₇を得ている。なお言うまでもなく
当該分散ｄ₀，ｄ₁…は上述の分散データｄ_k（ｋ＝
０，…，Ｄ−１）である。FIG. 5 conceptually illustrates the flow of the above-described processing 304 and processing 305 with D = 18. For the grayscale image shown in the upper right of the figure, a “partial region original image” as shown is extracted. Then, for the extracted partial area original image, as shown in FIG.
The image is rotated by 0 degrees at 0 degree, 10 degrees... 90 degrees... 160 degrees, and 170 degrees, and each is projected in the vertical direction to obtain a projection distribution. Then, the variances d ₀ and d of the projection values for each of the projection distributions
₁ ... D ₉ ... D ₁₆ and d ₁₇ are obtained. Needless to say, the variances d ₀ , d ₁ ... Are the above-described variance data d _k (k =
0,..., D-1).

【００２８】文字領域／背景領域の総合判定処理３０６
は、上記第１のブロック判定処理による判別結果と上記
第２のブロック判定処理とによる判別結果、両者の論理
積または論理和をとって、最終的に全部分領域を文字ブ
ロック領域と背景ブロック領域とに分離する処理であ
る。Comprehensive judgment processing 306 of character area / background area
Calculates the logical product or logical sum of the result of the determination by the first block determination process and the result of the determination by the second block determination process, and finally divides the entire partial region into the character block region and the background block region. This is the process of separating into

【００２９】最後のエッジ抽出処理３０７は、上記総合
判定処理によって文字ブロック領域と決定された部分領
域のみに限定して行う。ここでは、原画像に例えば代表
的なSobel オペレータを適用して画素の濃度値の微分強
度を算出し、それを閾値処理して文字境界となるエッジ
を抽出する。他にエッジ抽出技法としては、Prewittオ
ペレータや濃度断面の２次微分のゼロクロス点を算出す
るCanny の方法などがあり、それらを使っても構わな
い。The last edge extraction processing 307 is performed only for the partial area determined as the character block area by the above-mentioned comprehensive judgment processing. Here, for example, a typical Sobel operator is applied to the original image to calculate the differential intensity of the density value of the pixel, and the threshold value is processed to extract an edge serving as a character boundary. Other edge extraction techniques include the Prewitt operator and Canny's method of calculating the zero cross point of the second derivative of the density cross section, and these may be used.

【００３０】本発明を適用して得られた処理例を以下に
示す。図６（ａ）は鋳物に打刻された刻印文字の原画像
（サイズ５００×１２０画素）であり、図６（ｂ）は文
字ブロックと決定されたすべての部分領域について当該
外枠を原画像に重畳表示した図であり、図６（ｃ）は当
該文字領域に限定して、Sobel オペレータを適用して算
出した微分強度画像を閾値処理（閾値は２５）してエッ
ジを抽出した処理結果の画像である。ここで、部分領域
のサイズとして１２×１２画素を与え、前記第２の判定
処理では射影値分散データの各回転角度について分散値
（ var₁）を用いた。比較として、従来の代表的技術で
広く用いられている判別分析法により２値化処理した画
像を図７（ａ）に、Sobel オペレータを適用して算出し
た微分強度画像を閾値処理して得られたエッジ抽出画像
を図７（ｂ）に示す。これらの図を見れば明らかなよう
に、本発明による方法では、文字領域が限定され鮮明な
文字形状を抽出できている。An example of processing obtained by applying the present invention will be described below. FIG. 6A is an original image (size of 500 × 120 pixels) of an engraved character stamped on a casting, and FIG. 6B is an original image of the outer frame for all partial regions determined to be character blocks. 6C is a diagram superimposed and displayed, and FIG. 6C shows a threshold value of the differential intensity image calculated by applying the Sobel operator (threshold value is 25) for the character area, and extracts an edge. It is an image. Here, 12 × 12 pixels were given as the size of the partial region, and the variance value (var ₁ ) was used for each rotation angle of the projection value variance data in the second determination processing. As a comparison, FIG. 7A shows an image subjected to binarization processing by a discriminant analysis method widely used in the conventional representative technology, and FIG. 7A shows a differential intensity image calculated by applying a Sobel operator obtained by threshold processing. FIG. 7B shows the extracted edge extracted image. As is apparent from these figures, the method according to the present invention can extract a clear character shape with a limited character area.

【００３１】入力画像を一定サイズの矩形領域に分割
し、各部分領域においてその回転角度を変化させながら
濃度値の積算により垂直方向への射影値を算出し、その
射影値を回転方向角度について比較し、射影値の角度依
存性から当該部分領域が線分画像を含むか否かを判定す
る。この各部分領域についての判定結果に基づいて、文
字を含む領域以外の領域を原画像から除去することによ
り、処理対象となる文字画像から背景の模様や濃度む
ら、雑音性画像に左右されることなく文字を含む部分領
域のみを分離・抽出することができる。The input image is divided into rectangular regions of a fixed size, and the projection value in the vertical direction is calculated by integrating the density values while changing the rotation angle in each partial region, and the projection values are compared with respect to the rotation direction angle. Then, it is determined whether or not the partial region includes a line segment image from the angle dependence of the projection value. By removing areas other than the area including the character from the original image based on the determination result of each partial area, the character image to be processed is influenced by the background pattern, density unevenness, and noise image from the character image to be processed. And only a partial area including a character can be separated and extracted.

【００３２】上記において、濃淡画像の文字領域抽出方
法について説明したが、当該濃淡画像の文字領域抽出方
法をデータ処理装置が実行できるプログラムの形で保持
することができる。このことから、本願発明は当該プロ
グラムを記録した記録媒体をも発明の対象としており、
本願明細書の特許請求の範囲に記述されている。In the above, the method for extracting the character area of the gray image has been described. However, the method for extracting the character area of the gray image can be stored in the form of a program that can be executed by the data processing apparatus. From this, the invention of the present application also targets a recording medium on which the program is recorded, and
It is set forth in the claims herein.

【００３３】[0033]

【発明の効果】以上説明したように、本発明によれば、
文字そのものの形状情報を保存しつつ、処理対象の画像
から文字を含む部分領域のみを抽出・特定し、当該部分
領域に限定して画像処理や文字照合を行うことにより、
原画像の背景領域に存在する濃度むら、雑音性パターン
に影響されにくい文字読み取り・認識が可能となる。As described above, according to the present invention,
By preserving the shape information of the character itself, extracting and specifying only the partial region including the character from the image to be processed, and performing image processing and character matching only on the partial region,
Character reading / recognition that is less affected by uneven density and noise patterns existing in the background area of the original image can be performed.

[Brief description of the drawings]

【図１】本発明によって刻印文字の濃淡画像から文字領
域を抽出する一実施例を示した図である。FIG. 1 is a diagram showing an embodiment of extracting a character region from a gray image of a stamp character according to the present invention.

【図２】実施例の詳細な処理手順の一例を示したフロー
チャートである。FIG. 2 is a flowchart illustrating an example of a detailed processing procedure according to the embodiment;

【図３】入力画像の領域分割例を示した図である。FIG. 3 is a diagram showing an example of area division of an input image.

【図４】ある文字画像について横軸に各部分領域の濃度
の平均、横軸に濃度の標準偏差をプロットして得られる
典型的な点列データの一例である。FIG. 4 is an example of typical point sequence data obtained by plotting the average of the density of each partial region on the horizontal axis and the standard deviation of the density on the horizontal axis for a certain character image.

【図５】１０度ずつ１８方向として、前記処理３０４と
前記３０５との大まかな流れを示した図である。FIG. 5 is a diagram showing a rough flow of the processing 304 and the processing 305 in 18 directions of 10 degrees.

【図６】図（ａ）は刻印文字の原画像であり、図（ｂ）
は文字ブロックと決定された部分領域の外枠を表示した
図であり、図（ｃ）はSobel オペレータを適用して算出
された微分強度画像を閾値処理してエッジを抽出した処
理結果の画像である。FIG. 6A is an original image of an engraved character, and FIG.
Is a diagram showing an outer frame of a partial region determined as a character block, and FIG. (C) is an image of a processing result obtained by extracting edges by performing threshold processing on a differential intensity image calculated by applying a Sobel operator. is there.

【図７】図（ａ）は従来広く用いられている判別分析方
法を適用して得られた処理結果の２値画像であり、図
（ｂ）は前記の処理３０７と同様に、Sobel オペレータ
を適用してエッジ抽出を行った処理結果の画像である。FIG. 7A is a binary image of a processing result obtained by applying a widely used discriminant analysis method, and FIG. 7B is a diagram illustrating a Sobel operator as in the above-described processing 307. It is the image of the processing result which applied and performed the edge extraction.

【図８】対象となる文字画像パターンのバリエーション
を示した図である。FIG. 8 is a diagram showing variations of a target character image pattern.

[Explanation of symbols]

２０１第１の判定処理２０２第２の判定処理２０３総合判定処理２０４文字領域／境界形状抽出処理３０１領域分割処理３０２分散量算出と第１のブロック判定処理３０３回転画像生成処理３０４射影分布算出と分散データ算出判定処理３０５第２のブロック判定処理３０６文字領域／背景領域の総合判定処理３０７エッジ抽出処理 201 first determination processing 202 second determination processing 203 comprehensive determination processing 204 character area / boundary shape extraction processing 301 area division processing 302 variance amount calculation and first block determination processing 303 rotated image generation processing 304 projection distribution calculation and variance Data calculation determination processing 305 Second block determination processing 306 Character area / background area comprehensive determination processing 307 Edge extraction processing

───────────────────────────────────────────────────── フロントページの続き (72)発明者田中弘一東京都新宿区西新宿三丁目19番２号日本電信電話株式会社内 (72)発明者奥平雅士東京都新宿区西新宿三丁目19番２号日本電信電話株式会社内Ｆターム(参考） 5B029 AA01 BB02 CC15 CC29 5C077 LL01 MP06 PP22 PP25 PP27 PP43 PP46 PP47 PP48 PP58 PP68 PQ22 5L096 AA06 BA17 FA32 FA33 FA44 GA19 GA22 GA23 ──────────────────────────────────────────────────続き Continuing on the front page (72) Inventor Koichi Tanaka 3-19-2 Nishi-Shinjuku, Shinjuku-ku, Tokyo Japan Telegraph and Telephone Corporation (72) Inventor Masashi Okuhira 3-192-1, Nishi-Shinjuku, Shinjuku-ku, Tokyo No. Nippon Telegraph and Telephone Corporation F term (reference) 5B029 AA01 BB02 CC15 CC29 5C077 LL01 MP06 PP22 PP25 PP27 PP43 PP46 PP47 PP48 PP58 PP68 PQ22 5L096 AA06 BA17 FA32 FA33 FA44 GA19 GA22 GA23

Claims

[Claims]

1. A grayscale image including a character image to be processed is divided into partial regions, and for each partial region, whether the partial region is a region including a character or a background region not including a character is determined. This is a method for extracting a gray-scale image character area by removing the background area from the entire image and extracting only the character area. The gray-scale image to be processed is divided into partial areas, and the density values of the pixels belonging to the partial area are calculated. A first determination process for identifying a character candidate partial region based on the average value and the standard deviation calculated for the partial change region; A second determination process for identifying a character candidate partial region by focusing on rotation angle dependency; and a character region extraction method for a shaded image.

2. The method for extracting a character area of a shaded image according to claim 1, wherein the determination result based on the first determination processing and / or the determination result based on the second determination processing are determined. A character area extracting method for a shaded image, comprising: at least a comprehensive determination process for determining a character area by taking a logical product or a logical sum.

3. The method according to claim 1, wherein the second determination process generates an image rotated by a predetermined angle for each partial region. Then, at each rotation angle, an average value is calculated by integrating the density values of the pixel rows in one direction to obtain a projection value.
A second step of calculating the variance of the projection value of each pixel row in the other direction distribution for each rotation angle in each partial area; and a second step of calculating the variance of each partial area in the second step. A third process of calculating the variance of the projected value variance values of the individual rotation angles, which is the difference between the variance value of the rotation angle or the difference between the variance maximum value and the variance minimum value, and the third process. The given variance value or difference variance is compared with a threshold value given in advance,
A fourth step of determining whether or not the partial area is a partial area candidate that includes a character to be detected.

4. A grayscale image including a character image to be processed is divided into partial regions, and for each partial region, whether the partial region is a region including a character or a background region not including a character is determined. A grayscale image to be processed is divided into partial areas on a recording medium which is recorded in the form of a character area extraction program for a grayscale image that removes a background area from the entire image and extracts only a character area. A first determination process for specifying a character candidate partial area based on the average value and the standard deviation calculated for the density values of pixels belonging to And a second determination process for specifying a character candidate region by focusing on the rotation angle dependency of the value with respect to the angle change, and a program for executing the data processing device by combining A recording medium on which a character area extraction program is recorded.

5. A shaded image including a character image to be processed is divided into partial regions, and for each partial region, whether the partial region is a region including characters or a background region not including characters is determined. A grayscale image to be processed is divided into partial areas on a recording medium that is recorded in the form of a grayscale image character area extraction program that extracts only a character area by removing a background area from the entire image. A first determination process for specifying a character candidate partial area based on the average value and the standard deviation calculated for the density values of the pixels belonging to A second determination process for specifying a character candidate area by focusing on the rotation angle dependency of the value on the angle change, a determination result based on the first determination process, and a determination result based on the second determination process Nozomi Or a logical product or logical sum of both to determine a character area, and a character area / boundary shape extraction processing for extracting a character area or a character boundary shape based on the result of the comprehensive determination processing. A recording medium on which is recorded a program for extracting a character area of a gray-scale image, the program being recorded with a program which can be executed by a data processing apparatus by combining the above.

6. The recording medium according to claim 4, wherein the second determination process generates an image rotated by a predetermined angle for each partial region, and In the first method, an average value is calculated by integrating density values of pixel rows in one direction to obtain a projection value.
A second step of calculating the variance of the projection value of each pixel row in the other direction distribution for each rotation angle in each partial area; and a second step of calculating the variance of each partial area in the second step. A third process of calculating the variance of the projected value variance values of the individual rotation angles, which is the difference between the variance value of the rotation angle or the difference between the variance maximum value and the variance minimum value, and the third process. The given variance value or difference variance is compared with a threshold value given in advance,
A fourth step of determining whether or not the partial area is a partial area candidate including a character to be detected; and a recording medium storing a gray-scale image character area extraction program.