JPH03269782A

JPH03269782A - Character area extracting method for character reader

Info

Publication number: JPH03269782A
Application number: JP2069965A
Authority: JP
Inventors: Yukio Koga; 古賀　由紀夫
Original assignee: Fuji Electric Co Ltd; Fuji Facom Corp
Current assignee: Fuji Electric Co Ltd; Fuji Facom Corp
Priority date: 1990-03-20
Filing date: 1990-03-20
Publication date: 1991-12-02

Abstract

PURPOSE:To accurately read characters by using the two-dimensional fractal matrix of a multilevel picture to classify a texture pattern and extracting a character area. CONSTITUTION:A multilevel picture 5 consisting of a background area 3 and a character area 4 having characteristic texture patterns is inputted from an image pickup device 1 to a character reader 2. A rectangular coordinate system of (x) and (y) is considered and the density value at each point of the picture is read on the (z) axis to obtain the multilevel picture which has a picture surface 6 of a curved surface corresponding to the density value. The character reader 2 operates the trace value of the two-dimensional fractal matrix and the value of a determinant based on inputted picture data and uses the difference in texture pattern between the background area 3 and the character area 4 based on both operated values to extract the character area 4. A character is recognized in the extracted character area 4. Thus, the character is accurately read.

Description

【発明の詳細な説明】〔産業上の利用分野〕この発明は、ＣＣＤカメラ等の撮像装置を用いて撮像さ
れた、文字を含む多値画像から文字を読取る文字読取装
置における文字領域抽出方法に関する。[Detailed Description of the Invention] [Industrial Application Field] The present invention relates to a character area extraction method in a character reading device that reads characters from a multivalued image containing characters captured using an imaging device such as a CCD camera. .

［従来の技術］従来の文字読取装置における文字領域抽出方法としては
、先ず、第４図（ａ）に示す画像に対して平滑化等のフ
ィルタ処理を施して第４図（ｂ）に示すように、局所的
なノイズ２０を除去し、次いで第４図（Ｃ）に示すよう
に、画像濃度値のヒストグラム分布を求め、これをもと
に適切な濃度値で画像を２値化することにより、第４図
（ｄ）に示すように背景領域と文字領域を分離して文字
領域を抽出する方法が一般的である。[Prior Art] As a method for extracting a character area in a conventional character reading device, first, filter processing such as smoothing is performed on the image shown in FIG. 4(a), and the image shown in FIG. 4(b) is Then, as shown in FIG. 4(C), the local noise 20 is removed, the histogram distribution of image density values is obtained, and the image is binarized with appropriate density values based on this. A common method is to separate the background area and character area and extract the character area, as shown in FIG. 4(d).

[Problem to be solved by the invention]

しかしながら、上記従来の文字領域抽出方法にあっては
、例えば第５図（ａ）に示すように、背景画像が特徴的
なテクスチャパターンを有している場合であっても、そ
のテクスチャパターンの情報を用いず、第５図（ｂ）に
示す濃度値のヒストグラム分布のみで背景領域と文字領
域とを分離しようとするため、得られた文字領域内には
、ノイズが現れたり、或いは抽出された文字が切れたり
して、適切な文字領域の抽出を行うことができないとい
う未解決の課題があった。However, in the conventional character area extraction method described above, even if the background image has a characteristic texture pattern, as shown in FIG. Since the background region and the character region are separated using only the histogram distribution of the density values shown in Fig. 5(b) without using the There was an unresolved problem that characters were cut off and it was not possible to extract an appropriate character area.

そこで、この発明は、上記従来例の未解決の課題に着目
してなされたものであり、特徴的なテクスチャパターン
を有する背景領域と文字領域とからなる多値画像から、
文字領域を正確に抽出することができる文字読取装置に
おける文字領域抽出方法を提供することを目的としてい
る。Therefore, the present invention has been made by focusing on the unresolved problems of the above-mentioned conventional example.
It is an object of the present invention to provide a character area extraction method for a character reading device that can accurately extract character areas.

[Means to solve the problem]

上記目的を遠戚するために、この発明に係る文字読取装
置における文字領域抽出方法は、背景領域と文字領域と
からなる多値画像から文字領域を抽出して文字を読取る
文字読取装置において、前記多値画像の各点に対して、
画像の濃度値によってできる画像面上の直交する切口曲
線の長さを単位長さで計測したときの曲線被覆数から求
めた２次元フラクタル行列のトレースの値と行列式の値
を演算し、該演算値によって前記背景領域と文字領域と
を分離して文字領域を抽出するようにしたことを特徴と
している。In order to achieve the above object, a method for extracting a character region in a character reading device according to the present invention is provided. For each point in the multivalued image,
The trace value and determinant value of a two-dimensional fractal matrix obtained from the number of curves covered when the length of orthogonal cut curves on the image plane formed by the image density values are measured in unit length, and the values of the trace and determinant of the two-dimensional fractal matrix are calculated. The present invention is characterized in that the background area and the character area are separated from each other based on a calculated value, and the character area is extracted.

[Effect]

この発明においては、文字領域の抽出の際に、単に画像
の濃度値のみを基準として文字領域を抽出するのではな
く、２次元フラクタル行列のトレースの値と行列式の値
を演算し、両演算値から背景領域と文字領域とのテクス
チャパターンの違いを利用して文字領域を正確に抽出す
る。In this invention, when extracting a character area, instead of simply extracting the character area based on only the density value of the image, the value of the trace of the two-dimensional fractal matrix and the value of the determinant are calculated, and both calculations are performed. To accurately extract a character area from a value by using the difference in texture pattern between a background area and a character area.

〔Example〕

以下、この発明の実施例を図面に基づいて説明する。 Embodiments of the present invention will be described below based on the drawings.

第１図はこの発明の一実施例を示す概略構成図である。FIG. 1 is a schematic diagram showing an embodiment of the present invention.

図中、１は文書を走査して画像データを出力するＣＣＤ
カメラ等の撮像装置であって、この撮像装置１から出力
される画像データが文字読取装置２に入力される。In the figure, 1 is a CCD that scans documents and outputs image data.
It is an imaging device such as a camera, and image data output from this imaging device 1 is input to a character reading device 2.

この文字読取装置２は、入力される画像データに基づい
て２次元フラクタル行列のトレース値と行列式の値とを
演算し、両演算値をもとに背景領域と文字領域とのテク
スチャパターンの違いを利用して文字領域を抽出する。This character reading device 2 calculates a trace value and a determinant value of a two-dimensional fractal matrix based on input image data, and based on both calculated values, the difference in texture pattern between a background area and a character area. Extract the character area using .

すなわち、今、撮像装置１から第２図（ａ）に示すよう
な特徴的なテクスチャパターンを有する背景領域３と文
字領域４とからなる多値画像５が文字読取装置２に入力
されているものとする。この第２図（ａ）において、ｘ
、ｙの直交座標系を考え、さらにＺ軸として画像の各点
における濃度値をとると、第２図中）に示すように濃度
値に応じた曲面の画像面６を有する多値画像を得ること
ができる。That is, a multivalued image 5 consisting of a background area 3 and a character area 4 having a characteristic texture pattern as shown in FIG. 2(a) is now input from the imaging device 1 to the character reading device 2. shall be. In this FIG. 2(a), x
, y, and taking the density value at each point of the image as the Z axis, we obtain a multivalued image having a curved image plane 6 according to the density value, as shown in Figure 2). be able to.

この第２図（ロ）において、多値画像上の任意の画素Ｐ
　（ｘ、ｙ）を通り画像面６上で直交する切り口曲線７
ｘ、７ｙを考え、この切り口曲線７ｘ。In this FIG. 2 (b), any pixel P on the multivalued image
Cut curve 7 passing through (x, y) and orthogonal on the image plane 6
Considering x and 7y, this cut curve is 7x.

７ｙの長さを夫々単位長さＵで計測したときの曲線被覆
数のうちＸ軸方向の値をＸ（ｕ）、ｙ軸方向の値をＹ　
（ｕ）とすると、下記（１）式のベクトル値関数Ｖ　（
ｕ）を得ることができる。When the length of 7y is measured with unit length U, the value in the X-axis direction of the number of curves covered is X(u), and the value in the y-axis direction is Y
(u), the vector-valued function V (
u) can be obtained.

Ｖ　（ｕ）　＝ｔＣＸ　（ｕ）、　Ｙ　（ｕ）　）　　
　　−−（１）このとき、（Ｖ（ｕ））　＝ｕ−’　（Ｖ（１）：ｌ　　　・・・
−（２）を満たすような（２，２）行列Ｈが存在すれば
、■は２次元フラクタル量であり、行列Ｈは■に伴うフ
ラクタル行列であると定義することができる。V (u) = tCX (u), Y (u))
--(1) At this time, (V(u)) = u-' (V(1):l...
If there exists a (2,2) matrix H that satisfies -(2), it can be defined that ■ is a two-dimensional fractal quantity, and matrix H is a fractal matrix associated with ■.

したがって、行列Ｈを推定するには、上記（２）式％式
％（））（３）で表される二次元ベクトルＷ（ｓ）が得られる。Therefore, in order to estimate the matrix H, a two-dimensional vector W(s) expressed by the above equation (2) is obtained.

この（３）式の右辺が微分方程式Ｗ’　＝−ＨＷの解で
あることから、行列Ｈの各成分（ｈ＋＋、ｈ＋ｚ。Since the right side of this equation (3) is the solution of the differential equation W' = -HW, each component of the matrix H (h++, h+z.

ｈ２°ｈｇｚ）を計算によって求めることができる。h2°hgz) can be obtained by calculation.

この算出された各成分ｈｌｌ＋　　ｈｌ□、　　ｈｚ＋
、　　ｈｚ□Ｏ値から行列Ｈのトレースの値Ｔと行列式
の（ｉＤとを算出する。Each of the calculated components hll+ hl□, hz+
, calculate the trace value T of the matrix H and the determinant (iD) from the hz□O value.

そして、上記の計算を画像上の全ての画素について行い
、その計算結果である行列Ｈのトレースの値Ｔと行列式
の値りの分布を表すと、第３図に示すように、背景領域
３及び文字領域４に対応する同一カテゴリーのサンプル
が互いにクラスタを作って、Ｄ＝Ｔ”　／４で表される
曲線ｌで分離される安定らせん焦点領域に含まれる領域
８及び安定節焦点領域に含まれる領域９の２つのテクス
チャパターンに分類される。Then, the above calculation is performed for all pixels on the image, and the distribution of the value T of the trace of the matrix H and the value of the determinant, which is the calculation result, is as shown in FIG. Samples of the same category corresponding to character region 4 form a cluster with each other, and are included in region 8 included in the stable helical focal region and in the stable nodal focal region separated by the curve l represented by D=T”/4. The area 9 is classified into two texture patterns.

したがって、第３図から背景領域と文字領域とを分離す
るための適切な境界１０を求めることにより、この境界
１０を使用して文字領域に該当する画素を正確に抽出す
ることができる。Therefore, by finding an appropriate boundary 10 for separating the background area and character area from FIG. 3, pixels corresponding to the character area can be accurately extracted using this boundary 10.

そして、抽出された文字領域について文字認識を行うこ
とにより、正確な文字読取を行うことが可能となる。Then, by performing character recognition on the extracted character area, it becomes possible to perform accurate character reading.

なお、上記実施例においては、画像上の各画素について
フラクタル行列のトレース後と行列式の値を計算する場
合について説明したが、これに限定されるものではなく
、画像の解像度や文字のサイズによっては、例えば数画
素置きにサンプリングして大局的な文字領域を先ず抽出
した後、その領域内をさらに細かく計算するようにする
こともでき、この場合には文字読取処理速度を速めるこ
とができる利点がある。In addition, in the above embodiment, the case was explained in which the fractal matrix is traced and the determinant value is calculated for each pixel on the image, but this is not limited to this, and depending on the resolution of the image and the size of the characters, For example, it is possible to first extract a global character area by sampling every few pixels, and then perform more detailed calculations within that area, which has the advantage of speeding up the character reading process. There is.

〔Effect of the invention〕

以上説明したように、この発明によれば、多値画像の２
次元フラクタル行列を用いて、テクスチャパターンを分
類し、背景領域と文字領域とを分離して文字領域を抽出
するようにしているので、背景領域が多少複雑な濃度値
の分布を示すような画像であっても、従来例のように画
像の濃度値のみにより文字領域を抽出する場合に比較し
て、正確に文字領域を抽出することができ、正確な文字
読取り行うことができる効果が得られる。As explained above, according to the present invention, two
A dimensional fractal matrix is used to classify the texture pattern, separate the background area from the text area, and extract the text area. Even if there is a problem, the character area can be extracted more accurately and the characters can be read more accurately than in the case of extracting the character area only based on the density value of the image as in the conventional example.

[Brief explanation of drawings]

第１図はこの発明の一実施例を示すブロック図、第２図
（ａ）及び（ハ）は夫々この発明の文字抽出方法の説明
に供する説明図、第３図はトレース値と行列式の値との
関係を示すグラフ、第４図（ａ）〜（ｄ）並びに第５図
（ａ）及び（ｂ）は夫々従来例の文字抽出方法の説明に
供する説明図である。図中、１は撮像装置、２は文字読取装置、３は背景領域
、４は文字領域、５は多値画像、６は画像面、７ｘ、７
ｙは切り口曲線、１０は境界である。ｉ炭質濃度値FIG. 1 is a block diagram showing an embodiment of the present invention, FIGS. 2(a) and (c) are explanatory diagrams for explaining the character extraction method of the present invention, and FIG. 3 is a diagram showing trace values and determinants. Graphs showing the relationship with values, FIGS. 4(a) to 4(d) and FIGS. 5(a) and 5(b) are explanatory diagrams for explaining the conventional character extraction method, respectively. In the figure, 1 is an imaging device, 2 is a character reading device, 3 is a background area, 4 is a character area, 5 is a multivalued image, 6 is an image plane, 7x, 7
y is the cut curve and 10 is the boundary. i Carbon concentration value

Claims

[Claims]

In a character reading device that extracts a character area from a multi-valued image consisting of a background area and a character area and reads the characters, an orthogonal cut on the image surface made by the density value of the image is made for each point of the multi-valued image. The trace value and the determinant value of the two-dimensional fractal matrix obtained from the curve coverage number when the length of the curve is measured in unit length are calculated, and the background area and the character area are separated using both calculated values. 1. A method for extracting a character area in a character reading device, characterized in that a character area is extracted using a character reading device.