JP2987877B2

JP2987877B2 - Character recognition method

Info

Publication number: JP2987877B2
Application number: JP2120219A
Authority: JP
Inventors: 良一湯下
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1990-05-10
Filing date: 1990-05-10
Publication date: 1999-12-06
Anticipated expiration: 2014-12-06
Also published as: JPH0417087A

Description

【発明の詳細な説明】産業上の利用分野本発明は読み取った文字パターンから文字の認識を行
う文字認識方法に関するものである。Description: TECHNICAL FIELD The present invention relates to a character recognition method for recognizing a character from a read character pattern.

従来の技術近年、文字認識装置をコンピュータ等の入力装置とし
て利用しようとする要求が高まっており、安定な認識結
果を効率的に得ることのできる文字認識装置がコンピュ
ータ等のシステムの性能向上に不可欠となっている。2. Description of the Related Art In recent years, there has been an increasing demand for using a character recognition device as an input device for a computer or the like, and a character recognition device capable of efficiently obtaining a stable recognition result is indispensable for improving the performance of a system such as a computer. It has become.

従来の文字認識の方法として文字を形成する２値画像
の輪郭線のうち、定められた方向の最も外側の点を求
め、この情報から輪郭線を凸線分、凹線分、ホール線分
に区分し各線分について特徴量を求め、求めた特徴量と
予め用意された辞書と整合をとることにより文字パター
ンの判別を行うものがあった。As a conventional character recognition method, the outermost point in a predetermined direction is determined from among the outlines of a binary image forming a character, and the outline is converted into a convex line segment, a concave line segment, and a hole line segment from this information. In some cases, a character amount is determined for each line segment, and a character pattern is determined by matching the determined characteristic amount with a dictionary prepared in advance.

発明が解決しようとする課題上記で説明したように、従来の文字認識の方式は輪郭
線を凹線分、凸線分、ホール線分に区分するために輪郭
線のうち定められた方向の最も外側の点を求めている
が、この操作では文字パターンを構成する黒画素のかた
まり、すなわち島の位置情報が求められないため、“i"
や“j"のように島が２つ以上ある分離文字を認識対象と
するときは別途、島の位置情報を求めなければならなか
った。Problems to be Solved by the Invention As described above, in the conventional character recognition method, in order to divide a contour line into a concave line segment, a convex line segment, and a hole line segment, the most specific direction of the contour line is determined. Although the outer point is obtained, this operation does not obtain the cluster of black pixels constituting the character pattern, that is, the position information of the island.
When a separated character having two or more islands such as "j" and "j" is to be recognized, the position information of the island has to be separately obtained.

課題を解決する為の手段本発明は上記問題点を解決する為、原画像の輪郭線を
凹線分、凸線分に区分する情報として凸閉包を構成する
点を求めるため、定められた複数の基準点から文字パタ
ーンを見た場合の文字パターンを内包する接線と文字パ
ターンとの接点及びその節線の角度を示すデータを基準
点毎に求め、求めた接点を結んで得られる凸閉包の各線
分に対して文字パターンが突出しているか否か、及び基
準点を通る基準線と接線との角度のデータから記憶され
た文字パターンの特徴を得るようにする。Means for Solving the Problems In order to solve the above problems, the present invention determines a plurality of points constituting a convex hull as information for dividing a contour line of an original image into a concave line segment and a convex line segment. When the character pattern is viewed from the reference point, the contact point between the tangent line containing the character pattern and the character pattern and the data indicating the angle of the nodal line are obtained for each reference point, and the convex closure obtained by connecting the obtained contact points is obtained. Whether the character pattern protrudes with respect to each line segment, and the characteristics of the stored character pattern are obtained from data of the angle between the reference line and the tangent passing through the reference point.

また、凸閉包を構成する点をテーブル参照操作にて求
めることにより算術演算を必要としない。Further, since the points constituting the convex hull are obtained by a table reference operation, no arithmetic operation is required.

作用本発明において、凸閉包を構成する点を求める操作に
より島の位置情報が同時に得られる。Operation In the present invention, the position information of the island is obtained at the same time by the operation for obtaining the points constituting the convex hull.

実施例本発明を一実施例を示す添加図面とともに説明する。EXAMPLES The present invention will be described with reference to additional drawings showing an example.

第１図において、１は認識対象文字を２値化された原
画像として入力し画像メモリに記憶する画像入力部、２
は入力された原画像を記憶する画像メモリ、３は入力さ
れた原画像を画像メモリから取り出し、原画像の輪郭を
構成する画素列である輪郭線を求める輪郭線検出部、４
は輪郭線のうち原画像の凸閉包を構成する画素を定めら
れた複数の基準点からの方位より求め、これをもとに輪
郭線を凹区間、凸区間に区分する凹凸情報検出部、５は
４にて求められた凹凸情報を基に図形特徴を求める特徴
抽出部、６は５にて抽出された特徴とあらかじめ用意さ
れた辞書とを比較し最も近い特徴量を持つ文字を候補文
字として出力する認識処理部、７は認識対象とする文字
の標準的な特徴をあらかじめ求め格納している辞書、８
は１から７の各部をつなぐ内部バスである。９は画像メ
モリ２に記憶された文字パターンから１文字を切り出す
文字切り出し部である。In FIG. 1, reference numeral 1 denotes an image input unit for inputting a recognition target character as a binarized original image and storing it in an image memory;
Is an image memory for storing the input original image, 3 is an outline detection unit for extracting the input original image from the image memory, and obtaining an outline which is a pixel row constituting an outline of the original image.
Are obtained from the azimuths of a plurality of reference points that define the convex hull of the original image in the contour line, and based on the azimuth information, a concavo-convex information detecting unit that divides the contour line into concave sections and convex sections. Is a feature extraction unit that obtains a graphic feature based on the concavo-convex information obtained in 4, and 6 compares the feature extracted in 5 with a dictionary prepared in advance, and determines a character having the closest feature amount as a candidate character. A recognition processing unit 7 for outputting a dictionary in which standard characteristics of characters to be recognized are obtained and stored in advance;
Is an internal bus connecting the parts 1 to 7. Reference numeral 9 denotes a character cutout unit that cuts out one character from the character pattern stored in the image memory 2.

以上のように構成された本実施例の文字認識装置につ
いて、第２図に全体の処理の流れ図を示すとともに文字
「？」を認識対象とした時を例に、以下その動作を説明
する。FIG. 2 shows a flow chart of the overall processing of the character recognition apparatus of the present embodiment configured as described above, and its operation will be described below, taking as an example the case where the character "?" Is to be recognized.

文字「？」を画像入力部にて２値画像として入力し、
これを原画像として画像メモリに記憶する。入力された
原画像を第３図に示す。Enter the character "?" As a binary image in the image input unit,
This is stored in the image memory as an original image. The input original image is shown in FIG.

原画像は輪郭線検出部により取り出され、輪郭線検出
処理が行われる。以下、同処理について説明する。The original image is taken out by the contour detection unit, and the contour detection processing is performed. Hereinafter, the processing will be described.

まず、原画像において８連結で連結している黒画素を
ひとまとめにし、１つの連結成分ごとに異なった名前
（以下ラベルと記す）を割り当てる。First, black pixels connected by eight connections in the original image are grouped together, and a different name (hereinafter referred to as a label) is assigned to each connected component.

第３図はラベルを割り付けた結果であり「ａ」、
「ｂ」は各連結成分のラベル名である。ラベルを割り当
てた後、各ラベルごとに輪郭線を求める。輪郭線とは上
下左右いずれかに対象としているラベルのついた黒画素
がある白画素のことであり、左上端からのテレビジョン
走査にて最初に発見された黒画素の左側の白画素を始点
として黒画素を左手に見ながら反時計回りに求める。Fig. 3 shows the result of label assignment, "a",
“B” is the label name of each connected component. After assigning labels, an outline is obtained for each label. An outline is a white pixel with a labeled black pixel on either the top, bottom, left, or right, and the starting point is the white pixel on the left of the first black pixel found by television scanning from the upper left corner Observe black pixels counterclockwise while looking at the left hand.

輪郭線を検出した結果を第４図に示す。第４図中、a1
−a18はラベルａ、b1−b6はラベルｂの輪郭であり番号
は１本の輪郭線中での求められた順番を表わしている。FIG. 4 shows the result of detecting the contour. In FIG. 4, a1
-A18 is the outline of the label a and b1-b6 are the outlines of the label b, and the numbers indicate the order determined in one outline.

次に以上の操作で求められた輪郭線をもとに原画像の
凸閉包を求める。凸閉包とは任意の画素の集合Ｓにたい
して、Ｓを含む最小の凸集合のことであり、凸閉包を求
めることにより、凸閉包中で、Ｓに含まれない領域、す
なわちＳの凹みを検出することができる。Next, a convex hull of the original image is obtained based on the contour obtained by the above operation. The convex hull is a minimum convex set including S for an arbitrary set of pixels S. By finding the convex hull, an area not included in S, ie, a dent of S, is detected in the convex hull. be able to.

凸閉包を求める方法について説明する。凸閉包を形ど
る外郭線は対象図形の全ての接線を引くことにより求め
られ、任意の接線が対象図形に接する点（接点）が凸閉
包を構成する画素となる。A method for obtaining a convex hull will be described. The outline forming the convex hull is obtained by drawing all the tangents of the target graphic, and a point (contact point) at which an arbitrary tangent comes into contact with the target graphic is a pixel constituting the convex hull.

本発明においては、複数の基準点を対象図形の凸閉包
よりも外側に設け、基準点を通る接線を対象図形の輪郭
線に対して求める。基準点の位置及び数は認識の対象と
なり得るパターンの大きさおよび形状の複雑さにより定
める。基準点を通る接線は、対象図形の輪郭線上の接点
を見つけることにより求まり、第５図に示すように基準
点と輪郭線上の点を通る直線をＡ、基準点を通る水平な
直線を基準線としたとき、Ａと基準線のなす角θが最大
及び最小になる２つの点が接点となる。第５図中に於て
点B,Cが接点となる。In the present invention, a plurality of reference points are provided outside the convex hull of the target graphic, and a tangent passing through the reference point is obtained for the contour of the target graphic. The position and number of the reference points are determined based on the size and shape of the pattern that can be recognized. A tangent line passing through the reference point is obtained by finding a tangent point on the contour line of the target graphic, and a straight line passing through the reference point and a point on the contour line is represented by A, and a horizontal straight line passing through the reference point is represented by a reference line as shown in FIG. , Two points at which the angle θ between A and the reference line becomes the maximum and minimum become the contact points. In FIG. 5, points B and C are contact points.

本実施例においては、認識対象とする文字パターンの
大きさおよび基準点の位置を一例としてそれぞれ以下の
ようにし、凸閉包を求める。In the present embodiment, the size of the character pattern to be recognized and the position of the reference point are taken as examples, and the convex hull is obtained as follows.

文字パターンの大きさは縦９画素、横９画素以内であ
り基準点の位置は縦10画素、横10画素の画像領域の左上
端、右上端、左下端、右下端の４点である。The size of the character pattern is within 9 pixels vertically and 9 pixels horizontally, and the positions of the reference points are four points of the upper left, upper right, lower left and lower right edges of the image area of 10 pixels vertically and 10 pixels horizontally.

以上で説明したように凸閉包は基準点と輪郭線を通る
直線と基準線のなす角の大小関係を調べることにより求
められる。As described above, the convex hull is obtained by examining the magnitude relationship between the angle formed by the straight line passing through the reference point and the contour and the reference line.

しかしながら、直線と直線のなす角を数学的に求めよ
うとすると除算等の演算が必要となり、処理時間の増大
を招く。However, when trying to mathematically determine the angle between the straight lines, an operation such as division is required, which causes an increase in processing time.

そこで本実施例においては画像領域全体についてあら
かじめ基準点からの方位を求め、その大小関係のみに注
目して順位をつけた方位順位マトリクスを用意すること
により処理の簡単化を行った。第６図は縦10画素、横10
画素の時の方位順位マトリクスであり、Ｈを基準点、HI
を基準線とした時の角度が大きい順番に番号をつけてい
る。Therefore, in this embodiment, the simplification of the processing is performed by obtaining the azimuth from the reference point in advance for the entire image area, and preparing an azimuth order matrix in which the order is set by paying attention only to the magnitude relation. Figure 6 shows 10 vertical pixels and 10 horizontal pixels.
It is an azimuth order matrix at the time of a pixel, where H is a reference point, HI
Are numbered in order of increasing angle with respect to.

この方位順位マトリクスに第４図の輪郭線を当ては
め、輪郭点の位置に対応する値を各輪郭点の値（以下、
マトリクス値と記す）とし、１本の輪郭線の中でその値
が最大となる画素と最小になる画素を接点として求め
る。ここで基準点を第４図中のＡ、Ｂ、Ｃ、Ｄの４箇所
としているため、Ａに対する接点は第４図ABCDに第６図
HIJKを対応させて求め、Ｂ、Ｃ、Ｄに対しては第４図AB
CDにそれぞれ第６図IJKH、JKHI、KHIJを対応させて求め
る。The contour line shown in FIG. 4 is applied to this azimuth order matrix, and the value corresponding to the position of the contour point is determined by the value of each contour point (hereinafter, referred to as the value of each contour point)
A pixel having the maximum value and a pixel having the minimum value in one contour line are determined as contact points. Since the reference points are four points A, B, C, and D in FIG. 4, the contact point for A is shown in FIG.
HIJK is calculated in correspondence with B, C and D in Fig. 4 AB
FIG. 6 shows IJKH, JKHI, and KHIJ corresponding to the CD.

第４図において“○”印が凸閉包を構成する画素とな
る。In FIG. 4, "O" marks are pixels constituting the convex hull.

凸閉包を構成する画素（以下、凸閉包点と記す）を検
出した後、凸閉包点曲の画素列が凹なのか凸なのかを判
定する。判定はマトリクス値によって行え、凸閉包点間
の画素の各基準点でのマトリクス値がその始点終点とな
る凸閉包点でのマトリクス値よりも全て小さければ凸区
間と見なし、大きいものがある時は凹区間と見なす。After detecting the pixels constituting the convex hull (hereinafter referred to as convex hull points), it is determined whether the pixel sequence of the convex hull point curve is concave or convex. Judgment can be made based on the matrix value.If the matrix value at each reference point of the pixel between the convex hull points is smaller than the matrix value at the convex hull point which is the start point and end point, it is regarded as a convex section, and if there is a larger one, Consider a concave section.

すなわち、a2・a9間、a11・a14が凹区間となりその他
の区間は凸区間となる。That is, between a2 and a9, a11 and a14 are concave sections, and the other sections are convex sections.

以上の操作により原画像の輪郭線を凹区間、凸区間に
区分する事ができたので以下より図形特徴の抽出処理を
行う。With the above operation, the contour line of the original image can be divided into the concave section and the convex section.

本実施例では下記のものを図形特徴として求めてい
る。In this embodiment, the following are obtained as graphic features.

（１）凹区間の開方向凹区間の始点から終点をみた時の基準線を基準とした
方向（２）外接矩形と凹部の面積比（３）島の位置関係（１）は凹区間の始点から終点を見たときの基準線を
基準とした方向の事で始点及び終点の座標より容易に求
めることができる。(1) The opening direction of the concave section The direction with reference to the reference line when the end point is viewed from the start point of the concave section (2) The area ratio between the circumscribed rectangle and the concave section (3) The positional relationship between the islands (1) The starting point of the concave section The direction in which the end point is viewed from the reference line can be easily obtained from the coordinates of the start point and the end point.

（２）は凹区間の始点と終点を結んだときにできる閉
領域の面積と、原画像に外接する矩形の面積との比、す
なわち（凹部の面積）／（外接矩形の面積）となる。(2) is the ratio of the area of the closed region formed when the start point and the end point of the concave section are connected to the area of the rectangle circumscribing the original image, that is, (area of the concave part) / (area of the circumscribed rectangle).

（３）は文字パターンを構成する黒画素のかたまり
（以下、島と記す）の数、すなわち前述した処理で求め
られたラベルの数が２以上の時、各島の位置関係を表わ
すもので、島の輪郭の各基準点における第６図のマトリ
クス値の最大値と最小値を平均したものをその島の位置
情報として持つ。(3) indicates the positional relationship between the islands when the number of clusters of black pixels (hereinafter referred to as islands) constituting the character pattern, that is, when the number of labels obtained by the above-described processing is two or more, The average of the maximum value and the minimum value of the matrix values in FIG. 6 at each reference point of the contour of the island is held as the position information of the island.

以上で求められた図形特徴を基にして、認識部にて予
め用意された辞書と比較を行い、候補文字を出力する。Based on the graphic features obtained as described above, the recognizing unit performs comparison with a dictionary prepared in advance, and outputs candidate characters.

発明の効果以上説明したようにこの発明によって、原画像の輪郭
線の凹凸情報を基に文字パターンの図形特徴を抽出する
ことにより、安定な認識結果を得ることができ、また算
術演算を行うことなく凹凸情報が得られ、凹凸情報を得
る過程において島の位置情報が得られるので別途島の位
置情報を求める必要がないため、高速な処理が可能とな
る。Effect of the Invention As described above, according to the present invention, a stable recognition result can be obtained by extracting graphic features of a character pattern on the basis of unevenness information of an outline of an original image, and an arithmetic operation can be performed. Since the unevenness information can be obtained without any change and the position information of the island can be obtained in the process of obtaining the unevenness information, it is not necessary to separately obtain the position information of the island, so that high-speed processing can be performed.

[Brief description of the drawings]

第１図は本発明の一実施例における文字認識方法を用い
た装置の構成図、第２図は文字認識方法の処理の全体の
流れ図、第３図は文字「？」のイメージ画像図、第４図
は輪郭及び凸閉包の検出結果を示す図、第５図は凸閉包
検出方法の概念図、第６図は凸閉包を求めるための方位
順位マトリクスを示す図である。１……画像入力部、２……画像メモリ、３……輪郭線検
出部、４……凹凸情報検出部、５……特徴抽出部、６…
…認識処理部、７……辞書、９……文字切り出し部FIG. 1 is a block diagram of an apparatus using a character recognition method according to an embodiment of the present invention, FIG. 2 is an overall flowchart of processing of the character recognition method, FIG. 3 is an image image of a character "?" FIG. 4 is a diagram showing the detection result of the contour and the convex hull, FIG. 5 is a conceptual diagram of the convex hull detection method, and FIG. 6 is a diagram showing an azimuth order matrix for obtaining the convex hull. 1 ... image input unit, 2 ... image memory, 3 ... outline detection unit, 4 ... unevenness information detection unit, 5 ... characteristic extraction unit, 6 ...
... Recognition processing unit, 7 ... Dictionary, 9 ... Character cutout unit

───────────────────────────────────────────────────── フロントページの続き (58)調査した分野(Int.Cl.⁶，ＤＢ名) G06K 9/46 G06K 9/62 ──────────────────────────────────────────────────続き Continued on the front page (58) Field surveyed (Int.Cl. ⁶ , DB name) G06K 9/46 G06K 9/62

Claims

(57) [Claims]

1. A recognition target character pattern is stored, and a contact point with a tangent character pattern including the character pattern when the stored character pattern is viewed from the reference point and an angle of the tangent are set for each of a plurality of reference points. It is determined whether the character pattern protrudes with respect to each line segment of the convex hull composed of a plurality of determined contact points, and the character pattern of the character pattern stored from the data of the angle between the tangent and the reference line passing through the reference point. A character recognition method characterized by extracting a feature and identifying a character pattern to be recognized from the extracted feature.