JPS5822479A - Character recognition device - Google Patents

Character recognition device

Info

Publication number
JPS5822479A
JPS5822479A JP56121615A JP12161581A JPS5822479A JP S5822479 A JPS5822479 A JP S5822479A JP 56121615 A JP56121615 A JP 56121615A JP 12161581 A JP12161581 A JP 12161581A JP S5822479 A JPS5822479 A JP S5822479A
Authority
JP
Japan
Prior art keywords
pattern
stroke
points
character
input character
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP56121615A
Other languages
Japanese (ja)
Other versions
JPH026113B2 (en
Inventor
Keiji Kobayashi
啓二 小林
Masataka Yamamoto
山本 勝敬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Computer Basic Technology Research Association Corp
Original Assignee
Computer Basic Technology Research Association Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Computer Basic Technology Research Association Corp filed Critical Computer Basic Technology Research Association Corp
Priority to JP56121615A priority Critical patent/JPS5822479A/en
Publication of JPS5822479A publication Critical patent/JPS5822479A/en
Publication of JPH026113B2 publication Critical patent/JPH026113B2/ja
Granted legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/16Image preprocessing
    • G06V30/168Smoothing or thinning of the pattern; Skeletonisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Character Discrimination (AREA)
  • Character Input (AREA)
  • Image Processing (AREA)

Abstract

PURPOSE:To recognize handwritten characters with high precision by extracting straight segments from an input character pattern, and performing the recognition by using middle-point patterns of those strokes. CONSTITUTION:The picture signal of an input character on a slip 11 obtained through the scanning of a scanning means 12 is used to make the input character pattern into thin lines by a preprocessing means 13. Then, a stroke extracting means 14 finds feature points of terminal points, branch points, inflection points, etc., and then finds segments connecting those points. Further, the means 14 checks the direction of each segment at the feature points except the terminal points to concatenate the segments, thus extracting the strokes. Then, a stroke middle-point pattern from the center position of the stroke is generated and sent to a determining means 15. The means 15 finds the similarity of this stroke middle-point pattern to that of a previously stored reference character to decide on what is the input character.

Description

【発明の詳細な説明】 この発明は、直線部分の多い文字を認識する文字g識装
置に関するものであり、さらに詳しくは手書き漢字を認
識する文字認識装置゛に関するものである。
DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a character recognition device for recognizing characters with many straight lines, and more particularly to a character recognition device for recognizing handwritten Chinese characters.

従来、漢字を認識する装置、特に印刷漢字を認R−rる
装置では、パターンマツチング法が用いられていた。印
刷漢字のように字形が一定のものKはこの方法は有効で
あった。しかし、第1図に示すように記入枠1内の基準
文字パターン2に対して入力文字パターン3が少しでも
傾いている場合、両者の類似度は小さくなる。また、第
2図に示すように基準文字パターン4に対して入力文字
パターン5の線幅が異なる場合も両者の類似度は小さく
なる。
Conventionally, a pattern matching method has been used in devices that recognize Chinese characters, particularly devices that recognize printed Chinese characters. This method was effective for characters with a constant shape, such as printed kanji. However, as shown in FIG. 1, if the input character pattern 3 is even slightly tilted with respect to the reference character pattern 2 within the entry frame 1, the degree of similarity between the two becomes small. Furthermore, as shown in FIG. 2, when the input character pattern 5 has a different line width from the reference character pattern 4, the degree of similarity between the two also decreases.

したがって、手書き漢字のように各ストロークが基準パ
ターンに対して傾いていたり、線幅が一定でない文字に
ついては、パターンマツチング法では高い認識率を得る
ことは困難であるという欠点があった。
Therefore, the pattern matching method has the disadvantage that it is difficult to obtain a high recognition rate for characters such as handwritten Chinese characters in which each stroke is tilted with respect to the reference pattern or the line width is not constant.

この発明は、これらの欠点を除去するため、入力文字パ
ターンより直線線分(以後スト、−りと呼ぶ)を抽出し
、このストロークの中点パターンを用いて認識を行うよ
5Kしたことを特徴とし、その目的は高い読み取り精度
の文字認識装置を提供するととKある。以下、図面を用
いてこの発明の詳細な説明する。
In order to eliminate these drawbacks, this invention is characterized by extracting straight line segments (hereinafter referred to as strokes) from the input character pattern and performing recognition using the midpoint pattern of these strokes. The purpose is to provide a character recognition device with high reading accuracy. Hereinafter, the present invention will be explained in detail using the drawings.

第3図はこの発明の装置の一実施例をブロック図で示し
た構成図である。まず、帳票11上の入力文字を走査手
段12で走査し、得られた入力文字の画像信号を用いズ
前処理手段13により入力文字パターンを細線化する0
次に、ス)p−り抽出手段14によって細線化された文
字パターンからストロークを抽出し、その中心位置から
ストローク中点パターンを作成して決定手段15に送今
FIG. 3 is a block diagram showing an embodiment of the apparatus of the present invention. First, the input characters on the form 11 are scanned by the scanning means 12, and the input character pattern is thinned by the pre-processing means 13 using the obtained image signal of the input characters.
Next, the strokes are extracted from the thinned character pattern by the p-ri extraction means 14, and a stroke midpoint pattern is created from the center position and sent to the determination means 15.

決定子!R15では、上記ストーーり中点パターンと、
あらかじめ記憶されている基準文字のス)s−−り中点
パターンとの類似度を求め、入力文字が何であるかを決
定する。
Determinant! In R15, the above-mentioned stalling midpoint pattern,
The degree of similarity with a pre-stored standard character midpoint pattern is determined to determine what the input character is.

第4図は漢字の1有”を上記前処理手段13により細線
化した例を示すものであり、細線化された入力文字パタ
ーンの文字部16を@1″で示している。
FIG. 4 shows an example in which the kanji character ``1'' is thinned by the preprocessing means 13, and the character portion 16 of the thinned input character pattern is indicated by @1''.

第5図は上記細線化された入力文字パターンの文字部1
6から上記ス)9−り抽出手段14により端点1分岐点
、屈折点などの特徴点を求め、特徴点間を結んだ線分(
以後セグメントと呼ぶ)を抽出した結果を示したもので
あり、特徴点11.。
Figure 5 shows character part 1 of the thinned input character pattern above.
From step 6 to step 9 above, feature points such as end points, branch points, and refraction points are obtained by the extraction means 14, and a line segment connecting the feature points (
This figure shows the result of extracting feature points 11. .

17、、 1 F、、  17.、−・・は1本”で示
し、各セグメント18は@A″〜@N″で示した0例え
ば、特徴点11、に連結しているセグメントはs ”A
”で示されるセグメント19.@B”で示されるセグメ
ント2o、@c″で示されるセグメント21.@D”で
示されるセグメント22である。
17,, 1 F,, 17. , -... are indicated by one line ", and each segment 18 is indicated by @A" to @N". For example, a segment connected to the feature point 11 is indicated by s "A".
Segment 19 indicated by "@B", segment 2o indicated by "@c", segment 21 indicated by "@c", segment 22 indicated by "@D".

さらに、ストー−り抽出手段14では、端点な除く特徴
点における各セグメントの方向を調べてセグメントを結
合することKよってストロークを1 抽出する。スト・
−りの抽出は以下のように行う。
Furthermore, the stroke extracting means 14 extracts one stroke by examining the direction of each segment at feature points excluding end points and combining the segments. Strike
- Extraction of ri is carried out as follows.

まず、ある特徴点PI に連結しているセグメント対(
2個のセグメントを合わせてセグメント対と呼ぶ)Kつ
いて、第1のセグメントにつぃ文は、411像点PI 
を終点とし、他端(終端)をPJ とし11.→  − た第1のセグメントの方向ベクトルPj P、を求め、
第2のセグメン)Kついては特徴点Pi を始点とし、
他端(終端)をPkとした第2のセグメントの方向ベク
トルPIPkを求める。特徴点P、 KおけるIIIE
Iのセグメントと第2のセグメントの方向ペタトルのな
す角をOとしたとき、上記2つのセグメントの方向の一
致度をCSO#で定義し、特徴点P、に連結したすべて
のセグメントの中から一致度が最も大きく、かつ所定の
しきい値以上のセグメント対を結合してストーーりとす
る。さらに上記セグメント対を除いたセグメントについ
て上記ストp−夕を求める処理を繰り返す。
First, a pair of segments connected to a certain feature point PI (
For K, the first segment has 411 image points PI.
11. Set the end point to PJ and the other end (terminal end) to PJ. → − Find the direction vector Pj P of the first segment,
For the second segment) K, the starting point is the feature point Pi,
A direction vector PIPk of the second segment with the other end (terminus) as Pk is determined. IIIE at feature points P and K
When the angle formed by the direction petator of the segment I and the second segment is O, the degree of coincidence of the directions of the above two segments is defined by CSO#, and a match is found among all segments connected to the feature point P. A pair of segments with the highest degree and a predetermined threshold value or more are combined to form a stalemate. Further, the process of determining the above-mentioned strike points is repeated for segments other than the above-mentioned segment pairs.

第6図はストローク抽出手段14によって上記方法で求
められたストロークを示す図である。すなわち、第5図
のセグメント19と22が結合されて@111+で示さ
れるストρ−り23が抽出され、同じくセグメント20
と21が結合されて12″で示されるストローク24が
抽出される。抽出された各ストロークは11′〜mB1
mで示している。
FIG. 6 is a diagram showing strokes determined by the stroke extraction means 14 using the above method. That is, segments 19 and 22 in FIG. 5 are combined to extract the string ρ-23 indicated by @111+,
and 21 are combined to extract a stroke 24 indicated by 12''. Each extracted stroke is 11' to mB1.
It is indicated by m.

次に、これら各ストロークからストローク中点を求める
。ストローク中点の座標は、ストロークの始点の座標を
(X□Y、)、終点の座標を(X、。
Next, the stroke midpoint is determined from each of these strokes. The coordinates of the stroke midpoint are the coordinates of the start point of the stroke (X□Y,) and the coordinates of the end point (X,).

yt )とすると、((Xs + Xt )/2−  
(Yt 十Yt )/2)で表わされる。
yt ), then ((Xs + Xt )/2−
(Yt +Yt)/2).

第7図は上記処理によって抽出した入力文字パターンの
ストローク中点パターン25を示す図であり、ストロー
ク中点には第6図のストローク番号に対応した番号な付
している。
FIG. 7 is a diagram showing a stroke midpoint pattern 25 of the input character pattern extracted by the above processing, and the stroke midpoints are numbered in correspondence with the stroke numbers in FIG. 6.

第8図は同様の処理で得られた漢字Tの基準文字のスト
ローク中点パターン34を示す図である。決定手段15
においては、上記入力文字のストローク中点パターン2
5と、決定手段ISK格納されている基準文字のストロ
ーク中点パターン34とbiら類似度を求めて入力文字
が何であるかを決定する。
FIG. 8 is a diagram showing the stroke midpoint pattern 34 of the reference character of the Kanji T obtained by the same process. Determination means 15
, stroke midpoint pattern 2 of the input character above
5, the stroke midpoint pattern 34 of the reference character stored in the determining means ISK, and bi, the similarity is determined to determine what the input character is.

具体的には、まず基準文字の各ストローク中点35〜4
2に対して入力文字の各ストローク中点26〜33のう
ち距離の最も近い点を対応付ける。
Specifically, first, each stroke midpoint of the reference character is 35 to 4.
2 is associated with the closest point among the stroke midpoints 26 to 33 of the input character.

この例では、ストローク中点26〜33が基準文字のス
トローク中点3s〜42に対応付けられる。
In this example, stroke midpoints 26 to 33 are associated with stroke midpoints 3s to 42 of the reference character.

次に、上記対応付けられた2点間の距離を加算し、その
逆数を基準文字に対する入力文字の類似度とする。最後
に、l114Ji度が最も大きい値を持つ基準文字を認
識文字として決定する。このようK、安定なストローク
中点を利用して認識しているので、従来に比較して高い
認識精度を得ることができる。
Next, the distance between the two correlated points is added, and the reciprocal of the distance is taken as the degree of similarity of the input character to the reference character. Finally, the reference character with the largest l114Ji degree is determined as the recognized character. Since recognition is performed using the stable stroke midpoint, higher recognition accuracy can be obtained than in the past.

なお、上記実施例では手書き漢字を認識する場合につい
て説明したが、この発明は、これに限らず直線線分の多
い文字、例えば、手書きカタカナ文字等の認識に使用し
てもよい。
Although the above embodiment describes the case of recognizing handwritten kanji characters, the present invention is not limited to this and may be used to recognize characters with many straight line segments, such as handwritten katakana characters.

また、決定手段1sの決定方法として、入力文字のスト
ーーり中点パターンと基準文字のストーーり中点パター
ンとの距離から類似度を求めて決定する方法について説
明したが、この発明はこれに隈らず、基準文字のあらか
じめ重み付けされたストーーり中点パターンと入力文字
のストローク中点パターンとな直接重ね台わせて類似度
を求めることによって決定する方法を使用してもよい。
Furthermore, as a determining method of the determining means 1s, a method has been described in which the degree of similarity is determined from the distance between the stuck midpoint pattern of the input character and the staggered midpoint pattern of the reference character. Alternatively, a method may be used in which the pre-weighted stroke midpoint pattern of the reference character is directly superimposed on the stroke midpoint pattern of the input character to determine the degree of similarity.

以上説明したように1この発明によれば細線化した後に
抽出したストローク中点パターンを用いて認識するよう
にしたので、線分の傾きの小さな変動とか、線幅の変動
に対して安定であるとともに1端点や分岐点等の特徴点
を特徴とする方法の欠点である文字線の結合や分離に影
響されKくいので、高い精度で手書き文字を認識できる
という利点がある。
As explained above, 1.According to this invention, recognition is performed using the stroke midpoint pattern extracted after line thinning, so it is stable against small fluctuations in the slope of line segments and fluctuations in line width. In addition, this method has the advantage that handwritten characters can be recognized with high accuracy because it is less affected by the combination or separation of character lines, which is a disadvantage of methods characterized by feature points such as one end point or a branch point.

【図面の簡単な説明】[Brief explanation of the drawing]

第1図および第2図は入力文字パターンの変動例を説明
するための図、第3図はこの発明の装置の一実施例をブ
ロック図で示した構成図、第4図は細線化された入力文
字パターンの例を示す図、第5図はセグメントを抽出し
た入力文字パターンの例を示す図、第6図はストローク
を抽出した入力文字パターンの例を示す図、第7図は入
力文字のストーーり中点パターンの例を示す図、第8図
は基準文字のス)9−り中点パターンの例を示す図であ
る。 図中、11は帳票、12は走査手段、13は前処理手段
、14はストローク抽出手段、15は決定手段である。 なお、図中の同一符号は同一または和尚部分を示す。 代理人 葛野信−(ほか1名) 第1図   第2図 第3図 第4図 第6図 17゜
Figures 1 and 2 are diagrams for explaining examples of variations in input character patterns, Figure 3 is a block diagram showing an embodiment of the device of the present invention, and Figure 4 is a diagram with thin lines. Figure 5 is a diagram showing an example of an input character pattern, Figure 5 is a diagram showing an example of an input character pattern with segments extracted, Figure 6 is a diagram showing an example of an input character pattern with strokes extracted, Figure 7 is a diagram showing an example of an input character pattern with strokes extracted. FIG. 8 is a diagram showing an example of a midpoint pattern of a standard character. In the figure, 11 is a form, 12 is a scanning means, 13 is a preprocessing means, 14 is a stroke extraction means, and 15 is a determining means. Note that the same reference numerals in the figures indicate the same or similar parts. Agent Makoto Kuzuno (1 other person) Figure 1 Figure 2 Figure 3 Figure 4 Figure 6 Figure 17゜

Claims (1)

【特許請求の範囲】[Claims] 帳票などの記録媒体に記録された文字を認識する文字U
識装置1において、前記文字を走査して光電変換する走
査手段と、この走査手段で得られた入力文字パターンを
細線化する前処理手段と、前記細線化された入力文字パ
ターンから直線線分を抽出するストローク抽出手段と、
このストローク抽出手段で得られた直線線分の中点パタ
ーンな用いて文字を決定する決定手段とを具備してなり
、前記前処理手段において細線化された入力文字パター
ンから、前記ストローク抽出手段において端点9分線点
および屈折点などの特徴点を抽出してこれらを結ぶ線分
を求め、この線分から端点以外の特徴点に連結している
方向が等しい線分の対を結合するととにより直線線分を
抽出し、前記直線線分の中点パターンと、前記決定手段
内にあらかじめ格納されている基準文字パターンの直線
線分の中点パターンとの類似度を用いて文字を認識する
ことtt特徴とする文字認識装置。
Character U that recognizes characters recorded on recording media such as forms
The recognition device 1 includes a scanning means for scanning and photoelectrically converting the characters, a preprocessing means for thinning the input character pattern obtained by the scanning means, and a straight line segment from the thinned input character pattern. a stroke extracting means for extracting;
determining means for determining a character using the midpoint pattern of the straight line segment obtained by the stroke extraction means, the stroke extraction means Extract feature points such as end point 9-segment line points and refraction points, find a line segment that connects them, and combine pairs of line segments that connect to feature points other than end points in the same direction to form a straight line. extracting a line segment and recognizing a character using the degree of similarity between the midpoint pattern of the straight line segment and the midpoint pattern of the straight line segment of a reference character pattern stored in advance in the determining means; Character recognition device.
JP56121615A 1981-08-03 1981-08-03 Character recognition device Granted JPS5822479A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP56121615A JPS5822479A (en) 1981-08-03 1981-08-03 Character recognition device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP56121615A JPS5822479A (en) 1981-08-03 1981-08-03 Character recognition device

Publications (2)

Publication Number Publication Date
JPS5822479A true JPS5822479A (en) 1983-02-09
JPH026113B2 JPH026113B2 (en) 1990-02-07

Family

ID=14815633

Family Applications (1)

Application Number Title Priority Date Filing Date
JP56121615A Granted JPS5822479A (en) 1981-08-03 1981-08-03 Character recognition device

Country Status (1)

Country Link
JP (1) JPS5822479A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010218128A (en) * 2009-03-16 2010-09-30 Ricoh Co Ltd Image processing apparatus and method, program, and recording medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010218128A (en) * 2009-03-16 2010-09-30 Ricoh Co Ltd Image processing apparatus and method, program, and recording medium

Also Published As

Publication number Publication date
JPH026113B2 (en) 1990-02-07

Similar Documents

Publication Publication Date Title
KR100322982B1 (en) Facial Image Processing Equipment
KR910010353A (en) Fingerprint Identification Method
Kiyko Recognition of objects in images of paper based line drawings
JPS5822479A (en) Character recognition device
US6208756B1 (en) Hand-written character recognition device with noise removal
JP2675303B2 (en) Character recognition method
KR102389066B1 (en) Face Image Generating Method for Recognizing Face
JP3140079B2 (en) Ruled line recognition method and table processing method
JP2000357231A (en) Image discrimination device
JPH02166583A (en) Character recognizing device
JPS58222384A (en) Discriminating system of font
JP3193573B2 (en) Character recognition device with brackets
JPS5812080A (en) Character recognizing device
JP2001060250A (en) Method and device for character recognition
JP2832035B2 (en) Character recognition device
JPH0830717A (en) Character recognition method and device therefor
JP2935331B2 (en) Figure recognition device
JP2974396B2 (en) Image processing method and apparatus
KR930000034B1 (en) Korean characters font dividing method using run length code
JP2962984B2 (en) Character recognition device
JPH03229386A (en) Character recognizing device
JPH0246988B2 (en)
JPH0353392A (en) Character recognizing device
JPS58201183A (en) Feature extracting method of handwritten character recognition
JPS6361382A (en) Character component removing method for linear image