JPS58225489A

JPS58225489A - Chinese character recognizing device

Info

Publication number: JPS58225489A
Application number: JP57108776A
Authority: JP
Inventors: Yoshihisa Fujii; 敬久藤井; Hiroshi Kamata; 洋鎌田
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1982-06-24
Filing date: 1982-06-24
Publication date: 1983-12-27
Also published as: JPH0253830B2

Abstract

PURPOSE:To obtain a sufficient rate of recognition by collating along the respective contour segment sequence obtained from KANJI (Chinese character) to be recognized with respective contour segment sequence read out of a dictionary. CONSTITUTION:An input character is inputted to a horizontal segment extracting circuit 12 and a vertical segment extracting circuit 13 to extract respective segments. Their outputs are passed through feature segment connecting circuits 14 and 15 and then inputted to decision making circuits 23-28 through a left contour segment sequence buffer 16, right contour segment sequence buffer 17, and center contour segment sequence buffer 18, and an upper contour segment sequence buffer 19, lower contour segment sequence buffer 20, and center contour segment sequence buffer 21 to be collated with contour segment sequences from the dictionary. A decision making circuit 29 recognizes the KANJI on the basis of the decision results of the decision making circuits 23-28.

Description

【発明の詳細な説明】因発明の技術分野本発明は、漢字認識装置、特に例えば手書き漢字に対す
る認識処理を行うに当って、文字輪郭の左・右・上・下
の各線分についての系列を抽出し、辞書にもたせた上記
各線分についての冗長度をもたせた系列と系列に沿って
照合してゆくようにした漢字認識装置に関するものであ
る。[Detailed Description of the Invention] Technical Field of the Invention The present invention uses a kanji recognition device, particularly when performing recognition processing for, for example, handwritten kanji, a series of line segments on the left, right, top, and bottom of a character outline. The present invention relates to a kanji recognition device that extracts the above-mentioned line segments and compares them along the series with a redundant series of the above-mentioned line segments stored in a dictionary.

（Ｂ）技術の背景と問題点漢字特に手書き１漢字に対する認識処理は、現在の所仲
々困難な段階にある。このような認識処理の１つとして
、漢字の背景部分に注目して大局的な判断を行うように
する方式が考慮されているが、当該背景（あるいは文字
内部の空間）の２次元的な特徴をとらえようとすると処
理がきわめて複雑となり易い。このことから、従来、手
書き片仮名などに適用されていた輪郭線分の利用に着目
することが考慮された。(B) Technical Background and Problems The recognition process for kanji, especially for single handwritten kanji, is currently at a fairly difficult stage. As one such recognition process, a method is being considered that focuses on the background part of the kanji and makes a global judgment, but the two-dimensional characteristics of the background (or the space inside the character) are considered. If you try to capture this, the processing tends to become extremely complicated. For this reason, consideration has been given to focusing on the use of contour line segments, which has traditionally been applied to handwritten katakana.

（０）発明の目的と構成本発明は上記の点を解決することを目的としておシ、本
発明の漢字認識装置は、認識対象漢字文字を走査して特
徴を抽出し、標準漢字文字に対応した特徴が格納されて
いる辞書の内容と照合して、上記認識対象漢字文字のカ
テゴリを決定する漢字認識装置において、上記認識対象
漢字文字を、少なくとも、水平力から左方向へ向う探索
によって得られる文字左輪部と、水平左から右方向へ向
う探索によって得られる文字右輪部と、上方から下方向
へ向う探索によって得られる文字上輪郭と、下方から上
方向へ向う探索によって得られる文字上輪郭とにもとづ
いて、上記文字左輪部に沿う輪郭左線分の系列と、上記
文字右輪部に浴う輪郭右線分の系列と、上記文字上輪郭
に削う輪郭上線分の系列と、上記文字上輪郭に沿う輪郭
上線分の系列とを抽出すると共に、上記辞書中に、上記
各輪郭線分系列を当該系列に冗長度をもたせて夫々格納
してなり、上記認識対象漢字文字から得られた上記各輪
郭線分系列と上記辞書から読出された上記各輪郭線分系
列とを系列に油って照合してゆくようにしたことを特徴
としている。以下図面を参照しつつ説明する。(0) Purpose and Structure of the Invention The purpose of the present invention is to solve the above-mentioned problems. In a kanji recognition device that determines the category of the kanji character to be recognized by comparing the characteristics of the kanji character with the contents of a dictionary in which the characteristics are stored, the kanji character to be recognized is obtained by at least a search toward the left from a horizontal force. The left limb of the character, the right limb of the character obtained by searching horizontally from left to right, the character contour obtained by searching from top to bottom, and the character contour obtained by searching from bottom to top. Based on the above, there are a series of contour left line segments along the left ring of the character, a series of contour right line segments along the right ring of the character, a series of contour line segments that cut into the character contour, and the above. A series of line segments on the contour that follow the contour on the character are extracted, and each of the contour line segment series is stored in the dictionary with redundancy added to the series, so that the series of contour line segments that follow the contour of the character are stored in the dictionary, and The present invention is characterized in that each of the contour line segment series read out from the dictionary is compared with the contour line segment series read out from the dictionary. This will be explained below with reference to the drawings.

（籾発明の実施例第１図は本発明にいう輪郭線分の系列を説明する説明図
、第２図は本発明において辞書中に格納される系列を説
明する説明図、第３図は本発明の一実施例構成を示す。(Embodiment of the Paddy Invention Fig. 1 is an explanatory diagram for explaining the series of contour line segments according to the present invention, Fig. 2 is an explanatory diagram for explaining the series stored in the dictionary in the present invention, and Fig. 3 is an explanatory diagram for explaining the series of contour line segments according to the present invention. 1 shows a configuration of an embodiment of the invention.

第１図（４）に例示する漢字１例えば手書き漢字「資」
が与えられたとき、図示矢印方向に走査２を行い、り背景の左白領域から最初に黒領域に達した点Ａを抽出
する。　　　　　　　　　　　　　　　　　　　１１１
）次に黒領域から白領域に達した点αを抽出する。Kanji 1 illustrated in Figure 1 (4) For example, the handwritten kanji ``Chi''
When given, scan 2 is performed in the direction of the arrow shown in the figure, and point A, which first reaches the black area from the left white area of the background, is extracted. 111
) Next, extract the point α that reaches the white area from the black area.

Ｉ）次に白領域から黒領域に達した点Ｂを抽出する。I) Next, extract the point B that reaches the black area from the white area.

Ｉｖ）次に黒領域から白領域に達した点すを抽出する。Iv) Next, extract points that reach the white area from the black area.

Ｖ）最後に黒領域から背景の右白領域に達した点ｎを抽
出する。V) Finally, extract the point n that reaches the right white area of the background from the black area.

ようにする。Do it like this.

そして、上記点Ａに対応する各走査毎の点について例え
ば上下方向に連らねて輪郭左線分を、第１図（Ｂ）図示
Ｌｌ、Ｌ２．・・・・・・の如く抽出する。なお、この
とき、上下に並ぶ２つの走査に対応して得られた上記点
Ａに対応する点の水平位置が閾値以上熱れていれば、線
分が不連続であるとみる。Then, for each point corresponding to the above-mentioned point A, the contour left line segments are connected in the vertical direction, for example, in Ll, L2, as shown in FIG. 1(B). Extract as follows. In addition, at this time, if the horizontal position of the point corresponding to the above-mentioned point A obtained corresponding to the two scans arranged above and below is heated by a threshold value or more, the line segment is considered to be discontinuous.

また各線分の始端や終端が文字の黒領域によって封さが
れている場合（図示黒丸）と封さがれていない場合（図
示白丸）とを区別して抽出する。In addition, cases in which the start and end of each line segment are sealed by a black area of a character (black circles in the figure) and cases in which they are not sealed (white circles in the figure) are extracted separately.

上記と同様に第１図図示点ｎに対応する点を連らねて輪
郭右線分を、第１図（Ｂ）図示几１　、　Ｒ２，・・・
の如く抽出する。勿論、この抽出に当って、改めて右側
から左方向へ向う走査をやシ直してもよい。Similarly to the above, the right line segment of the contour is created by connecting the points corresponding to the point n shown in FIG. 1, as shown in FIG.
Extract as follows. Of course, in this extraction, the scanning from the right side to the left direction may be slightly changed.

更に必要に応じて、文字のストロークによって挾まれて
いる白領域をコード化すべく、第１図（４）図示の点α
と点Ｂとの中央点、点すと点Ｎとの中央点・・・・・・
を抽出し、これら夫々の中央点を上下方向に連らねて、
本発明にいう輪郭中線分を抽出し、その系列を抽出する
ことができる。Furthermore, if necessary, in order to code the white area sandwiched by the stroke of the character, the point α shown in FIG.
The center point between and point B, the center point between point N and point N...
, and connect the center points of each of these in the vertical direction,
It is possible to extract the line segments in the contour according to the present invention, and to extract the series thereof.

第１図（Ｂ）に示す如く抽出された輪郭左線分Ｌ　１゜
Ｌ　２．・・・・・・や輪郭右線分Ｒ１，Ｒ２，・・・
・・・Ｋついて、第２図（４）図示の如く、矢印３の方
向に輪郭左線分系列４や輪郭線分系列５をつくる。この
系列は、認識対象漢字１の輪郭特徴を代表していること
は言うまでもない。The extracted contour left line segment L 1°L 2. as shown in FIG. 1(B). ...and contour right line segments R1, R2, ...
...K, as shown in FIG. 2(4), a contour left line segment series 4 and a contour line segment series 5 are created in the direction of the arrow 3. It goes without saying that this series represents the outline features of the kanji to be recognized 1.

このようにして得られた輪郭線分系列が、本発明の場合
、辞書中の輪郭線分系列と矢印３の方向に順次照合され
てゆく６第２図（Ｂ）は、文字「資」に対応する所の辞
書中の輪郭左線分系列６を示している。図示の符号７，
８はいずれでも可を示し、符号９，１０は省略化を示し
、符号１１は系列終点を示している。辞書中の系列６に
おいて、符号７．８や符号９，１０の如く冗長度を与え
たのは、文゛字の変形に対処するためと考えてよい。In the case of the present invention, the contour line segment series obtained in this way is sequentially compared with the contour line segment series in the dictionary in the direction of arrow 36. The corresponding contour left line segment series 6 in the dictionary is shown. Illustrated code 7,
8 indicates that either is acceptable, codes 9 and 10 indicate abbreviation, and code 11 indicates the end point of the series. In series 6 in the dictionary, the reason for giving redundancy such as code 7.8 and code 9, 10 can be considered to be to cope with the deformation of characters.

上記特に輪郭左線分系列を利用して説明した帷き、輪郭
線分系列が、例えば（１）輪郭左線分系列、（１１）輪
郭線分系列、（町輪郭上線分系列、（１ｖ）輪郭上線分
系列の４通り抽出される。また文字ストローク間に存在
する線分系列として、Ｍ水平方向走査時の輪郭中線分糸
列、（マり垂直方向走査時の輪郭線分系列の２通りが抽
出される。そして、夫々について上記と同様な照合が行
われる。The skirt and contour line segment series explained above using the contour left line segment series are, for example, (1) contour left line segment series, (11) contour line segment series, (town contour upper line segment series, (1v) Four types of line segment series on the contour are extracted.Four types of line segment series existing between character strokes are extracted. The streets are extracted, and the same matching as above is performed for each.

第３図は本発明の一実施例構成を示している。FIG. 3 shows the configuration of an embodiment of the present invention.

図中の符号１２は水平方向線分抽出回路、１３は垂直方
向線分抽出回路、１４．１５は夫々特徴線分系列（ｉ！
ｉ！分系列）連結回路であって第１図（Ｂ）図示の線分
ＬＬ　、　Ｌ２　、・・・・・・の如き系列を得るもの
、１６は輪郭線分系列バッファ、１，７は輪郭線分系列
バッファ、１８は輪郭中線分糸列バッファ、１９は輪郭
上線分系列バッファ、２０は輪郭上線分系列バッファ、
２１は輪郭線分系列バッファ、２２は辞書、２３ないし
２８は夫々照合判定回路、２９は決定回路を表わしてい
る。In the figure, reference numeral 12 is a horizontal line segment extraction circuit, 13 is a vertical line segment extraction circuit, and 14 and 15 are characteristic line segment series (i!
i! 1(B) is a concatenation circuit for obtaining a series such as line segments LL, L2, etc., 16 is a contour line segment series buffer, 1 and 7 are contour line segment series buffer, 18 is a line segment string buffer on the contour, 19 is a line segment series buffer on the contour, 20 is a line segment series buffer on the contour,
Reference numeral 21 represents a contour line segment series buffer, 22 a dictionary, 23 to 28 each a collation determination circuit, and 29 a determination circuit.

図示特徴線分連結回路１４は、第１図（Ｂ）に関連して
説明した線分Ｌｌ、Ｌ２．・曲・やＲ１，Ｒ２，・・・
の連結状態を調べ、夫々の対応するバッファ１６゜１７
．１８に格納する。特徴線分連結回路１５は、垂直方向
走査に対応するものであって、回路１４と同様に動作す
る。The illustrated characteristic line segment connection circuit 14 includes the line segments L1, L2, .・Song・YaR1,R2,...
Check the concatenation status of the respective buffers 16゜17
．． 18. The feature line segment connecting circuit 15 corresponds to vertical scanning and operates in the same manner as the circuit 14.

各バッファ１６ないし２１の内容は、辞書２２から続出
された各線分系列と、照合判定回路２３ないし２８によ
って照合される。この場合、第２図囚図示の矢印３．の
方向に順次照合されてゆくものと考えてよい。The contents of each buffer 16 to 21 are compared with each line segment series sequentially output from the dictionary 22 by matching determination circuits 23 to 28. In this case, arrow 3. You can think of it as being collated sequentially in the direction of .

各照合判定回路２３ないし２８からの判定結果は決定回
路２９に導ひかれ、それらを綜合的に調　　　　゛べて
、決定回路２９が認識対象漢字文字のカテゴリを決定す
る。The determination results from each of the collation determination circuits 23 to 28 are led to a determination circuit 29, which comprehensively examines them and determines the category of the kanji character to be recognized.

（ｌｉｔ）発明の詳細な説明した如く、本発明によれば、文字の輪郭線分系列
を手書き漢字の認識などに利用でき６゜　　　１認識対
象文字の２次元的特徴を抽出することが容易であり、か
つ辞書中に多少の冗長度をもつものを用意しておくこと
によって十分な認識率を得ることが可能となる。(lit) As described in detail, according to the present invention, a series of character outline segments can be used for recognition of handwritten kanji, etc.6゜1 It is easy to extract two-dimensional features of characters to be recognized. By preparing a dictionary with some redundancy, it is possible to obtain a sufficient recognition rate.

[Brief explanation of the drawing]

第１図は本発明にいう輪郭線分の系列を説明する説明図
、第２図は本発明において辞書中に格納される系列を説
明する説明図、第３図は本発明の一実施例構成を示す。図中、１は認識対象漢字、２は走査線、４，５゜６は夫
々輪郭線分系列、１２．１３は夫々線分抽出回路、１４
．１５は夫々線分系列連結回路、１６ないし２１は夫々
線分系列バッファ、２２は辞書、２３ないし２８は夫々
照合判定回路、２９は決定回路を表わしている。特許出願人　　富士通株式会社代理人弁理士　　森１）寛（外１名）ト　　　　　　　　　　　　　　　囚ベト；　　−ワーFIG. 1 is an explanatory diagram for explaining the series of contour line segments according to the present invention, FIG. 2 is an explanatory diagram for explaining the series stored in the dictionary according to the present invention, and FIG. 3 is an explanatory diagram for explaining the configuration of an embodiment of the present invention. shows. In the figure, 1 is a kanji to be recognized, 2 is a scanning line, 4, 5゜6 are contour line segment series, 12.13 are line segment extraction circuits, 14
．． Reference numeral 15 represents a line segment series concatenation circuit, 16 to 21 represent line segment series buffers, 22 represents a dictionary, 23 to 28 represent matching determination circuits, and 29 represents a determination circuit. Patent Applicant Fujitsu Ltd. Representative Patent Attorney Hiroshi Mori (1 other person)

Claims

[Claims]

The recognition is performed in a kanji recognition device that scans the kanji character to be recognized, extracts features, and determines the category of the kanji character to be recognized by comparing the characteristics with the contents of a dictionary that stores features corresponding to standard kanji characters. The target kanji character is at least the left contour of the character obtained by searching from the horizontal cloth to the left, the right limb of the character obtained by searching from the horizontal left to the right, and the right limb of the character obtained by searching from the top to the bottom. Based on the top contour of the character and the bottom contour of the character obtained by searching from the bottom to the top, a series of left contour line segments that follow the left contour of the character, and a right contour line segment that follows the right ring of the character. series, a series of line segments on the contour that follow the upper contour of the character, and a series of line segments on the contour that overflows the lower contour of the character, and in the dictionary, each contour line segment series is redundantly added to the series. Each contour line segment obtained from the recognition target kanji character, series 1 column j, and each contour line segment series read from the dictionary are stored in series. A kanji recognition device characterized in that it performs comparison.