JPH0646418B2

JPH0646418B2 - Feature extraction method

Info

Publication number: JPH0646418B2
Application number: JP62294057A
Authority: JP
Inventors: 隆博小川; 晃治伊東; 義征山下
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1987-11-24
Filing date: 1987-11-24
Publication date: 1994-06-15
Anticipated expiration: 2009-06-15
Also published as: JPH01136287A

Description

【発明の詳細な説明】（産業上の利用分野）本発明は、文字認識装置等に適用される特徴抽出方式に
関し、特に微小な違いしか存在しないパタンの差を有効
かつ安定に抽出する特徴抽出方式に関する。Description: TECHNICAL FIELD The present invention relates to a feature extraction method applied to a character recognition device or the like, and particularly feature extraction for effectively and stably extracting a pattern difference that has only a minute difference. Regarding the scheme.

従来の文字図形の認識においては、文字図形パタンより
ストロークを抽出し、それら抽出されたストロークの位
置、長さ、ストローク間の相互関係等を用いて認識する
方式が多く採用されている。その手法は(1) 文字図形の
輪郭を追跡することにより検出された輪郭点系列につい
て曲率を計算し、その曲率の大きな値の点を分割点とし
て輪郭系列を分割し、分割された系列を組合わせること
によりストロークを抽出するか、(2) 文字図形パタンに
細線化処理を行なって骨格化し、その骨格パタンの連結
性及び骨格パタンを追跡し急激な角度の変化点等を検出
してストロークを抽出し、前記(1)(2)より抽出されたス
トロークについて幾何学的な特徴等を抽出して識別を行
なっていた。又(3) 文字図形パタンの文字矩形枠の夫々
の辺から文字図形パタンまでの距離を求め、前記距離を
用いて前記文字矩形枠の各辺に対応する文字図形パタン
の突起部を抽出し、抽出された突起部の突起情報を用い
て、文字認識を行なう方式も特開昭60-230282 号公報に
より提案されている。In the conventional recognition of character / graphics, a method is widely adopted in which strokes are extracted from a character / graphic pattern, and the positions and lengths of the extracted strokes and mutual relationships between the strokes are used for recognition. The method is as follows: (1) Calculate the curvature of a contour point sequence detected by tracing the contour of a character figure, divide the contour sequence with points with large curvature values as dividing points, and combine the divided sequences. The strokes are extracted by combining them, or (2) the character / graphic pattern is thinned to form a skeleton, and the connectivity and skeleton pattern of the skeleton pattern is traced to detect a sudden change point of the angle, etc. The strokes extracted and extracted in (1) and (2) above are identified by extracting geometrical features and the like. In addition, (3) the distance from each side of the character rectangular frame of the character / graphic pattern to the character / graphic pattern is determined, and the protrusion of the character / graphic pattern corresponding to each side of the character rectangular frame is extracted using the distance, Japanese Patent Laid-Open No. 60-230282 also proposes a method of recognizing characters by using the extracted protrusion information of the protrusions.

（発明が解決しようとする問題点）しかしながら(1) の方法は文字図形パタンが大きくな
り、又文字図形パタンが複雑化すると、その処理量が増
大し処理速度の低下を招いていた。(2) の方法は文字図
形パタンを細線化する必要があり、又その細線化による
パタンのひずみ、ヒゲ等の問題がありその後の処理を複
雑なものとしていた。(3) の方法は入力文字の字形が変
形すると抽出される特徴が不安定となり、特に「水」・
「氷」及び「大」・「太」等の類似文字間ではその影響
が顕著である。(Problems to be Solved by the Invention) However, in the method (1), when the character / graphic pattern becomes large and the character / graphic pattern becomes complicated, the processing amount increases and the processing speed decreases. In the method (2), it is necessary to make the character / graphic pattern thin, and there are problems such as pattern distortion and beard due to the thinning, and the subsequent processing is complicated. In the method of (3), the extracted features become unstable when the shape of the input character is deformed, and especially "water"
The effect is significant between similar characters such as "ice" and "large" and "thick".

本発明の目的は、従来の認識方式における輪郭追跡や細
線化等の非常に複雑な処理を行なうことなく、文字図形
パタンの周辺分布に基づいて分割した（Ｍ×Ｎ）個の夫
々の領域に対して、文字図形パタンの外形を表す外郭座
標系列から、所定の条件を満たす極点についての情報を
抽出することにより、文字の字形の変動に対しても安定
で、特に微小な違いしか存在しないパタンの差を有効に
抽出する特徴抽出方式を提供することにある。An object of the present invention is to divide each of (M × N) regions based on the peripheral distribution of a character / graphic pattern without performing very complicated processing such as contour tracing and thinning in the conventional recognition method. On the other hand, by extracting the information about the poles that satisfy a predetermined condition from the outer coordinate series that represents the outer shape of the character / graphic pattern, the pattern is stable even when the character shape changes, and there are only very small differences. The object of the present invention is to provide a feature extraction method that effectively extracts the difference between the two.

（問題点を解決するための手段）本発明は、媒体上の文字図形を読取って量子化して得ら
れるパタンを原パタンとして記憶するパタンレジスタを
備え、上記原パタンに基づいて文字図形の特徴を抽出す
る特徴抽出方式を対象とし、前記従来技術の問題点を解
決するため、 (a) パタンレジスタに記憶された原パタンの外接辺から
なる文字矩形枠を検出する手段と、 (b) 上記原パタンを所望の２つの軸上に夫々投影して夫
々の黒ビット数分布を求める手段と、 (c) まず夫々の軸上に対して、上記文字矩形枠で限定さ
れる範囲で黒ビット数分布の夫々の重心座標を決定し
て、次いでそれまでに検出した夫々の重心座標で上記文
字矩形枠で限定される範囲を分割した夫々の範囲を対象
として夫々の前記黒ビット数分布の重心座標を決定する
過程を複数回繰返して夫々の重心座標系列を求め文字矩
形枠内を（Ｍ×Ｎ）個の領域に分割するための前記重心
座標に基づいた夫々の分割座標を決定する手段と、 (d) 上記原パタンの上記文字矩形枠内部を文字矩形枠の
夫々の辺から垂直方向に走査し、背景部から文字部への
最初の変化点の座標系列を検出する手段と、 (e) 上記変化点座標系列に基づいて、夫々の変化点の変
化量を算出する手段と、 (f) 上記変化量に従って、増減の変化する極点を検出す
る手段と、 (g) 上記（Ｍ×Ｎ）個の夫々の領域に対して、所定の条
件を満たす上記極点の情報を抽出する手段を設けたもの
である。(Means for Solving the Problems) The present invention includes a pattern register for storing a pattern obtained by reading and quantizing a character graphic on a medium as an original pattern, and based on the original pattern, the characteristics of the character graphic are determined. In order to solve the above-mentioned problems of the prior art, the feature extraction method for extracting is to be used. (A) A means for detecting a character rectangular frame formed of the circumscribed side of the original pattern stored in the pattern register, and (b) the original Means for projecting the pattern onto desired two axes respectively to obtain respective black bit number distributions, and (c) first, along each axis, the black bit number distribution within a range limited by the character rectangular frame. The respective barycentric coordinates of the black bit number distribution are determined for the respective ranges obtained by dividing the range limited by the character rectangular frame by the respective barycentric coordinates detected until then. The process of making decisions multiple times Means for repeatedly obtaining respective barycentric coordinate series and dividing the inside of the character rectangular frame into (M × N) regions, determining means for the respective divided coordinates based on the barycentric coordinates, and (d) the original pattern Means for vertically scanning the inside of the character rectangular frame from each side of the character rectangular frame and detecting the coordinate series of the first change point from the background part to the character part, (e) based on the change point coordinate series Then, means for calculating the amount of change of each change point, (f) means for detecting a pole point whose increase and decrease change according to the above change amount, and (g) for each of the (M × N) areas Then, a means for extracting the information of the extreme point satisfying a predetermined condition is provided.

（作用）本発明によれば、以上のように特徴抽出方式を構成した
ので各技術手段は次のように作用する。(Operation) According to the present invention, since the feature extraction method is configured as described above, each technical means operates as follows.

(a) の手段は、パタンレジスタに格納された原パタンを
走査することによって、文字図形の外接枠である文字矩
形枠を検出し、(b)(c)及び(d) の各手段に出力する。
(b) の手段は、原パタンから、所望の２つの軸の各方向
（例えばＸ軸、Ｙ軸方向）の黒ビット数分布を作成し、
(c) の手段に出力する。(c) の手段は、(a) の手段から
の文字矩形枠及び(b) の手段からの各黒ビット数分布を
受け、上記所望の２軸に対して周辺分布に基づいた原パ
タンの重心分割座標を検出し、その分割座標を(g) の手
段に出力する。(d) の手段は原パタンの文字矩形枠の上
辺・下辺・左辺・右辺の４辺の夫々の辺から垂直方向に
走査したときの最初の変化点の座標を上記辺の全範囲に
わたって検出することにより変化点座標系列を作成し、
(e) の手段に出力する。(e) の手段は、(d) の手段から
の変化点座標系列に基づいて夫々の変化点の変化量を算
出し、(f) の手段に出力する。(f) の手段は、(e) の手
段からの変化量を基に、増減が変化する極点を検出し、
(g) の手段に出力する。(g) の手段は、(c) の手段から
の分割点座標及び(f) の手段からの極点情報を受け、上
記（Ｍ×Ｎ）個の夫々の領域に対して、所定の条件を満
たす極点を抽出し、これを文字図形認識のための特徴と
して提供する。したがって、微小な違いしか存在しない
パタンの差を有効かつ安定に抽出できるようになり、前
記従来技術の問題点が解決される。The means (a) detects the character rectangular frame that is the circumscribed frame of the character figure by scanning the original pattern stored in the pattern register, and outputs it to each means (b) (c) and (d). To do.
The means of (b) creates a black bit number distribution in each direction of the desired two axes (for example, X-axis and Y-axis directions) from the original pattern,
Output to the method of (c). The means of (c) receives the character rectangular frame from the means of (a) and each black bit number distribution from the means of (b), and the center of gravity of the original pattern based on the peripheral distribution with respect to the desired two axes. The division coordinates are detected and the division coordinates are output to the means (g). The means of (d) detects the coordinates of the first change point when scanning vertically from each of the four sides of the upper, lower, left and right sides of the original rectangular frame of the original pattern over the entire range of the above side. By creating a change point coordinate series,
Output to the means of (e). The means (e) calculates the amount of change of each change point on the basis of the change point coordinate series from the means (d) and outputs it to the means (f). The means of (f) detects the pole point where the increase and decrease change based on the amount of change from the means of (e),
Output to the means of (g). The means (g) receives the division point coordinates from the means (c) and the pole point information from the means (f), and satisfies the predetermined condition for each of the above (M × N) areas. We extract the poles and provide them as features for character and figure recognition. Therefore, it becomes possible to effectively and stably extract the pattern difference having only a minute difference, and the problem of the conventional technique is solved.

（実施例）以下本発明の実施例につき図面を参照して詳細に説明す
る。Embodiments Embodiments of the present invention will be described in detail below with reference to the drawings.

第１図は本発明の特徴抽出方式が適用される文字認識装
置の構成を示すブロック図である。FIG. 1 is a block diagram showing the configuration of a character recognition device to which the feature extraction method of the present invention is applied.

第１図において、１は帳票からの光入力、１４は本実施
例の文字認識装置の文字名出力である。この文字認識装
置は光電変換部２、パタンレジスタ３、文字枠検出部
４、文字投影作成部５、重心検出部６、文字枠分割点決
定部７、外郭座標系列検出部８、極点座標系列検出部
９、極点計数部１０、極点計数領域テーブル１１、識別
部１２及び辞書メモリ１３から構成される。In FIG. 1, 1 is an optical input from a form, and 14 is a character name output of the character recognition device of this embodiment. This character recognition device includes a photoelectric conversion unit 2, a pattern register 3, a character frame detection unit 4, a character projection creation unit 5, a center of gravity detection unit 6, a character frame division point determination unit 7, an outer coordinate sequence detection unit 8, and a polar coordinate sequence detection. It includes a unit 9, a pole counting unit 10, a pole counting region table 11, an identifying unit 12, and a dictionary memory 13.

光入力１は光電変換部２に入力される。光電変換部２は
１つの文字予定領域を128 ×128 の画素へ分解し、各画
素を２値のディジタル信号（以下これを入力文字パタン
と呼ぶ）へ変換するものであり、平均的大きさの１文字
は６０×６０ビット程度の入力文字パタンで表現され
る。パタンレジスタ３は文字予定領域における各画素の
Ｘ，Ｙ座標を再現できる形式で入力文字パタンを記憶す
るものであり、文字予定領域に対応して128 ×128 ビッ
トの容量を有するものである。The optical input 1 is input to the photoelectric conversion unit 2. The photoelectric conversion unit 2 decomposes one character planned area into 128 × 128 pixels and converts each pixel into a binary digital signal (hereinafter referred to as an input character pattern), which has an average size. One character is represented by an input character pattern of about 60 × 60 bits. The pattern register 3 stores the input character pattern in a format capable of reproducing the X and Y coordinates of each pixel in the character planned area, and has a capacity of 128 × 128 bits corresponding to the character planned area.

文字枠検出部４は本実施例では文字の外接枠をそのパタ
ンレジスタ３における左端座標Ｘ_l 、右端座標Ｘ_r 、上
端座標Ｙ_t 、下端座標Ｙ_b で表現して検出するものであ
る。In the present embodiment, the character frame detection unit 4 detects the circumscribing frame of a character by expressing it in the pattern register 3 by the left end coordinate X _l , the right end coordinate X _r , the upper end coordinate Y _t , and the lower end coordinate Y _b .

文字投影作成部５はパタンレジスタ３の入力文字パタン
を所望の２つの軸、本実施例ではＸ軸、Ｙ軸（夫々パタ
ンレジスタ３の２次元座標における水平方向、垂直方
向）へ投影して黒ビット数の分布を求め、黒ビット数分
布ＳＸ(x) ，ＳＹ(y) を作成する。この黒ビット数分布
ＳＸ(x) ，ＳＹ(y) は次式で表される。The character projection creation unit 5 projects the input character pattern of the pattern register 3 onto two desired axes, that is, the X axis and the Y axis (horizontal direction and vertical direction in the two-dimensional coordinates of the pattern register 3 respectively) in the present embodiment, and black. The distribution of the number of bits is obtained and black bit number distributions SX (x) and SY (y) are created. The black bit number distributions SX (x) and SY (y) are expressed by the following equation.

但し、ｘ，ｙはパタンレジスタ３における夫々０〜127
なる２次元座標であり、Ｙ_t ，Ｙ_b は文字枠のＹ軸方向
の上端座標、下端座標、Ｘ_l ，Ｘ_r はＸ軸方向の左端座
標、右端座標であり、Ｐ（ｘ，ｙ）は黒ビット又は白ビ
ットを意味し、黒ビット（有意色）の場合Ｐ（ｘ，ｙ）
＝１、白ビット（背景色）の場合Ｐ（ｘ，ｙ）＝０をと
る。 However, x and y are 0 to 127 in the pattern register 3, respectively.
Where Y _t and Y _b are the upper and lower end coordinates in the Y axis direction of the character frame, X _l and X _r are the left and right end coordinates in the X axis direction, and P (x, y) Means a black bit or a white bit, and P (x, y) for a black bit (significant color)
= 1 and a white bit (background color), P (x, y) = 0.

第２図は(b),(c) に漢字「水」と「氷」夫々の入力文字
パタン（第２図(a) ）に対する黒ビット数分布ＳＸ(x)
，ＳＹ(y) を示す。Fig. 2 shows the black bit number distribution SX (x) for the input character patterns (Fig. 2 (a)) of the Chinese characters "water" and "ice" in (b) and (c).
, SY (y).

重心検出部６は、文字枠のＸ，Ｙ各軸方向の全範囲Ｘ_l
〜Ｘ_r ，Ｙ_t 〜Ｙ_b 及び前の過程で検出した重心座標に
よって分割される各範囲を対象として、入力文字パタン
の夫々の黒ビット数分布ＳＸ(x) ，ＳＹ(y) の重心座標
系列Ｘ（Ｍ_p），Ｙ（Ｍ_q）を求めるものであり、各範囲
の１次モーメントの和をその範囲の黒ビット和で除算す
ることによって求めるものである。但し、Ｍ_p ，Ｍ_q は
座標値の大きさの順に付した重心座標番号であり、Ｍ_p
＝１〜ＭＸ（ＭＸはＸ軸方向の重心の個数）、Ｍ_q ＝１
〜ＭＹ（ＭＹはＹ軸方向の重心の個数）である。以下の
説明では７個の重心座標Ｘ（Ｍ_p）を検出する場合につ
いて述べる。まず、文字枠のＸ軸方向の範囲Ｘ_l 〜Ｘ_r
を対象として、(3) 式に示すように入力文字パタンの黒
ビット数分布ＳＸ(x) の１次モーメント和をその範囲の
黒ビット和で除算することによって、中央の重心座標番
号が４の重心座標Ｘ(4) を求める。The center-of-gravity detection unit 6 determines the entire range X _{l in} the X and Y axis directions of the character frame.
˜X _r , Y _t ˜Y _b and each range divided by the barycentric coordinates detected in the previous process, the barycentric coordinates of each black bit number distribution SX (x), SY (y) of the input character pattern The series X (M _p ) and Y (M _q ) are obtained by dividing the sum of the first-order moments in each range by the black bit sum in that range. However, M _p, M _q is the centroid coordinate numbers assigned in the order of magnitude of the coordinate values, M _p
= 1 to MX (MX is the number of centers of gravity in the X-axis direction), M _q = 1
To MY (MY is the number of centers of gravity in the Y-axis direction). In the following description, the case of detecting seven barycentric coordinates X (M _p ) will be described. First, the range of X-axis direction of a character frame X _l to X _r
As shown in equation (3), the first moment sum of the black bit number distribution SX (x) of the input character pattern is divided by the black bit sum of the range, and the center of gravity coordinate number of 4 becomes Calculate the barycentric coordinate X (4).

次いで、その重心座標Ｘ(4) で分割された夫々の範囲、
Ｘ_l 〜Ｘ(4) ，Ｘ(4) 〜Ｘ_r を対象として２つの重心座
標Ｘ(2) ，Ｘ(6) を(4) 式により求める。 Then, each range divided by the barycentric coordinate X (4),
_{X l ~X (4), X} (4) ~X r 2 two barycentric coordinates as the target X (2), determined by the X (6) (4) equation.

次いで、これまで検出された重心座標Ｘ(2) ，Ｘ(4) ，
Ｘ(6) で分割された範囲Ｘ_l 〜Ｘ(2) ，Ｘ(2) 〜Ｘ(4)
，Ｘ(4) 〜Ｘ(6) ，Ｘ(6) 〜Ｘ_r を対象として４個の
重心座標Ｘ(1) ，Ｘ(3) ，Ｘ(5) ，Ｘ(7) を(5) 式によ
り求める。 Then, the barycentric coordinates detected so far X (2), X (4),
Range divided by the _{X (6) X l ~X (} 2), X (2) ~X (4)
, X (4) to X (6), X (6) to Xr, the four barycentric coordinates X (1), X (3), X (5), and X (7) are _expressed by equation (5). Ask by.

Ｙ軸方向の重心座標Ｙ（Ｍ_q）の検出も、検出する重心
座標個数ＭＹを７個とした場合、まず、文字枠の範囲Ｙ
_t 〜Ｙ_b を対象として入力文字パタンの黒ビット分布Ｓ
Ｙ(y) の重心座標Ｙ(4) を検出し、次いで文字枠を重心
座標で２分した範囲Ｙ_t 〜Ｙ(4) ，Ｙ(4) 〜Ｙ_b それぞ
れを対象として黒ビット分布ＳＹ(y) の重心座標Ｙ(2)
，Ｙ(4) を検出し、更にこれまでに検出された重心座
標Ｙ(2) ，Ｙ(4) ，Ｙ(6) でＹ軸方向の文字枠を分割し
た夫々の範囲Ｙ_t 〜Ｙ(2) ，Ｙ(2) 〜Ｙ(4) ，Ｙ(4) 〜
Ｙ(6) ，Ｙ(6) 〜Ｙ_b を対象として黒ビット分布ＳＹ
(y) の重心座標Ｙ(1) ，Ｙ(3) ，Ｙ(5) ，Ｙ(7) を検出
することによって、計７個の重心座標Ｙ(1) 〜Ｙ(7) を
検出する。 In the detection of the barycentric coordinate Y (M _q ) in the Y-axis direction, if the number of barycentric coordinate MY to be detected is 7, first, the range Y of the character frame is detected.
Black bit distribution S of input character pattern for _{t to} Y _b
Detecting the Y center coordinates Y of the (y) (4), then the range Y _t to Y for 2 minutes character frame center of gravity coordinates (4), Y (4) ~Y b black bit distribution SY respectively as the target ( barycentric coordinate Y (2) of y)
, Y (4) are detected, and the respective ranges Y _{t to} Y (Y (2), Y (4), and Y (6) in which the character frames in the Y-axis direction are divided are detected. 2), Y (2) ~ Y (4), Y (4) ~
Y (6), Y (6 ) ~Y b black bit distribution as target SY
By detecting the barycentric coordinates Y (1), Y (3), Y (5), and Y (7) of (y), a total of seven barycentric coordinates Y (1) to Y (7) are detected.

第２図(b),(c) に漢字「水」と「氷」夫々の入力文字パ
タン（第２図(a) ）に対する重心座標Ｘ(1) 〜Ｘ(7) ，
Ｙ(1) 〜Ｙ(7) が黒ビット数分布ともに例示されてい
る。The barycentric coordinates X (1) to X (7) for the input character patterns (Fig. 2 (a)) of the Chinese characters "water" and "ice" are shown in Figs. 2 (b) and (c), respectively.
Y (1) to Y (7) are illustrated together with the black bit number distribution.

文字枠分割点決定部７は、重心検出部６よりうけたＸ，
Ｙ軸各方向の重心座標系列Ｘ（Ｍ_p），Ｙ（Ｍ_q）を分割
座標候補として、重心座標番号Ｍ_p ，Ｍ_q を入力文字パ
タンの文字枠をＮＸ，ＮＹなる個数の分割単位領域に分
割する分割座標番号k_i,k_jに対応づけて分割座標系列Ｄ
Ｘ（k_i），ＤＹ(k_j)を決定するものである。The character frame division point determination unit 7 receives X, which is received from the center of gravity detection unit 6,
The barycentric coordinate series X (M _p ) and Y (M _q ) in each direction of the Y-axis are used as the divisional coordinate candidates, and the barycentric coordinate numbers M _p and M _q are divided unit areas whose number is NX, NY in the character frame of the input character pattern. Divided coordinate series D in association with divided coordinate numbers k _i and k _j
X (k _i ) and DY (k _j ) are determined.

この実施例における分割単位領域の分割形式は、Ｘ軸方
向に関する分割数としてＮＸ＝４，５，６，８なる４形
式を取ることができ、Ｙ軸方向に関する分割数ＮＹとし
てＮＹ＝４，５，６，８なる４形式を取ることができ、
Ｘ軸方向の分割座標番号をk_i（k_i＝１〜ＮＸ−１，ＮＸ
＝４，５，６，８）とし且つＹ軸方向の分割座標番号を
k_j（k_j＝１〜ＮＹ−１，ＮＹ＝４，５，６，８）とし
て、文字枠をＮＸ，ＮＹなる個数の分割単位領域に分割
する分割座標系列ＤＸ(k_i)，ＤＹ(k_j)を決定する。Ｘ，
Ｙ軸各方向の重心座標番号Ｍ_p ，Ｍ_q とＸ，Ｙ軸方向の
分割座標番号k_i,k_jを対応づけて分割座標系列ＤＸ
(k_i)，ＤＹ(k_j)を決定するために、表１に示す如きテー
ブルが用意されていて、Ｘ，Ｙ軸各方向の分割数ＮＸ，
ＮＹに対応してこのテーブルから重心座標番号Ｍ_p ，Ｍ
_q を読み出し、その重心座標番号Ｍ_p ，Ｍ_q に対応した
重心座標Ｘ（Ｍ_p），Ｙ（Ｍ_q）を分割座標ＤＸ(k_i)，Ｄ
Ｙ(k_j)として決定する。The division format of the division unit area in this embodiment can take four formats of NX = 4,5,6,8 as the division number in the X-axis direction, and NY = 4,5 as the division number NY in the Y-axis direction. , 6,8
The division coordinate number in the X-axis direction is k _i (k _i = 1 to NX-1, NX
= 4, 5, 6, 8) and the division coordinate number in the Y-axis direction
As k _j (k _j = 1 to NY-1, NY = 4, 5, 6, 8), the divided coordinate series DX (k _i ), DY (which divides the character frame into the divided unit areas NX, NY. k _j ). X,
Divided coordinate series DX by associating barycentric coordinate numbers M _p and M _{q in} each Y-axis direction with divided coordinate numbers k _i and k _j in the X and Y-axis directions.
In order to determine (k _i ), DY (k _j ), a table as shown in Table 1 is prepared, and the number of divisions NX in each direction of the X and Y axes,
Corresponding to NY, from this table, the barycentric coordinate numbers M _p , M
_q is read, and the barycentric coordinates X (M _p ), Y (M _q ) corresponding to the barycentric coordinate numbers M _p , M _q are divided into coordinates DX (k _i ), D.
Determine as Y (k _j ).

このテーブルは、重心検出部６で検出する重心座標の個
数ＭＸ，ＭＹが７個の場合であるが、一般的な場合にお
いても、Ｘ，Ｙ軸各方向の分割数の重心座標が含まれる
ように対応させ、且つその際余分の重心座標が残った場
合は両端の領域から順に１個多い重心座標が含まれるよ
うに対応させることによって作ることができる。 In this table, the number of barycentric coordinates MX and MY detected by the barycentric detecting unit 6 is 7, but in a general case, the barycentric coordinates of the number of divisions in each direction of the X and Y axes are included. , And in that case, if there are extra barycentric coordinates remaining, the barycentric coordinates may be added so as to include one more barycentric coordinate in order from the regions at both ends.

第３図には、Ｘ，Ｙ軸各方向の分割数ＮＸ，ＮＹとして
ＮＸ＝ＮＹ＝５なる分割数が指定された場合について、
分割座標系列ＤＸ(k_i)，ＤＹ(k_j)と重心座標系列Ｘ（Ｍ
_p），Ｙ（Ｍ_q）との対応関係を示し、また、それらの分
割座標系列ＤＸ(k_i)，ＤＹ(k_j)で設定される分割単位領
域（k_i，k_j）を示している。FIG. 3 shows a case where the number of divisions NX = NY = 5 is designated as the number of divisions NX, NY in each direction of the X and Y axes.
The divided coordinate series DX (k _i ), DY (k _j ) and the barycentric coordinate series X (M
_p ), Y (M _q ) and the division unit areas (k _i , k _j ) set by the division coordinate series DX (k _i ), DY (k _j ). There is.

なお、分割数ＮＸ，ＮＹは入力文字の複雑度に応じて決
定する。又、一旦リジェクトされた場合には分割数Ｎ
Ｘ，ＮＹを変更して再度文字認識を行なわせる。The division numbers NX and NY are determined according to the complexity of the input character. If it is rejected once, the number of divisions N
Change X and NY to make character recognition again.

以上の様に文字枠分割点決定部７において、分割単位領
域の分割形式は、Ｘ軸方向に関する分割数としてＮＸ＝
４，５，６，８なる４形式、Ｙ軸方向に関する分割数と
してＮＹ＝４，５，６，８なる４形式をとることができ
る。本実施例では説明の簡略化のため、分割数をＮＸ＝
ＮＹ＝８として以下説明する。この場合、Ｘ軸方向にお
いては、重心座標Ｘ(1) ，Ｘ(2) ，Ｘ(3) ，Ｘ(4) ，Ｘ
(5) ，Ｘ(6) ，Ｘ(7) に対応する分割座標ＤＸ(1),ＤＸ
(2) ，ＤＸ(3) ，ＤＸ(4) ，ＤＸ(5) ，ＤＸ(6) ，ＤＸ
(7) 、Ｙ軸方向においては、重心座標Ｙ(1) ，Ｙ(2) ，
Ｙ(3) ，Ｙ(4) ，Ｓ(5) ，Ｓ(6) ，Ｙ(7) に対応する分
割座標ＤＹ(1) ，ＤＹ(2) ，ＤＹ(3) ，ＤＹ(4) ，ＤＳ
(5) ，ＤＹ(6) ，ＤＹ(7) を決定する。As described above, in the character frame division point determination unit 7, the division format of the division unit area is NX = as the number of divisions in the X-axis direction.
It is possible to adopt four formats of 4, 5, 6, 8 and 4 formats of NY = 4,5, 6, 8 as the number of divisions in the Y-axis direction. In the present embodiment, the number of divisions is NX =
The following description will be given assuming that NY = 8. In this case, in the X-axis direction, the barycentric coordinates X (1), X (2), X (3), X (4), X
Division coordinates DX (1), DX corresponding to (5), X (6), X (7)
(2), DX (3), DX (4), DX (5), DX (6), DX
(7) In the Y-axis direction, barycentric coordinates Y (1), Y (2),
Divided coordinates DY (1), DY (2), DY (3), DY (4), DS corresponding to Y (3), Y (4), S (5), S (6), and Y (7)
(5), DY (6) and DY (7) are determined.

なお、本実施例においてはテーブルを採用することによ
って重心座標と分割座標とを対応づけたが、第４図に示
すフローチャートの演算処理を実行させることによって
も両者を対応づけることができる。なお、第４図におけ
る除算の結果はすべて小数点以下切り捨てである。Although the barycentric coordinates and the divided coordinates are associated with each other by using the table in the present embodiment, the two can be associated with each other by executing the arithmetic processing of the flowchart shown in FIG. The results of the division in FIG. 4 are all rounded down to the right of the decimal point.

第４図において、ステップＳ１で重心個数ＭＸを分割数
ＮＸで割った数Ｍ_αを求め、ステップＳ２，Ｓ３でＭＸ
／ＮＸの剰余Ｒ₁ とそのＲ₁ を２で割ったＲ₂ を求め
る。又、ステップＳ４で分割数の中央値k_αを求め、ス
テップＳ５，Ｓ６で分割番号k_iと重心番号Ｍ_p を０をセ
ットする。又、ステップＳ７，Ｓ８，Ｓ９で、分割番号
k_iを１つ増加する毎に、前に設定されている商Ｒ₂ を１
つ乗じ、重心番号Ｍ_p をＭ_αずつ増加させる。ステップ
Ｓ１０で商Ｒ₂ が負でないことを調べ、商Ｒ₂ が負でな
い限りステップＳ１１で重心番号の数を１つ増し、ステ
ップ１２でその重心番号Ｍ_p を分割番号k_iに対応づけ、
分割座標ＤＸ（Ｍ_p）を決定する。商Ｒ₂ が負の場合、
ステップＳ１３で現在の分割番号k_iが中央値k_dより大き
いか否かを判定し、大きい場合は重心番号を１つ増し、
小さい場合はステップＳ９で設定された重心番号によ
り、分割座標ＤＸ（Ｍ_p）を決定し、ステップＳ１４で
分割番号k_iが（ＮＸ−１）に一致したことを検出して終
了する。In FIG. 4, a number M _α is obtained by dividing the number of centroids MX by the number of divisions NX in step S1, and MX is obtained in steps S2 and S3.
The remainder R _{1 of} / NX and R ₂ obtained by dividing R ₁ by 2 are obtained. In step S4, the central value k _{α of the} number of divisions is obtained, and in steps S5 and S6, the division number k _i and the center of gravity number M _p are set to 0. Also, in steps S7, S8, and S9, the division number
Each time k _i is incremented by 1, the previously set quotient R ₂ is incremented by 1
Then, the center of gravity number M _p is increased by M _α . It is checked in step S10 that the quotient R ₂ is not negative, and unless the quotient R ₂ is negative, the number of centroid numbers is incremented by 1 in step S11, and the centroid number M _{p is associated} with the division number k _i in step 12,
The division coordinate DX (M _p ) is determined. If the quotient R ₂ is negative,
In step S13, it is determined whether or not the current division number k _i is larger than the median k _d , and if it is larger, the center of gravity number is increased by 1,
If it is smaller, the division coordinate DX (M _p ) is determined by the center of gravity number set in step S9, and it is detected in step S14 that the division number k _i matches (NX-1), and the process ends.

外郭座標系列検出部８は、上記文字枠の内部を対象と
し、文字枠の上辺・下辺・左辺・右辺の４辺夫々の全範
囲に対して、辺に垂直な方向に走査して最初の黒点の座
標を検出し、外郭座標系列ＧＴ(x) ，ＧＢ(x) ，ＧＬ
(y) ，ＧＲ(y) を作成する。但し、ｘ，ｙはパタンレジ
スタ３における夫々０〜127 なる２次元座標であり、Ｇ
Ｔは上辺に対して上記処理により求めたｙ座標を格納
し、ＧＢは下辺に対して上記処理により求めたｙ座標を
格納し、ＧＬは左辺に対して上記処理により求めたｘ座
標を格納し、ＧＲは右辺に対して上記処理により求めた
ｘ座標を格納する。The outer coordinate sequence detection unit 8 targets the inside of the character frame, and scans the entire range of each of the upper side, lower side, left side, and right side of the character frame in the direction perpendicular to the first black dot. Of the outer coordinate series GT (x), GB (x), GL
Create (y) and GR (y). However, x and y are two-dimensional coordinates of 0 to 127 in the pattern register 3, respectively, and G
T stores the y coordinate obtained by the above process for the upper side, GB stores the y coordinate obtained by the above process for the lower side, and GL stores the x coordinate obtained by the above process for the left side. , GR stores the x-coordinates obtained by the above processing on the right side.

第５図に漢字「水」の入力文字パタンに対する上辺から
の外郭座標系列、第８図に漢字「氷」の入力文字パタン
に対する上辺からの外郭座標系列の値を夫々示す。FIG. 5 shows the outline coordinate series from the upper side for the input character pattern of the Chinese character “water”, and FIG. 8 shows the values of the outline coordinate series from the upper side for the input character pattern of the Chinese character “ice”.

極点座標系列検出部９は、外郭座標系列検出部８で検出
した外郭座標系列において範囲Ｘ_l 〜Ｘ_r ，Ｙ_t 〜Ｙ_b
を対象とし、夫々の外郭座標系列に対する増減を(6) 式
〜(9) 式を用いて調べ、式の値が零となる点は無視し
て、増減の変化する極点の座標の系列（TX(n₁),TY(n₁) ），（BX(n₂),BY(n₂) ），（LX(n₃),LY(n₃) ），（RX(n₄),RY(n₄) ）を検出する。The polar coordinate series detection unit 9 detects the ranges X _{l to} X _r and Y _{t to} Y _b in the outline coordinate series detected by the outline coordinate series detection unit 8.
(6) to (9), the points where the value of the equation becomes zero are ignored, and the coordinate series of the polar points where the increase and decrease change (TX (n ₁ ), TY (n ₁ )), (BX (n ₂ ), BY (n ₂ )), (LX (n ₃ ), LY (n ₃ )), (RX (n ₄ ), RY (n ₄ ))) is detected.

dGT(x)＝GT(x+1)-GT(x) x_l≦ x＜x_r …(6) dGB(x)＝GB(x+1)-GB(x) x_l≦ x＜x_r …(7) dGL(y)＝GL(y+1)-GL(y) y_t≦ y＜y_b …(8) dGR(y)＝GR(y+1)-GR(y) y_t≦ y＜y_b …(9) 但し、始点(x_l,GT(x_l)),(x_l,GB(x_l)),(GL(y_t),y_t),(GR
(y_t),y_t) 及び終点(x_r,GT(X_r)),(x_r,GB(x_r)),(GL(y_b ,y
_b),(GR(y_b),y_b)は、無条件に極点とする。ここでｎ₁ ，
ｎ₂ ，ｎ₃ ，ｎ₄ は夫々の外郭座標系列に対する極点の
座標番号で、２≦n₁≦Tn，２≦n₂≦Bn，２≦n₃≦Ln，２
≦n₄≦Rnであり、Ｔｎは上辺に対する外郭座標系列ＧＴ
における極点の総数であり、Ｂｎは下辺に対する外郭座
標系列ＧＢにおける極点の総数であり、Ｌｎは左辺に対
する外郭座標系列ＧＬにおける極点の総数であり、Ｒｎ
は右辺に対する外郭座標系列ＧＲにおける極点の総数で
ある。また（TX(n₁)，TY(n₁)）は上辺に対する外郭座標
系列ＧＴにおける極点座標系列であり、（BX(n₂),BY
(n₂)）は下辺に対する外郭座標系列ＧＢにおける極点座
標系列であり、（LX(n₃),LY(n₃)）は左辺に対する外郭
座標系列ＧＬにおける極点座標系列であり、（RX(n₄),R
Y(n₄)）は右辺に対する外郭座標系列ＧＲにおける極点
座標系列である。dGT (x) = GT (x + 1) -GT (x) x _l ≤ x <x _r … (6) dGB (x) = GB (x + 1) -GB (x) x _l ≤ x <x _r … (7) dGL (y) ＝ GL (y + 1) -GL (y) y _t ≦ y <y _b … (8) dGR (y) ＝ GR (y + 1) -GR (y) y _t ≦ y <y _b … (9) where start point (x _l , GT (x _l )), (x _l , GB (x _l )), (GL (y _t ), y _t ), (GR
(y _t ), y _t ) and end point (x _r , GT (X _r )), (x _r , GB (x _r )), (GL (y _b , y
_b ), (GR (y _b ), y _b ) are unconditionally poles. Where n ₁ ,
n _2, n _3, n ₄ is coordinate number of pole against outer coordinate sequence _{each, 2 ≦ n 1 ≦ Tn,} 2 ≦ n 2 ≦ Bn, 2 ≦ n 3 ≦ Ln, 2
≦ n ₄ ≦ Rn, where Tn is the outer coordinate series GT for the upper side
, Bn is the total number of poles in the outer coordinate series GB for the lower side, Ln is the total number of poles in the outer coordinate series GL for the left side, and Rn
Is the total number of poles in the outer coordinate series GR for the right side. Further, (TX (n ₁ ), TY (n ₁ )) is a polar coordinate series in the outer coordinate series GT with respect to the upper side, and (BX (n ₂ ), BY
(n ₂ )) is a polar coordinate series in the outer coordinate series GB for the lower side, (LX (n ₃ ), LY (n ₃ )) is a polar coordinate series in the outer coordinate series GL for the left side, and (RX (n ₄ ), R
Y (n ₄ )) is a polar coordinate series in the outer coordinate series GR for the right side.

第６図に漢字「水」の入力文字パタンに対する上辺から
の極点座標系列、第９図に漢字「氷」の入力文字パタン
に対する上辺からの極点座標系列の例を夫々示す。FIG. 6 shows an example of the polar coordinate series from the upper side for the input character pattern of the Chinese character “water”, and FIG. 9 shows an example of the polar coordinate series from the upper side for the input character pattern of the Chinese character “ice”.

極点計数部１０は、対象極点座標系列（PTX(pn₁),PTY(p
n₁)），(PBX(pn₂),PBY(pn₂)),(PLX(pn₃)),PLY(pn₃)),(P
RX(pn₄),PRY(pn₄))の上辺、下辺、左辺、右辺の種別を
表すＳＥと、前記極点座標系列を抽出する際の基準辺に
おける対象領域の始点分割点番号Ｓと終点分割点番号Ｅ（ＳＥが上辺、下辺を示す場合、Ｓ
≦Ｅ≦ＮＸ、左辺、右辺を示す場合は、Ｓ≦Ｅ≦ＮＹ）
と、前記基準辺と垂直な方向における対象領域の領域端
分割点番号Ｄ（ＳＥが上辺、下辺を示す場合、左辺、右辺を示す場合は、とを極点計数領域テーブル１１から参照し、前記種別Ｓ
Ｅによって定まる極点座標系列に対して、前記基準辺の
始点分割点番号Ｓに対する前記文字枠分割点決定部７で
決定された分割点及び前記基準辺の終点分割点番号Ｅに
対する前記文字枠分割点決定部７で決定された分割点と
前記基準辺及び前記領域端分割点番号Ｄに対する前記文
字枠分割点決定部７で決定された分割点によって囲まれ
る領域を対象として、前記基準辺に向かう方向を正とし
た時に極大点及び極小点となる極点を夫々極大点個数ｎ
_max(i)及び極小点個数ｎ_min(i)として計数する。The pole counting unit 10 calculates the target pole coordinate series (PTX (pn ₁ ), PTY (p
n ₁ )), (PBX (pn ₂ ), PBY (pn ₂ )), (PLX (pn ₃ )), PLY (pn ₃ )), (P
RX (pn ₄ ), PRY (pn ₄ )) top side, bottom side, left side, SE indicating the type of right side, and the start point division point number S of the target area in the reference side when extracting the polar coordinate series And the end point division point number E (if SE indicates the upper side and the lower side, S
≦ E ≦ NX, S ≦ E ≦ NY when indicating the left side and the right side)
And the area end division point number D (SE indicates the upper side and the lower side of the target area in the direction perpendicular to the reference side, When showing the left side and the right side, And S from the pole point counting area table 11
With respect to the polar coordinate series determined by E, the division point determined by the character frame division point determination unit 7 for the starting point division point number S of the reference side and the character frame division point for the end point division point number E of the reference side A direction toward the reference side with respect to the division point determined by the determination section 7, the reference side, and the area surrounded by the division points determined by the character frame division point determination section 7 with respect to the area end division point number D When n is positive, the number of maximum points and the number of maximum points are n, respectively.
_It is counted as _max (i) and the number of minimum points n _min (i).

但し、本実施例において種別ＳＥは、ＳＥ＝１で上辺を
基準辺とする極点座標系列（ＴＸ（ｎ₁ ），ＴＹ（ｎ
₁ ））を示し、ＳＥ＝２で下辺を基準辺とする極点座標
系列（ＢＸ（ｎ₂ ），ＢＹ（ｎ₂ ））を示し、ＳＥ＝３
で左辺を基準辺とする極点座標系列（ＬＸ（ｎ₃ ），Ｌ
Ｙ（ｎ₃ ））を示し、ＳＥ＝４で右辺を基準辺とする極
点座標系列（ＲＸ（ｎ₄ ），ＲＹ（ｎ₄ ））を示すこと
とした。またｉは極点計数領域テーブル１１における種
別ＳＥ・始点分割点番号Ｓ・終点分割点番号Ｅ・領域端
分割点番号Ｄを１組としたときの夫々の組の番号を表
し、１≦ i≦i_maxとし、i_maxは、極点計数領域テーブル
１１における総組数である。Ｓ，Ｅ，Ｄは、零でｘ_l も
しくはＹ_t を示し、ＮＸもしくはＮＹでｘ_r もしくはＹ
_b を示す。However, in this embodiment, the type SE is the polar coordinate series (TX (n ₁ ), TY (n) where SE = 1 and the upper side is the reference side.
₁ )), SE = 2, and the polar coordinate series (BX (n ₂ ), BY (n ₂ )) with the lower side as the reference side, SE = 3
And the polar coordinate series (LX (n ₃ ), L
Y (n ₃ )), and SE = 4, and a polar coordinate series (RX (n ₄ ), RY (n ₄ )) with the right side as the reference side. Further, i represents the number of each set when the type SE, the starting point division point number S, the end point division point number E, and the area end division point number D in the pole counting area table 11 are set as 1 set, and 1 ≦ i ≦ i and _max, i _max is the total number of sets of pole count region table 11. S, E, and D are zero and represent x ₁ or Y _t , and NX or NY are x _r or Y.
Indicates _b .

第７図に第６図に対応した漢字「水」の上辺からの極大
点及び極小点とＳ，Ｅ，Ｄに対する極点計数領域、第１
０図に第９図に対応した漢字「氷」の上辺からの極大点
及び極小点とＳ，Ｅ，Ｄに対する極点計数領域を夫々示
す。In FIG. 7, the maximum and minimum points from the upper side of the Chinese character “water” corresponding to FIG. 6 and the pole counting areas for S, E and D, the first
FIG. 0 shows the maximum and minimum points from the upper side of the Chinese character “ice” corresponding to FIG. 9 and the pole counting areas for S, E and D, respectively.

識別部１２は、極点計数部１０で計数した夫々の対象領
域に対する極大点個数及び極小点個数と辞書メモリ１３
に格納されているあらかじめ用意された標準パタンの当
該領域に対する極大点個数範囲及び極小点個数範囲との
比較を行い辞書メモリ１３内の当該するすべての領域に
おいて範囲内に存在すると判定された標準パタンの文字
名を文字名出力１４に出力する。The identification unit 12 determines the number of maximum points and the number of minimum points for each target area counted by the pole counting unit 10, and the dictionary memory 13.
The standard pattern stored in the above-mentioned standard pattern is compared with the maximum point number range and the minimum point number range for the relevant area, and the standard pattern determined to be within the range in all relevant areas in the dictionary memory 13 is stored. The character name of is output to the character name output 14.

ここで、本発明の特徴抽出方式の特徴情報の１種である
極大点の個数及び極小点の個数に関する有効性を以下に
示す。Here, the effectiveness of the number of local maximum points and the number of local minimum points, which is one type of the feature information of the feature extraction method of the present invention, is shown below.

例えば、漢字「水」と漢字「氷」において、左上の点の
存在有無を上辺より検出する場合に、右の２本のストロ
ークの変動が特徴値に悪影響を与える。そこで、極点計
数領域の横方向の右側範囲を周辺分布に基づいた分割点
により制限して、上記悪影響を取り除くことができる。
また上下方向に対しても極点計数領域を制限することに
より極点計数領域外の切れ、かすれ等の影響も除くこと
ができ、より安定な特徴値を得ることができる。For example, in the kanji “water” and the kanji “ice”, when the presence or absence of the upper left point is detected from the upper side, the fluctuation of the two right strokes adversely affects the feature value. Therefore, it is possible to eliminate the above-mentioned adverse effect by limiting the lateral right side range of the pole-counting region by the division points based on the peripheral distribution.
In addition, by limiting the pole counting area in the vertical direction as well, it is possible to eliminate the effects of cutting, blurring, etc. outside the pole counting area, and more stable feature values can be obtained.

また、本実施例における前記文字枠分割点決定部７を付
加することによって、極点計数対象領域の個数が減少
し、従って処理速度の高速化がはかれる。Further, by adding the character frame division point determination unit 7 in the present embodiment, the number of pole point counting target areas is reduced, and therefore the processing speed can be increased.

（発明の効果）以上詳細に説明したように、本発明によれば、従来の認
識方式の特徴情報抽出における輪郭追跡や細線化等の非
常に複雑なパタン処理を行なうことなく、文字図形パタ
ンの２つの軸への投影の重心座標より、文字矩形枠内部
を（Ｍ×Ｎ）個の領域に分割するための文字枠分割点座
標系列を抽出し、文字図形パタンの外形を表す外郭座標
系列から、前記夫々の領域に対して所定の条件を満たす
極点の情報を抽出し、もって特徴とすることで、安定性
良く、微小な違いしか存在しないパターンの差を有効に
抽出する特徴抽出方式を提供できる。(Effect of the Invention) As described in detail above, according to the present invention, a character / graphic pattern can be formed without performing very complicated pattern processing such as contour tracking and thinning in the feature information extraction of the conventional recognition method. A character frame division point coordinate series for dividing the inside of the character rectangular frame into (M × N) areas is extracted from the barycentric coordinates of the projection on the two axes, and is extracted from the outline coordinate series representing the outer shape of the character / graphic pattern. A feature extraction method is provided that extracts the information of the poles satisfying a predetermined condition from each of the regions and uses them as features to effectively extract the pattern differences that are stable and have only minute differences. it can.

[Brief description of drawings]

第１図は本発明が適用される文字認識装置の一実施例を
示す機能ブロック図、第２図は入力文字パタン例と、重
心座標系列、分割座標系列との関係を示す図、第３図は
重心座標系列と分割座標系列との対応関係を示す図、第
４図は分割座標系列の他の決定方法におけるフローチャ
ート、第５図は入力文字パタンの漢字「水」に対する上
辺からの外郭座標系列検出方法の説明図、第６図は第５
図の外郭座標系列に対する極大点及び極小点検出方法の
説明図、第７図は極点計数領域Ｓ＝１，Ｅ＝６，Ｄ＝３
に対する第６図の極大点・極小点計数方法の説明図、第
８図は入力文字パタンの漢字「氷」に対する上辺からの
外郭座標系列検出方法の説明図、第９図は第８図の外郭
座標系列に対する極大点及び極小点検出方法の説明図、
第１０図は極点計数領域Ｓ＝１，Ｅ＝６，Ｄ＝３に対す
る第９図の極大点・極小点計数方法の説明図である。１……光入力、２……光電変換部３……パタンレジスタ、４……文字枠検出部５……文字投影作成部、６……重心検出部７……文字枠分割点決定部８……外部座標系列検出部９……極点座標系列検出部１０……極点計数部１１……極点計数領域テーブル１２……識別部、１３……辞書メモリ１４……文字名出力FIG. 1 is a functional block diagram showing an embodiment of a character recognition device to which the present invention is applied, FIG. 2 is a diagram showing a relationship between an input character pattern example, a barycentric coordinate series, and a divided coordinate series, and FIG. Is a diagram showing a correspondence relationship between the barycentric coordinate series and the divided coordinate series, FIG. 4 is a flowchart in another method of determining the divided coordinate series, and FIG. 5 is an outer coordinate series from the upper side for the Chinese character “water” of the input character pattern. Explanatory drawing of detection method, FIG.
FIG. 7 is an explanatory diagram of a maximum point / minimum point detection method for the outline coordinate series in FIG. 7, and FIG. 7 shows a maximum point counting area S = 1, E = 6, D = 3.
6 is an explanatory view of the maximum / minimum point counting method of FIG. 6, FIG. 8 is an explanatory view of an outer coordinate sequence detection method from the upper side for the Chinese character “ice” of the input character pattern, and FIG. 9 is an outer contour of FIG. Explanatory diagram of a maximum point and a minimum point detection method for a coordinate series,
FIG. 10 is an explanatory diagram of the maximum point / minimum point counting method of FIG. 9 for the pole counting areas S = 1, E = 6, D = 3. 1 ... Optical input, 2 ... Photoelectric conversion unit 3 ... Pattern register, 4 ... Character frame detection unit 5 ... Character projection creation unit, 6 ... Center of gravity detection unit 7 ... Character frame division point determination unit 8 ... ... External coordinate sequence detection unit 9 ... Pole point coordinate sequence detection unit 10 ... Pole point counting unit 11 ... Pole point counting area table 12 ... Identification unit 13 ... Dictionary memory 14 ... Character name output

Claims

[Claims]

1. A feature extraction method for extracting a feature of a character / graphic based on the original pattern, comprising a pattern register for storing a pattern obtained by reading and quantizing a character / graphic on a medium as an original pattern. ) Means for detecting a character rectangular frame composed of the circumscribed sides of the original pattern stored in the pattern register, and (b) Means for projecting the original pattern on desired two axes to obtain respective black bit number distributions. (C) First, determine the respective barycentric coordinates of the black bit number distribution within the range limited by the character rectangular frame with respect to the respective axes, and then use the respective barycentric coordinates detected until then. The process of determining the barycentric coordinates of each of the black bit number distributions for each range obtained by dividing the range limited by the character rectangular frame is repeated a plurality of times to obtain each barycentric coordinate sequence, and the inside of the character rectangular frame is calculated by (M × N Means for determining respective division coordinates based on the barycentric coordinates for division into individual areas, and (d) scanning the inside of the character rectangular frame of the original pattern in a vertical direction from each side of the character rectangular frame. Then, means for detecting the coordinate series of the first change point from the background portion to the character portion, (e) means for calculating the change amount of each change point based on the change point coordinate series, and (f) Means for detecting a pole whose increase or decrease changes according to the amount of change, and (g) a pole existing in a predetermined area of the (M × N) respective areas among the detected poles. A feature extraction method comprising: a means for detecting the number.