JPH0147835B2

JPH0147835B2 -

Info

Publication number: JPH0147835B2
Application number: JP57004931A
Authority: JP
Inventors: Yoshuki Yamashita; Koichi Higuchi; Yoichi Yamada
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1982-01-18
Filing date: 1982-01-18
Publication date: 1989-10-17
Also published as: JPS58123182A

Description

[Detailed description of the invention]

本発明は、入力文字パターンの特定方向のスト
ロークをあらわす各サブパターンにおける各文字
枠分割単位領域毎の黒ビツト和を文字線幅と各サ
ブパターンに対応した文字枠の大きさとで正規化
して特徴要素を抽出し、この特徴要素で文字の特
徴を表現したものをサブパターン特徴マトリツク
スとし、同じ分割単位領域に対応したサブパター
ンの特徴要素を加算した値で文字の特徴を表現し
たものを加算特徴マトリツクスとして、第１段階
として加算特徴マトリツクスによつて文字候補を
選択し、第２段階としてサブパターン特徴マトリ
ツクスによつて文字名を判定するようにした、文
字認識方式に関する。文字認識方式の１形式として下記の式であらわ
される量を特徴要素Ｆ（ｋ、ｉ、ｊ）とする特徴
マトリツクス（説明の都合上、以下サブパターン
特徴マトリツクスという）を作成し、同様の形式
であらわされた標準文字マスクとの類似度を測定
して文字を認識する文字認識方式がある。Ｆ（ｋ、ｉ、ｊ）＝Bk（ｉ、ｊ）／WL・WPk WLは入力文字パターンの文字線幅をあらわ
す。 WPkはWPh、WPv、WPr、WPl等をあらわ
し、例えばWPhは水平方向の文字枠の大きさで
あり、以下同様にWPv、WPr、WPlはそれぞれ
垂直方向、右斜め45度方向、左斜め45度方向の文
字枠の大きさである。Bk（ｉ、ｊ）はBh（ｉ、
ｊ）、Bv（ｉ、ｊ）、Br（ｉ、ｊ）、Bl（ｉ、ｊ）等
をあらわし、例えばBh（ｉ、ｊ）は水平サブパタ
ーンHSPの分割単位領域（ｉ、ｊ）における黒
ビツトの和であり、ｉは文字枠のＸ軸方向の分割
単位領域の番号であり、ｊは文字枠のＹ軸方向の
分割単位領域の番号である。以下同様にBv（ｉ、
ｊ）、Br（ｉ、ｊ）、Bl（ｉ、ｊ）はそれぞれ垂直
サブパターンVSP、右斜め45度サブパターン
RSP、左斜め45度サブパターンLSPの分割単位
領域（ｉ、ｊ）における黒ビツトの和である。この方式においては、Ｘ軸方向及びＹ軸方向の
文字枠の夫々の分割数NX，NYを共に７とし且
つサブパターンの種類を４とした場合、１つの文
字は７×７×４次元の比較的多い情報量で表現さ
れ、更にまた、分割数NX，NYに関する分割形
式として複数種類を設定することもあり、認識対
象としてのカテゴリ数（文字種類）が増大すると
識別に要する時間が膨大となり、処理速度は低下
する。本発明の目的は、前述の方式に適合した大分類
情報を採用することによつて処理速度を向上させ
ることにあり、これを、各サブパターンの夫々対
応した分割単位領域の特徴要素を加算した値をそ
の分割単位領域の特徴要素とした加算特徴マトリ
ツクスを採用することによつて達成したものであ
る。第１図は、本発明による文字認識装置の一実施
例を示したものである。以下第１図に基づいて詳
細に説明する。１は帳票からの光入力である。この光入力１は
光電変換部２に入力される。光電変換部２は１つ
の文字予定領域を128×128の画素へ分解し、各画
素を２値のデイジタル信号（以下これを入力文字
パターンと呼ぶ）へ変換するものであり、平均的
大きさの１文字は60×60ビツト程度の入力文字パ
ターンで表現される。パターンレジスタ３は文字
予定領域における各画素のＸ、Ｙ座標を再現でき
る形式で入力文字パターンを記憶するものであ
り、文字予定領域に対応して128×128ビツトの容
量を有するものである。文字線幅計算部４は周知
のフイルタ回路と同様にシフトレジスタ構成とな
つており、入力文字パターンを受けて２×２段の
シフトレジスタ窓のすべての画素が黒ビツトとな
る状態の個数Ｑと入力文字パターンの黒ビツト和
Ａを計数し、下記に示す周知式で線幅WLを算出
する。 WL＝Ａ／Ａ−Ｑ文字枠検出部５は文字の外接枠をそのパターン
レジスタにおける左端座標Xl、右端座標Xr、上
端座標Yt、下端座標Ybで表現して検出し、更に
文字枠の大きさを検出するものである。文字枠の
大きさは水平方向の大きさとしてWPh＝Xr−Xl
＋１、垂直方向の大きさとしてWPv＝Yt−Yb＋
１、として検出される。右斜め45度方向及び左斜
め45度方向の大きさとして WPr＝WPl＝WPh＋WPv／２として検出する。水平・垂直・右斜め・左斜めサ
ブパターン抽出部６〜９は入力文字パターンと線
幅WLとに基づいて水平、垂直、右斜め、左斜
め、の各方向のストロークをあらわすサブパター
ンHSP、VSP、RSP、LSPを抽出するものであ
り、各方向に対応した線幅より十分長く連続する
黒ビツトを抽出することにより行う。例えばサブ
パターンHSPはパターンレジスタ３の全面を水
平に走査し、各走査線毎に黒ビツトの連続する個
数を検出し、その長さが2WLより大きい黒ビツ
トを抽出することにより水平ストロークからなる
水平サブパターンを抽出する。同様に、垂直、右
斜め、左斜めの各サブパターンはパターンレジス
タ３をそれぞれ垂直方向、右斜め方向、左斜め方
向へ走査することにより抽出する。このように抽出されたサブパターンの一例を第
２図に示しており、第２図ORGは入力文字パタ
ーン、HSPは水平サブパターン、RSPは右斜め
サブパターン、LSPは左斜めサブパターンであ
る。文字枠分割決定部１０は、分割形式によつて指
定される分割数NX、NYと文字枠座標Xl、Xr、
Yt、Ybと入力文字パターンORGとを受けて、
Ｘ、Ｙ軸上に夫々投影された入力文字パターン
ORGの夫々の黒ビツト数分布を対象として、設
定された最大の分割数よりも十分大きい個数の重
心座標の夫々の系列を求め、それを分割数NX、
NYでほぼ均等に配分して対応づけて（NX−１）
個及び（NY−１）個の重心座標を夫々選択して
夫々の分割座標として決定する。例えばＸ軸の重
心座標系列Ｘ（Mi）（但しMiは重心番号でｉ＝１
〜15）の検出は入力文字パターンをＸ軸に投影し
て黒ビツト数分布を求め、まず最初はＸ軸に関す
る文字枠の範囲Xl〜Xrを対象として重心座標Ｘ
（Mp）を求め、次いで、前段までに求められた
重心座標系列によつてＸ軸に関する文字枠の範囲
Xl〜Xrを分割して夫々の範囲を対象として重心
座標を求める過程を３回繰返すことによつて他の
14個の重心座標Ｘ（M1）〜Ｘ（M7）、Ｘ（M9）〜
Ｘ（M15）を検出する。このようにして求めた15
個の重心座標を分割座標の候補点として予め用意
しておいた下記テーブルに基づいてＸ軸の各分割
区間がほぼ均等個数の重心を含むように対応づけ
て決定する。 The present invention is characterized by normalizing the sum of black bits for each character frame division unit area in each subpattern representing a stroke in a specific direction of an input character pattern by the character line width and the size of the character frame corresponding to each subpattern. The elements are extracted, and the character features expressed by these feature elements are used as a sub-pattern feature matrix, and the character features are expressed by the sum of the feature elements of sub-patterns corresponding to the same division unit area, and the character features are expressed as an additive feature. The present invention relates to a character recognition method in which character candidates are selected using an additive feature matrix in the first step, and character names are determined using a subpattern feature matrix in the second step. As one form of character recognition method, a feature matrix (hereinafter referred to as sub-pattern feature matrix for convenience of explanation) whose feature elements F (k, i, j) are quantities expressed by the following formula is created, and in a similar format. There is a character recognition method that recognizes a character by measuring its similarity to a standard character mask. F(k, i, j)=Bk(i, j)/WL·WPk WL represents the character line width of the input character pattern. WPk represents WPh, WPv, WPr, WPl, etc. For example, WPh is the size of the character frame in the horizontal direction, and similarly, WPv, WPr, and WPl are the vertical direction, 45 degrees diagonally to the right, and 45 degrees diagonally to the left, respectively. The size of the character frame in the direction. Bk(i, j) is Bh(i,
j), Bv (i, j), Br (i, j), Bl (i, j), etc., for example, Bh (i, j) is the black color in the divided unit area (i, j) of the horizontal sub-pattern HSP. It is the sum of bits, where i is the number of the divided unit area of the character frame in the X-axis direction, and j is the number of the divided unit area of the character frame in the Y-axis direction. Similarly, Bv(i,
j), Br (i, j), and Bl (i, j) are vertical subpattern VSP and right diagonal 45 degree subpattern, respectively.
RSP is the sum of black bits in the divided unit area (i, j) of the 45-degree left diagonal sub-pattern LSP. In this method, if the number of divisions NX and NY of the character frame in the X-axis direction and Y-axis direction are both 7, and the type of subpattern is 4, one character is a 7 x 7 x 4-dimensional comparison. Furthermore, multiple types of division formats may be set for the number of divisions NX and NY, and as the number of categories (types of characters) to be recognized increases, the time required for identification becomes enormous. Processing speed will decrease. The purpose of the present invention is to improve processing speed by adopting large classification information that is compatible with the above-mentioned method, and to improve the processing speed by adding the characteristic elements of the corresponding division unit areas of each sub-pattern. This was achieved by employing an additive feature matrix in which the value is the feature element of the divided unit area. FIG. 1 shows an embodiment of a character recognition device according to the present invention. A detailed explanation will be given below based on FIG. 1. 1 is optical input from the form. This optical input 1 is input to a photoelectric conversion section 2. The photoelectric conversion unit 2 decomposes one character area into 128 x 128 pixels and converts each pixel into a binary digital signal (hereinafter referred to as input character pattern). One character is represented by an input character pattern of about 60 x 60 bits. The pattern register 3 stores the input character pattern in a format capable of reproducing the X and Y coordinates of each pixel in the expected character area, and has a capacity of 128×128 bits corresponding to the expected character area. The character line width calculation unit 4 has a shift register configuration similar to a well-known filter circuit, and receives an input character pattern and calculates the number Q of states in which all pixels in a 2×2 shift register window are black bits. The black bit sum A of the input character pattern is counted, and the line width WL is calculated using the well-known formula shown below. WL=A/A-Q The character frame detection unit 5 detects the circumscribed frame of the character by expressing it in the left end coordinate Xl, right end coordinate Xr, upper end coordinate Yt, and lower end coordinate Yb in the pattern register, and further determines the size of the character frame. This is to detect. The size of the character frame in the horizontal direction is WPh=Xr−Xl
+1, as vertical size WPv=Yt−Yb+
1. The size in the 45 degree right diagonal direction and the 45 degree diagonal left direction is detected as WPr=WPl=WPh+WPv/2. The horizontal/vertical/right diagonal/left diagonal subpattern extraction units 6 to 9 extract subpatterns HSP and VSP representing strokes in horizontal, vertical, right diagonal, and left diagonal directions based on the input character pattern and line width WL. , RSP, and LSP by extracting continuous black bits that are longer than the corresponding line width in each direction. For example, the sub-pattern HSP is created by horizontally scanning the entire surface of the pattern register 3, detecting the number of consecutive black bits for each scanning line, and extracting the black bits whose length is greater than 2WL. Extract subpatterns. Similarly, vertical, diagonal right, and diagonal left subpatterns are extracted by scanning the pattern register 3 in the vertical direction, diagonal right direction, and diagonal left direction, respectively. An example of subpatterns extracted in this way is shown in FIG. 2, in which ORG is an input character pattern, HSP is a horizontal subpattern, RSP is a right diagonal subpattern, and LSP is a left diagonal subpattern. The character frame division determination unit 10 determines the number of divisions NX, NY specified by the division format and the character frame coordinates Xl, Xr,
Receive Yt, Yb and input character pattern ORG,
Input character pattern projected on the X and Y axes respectively
For each black bit number distribution of ORG, find each series of centroid coordinates that is sufficiently larger than the set maximum number of divisions, and divide it into the number of divisions NX,
Almost equally distributed and matched in NY (NX-1)
and (NY-1) barycentric coordinates are selected and determined as the respective divided coordinates. For example, the X-axis barycenter coordinate series X (Mi) (where Mi is the barycenter number and i=1
~15) Detection is performed by projecting the input character pattern onto the
(Mp), and then the range of the character frame on the X axis using the barycenter coordinate series found in the previous step.
By dividing Xl to Xr and repeating the process of finding the center of gravity coordinates for each range three times,
14 center of gravity coordinates X (M1) ~ X (M7), X (M9) ~
Detect X (M15). 15 obtained in this way
The coordinates of the center of gravity are determined as candidate points for the divided coordinates based on the table below prepared in advance so that each divided section of the X-axis includes an approximately equal number of centers of gravity.

【表】Ｙ軸についても同様にして15個の重心座標Ｙ
（M1）〜Ｙ（M15）を求め、分割数に応じて前記
テーブルを参照して分割座標を決定する。なお、
分割形式は入力文字パターンに応じて変更設定さ
れ、或いは一旦リジエクトされると別の分割形式
に変更されるものであるが、詳細は省略する。ここでは、分割形式によつて、各サブパターン
に共通である分割数NX、NYが５×５として指
定されたものとして、それに対応してＸ軸方向の
分割座標としてＸ（M3）、Ｘ（M6）、Ｘ（M9）、Ｘ
（M12）を、Ｙ軸方向の分割座標としてＹ（M3）、
Ｙ（M6）、Ｙ（M6）、Ｙ（M9）、Ｙ（M12）を決定す
る。特徴マトリツクス抽出部１１はサブパターンの
分割数に対応したＸ軸方向の分割座標及び両端座
標Xl、Ｘ（M3）、Ｘ（M6）、Ｘ（M9）、Ｘ（M12）、
Xrと、Ｙ軸方向の分割座標及び両端座標Yb、Ｙ
（M3）、Ｙ（M6）、Ｙ（M9）、Ｙ（M12）、Ytと、各
サブパターンHSP、VSP、RSP、LSPと、各サ
ブパターンに対応した文字枠の大きさWPh、
WPv、WPr、WPlと、文字線幅WLとを受けて、
各サブパターンを前記分割座標からきまる複数の
分割単位領域（この場合５×５の領域）に分割し
て、各分割単位領域毎の黒ビツト数Bk（ｉ、ｊ）
を計数し、下記の式で示すように線幅WLと各サ
ブパターンに対応する文字枠の大きさWPh、
WPv、WPr、WPlとで正規化し、各サブパター
ンの分割単位領域毎に特徴要素Ｆ（ｋ、ｉ、ｊ）＝
Bk（ｉ、ｊ）／WL・WPk（但し、ｋ＝ｈ、ｖ、ｒ、ｌ、
ｉ＝１〜NX、ｊ＝１〜NY）を抽出し、この場合５
×５×４次元の特徴要素からなる特徴マトリツク
スを作成する。Ｆ（ｈ、ｉ、ｊ）＝Bh（ｉ、ｊ）／WL・WPh Ｆ（ｖ、ｉ、ｊ）＝Bv（ｉ、ｊ）／WL・WPv Ｆ（ｒ、ｉ、ｊ）＝Br（ｉ、ｊ）／WL・WPr Ｆ（ｌ、ｉ、ｊ）＝Bl（ｉ、ｊ）／WL・WPl 各サブパターンの分割単位領域（ｉ、ｊ）毎の
黒ビツト和Bk（ｉ、ｊ）の算出は、分割単位領域
（ｉ、ｊ）に対応した分割座標で各サブパターン
を読み込み、黒ビツトの個数を加算することによ
つて行う。例えば特徴要素Ｆ（ｈ、１、１）に対
応した黒ビツト和Bh（１、１）は分割単位領域
（１、１）に対応したＸ座標Xl、Ｘ（M3）及びＹ
座標Yt、Ｙ（M3）で決定される範囲の水平サブ
パターンHSPを水平サブパターン抽出部６から
読み出し、黒ビツト数を計数することによつて求
められる。加算マトリツクス計算部１２は、特徴マトリツ
クス抽出部１１で抽出されたサブパターン特徴マ
トリツクスＦ（ｋ、ｉ、ｊ）を受け、次式の演算
を実行することによつて、サブパターン特徴マト
リツクスＦ（ｋ、ｉ、ｊ）の対応する位置（ｉ、
ｊ）の特徴要素を加算した値をその位置（ｉ、
ｊ）の特徴要素FA（ｉ、ｊ）とする加算特徴マト
リツクスを作成する。 FA（ｉ、ｊ）＝Ｆ（ｈ、ｉ、ｊ）＋Ｆ（ｖ、ｉ、ｊ）＋Ｆ（ｒ、ｉ、ｊ）＋Ｆ（ｌ、ｉ、ｊ）大分類部１３には、標準文字マスクが入力文字
パターンにおける加算特徴マトリツクスと同様
に、夫々対応する分割単位領域（ｉ、ｊ）の各サ
ブパターンの特徴要素加算した値を特徴要素
FSA（ｉ、ｊ）とする加算特徴マトリツクスで表
現された辞書が用意されていて、次の式の演算を
実行することにより、入力文字パターンの加算特
徴マトリツクスと各標準文字マスクとの類似度
DAが測定され、類似度の順位が高い（DAの値
が小さい）特定個数（代表的には30個）のカテゴ
リ名を候補カテゴリとして検出する。 DA＝√｛（、）−（、）｝² なお、この実施例では分割形式の個数に対応し
て３種類の辞書が用意されていて、分割形式によ
つて指定された一つの辞書を参照として、前述の
類似度が測定されるが、その詳細は省略する。識別部１４には、標準文字マスクが入力文字パ
ターンにおけるサブパターン特徴マトリツクスと
同様の形式のサブパターン特徴マトリツクスで
FS（ｋ、ｉ、ｊ）で表現された辞書が用意されて
いて、次の式の演算を実行することによつて、大
分類部１３で検出された候補カテゴリの標準文字
マスクFS（ｋ、ｉ、ｊ）と入力文字パターンのサ
ブパターン特徴マトリツクスＦ（ｋ、ｉ、ｊ）と
の類似度が測定され、最も類似する標準文字マス
クのカテゴリ名を文字コード出力端１５に出力す
る。なお、この実施例においては、この識別部１４
にも分割形式の個数に対応して３種類の辞書が用
意されていて、分割形式で指定される一つの辞書
を参照して前述の類似度が測定される。以上説明した様に、本実施例においては大分類
部１３により候補カテゴリを絞り、その後に識別
を行なつており、又大分類部１３で使用する特徴
マトリクスを抽出する為に特別な処理、演算を必
要とせず特徴マトリクス抽出部１１の出力マトリ
ツクスの単純な加算のみにより算出しているの
で、抽出した特徴を失うことなく高速で安定な文
字認識ができる利点がある。以上説明したごとく本発明はサブパターンより
抽出した特徴マトリツクスを加算した加算マトリ
ツクスにより大分類を行ない、その後に識別を行
なつているので、高速で安定な文字認識装置が実
現できる。[Table] Similarly for the Y axis, 15 barycenter coordinates Y
(M1) to Y(M15) are determined, and the division coordinates are determined by referring to the table according to the number of divisions. In addition,
The division format is changed and set according to the input character pattern, or once rejected, it is changed to another division format, but the details will be omitted. Here, it is assumed that the number of divisions NX and NY that are common to each sub-pattern is specified as 5 × 5 according to the division format, and the corresponding division coordinates in the X-axis direction are X(M3), M6), X (M9), X
(M12) as the division coordinate in the Y-axis direction, Y(M3),
Determine Y (M6), Y (M6), Y (M9), and Y (M12). The feature matrix extraction unit 11 extracts division coordinates in the X-axis direction and both end coordinates Xl, X (M3), X (M6), X (M9), X (M12),
Xr, division coordinates in the Y-axis direction and both end coordinates Yb, Y
(M3), Y (M6), Y (M9), Y (M12), Yt, each sub-pattern HSP, VSP, RSP, LSP, and the size of the character frame WPh corresponding to each sub-pattern,
Receiving WPv, WPr, WPl and character line width WL,
Each sub-pattern is divided into a plurality of division unit areas (5×5 areas in this case) determined from the division coordinates, and the number of black bits Bk (i, j) for each division unit area is calculated.
and calculate the line width WL and the character frame size WPh corresponding to each sub-pattern as shown in the formula below.
Normalized by WPv, WPr, and WPl, the feature element F(k, i, j) =
Bk (i, j)/WL・WPk (where k=h, v, r, l,
i=1~NX, j=1~NY), in this case 5
A feature matrix consisting of ×5 × 4-dimensional feature elements is created. F (h, i, j) = Bh (i, j) / WL・WPh F (v, i, j) = Bv (i, j) / WL・WPv F (r, i, j) = Br (i , j)/WL・WPr F(l, i, j)=Bl(i, j)/WL・WPr Black bit sum Bk(i, j) for each divided unit area (i, j) of each subpattern The calculation is performed by reading each sub-pattern at the division coordinates corresponding to the division unit area (i, j) and adding up the number of black bits. For example, the black bit sum Bh (1, 1) corresponding to the feature element F (h, 1, 1) is the X coordinate Xl, X (M3) and Y
It is obtained by reading out the horizontal sub-pattern HSP in the range determined by the coordinates Yt, Y(M3) from the horizontal sub-pattern extraction section 6 and counting the number of black bits. The addition matrix calculation unit 12 receives the subpattern feature matrix F(k, i, j) extracted by the feature matrix extraction unit 11, and calculates the subpattern feature matrix F(k , i, j) at the corresponding position (i,
The value obtained by adding the feature elements of
Create an additive feature matrix with the feature element FA(i, j) of j). FA (i, j) = F (h, i, j) + F (v, i, j) + F (r, i, j) + F (l, i, j) The main classification section 13 has a standard character mask. Similar to the addition feature matrix in the input character pattern, the value obtained by adding the feature elements of each sub-pattern of the corresponding divided unit area (i, j) is calculated as the feature element.
A dictionary expressed as an additive feature matrix FSA (i, j) is prepared, and by executing the operation of the following formula, the similarity between the additive feature matrix of the input character pattern and each standard character mask is calculated.
DA is measured, and a specific number (typically 30) of category names with a high similarity rank (low DA value) are detected as candidate categories. DA=√{(,)−(,)} ^2In this example, three types of dictionaries are prepared corresponding to the number of division formats, and one dictionary specified by the division format is referenced. The above-mentioned similarity is measured as follows, but the details thereof will be omitted. In the identification unit 14, the standard character mask is a sub-pattern feature matrix in the same format as the sub-pattern feature matrix in the input character pattern.
A dictionary expressed as FS(k, i, j) is prepared, and by executing the following formula, the standard character mask FS(k, i, j) and the sub-pattern feature matrix F(k, i, j) of the input character pattern is measured, and the category name of the most similar standard character mask is output to the character code output terminal 15. Note that in this embodiment, this identification section 14
Also, three types of dictionaries are prepared corresponding to the number of division formats, and the above-mentioned similarity is measured with reference to one dictionary specified by the division format. As explained above, in this embodiment, the broad classification unit 13 narrows down the candidate categories and then performs identification, and special processing and calculations are performed to extract the feature matrix used by the broad classification unit 13. Since the character is calculated by simple addition of the output matrix of the feature matrix extraction section 11 without the need for , there is an advantage that character recognition can be performed stably at high speed without losing the extracted features. As described above, the present invention performs major classification using an addition matrix obtained by adding feature matrices extracted from sub-patterns, and then performs identification, so that a high-speed and stable character recognition device can be realized.

[Brief explanation of drawings]

第１図は本発明による文字認識装置の実施例を
示す機能ブロツク図、第２図は入力文字パターン
とサブパターンとを示す図である。１……光信号入力、２……光電変換部、３……
パターンレジスタ、４……線幅計算部、５……文
字枠検出部、６……水平サブパターン抽出部、７
……垂直サブパターン抽出部、８……右斜めサブ
パターン抽出部、９……左斜めサブパターン抽出
部、１０……文字枠分割決定部、１１……特徴マ
トリツクス抽出部、１２……加算マトリツクス計
算部、１３……大分類部、１４……識別部、１５
……文字名出力端。 FIG. 1 is a functional block diagram showing an embodiment of a character recognition device according to the present invention, and FIG. 2 is a diagram showing input character patterns and sub-patterns. 1... Optical signal input, 2... Photoelectric conversion section, 3...
Pattern register, 4...Line width calculation section, 5...Character frame detection section, 6...Horizontal sub-pattern extraction section, 7
...Vertical sub-pattern extraction section, 8... Right diagonal sub-pattern extraction section, 9... Left diagonal sub-pattern extraction section, 10... Character frame division determination section, 11... Feature matrix extraction section, 12... Addition matrix Calculation section, 13... Major classification section, 14... Identification section, 15
...Character name output terminal.

Claims

[Claims] 1. Obtained by normalizing the number of black bits of an arbitrary divided unit area in a subpattern representing a stroke in a specific direction of an input character pattern by the character line width and the size of a character frame corresponding to the stroke direction. The feature elements are extracted from the input character pattern for each of a plurality of sub-patterns with different stroke directions and for each divided unit area obtained by dividing the character frame, and the amount is taken as a feature element to create a sub-pattern. In a character recognition method that creates a feature matrix and recognizes an input character pattern by referring to a dictionary in which standard character masks are described in the same format as the subpattern feature matrix, all Separately prepare a dictionary in which standard character masks are described using an addition feature matrix in which the value obtained by adding the feature elements of the sub-pattern is the feature element of the division unit area.
An additive feature matrix of the input character pattern is created with the value obtained by adding the feature elements of all the sub-patterns corresponding to the same division unit area as the feature element of the division unit area, and the addition feature matrix of the input character pattern is The degree of similarity with each standard character mask in the dictionary of the addition feature matrix is measured to detect a group of standard character names whose degree of similarity is at least a certain value or a certain rank, and then A character recognition method characterized in that an input character pattern is recognized by measuring the degree of similarity between a standard character mask corresponding to a standard character name and the sub-pattern feature matrix of the input character pattern.