JPH07110843A - Character recognizing device - Google Patents

Character recognizing device

Info

Publication number
JPH07110843A
JPH07110843A JP5256103A JP25610393A JPH07110843A JP H07110843 A JPH07110843 A JP H07110843A JP 5256103 A JP5256103 A JP 5256103A JP 25610393 A JP25610393 A JP 25610393A JP H07110843 A JPH07110843 A JP H07110843A
Authority
JP
Japan
Prior art keywords
category
input
character
distance
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP5256103A
Other languages
Japanese (ja)
Inventor
Akimichi Tanaka
明通 田中
Osamu Nakamura
修 中村
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nippon Telegraph and Telephone Corp
Original Assignee
Nippon Telegraph and Telephone Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nippon Telegraph and Telephone Corp filed Critical Nippon Telegraph and Telephone Corp
Priority to JP5256103A priority Critical patent/JPH07110843A/en
Publication of JPH07110843A publication Critical patent/JPH07110843A/en
Pending legal-status Critical Current

Links

Landscapes

  • Character Discrimination (AREA)

Abstract

PURPOSE:To obtain a character code included in a highly reliable category to which an input character belongs as a character candidate. CONSTITUTION:A distance calculating means 101 calculates a distance value between an input character pattern and the reference pattern of each category in a character dictionary 102. A reliability evaluating means 103 is constituted of a neural net, previously learns relation between the distance value and reliability, and when the distance value of the input character pattern is inputted at the time of recognizing the relation, a reliability evaluating value to which each category belongs is outputted based upon the learned result. A teacher signal preparing means 104 prepares a teacher signal for learning the means 103 while changing it in accordance with the distance value order of the input category. A character candidate determining means 105 outputs a character code with a reliability evaluation value for a category whose reliability evaluation value is more than a previously determined threshold as a character candidate.

Description

【発明の詳細な説明】Detailed Description of the Invention

【0001】[0001]

【産業上の利用分野】本発明は、入力文字パターンと文
字辞書中に収められた各カテゴリの標準パターンとを比
較し、入力文字が属する信頼性が高いカテゴリの文字コ
ードを文字候補として出力する文字認識装置に関し、特
に、各カテゴリに属する信頼性評価値を付与して文字コ
ードを出力する文字認識装置に関する。
BACKGROUND OF THE INVENTION The present invention compares an input character pattern with a standard pattern of each category stored in a character dictionary, and outputs a character code of a highly reliable category to which an input character belongs as a character candidate. The present invention relates to a character recognition device, and more particularly, to a character recognition device that outputs a character code with a reliability evaluation value belonging to each category.

【0002】[0002]

【従来の技術】パターンマッチング法による文字認識で
は、通常、入力文字パターンと文字辞書中の各カテゴリ
の標準パターンとの距離を計算し、距離が小さい順に上
位の文字候補とする(例えば、橋本新一郎:文字認識概
論、pp.32−36、オーム社(1982))。すな
わち、距離最小のカテゴリを第1位文字候補とする。し
かし、このような距離値順位を用いる手法は、同じ第1
位文字候補でも正解である信頼性が場合によって異なる
という問題がある。距離値と信頼性の関係は複雑で、距
離値を付与しただけでは信頼性の表示にならない。
2. Description of the Related Art In character recognition by a pattern matching method, the distance between an input character pattern and a standard pattern of each category in a character dictionary is usually calculated, and the character candidates are ranked in descending order of distance (for example, Shinichiro Hashimoto). : Introduction to character recognition, pp. 32-36, Ohmsha (1982)). That is, the category with the smallest distance is set as the first character candidate. However, the method using such a distance value rank is the same as the first method.
There is a problem that the reliability of the correct answer is different depending on the case even for the character candidates. The relationship between the distance value and the reliability is complicated, and just adding the distance value does not indicate the reliability.

【0003】上記問題を解決する方法として、ニューラ
ルネットを利用し、距離値と信頼性の関係を学習させる
方法が考えられる(例えば、D.E.Rumelhart,J.
L.McCelland and the PDP Research Group:
Parallel DistributedProcessing,The MIT pre
ss(1986))。この場合、入力層は標準パターンの
個数だけのユニットを持ち、標準パターンとの距離が入
力される。出力層には各カテゴリに対応するユニットが
並び、入力カテゴリが対応カテゴリに属する信頼性評価
値を出力する。学習の際には、入力カテゴリに対応する
ユニットには1.0、それ以外のユニットには0.0が教
師信号として与えられる。この時、距離最小のカテゴリ
が入力カテゴリと一致しないことがあるが、入力カテゴ
リを正しく教えることにより、認識率の向上が期待され
る。しかし、同様の距離値入力に対して距離最小のカテ
ゴリが入力カテゴリとなる場合も多い。したがって、こ
の方法には、距離最小のカテゴリが入力カテゴリと一致
しない場合と、同様の距離値入力で距離最小のカテゴリ
が入力カテゴリとなる場合の混合が生じやすいという問
題がある。
As a method of solving the above problem, a method of learning the relationship between the distance value and the reliability by using a neural net can be considered (for example, DE Rumelhart, J. et al.
L. McCelland and the PDP Research Group:
Parallel Distributed Processing, The MIT pre
ss (1986)). In this case, the input layer has the same number of units as the standard pattern, and the distance from the standard pattern is input. Units corresponding to each category are arranged in the output layer, and the reliability evaluation value in which the input category belongs to the corresponding category is output. At the time of learning, 1.0 is given as a teacher signal to the unit corresponding to the input category, and 0.0 is given to the other units. At this time, the category with the smallest distance may not match the input category, but it is expected that the recognition rate will be improved by correctly teaching the input category. However, in most cases, the category with the smallest distance becomes the input category for the same distance value input. Therefore, this method has a problem that a mixture is likely to occur when the category with the smallest distance does not match the input category and when the category with the smallest distance becomes the input category with the same distance value input.

【0004】[0004]

【発明が解決しようとする課題】本発明は上記問題点を
解決するためになされたものであり、その目的は、入力
パターンと標準パターンとの距離と各カテゴリに属する
信頼性との関係をニューラルネットを用いて学習し、か
つ、距離最小のカテゴリが入力カテゴリと一致しない場
合も他に悪影響を与えずに学習して、距離値入力から信
頼性の高い文字候補を出力できる文字認識装置を提供す
ることにある。
SUMMARY OF THE INVENTION The present invention has been made to solve the above problems, and an object of the present invention is to neuralize the relationship between the distance between an input pattern and a standard pattern and the reliability belonging to each category. We provide a character recognition device that can output highly reliable character candidates from distance value input by learning using the net and learning without adversely affecting other categories even if the minimum distance category does not match the input category. To do.

【0005】[0005]

【課題を解決するための手段】本発明の文字認識装置
は、各カテゴリの標準パターンを記憶した文字辞書と、
入力文字パターンと前記文字辞書中の各カテゴリの標準
パターンとの距離値を計算する距離計算手段と、あらか
じめニューラルネットを用いて、入力文字パターンと標
準パターンとの距離値と信頼性との関係を学習してお
き、前記距離計算手段が計算した距離値を入力し、入力
文字パターンが各カテゴリに属する信頼性評価値を出力
する信頼性評価手段と、前記信頼性評価手段の学習時
に、前記距離計算手段によって計算された距離値と入力
カテゴリの指示を入力てし、前記指示された入力カテゴ
リ以外のカテゴリには一定、入力カテゴリにはそのカテ
ゴリの距離値順位に応じて変えて、前記信頼性評価手段
の学習のための教師信号を作成する教師信号作成手段
と、前記信頼性評価手段が出力する信頼性評価値を入力
し、該信頼性評価値があらかじめ定めた閾値以上になっ
ているカテゴリについて、その文字コードを文字候補と
して当該信頼性評価値を付して出力する文字候補決定手
段とから構成される。
A character recognition apparatus of the present invention comprises a character dictionary storing standard patterns of each category,
A distance calculating means for calculating a distance value between the input character pattern and the standard pattern of each category in the character dictionary, and a neural network in advance are used to determine the relationship between the distance value between the input character pattern and the standard pattern and the reliability. A reliability evaluation unit that learns and inputs the distance value calculated by the distance calculation unit and outputs a reliability evaluation value in which the input character pattern belongs to each category; and the distance when the reliability evaluation unit learns. By inputting the distance value calculated by the calculating means and the instruction of the input category, the category other than the instructed input category is constant, the input category is changed according to the distance value rank of the category, and the reliability is changed. A teacher signal creating means for creating a teacher signal for learning by the evaluating means and a reliability evaluation value output by the reliability evaluating means are input, and the reliability evaluation value is stored. For category is equal to or higher than a threshold value that defines beforehand, composed of a character candidate determination means for outputting designated by the reliability evaluation value the character code as a character candidate.

【0006】[0006]

【作用】入力文字パターンに対して、距離計算手段によ
り文字辞書中の標準パターンとの距離が計算される。学
習時には、この距離値と教師信号作成手段による教師信
号により、信頼性評価手段の学習が行われる。教師信号
作成手段は、入力カテゴリの距離値順位に応じた教師信
号を作成する。認識時、新たに入力文字パターンが与え
られると、同様に距離計算手段により距離値が計算さ
れ、信頼性評価手段に入力される。信頼性評価手段で
は、学習結果をもとに各カテゴリに属する信頼性評価値
を出力する。これが文字候補選択手段に入力され、適当
な閾値以上を出力しているカテゴリについて、評価値が
大きい順に評価値つきの文字コードを文字候補として出
力される。
The distance calculation means calculates the distance between the input character pattern and the standard pattern in the character dictionary. At the time of learning, the reliability evaluation means learns by the distance value and the teacher signal by the teacher signal creation means. The teacher signal creating means creates a teacher signal according to the distance value rank of the input category. When a new input character pattern is given at the time of recognition, the distance value is similarly calculated by the distance calculating means and input to the reliability evaluating means. The reliability evaluation means outputs the reliability evaluation value belonging to each category based on the learning result. This is input to the character candidate selection means, and the character code with the evaluation value is output as the character candidate in the descending order of the evaluation value for the category that outputs an appropriate threshold value or more.

【0007】本発明では、入力パターンと標準パターン
との距離と信頼性との関係をニューラルネットを用いて
学習し、距離値入力から信頼性の高い文字候補を出力で
きる。また、距離最小のカテゴリが入力カテゴリと一致
しない場合を、他に悪影響を与えずに学習できる。
According to the present invention, the relationship between the distance between the input pattern and the standard pattern and the reliability is learned by using a neural network, and a highly reliable character candidate can be output from the input of the distance value. Further, when the category with the smallest distance does not match the input category, learning can be performed without adversely affecting the other categories.

【0008】[0008]

【実施例】以下、本発明の一実施例を図面を使って詳細
に説明する。
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of the present invention will be described in detail below with reference to the drawings.

【0009】図1は、本発明の一実施例を示すブロック
図である。図において、入力文字パターンは、例えば入
力文字イメージからPDC特徴のような文字特徴をベク
トルの形で抽出することにより求められる(例えば、荻
田、内藤、増田:外郭方向寄与度特徴による手書き漢字
の識別、信学論(D),vol.J66−D,No.10,p
p.1185−1192(1983))。文字辞書10
2は、文字辞書用サンプルをカテゴリあたり数百サンプ
ル程度集めてPDC特徴を抽出し、カテゴリ毎の平均値
を標準パターンとして作成しておく。
FIG. 1 is a block diagram showing an embodiment of the present invention. In the figure, an input character pattern is obtained, for example, by extracting character features such as PDC features in the form of a vector from an input character image (eg, Ogita, Naito, Masuda: Identification of handwritten Chinese characters by outer-direction contribution feature). , Theoretical theory (D), vol.J66-D, No.10, p
p. 1185-1192 (1983)). Character dictionary 10
In No. 2, several hundreds of character dictionary samples are collected per category to extract PDC features, and an average value for each category is created as a standard pattern.

【0010】距離計算手段101は、入力文字パターン
と文字辞書102中に収められた各カテゴリの標準パタ
ーンとの距離値を計算する。すなわち、入力文字パター
ンを x=(x1,x2,…,xn) …(1) カテゴリj(j=1,2,…,J)の標準パターンを mj=(mj1,mj2,…,mjn) …(2) と表すと、入力文字パターンとカテゴリjの標準パター
ンとの距離ljが、例えばユークリッド距離を用いて数
1のように計算される。
The distance calculating means 101 calculates the distance value between the input character pattern and the standard pattern of each category stored in the character dictionary 102. That is, the input character pattern is x = (x 1 , x 2 , ..., X n ) ... (1) The standard pattern of category j (j = 1, 2, ..., J) is m j = (m j1 , m j2 , ..., M jn ) (2), the distance l j between the input character pattern and the standard pattern of the category j is calculated by using Euclidean distance as shown in Equation 1.

【0011】[0011]

【数1】 [Equation 1]

【0012】信頼性評価手段103は図2のような3層
のニューラルネットで構成される。図2において、入力
層は標準パターンの個数だけのユニットを持ち、出力層
は各カテゴリに対応するユニットを持つ。入力層には、
距離計算手段101で求まった標準パターンとの距離値
jおよび入力カテゴリが何であるかの指示が入力され
る。出力層のユニットUjはカテゴリjに対応し、入力
文字パターンがカテゴリjに属する信頼性評価値を出力
する。
The reliability evaluation means 103 is composed of a three-layer neural network as shown in FIG. In FIG. 2, the input layer has as many units as the number of standard patterns, and the output layer has units corresponding to each category. In the input layer,
The distance value l j from the standard pattern obtained by the distance calculation means 101 and an instruction of what the input category is are input. The unit U j in the output layer corresponds to the category j and outputs the reliability evaluation value in which the input character pattern belongs to the category j.

【0013】信頼性評価手段103の動作は学習時と認
識時によって以下のようになる。
The operation of the reliability evaluation means 103 is as follows depending on learning and recognition.

【0014】学習時には、距離計算手段101によって
計算された距離値と、後述の教師信号作成手段104に
よって作られた教師信号が信頼性評価手段103に入力
される。ここで、教師信号作成法の1つとして、入力カ
テゴリに対応するユニットには1.0、それ以外のユニ
ットには0.0とする方法がある。この教師信号を用い
て学習を行えば、距離値と信頼性の関係が学習される。
しかし、この教師信号作成法を用いると、先に述べた通
り、距離最小のカテゴリが入力カテゴリと一致しない場
合と、同様の距離値入力で距離最小のカテゴリが入力カ
テゴリとなる場合の混同が生じやすいという問題があ
る。そこで、入力カテゴリに与える教師信号を、カテゴ
リが距離値最小でない場合には、入力サンプルの距離値
順位が下がるにしたがって教師信号を小さくする。すな
わち、距離値順位がiの時に与える教師信号をyi(i
=1,2,…,J)とした時、 y1=1>y2>y3 … …(4) とする。これにより上述の混同の影響を低減できる。
At the time of learning, the distance value calculated by the distance calculation means 101 and the teacher signal generated by the teacher signal generation means 104 described later are input to the reliability evaluation means 103. Here, as one of the teacher signal generation methods, there is a method in which the unit corresponding to the input category is set to 1.0 and the other units are set to 0.0. If learning is performed using this teacher signal, the relationship between the distance value and the reliability is learned.
However, using this teacher signal creation method, as described above, there is confusion when the category with the smallest distance does not match the input category and when the category with the smallest distance becomes the input category with similar distance value input. There is a problem that it is easy. Therefore, when the category is not the minimum distance value, the teacher signal given to the input category is reduced as the distance value rank of the input sample decreases. That is, the teacher signal given when the distance value rank is i is y i (i
= 1, 2, ..., J), y 1 = 1> y 2 > y 3 ... (4). This can reduce the effect of the above confusion.

【0015】実施例では、教師信号作成手段104は、
入力カテゴリの距離値順位によってユニットUjに与え
る教師信号yiを、距離計算手段101の出力と入力カ
テゴリの指示(ユーザが指示)から、数2のように作成
する。
In the embodiment, the teacher signal creating means 104 is
The teacher signal y i given to the unit U j according to the distance value rank of the input category is created as shown in Expression 2 from the output of the distance calculating means 101 and the instruction of the input category (instructed by the user).

【0016】[0016]

【数2】 [Equation 2]

【0017】図3に、教師信号作成手段104で教師信
号が作成されるアルゴリズムを示す。このように、入力
カテゴリがjであるが、その距離値ljが最小でない場
合には、教師信号yiを1.0より小さくし、学習効果を
向上させる。
FIG. 3 shows an algorithm for creating a teacher signal by the teacher signal creating means 104. In this way, when the input category is j, but the distance value l j is not the minimum, the teacher signal y i is made smaller than 1.0 to improve the learning effect.

【0018】認識時には、新たな入力文字パターンが与
えられると、学習時と同様に、距離計算手段101によ
り標準パターンとの距離が計算され、信頼性評価手段1
03に入力される。信頼性評価手段103は学習結果を
もとに各カテゴリに属する信頼性評価値を出力する。
At the time of recognition, when a new input character pattern is given, the distance from the standard pattern is calculated by the distance calculation means 101 as in the case of learning, and the reliability evaluation means 1
It is input to 03. The reliability evaluation means 103 outputs a reliability evaluation value belonging to each category based on the learning result.

【0019】文字候補決定手段105は、信頼性評価手
段103の出力から、あらかじめ定めた閾値a以上の値
を出力しているカテゴリのうち、大きな値を出力してい
る順に上位の文字候補とし、信頼性評価値つきの文字コ
ードを出力する。また、閾値a以上の値を出力している
順にカテゴリがない場合には文字候補なしとし、リジェ
クトとする。
From the output of the reliability evaluation means 103, the character candidate determination means 105 is a higher-ranked character candidate in the order in which a large value is output in the category that outputs a value equal to or greater than a predetermined threshold value a, Outputs character code with reliability evaluation value. If there is no category in the order in which the value equal to or greater than the threshold value a is output, there is no character candidate and it is rejected.

【0020】[0020]

【発明の効果】以上説明したように、本発明の文字認識
装置では、入力文字パターンと標準パターンとの距離と
各カテゴリに属する信頼性との関係をニューラルネット
を用いて学習し、かつ、距離最小のカテゴリが入力カテ
ゴリと一致しない場合も他に悪影響を与えずに学習し
て、入力文字が属する信頼性が高いカテゴリの文字コー
ドを文字候補として出力できる、という効果がある。
As described above, in the character recognition device of the present invention, the relationship between the distance between the input character pattern and the standard pattern and the reliability belonging to each category is learned using a neural network, and the distance is learned. Even if the minimum category does not match the input category, there is an effect that the learning can be performed without adversely affecting the other categories and the character code of the highly reliable category to which the input character belongs can be output as a character candidate.

【図面の簡単な説明】[Brief description of drawings]

【図1】本発明の一実施例を示すブロック図である。FIG. 1 is a block diagram showing an embodiment of the present invention.

【図2】図1の信頼性評価手段で用いられるニューラル
ネットの構成例を示す図である。
FIG. 2 is a diagram showing a configuration example of a neural network used in the reliability evaluation means of FIG.

【図3】図1の教師信号作成手段における教師信号作成
アルゴリズムの一例を示す図である。
FIG. 3 is a diagram showing an example of a teacher signal creation algorithm in the teacher signal creation means of FIG.

【符号の説明】[Explanation of symbols]

101 距離計算手段 102 文字辞書 103 信頼性評価手段 104 教師信号作成手段 105 文字候補決定手段 101 distance calculation means 102 character dictionary 103 reliability evaluation means 104 teacher signal generation means 105 character candidate determination means

Claims (1)

【特許請求の範囲】[Claims] 【請求項1】 各カテゴリの標準パターンを記憶した文
字辞書と、 入力文字パターンと前記文字辞書中の各カテゴリの標準
パターンとの距離値を計算する距離計算手段と、 あらかじめニューラルネットを用いて、入力文字パター
ンと標準パターンとの距離値と信頼性との関係を学習し
ておき、前記距離計算手段が計算した距離値を入力し、
入力文字パターンが各カテゴリに属する信頼性評価値を
出力する信頼性評価手段と、 前記信頼性評価手段の学習時に、前記距離計算手段によ
って計算された距離値と入力カテゴリの指示を入力て
し、前記指示された入力カテゴリ以外のカテゴリには一
定、入力カテゴリにはそのカテゴリの距離値順位に応じ
て変えて、前記信頼性評価手段の学習のための教師信号
を作成する教師信号作成手段と、 前記信頼性評価手段が出力する信頼性評価値を入力し、
該信頼性評価値があらかじめ定めた閾値以上になってい
るカテゴリについて、その文字コードを文字候補として
当該信頼性評価値を付して出力する文字候補決定手段
と、からなることを特徴とする文字認識装置。
1. A character dictionary that stores a standard pattern of each category, a distance calculation unit that calculates a distance value between an input character pattern and a standard pattern of each category in the character dictionary, and a neural network in advance, The relationship between the distance value between the input character pattern and the standard pattern and the reliability is learned, and the distance value calculated by the distance calculation means is input.
A reliability evaluation unit that outputs a reliability evaluation value in which an input character pattern belongs to each category, and at the time of learning of the reliability evaluation unit, input a distance value calculated by the distance calculation unit and an input category instruction, A teacher signal creation unit that creates a teacher signal for learning of the reliability evaluation unit by changing the input category to a category other than the designated input category, and changing the input category according to the distance value rank of the category. Input the reliability evaluation value output by the reliability evaluation means,
A character candidate determining means for outputting the category of which the reliability evaluation value is equal to or higher than a predetermined threshold value, with the character code as a character candidate, and outputting the reliability evaluation value with the reliability evaluation value. Recognition device.
JP5256103A 1993-10-13 1993-10-13 Character recognizing device Pending JPH07110843A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP5256103A JPH07110843A (en) 1993-10-13 1993-10-13 Character recognizing device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP5256103A JPH07110843A (en) 1993-10-13 1993-10-13 Character recognizing device

Publications (1)

Publication Number Publication Date
JPH07110843A true JPH07110843A (en) 1995-04-25

Family

ID=17287935

Family Applications (1)

Application Number Title Priority Date Filing Date
JP5256103A Pending JPH07110843A (en) 1993-10-13 1993-10-13 Character recognizing device

Country Status (1)

Country Link
JP (1) JPH07110843A (en)

Similar Documents

Publication Publication Date Title
CA2152211C (en) System and method for automated interpretation of input expressions using novel a posteriori probability measures and optimally trained information processing networks
US5005205A (en) Handwriting recognition employing pairwise discriminant measures
CA2375355A1 (en) Character recognition system and method
Pomazan et al. Handwritten character recognition models based on convolutional neural networks
JPH06508464A (en) Cursive recognition method and device
Mozaffari et al. Recognition of isolated handwritten Farsi/Arabic alphanumeric using fractal codes
CN108345833A (en) The recognition methods of mathematical formulae and system and computer equipment
Hussien et al. Optical character recognition of arabic handwritten characters using neural network
JPH0765165A (en) Method and device for pattern recognition by neural network
Agazzi et al. Pseudo two-dimensional hidden Markov models for document recognition
Azizi et al. From static to dynamic ensemble of classifiers selection: Application to Arabic handwritten recognition
Anigbogu et al. Hidden Markov models in text recognition
Lazzerini et al. A fuzzy approach to 2D-shape recognition
Sahoo et al. Indian sign language recognition using soft computing techniques
Reddy et al. A three-dimensional neural network model for unconstrained handwritten numeral recognition: a new approach
Lazzerini et al. A linguistic fuzzy recogniser of off-line handwritten characters
JPH07110843A (en) Character recognizing device
Radhiah et al. Printed Arabic letter recognition based on image
Niharmine et al. Tifinagh handwritten character recognition using genetic algorithms
JPH01321591A (en) Character recognizing device
Shah et al. SnapSolve—A novel mathematics equation solver using deep learning
Bourbakis et al. Handwriting recognition using a reduced character method and neural nets
JPH08115387A (en) Pattern recognition device
Gotlur et al. Handwritten math equation solver using machine learning
JP2812391B2 (en) Pattern processing method