JPH11224309A

JPH11224309A - Character recognition method and device

Info

Publication number: JPH11224309A
Application number: JP10026907A
Authority: JP
Inventors: Yoshimasa Kimura; 義政木村; Minoru Mori; 稔森; Teruo Akiyama; 照雄秋山; Toru Wakahara; 徹若原; Nobuo Miyamoto; 信夫宮本; Kenji Ogura; 健司小倉
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1998-02-09
Filing date: 1998-02-09
Publication date: 1999-08-17

Abstract

PROBLEM TO BE SOLVED: To perform classification at a high speed while keeping a broad classification rate at a fixed value by successively changing an initial value so as to minimize an operation amount required for hierarchical broad classification. SOLUTION: Relating to a broad classification part 4, a first stage classification circuit 41 calculates a distance between the n1 pieces (n0 >n1 ) of features specified in a broad classification constitution decision part 9 in the n0 pieces of the features of an input character pattern obtained from a feature extraction part 3 and the n1 pieces of the features of respective categories stored in a broad classification dictionary 5 and narrows the categories to c. pieces in the ascending order of the distance. A second stage classification circuit 42 calculates the distance between the n2 pieces (n0 >n2 >n1 ) pieces of the features specified in the broad classification constitution decision part 9 from the n0 pieces of the features and the n2 pieces of the features of the c1 pieces of the categories outputted from the first stage classification circuit 41 and narrows the categories to c2 pieces (c2 <c1 ) in the ascending order of the distance. The hierarchical broad classification of reducing a candidate number while increasing a feature number through the number of stages is performed similarly thereafter.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、文字認識方法およ
びその装置、特に高速に大分類を行うことのできる文字
認識方法およびその装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character recognition method and apparatus, and more particularly, to a character recognition method and apparatus capable of performing large-scale classification at high speed.

【０００２】[0002]

【従来の技術】従来の文字認識方法および装置では入力
文字パターンとカテゴリ毎に作成した辞書との照合をと
ることにより認識する方法が多く採られている。認識対
象とする字種が数字、記号等のように数十のオーダであ
る場合には入力文字パターンとすべてのカテゴリとの間
で照合演算を施しても計算時間は比較的短い時間に留ま
るが、漢字の如く字種数が数千のオーダとなると計算時
間は単純に見積っても数字、記号の場合の百倍以上とな
る。2. Description of the Related Art In a conventional character recognition method and apparatus, a method of recognizing an input character pattern by comparing the input character pattern with a dictionary created for each category is adopted in many cases. If the character type to be recognized is of the order of several tens, such as numbers and symbols, the calculation time is relatively short even if the collation operation is performed between the input character pattern and all categories. If the number of character types is in the order of thousands, such as kanji, the calculation time is more than one hundred times that of numbers and symbols, even if it is simply estimated.

【０００３】[0003]

【発明が解決しようとする課題】そこで計算時間を短縮
すべく、まず大分類を行って候補を比較的少数に絞り、
次に絞られた候補に対して識別を行う処理法が採られて
いる。Therefore, in order to reduce the calculation time, first, a large classification is performed to narrow down the candidates to a relatively small number.
Next, a processing method for identifying the narrowed candidates is adopted.

【０００４】大分類における公知の方法としては、識別
木や組合せ論理に見られる如く入力文字パターンから抽
出された特徴の中で予め指定された特徴と閾値との間で
論理演算を行い、前記論理を満足したカテゴリを分類結
果とする方法がある。当該方法は処理が高速であるとい
う長所を有するが、特定の特徴の閾値論理により大分類
が行われるため、変形等により該特徴の特徴値が変動し
閾値外の値をとるようになると前記論理は満足しなくな
り、誤分類となる問題点がある。[0004] As a well-known method in the large classification, a logic operation is performed between a predetermined feature and a threshold value among features extracted from an input character pattern as seen in an identification tree or combinational logic. There is a method in which a category satisfying is used as a classification result. Although this method has the advantage that the processing is fast, since the major classification is performed by the threshold logic of a specific feature, if the feature value of the feature fluctuates due to deformation or the like and takes a value outside the threshold, the logic is reduced. Is no longer satisfied, resulting in misclassification.

【０００５】前記問題点の解決策として閾値を緩和して
論理の数を増やす方法、満足しない論理の数が予め定め
られた値に達するまでは当該カテゴリを候補として残し
ておく方法等がある。しかし、当該解決策は論理が大規
模化するため、処理時間の増大を招くうえ、特徴値の突
発的な変動に対応できる原理とはなっておらず、安定な
分類は依然として困難である。As a solution to the above problem, there are a method of increasing the number of logics by relaxing the threshold value, and a method of leaving the category as a candidate until the number of unsatisfied logics reaches a predetermined value. However, since the solution has a large logic, the processing time is increased, and the solution is not based on a principle capable of coping with a sudden change in a feature value, and stable classification is still difficult.

【０００６】別の公知の方法としては入力パターンから
得られた比較的多数の特徴と標準パターンとの間で距
離、類似度等を計算することにより大分類を行う方法が
ある。当該方法では、複数の特徴を用いた総合的判定と
なるため、少数個の特徴値が変動してもその他の多数個
の特徴値の変動が小さければ両者を合算した値は正常な
変動範囲内に留まり、安定な分類ができる長所がある。
しかし、距離、類似度等のベクトル演算は閾値論理に比
べて計算時間を要するという問題点がある。As another known method, there is a method of performing a large classification by calculating a distance, a similarity, and the like between a relatively large number of features obtained from an input pattern and a standard pattern. In this method, since comprehensive judgment using a plurality of features is performed, even if a small number of feature values fluctuate, if the variation of other many feature values is small, the sum of the two values is within a normal variation range. And has the advantage of stable classification.
However, there is a problem in that vector operations such as distance and similarity require a longer calculation time than threshold logic.

【０００７】計算時間の削減法として、分類回路を複数
個直列接続して階層構造を採るようにし、まず最前段の
階層では少数個の特徴を用いて粗く候補を絞り、次の階
層では特徴数を増加して精細な分類を行いさらに候補を
絞り、以降、前記処理の繰り返しにより最後段の階層に
おいて所望の候補数を得る階層的大分類法がある。しか
し、当該方法には各階層の特徴数を大きくし候補数を多
くとれば大分類率は高くなるものの計算時間は長くな
り、逆に特徴数を小さくし候補数を少なくとれば計算時
間は短いが大分類率が低下するという相矛盾する性質が
ある。また、各階層の分類率も相互に関連しており、各
階層の分類にかかる負荷を如何に適正配置するかは容易
ではない。As a method of reducing the calculation time, a plurality of classification circuits are connected in series to adopt a hierarchical structure. First, the first hierarchical level roughly narrows down candidates using a small number of features, and the next hierarchical level reduces the number of features. There is a hierarchical large classification method that obtains a desired number of candidates in the last hierarchical level by repeating the above processing to further narrow down candidates by increasing the number of candidates. However, in this method, if the number of features in each layer is increased and the number of candidates is increased, the large classification rate is increased, but the calculation time is increased. However, there is a contradictory property that the large classification rate decreases. In addition, the classification rates of the respective layers are also related to each other, and it is not easy to appropriately arrange the load on the classification of the respective layers.

【０００８】以上述べたように前記方法の階層内、階層
間には複雑な関係が存在するので、大分類率を一定値に
保持し、かつ、計算時間を最小化するための各段の使用
特徴数と絞り込む候補数の最適解を求めるという大分類
部の設計方法は確立されていないと言う問題点がある。
また、文字認識装置の用途によっては認識速度は遅速で
あっても誤読の極力低減が要求される場合から、誤読は
多くとも高速認識が要求される場合まで種々の要請があ
るものの、大分類部の調整により前記要請に応えるには
困難という問題点もあった。As described above, since there is a complicated relationship between hierarchies in and between hierarchies of the above method, the use of each stage for maintaining the large classification rate at a constant value and minimizing the calculation time is used. There is a problem that a method of designing a large classification unit for finding an optimal solution of the number of features and the number of candidates to be narrowed down has not been established.
Also, depending on the application of the character recognition device, there are various requests from the case where the recognition speed is as low as possible, even if the recognition speed is slow, to the case where at most high speed recognition is required for the misreading. There is also a problem that it is difficult to respond to the above request by adjusting the above.

【０００９】本発明は前記問題点を解決するためになさ
れたものであり、その目的とするところは大分類率を一
定値に保持しつつ高速に分類でき、かつ、高精度認識あ
るいは高速認識といった種々の要請に対応できる文字認
識方法および装置を提供することにある。SUMMARY OF THE INVENTION The present invention has been made to solve the above-mentioned problems, and it is an object of the present invention to perform high-speed classification while maintaining a large classification rate at a constant value, and perform high-precision recognition or high-speed recognition. An object of the present invention is to provide a character recognition method and apparatus that can respond to various requests.

【００１０】[0010]

【課題を解決するための手段】本発明は上記目的を達成
するため、入力文字パターンの特徴の全部または一部と
標準辞書の特徴の全部または一部との間で分類尺度を計
算することにより分類対象とする候補数を減少させる分
類回路を階層構造を成すように構成し、前段の階層から
後段の階層へと処理が進行するに伴い該分類回路の分類
に使用する特徴数を増加させながら分類対象とする候補
数を減少させていく階層的大分類手段と、前記階層的大
分類法における各階層の特徴数と候補数の設定とは、ま
ず各階層の特徴数と候補数との初期値を設定しており、
次に大分類率を予め定められた一定値に保持するという
拘束条件の下で前記階層的大分類に要する演算量を最小
化すべく前記初期値を逐次変更することにより行う大分
類構成決定手段と、前記階層的大分類法における各階層
の特徴数と候補数の設定とは、大分類率の目標値が指定
された場合には該目標値を前記一定値に置換し前記逐次
変更するようにされ、大分類処理速度の目標値が指定さ
れた場合には予め得られている大分類処理速度と大分類
率との関係から前記一定値を求め前記逐次変更するよう
にされ、かつ、前記２種類の指定を選択できるトレード
オフ機能設定手段と、を具備することを最も主要な特徴
とする。The present invention achieves the above object by calculating a classification measure between all or some of the features of an input character pattern and all or some of the features of a standard dictionary. A classification circuit for reducing the number of candidates to be classified is configured to form a hierarchical structure, and the number of features used for classification of the classification circuit is increased while the processing proceeds from a preceding layer to a subsequent layer. Hierarchical large classification means for reducing the number of candidates to be classified and setting of the number of features and the number of candidates of each layer in the hierarchical large classification method are performed by first setting the initial number of features and the number of candidates in each layer. Value is set,
Next, a large classification configuration determining unit that sequentially changes the initial value so as to minimize the amount of computation required for the hierarchical large classification under the constraint that the large classification rate is held at a predetermined constant value. The setting of the number of features and the number of candidates for each layer in the hierarchical large classification method is such that when a target value of the large classification rate is specified, the target value is replaced with the fixed value and the sequential change is performed. When the target value of the large classification processing speed is designated, the constant value is obtained from the relationship between the previously obtained large classification processing speed and the large classification rate, and the value is sequentially changed. And a trade-off function setting means capable of selecting a type designation.

【００１１】本発明の大分類構成決定手段は初期設定さ
れた各階層の特徴数と候補数とを逐次変更することによ
り準最適な特徴数と候補数とを発見する原理となってい
るので、階層内、階層間に複雑な関係が存在する場合で
も大分類部を容易に構築することができる。また、トレ
ードオフ機能設定手段により所望の大分類率あるいは処
理速度を選択して指定することにより前記性能を満足す
る大分類部を構築することができるので用途に応じたチ
ューニングが可能となる。The principle of the large classification configuration determining means of the present invention is based on the principle of finding the sub-optimal feature number and candidate number by sequentially changing the initially set number of features and number of candidates in each hierarchy. Even when there is a complicated relationship between hierarchies or between hierarchies, a large classification unit can be easily constructed. Further, by selecting and specifying a desired large classification rate or processing speed by the trade-off function setting means, a large classification unit satisfying the above performance can be constructed, so that tuning according to the application can be performed.

【００１２】[0012]

【発明の実施の形態】次に、本発明の実施例について図
面を参照して説明する。図１は本発明の一実施例を示す
文字認識装置のブロック構成図である。Next, an embodiment of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram of a character recognition apparatus according to an embodiment of the present invention.

【００１３】本実施例の文字認識装置は、文字パターン
入力部１と、前処理部２と、特徴抽出部３と、大分類部
４と、大分類辞書５と、識別部６と、識別辞書７と、認
識結果メモリ部８と、大分類構成決定部９と、トレード
オフ機能設定部１０とから構成されている。The character recognition device of this embodiment includes a character pattern input unit 1, a preprocessing unit 2, a feature extraction unit 3, a large classification unit 4, a large classification dictionary 5, a classification unit 6, a classification dictionary, 7, a recognition result memory unit 8, a large classification configuration determining unit 9, and a trade-off function setting unit 10.

【００１４】文字パターン入力部１はスキャナ，テレビ
カメラ等の入力装置により文字パターンを取込み、前処
理部２は正規化・雑音除去等を行い、特徴抽出部３は認
識に使用するための特徴を抽出する。大分類部４は前記
文字パターンの特徴と大分類辞書５に格納されている特
徴との間でカテゴリ毎に距離計算を行い、得られた候補
を出力する。識別部６は大分類部４から出力された候補
を対象として前記文字パターンの特徴と識別辞書７に格
納されている特徴との間で距離計算を行い、得られた候
補を最終結果として認識結果メモリ部８に出力する。A character pattern input unit 1 captures a character pattern by an input device such as a scanner or a television camera, a preprocessing unit 2 performs normalization and noise removal, and a feature extraction unit 3 stores features to be used for recognition. Extract. The major classification unit 4 calculates a distance for each category between the features of the character pattern and the features stored in the major classification dictionary 5, and outputs the obtained candidates. The identification unit 6 calculates a distance between the features of the character pattern and the features stored in the identification dictionary 7 for the candidates output from the large classification unit 4 and uses the obtained candidates as the final result as the recognition result. Output to the memory unit 8.

【００１５】次に、本発明の大分類部４の動作を図２を
用いて説明する。図２は本発明の一実施例を示す大分類
部４の機能ブロック図であって、第１段分類回路４１、
第２段分類回路４２、…、第ｋ段分類回路４３、…、第
Ｋ段分類回路４４から成る。Next, the operation of the large classification unit 4 of the present invention will be described with reference to FIG. FIG. 2 is a functional block diagram of the large classifying unit 4 showing one embodiment of the present invention.
, A k-th stage classification circuit 43,..., A K-th stage classification circuit 44.

【００１６】第１段分類回路４１は特徴抽出部３から得
られた入力文字パターンのｎ₀個の特徴の中で大分類構
成決定部９で指定されたｎ₁個（ｎ₀＞ｎ₁）の特徴
と、大分類辞書５に格納された各カテゴリの前記ｎ₁個
の特徴との間で距離を計算し、該距離の昇順にカテゴリ
をｃ₁個に絞る。第２段分類回路４２は前記ｎ₀個の特
徴の中から大分類構成決定部９で指定された特徴ｎ₂個
（ｎ₀＞ｎ₂＞ｎ₁）と第１段分類回路４１から出力さ
れたｃ₁個のカテゴリの前記ｎ₂個の特徴との間で距離
を計算し、該距離の昇順にカテゴリをｃ₂個（ｃ₂＜ｃ
₁）に絞る。The first-stage classification circuit 41 selects n ₁ (n ₀ > n ₁ ) specified by the large classification configuration determination unit 9 out of the n ₀ features of the input character pattern obtained from the feature extraction unit 3. And the n ₁ features of each category stored in the large classification dictionary 5 are calculated, and the categories are narrowed down to c _{1 in} ascending order of the distance. The second-stage classification circuit 42 outputs the n ₂ features (n ₀ > n ₂ > n ₁ ) designated by the large-class configuration determining unit 9 from the n ₀ features, and outputs them from the first-stage classification circuit 41. A distance is calculated between the n ₂ features of the c ₁ categories and the categories are c ₂ (c ₂ <c) in ascending order of the distance.
₁ ).

【００１７】以下同様に、第ｋ段分類回路４３ではｎ_k
個の特徴を用いてｃ_k個の候補を得る操作を繰り返す。
前記処理においてはｎ₁＜ｎ₂＜…＜ｎ_k＜…＜ｎ_K，
ｃ₁＞ｃ₂＞…＞ｃ_k＞…＞ｃ_Kとなっており、段数を
経るに従って特徴数を増加させながら候補数を減少させ
る階層的大分類となっている。最終段の第Ｋ段分類回路
４４からはｃ_K個の候補が出力される。Similarly, in the k-th classifying circuit 43, n _k
The operation of obtaining _ck candidates using the above features is repeated.
In the process _{_{n 1 <n 2 <... <}} n k <... <n K,
c _1> c _2> ... has _{_{a> c k>...> c K}} , has a hierarchical major classification to reduce the number of candidates with increasing number of features in accordance undergo stages. C _K candidates are output from the K-th classifying circuit 44 at the last stage.

【００１８】いま、認識対象カテゴリ数をｃ₀、入力文
字パターンから得られた１個の特徴と辞書に格納されて
いる１個の特徴との間の距離計算に要する時間をΔｔと
すると、階層的大分類を行わないときの距離計算時間Ｔ
₀はＴ₀＝ｎ₀・ｃ₀・Δｔ（１）である。いま、第ｋ段分類回路４３で使用するｎ_k個の
特徴は前段の第（ｋ−１）段分類回路ｎ_k-1個の特徴を
包含しているものとする。このとき階層的大分類に要す
る距離計算時間ＴはAssuming that the number of categories to be recognized is c ₀ and the time required to calculate the distance between one feature obtained from the input character pattern and one feature stored in the dictionary is Δt, Distance calculation time T when no major classification is performed
₀ is T ₀ = n ₀ · c ₀ · Δt (1). Now, n _k-number of features to be used in the k-th stage classification circuit 43 is assumed to include the front of the (k-1) stage classification circuit n _k-1 one feature. At this time, the distance calculation time T required for the hierarchical large classification is

【００１９】[0019]

【数１】 (Equation 1)

【００２０】となる。右辺第２項がｎ_k・ｃ_k-1ではな
く（ｎ_k−ｎ_k-1）・ｃ_k-1となっているのはｎ_k-1個
の特徴の距離計算は既に前段で終了していることによ
る。距離計算時間の高速化率ηは次式で定義されるＴに
対するＴ₀の比で表される。## EQU1 ## Second term is not the _{_{n k · c k-1 (}} n k -n k-1) · c k-1 and turned by the n _k-1 or distance calculation feature of What is already completed in the preceding stage It depends. Distance speed ratio computation time η is expressed by the ratio of T ₀ to T, which is defined by the following equation.

【００２１】 η＝Ｔ₀／Ｔ（３）階層的大分類による処理量削減の効果を見るために、一
例として、Ｋ＝３とし、ｎ₀＝500, ｃ₀＝3,500, ｎ
₁＝200, ｃ₁＝500, ｎ₂＝300, ｃ₂＝300, ｎ₃
＝500 を用いると、式（１）〜（３）の計算によりη＝
2.5 となり、距離計算に関しては2.5 倍の高速化が図ら
れることが分かる。Η = T ₀ / T (3) In order to see the effect of the processing amount reduction by the hierarchical large classification, as an example, K = 3, n ₀ = 500, c ₀ = 3,500, n
_{_{1 = 200, c 1 = 500}} , n 2 = 300, c 2 = 300, n 3
= 500, η = 500 by the calculation of equations (1) to (3).
2.5, which means that the speed of distance calculation can be increased by 2.5 times.

【００２２】次に、大分類構成決定部９の動作を図３、
図４、図５を用いて説明する。図３は本発明の一実施例
を示す大分類構成決定部９の機能ブロック図であって、
初期値設定回路９１、演算量計算回路９２、大分類パラ
メータ設定回路９３、分類率テーブル９４から成る。Next, the operation of the large classification configuration determining unit 9 will be described with reference to FIG.
This will be described with reference to FIGS. FIG. 3 is a functional block diagram of the large classification configuration determining unit 9 showing one embodiment of the present invention.
It comprises an initial value setting circuit 91, a calculation amount calculation circuit 92, a large classification parameter setting circuit 93, and a classification rate table 94.

【００２３】図４は大分類構成決定部９の処理フロー図
である。図５は分類率テーブル９４の一実施例であっ
て、ｎ_i個（ｉ＝１，２，...,Ｉ）の特徴により候補を
ｃ₀からｃ_j個（ｊ＝１，２，...,Ｊ）に絞ったときの
分類率ｒ（ｉ，ｊ）（０≦ｒ（ｉ，ｊ）≦１）が格納さ
れている。該ｒ（ｉ，ｊ）は大量の学習データに用いた
認識実験により予め求められた値である。図でＩは各階
層の分類で取り得る特徴数の種類数、Ｊは取り得る候補
数の種類数であり、いずれも予め与えられた数である。FIG. 4 is a processing flowchart of the large classification configuration determining section 9. FIG. 5 shows an embodiment of the classification rate table 94, in which n _i (i = 1, 2,..., I) candidates are selected from c ₀ to c _j (j = 1, 2,. .., J) is stored as the classification rate r (i, j) (0 ≦ r (i, j) ≦ 1). The r (i, j) is a value obtained in advance by a recognition experiment using a large amount of learning data. In the figure, I is the number of types of the number of features that can be taken in the classification of each layer, and J is the number of types of the number of candidates that can be taken, all of which are predetermined numbers.

【００２４】初期値設定回路９１はトレードオフ機能設
定部１０より与えられる保持すべき大分類率の値ｒ₀か
ら、各段で使用する特徴数と絞り込む候補数の初期値を
決定する。まず、Ｋを決定し（ステップ２１）、各段に
おける分類率の目標値ｒ_tをｒ_t＝（ｒ₀）^1/K （４）で求める（ステップ２２）。本実施例では各段の分類率
の負荷分担を等負荷としたが、前段は軽くして後段へ行
くに従って重くする等、可変にしても良い。演算量計算
回路９２は分類率テーブル９４を構成するｒ（ｉ，ｊ）
の中からｒ_tに近い値を取るＫ個の要素を選出し、該ｒ
（ｉ，ｊ）のｊの小さい順にｒ_k ⁽¹⁾（ｋ＝１，
２，...,Ｋ）とおく（ステップ２３）。図６はＫ＝３，
Ｉ＝５，Ｊ＝５の場合の選出例を示す図であり、○印で
囲まれたｒ（４，１），ｒ（３，３），ｒ（１，５）が
選出されたｒ（ｉ，ｊ）であり、ｒ₁ ⁽¹⁾＝ｒ（４，
１），ｒ₂ ⁽¹ ⁾＝ｒ（３，３），ｒ₃ ⁽¹⁾＝ｒ（１，
５）と置き換えられる。Ｋ段を通過したときの大分類率
ｒ_pをThe initial value setting circuit 91 determines the initial value of the number of features used in each stage and the number of candidates to be narrowed down from the value r ₀ of the large classification rate to be held provided by the trade-off function setting unit 10. First obtained by determining the K (step 21), the target value r _t of classification rate in each stage _{_{^{r t = (r 0) 1}}} / K (4) ( step 22). In the present embodiment, the load sharing of the classification rate of each stage is equal, but the load may be variable, such as reducing the weight of the front stage and increasing the weight toward the rear stage. The operation amount calculation circuit 92 constructs the classification rate table 94 by r (i, j).
Are selected from among K elements taking a value close to r _t ,
R _k ⁽¹⁾ (k = 1, k ^{) in} ascending order of j in (i, j)
2, ..., K) (step 23). FIG. 6 shows K = 3,
It is a figure which shows the example of selection at the time of I = 5, J = 5, and r (4,1), r (3,3), r (1,5) enclosed with the (circle) mark selected r ( i, j), and r ₁ ⁽¹⁾ = r (4,
1), r ₂ ⁽¹ ⁾ = r (3, 3), r ₃ ⁽¹⁾ = r (1,
5) is replaced. The large classification rate r _p of the time that has passed through the K stages

【００２５】[0025]

【数２】 (Equation 2)

【００２６】で求め、また、ｒ_k ⁽¹⁾で指定される特徴
数と候補数を用いて距離計算時間Ｔ_pを式（２）により
求める（ステップ２４）。次に、分類率テーブル９４に
おいてｒ_k ⁽¹⁾に上下左右に隣接する４要素の中から乱
数で選出した値をｒ_k ⁽²⁾とする。The distance calculation time T _{p is obtained} by the equation (2) using the number of features and the number of candidates specified by r _k ⁽¹⁾ (step 24). Next, the values selected at random from among the four elements that are vertically and horizontally adjacent to each to r _k ⁽¹⁾ and r _k ⁽²⁾ in the classification rate table 94.

【００２７】図６にｒ（３，３）の上下左右に隣接する
４要素の例を□で囲んで示す。該ｒ _k ⁽²⁾を用いてＫ段
を通過したときの分類率ｒ_uを式（５）により求め、ま
た、該ｒ_k ⁽²⁾で指定される特徴数と候補数とを用いて
距離計算時間Ｔ_uを式（２）により求める（ステップ２
５）。FIG. 6 shows that r (3, 3) is adjacent to the upper, lower, left and right sides.
Examples of the four elements are shown in squares. The r _k ⁽²⁾Using K
Classification rate r when passing_uIs calculated by equation (5), and
The r_k ⁽²⁾Using the number of features and the number of candidates specified in
Distance calculation time T_uIs obtained by Expression (2) (Step 2)
5).

【００２８】大分類パラメータ設定回路９３はδを予め
定めた許容値とし、次の３条件ｒ₀−δ≦ｒ_u≦ｒ₀＋δ （６）ｒ_u＞ｒ_p （７）Ｔ_u＜Ｔ_p （８）を満足した場合は（ステップ２６）、前記ｒ_k ⁽²⁾に対
応する特徴数および候補数を大分類パラメータとして仮
登録し、ｒ_uの値をｒ_pに代入する（ステップ２７）。The major classification parameter setting circuit 93 is set to the allowable value determined in advance [delta], the following three conditions _{_{r 0 -δ ≦ r u ≦ r}} 0 + δ (6) r u> r p (7) T u <T p If satisfied (8) (step 26), the r _k corresponding features number and the number of candidates in ⁽²⁾ and temporarily registered as a large classification parameters, it substitutes the value of r _u in r _p (step 27) .

【００２９】前記処理が終了すると、ｒ_k ⁽²⁾に隣接す
る４要素の中から乱数で選出した値に対してステップ２
５からステップ２７の処理を行う。前記処理を予め定め
た回数（Ｌ回）だけ繰り返す（ステップ２８）。かよう
な処理により、保持すべき大分類率の値ｒ₀に近い大分
類率を有しつつ距離計算時間Ｔを最小化する大分類パラ
メータが選ばれる。Ｌ回の試行終了後に大分類パラメー
タ設定回路９３に残された特徴数、候補数が大分類構成
決定部９から出力され（ステップ２９）、大分類部４の
処理に供せられる。When the above processing is completed, the value selected by random numbers from the four elements adjacent to r _k ⁽²⁾
The processing from step 5 to step 27 is performed. The above process is repeated a predetermined number of times (L times) (step 28). By such a process, a large classification parameter that minimizes the distance calculation time T while having a large classification rate close to the value r ₀ of the large classification rate to be held is selected. After the L trials, the number of features and the number of candidates left in the large classification parameter setting circuit 93 are output from the large classification configuration determination unit 9 (step 29), and are provided to the processing of the large classification unit 4.

【００３０】次に、トレードオフ機能設定部１０の動作
を図７，図８を用いて説明する。図７は処理速度／大分
類率のトレードオフテーブルの一実施例であり、Ｍ段階
の処理速度ｓ_m（ｍ＝１，２，...,Ｍ）に対する大分類
率ｒ_mから成る。ｓ_m，ｒ_mは多数の学習データを用い
て予め求められている値である。一般にｓ₁＜ｓ ₂＜
_...＜ｓ_M，ｒ₁＞ｒ₂＞_...＞ｒ_Mであり、処理速度
の高速化に伴い大分類率が低下するトレードオフの性質
を有している。Next, the operation of the trade-off function setting unit 10
Will be described with reference to FIGS. FIG. 7 shows the processing speed / Oita
It is an example of a trade-off table of similarity, and is an M-stage.
Processing speed s_mLarge classification for (m = 1,2, ..., M)
Rate r_mConsists of s_m, R_mUses a lot of learning data
This is a value that is obtained in advance. Generally s₁<S _Two<
_...<S_M, R₁> R_Two>_...> R_MAnd processing speed
Of the trade-off that the large classification rate decreases with the speeding up
have.

【００３１】図８はトレードオフ機能設定部１０の処理
フロー図である。図８において、ユーザは所望する大分
類率ｒ_d、または処理速度ｓ_dのいずれかを入力する
（ステップ３１）。FIG. 8 is a processing flowchart of the trade-off function setting section 10. In FIG. 8, the user inputs either the desired large classification rate r _d or the processing speed s _d (step 31).

【００３２】ｒ_dが入力された場合は、トレードオフテ
ーブルに格納されたＭ個のｒ_mの中でｒ_dに最も近いｒ
_mを選択し（ステップ３２）、該ｒ_mを大分類構成決定
部９に送出する（ステップ３３）。大分類構成決定部９
は該ｒ_mを保持すべき大分類率の値ｒ₀に置換し、大分
類構成決定部９の処理により大分類パラメータを決定す
る（ステップ３４）。When r _d is input, r closest to r _d among the M r _m stored in the trade-off table
Select _m (step 32) and sends the r _m in the major classification configuration determination section 9 (step 33). Major classification configuration decision unit 9
Is replaced by the value r ₀ of the major classification rate should retain the r _m, determines a large classification parameters by treatment of the large classification configuration determination section 9 (step 34).

【００３３】ｓ_dが入力された場合は、トレードオフテ
ーブルに格納されたＭ個のｓ_mの中でｓ_dに最も近いｓ
_mを選択し（ステップ３５）、該ｓ_mに対応するｒ_mを
取り出し（ステップ３６）、大分類構成決定部９に送出
する（ステップ３３）。大分類構成決定部９は該ｒ_mを
保持すべき大分類率の値ｒ₀に置換し、大分類構成決定
部９の処理により大分類パラメータを決定する（ステッ
プ３４）。[0033] s _{if d} is input, closest to s _d in the M stored in the tradeoff table s _m s
Select _m (step 35), r _m was removed (step 36) corresponding to the s _m, and sends the major classification configuration determination section 9 (step 33). Major classification configuration determination section 9 is replaced by the value r ₀ of the major classification rate should retain the r _m, determines a large classification parameters by treatment of the large classification configuration determination section 9 (step 34).

【００３４】ユーザは精度重視のときはｒ_dを選択し、
速度重視のときはｓ_dを選択することにより、目的に応
じた大分類率あるいは処理速度で動作する大分類部を実
現できる。これは分類性能と処理速度をトレードオフ的
選択となっている。[0034] The user selects a r _d when the precision-oriented,
By selecting _sd when speed is important, it is possible to realize a large classification unit that operates at a large classification rate or processing speed according to the purpose. This is a trade-off between classification performance and processing speed.

【００３５】かくして目標の大分類率を実現する大分類
部が自動で構成される。これにより認識率、処理速度の
側面でユーザの種々の要請に応えることができるように
なる。In this way, a large classification unit for realizing a target large classification rate is automatically configured. This makes it possible to respond to various requests of the user in terms of recognition rate and processing speed.

【００３６】本実施例は大分類構成決定部９の分類率テ
ーブル９４で選出するｒ^k(2)はｒ^k( ¹⁾の上下左右に隣接
する４要素の中から選んだが、これは上下左右に隣接す
る４要素に限ったことではなく、ｒ^k(1)を取り囲む８要
素等であっても良い。In the present embodiment, rk ⁽²⁾ selected from the classification rate table 94 of the large classification configuration determining unit 9 is selected from four elements adjacent to rk ⁽ ^{1) in} the upper, lower, left and right directions. Is not limited to the four elements adjacent to, but may be eight elements surrounding rk ⁽¹⁾ or the like.

【００３７】本実施例はトレードオフ機能設定部１０に
おいては大分類率あるいは処理速度を直接入力する方式
について説明したが、Ｍ段階ある大分類率あるいは処理
速度の任意の１個を指定する方式であっても本実施例と
同様の処理が可能である。In the present embodiment, the system for directly inputting the large classification rate or the processing speed in the trade-off function setting section 10 has been described. Even in this case, the same processing as in the present embodiment can be performed.

【００３８】以上、本発明を実施例に基づき具体的に説
明したが、本発明は前記実施例に限定されるものではな
く、その要旨を逸脱しない範囲において種々変更可能で
あることは言うまでもない。As described above, the present invention has been specifically described based on the embodiments. However, it is needless to say that the present invention is not limited to the above-described embodiments and can be variously modified without departing from the gist thereof.

【００３９】[0039]

【発明の効果】以上説明したように、本発明によれば階
層的大分類手段を採っており、かつ、大分類部を自動で
構築する機能を有しているため、指定した認識率を保持
しつつ高速な認識が行える系、あるいは指定した処理速
度を実現する系を容易に構築できる長所がある。また、
分類率テーブル、トレードオフテーブルは入力される文
字品質に応じて設定変更が可能なため、種々の用途に適
応できるという長所がある。As described above, according to the present invention, since the hierarchical large classification means is employed and the function of automatically constructing the large classification part is provided, the specified recognition rate is maintained. There is an advantage that a system that can perform high-speed recognition while performing, or a system that achieves a specified processing speed can be easily constructed. Also,
Since the setting of the classification rate table and the trade-off table can be changed according to the quality of the input characters, there is an advantage that it can be applied to various uses.

[Brief description of the drawings]

【図１】本発明の一実施例を示す文字認識装置のブロッ
ク構成図である。FIG. 1 is a block diagram of a character recognition device according to an embodiment of the present invention.

【図２】大分類部４のブロック構成図である。FIG. 2 is a block diagram of a large classification unit 4;

【図３】大分類構成決定部９のブロック構成図である。FIG. 3 is a block diagram illustrating a configuration of a large classification configuration determination unit 9;

【図４】大分類構成決定部９の処理フロー図である。FIG. 4 is a processing flowchart of a large classification configuration determining unit 9;

【図５】分類率テーブル９４の一例を示す図である。FIG. 5 is a diagram showing an example of a classification rate table 94.

【図６】分類率テーブル９４で選出されたｒ（ｉ，ｊ）
の一例を示す図である。FIG. 6 shows r (i, j) selected in the classification rate table 94
It is a figure showing an example of.

【図７】トレードオフテーブルの一例を示す図である。FIG. 7 is a diagram illustrating an example of a trade-off table.

【図８】トレードオフ機能設定部１０の処理フロー図で
ある。FIG. 8 is a processing flowchart of the trade-off function setting unit 10;

[Explanation of symbols]

１文字パターン入力部２前処理部３特徴抽出部４大分類部５大分類辞書６識別部７識別辞書８認識結果メモリ部９大分類構成決定部１０トレードオフ機能設定部 DESCRIPTION OF SYMBOLS 1 Character pattern input part 2 Preprocessing part 3 Feature extraction part 4 Large classification part 5 Large classification dictionary 6 Identification part 7 Identification dictionary 8 Recognition result memory part 9 Large classification configuration decision part 10 Trade-off function setting part

フロントページの続き (72)発明者若原徹東京都新宿区西新宿三丁目19番２号日本電信電話株式会社内 (72)発明者宮本信夫東京都新宿区西新宿三丁目19番２号日本電信電話株式会社内 (72)発明者小倉健司東京都新宿区西新宿三丁目19番２号日本電信電話株式会社内Continued on the front page (72) Inventor Toru Wakahara 3-19-2 Nishi Shinjuku, Shinjuku-ku, Tokyo Nippon Telegraph and Telephone Corporation (72) Inventor Nobuo Miyamoto 3-19-2 Nishi-Shinjuku, Shinjuku-ku, Tokyo Nippon Telegraph and Telephone Within Telephone Co., Ltd. (72) Inventor Kenji Ogura 3-19-2 Nishi Shinjuku, Shinjuku-ku, Tokyo Japan Telegraph and Telephone Co., Ltd.

Claims

[Claims]

1. A character recognition method for comparing a feature of an input character pattern with a feature of a standard dictionary to output a category to which the input character pattern belongs as a candidate, wherein: A classification circuit that reduces the number of candidates to be classified by calculating a classification scale between a part and all or a part of the features of the standard dictionary is configured to form a hierarchical structure, and a hierarchy is arranged from a previous stage to a subsequent stage. Hierarchical large classification method in which the number of candidates to be classified is reduced while increasing the number of features used for classification of the classification circuit as the processing proceeds to the hierarchy of The setting of the number of features and the number of candidates in each layer is performed by first setting the initial values of the number of features and the number of candidates in each layer, and then maintaining the large classification rate at a predetermined constant value. of Is performed by a method of sequentially changing the initial value so as to minimize the amount of computation required for the hierarchical large classification. The setting of the number of features and the number of candidates of each hierarchy in the hierarchical large classification method is performed by setting a large classification rate. When the target value is specified, the target value is replaced by the constant value, and the method of sequentially changing the target value is performed. When the target value of the large classification processing speed is specified, the target value is obtained in advance. The method is performed by applying the method of obtaining the constant value from the relationship between the large classification processing speed and the large classification rate, and sequentially changing the value, and performs a trade-off in which the two types of designation can be selected. Character recognition method to be characterized.

2. A character recognition apparatus for comparing a feature of an input character pattern with a feature of a standard dictionary to output a category to which the input character pattern belongs as a candidate, A classification circuit that reduces the number of candidates to be classified by calculating a classification scale between a part and all or a part of the features of the standard dictionary is configured to form a hierarchical structure, and a hierarchy is arranged from a previous stage to a subsequent stage. Hierarchical large classification means for decreasing the number of candidates to be classified while increasing the number of features used for the classification of the classification circuit as the processing proceeds to the hierarchy of The setting of the number of features and the number of candidates in the hierarchy is performed by first setting the initial values of the number of features and the number of candidates in each layer, and then maintaining the large classification rate at a predetermined constant value. so A large classification configuration determining means for sequentially changing the initial value so as to minimize the amount of computation required for the hierarchical large classification, and setting of the number of features and the number of candidates of each layer in the hierarchical large classification method, When the target value of the large classification rate is specified, the target value is replaced with the constant value and the value is sequentially changed. When the target value of the large classification processing speed is specified, the target value is obtained in advance. Trade-off function setting means for obtaining the constant value from the relationship between the large classification processing speed and the large classification rate and changing the constant value sequentially, and which can select the two types of designations. Character recognition device.